Go to the first, previous, next, last section, table of contents.

Copyright (C) 1991 Tom Horsley

Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation.

Overview

An ID database is simply a file containing a list of file names, a list of identifiers, and a binary relation (stored as a bit matrix) indicating which of the identifiers appear in each file. With this database and some tools to manipulate the data, a host of tasks become simpler and faster. You can grep through hundreds of files for a name, skipping the files that don't contain the name. You can search for all the memos containing references to a project. You can edit every file that calls some function, adding a new required argument. Anyone with a large software project to maintain, or a large set of text files to organize can benefit from the ID database and the tools that manipulate it.

There are several programs in the ID family. The mkid program scans the files, finds the identifiers and builds the ID database. The lid and aid tools are used to generate lists of file names containing an identifier (perhaps to recompile every file that references a macro which just changed). The eid program will invoke an editor on each of the files containing an identifier and the gid program will grep for an identifier in the subset of files known to contain it. The pid tool is used to query the path names of the files in the database (rather than the contents). Finally, the iid tool is an interactive program supporting complex queries to intersect and join sets of file names.

History

Greg McGary conceived of the ideas behind mkid when he began hacking the UNIX kernel in 1984. He needed a navigation tool to help him find his way the expansive, unfamiliar landscape. The first mkid-like tools were built with shell scripts, and produced an ascii database that looks much like the output of `lid' with no arguments. It took over an hour on a VAX 11/750 to build a database for a 4.1BSDish kernel. Lookups were done with the UNIX command look, modified to handle very long lines.

In 1986, Greg rewrote mkid, lid, fid and idx in C to improve performance. Database-build times were shortened by an order of magnitude. The mkid tools were first posted to `comp.sources.unix' September of 1987.

Over the next few years, several versions diverged from the original source. Tom Horsley at Harris Computer Systems Division stepped forward to take over maintenance and integrated some of the fixes from divergent versions. He also wrote the iid program. A pre-release of mkid2 was posted to `alt.sources' near the end of 1990. At that time Tom wrote this texinfo manual with the encouragement the net community. (Tom thanks Doug Scofield and Bill Leonard whom I dragooned into helping me poorf raed and edit -- they found several problems in the initial version.)

In January, 1995, Greg McGary reemerged as the primary maintaner and is hereby launching mkid-3 whose primary new feature is an efficient algorithm for building databases that is linear over the size of the input text for both time and space. (The old algorithm was quadratic for space and choked on very large source trees.) The code is now under GPL and might become a part of the GNU system. Mkid-3 is an interim release, since several significant enhacements are in the works. These include an optional coupling with GNU grep, so that grep can use an ID database for hints; a cscope work-alike query interface; incremental update of the ID database; and an automatic file-tree walker so you need not explicitly supply every file name argument to the mkid program.


Go to the first, previous, next, last section, table of contents.