Copyright (C) 1991 Tom Horsley
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation.
An ID database is simply a file containing a list of file names, a list of
identifiers, and a binary relation (stored as a bit matrix) indicating which
of the identifiers appear in each file. With this database and some tools
to manipulate the data, a host of tasks become simpler and faster. You can
grep through hundreds of files for a name, skipping the files that
don't contain the name. You can search for all the memos containing
references to a project. You can edit every file that calls some function,
adding a new required argument. Anyone with a large software project to
maintain, or a large set of text files to organize can benefit from the ID
database and the tools that manipulate it.
There are several programs in the ID family. The
scans the files, finds the identifiers and builds the ID database. The
aid tools are used to generate lists of file names
containing an identifier (perhaps to recompile every file that
references a macro which just changed). The
eid program will
invoke an editor on each of the files containing an identifier and the
gid program will
grep for an identifier in the subset of
files known to contain it. The
pid tool is used to query the
path names of the files in the database (rather than the contents).
iid tool is an interactive program supporting
complex queries to intersect and join sets of file names.
Greg McGary conceived of the ideas behind mkid when he began hacking
the UNIX kernel in 1984. He needed a navigation tool to help him find
his way the expansive, unfamiliar landscape. The first mkid-like tools
were built with shell scripts, and produced an ascii database that looks
much like the output of `lid' with no arguments. It took over an hour
on a VAX 11/750 to build a database for a 4.1BSDish kernel. Lookups were
done with the UNIX command
look, modified to handle very long lines.
In 1986, Greg rewrote mkid, lid, fid and idx in C to improve performance. Database-build times were shortened by an order of magnitude. The mkid tools were first posted to `comp.sources.unix' September of 1987.
Over the next few years, several versions diverged from the original
source. Tom Horsley at Harris Computer Systems Division stepped forward
to take over maintenance and integrated some of the fixes from divergent
versions. He also wrote the
iid program. A pre-release of
mkid2 was posted to `alt.sources' near the end of 1990. At
that time Tom wrote this texinfo manual with the encouragement the net
community. (Tom thanks Doug Scofield and Bill Leonard whom I dragooned
into helping me poorf raed and edit -- they found several problems in
the initial version.)
In January, 1995, Greg McGary reemerged as the primary maintaner and is
mkid-3 whose primary new feature is an efficient
algorithm for building databases that is linear over the size of the
input text for both time and space. (The old algorithm was quadratic
for space and choked on very large source trees.) The code is now under
GPL and might become a part of the GNU system.
Mkid-3 is an
interim release, since several significant enhacements are in the works.
These include an optional coupling with GNU grep, so that grep can use
an ID database for hints; a cscope work-alike query interface;
incremental update of the ID database; and an automatic file-tree walker
so you need not explicitly supply every file name argument to