Next: The user interface Up: AUTHIDX: An AuthorEditor Indexing Previous: The book bibliography problem

Preparing to solve the problem

To understand how I solved the problem, it is useful to review how citations in the running text eventually produce a reference to a bibliographic entry. While the description here assumes (L)TEX and , any document formatting and bibliographic systems will have to do something similar.

First, the presence of a citation in the running text, such as \cite{Lamport:1994:LDP}, results in the entry on an auxiliary file of a line containing the citation label, such as

\citation{Lamport:1994:LDP}

Since the expansion of that citation in the running text may not be known until the bibliographic data has been retrieved, the amount of space that it requires is initially uncertain, and consequently, the typesetting system must make at least two passes over the document before line- and page-breaking decisions can be finalized.

Second, when the bibliographic software reads the auxiliary file, it must locate the requested citations in one or more bibliographic database files, and then format them for inclusion in the bibliography according to the specified style. For , this might mean that the database entry

@Book{Lamport:1994:LDP,
author =    "Leslie Lamport",
title =     "{\LaTeX}: {A} Document
Preparation System: User's
Guide and Reference
Manual",
publisher = pub-AW,
edition =   "Second",
pages =     "xvi + 272",
year =      "1994",
ISBN =      "0-201-52983-1",
LCCN =      "Z253.4.L38L35 1994",
acknowledgement = ack-nhfb,
bibdate =   "Wed Aug 10 09:55:59 1994",
}


is retrieved and reformatted in an output bibliography file as

\bibitem{Lamport:LDP94}
Leslie Lamport.
\newblock {\em {\LaTeX}: {A} Document
Preparation System: User's Guide and
Reference Manual}.
Reading, MA, USA, second edition, 1994.
\newblock \showISBN{0-201-52983-1}.
\newblock xvi + 272 pp.
\newblock \showLCCN{Z253.4.L38L35 1994}.

which is finally typeset by LATEX in the form shown in the bibliography at the end of this document.

Several modifications of the citation process are required for our book indexing project:

• each citation recorded in the auxiliary file must also include its page number;
• the page numbers must not interfere with the bibliographic data extraction;
• the page number lists for each cited publication must be collected, sorted, and merged to remove duplicates, and then included in the corresponding bibliography entry;
• the authors andor editors of each cited entry must be extracted and added to an auxiliary index file;
• the authoreditor index file must be sorted, merged, and formatted for inclusion in an index.

For (L)TEX, access to page numbers is complicated by the asynchronous nature of TEX's page breaking algorithm: the output routine can be called at any time, either implicitly or explicitly, and it may choose to delay some of the accumulated potential output material until the next page. Thus, TEX does not have a single variable that reliably records the `current page number'. Fortunately, TEX's author saw the need for this, and provided a \write command whose argument is not evaluated until the output routine has made its page-breaking decision and the text is sent to the output file [5, pp. 215, 217]. Thus, the same mechanism that is used for writing page numbers in table-of-contents, list-of-figures, list-of-tables, and index files can be used for recording the citation page numbers.

Next: The user interface Up: AUTHIDX: An AuthorEditor Indexing Previous: The book bibliography problem
Nelson H. F. Beebe
7/11/1998