# Overview

This chapter gives an overview of what you do to create fonts using these programs. It also describes some things which are common to all the programs.

Throughout this document, we refer to various source files in the implementation. If you can read C programs, you may find these references useful as points of entry into the source when you are confused about some program's behavior, or are just curious.

## Picture

Following is a pictorial representation of the typical order in which these programs are used, as well as their input and output.

GSrenderfont is not in the picture since it is intended for an entirely separate purpose (namely, making bitmaps from PostScript outlines). Fontconvert also has many functions which are not needed for the basic task of font creation from scanned images.

                                         ---------------
| fontconvert |
/ ---------------
/--------/         ^
scanned                     /                   |
image                      /                    v
and IFI   -----------    GF   -------------  TFM, GF   --------  BZR
========> | imageto | ======> | charspace | =========> | limn | ======...
^       -----------         -------------     ^      --------
|                         /                   |               (continued)
v                       CMI                   v
-------------                               --------
| imgrotate |                               | xbfe |
-------------                               --------

Metafont source    ------  GF, TFM
|=====================> | mf | =========
(continued)                 |                       ------
|
BZR   ---------  TFM,  |   PostScript Type 1 (gsf)
... ======> | bzrto |========|=======================
---------        |
/                  |
CCC                  |   PostScript Type 3 (pf3)
|======================
|
|
|    BPL    ------------  BZR
|=========> | bpltobzr | =====
------------


## Creating fonts

The previous section described pictorially the usual order in which these programs are used. This section will do the same in words.

Naturally, you may not need to go through all the steps described here. For example, if you are not starting with a scanned image, but already have a bitmap font, then the first step--running Imageto--is irrelevant.

Here is a description of the usual font creation process, starting with a scanned image of a type specimen and ending with fonts which can be used by Ghostscript, TeX, etc.

1. To see what an image I consists of, run Imageto with the -strips' option. This produces a bitmap font Isp' in which each character is simply a constant number of scanlines from the image.
2. Run Fontconvert (see section Fontconvert) on Isp' with the -tfm' option, to produce a TFM file. This is because of the next step:
3. Run TeX on imageto/strips.tex', telling TeX to use the font Isp'. This produces a DVI file which you can print or preview as you usually do with TeX documents. (If you don't know how to do this, you'll have to ask someone knowledgeable at your site, or otherwise investigate.) This will (finally) show you what is in the image. An alternative to the above steps is to run Imageto with the -epsf' option. This outputs an Encapsulated PostScript file with the image given as a simple PostScript bitmap. Then you can use Ghostscript or some other PostScript interpreter to look at the EPS file. This method is simpler, but has the disadvantage of using much more disk space, and needing a PostScript interpreter.
4. If the original was not scanned in the normal orientation, the image must be rotated 90 degrees in some direction and/or flipped end for end. (Sometimes we have not scanned in the normal orientation because the physical construction of the book we were scanning made it difficult or impossible.) In this case, you must rotate the image to be upright. The program IMGrotate does this, given the -flip' or rotate-clockwise' option. Given an image RI, this outputs the upright image I.
5. Once you have an upright image I, you can use Imageto (see section Imageto) to extract the characters from the image and make a bitmap font I.dpigf', where dpi is the resolution of the image in pixels per inch. (If the image itself does not contain the resolution, you must specify it on the command line with -dpi'.) To do this, you must first prepare an IFI file describing the image. See section IFI files, for a description of IFI files.
6. To view the resulting GF file, run Fontconvert to make a TFM file, as above. Then run TeX on testfont.tex' and use the \table or \sample commands to produce a font table. Next, print or preview the DVI file that TeX outputs, as before. This will probably reveal problems in your IFI file, e.g., that not all the characters are present, or that they are not in the right positions. So you need to iterate until the image is correctly processed. testfont.tex' should have come with your TeX distribution. If for some reason you do not have it, you can use the one distributed in the data' directory.
7. Once all the characters have been properly extracted from the image, you have a bitmap font. Unlike the above, the following steps all interact with each other, in the sense that fixing problems found at one stage may imply changes in an earlier stage. As a result, you must expect to iterate them several (billion) times. At any rate, given a bitmap font f you then run Charspace (see section Charspace) to add side bearings to f, producing a new bitmap font, say g, and a corresponding TFM file g.tfm'. To do this, you must prepare a CMI file specifying the side bearings. See section CMI files, for a description of CMI files.
8. To fit outlines to the characters in a bitmap font, run Limn (see section Limn). Given the bitmap font g, it produces the BZR (see section BZR files) outline font g.bzr'. The side bearings in g are carried along. Although Limn will (should) always be able to fit some sort of outline to the bitmaps, you can get the best results only by fiddling with the (unfortunately numerous) parameters. See section Invoking Limn.
9. To convert from the BZR file g.bzr' that Limn outputs to a font format that a typesetting program can use, run BZRto (see section BZRto). While developing a font, we typically convert it to a Metafont program (with the -metafont' option). As you get closer to a finished font, you may want to prepare a CCC file (see section CCC files) to tell BZRto how construct composite characters (pre-accented A's, for example) to complete the font.
10. Given the font in Metafont form, you can then either make the font at its true size for some device, or make an enlarged version to examine the characters closely. See section Metafont and BZRto, for the full details. Briefly, to do the former, run Metafont with a mode of whatever device you wish (the mode localfont will get you the most common local device, if Metafont has been installed properly). Then you can use testfont.tex' to get a font sample, as described above. To do the latter, run Metafont with no assignment to mode. This should get you proof mode. You can then use GFtoDVI to get a DVI file with one character per page, showing you the control points Limn chose for the outlines.
11. Problems can arise at any stage. For example, the character spacing might look wrong; in that case, you should fix the CMI files and rerun Charspace (and all subsequent programs, naturally). Or the outlines might not match the bitmaps very well; then you can change the parameters to Limn, or use XBfe (see section XBfe) to hand-edit the bitmaps so Limn will do a better job. (To eliminate some of tedium of fixing digitization problems in the scanned image, you might want to use the filtering options in Fontconvert before hand-editing; see section Character manipulation options.) Inevitably, as one problem gets fixed you notice new ones ...

### Font creation example

This section gives a real-life example of font creation for the Garamond roman typeface, which we worked on concomitantly with developing the programs. We started from a scanned type specimen of 30 point Monotype Garamond, scanned using a Xerox 9700 scanner loaned to us from Interleaf, Inc. (Thanks to Paul English and others at Interleaf for this loan.)

To begin, we used Imageto as follows to look at the image file we had scanned (see section Viewing an image). Each line is a separate command.
imageto -strips ggmr.img
fontconvert -tfm ggmrsp.1200
echo ggmrsp | tex strips.tex
xdvi -p 1200 -s 10 strips.dvi

1. Next, we created the file ggmr.ifi' (distributed in the data' directory), listing the characters in the order they appeared in the image, guessing at baseline offsets and (if necessary) including bounding box counts. Then we ran Imageto again, this time to get information about the baselines and spurious blotches in the image. We use the -encoding' option since some of the characters in the image are not in the default ASCII encoding.
imageto -print-guidelines -print-clean-info -encoding=gnulatin ggmr.img

2. Based on the information gleaned from that run, we decided on the final baselines, adjusted the bounding box counts for broken-up characters, and extracted the font (see section Image to font conversion). (In truth, this took several iterations.) The design size of the original image was stated in the book to be 30pt. We noticed several blotches in the image we needed to ignore, and so we added .notdef lines to ggmr.ifi' as appropriate.
imageto -verbose -baselines=121,130,120 \
-designsize=30 -encoding=gnulatin ggmr.img

3. To smooth some of the rough edges caused by the scanner's rasterization errors, we filtered the bitmaps with Fontconvert (see section Fontconvert).
fontconvert -verbose -gf -tfm -filter-passes=3 -filter-size=3 \
ggmr30.1200 -output=ggmr30a

4. For a first attempt at intercharacter and interword spacing, we created ggmr.1200cmi' (also distributed in the data' directory) and ran Charspace (see section Charspace), producing ggmr30b.1200gf' and ggmr30b.tfm'. To see the results, we ran ggmr30b' through testfont.tex', modified the CMI file, reran Charspace, etc., until the output was somewhat reasonable. We didn't try to fine-tune the spacing here, since we knew the following steps would affect the character shapes, which in turn would affect the spacing.
charspace -verbose -cmi=ggmr.1200cmi ggmr30a.1200 -output=ggmr30b

5. Next we ran ggmr30b.1200gf', created by Charspace, through Limn to produce the outline font in BZR form, ggmr30b.bzr'. We couldn't know what the best values of all the fitting parameters were the first time, so we just increased the ones which are relative to the resolution.
limn -verbose -corner-surround=4 -filter-surround=6 \
-filter-alternative-surround=3 -subdivide-surround=6 \
-tangent-surround=6 ggmr30b.1200

6. Then we converted ggmr30b.bzr' to a Metafont program using BZRto (see section BZRto), and then ran Metafont to create TFM and GF files we could typeset with (see section Metafont and BZRto). In order to keep the Metafont-generated files distinct from the original TFM and GF files, we use the output stem ggmr30B'. To see the results at the usual 10pt, we then ran the Metafont output through sample.tex' (a one-line wrapper for testfont.tex': \input testfont \sample \end').
bzrto -verbose -metafont ggmr30b -output=ggmr30B
mf '\mode:=localfont; input ggmr30B'
echo ggmr30B | tex sample
dvips sample

7. This 10pt output looked too small to us. So we changed the design size to 26pt (finding the value took several iterations) with Fontconvert (see section Fontconvert), then reran Charspace, Limn, BZRto, Metafont, etc., as above. We only show the Fontconvert step here; the others have only the filenames changed from the invocations above.
fontconvert -verbose -gf -tfm -designsize=26 ggmr30b.1200 -output=ggmr26c

8. After this, the real work begins. We ran the Metafont program ggmr26D.mf' in proof mode, followed by GFtoDVI, so we could see how well Limn did at choosing the control points for the outlines. See section Proofing with Metafont. (The nodisplays tells Metafont not to bother displaying each character in a window online.)
mf '\mode:=proof; nodisplays; input ggmr26D'
gftodvi ggmr26D.3656gf

9. From this, we went and hand-edited the font ggmr26d.1200gf' with XBfe (see section XBfe), and/or tinkered with the options to Limn, trying to make the outlines reasonable. We still haven't finished ...

## Command-line options

Since these programs do not have counterparts on historical Unix systems, they need not conform to an existing interface. We chose to have all the programs use the GNU function getopt_long_only to parse command lines.

As a result, you can give the options in any order, interspersed as you wish with non-option arguments; you can use -' or --' to start an option; you can use any unambiguous abbreviation for an option name; you can separate option names and values with either =' or one or more spaces; and you can use filenames that would otherwise look like options by putting them after an option --'.

By convention, all the programs accept only one non-option argument, which is taken to be the name of the main input file.

If a particular option with a value is given more than once, it is the last value which is used.

For example, the following command line specifies the options foo', bar', and verbose'; gives the value abc' to the baz' option, and the value xyz' to the quux' option; and specifies the filename -myfile-'.

-foo --bar -verb -abc=baz -quux karl -quux xyz -- -myfile-


### The main input file

By convention, all the programs accept only one non-option argument, which they take to be the name of the main input file.

Usually this is the name of a bitmap font. By their nature, bitmap fonts are for a particular resolution. You can specify the resolution in two ways: with the -dpi' option (see the next section), or by giving an extension to the font name on the command line.

For example, you could specify the font foo at a resolution of 300dpi to the program program in either of these two ways ($' being the shell prompt): $ program foo.300
\$ program -dpi=300 foo


You can also say, e.g., program foo.300gf', but the gf' is ignored. These programs always look for a given font in PK format before looking for it in GF format, under the assumption that if both fonts exist, and have the same stem, they are the same.

If the filename is absolute or explicitly relative, i.e., starts with /' or ./' or ../', then the programs do not use search paths to look for it, as described in section Font searching. Instead, the fonts are simply searched for in the given directory.

### Common options

Certain options are available in all or most of the programs. Rather than writing identical descriptions in the chapters for each of the programs, they are described here.

This first table lists common options which do not convey anything about the input. They merely direct the program to print additional output.

-help'
Prints a usage message listing all available options on standard error. The program exits after doing so.
-log'
Write information about everything the program is doing to the file foo.log', where foo is the root part of the main input file.
-verbose'
Prints brief status reports as the program runs, typically the character code of each character as it is processed. This usually goes to standard output; but if the program is outputting other information there, it goes to standard error.
-version'
Prints the version number of the program on standard output. If a main input file is supplied, processing continues; otherwise, the program exits normally.

This second table lists common options which change the program's behavior in more substantive ways.

-dpi dpi'
Look for the main input font at a resolution of dpi pixels per inch. The default is to infer the information from the main input filename (see section The main input file).
-output-file fname'
Write the main output of the program to fname. If fname has a suffix, it is used unchanged; otherwise, it is extended with some standard suffix, such as resolutiongf'. Unless fname is an absolute or explicitly relative pathname, the file is written in the current directory.
-range start-end'
Only look at the characters between the character codes start and end, inclusive. The default is to look at all characters in the font. See section Specifying character codes, for the precise syntax of character codes.

### Specifying character codes

Most of the programs allow you to specify character codes for various purposes. Character codes are always parsed in the same way (using the routines in lib/charcode.c' and lib/charspec.c').

You can specify the character code directly, as a numeric value, or indirectly, as a character name to be looked up in an encoding vector.

#### Named character codes

If a string being parsed as a character code is more than one character long, or starts with a non-digit, it is always looked up as a name in an encoding vector before being considered as a numeric code. We do this because you can always specify a particular value in one of the numeric formats, if that's what you want.

The encoding vector used varies with the program; you can always define an explicit encoding vector with the -encoding' option. If you don't specify one explicitly, programs which must have an encoding vector use a default; programs which can proceed without one do not. See section Encoding files, for more details on encoding vectors.

As a practical matter, the only character names which have length one are the 52 letters, A'--Z', a'--z'. In virtually all common cases, the encoding vector and the underlying character set both have these in their ASCII positions. (The exception is machines that use the EBCDIC encoding.)

#### Numeric character codes

The following variations for numeric character codes are allowed. The examples all assume the character set is ASCII.

• Octal numerals preceded by a zero are taken to be an octal number. For example, 0113 also means decimal 75. If a would-be character code starts with a zero but contains any characters other than the digits 0' through 7', it is invalid.
• Hexadecimal "digits" preceded by 0x' or 0X' are taken to be a hexadecimal number. Case is irrelevant. For example, 0x4b, 0X4b, 0x4B, and 0X4B all mean decimal 75. As with octal, a would-be character code starting with 0x' and containing any characters other than 0'--9', a'--f', and A'--F' is invalid.
• A decimal number (consisting of more than one numeral) is itself. For example, 75 means the character code decimal 75. As before, a would-be character code starting with 1'--9' and containing any characters other than 0'--9' is invalid.
• A single digit, or a single character not in the encoding vector as a name, is taken to represent its value in the underlying character set. For example, K means the character code decimal 75, and 0 (the numeral zero) means the character code decimal 48 (if the machine uses ASCII).
• If the string being parsed as a character code starts with a digit, the appropriate one of the previous cases is applied. If it starts with any other character, the string is first looked up as a name.

Character codes must be between zero and 255 (decimal), inclusive.

### Common option values

The programs have a few common conventions for how to specify option values that are more complicated than simple numbers or strings.

Some options take not a single value, but a list. In this case, the individual values are separated by commas or whitespace, as in -omit=1,2,3' or -omit="1 2 3"'. Although using whitespace to separate the values is less convenient when typing them interactively, it is useful when you have a list that is so long you want to put it in the file. Then you can use cat' in conjunction with shell quoting to get the value: -omit="cat file"'.

Other options take a list of values, but each value is a keyword and a corresponding quantity, as in -fontdimens name:real,name,real'.

Finally, a few options take percentages, which you specify as an integer between 0 and 100, inclusive.

## Font searching

These programs use the same environment variables and algorithms for finding font files as does (the Unix port of) TeX and its friends.

You specify the default paths in the top-level Makefile. The environment variables TEXFONTS, PKFONTS, TEXPKS, and GFFONTS override those paths. Both the default paths and the environment variable values should consist of a colon-separated list of directories.

Specifically, a TFM file is looked for along the path specified by TEXFONTS; a GF file along GFFONTS, then TEXFONTS; a PK file along PKFONTS, then TEXPKS, then TEXFONTS.

A leading or trailing colon in an environment variable value is replaced by the default path.

A leading ~' or ~user' in a path component is replaced by the current home directory or user's home directory, respectively.

If a directory in a path does not exist, it is simply ignored.

In either the default value or the environment variable value, if a component directory d ends with two slashes, all subdirectories of d are searched: first those subdirectories directly under d, then the subsubdirectories under those, and so on. At each level, the order in which the directories are searched is unspecified.

The subdirectory searching has one known deficiency, for which we know of no good solution. If a directory d being searched for subdirectories contains plain files and symbolic links to other directories, but no real subdirectories, d will be considered a leaf directory, i.e., the symbolic links will not be followed. The only way we know of to fix this is to invoke stat (an expensive system call) on every directory entry. Since font directories often contain hundreds of files, this would make the startup time unacceptably slow.

A directory d explicitly named with two trailing slashes, however, is always searched for subdirectories, even if it is a "leaf". We do this since presumably you would not have asked d to be searched for subdirectories if you didn't want it to be, and therefore you don't have hundreds of files in d.

For example, the following value for an environment variable says to search the following: all subdirectories of the current user's fonts' directory in his or her home directory, then the directory fonts' in the user karl's home directory, and finally the system default directories specified at compilation time.

~/fonts//:~karl/fonts:


## Font naming

Naming font files has always been a difficult proposition at best. On the one hand, the names should be as portable as possible, so the fonts themselves can be used on almost any platform. On the other hand, the names should be as descriptive and comprehensive as possible. The best compromise we have been able to work out is described in a separate document: section Introduction' in Filenames for TeX fonts. See section Archives, for where to obtain.

Filenames for GNU project fonts should start with g', for the "source" abbreviation of "GNU".

Aside from a general font naming scheme, when developing fonts you must keep the different versions straight. We do this by appending a "version letter" a', b', ... to the main bitmap filename. For example, the original Garamond roman font we scanned was a 30 point size, so the main filename was ggmr30' (g' for GNU, gm' for Garamond, r' for roman). As we ran the font through the various programs, we named the output ggmr30b', `ggmr30c', and so on.

Since the outline fonts produced by BZRto are scalable, we do not include the design size in their names. (BZRto removes a trailing number from the input name by default.)