Go to the first, previous, next, last section, table of contents.

Installing gawk

This appendix provides instructions for installing gawk on the various platforms that are supported by the developers. The primary developers support Unix (and one day, GNU), while the other ports were contributed. The file `ACKNOWLEDGMENT' in the gawk distribution lists the electronic mail addresses of the people who did the respective ports, and they are also provided in section Reporting Problems and Bugs.

The gawk Distribution

This section first describes how to get the gawk distribution, how to extract it, and then what is in the various files and subdirectories.

Getting the gawk Distribution

There are three ways you can get GNU software.

  1. You can copy it from someone else who already has it.
  2. You can order gawk directly from the Free Software Foundation. Software distributions are available for Unix, MS-DOS, and VMS, on tape, CD-ROM, or floppies (MS-DOS only). The address is:

    Free Software Foundation
    59 Temple Place--Suite 330
    Boston, MA 02111-1307 USA
    Phone: +1-617-542-5942
    Fax (including Japan): +1-617-542-2652
    E-mail: gnu@prep.ai.mit.edu

    Ordering from the FSF directly contributes to the support of the foundation and to the production of more free software.
  3. You can get gawk by using anonymous ftp to the Internet host ftp.gnu.ai.mit.edu, in the directory `/pub/gnu'. Here is a list of alternate ftp sites from which you can obtain GNU software. When a site is listed as "site:directory" the directory indicates the directory where GNU software is kept. You should use a site that is geographically close to you.
    (archie.oz or archie.oz.au for ACSnet)
    Middle East:
    South America:
    Western Canada:
    USA (continued):

Extracting the Distribution

gawk is distributed as a tar file compressed with the GNU Zip program, gzip.

Once you have the distribution (for example, `gawk-3.0.0.tar.gz'), first use gzip to expand the file, and then use tar to extract it. You can use the following pipeline to produce the gawk distribution:

# Under System V, add 'o' to the tar flags
gzip -d -c gawk-3.0.0.tar.gz | tar -xvpf -

This will create a directory named `gawk-3.0.0' in the current directory.

The distribution file name is of the form `gawk-V.R.n.tar.gz'. The V represents the major version of gawk, the R represents the current release of version V, and the n represents a patch level, meaning that minor bugs have been fixed in the release. The current patch level is 0, but when retrieving distributions, you should get the version with the highest version, release, and patch level. (Note that release levels greater than or equal to 90 denote "beta," or non-production software; you may not wish to retrieve such a version unless you don't mind experimenting.)

If you are not on a Unix system, you will need to make other arrangements for getting and extracting the gawk distribution. You should consult a local expert.

Contents of the gawk Distribution

The gawk distribution has a number of C source files, documentation files, subdirectories and files related to the configuration process (see section Compiling and Installing gawk on Unix), and several subdirectories related to different, non-Unix, operating systems.

various `.c', `.y', and `.h' files
These files are the actual gawk source code.
Descriptive files: `README' for gawk under Unix, and the rest for the various hardware and software combinations.
A file providing an overview of the configuration and installation process.
A list of systems to which gawk has been ported, and which have successfully run the test suite.
A list of the people who contributed major parts of the code or documentation.
A detailed list of source code changes as bugs are fixed or improvements made.
A list of changes to gawk since the last release or patch.
The GNU General Public License.
A brief list of features and/or changes being contemplated for future releases, with some indication of the time frame for the feature, based on its difficulty.
A list of those factors that limit gawk's performance. Most of these depend on the hardware or operating system software, and are not limits in gawk itself.
A description of one area where the POSIX standard for awk is incorrect, and how gawk handles the problem.
A file describing known problems with the current release.
The troff source for a manual page describing gawk. This is distributed for the convenience of Unix users.
The Texinfo source file for this book. It should be processed with TeX to produce a printed document, and with makeinfo to produce an Info file.
The generated Info file for this book.
The troff source for a manual page describing the igawk program presented in section An Easy Way to Use Library Functions.
The input file used during the configuration process to generate the actual `Makefile' for creating the documentation.
These files and subdirectory are used when configuring gawk for various Unix systems. They are explained in detail in section Compiling and Installing gawk on Unix.
The `awklib' directory contains a copy of `extract.awk' (see section Extracting Programs from Texinfo Source Files), which can be used to extract the sample programs from the Texinfo source file for this book, and a `Makefile.in' file, which configure uses to generate a `Makefile'. As part of the process of building gawk, the library functions from section A Library of awk Functions, and the igawk program from section An Easy Way to Use Library Functions, are extracted into ready to use files. They are installed as part of the installation process.
Files needed for building gawk on an Amiga. See section Installing gawk on an Amiga, for details.
Files needed for building gawk on an Atari ST. See section Installing gawk on the Atari ST, for details.
Files needed for building gawk under MS-DOS and OS/2. See section MS-DOS and OS/2 Installation and Compilation, for details.
Files needed for building gawk under VMS. See section How to Compile and Install gawk on VMS, for details.
A test suite for gawk. You can use `make check' from the top level gawk directory to run your version of gawk against the test suite. If gawk successfully passes `make check' then you can be confident of a successful port.

Compiling and Installing gawk on Unix

Usually, you can compile and install gawk by typing only two commands. However, if you do use an unusual system, you may need to configure gawk for your system yourself.

Compiling gawk for Unix

After you have extracted the gawk distribution, cd to `gawk-3.0.0'. Like most GNU software, gawk is configured automatically for your Unix system by running the configure program. This program is a Bourne shell script that was generated automatically using GNU autoconf. (The autoconf software is described fully in Autoconf--Generating Automatic Configuration Scripts, which is available from the Free Software Foundation.)

To configure gawk, simply run configure:

sh ./configure

This produces a `Makefile' and `config.h' tailored to your system. The `config.h' file describes various facts about your system. You may wish to edit the `Makefile' to change the CFLAGS variable, which controls the command line options that are passed to the C compiler (such as optimization levels, or compiling for debugging).

Alternatively, you can add your own values for most make variables, such as CC and CFLAGS, on the command line when running configure:

CC=cc CFLAGS=-g sh ./configure

See the file `INSTALL' in the gawk distribution for all the details.

After you have run configure, and possibly edited the `Makefile', type:


and shortly thereafter, you should have an executable version of gawk. That's all there is to it! (If these steps do not work, please send in a bug report; see section Reporting Problems and Bugs.)

The Configuration Process

(This section is of interest only if you know something about using the C language and the Unix operating system.)

The source code for gawk generally attempts to adhere to formal standards wherever possible. This means that gawk uses library routines that are specified by the ANSI C standard and by the POSIX operating system interface standard. When using an ANSI C compiler, function prototypes are used to help improve the compile-time checking.

Many Unix systems do not support all of either the ANSI or the POSIX standards. The `missing' subdirectory in the gawk distribution contains replacement versions of those subroutines that are most likely to be missing.

The `config.h' file that is created by the configure program contains definitions that describe features of the particular operating system where you are attempting to compile gawk. The three things described by this file are what header files are available, so that they can be correctly included, what (supposedly) standard functions are actually available in your C libraries, and other miscellaneous facts about your variant of Unix. For example, there may not be an st_blksize element in the stat structure. In this case `HAVE_ST_BLKSIZE' would be undefined.

It is possible for your C compiler to lie to configure. It may do so by not exiting with an error when a library function is not available. To get around this, you can edit the file `custom.h'. Use an `#ifdef' that is appropriate for your system, and either #define any constants that configure should have defined but didn't, or #undef any constants that configure defined and should not have. `custom.h' is automatically included by `config.h'.

It is also possible that the configure program generated by autoconf will not work on your system in some other fashion. If you do have a problem, the file `configure.in' is the input for autoconf. You may be able to change this file, and generate a new version of configure that will work on your system. See section Reporting Problems and Bugs, for information on how to report problems in configuring gawk. The same mechanism may be used to send in updates to `configure.in' and/or `custom.h'.

How to Compile and Install gawk on VMS

This section describes how to compile and install gawk under VMS.

Compiling gawk on VMS

To compile gawk under VMS, there is a DCL command procedure that will issue all the necessary CC and LINK commands, and there is also a `Makefile' for use with the MMS utility. From the source directory, use either




Depending upon which C compiler you are using, follow one of the sets of instructions in this table:

VAX C V3.x
Use either `vmsbuild.com' or `descrip.mms' as is. These use CC/OPTIMIZE=NOLINE, which is essential for Version 3.0.
VAX C V2.x
You must have Version 2.3 or 2.4; older ones won't work. Edit either `vmsbuild.com' or `descrip.mms' according to the comments in them. For `vmsbuild.com', this just entails removing two `!' delimiters. Also edit `config.h' (which is a copy of file `[.config]vms-conf.h') and comment out or delete the two lines `#define __STDC__ 0' and `#define VAXC_BUILTINS' near the end.
Edit `vmsbuild.com' or `descrip.mms'; the changes are different from those for VAX C V2.x, but equally straightforward. No changes to `config.h' should be needed.
Edit `vmsbuild.com' or `descrip.mms' according to their comments. No changes to `config.h' should be needed.

gawk has been tested under VAX/VMS 5.5-1 using VAX C V3.2, GNU C 1.40 and 2.3. It should work without modifications for VMS V4.6 and up.

Installing gawk on VMS

To install gawk, all you need is a "foreign" command, which is a DCL symbol whose value begins with a dollar sign. For example:

$ GAWK :== $disk1:[gnubin]GAWK

(Substitute the actual location of gawk.exe for `$disk1:[gnubin]'.) The symbol should be placed in the `login.com' of any user who wishes to run gawk, so that it will be defined every time the user logs on. Alternatively, the symbol may be placed in the system-wide `sylogin.com' procedure, which will allow all users to run gawk.

Optionally, the help entry can be loaded into a VMS help library:


(You may want to substitute a site-specific help library rather than the standard VMS library `HELPLIB'.) After loading the help text,


will provide information about both the gawk implementation and the awk programming language.

The logical name `AWK_LIBRARY' can designate a default location for awk program files. For the `-f' option, if the specified filename has no device or directory path information in it, gawk will look in the current directory first, then in the directory specified by the translation of `AWK_LIBRARY' if the file was not found. If after searching in both directories, the file still is not found, then gawk appends the suffix `.awk' to the filename and the file search will be re-tried. If `AWK_LIBRARY' is not defined, that portion of the file search will fail benignly.

Running gawk on VMS

Command line parsing and quoting conventions are significantly different on VMS, so examples in this book or from other sources often need minor changes. They are minor though, and all awk programs should run correctly.

Here are a couple of trivial tests:

$ gawk -- "BEGIN {print ""Hello, World!""}"
$ gawk -"W" version
! could also be -"W version" or "-W version"

Note that upper-case and mixed-case text must be quoted.

The VMS port of gawk includes a DCL-style interface in addition to the original shell-style interface (see the help entry for details). One side-effect of dual command line parsing is that if there is only a single parameter (as in the quoted string program above), the command becomes ambiguous. To work around this, the normally optional `--' flag is required to force Unix style rather than DCL parsing. If any other dash-type options (or multiple parameters such as data files to be processed) are present, there is no ambiguity and `--' can be omitted.

The default search path when looking for awk program files specified by the `-f' option is "SYS$DISK:[],AWK_LIBRARY:". The logical name `AWKPATH' can be used to override this default. The format of `AWKPATH' is a comma-separated list of directory specifications. When defining it, the value should be quoted so that it retains a single translation, and not a multi-translation RMS searchlist.

Building and Using gawk on VMS POSIX

Ignore the instructions above, although `vms/gawk.hlp' should still be made available in a help library. Make sure that the configure script is executable; use `chmod +x' on it if necessary. Then execute the following commands:

psx> CC=vms/posix-cc.sh configure
psx> CC=c89 make gawk

The first command will construct files `config.h' and `Makefile' out of templates. The second command will compile and link gawk. Ignore the warning "Could not find lib m in lib list"; it is harmless, caused by the explicit use of `-lm' as a linker option which is not needed under VMS POSIX. Under V1.1 (but not V1.0) a problem with the yacc skeleton `/etc/yyparse.c' will cause a compiler warning for `awktab.c', followed by a linker warning about compilation warnings in the resulting object module. These warnings can be ignored.

Once built, gawk will work like any other shell utility. Unlike the normal VMS port of gawk, no special command line manipulation is needed in the VMS POSIX environment.

MS-DOS and OS/2 Installation and Compilation

If you have received a binary distribution prepared by the DOS maintainers, then gawk and the necessary support files will appear under the `gnu' directory, with executables in `gnu/bin', libraries in `gnu/lib/awk', and manual pages under `gnu/man'. This is designed for easy installation to a `/gnu' directory on your drive, but the files can be installed anywhere provided AWKPATH is set properly. Regardless of the installation directory, the first line of `igawk.cmd' and `igawk.bat' (in `gnu/bin') may need to be edited.

The binary distribution will contain a separate file describing the contents. In particular, it may include more than one version of the gawk executable. OS/2 binary distributions may have a different arrangement, but installation is similar.

The OS/2 and MS-DOS versions of gawk search for program files as described in section The AWKPATH Environment Variable. However, semicolons (rather than colons) separate elements in the AWKPATH variable. If AWKPATH is not set or is empty, then the default search path is ".;c:/lib/awk;c:/gnu/lib/awk".

An sh-like shell (as opposed to command.com under MS-DOS or cmd.exe under OS/2) may be useful for awk programming. Ian Stewartson has written an excellent shell for MS-DOS and OS/2, and a ksh clone and GNU Bash are available for OS/2. The file `README_d/README.pc' in the gawk distribution contains information on these shells. Users of Stewartson's shell on DOS should examine its documentation on handling of command-lines. In particular, the setting for gawk in the shell configuration may need to be changed, and the ignoretype option may also be of interest.

gawk can be compiled for MS-DOS and OS/2 using the GNU development tools from DJ Delorie (DJGPP, MS-DOS-only) or Eberhard Mattes (EMX, MS-DOS and OS/2). Microsoft C can be used to build 16-bit versions for MS-DOS and OS/2. The file `README_d/README.pc' in the gawk distribution contains additional notes, and `pc/Makefile' contains important notes on compilation options.

To build gawk, copy the files in the `pc' directory to the directory with the rest of the gawk sources. The `Makefile' contains a configuration section with comments, and may need to be edited in order to work with your make utility.

The `Makefile' contains a number of targets for building various MS-DOS and OS/2 versions. A list of targets will be printed if the make command is given without a target. As an example, to build gawk using the DJGPP tools, enter `make djgpp'.

Using make to run the standard tests and to install gawk requires additional Unix-like tools, including sh, sed, and cp. In order to run the tests, the `test/*.ok' files may need to be converted so that they have the usual DOS-style end-of-line markers. Most of the tests will work properly with Stewartson's shell along with the companion utilities or appropriate GNU utilities. However, some editing of `test/Makefile' is required. It is recommended that the file `pc/Makefile.tst' be copied to `test/Makefile' as a replacement. Details can be found in `README_d/README.pc'.

Installing gawk on the Atari ST

There are no substantial differences when installing gawk on various Atari models. Compiled gawk executables do not require a large amount of memory with most awk programs and should run on all Motorola processor based models (called further ST, even if that is not exactly right).

In order to use gawk, you need to have a shell, either text or graphics, that does not map all the characters of a command line to upper-case. Maintaining case distinction in option flags is very important (see section Command Line Options). These days this is the default, and it may only be a problem for some very old machines. If your system does not preserve the case of option flags, you will need to upgrade your tools. Support for I/O redirection is necessary to make it easy to import awk programs from other environments. Pipes are nice to have, but not vital.

Compiling gawk on the Atari ST

A proper compilation of gawk sources when sizeof(int) differs from sizeof(void *) requires an ANSI C compiler. An initial port was done with gcc. You may actually prefer executables where ints are four bytes wide, but the other variant works as well.

You may need quite a bit of memory when trying to recompile the gawk sources, as some source files (`regex.c' in particular) are quite big. If you run out of memory compiling such a file, try reducing the optimization level for this particular file; this may help.

With a reasonable shell (Bash will do), and in particular if you run Linux, MiNT or a similar operating system, you have a pretty good chance that the configure utility will succeed. Otherwise sample versions of `config.h' and `Makefile.st' are given in the `atari' subdirectory and can be edited and copied to the corresponding files in the main source directory. Even if configure produced something, it might be advisable to compare its results with the sample versions and possibly make adjustments.

Some gawk source code fragments depend on a preprocessor define `atarist'. This basically assumes the TOS environment with gcc. Modify these sections as appropriate if they are not right for your environment. Also see the remarks about AWKPATH and envsep in section Running gawk on the Atari ST.

As shipped, the sample `config.h' claims that the system function is missing from the libraries, which is not true, and an alternative implementation of this function is provided in `atari/system.c'. Depending upon your particular combination of shell and operating system, you may wish to change the file to indicate that system is available.

Running gawk on the Atari ST

An executable version of gawk should be placed, as usual, anywhere in your PATH where your shell can find it.

While executing, gawk creates a number of temporary files. When using gcc libraries for TOS, gawk looks for either of the environment variables TEMP or TMPDIR, in that order. If either one is found, its value is assumed to be a directory for temporary files. This directory must exist, and if you can spare the memory, it is a good idea to put it on a RAM drive. If neither TEMP nor TMPDIR are found, then gawk uses the current directory for its temporary files.

The ST version of gawk searches for its program files as described in section The AWKPATH Environment Variable. The default value for the AWKPATH variable is taken from DEFPATH defined in `Makefile'. The sample gcc/TOS `Makefile' for the ST in the distribution sets DEFPATH to ".,c:\lib\awk,c:\gnu\lib\awk". The search path can be modified by explicitly setting AWKPATH to whatever you wish. Note that colons cannot be used on the ST to separate elements in the AWKPATH variable, since they have another, reserved, meaning. Instead, you must use a comma to separate elements in the path. When recompiling, the separating character can be modified by initializing the envsep variable in `atari/gawkmisc.atr' to another value.

Although awk allows great flexibility in doing I/O redirections from within a program, this facility should be used with care on the ST running under TOS. In some circumstances the OS routines for file handle pool processing lose track of certain events, causing the computer to crash, and requiring a reboot. Often a warm reboot is sufficient. Fortunately, this happens infrequently, and in rather esoteric situations. In particular, avoid having one part of an awk program using print statements explicitly redirected to "/dev/stdout", while other print statements use the default standard output, and a calling shell has redirected standard output to a file.

When gawk is compiled with the ST version of gcc and its usual libraries, it will accept both `/' and `\' as path separators. While this is convenient, it should be remembered that this removes one, technically valid, character (`/') from your file names, and that it may create problems for external programs, called via the system function, which may not support this convention. Whenever it is possible that a file created by gawk will be used by some other program, use only backslashes. Also remember that in awk, backslashes in strings have to be doubled in order to get literal backslashes (see section Escape Sequences).

Installing gawk on an Amiga

You can install gawk on an Amiga system using a Unix emulation environment available via anonymous ftp from wuarchive.wustl.edu in the directory `pub/aminet/dev/gcc'. This includes a shell based on pdksh. The primary component of this environment is a Unix emulation library, `ixemul.lib'.

A more complete distribution for the Amiga is available on the FreshFish CD-ROM from:

Amiga Library Services
610 North Alma School Road, Suite 18
Chandler, AZ 85224 USA
Phone: +1-602-491-0048
FAX: +1-602-491-0048
E-mail: orders@amigalib.com

Once you have the distribution, you can configure gawk simply by running configure:

configure -v m68k-cbm-amigados

Then run make, and you should be all set! (If these steps do not work, please send in a bug report; see section Reporting Problems and Bugs.)

Reporting Problems and Bugs

If you have problems with gawk or think that you have found a bug, please report it to the developers; we cannot promise to do anything but we might well want to fix it.

Before reporting a bug, make sure you have actually found a real bug. Carefully reread the documentation and see if it really says you can do what you're trying to do. If it's not clear whether you should be able to do something or not, report that too; it's a bug in the documentation!

Before reporting a bug or trying to fix it yourself, try to isolate it to the smallest possible awk program and input data file that reproduces the problem. Then send us the program and data file, some idea of what kind of Unix system you're using, and the exact results gawk gave you. Also say what you expected to occur; this will help us decide whether the problem was really in the documentation.

Once you have a precise problem, there are two e-mail addresses you can send mail to.


Please include the version number of gawk you are using. You can get this information with the command `gawk --version'. You should send a carbon copy of your mail to Arnold Robbins, who can be reached at `arnold@gnu.ai.mit.edu'.

Important! Do not try to report bugs in gawk by posting to the Usenet/Internet newsgroup comp.lang.awk. While the gawk developers do occasionally read this newsgroup, there is no guarantee that we will see your posting. The steps described above are the official, recognized ways for reporting bugs.

Non-bug suggestions are always welcome as well. If you have questions about things that are unclear in the documentation or are just obscure features, ask Arnold Robbins; he will try to help you out, although he may not have the time to fix the problem. You can send him electronic mail at the Internet address above.

If you find bugs in one of the non-Unix ports of gawk, please send an electronic mail message to the person who maintains that port. They are listed below, and also in the `README' file in the gawk distribution. Information in the README file should be considered authoritative if it conflicts with this book.

The people maintaining the non-Unix ports of gawk are:

Scott Deifik, `scottd@amgen.com', and Darrel Hankerson, `hankedr@mail.auburn.edu'.
Kai Uwe Rommel, `rommel@ars.de'.
Pat Rankin, `rankin@eql.caltech.edu'.
Atari ST
Michal Jaegermann, `michal@gortel.phys.ualberta.ca'.
Fred Fish, `fnf@amigalib.com'.

If your bug is also reproducible under Unix, please send copies of your report to the general GNU bug list, as well as to Arnold Robbins, at the addresses listed above.

Other Freely Available awk Implementations

There are two other freely available awk implementations. This section briefly describes where to get them.

Unix awk
Brian Kernighan has been able to make his implementation of awk freely available. You can get it via anonymous ftp to the host netlib.att.com. Change directory to `/netlib/research'. Use "binary" or "image" mode, and retrieve `awk.bundle.Z'. This is a shell archive that has been compressed with the compress utility. It can be uncompressed with either uncompress or the GNU gunzip utility. This version requires an ANSI C compiler; GCC (the GNU C compiler) works quite nicely.
Michael Brennan has written an independent implementation of awk, called mawk. It is available under the GPL (see section GNU GENERAL PUBLIC LICENSE), just as gawk is. You can get it via anonymous ftp to the host oxy.edu. Change directory to `/public'. Use "binary" or "image" mode, and retrieve `mawk1.2.1.tar.gz' (or the latest version that is there). gunzip may be used to decompress this file. Installation is similar to gawk's (see section Compiling and Installing gawk on Unix).

Go to the first, previous, next, last section, table of contents.