Table of contents


NAME

fpp - cpp-like reversible preprocessor filter for Fortran and SFTRAN3 code

SYNOPSIS

fpp [ +debug ] [ +version ] [ +Dname ] [ +Dname=value ] [ +Uname ] [ name1=value1 ] ... [ namek=valuek ] [ -\^- ] [ file1 ] ... [ filep ]

OPTIONS

Because fpp is currently implemented in the nawk(1) language, which has its own command-line switches, fpp switches are prefixed with a + instead of a -.
-\^-
The preceding word on the command line is the last switch; all following words are to be interpreted as filenames, even if they begin with a hyphen, or contain equal signs.
+debug
Turn on debugging output, which is sent to stderr. This produces helpful intermediate output from the expression evaluator. Macro definitions are also displayed on stderr when they are executed.
+Dname
Define the symbol name to the value 1.
+Uname
Undefine the symbol name. If the name is subsequently referenced, it will silently evaluate to zero. The existence of a definition can be checked with the defined operator, or in


C#ifdef   name
C#ifndef   name

statements; see below for details.

+version
Display the current version number of fpp on stderr.

If no input file names are given on the command line, input is assumed to come from stdin.

INTRODUCTION

fpp is a preprocessor for Fortran and SFTRAN3, modelled on the ANSI C preprocessor, but tailored for Fortran use, and for reversibility. It may serve as a prototype for a future SFTRAN3 preprocessor conditional and macro facility.

These conventions ensure that the preprocessing is reversible; the output always contains the complete input, except that some code sections may have become comments, or vice versa. This is more useful than the non-reversible approach used by the C preprocessor.

Reversible preprocessing is convenient when a master source file must be maintained to generate multiple versions, such as for different operating-system, compiler, and architecture variations. The source code for any of these can serve as the master file to create any of the others.

The current implementation is written in nawk(1). That language is available on several different operating systems, in both commercial and free implementations, and serves as a convenient prototyping language for a program such as fpp.


LANGUAGE OVERVIEW

fpp statements, or directives, are specially-formatted Fortran comments of the forms

C#name
C#name  args

Since all fpp directives are encoded as comments, both input and output files should be compilable without any preprocessing by fpp.

Blanks may optionally surround the initial # to permit indentation for better visibility, or to reflect conditional statement nesting.

The first column may contain any valid Fortran comment starter: C, c, or *.

Unrecognized C # word sequences are silently copied to the output, so as to permit the rare case of a # in the initial text of a Fortran comment.

Preprocessor names in conditionals and definitions, or set on the command line, consist of letters, digits, and underscores; the first character may not be a digit.

Symbols beginning with two underscores, or an underscore and an uppercase letter, are reserved for the local implementation; see the PREDEFINED SYMBOLS section below for details.

Following standard Fortran practice, letter case is not significant in directives, or in constants and operators in expressions. However, if nawk(1) is used to implement fpp, letter case is significant in names that are defined. With gawk(1), this limitation is removed; letter case is never significant.

For portability, it is recommended that lower-case letters be used for all directives, and upper-case letters for all defined names; this conforms to two decades of widespread practice in the C programming language.


DEFINITION STATEMENTS

Definitions of names for preprocessor conditionals may be set on the command line:

fpp   _OS_UNIX=1   _SUN386=1

or in the input file text itself:

C #define   _OS_UNIX   1
C #define   _SUN386   1

The cc(1)-like forms with define and undefine switches are supported; these two invocations are equivalent:

fpp   +D_OS_UNIX   +DWORDSIZE=32   +U_OS_VAXVMS
fpp   _OS_UNIX=1   WORDSIZE=32   _OS_VAXVMS=0

If the value is omitted, as in

fpp   _OS_UNIX=   _SUN386=

or

C #define   _OS_UNIX
C #define   _SUN386

a value of 1 is assumed.

Names can be undefined by

C #undef  name
C #undefine  name

If the name was not already defined, the request is silently ignored.


CONDITIONAL STATEMENTS

The conditional statements supported are:

C#if   constant-expression
C#ifdef   name
C#ifndef   name
C#elseif   constant-expression
C#elif   constant-expression
C#else   [optional comment]
C#endif   [optional comment]

Each C#ifxxx statement must have a matching C#endif following it. The two may be separated by any number of C#elseif statements, which may be followed by a single C#else statement.

A branch of a conditional is selected when the expression evaluates to a non-zero value; see the EXPRESSIONS section below for details.

Code between these statements is preserved, but in the unselected branches of a conditional statement, a non-comment statement will be altered to a comment by prefixing it with an initial C##, shifting the statement right by three columns. In the selected branch, any initial C## in columns 1 through 3 is stripped; lines without this prefix are copied verbatim. Because of Fortran's 72-character line length limitation, this means that code lines may not exceed 69 characters in length inside an fpp conditional. For example, the input


C#if _OS_UNIX
C##C     UNIX code
C##      CALL GETENV(...)
C#elseif _OS_VAXVMS
C     VMS code
      	CALL LIB$TRNLNM(...)
C#endif

when _OS_UNIX=1 produces

C#if _OS_UNIX
C     	UNIX code
      	CALL GETENV(...)
C#elseif _OS_VAXVMS
C##C     VMS code
C##      CALL LIB$TRNLNM(...)
C#endif

When neither _OS_UNIX nor _OS_VAXVMS are defined, the output is

C#if _OS_UNIX
C##C     UNIX code
C##      	CALL GETENV(...)
C#elseif _OS_VAXVMS
C##C     VMS code
C##      CALL LIB$TRNLNM(...)
C#endif

When only _OS_VAXVMS is defined, the original input is sent to the output. If by chance both _OS_VAXVMS and _OS_UNIX were defined, only the UNIX code would be selected, because only the first branch of the conditional would be executed.

Preprocessor conditionals may be nested:

C # if _OS_UNIX
C   # if _SUN
C      # if _SUN4
C      # elseif _SUN3
C      # elseif _SUN386
C      # endif
C   # endif
C # elseif _OS_VMS
C # endif

Any text following #else or #endif is ignored; it can be used to document the conditional, usually with the test from the preceding #if:

C # if _OS_UNIX
C # else NOT _OS_UNIX
C # endif _OS_UNIX

fpp directives are not executed if they are in a branch of a conditional that is not currently selected. However, conditional statements are processed to keep track of the current nesting.

On UNIX, the -Dname option for diff(1) can be used to get output from the comparison of two files that is almost correct input for fpp. A simple command pipe diff -Dxxx file1 file2 | sed -e 's/^#/C#/' >file3 will produce an output file file3 from which fpp +Dxxx file3 will recover file2 and fpp +Uxxx file3 will recover file1.


EXPRESSIONS

Expressions are recognized and evaluated in two circumstances: in the arguments of C#if, C#elseif, and C#elif, and inside the parentheses of #(...).

In expressions, primaries are Fortran integer, floating-point, logical, and character constants, and preprocessor names.

Undefined names silently evaluate to zero in arithmetic expressions, and to empty strings in string expressions.

Character strings appearing in arithmetic expressions are converted to numbers, which are zero if the string does not look like a number. Character strings appearing by themselves evaluate to themselves.

Arithmetic expressions are evaluated in floating-point arithmetic; for Boolean (Fortran logical) tests, zero is false, and non-zero is true.

The usual Fortran arithmetic operators + - * / ** are recognized, along with the C modulus operator %; x % y is Fortran's mod(x,y). This operator is rigorously defined for all arguments to be x % y = x - int(x/y)*y.

The Fortran logical and relational operators are supported, with convenient modern C-like synonyms: .and. (& and &&), .or. (| or ||), .not. (!), .eq. (==), .ne. (!=), .lt. (<), .le. (<=), .gt. (>), and .ge. (>=). Letter case in the dotted operators is not significant. Finally, the Fortran character string concatenation operator, //, is handled.

One special name, defined, is recognized, in any letter case; it may be used either in functional form, defined(name), or in prefix operator form, defined name. It evaluates to 1 if the name is defined (even if the value of the name is zero), and otherwise, to 0. Several defined operators can be used in a single expression; that is much more convenient than a series of nested conditionals using C#ifdef and C#ifndef.

The #(...) form is only recognized in a comment line, and the next line is converted to a comment (see the section MACRO EXPANSION below); the parentheses hold an expression involving Fortran constants and preprocessor names.

Examples of expressions are

C#if  defined(_OS_UNIX)  ||  defined _OS_VAXVMS  ||
  (WORDSIZE == 32)"

C     REAL A(#(MAXA**2)), B(#(MAXA % 32))

C     INTEGER BITS(#(WORDSIZE))

MACRO EXPANSION

fpp supports a reversible argument-free macro expansion capability. This involves pairs of lines, the first a comment line containing the macro references as strings of the form #(constant-expression), and the second a non-comment Fortran statement.

The first line of the pair is always exactly preserved in the output, while the second is replaced by the expansion of the comment, with the first character deleted, to change the comment into a non-comment. The original contents of the second line are preserved as a comment with the prefix C-fpp- in a third output line.

This peculiar input line pairing is necessary to ensure that the expansion is reversible.

The parentheses around the expression serve to distinguish between macros and fpp preprocessor directives in comments, and serendipitously permit the extension from simple names to arbitrary constant expressions that can be evaluated by fpp.

Care must be taken in writing the input to ensure that any expected expansion does not make the line longer than 72 characters; fpp has almost no knowledge of Fortran, and therefore cannot provide correct line wrapping for it.

Similarly, macro expansion in a multi-line continued statement should avoided, since it introduces comment lines between continuation lines. While such comments are legal in full Fortran 77, they are illegal in subset Fortran 77, and in older Fortrans, and may cause problems for other tools that process Fortran code.

Here is a small example. Given command-line definitions

FPTYPE='DOUBLE
MAXA=19
MAXB=25

then the input

C     #(FPTYPE) A(#(MAXA)), B(#(MAXB),#(MAXA**2))
      	REAL A(100), B(255,10000)

is converted to the output

C     #(FPTYPE) A(#(MAXA)), B(#(MAXB),#(MAXA**2))
      	DOUBLE PRECISION A(19), B(25,361)
C-fpp-      REAL A(100), B(255,10000)

Fortran 77 PARAMETER statements can be used to achieve similar effects, but in more restricted circumstances. In particular, fpp permits such expansions to happen in strings:

C10000 FORMAT ('Host operating system = #(OS)')
10000 FORMAT ('Host operating system = UNIX')

This may be awkward to achieve in standard Fortran.


MESSAGE OUTPUT STATEMENTS

Text can be written to stderr with either of

C#message text

C#error text

The difference between them is that #error sets an exit code of 1, and also sends the text to stdout. This can be used to ensure that a preprocessing error forces a compilation error if an attempt is made later to compile the output source program.

The output of both directives is prefixed with the file name and input line number to identify the origin of the message.

When #error is executed, processing is not terminated; instead, fpp tries to process the remaining input so as to uncover additional errors in the same run.


OUTPUT OF fpp

The output contains an initial comment header of the form

C-fpp- =================================================================
C-fpp- fpp version 1.0 [10-Dec-1990]
C-fpp- Date: Sat Dec  8 23:06:30 MST 1990
C-fpp- Directory: /u/sy/beebe
C-fpp- User: beebe@sandy.math.utah.edu
C-fpp- Macro: _OS_VAXVMS=1
C-fpp- Macro: FPTYPE=DOUBLE PRECISION
C-fpp- =================================================================

These comments provide a record of the processing, including what symbol definitions and macro values have been selected.

Input comments beginning

C-fpp-

are flushed. Thus, any existing comment header is always replaced by a new header.

Each command-line name=value or +Dname setting, and each input definition directive

C#define name value

produce an output comment of the form

C-fpp- Macro: name=value

Thus, all output lines beginning

C-fpp- Macro:

document which names have been defined.

A C#undefine statement results in output like

C-fpp- Macro: name=--UNDEFINED--

PREDEFINED SYMBOLS

Each implementation of fpp will predefine a few symbols that can be tested and used in conditionals to select machine-specific code sections. The predefined symbols are provided to fpp ahead of any user-defined ones. Since later redefinitions override earlier ones, predefined symbols can always be changed by the user.

Following ANSI C, predefined symbols always begin with two underscores, or an underscore and an uppercase letter; such names are reserved for the local implementation. Predefined symbols that do not follow this convention are forbidden. This requirement makes it possible to distinguish separate name spaces for the user and for the implementation, preventing surprises from unexpected substitutions that happen when code is moved to a new environment.

The complete set of definitions is always recorded in the output header; they can easily be displayed on stdout by giving fpp an empty input file:

fpp /dev/null

Following the conventions of the ANSI C preprocessor, predefined symbols always begin with two underscores, or an underscore and an uppercase letter.

fpp always predefines exactly one major operating-system symbol:

_OS_PCDOS
_OS_TOPS20
_OS_UNIX
_OS_VAXVMS

For _OS_UNIX, exactly one minor operating-system variant may also be defined:

_AIX
_AIX370
_BSD
_GOULD
_HPUX
_MACH
_MIPS
_POSIX
_STARDENT
_SUN

Additional architectural variants may be defined on some systems:

_IBM_3090
_IBM_PS_2
_IBM_RS_6000
_IBM_RT
_STARDENT_1500
_STARDENT_3000
_SUN3
_SUN386
_SUN4
Host byte addressing order is defined by one of these:

_BIG_ENDIAN
_LITTLE_ENDIAN

Big-endian addressing is used by IBM, Motorola, and most RISC systems. Little-endian addressing is used by the Intel 80xxx and DEC VAX architectures.

Host floating-point architecture must be defined by

_IEEE_754
on those machines that have IEEE 754 floating-point arithmetic.

If the Fortran implementation supports NAMELIST I/O, the symbol

_NAMELIST
must be defined.

To ensure standardization, all such names must be registered with the author of fpp, and will be listed in these manual pages.

Following ANSI C, four standard permanent symbols are always defined; these each have two leading and two trailing underscores. Permanent symbols always begin with two underscores, and they may not be redefined by the user.

__DATE__
Current calendar date in the form Mmm dd yyyy. The month field is alphabetic, and the day number field has a leading blank if the day is less than 10.
__FILE__
Current input file filename.
__LINE__
Current input file line number.
__TIME__
Wall-clock time in the form hh:mm:ss.
__TIMEZONE__
Three-letter time zone abbreviation, such as MDT for Mountain Daylight Time, or GMT for Greenwich Mean Time.

The symbols __DATE__, __TIME__, and __TIMEZONE__ are set only once, at the start of execution of fpp. These values can be conveniently used to generated output stamped with the time of compilation:

C     WRITE (*,*) 'Compiled on #(__DATE__) at #(__TIME__) 
#(__TIMEZONE__)'
      WRITE (*,*) 'Compiled on ??? ?? ???? at ??:??:?? ???'

might produce

C     WRITE (*,*) 'Compiled on #(__DATE__) at #(__TIME__) 
#(__TIMEZONE__)'
      WRITE (*,*) 'Compiled on Dec 10 1990 at 09:10:07 MST'
C-fpp-      WRITE (*,*) 'Compiled on ??? ?? ???? at ??:??:??
 ???'


SEE ALSO

cc(1), cpp(1), diff(1), m4(1), gawk(1), nawk(1).

DIAGNOSTICS

Diagnostics are issued to stderr if unclosed conditionals, or out-of-place conditional branches, or errors in preprocessor expressions, are detected. Attempts to redefine permanent macros (any that begin with two underscores) produce an error message. Debug output requested by a command-line option will be sent to stderr.

C#error text

directives that are executed send their text argument to stderr and to stdout, and on UNIX, cause a later exit code of 1.

C#message text

directives that are executed send their text argument to stderr.


AUTHOR

Nelson H. F. Beebe, Ph.D.

Center for Scientific Computing

Department of Mathematics

University of Utah

Salt Lake City, UT 84112

Tel: +1 801 581 5254

FAX: +1 801 581 4148

Email: <beebe@math.utah.edu>