The original source, with bad table-based formatting. I copied it here so that it could be readable on a mobile screen, without unnecessary zooming and panning.


1401 FORTRAN Notes

FORTRAN, May it live Forever ;-))

Serial compilation and the 1401 FORTRAN compiler

by L. H. Haines

This paper discusses a compiler organization in which phases act sequentially on a source program held in core storage.

A brief description of each phase of the 1401 FORTRAN compiler is given to illustrate the general scheme.

The IBM 1401 FORTRAN compiler was designed as a set of phases that operate sequentially on the source program. The source program having been placed in core storage, the compiler phases enter core one at a time. Each phase overlays its predecessor, operates on the source program and, in turn, is overlaid by the next phase of the sequence. Thus, in contrast to the customary technique of passing the source program against the compiler in core, the compiler is passed against the source program which resides in core. It is assumed that the source program is more concise than the object program, and that an object program of interest can be accommodated in core.

The 1401 FORTRAN compiler was designed for a minimum of 8000 positions of core, with tape usage being optional. The fundamental design problem stems from the core storage limitation. Because the average number of instructions per phase and the number of phases selected are inversely related (at least to a significant degree), the phase organization was employed to circumvent the core limitation. The 1401 FORTRAN compiler has 63 phases, an average of 150 instructions per phase, and a maximum of 300 instructions in any phase.

Efficiency of the compilation technique

Experience with the compiler suggests that, even though the basic compilation technique requires repetitive scanning of the source program, the time required for this redundant effort is small. The experience lends some credence to the following argument.

If a phase is decomposed into two serially acting phases, assume that the average execution time is increased by a multiplicative factor r. If each component phase is again decomposed into two phases, the average total execution time of the resulting four phases increases to r² times the original. After k decompositions, 2k phases result, with an average total execution time of rk times the original. Under this assumption, it follows that an n-phase compiler takes rlog₂n times longer than a comparable one-phase compiler. Because some phases do not involve scanning, this estimate may tend to be high.

Based on experience with the present compiler, it is conjectured that r ~ 1.05 and that the redundant work occasioned by the use of 63 phases increases compilation time by a factor of 1.05log₂63 ~ 1.3.

However, this technique reduces compilation time elsewhere, so that the net increase can be expected to be less than conjectured. Since the phases act serially, tape searching is unnecessary. Moreover, no external sorts or merges are required. After one pass of the processor, the object program lies in core storage, ready for execution.

Also, the present technique permits an approach to coding that tends to reduce compilation time. Decomposition into an appropriate number of phases often results in more available core space for execution of an individual phase than the minimum needed.

The typical phase requires only on the order of 150 instructions, although twice that number are accommodated by the allocated region in core. Thus, most of the code can be written for efficient execution without regard to economy in the number of instructions employed. Other advantages inherent in this type of coding are that it is usually faster to write and easier to debug.

Compilation time

Except for certain extreme cases in which the object program fills almost all of core, the compilation time in seconds t is approximated by a linear function of the number of FORTRAN statements n (the statements assumed to be of average complexity). If the compiler is on tape, reads the source program from cards, and punches a self-loading machine language object program, then t = (0.96n + 72) seconds. As a load-and-go system, with the punching of the object program suppressed, t = (0.76n + 22) seconds. If the system is on cards, an additional 173 seconds are required.

Compiler phases

The principal function of each phase of the compiler is indicated below. Secondary functions are subordinated; for example, error checking occurs in almost every phase, but is seldom mentioned.

ACKNOWLEDGMENT

The author wishes to express appreciation to his colleagues, G. Mokotoff, S. Smillie, and D. Macklin for their generous assistance. Parts of this paper were prepared by the author for inclusion in Reference 1.