Fact-checked by Grok 2 weeks ago

One-pass compiler

A one-pass compiler is a compiler that processes the source code of a program in a single sequential traversal, integrating lexical analysis, syntax analysis, semantic analysis, and code generation without revisiting any part of the input.^[1] This approach contrasts with multi-pass compilers, which make multiple traversals over the code to perform different phases separately.^[2] One-pass compilers are valued for their efficiency, as they require less memory by avoiding the storage of intermediate representations like abstract syntax trees and enable faster compilation times due to the streamlined process.^[1]^[2] However, they impose limitations on language design, such as requiring declarations before use to resolve symbols during the initial scan and complicating the handling of forward references, which often necessitates techniques like backpatching.^[2] These constraints make them less suitable for complex optimizations or languages with extensive forward dependencies, restricting their use to simpler constructs and grammars, typically LL(1) parsable.^[1]^[2] Historically, one-pass compilers emerged as a practical choice for resource-constrained environments in the mid-20th century, influencing the design of early programming languages to facilitate single-pass implementation.^[2] A notable example is the original Pascal language, developed in the late 1960s by Niklaus Wirth, which enforced strict declaration-before-use rules and avoided features like separate compilation to enable efficient one-pass compilation.^[3] Such designs prioritized simplicity and speed, paving the way for educational and prototyping tools, though modern compilers largely favor multi-pass architectures for greater flexibility and optimization potential.^[1]

Overview

Definition

A one-pass compiler is a type of compiler that processes the source code of a compilation unit in a single sequential traversal, immediately translating each encountered statement or declaration into target machine code without requiring subsequent revisits to earlier parts of the input. This approach contrasts with multi-pass compilers, which typically involve multiple traversals over the source or intermediate representations.^[4] In this context, a compilation unit refers to a self-contained portion of the program, such as a single source file or module, that can be compiled independently.^[5] A compiler pass, more broadly, denotes a complete processing phase that transforms input into output, often involving analysis and generation steps.^[4] Key characteristics of a one-pass compiler include the absence of intermediate storage for a full abstract syntax tree (AST), as the entire structure is not retained; instead, semantic and translation decisions are made on the fly using only the current token stream and symbol information accumulated from prior context.^[6] This on-the-fly processing enables direct code generation without buffering the complete parse tree, relying on syntax-directed translation techniques where actions trigger immediate output during parsing.^[7] The basic workflow of a one-pass compiler entails reading the input source code once from start to end, interleaving lexical analysis (tokenization), syntactic parsing (structure recognition), semantic analysis (type and scope checks), and code generation (target code emission) within the same traversal.^[8] This concurrent execution of phases ensures that each element of the source is handled progressively, with symbol tables updated incrementally to support forward references where possible.^[9]

Comparison to Multi-Pass Compilers

Multi-pass compilers process the source code through multiple traversals, typically separating the front-end phases—such as lexical analysis, syntax analysis, and semantic analysis—from the back-end phases of optimization and code generation, often using intermediate representations like abstract syntax trees (ASTs) to facilitate deferred processing.^[10]^[11] In contrast, one-pass compilers integrate all these phases into a single, linear traversal of the source code, where the syntactic analyzer invokes contextual analysis and code generation on-the-fly as constructs are encountered, without constructing a complete intermediate representation.^[10] A key structural difference lies in symbol handling: one-pass compilers resolve declarations and references immediately during the traversal, necessitating forward declarations or references for any unresolved symbols to avoid errors, whereas multi-pass compilers defer such resolution to later phases by storing symbol information in the AST or symbol tables for global linkage.^[10]^[12] This design choice significantly influences language features; one-pass compilers are best suited to languages enforcing strict declaration-before-use rules, as seen in Pascal, where identifiers must be defined prior to their use to enable real-time resolution, thereby restricting constructs like mutual recursion unless headers or forward declarations are employed.^[13]^[10] Multi-pass approaches, by allowing analysis across the entire program, support more flexible scoping and reference patterns in languages like Java.^[10] At a high level, these differences yield efficiency trade-offs: one-pass compilers consume less memory by processing incrementally without retaining full program structures, enabling faster compilation for straightforward languages, but they limit opportunities for comprehensive global optimizations that multi-pass compilers achieve through repeated traversals of intermediate forms.^[10]^[11]

Implementation

Parsing and Semantic Analysis

In one-pass compilers, lexical analysis and syntactic parsing are tightly integrated to process the source code in a single forward traversal, without buffering the entire input or generating intermediate representations like parse trees for later phases. The lexical analyzer, or scanner, produces a stream of tokens—such as identifiers, keywords, operators, and literals—that are immediately consumed by the parser as they are generated, enabling continuous processing without revisiting prior input.^[14] This direct feed supports top-down parsers, particularly LL(1) or simple recursive descent parsers, which predict the next production based on a single token of lookahead and construct the parse incrementally from left to right.^[15] Such parsers are well-suited to one-pass constraints because they facilitate straightforward integration of semantic actions without backtracking. Bottom-up parsers, such as LR parsers, can also be employed in one-pass compilation with appropriate attribute evaluation.^[16] Semantic analysis is interwoven with parsing through embedded action routines in the grammar productions, allowing checks and annotations to occur as syntactic constructs are recognized, rather than in a separate phase. For instance, upon encountering a declaration like int a;, the parser triggers an action to insert the identifier "a" and its type into the symbol table, building it incrementally as scopes open and close during the parse.^[15] Type checking is performed immediately when variables are used; for example, in an expression like a + b, the parser verifies that both operands are numeric types by querying the symbol table entries created earlier in the pass.^[17] Error detection, such as undeclared variable usage or type mismatches, is reported on-the-fly, preventing propagation of invalid constructs to later stages.^[14] This approach relies on attribute grammars, where synthesized attributes (computed bottom-up from children) and inherited attributes (passed top-down from parents) are evaluated during the parse to manage semantic properties without delaying resolution.^[16] To handle dependencies where a symbol is used before its full declaration, one-pass compilers enforce top-down processing and often require forward declarations or predefined scoping rules to ensure symbols are resolvable upon first encounter. In languages supporting this, a preliminary declaration (e.g., a prototype or header) allows the parser to enter partial symbol information into the table, deferring complete details until the full definition is parsed later in the input stream.^[14] Symbol table implementations, such as scope stacks or global tables with scope tags, facilitate this by pushing new scopes on entry to a block and popping them on exit, enabling nested lookups without re-scanning.^[15] This mechanism avoids backtracking by assuming a linear order where definitions precede uses, or by using temporary placeholders for forward references that are resolved as the pass progresses.^[17] The single-pass constraint imposes challenges on analysis, primarily due to limited lookahead, which restricts the grammar to deterministic forms without left recursion or ambiguity that would require multiple tokens of preview. LL(1) parsers, for example, demand grammars where the first set of alternatives is disjoint based on one lookahead token, necessitating careful design to eliminate conflicts during semantic action placement.^[16] Context-sensitive dependencies, such as those spanning distant parts of the program, further complicate matters, as the parser cannot defer resolution without violating the one-pass rule, often limiting supported language features to those amenable to local, on-the-fly evaluation.^[14] These restrictions ensure feasibility but demand L-attributed grammars, where attribute dependencies flow left-to-right without cycles, to maintain efficient, single-sweep processing.^[17]

Code Generation and Optimization

In one-pass compilers, code generation proceeds immediately upon recognition of syntactic constructs during parsing, emitting target machine code in a sequential manner without buffering the full abstract syntax tree or other program structures.^[18] This approach integrates code emission directly into the parsing process, allowing for rapid translation of source elements like expressions, assignments, and control structures into instructions for a target architecture, such as a reduced instruction set computer (RISC).^[18] Symbol resolution during code generation relies on the symbol table constructed incrementally as declarations are processed in the single pass. As identifiers are encountered, their attributes—such as types, scopes, and memory addresses—are entered into the table, enabling immediate lookup and insertion of resolved values like variable offsets or procedure entry points into the generated code.^[18] For instance, when a variable declaration is parsed, its address is allocated (often with alignment to optimize access), and subsequent references use this information on-the-fly without backtracking.^[18] This forward-only resolution imposes constraints, such as requiring declarations before uses in most cases to avoid unresolved references.^[1] Due to the linear processing nature, optimizations in one-pass compilers are confined to local techniques that can be applied immediately within the current context, such as constant folding, which evaluates and replaces constant expressions at compile time (e.g., simplifying x := 5 + 3 to x := 8).^[19]^[1] Algebraic simplifications, like eliminating identities such as x + 0 = x, may also occur locally during expression evaluation.^[19] Global optimizations, including dead code elimination or interprocedural analysis across the entire program, are not feasible because the compiler lacks a complete view of the code structure.^[19] The output of code generation in one-pass compilers typically consists of direct assembly code or relocatable object code, tailored for an abstract or physical machine without necessitating additional passes for assembly or basic linking in straightforward scenarios.^[18] For example, instructions for stack-based operations or register loads are emitted sequentially, with symbol table data ensuring proper addressing, though complex linking for external references may still require post-processing.^[18]

Advantages and Limitations

Advantages

One-pass compilers offer significant memory efficiency by generating object code directly during parsing without storing a full abstract syntax tree (AST) or intermediate representations, making them particularly suitable for resource-constrained environments such as embedded systems or early computing hardware.^[20]^[21] This approach eliminates the need for dynamic allocation of large data structures, retaining only a constant amount of information—such as a small number of nodes on the run-time stack—thereby reducing overall memory footprint compared to multi-pass compilers that require multiple code representations.^[22] In terms of compilation speed, one-pass compilers process the source code in a single traversal, integrating semantic analysis and code generation as they go, which minimizes I/O operations and processing overhead for faster overall execution.^[21] Empirical evaluations demonstrate this advantage, with one-pass parsers achieving over 30% faster performance than two-pass alternatives due to the absence of intermediate passes and associated overheads like memory allocation.^[22] This efficiency is especially pronounced in scenarios where rapid compilation is prioritized over extensive optimization. The design of one-pass compilers is inherently simpler, as they involve fewer phases and allow semantic actions to be embedded directly into parsing procedures, facilitating easier implementation and maintenance by compiler developers.^[20]^[23] This reduced complexity stems from the unified bottom-up pass that combines labeling, reducing, and code generation, avoiding the modularity trade-offs of multi-pass systems.^[22] One-pass compilers are well-suited to simple, declarative languages where forward references are minimal and semantic checks can be performed incrementally, enabling quicker development cycles and iterative prototyping without the delays of repeated passes.^[21] Their streamlined nature supports efficient handling of grammars for straightforward constructs, promoting accessibility for educational or lightweight applications.^[22]

Difficulties and Limitations

One-pass compilers face significant challenges in performing global analysis, as they process the source code in a single sequential traversal without the ability to revisit earlier sections for whole-program insights. This limitation prevents the implementation of advanced optimizations, such as dead code elimination or common subexpression elimination across distant parts of the program, resulting in output code that is often less efficient in terms of size and execution speed compared to multi-pass alternatives. For instance, semantic analysis in one-pass systems relies on minimal context during parsing, restricting it to local information and making interprocedural optimizations infeasible without additional passes.^[14] A key constraint arises from language design requirements, where one-pass compilation typically mandates that all declarations precede their uses to avoid unresolved forward references during the single traversal. This enforces a strict ordering in program structure, complicating features like mutual recursion or separate compilation of modules with interdependencies, as the compiler cannot resolve symbols defined later in or across files without ad-hoc mechanisms such as fix-ups or chaining, which add complexity and may not scale to intricate expressions. In languages like Pascal, this design choice historically led to rigid scoping rules and difficulties in building large, modular systems, as procedures must be defined before calls, scattering related code and hindering maintainability.^[24]^[25] Error handling presents further difficulties, as the incremental construction of symbol tables and semantic context in a single pass can delay detection of inconsistencies until later stages, potentially overlooking errors that require broader program knowledge for accurate diagnosis. Comprehensive reporting is hampered by this sequential nature, often leading to cascaded error messages or incomplete diagnostics, since the compiler lacks a holistic view to contextualize issues like type mismatches across modules. To mitigate this, one-pass systems may employ simplistic recovery strategies, such as panic mode parsing, which suppresses subsequent errors after the first detection but sacrifices detailed feedback.^[26] Regarding scalability, one-pass compilers are ill-suited for large, modular codebases due to their inability to handle complex inter-module dependencies efficiently, often necessitating full program recompilation rather than incremental updates. This contrasts with multi-pass compilers, which can separate parsing, analysis, and generation to support modular builds and libraries, making one-pass approaches more viable only for small, self-contained programs where simplicity outweighs flexibility.^[25]

History and Examples

Historical Development

The development of one-pass compilers originated in the 1950s amid severe hardware constraints of early computers, where limited memory made multi-pass approaches impractical. Machines like the IBM 701, with a maximum of 4,096 36-bit words (approximately 18 KB), restricted the storage of intermediate representations during compilation, necessitating designs that processed source code in a single sequential scan to generate output directly. This approach was driven by the need to fit entire compilation processes within available RAM, as multi-pass compilers would require reloading and storing multiple phases of data, which exceeded the era's electrostatic tube or core memory capacities.^[27]^[28] A key milestone came with the IBM FORTRAN compiler, released in 1957 for the IBM 704, which was an early optimizing compiler consisting of multiple passes designed for efficiency on limited hardware. Led by John Backus, the team prioritized efficiency to rival hand-assembled code, motivated by the high cost of programming (up to 75% of machine time) and skepticism toward automatic coding systems.^[29] In the early 1960s, similar single-pass techniques advanced with Algol implementations, such as Samelson and Bauer's stack-based "cellar" method (1960) and Dijkstra's table-driven approach (1961), which enabled efficient translation on resource-constrained systems like the Zuse Z22 and X-1.^[30] By the 1970s and 1980s, the rise of personal computers shifted motivations from pure necessity to deliberate design choices favoring rapid compilation for interactive environments. Limited but more accessible hardware in machines like the IBM PC emphasized quick feedback loops for developers, leading to one-pass compilers that minimized wait times during iterative coding. For instance, Turbo Pascal (1983) employed a fast one-pass strategy to enable pinpointing errors directly in source code, aligning with the era's focus on productivity in resource-constrained desktop systems.^[31] Although multi-pass compilers dominated modern development by the late 20th century for superior optimization across phases, one-pass designs persisted in embedded systems and lightweight tools where memory efficiency and speed remain critical. Examples include the Tiny C Compiler (TCC), which generates linked binary code in a single pass to support rapid prototyping on constrained devices without intermediate files. This endurance reflects ongoing trade-offs in scenarios prioritizing simplicity over exhaustive analysis.^[32]^[30]

Notable Examples

One notable example of a one-pass compiler is Turbo Pascal, developed by Anders Hejlsberg and released by Borland in 1983 for CP/M and MS-DOS systems.^[33] This compiler was renowned for its exceptionally fast compilation speeds on early personal computers with limited resources, achieving this through a single-pass design that directly generated executable code while enforcing strict declaration-before-use rules inherent to the Pascal language.^[31]^[25] The approach allowed developers to compile and test programs rapidly, making it a staple for educational and hobbyist programming on 8-bit and 16-bit machines, though it required programmers to structure code sequentially without forward references to procedures or variables. The Tiny C Compiler (TCC), created by Fabrice Bellard in 2000, represents a modern lightweight implementation of a one-pass C compiler.^[32] TCC processes source code in a single pass to generate linked binary executables directly, bypassing intermediate assembly steps common in multi-pass compilers like GCC, which enables extremely quick build times suitable for scripting, embedded systems, and dynamic code generation.^[32] Despite its simplicity, it supports a substantial subset of ANSI C, including extensions for 64-bit architectures and ARM processors, and is often used in resource-constrained environments where compilation speed outweighs advanced optimizations. Several COBOL compilers exemplify one-pass designs tailored for business-oriented sequential processing. RM/COBOL, originally developed by Ryan-McFarland Corporation in the 1970s and later maintained by Micro Focus, performs a single pass over source programs under MS-DOS and subsequent platforms to produce efficient bytecode or native code for data processing applications.^[34] Similarly, ACUCOBOL-GT (now part of Micro Focus Extend) is a single-pass ANSI-compliant COBOL compiler that generates immediately executable object files, emphasizing rapid compilation for enterprise report generation and transaction processing without requiring multiple scans of verbose COBOL source files.^[35] These variants leverage COBOL's top-to-bottom structure, where divisions like Identification and Procedure are processed linearly, aligning well with one-pass constraints. An early academic example is Berkeley Pascal, implemented by Ken Thompson in 1975-1976 during his time at the University of California, Berkeley.^[36] This interpreter-based system produced fast-turnaround interpretive code for PDP-11 and VAX computers, adhering to Pascal's one-pass-friendly design that mandated forward declarations and avoided complex inter-procedural dependencies.^[37]^[25] Such features in these one-pass examples—strict scoping, sequential declaration orders, and limited forward referencing—highlight how language designs must accommodate single-scan processing to maintain efficiency, influencing portability and simplicity in resource-limited settings.

References

[1]
[PDF] Structure and Operation of Compilers - UNC Computer Science
• Single-pass compiler. – economical in time and space. – limited ability to ... – design programming languages so that their compilers are simple.
[2]
[PDF] COMPILING WITH C# AND JAVA - Computer Science
... Aho, Sethi, Ullman (1986)). The character handler is the section that ... single-pass compiler in which code generation can be carried out hand-in-hand ...
[3]
Why Pascal is Not My Favorite Programming Language - cs.Princeton
Apr 2, 1981 · Since the original Pascal was implemented with a one-pass compiler, the language believes strongly in declaration before use. In particular ...
[4]
Untitled - UT Computer Science
compiler. Pass I accepts a Nucleus program and produces its reduced program, and Pass II generates code from the reduced program. A two-pass compiler was ...
[5]
[PDF] Separate Compilation - Northwestern Computer Science
Each .c file is its own compilation unit, which means it is translated by the compiler in isolation, with no direct knowledge of the other .c files, into a ...<|control11|><|separator|>
[6]
[PDF] CSE 401/M501 – Compilers - Washington
– Construct abstract syntax tree (AST) (common). – Construct linear, lower ... – Generate target code on the fly (used in 1-pass compiler; not common in ...
[7]
[PDF] Introduction to Parsing - UT Computer Science
• One-pass compiler: directly generate assembly code. – This is what you will ... AST vs. Parse Tree. • AST is condensed form of a parse tree. – operators ...
[8]
[PDF] Semantic Analysis - DePaul University
A one-pass compiler interleaves scanning, parsing, semantic analysis, and code generation in a single traversal of the input. • A common approach ...
[9]
There are 2 kinds of compilers
The compiler reads the source code once to compile (= translate) the program · In an one-pass compiler, the compiler will: Gather the definition of all variables ...
[10]
[PDF] Programming Language Processors in Java - Computer Science
one-pass compiler encounters the command 'n : = n+17, it has not yet encountered the declaration of n. So it cannot subject the command to contextual ...<|control11|><|separator|>
[11]
[PDF] Compiler-Design-TEXT-book-1.pdf
... multi-pass, load-and-go, debugging, or optimizing, depending on how they ... one pass. and for the activity of these phases to be interleaved during ...
[12]
[PDF] CS 4300: Compiler Theory Chapter 1 Introduction - Computer Science
• Compiler passes: • A collection of phases is done only once (single pass) or multiple times (multi pass). • Single pass: usually requires everything to be ...
[13]
Comparative Programming Languages
One-pass versus multi-pass compilers: if you're allowed to textually use something before it is defined, the language cannot be compiled in one pass. Might ...
[14]
[PDF] Semantic processing - Purdue Computer Science
Intel's 80286 Pascal compiler used an attribute grammar evaluator to perform context-sensitive analysis. Historically, attribute grammar evaluators have been ...
[15]
[PDF] Semantic Analysis - SUIF
Oct 24, 2007 · Some semantic analysis might be done right in the middle of parsing. ... In fact, in a one-pass compiler, the code is generated right then ...
[16]
[PDF] Semantic Analysis - Stony Brook Computer Science
A one-pass compiler is a compiler that interleaves semantic analysis and code generation with parsing. Syntax trees: if the parsing and code generation are ...
[17]
[PDF] Lecture 7: Semantic Analysis - Computer Science (CS)
Lecture 7 — Semantic ... • Allows analysis of AST.
[18]
[PDF] Compiler Construction - Ethz
Often, the code generation part is therefore subdivided further. A partitioning of the compilation process into as many parts as possible was the predominant.
[19]
[PDF] Compiler-Based Code-Improvement Techniques
The paper suggests that the technique is quite effective in its intended environment—a single pass compiler. ... Optimal code generation for expression trees.
[20]
[PDF] Compiler Construction
One-pass compilers will generate object code without using an ... representation during semantic analysis; most often it will be an abstract syntax tree or.
[21]
None
### Summary of Advantages/Benefits of One-Pass Compilers
[22]
[PDF] One-Pass, Optimal Tree Parsing | With Or Without Trees
Sep 22, 1995 · Our system has the advantage of not requiring a dynamically allocated IR tree, but sometimes suffers from its small amount of additional ...<|control11|><|separator|>
[23]
[PDF] CSSE 404: Compilers Introduction - Rose-Hulman
Dec 2, 2019 · Types of Compilers. One-pass. ◦ Compiler completes all of its processing in one fell swoop. ◦ Benefit: Simple compiler. Multi-pass. ◦ Compiler ...
[24]
Chapter 4, Forward References - University of Iowa
When this is done, the resulting solution to the forward reference problem is called chaining. Chaining is commonly used in one-pass compilers and in linking ...
[25]
Why Pascal is Not My Favorite Programming Language
Apr 2, 1981 · Pascal is not adequate for writing real programs. It is suitable only for small, self-contained programs that have only trivial interactions with their ...
[26]
Compiling Expressions - Crafting Interpreters
Single-pass compilers like we're going to build don't work well for all languages. Since the compiler has only a peephole view into the user's program while ...
[27]
IBM 700 Series
With significant memory upgrades, the 701 was able to perform more than 16,000 addition or subtraction operations and more than 2,000 multiplication or division ...Missing: compiler limitations
[28]
The IBM 701 - by John DeTreville
Feb 12, 2023 · The IBM 701's memory held 2,048 36-bit words, which could also be addressed as 4,096 18-bit half-words. Numbers were stored as signed-magnitude ...Missing: size limitations
[29]
The history of Fortran I, II, and III - ACM Digital Library
Thus the compiler was "one pass" in the sense that it "saw" the source ... The arithmetic translator-compiler of the IBM FORTRAN automatic cod- ing system.Missing: limitations | Show results with:limitations
[30]
[PDF] A history of compilers
Jan 23, 2014 · History: Compilation techniques. • Single-pass table-driven with stacks. – Bauer and Samelson for Alcor. – Dijkstra 1960, Algol for X-1. – ...
[31]
30 Years Ago: Turbo Pascal, BASIC Turn PCs Into Programming ...
Sep 5, 2013 · So we decided that a fast one-pass compiler was the way to go because it allowed us to pinpoint runtime errors right into the source code,” Kahn ...Missing: 1970s | Show results with:1970s
[32]
Tiny C Compiler Reference Documentation
The TCC code generator directly generates linked binary code in one pass. It is rather unusual these days (see gcc for example which generates text assembly), ...Command line invocation · C language support · TinyCC Linker · Developer's guideMissing: design | Show results with:design
[33]
Thinking Back on 'Turbo Pascal' as It Turns 40 - Byte Cellar
Dec 4, 2023 · Created by Anders Hejlsberg, the development package was released by Borland in November 1983 at a price of $49.99 for both CP/M and DOS-based ...
[34]
[PDF] RM Cobol 26-5257.pdf - Index of /
May 1, 2000 · The RM/COBOL Compiler operates under the MS-DOS Operating System. Once executed, the Compiler makes a single pass on the source program,.
[35]
[PDF] ACUCOBOL-GT
ACUCOBOL-GT is a single-pass compiler that produces an object file. The object file is ready for immediate execution by the ACUCOBOL-GT runtime system. No ...
[36]
UNIX Evolution: 1975-1984; Part I - Diversity - Collyers
Ken Thompson wrote the first version of Berkeley Pascal at Berkeley while working there as Visiting Mackay Lacturer in Computer Science in 1975/76. He spent ...
[37]
(PDF) Berkeley Pascal User's Manual - ResearchGate
Berkeley Pascal is designed for interactive instructional use and runs on the PDP/11 and VAX/11 computers. Interpretive code is produced, providing fast ...