Fact-checked by Grok 2 weeks ago

LR parser

An LR parser is a bottom-up parser that performs a left-to-right scan of the input while constructing a rightmost in reverse order, enabling the efficient recognition of deterministic context-free grammars in linear time. This uses a to manage state and a lookahead symbol (typically one ) to decide between shifting input onto the stack or reducing previously shifted symbols according to the grammar's production rules. LR parsers are particularly powerful because they can handle a broader class of grammars than top-down parsers like LL, including those with and certain ambiguities, without requiring . Developed by Donald E. Knuth in 1965, LR parsing originated as a generalization of simpler precedence parsing methods to address the limitations of early technologies. In his seminal paper, Knuth introduced LR(k) grammars, where k denotes the number of lookahead symbols, proving that such parsers could deterministically recognize the largest class of context-free languages parsable in linear time by a . This innovation laid the foundation for practical construction, as LR parsers detect syntax errors early and support mechanical generation from grammar specifications. Common variants include canonical LR(1), which uses full lookahead sets for maximum precision but results in larger tables; simple LR (SLR), a simpler with smaller tables at the cost of some grammar coverage; and LALR(1), which merges lookahead sets from canonical LR to balance efficiency and power, making it suitable for real-world applications. Tools like and its open-source successor implement LALR(1) parsing, widely adopted in front-ends for languages such as , , and due to their ability to handle complex, real programming language grammars.

Introduction

Definition and Role in Parsing

An LR parser is a type of bottom-up that processes input strings from left to right while constructing a rightmost in reverse, enabling the recognition of deterministic context-free languages. It operates using a (DFA) encoded in parsing tables, where decisions for shifting, reducing, or accepting are made based on the current state (representing parser position) and a bounded lookahead of input symbols, typically one for LR(1) variants. This mechanism was formalized by in his definition of LR(k) grammars, which characterize languages parsable efficiently with k lookahead symbols. In the context of compiling, LR parsers play a crucial role in the front-end phase by analyzing to generate parse trees or abstract syntax trees (ASTs), verifying syntactic validity, and facilitating early detection during a single left-to-right pass over the input. They are particularly suited for programming languages, where grammars often include complex structures like expressions with operator precedence and associativity, allowing the parser to build hierarchical representations that subsequent phases (e.g., semantic analysis) can use. By recognizing valid sentence structures in linear time—specifically for an input of length n—LR parsers ensure efficient processing without rescanning prior input. Compared to top-down parsers such as parsers, which predict derivations from the start symbol downward, LR parsers excel in handling left-recursive grammars directly without requiring grammar transformations or , as they build the parse bottom-up from terminals. This capability allows LR parsers to deterministically recognize a strictly larger class of context-free grammars than LL(k) parsers for any fixed k, encompassing all LL(k) languages as a proper while supporting more ambiguous or recursive structures common in real-world languages. Overall, these advantages make LR parsers a cornerstone for practical design, offering mechanical generation from grammars and robust error handling.

Historical Development

The LR parser was first formalized by Donald E. Knuth in 1965 as part of his foundational work on syntax-directed translation for programming languages. In his seminal paper, Knuth introduced the LR(k) parsing family, which enables deterministic of a broad class of context-free grammars using left-to-right scans and rightmost derivations in reverse, with k symbols of lookahead. This approach generalized earlier precedence parsing techniques, providing a theoretical framework for efficient, table-driven parsers capable of handling nondeterministic grammars in linear time. Building on Knuth's LR(k) foundation, subsequent developments focused on practical variants to reduce . The simplest form, LR(0), emerged as a without lookahead, suitable for unambiguous grammars but limited in scope. In 1971, Frank L. DeRemer proposed Simple LR (SLR(1)), which incorporates one symbol of lookahead using FOLLOW sets to resolve ambiguities more effectively while maintaining smaller tables than full LR(1). Further refinement came with Look-Ahead LR (LALR(1)), positioned as a practical compromise between power and efficiency; an efficient for its was detailed by DeRemer and Thomas J. Pennello in 1982, merging LR(0) states with propagated lookaheads to achieve near-LR(1) capability with significantly reduced state space. LR parsing's influence extended rapidly to compiler construction tools, notably through its integration into Yet Another Compiler-Compiler (Yacc), developed by at Bell Laboratories in 1975. Yacc automated LALR(1) parser generation from grammar specifications, revolutionizing by enabling rapid prototyping of language processors. This tool's legacy evolved into open-source implementations, including in 1985, which extended Yacc's functionality for broader portability and enhancements like improved error reporting. Key milestones in LR parsing highlight its recognition as a powerful for deterministically processing nondeterministic grammars, contrasting with earlier top-down parsers' limitations. Canonical LR(1) parsers, as originally defined by Knuth, offer the highest expressiveness among the variants but at the cost of larger tables and higher construction overhead, underscoring the trade-offs that drove the adoption of SLR(1) and LALR(1) in real-world applications.

Bottom-Up Parsing Basics

Shift-Reduce Mechanism

The shift-reduce mechanism forms the core of bottom-up parsing, where the parser processes input symbols from left to right, incrementally constructing a by alternating between shifting symbols onto a and reducing matched sequences according to . This approach simulates the reverse of a rightmost , starting from the input string and working toward the grammar's start symbol. In the shift , the parser moves the next input symbol (a ) from the input buffer onto the top of the parse without applying any , preserving the symbol for potential future matching with production rules. This is typically performed when the current top does not yet form a complete syntactic construct. The reduce action reverses a by identifying a of symbols on the stack's top that matches the right-hand side of a grammar rule and replacing it with the non-terminal on the left-hand side, thereby recognizing a larger syntactic unit and advancing the construction upward. Reductions occur only when the matched qualifies as a in the current right-sentential form. A is the rightmost in a right-sentential form that matches the right-hand side of some and whose yields the prior right-sentential form in the rightmost of the input; for unambiguous context-free grammars, every right-sentential form possesses exactly one such unique , guaranteeing a deterministic path in viable . General shift-reduce is inherently non-deterministic, as multiple actions may compete at a configuration: a shift might continue building a construct, while one or more reduces could apply to different , resulting in shift-reduce conflicts ( between shifting and reducing) or reduce-reduce conflicts ( among multiple applicable ). LR parsers resolve this non-determinism through lookahead symbols and deterministic tables, enabling efficient handling of a broad class of grammars. For a concrete illustration, consider parsing the input "id + id" against an expression grammar with productions E → E + T, E → T, T → id (assuming $ as end marker). The parser shifts "id" (now stack: id), reduces it to T via T → id (stack: T), then reduces to E via E → T (stack: E); it shifts "+" (stack: E +), shifts the second "id" (stack: E + id), reduces to T (stack: E + T), and finally reduces to E via E → E + T (stack: E), accepting the input as a valid expression. This sequence highlights how shifts accumulate terminals and non-terminals, while reduces prune handles to form higher-level structures. The parse stack maintains the current partial parse, enabling these actions to interact seamlessly in the overall .

Parse Stack and Actions

In , the parse stack serves as the central for maintaining the of the parsing process, building the parse tree incrementally from the input string. The stack is initialized as empty or with an initial state (often denoted as state 0), representing the starting point of the 's augmented form. It grows and contracts dynamically as the parser processes input symbols, encoding both the sequence of grammar symbols derived so far and the current parsing configuration to enable deterministic decisions. The stack's composition alternates between grammar symbols—terminals or non-terminals—and parser states, which are indices into a finite derived from the . States capture the possible partial parses at each point, while symbols represent the actual input or derived elements accumulated on the right-sentential form. For instance, after processing a of the input, the stack might hold a sequence like [state 0, terminal 'a', state 3, non-terminal A, state 5], where the top determines the next action based on the current input lookahead. This interleaved structure ensures that the parser can track the viable prefixes of the efficiently. The primary actions on the stack build upon the foundational shift-reduce mechanism of bottom-up parsing. A shift action appends the next input terminal symbol to the stack followed by the corresponding new state, extending the partial parse to include the current symbol. A reduce action identifies a complete right-hand side (RHS) of a production on the top of the stack, pops twice the length of the RHS in elements (symbols and states), pushes the production's left-hand side (LHS) non-terminal, and then pushes the state transitioned to via the LHS. Beyond these, an accept action occurs when the stack holds the start symbol and the input is fully consumed, signaling a successful parse of the entire string. Conversely, an error action is triggered if no valid shift or reduce is possible, indicating an invalid input prefix that requires recovery or rejection. These actions ensure the stack always represents a valid prefix of some right-sentential form in the grammar. Stack growth occurs during shifts, as each appends two elements (symbol and state), simulating the expansion of the input . Contraction happens via reduces, where popping 2 × || elements removes the matched substring and replaces it with the single non-terminal plus its state, effectively collapsing the derivation tree bottom-up. This dynamic resizing allows the parser to handle nested structures without revisiting earlier decisions. The stack's role in ambiguity resolution is crucial: by preserving the full history of symbols and states, it disambiguates multiple possible parses for the same input prefix, guiding the parser toward the unique rightmost required for deterministic context-free languages. To illustrate, consider a simple evolution during parsing:
Initial: [state 0]

After shift on 'a': [state 0, 'a', state 2]

After shift on 'b': [state 0, 'a', state 2, 'b', state 4]

After reduce (e.g., A → a b): [state 0, A, state 3]  // Popped 4 elements (2 symbols + 2 states), pushed A and new state

Accept: Stack holds [start symbol S, final state] with end of input
This diagram shows the stack's initialization as empty (or with state 0) and its transformation, highlighting how shifts build depth while reduces prune and restructure.

Core LR Parsing Algorithm

Parser Loop and Decisions

The LR parser executes an iterative algorithm that drives the process, maintaining a to track the partial parse and processing the input from left to right. This loop continues until the parser reaches an accept state or encounters an error, with decisions guided by two precomputed tables: the action table, which dictates shifts, reductions, accepts, or errors based on the current and lookahead , and the goto table, which handles transitions for nonterminals after reductions. The algorithm builds on the shift-reduce , where the holds alternating and to represent valid prefixes of right-sentential forms. In the decision process, the parser first consults the action table using the at the top of the and the current lookahead from the input . If the entry specifies a shift to i, the lookahead is pushed onto the followed by i, advancing the input. A reduce entry for j (with right-hand side of r) prompts popping $2r elements (symbols and states), exposing the below, then pushing the production's left-hand-side nonterminal A and the new from the table entry for that exposed and A. An accept entry signals successful , while an empty entry triggers an , potentially invoking mechanisms or halting. These table-driven decisions ensure deterministic for LR grammars without . Input handling occurs incrementally within the : the lookahead is the next unconsumed , fetched initially before the and updated only on shifts. Reductions do not advance the input, preserving the lookahead for subsequent decisions, which allows the parser to resolve ambiguities using bounded lookahead. Termination happens upon encountering the accept , which occurs when the holds the initial followed by the augmented start symbol and the end-of-input marker serves as lookahead, confirming the input derives the start symbol. Otherwise, an error terminates unsuccessfully, reporting a . The following pseudocode illustrates the core loop structure:
input = [scanner](/page/Scanner).getNextToken();  // Initial lookahead
[stack](/page/Stack) = [0];  // Initial [state](/page/State) (augmented [grammar](/page/Grammar))

while true:
    current_state = top([stack](/page/Stack))
    action = [ACTION](/page/Action)[current_state, input]

    if action == shift i:
        [push](/page/Push)([stack](/page/Stack), input)
        [push](/page/Push)([stack](/page/Stack), i)
        input = [scanner](/page/Scanner).getNextToken()
    [else](/page/The_Else) if action == reduce j:  // Production A → β, |β| = r
        pop([stack](/page/Stack), 2 * r)  // Remove states and symbols
        exposed_state = top([stack](/page/Stack))
        [push](/page/Push)([stack](/page/Stack), A)
        [push](/page/Push)([stack](/page/Stack), [GOTO](/page/Goto)[exposed_state, A])
        // Do not advance input; recheck with same lookahead
    [else](/page/The_Else) if action == accept:
        return "Parse successful"
    [else](/page/The_Else):  // Error
        report_syntax_error()
        // Optional: recovery or halt
This loop can be visualized as a flowchart with a central iteration node branching to shift (advance input, update stack), reduce (pop/push via goto, loop back), accept (exit success), or error (exit failure), emphasizing the table-based determinism.

Handles and Reductions

In bottom-up parsing, particularly within the LR framework, a handle is formally defined as the right-hand side (RHS) of a production in a rightmost derivation of the input string, specifically the substring that corresponds to the production applied in the reverse step of the derivation process. This identification ensures that reductions mimic the inverse of the rightmost derivation, preserving the structure of the parse tree while building it bottom-up. The handle's position on the parser stack represents the viable prefix that can be reduced to advance toward the start symbol. The reduction process in an LR parser involves recognizing a handle on the top of the stack, popping the stack symbols corresponding to the RHS β of the production A → β, pushing the left-hand side (LHS) A onto the stack, and transitioning to a new state via the goto function based on the exposed state and A. For a handle with RHS β of length |β|, the parser pops 2 \times |β| elements from the stack, corresponding to the |β| symbols in β and the |β| states immediately following each symbol. This operation effectively replaces the handle with A, updating the stack configuration to reflect the reduced sentential form. In grammars with right recursion, such as those defining lists or expressions (e.g., E → E + T | T), multiple reductions may occur consecutively without intervening shifts, as each reduction exposes another on the . This sequence handles the recursive structure by iteratively reducing inner productions before outer ones, ensuring the parser correctly associates right-recursive constructs without . Detailed step-by-step illustrations of and for specific grammars are provided in later sections.

LR Parser Variants

LR(0) Parsers

LR(0) parsers represent the simplest form of LR parsers, relying exclusively on the current parser state to determine shift or reduce actions without examining any lookahead symbols. Introduced by in his seminal work on , these parsers process input from left to right while constructing a rightmost in reverse, using a to manage viable prefixes of the grammar. Central to LR(0) parsing are LR(0) items, which consist of a production with a dot (·) marking the current position in the right-hand side, indicating how much of the production has been recognized. For example, given a production A \to \alpha \beta, possible items include A \to \cdot \alpha \beta (before begins) and A \to \alpha \cdot \beta (after recognizing \alpha). These items form the basis of parser s, where each is a set of such items representing all possible partial parses at that point. Decisions in an LR(0) parser depend solely on the item set in the current , aligning with the general LR algorithm's shift-reduce mechanism but without lookahead refinement. The construction of an LR(0) parser begins with an augmented grammar, adding a new start production S' \to S to ensure a unique for the entire input. States are generated as the collection of sets of LR(0) items, starting from the initial state containing S' \to \cdot S and its (all items derivable via transitions on nonterminals). Transitions between states are computed using the function: for a state I and symbol X (terminal or nonterminal), the goto state is the closure of items obtained by advancing the dot past X in relevant items from I. This process yields a (DFA) whose states correspond to parser configurations and edges to shift actions on symbols. Reduce actions occur in states containing complete items A \to \gamma \cdot, and the accept action in the state with S' \to S \cdot. The resulting action and tables drive the parser loop. In terms of parsing power, LR(0) parsers recognize exactly the class of LR(0) grammars, a proper of the more general LR(1) grammars that allow one symbol of lookahead. This limited scope means LR(0) parsers handle only unambiguous context-free s where no shift-reduce or reduce-reduce conflicts arise during table construction, making them unsuitable for many practical grammars that require lookahead to resolve ambiguities, such as those involving operator precedence or the problem. A is formally LR(0) if, in the DFA constructed from its LR(0) items, no state contains both a complete item (indicating a reduce) and an item with the dot before the same lookahead symbol (indicating a shift), and no state has multiple complete items (indicating a reduce-reduce conflict). LR(0) parsers offer advantages in simplicity and efficiency for the grammars they can handle: their tables are smaller and quicker to construct than those of full LR(1) parsers, as they avoid propagating lookahead sets, and they enable linear-time with early error detection. However, these benefits come at the cost of reduced expressiveness; most real-world programming language grammars introduce conflicts in LR(0) tables, necessitating enhancements like SLR(1) or LALR(1) to resolve them without exponentially increasing table size. In practice, LR(0) parsers are rarely used standalone due to their frequent conflicts, though the underlying serves as a foundation for more powerful variants.

SLR(1) and LALR(1) Parsers

SLR(1) parsers extend LR(0) parsers by incorporating a single of lookahead to resolve reduce actions more precisely, using FOLLOW sets derived from the . In an SLR(1) parser, for a state containing an item A \to \alpha \bullet (indicating a completed production), a reduce action is placed in the action table for lookahead symbol a if a \in \text{FOLLOW}(A). This approach allows SLR(1) to handle a broader class of s than LR(0) by avoiding some shift-reduce conflicts where the lookahead distinguishes between shifting and reducing. The construction begins with the LR(0) and augments the action entries based on FOLLOW sets, making it simpler to implement than full lookahead methods. LALR(1) parsers achieve greater precision by building on LR(1) items but merging states that share the same core (the set of LR(0) items), then propagating and unioning the lookahead sets from the original LR(1) states. This merging reduces the number of states compared to canonical LR(1) while retaining much of its parsing power, resulting in compact tables suitable for practical use. The lookahead sets for reductions are computed more accurately than in SLR(1), as they reflect context-specific possibilities rather than global FOLLOW sets. LALR(1) forms the basis for tools like and , which generate efficient parsers for context-free grammars in compiler construction. SLR(1) is simpler to construct due to its reliance on FOLLOW sets but often introduces more conflicts in ambiguous situations, as these sets may over-approximate valid lookaheads. In contrast, LALR(1) resolves many such conflicts by using propagated lookaheads, allowing it to parse a superset of SLR(1) grammars with fewer states than full LR(1). The power hierarchy is LR(0) ⊂ SLR(1) ≈ LALR(1) ⊂ LR(1), where LALR(1) and SLR(1) are nearly equivalent in expressive power but LALR(1) is more practical for real-world applications. LALR(1) is sufficient for most practical parsing problems, including the grammars of many programming languages. A representative example illustrates the difference in conflict resolution: consider the grammar
S' → S
S → L = R | R
L → * R | id
R → L
In the SLR(1) parser, a shift-reduce conflict arises in the state containing R \to L \bullet because = \in \text{FOLLOW}(R), prompting a reduce on lookahead =, which conflicts with the shift for = from S \to L \bullet = R. However, in the LALR(1) parser, the lookahead for reducing R \to L is propagated as only \} or end-of-input (not =), avoiding the conflict and allowing a correct shift on =. This demonstrates how LALR(1)'s precise lookaheads prevent unnecessary reduces that SLR(1) cannot.

Parsing Table Construction

Augmented Grammar and Items

In LR parsing, the grammar is first augmented to facilitate the construction of a deterministic parser. This involves introducing a new nonterminal symbol S', distinct from the original start symbol S, and adding a single production S' \to S. The resulting augmented grammar G' ensures a unique initial configuration for the parser and a distinct accept state upon recognizing the complete input string followed by the end marker. This augmentation prevents ambiguities in handling the start of the parse and clearly signals acceptance when the item [S' \to S \bullet] is reached, where \bullet denotes the position after the entire right-hand side has been processed.90426-2) Central to LR parsing are LR items, which represent partial derivations in the and model the parser's state of knowledge during recognition. An LR item consists of a production rule with a \bullet inserted at some position in the right-hand side, indicating how much of the production has been matched so far. For a production A \to \alpha \beta, where \alpha and \beta are strings of symbols, possible items include [A \to \bullet \alpha \beta] (no progress made), [A \to \alpha \bullet \beta] ( \alpha matched, \beta pending), and [A \to \alpha \beta \bullet] (complete match). The initial item for the augmented is [S' \to \bullet S], signifying the parser's starting point. These items capture the parser's expectations: the symbols before the have been recognized on the input, while those after suggest future actions like shifting terminals or reducing nonterminals. LR items are categorized into kernels and closures to build the states of the LR automaton efficiently. The kernel of a set of items comprises those where the dot is not at the beginning of the right-hand side, reflecting the core progress from prior transitions (e.g., items like [A \to \alpha \bullet \beta] with \alpha nonempty). In contrast, the closure expands this kernel by incorporating additional items derived from any nonterminal immediately following the dot in kernel items, using the grammar's productions to anticipate possible expansions (e.g., if a kernel item has [A \to \alpha \bullet B \beta], and B \to \gamma is a production, then [B \to \bullet \gamma] is added to the closure). This distinction allows compact representation of parser states: kernels identify unique configurations reached via symbol transitions, while closures complete the set by including all viable next derivations, enabling the parser to handle nondeterminism deterministically. The purpose of these elements is to formalize the positions in potential parses, providing a foundation for computing transitions and decisions without backtracking.90426-2) For example, consider a simple augmented with original start S \to A and A \to a. The initial item [S' \to \bullet S] has kernel \emptyset (dot at start), but its closure adds [S \to \bullet A] and [A \to \bullet a], modeling the full set of possibilities at the outset. This structure underpins all LR variants by representing the parser's localized view of the at each step.

Closure and State Computation

In LR parsing, the closure operation computes the complete set of valid items reachable from an initial set of LR(0) items, ensuring the parser state captures all possible expansions of nonterminals immediately following the current . Given a set of items I, the \operatorname{cl}(I) is formed by initializing with the items in I and then iteratively adding, for every item [A \to \alpha \cdot B \beta] in the current set (where B is a nonterminal), all items of the form [B \to \cdot \gamma] for each production B \to \gamma in the . This process continues recursively until no new items are added, reaching a fixed point. The result represents all productions that could potentially be derived after the prefixes encoded in I, akin to epsilon-like expansions in nondeterministic finite automata. The goto function defines transitions between parser states based on grammar symbols. For a set of items I and a symbol X (terminal or nonterminal), \operatorname{goto}(I, X) is the closure of the set \{ [A \to \alpha X \cdot \beta] \mid [A \to \alpha \cdot X \beta] \in I \}. This moves the dot past X in all applicable items, forming new items, and then applies the closure operation to include any further expansions. The goto function thus simulates the shift or reduction transitions in the parser's (DFA). To construct the full set of parser states, begin with the initial state I_0 = \operatorname{cl}(\{ [S' \to \cdot S] \}), where S' \to S is the augmented start production and S is the grammar's start symbol. Subsequent states are generated by applying \operatorname{goto}(I, X) for every existing state I and every grammar symbol X, collecting unique sets until no new states are produced. This enumeration can be performed using breadth-first search (BFS) or depth-first search (DFS) to explore the state graph: start a queue or stack with I_0, compute gotos from the current state, add unvisited results as new states, and record transitions until exhaustion. The resulting states, numbered sequentially from 0 to n, form the canonical collection of LR(0) item sets, which underlies the parser's DFA. As an illustrative example, consider a simple augmented with productions S' \to S, S \to A, and A \to a B, B \to b. Starting with the initial item set I = \{ [S' \to \cdot S] \}, the adds [S \to \cdot A] since the dot precedes nonterminal S. Then, from [S \to \cdot A], add [A \to \cdot a B]. No further nonterminals follow the dot in the new item, so \operatorname{cl}(I) = \{ [S' \to \cdot S], [S \to \cdot A], [A \to \cdot a B] \}. Now, applying \operatorname{[goto](/page/Goto)}(\operatorname{cl}(I), A) yields \{ [S \to A \cdot] \}, whose remains the same since no nonterminal follows the dot. This demonstrates how expands the to include potential derivations, while advances the parse position.

Table Filling and Conflicts

Action and Goto Table Entries

The action and goto tables form the core decision mechanism in an LR parser, dictating shifts, reductions, accepts, or errors based on the current stack-top state and the next input symbol. These tables are populated after computing the set of parser states, each represented by a collection of LR items, using the function to determine transitions on grammar symbols. The rows of both tables correspond to the numbered states (typically from 0 to the total number of states), while the columns of the table align with terminal symbols (including the end marker $), and the columns of the goto table align with non-terminal symbols. In the action table, for a given state i and terminal a, the entry is set to "shift j" if the goto function applied to state i and symbol a yields a new state j; this instructs the parser to push a onto the stack and transition to state j. For reduction actions, if state i contains a completed item [A \to \beta \bullet] (where \beta is the right-hand side of production A \to \beta), the entry action[i, a] is set to "reduce A \to \beta" for each terminal a in the appropriate lookahead set; in LR(0) parsers, this applies to all terminals, whereas non-LR(0) variants like SLR(1), LALR(1), or LR(1) restrict reductions to lookaheads in FOLLOW(A) or propagated lookahead sets to ensure validity. The accept entry is specifically placed in the action table for the state containing the item [S' \to S \bullet] (where S' is the augmented start symbol and S the original start), setting action[i, $] to "accept" to indicate successful parsing upon encountering the end-of-input marker. The goto table is populated exclusively for non-terminals: for state i and non-terminal A, the entry goto[i, A] is set to j if the goto function from state i on A yields state j; this transition occurs after a reduction, pushing A onto the stack and moving to j. Entries in either table that remain undefined after processing all states and symbols are marked as "error," triggering parse failure handling. The table-filling process can be formalized in pseudocode as follows, iterating over the computed states and symbols:
for each state i in 0 to number_of_states - 1:
    // Handle shifts for terminals
    for each terminal a:
        j = goto(I_i, a)  // I_i is the item set for state i
        if j is defined:
            action[i][a] = "shift j"
        else:
            action[i][a] = "error"
    
    // Handle reductions and accept
    for each completed item [A → β •] in I_i where A ≠ S':
        for each a in lookahead_set(A → β):  // e.g., FOLLOW(A) in SLR(1), all terminals in LR(0)
            action[i][a] = "reduce A → β"
    if [S' → S •] in I_i:
        action[i][$] = "accept"
    
    // Handle gotos for non-terminals
    for each non-terminal A:
        j = goto(I_i, A)
        if j is defined:
            goto[i][A] = j
        // No else needed; undefined gotos are handled during parsing
This construction ensures deterministic behavior for LR(k) grammars, as originally formalized for general left-to-right parsing with bounded lookahead.

Shift-Reduce and Reduce-Reduce Conflicts

In LR parsing, a shift-reduce conflict arises in a parser state where the LR automaton contains an item of the form A \to \alpha \cdot a \beta, indicating a potential shift action on terminal a, alongside a completed item B \to \gamma \cdot whose lookahead set includes a, suggesting a reduce action instead. This ambiguity forces the parser to choose between shifting the next input symbol onto the stack or reducing the handle using the production for B. A classic illustration is the "dangling else" problem in grammars for conditional statements, such as:
stmt → if expr then stmt
     | if expr then stmt else stmt
     | other
In a state after parsing if expr then stmt, the parser encounters else and must decide whether to shift it (to associate the else with the inner if) or reduce the inner statement (associating the else with the outer if). A reduce-reduce conflict occurs when a single parser state includes two or more completed items, such as A \to \beta \cdot and C \to \delta \cdot, with overlapping lookahead sets that permit reduction by either production on the same input symbol. This situation indicates that the grammar is not LR(k) for the given lookahead, as the parser cannot deterministically select which reduction to perform without additional context. Such conflicts are less common than shift-reduce but signal inherent nondeterminism in the grammar's structure. Conflicts are detected during the construction and filling of the and tables, where multiple possible entries—such as both a shift and a reduce, or two reduces—are generated for the same pair of state and input symbol. Parser generators like report these as errors unless resolved, ensuring the table remains deterministic. To resolve shift-reduce and reduce-reduce conflicts in LR(0) and SLR(1) parsers, lookahead restrictions using FOLLOW sets are applied: a reduce for B \to \gamma is only entered for symbols in FOLLOW(B), preventing overlap with shift actions where possible. In LALR(1) parsers, more precise lookahead propagation merges sets from contributing LR(1) states while avoiding spurious conflicts, as described in algorithms that compute propagated lookaheads via on the LR(0) . These methods increase the class of parsable grammars without full LR(1) table sizes. Grammar refactoring offers a fundamental approach to eliminate conflicts entirely, such as declaring operator precedences and associativities to guide shift-reduce resolutions or rewriting ambiguous productions to enforce a unique parse tree. For the dangling else, associating else with the nearest if via precedence (e.g., else higher than then) or restructuring the grammar with a matched/unmatched statement hierarchy resolves the conflict while preserving the intended semantics.

Practical Examples

Grammar for Arithmetic Expressions

A standard for simple arithmetic expressions, commonly used as a running example in LR parsing, defines expressions with and operators while respecting operator precedence and left associativity. The grammar rules are as follows:
  • E \to E + T \mid T
  • T \to T * F \mid F
  • F \to (E) \mid \text{id}
Here, E represents an expression, T a (capturing multiplication precedence), and F a (handling parentheses and atomic values). The terminals include the operators + and *, parentheses ( and ), \text{id} (e.g., variables or constants), and the end marker \ . The non-terminals are E (the start symbol), T , and F $. This grammar avoids by using left-recursive productions for both E and T, which enforce left associativity for the operators; for instance, the expression \text{id} + \text{id} + \text{id} is parsed as (\text{id} + \text{id}) + \text{id}, and multiplication takes precedence over addition due to the hierarchical structure of non-terminals. To illustrate, consider the input string \text{id} * \text{id} + \text{id} \ . A rightmost [derivation](/page/Derivation) from the start symbol E proceeds as follows, replacing the rightmost non-terminal at each step and ultimately yielding the string with the desired grouping (\text{id} * \text{id}) + \text{id} $:
  1. E \Rightarrow E + T
  2. E \Rightarrow E + F
  3. E \Rightarrow E + \text{id}
  4. E \Rightarrow T + \text{id}
  5. E \Rightarrow F * F + \text{id}
  6. E \Rightarrow \text{id} * F + \text{id}
  7. E \Rightarrow \text{id} * \text{id} + \text{id}
This derivation respects the grammar's precedence and associativity rules.

Step-by-Step Table Construction

To construct the parsing table for the arithmetic expression using SLR(1), begin by augmenting the grammar with a new start production E' → E, where E is the original start symbol; this ensures a unique starting state and facilitates end-of-input handling. The original is E → E + T | T, T → T * F | F, F → (E) | , with productions numbered as follows: (1) E → E + T, (2) E → T, (3) T → T * F, (4) T → F, (5) F → (E), (6) F → , and the augmented production (0) E' → E. The initial state, State 0, starts with the item [E' → •E] and is expanded using the closure operation, which adds all items derivable by replacing the nonterminal after the with its productions and repeating until no new items are added; this yields State 0: [E' → •E], [E → •E + T], [E → •T], [T → •T * F], [T → •F], [F → •(E)], [F → •id]. From State 0, compute goto transitions on symbols: goto on E leads to State 1 with kernel [E' → E•] and adding [E → E• + T]; goto on T leads to State 2 with [E → T•], [T → T• * F]; goto on F leads to State 3 with [T → F•]; goto on ( leads to State 4, a copy of State 0 but shifted for the parenthesis; and goto on id leads to State 5 with [F → id•]. Continue enumerating states by applying goto from existing states until no new states are generated, resulting in 12 states total: State 6 from State 1 or 8 on + with [E → E + •T] and closures for T and F; State 7 from State 2 or 9 on * with [T → T * •F] and closures for F; State 8 from State 4 on E with [F → ( E • )]; State 9 from State 6 on T with [E → E + T•], [T → T• * F]; State 10 from State 7 on F with [T → T * F•]; and State 11 from State 8 on ) with [F → (E)•]. These states form the canonical collection of LR(0) items, capturing all possible parser configurations for this grammar. To fill the ACTION table, for each and , place a shift action s_k if there is a on that to k; place a reduce action r_i on terminals in the FOLLOW set of the nonterminal in a completed item A → α• (production i) if the contains such an item; and place accept in State 1 on (end marker).[](https://cs.gmu.edu/~white/CS540/slr.pdf) The FOLLOW sets are: FOLLOW(E') = {}, FOLLOW(E) = {, ), +}, FOLLOW(T) = {, ), +, *}, FOLLOW(F) = {, ), +, *}, computed via standard FIRST and FOLLOW algorithms to resolve reduces in SLR(1).[](https://cs.gmu.edu/~white/CS540/slr.pdf) For example, in State 0, shift ( to s4 and [id](/page/id) to s5; in State 5 ([F → id•]), reduce r6 (F → id) on FOLLOW(F) = {, ), +, *}; in State 3 ([T → F•]), reduce r4 (T → F) on FOLLOW(T). The GOTO table entries are placed for nonterminals as the numbers from the . The resulting SLR(1) tables are as follows, where the ACTION table uses terminals {+, *, (, ), id, $} and the table uses nonterminals {E, T, F}; SLR adjustments appear in the reduce actions, which are placed solely based on FOLLOW sets rather than more precise lookaheads, potentially introducing conflicts in less suitable grammars but none here. ACTION Table:
State+*()id$
0s4s5
1s6accept
2r2s7r2r2
3r4r4r4r4
4s4s5
5r6r6r6r6
6s4s5
7s4s5
8s6
9r1s7r1r1
10r3r3r3r3
11r5r5r5r5
GOTO Table:
StateETF
0123
1
2
3
4823
5
693
710
8
9
10
11
This table enables deterministic without conflicts, confirming the grammar's SLR(1) suitability.

Parsing Process Illustration

Input Processing Walkthrough

To illustrate the operation of an LR parser during input processing, consider the sample input token sequence "id * id + id" appended with the end-of-input marker $, processed using the SLR(1) and tables constructed from the augmented arithmetic expression (as detailed in prior sections on table construction). This input represents a valid expression with taking precedence over . The follows the standard LR , which iteratively consults the tables based on the current top and lookahead symbol to perform shift, reduce, or accept actions until the input is fully processed. The process initializes with the stack containing solely the start state 0 (denoted as ) and the input pointer at the first "id", making "id" the initial lookahead . In 0, the table entry action[0, id] = s5 specifies a shift: the parser pushes the "id" and 5 onto the , yielding [0 id 5], and advances the input pointer to "". With the top at 5 and lookahead "", action[5, ] = r6 (reduce by F → id) is applied, since "" ∈ FOLLOW(F); the parser pops two elements (id and 5), exposes 0, and applies goto[0, F] = 3 to push "F" and 3, yielding [0 F 3], lookahead remains "". Next, in 3 with lookahead "", action[3, ] = r4 (reduce by T → F) applies (using FOLLOW(T) in SLR); pop two (F 3), goto[0, T] = 2, [0 T 2], lookahead "". In state 2 with lookahead "", action[2, ] = s7 triggers a shift, appending "" and state 7 to the stack as [0 T 2 * 7], advancing to "id". With lookahead "id", action[7, id] = s5 shifts, resulting in [0 T 2 * 7 id 5], advancing to "+". In state 5 with lookahead "+", action[5, +] = r6 (reduce F → id); pop two (id 5), goto[7, F] = 10, stack [0 T 2 * 7 F 10], lookahead "+". Immediately, in state 10 with lookahead "+", action[10, +] = r3 (reduce T → T * F); pop six elements ( 7 F 10 T 2, for RHS length 3), exposes state 0, goto[0, T] = 2, stack [0 T 2], lookahead "+". This reduction recognizes "id * id" as a term T, enforcing multiplication precedence and left associativity. With stack top at state 2 and lookahead "+", action[2, +] = r2 (reduce E → T, since "+" ∈ FOLLOW(E)); pop two (T 2), goto[0, E] = 1, stack [0 E 1], lookahead "+". In state 1 with lookahead "+", action[1, +] = s6 shifts, appending "+" and state 6: [0 E 1 + 6], advancing to "id". Action[6, id] = s5 shifts "id" to [0 E 1 + 6 id 5], lookahead "". In state 5 with lookahead "", action[5, ] = r6 reduces F → id; pop two, goto[6, F] = 3, [0 E 1 + 6 F 3], "". Then state 3, action[3, ] = r4 reduces T → F; pop two, goto[6, T] = 9, [0 E 1 + 6 T 9], "". In state 9 with lookahead "", action[9, $] = r1 (reduce E → E + T); pop six (+ 6 T 9 E 1), exposes state 0, goto[0, E] = 1, stack [0 E 1], lookahead "". Finally, action[1, $] = accept signals successful . This sequence demonstrates how table-driven decisions, guided by FOLLOW sets in SLR(1), handle precedence (reducing multiplication before addition) and associativity without backtracking.

Stack Evolution and Reductions

In an LR parser, the stack serves as the core data structure for tracking the partial derivation during bottom-up parsing, alternating between parser states (which encode possible future actions) and grammar symbols (terminals or nonterminals). Shifts push a lookahead terminal and its corresponding state onto the stack, advancing the parse along potential productions, while reductions identify a handle—a rightmost substring matching a production's right-hand side—pop the associated symbols and states (2 times the length of the right-hand side), and push the left-hand side nonterminal with the goto state computed from the exposed state below the handle. This process ensures unambiguous recognition of viable prefixes, with the stack depth reflecting the nesting of the emerging parse structure. The reductions drive the bottom-up assembly of the : each applies a , effectively creating a parent for the popped subtree symbols and linking it to the above the , progressively building from leaves (terminals) to the . For expressions, this manifests as recognizing factors, terms, and expressions in a precedence-driven manner, where is handled by sequential reductions after shifts. Consider an LR(1) parser for the augmented E' → E, E → E + T | T, T → T * F | F, F → (E) | , processing the input string "id * id + id $". The begins with 0, and evolves as follows, with snapshots after each major emphasizing the configuration just before and after key reductions ( numbers per standard canonical construction for this ).
ConfigurationStack SnapshotRemaining InputOperation/Reduction Trace
id * id + id $-
After shift/reduce[0 T 2]id + id $Shift "id", reduce F → id (pop 2), reduce T → F (pop 2); first recognized.
After shift[0 T 2 * 7]id + id $Shift "*"; continue (higher precedence defers E reduce).
After shift/reduce[0 T 2 * 7 F 10]+ id $Shift "id", reduce F → id (pop 2); second .
After reduce[0 T 2]+ id $Reduce T → T * F (pop 6); assembled as left-associative .
After reduce/shift[0 E 1 + 6]id $Reduce E → T (pop 2, on "+"); shift "+"; now at expression level.
After shift/reduce[0 E 1 + 6 T 9]$Shift "id", reduce F → id (pop 2), reduce T → F (pop 2); second .
After reduce[0 E 1]$Reduce E → E + T (pop 6); applied after complete. Accept on E' → E (pop 2).
This sequence demonstrates multiple consecutive reductions for operator precedence and associativity: after the second "id", reductions assemble the (F → id, T → T * F) before reducing to E on "+", forming a with * nested under T within the overall E. The final accept confirms the complete tree rooted at E'. The use of lookahead in decisions (e.g., reduce E → T only on "+" or "$", shift "*" from T state) enforces rules without . In cases of invalid input, such as "id + ", the stack evolves to [0 E 1] after reducing the initial "id" through F → id, T → F, E → T, then shifts "+" to [0 E 1 + 6], but encounters "" with no defined action[6, $] (no shift or reduce for T expected), signaling a syntax error and invoking recovery (e.g., popping states or inserting tokens).

Advanced Topics

Lookahead Sets and Precision

In LR(1) parsing, lookahead sets play a crucial role by associating each production item with a specific terminal symbol that predicts the next input, enabling the parser to make unambiguous decisions during shift-reduce operations. An LR(1) item is represented as [A → α•β, a], where A → αβ is a production, the dot (•) indicates the current position in the right-hand side, α has been parsed, β remains to be parsed, and a is the lookahead terminal (or the end marker $) that must follow the completion of the production to trigger a reduction. This lookahead context resolves parsing ambiguities that arise in LR(0) parsers, such as shift-reduce or reduce-reduce conflicts, by restricting reductions to cases where the next input symbol matches the predicted lookahead, ensuring deterministic behavior for a broader class of grammars. FOLLOW sets provide the foundation for propagating lookaheads in LR(1) items, defined as the set of terminals that can appear immediately after a nonterminal A in any valid from the start symbol. For the start symbol S, FOLLOW(S) always includes the end marker $; for other nonterminals, FOLLOW(A) is computed using a that propagates terminals through the grammar's productions—for each production B → γAδ, add FIRST(δ) (excluding ε if δ is nullable) to FOLLOW(A), and if δ is nullable or empty, add FOLLOW(B) to FOLLOW(A). This iterative process continues until no new terminals are added, ensuring FOLLOW sets capture all possible succeeding terminals. Complementing FOLLOW sets, FIRST sets determine the initial terminals derivable from a of symbols, essential for computing lookaheads during item in LR(1) construction. For a α, FIRST(α) includes all terminals that can begin strings derived from α; if α is ε-producible (nullable), ε is included, allowing propagation to subsequent symbols. Computation proceeds recursively: for a terminal t, FIRST(t) = {t}; for a nonterminal X with productions X → Y₁Y₂⋯Yₖ, initialize FIRST(X) with FIRST(Y₁) (excluding ε if nullable), then iteratively add FIRST(Yᵢ) for subsequent nullable prefixes; ε-productions are handled by marking nonterminals as nullable and including ε in their FIRST sets if applicable. In the operation for an item [A → α•Bγ, a], new items [B → •δ, b] are added where b ∈ FIRST(γa), combining FIRST computation with lookahead propagation to refine precision. The use of FIRST and FOLLOW sets in LR(1) parsers achieves high precision by tailoring lookaheads to specific contexts, allowing recognition of all deterministic context-free grammars without conflicts, though this comes at the cost of potentially in the number of parser states due to distinct lookahead combinations. In contrast, approximations like LALR(1) merge states from the canonical LR(1) while approximating lookaheads via FOLLOW sets or , reducing state count to linear in grammar size but risking introduced conflicts in some grammars. The recursive algorithms for FIRST and FOLLOW computation typically involve iterative fixed-point iterations over the grammar's nonterminals and productions, starting with initial seeds (e.g., $ for FOLLOW(S)) and updating sets until convergence, with O(n³) in the worst case for n symbols but efficient in practice for typical grammars.

Error Detection and Recovery

In LR parsers, syntax errors are detected during the loop when the entry corresponding to the current state and lookahead symbol is undefined or marked as an error, indicating that no valid shift, reduce, or accept action is possible. This detection mechanism relies on the deterministic nature of the , where the absence of an entry signals a deviation from the viable prefixes of the . Common recovery techniques aim to resume parsing after an error with minimal disruption to the overall process. In panic mode recovery, the parser discards input tokens until it encounters a synchronizing token—typically a member of the follow set of a nonterminal—allowing it to pop the stack until a state where this token is valid, thus restoring a viable prefix. Another approach involves popping stack symbols until a viable prefix is achieved, effectively backtracking to a point where parsing can continue without further insertions or deletions. More sophisticated methods include local repairs such as inserting or deleting tokens near the error site, or augmenting the grammar with error productions (e.g., E \to \text{error}) to explicitly handle anticipated error patterns like missing semicolons. These error productions integrate seamlessly into the LR table construction, enabling the parser to reduce erroneous input to a special error nonterminal and continue. To enhance recovery accuracy, techniques often employ multiple lookaheads to anticipate corrections, evaluating potential shifts or based on subsequent to select the most plausible repair. For instance, in an arithmetic expression , an input like "id + +" triggers an on the second "+", where the parser might skip the extraneous "+" (a synchronizing in the follow set of the expression nonterminal) and resume , avoiding cascade failures. However, LR recovery strategies assume errors are locally deterministic and syntax-focused, limiting their applicability to semantic or deeply nested ambiguities, where additional context beyond the parsing table may be required.

References

  1. [1]
    [PDF] LR Parsing
    LR parsing is a technique for parsing deterministic context-free languages in compiling applications, used to verify input syntax.
  2. [2]
    LL vs. LR Parsing | Baeldung on Computer Science
    Feb 28, 2025 · LL parsers use a top-down approach, while LR parsers use a bottom-up approach. LL parsers cannot handle left-recursive rules, but LR parsers ...<|control11|><|separator|>
  3. [3]
    Bison - GNU Project - Free Software Foundation
    Bison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic LR or generalized LR (GLR) parser.
  4. [4]
    On the translation of languages from left to right - ScienceDirect.com
    In this paper, we define LR(k) grammars, which are perhaps the most general ones of this type, and they provide the basis for understanding all of the special ...
  5. [5]
    [PDF] LR Parsing Bottom-Up Parsing
    • LR parsers don't need left-factored grammars and can also handle left-recursive grammars. • Consider the following grammar: E → E + ( E ) | int. – Why is ...
  6. [6]
    Efficient Computation of LALR(1) Look-Ahead Sets
    A correction to DeRemer's SLR(1) parser constructor algorithm. Unpublished manuscript. Lawrence Livermore Labs., Livermore, Calif., 1977. Google Scholar.
  7. [7]
    [PDF] Yacc: Yet Another Compiler-Compiler - ePaperPress
    Yacc: Yet Another Compiler-Compiler. Stephen C. Johnson. ABSTRACT. Computer program input generally has some structure; in fact, every computer pro- gram that ...
  8. [8]
    [PDF] Shift-Reduce Parsing - UAF CS
    Feb 17, 2025 · large parsing tables. Thus, when Shift-Reduce parsing was first introduced [D. Knuth 1965], it was considered to be impractical. As often ...Missing: original | Show results with:original
  9. [9]
    [PDF] Some definitions
    Bottom-up parsing. Goal: Given an input string w and a grammar G, construct a ... If G is unambiguous then every right-sentential form has a unique handle.
  10. [10]
  11. [11]
    [PDF] CSE 401 – Compilers - Washington
    LR Parsing Algorithm Pseudocode word = scanner.getToken(); while (true) { s = state on top of stack; if (acTon[s, word] = si ) { push word; push i; // i is ...
  12. [12]
    [PDF] Chapter 8 Introduction to LR-Parsing
    LR-parsing, invented by D. Knuth, aims to determine if a string belongs to a grammar's language and construct a rightmost derivation. It uses a deterministic ...Missing: LL | Show results with:LL
  13. [13]
    [PDF] Compilers: Principles, Techniques, and Tools
    In the time since the 1986 edition of this book, the world of compiler design has changed significantly. Programming languages have evolved to present new.
  14. [14]
    [PDF] The purpose of LR-parsing, invented by D. Knuth in the mid sixties ...
    Knuth's major discovery was that for a certain type of grammars, the LR(k)-grammars, a certain kind of DPDA could be constructed from the grammar (shift/reduce.
  15. [15]
    Simple LR(k) Grammars
    In this section we briefly review the results of Knuth. [11] regarding LR(O) grammars. We use a combination of the terminologies of Earley [6] and McKeeman [12] ...Missing: citation | Show results with:citation<|control11|><|separator|>
  16. [16]
    [PDF] PRACTICAL TRANSLATORS FOR LR(k) LANGUAGES
    PRACTICAL TRANSLATORS FOR LR(k) LANGUAGES. Franklin Lewis DeRemer. 24 October 1969. PROJECT MAC. MASSACHUSETTS INSTITUTE OF TECHNOLOGY. Massachusetts 02139 ...
  17. [17]
    Yacc Yet Another Compiler Compiler by Stephen C. Johnson
    Yacc provides a general tool for describing the input to a computer program. The Yacc user specifies the structures of his input, together with code to be ...Missing: 1975 | Show results with:1975
  18. [18]
    Shift-reduce conflicts in LR parsers
    In order of power and complexity, the LR parsers commonly in use are the non-lookahead LR(0) and the lookahead SLR(1), LALR(1) and LR(1). For a given grammar, ...
  19. [19]
    [PDF] LR parsing
    Sep 14, 2021 · Sufficient for most practical parsing problems. LR(1). Slightly more powerful than LALR(1). Not used in practice – the tables become very ...
  20. [20]
    [PDF] Bottom-Up Parsing
    Bottom-Up Parsing. Thanks to Charles E. Hughes. Page 2. Reductions. • Top-down focuses ... the handle is the substring that was replaced at the last step in a.<|control11|><|separator|>
  21. [21]
    [PDF] Dragon-book.pdf - School of Information Science and Technology
    Chapter 4 covers the major parsing methods, top-down (recursive-descent, LL) and bottom-up (LR and its variants). Chapter 5 introduces the principal ideas ...
  22. [22]
    LR Parsing | ACM Computing Surveys
    First page of PDF. Formats available. You can view the full content in the following formats: PDF. References. [1]. AHO, A V., DENNING, P. J, AND ULbMAN, J D ...Missing: original | Show results with:original
  23. [23]
    Efficient Computation of LALR(1) Look-Ahead Sets
    4, October 1982. Page 24. 638. F. DeRemer and T. Pennello. LR items are necessary, however, to produce the diagnostic debugging traces described in the next ...
  24. [24]
    [PDF] Building SLR Parse Tables - GMU CS Department
    Initial elements (I) are often referred to as the kernel elements of closure(I). Example: we can use the expression grammar: E → E + T | T. T → T * F | F.
  25. [25]
    [PDF] CSE504 Compiler Design Syntax Analysis (SLR Parser)
    LR Parsing Example. Parse id * id + id. Page 12. LR Parsing Example. Page 13. Constructing SLR Parsing Table. • States of an SLR parser represent sets of items.
  26. [26]
    None
    ### Summary of FIRST and FOLLOW Sets Computation and Use in LR(1) Lookaheads
  27. [27]
    Practical LR error recovery | Proceedings of the 1979 SIGPLAN ...
    We present a practical, language independent mechanism for error recovery in LR parsers. The method is easy to implement in existing parser generators.
  28. [28]
    Practical error recovery in LR parsing - ACM Digital Library
    The technique includes a "phrase-level" error recovery strategy augmented with certain additional features such as "local correction".
  29. [29]
    Automatic error recovery for LR parsers | Communications of the ACM
    In this paper we present a scheme for detecting and recovering from syntax errors in programs. The scheme, which is based on LR parsing, ...
  30. [30]
    A practical method for LR and LL syntactic error diagnosis and ...
    This paper presents a powerful, practical, and essentially language-independent syntactic error diagnosis and recovery method that is applicable within the ...