Fact-checked by Grok 2 weeks ago

Control-flow graph

In computer science, a control-flow graph (CFG) is a directed graph that models the possible sequences of execution paths through a program's code, with nodes representing basic blocks and edges indicating transfers of control between them.^[1] A basic block is defined as a maximal sequence of consecutive statements in the program where control enters only at the beginning and leaves only at the end, ensuring no internal branches or jumps.^[2] CFGs typically include a unique entry node for the program's starting point and may include an exit node, forming a structure where all valid execution paths connect these points.^[3] The construction of a CFG begins with identifying "leaders" in the program's intermediate representation—statements such as the first instruction, targets of branch statements, or those immediately following branches—which mark the starts of basic blocks.^[2] Basic blocks are then formed by grouping consecutive statements between leaders, and directed edges are added to reflect sequential flow or conditional/unconditional branches, often labeled for true/false outcomes in conditional statements.^[1] This process yields a graph that captures the program's control structure without embedding data dependencies, distinguishing it from data-flow or program dependence graphs.^[3] Control-flow graphs are essential tools in compiler construction and program analysis, enabling optimizations like dead code elimination, loop detection via back-edges and dominators (nodes through which all paths from the entry must pass), and data-flow analyses such as reaching definitions. They also support software testing by identifying structural coverage criteria, anomaly detection in execution logs, and reverse engineering of compiled binaries.^[2] Properties like reducibility—where the graph becomes acyclic after removing back-edges—further aid in advanced analyses and transformations.^[3]

Fundamentals

Definition

A control-flow graph (CFG) is a directed graph that represents the possible sequences of execution in a program, where each node corresponds to a basic block—a maximal sequence of consecutive statements with no branches or jumps into or out of its interior—and each directed edge indicates a possible transfer of control from one basic block to another.^[1] This structure captures the flow of execution through sequential instructions, conditional branches, and jumps, providing a foundation for analyzing program behavior without considering data dependencies.^[4] Every CFG designates a unique entry node as the initial point of execution and, in many cases, an exit node as the termination point, allowing the graph to model the program's overall structure from start to finish.^[4] Basic blocks serve as the atomic units in this representation, ensuring that the graph abstracts away low-level details while preserving the essential paths of control.^[1] In distinction from an abstract syntax tree (AST), which encodes the hierarchical syntactic structure derived from parsing source code, a CFG focuses on the dynamic paths of execution after syntactic analysis, emphasizing how control proceeds through sequential and conditional statements.^[5] Similarly, while data-flow graphs model the movement and transformation of data between operations, CFGs are dedicated exclusively to the sequencing and branching of control, independent of variable values or dependencies.^[6] The concept of the control-flow graph originated in compiler theory during the late 1960s, with foundational contributions from researchers like Frances E. Allen, whose 1970 work formalized its use for program analysis and optimization.^[7] Allen's approach built on earlier ideas, such as boolean connectivity matrices for flow diagrams introduced by Reese T. Prosser in the 1950s, establishing the CFG as a key tool in understanding and transforming program execution.^[7]

Construction

The construction of a control-flow graph (CFG) involves systematically partitioning a program's intermediate representation or source code into basic blocks and then connecting them with directed edges that represent possible control transfers. This process typically begins with scanning the code to identify points where control flow may diverge or converge, ensuring the resulting graph accurately models execution paths without introducing extraneous nodes or edges.^[8]^[9] The first step is to identify basic block boundaries by locating "leaders," which are instructions that serve as entry points to blocks. Leaders include the program's first instruction, any target of a branch (conditional or unconditional, such as jumps or gotos), and the instruction immediately following a branch or return statement. This identification can be formalized as an algorithm that iterates through the code: initialize the set of leaders with the first instruction; for each instruction, if it is a branch, add its targets to the leaders; then, for each leader, mark the instructions following branches as additional leaders. Once leaders are found, basic blocks are formed by grouping consecutive instructions starting from each leader up to but not including the next leader or the end of the program. Control statements like if, while, return, and goto are key indicators during this scan, as they end the current block and potentially start new ones at labels or successors.^[8]^[9]^[1]^[10] After forming the basic blocks, which serve as nodes in the CFG, edges are inserted to capture control flow. Sequential edges connect a block to its immediate successor if the block does not end in a branch. For conditional branches, edges are added to all possible target blocks; unconditional jumps or gotos create a single edge to the target. Exception handling introduces additional edges from statements that may raise exceptions (e.g., I/O operations) to corresponding handler blocks, modeling non-local control transfers explicitly. Unstructured constructs like gotos are handled by treating them as unconditional branches, which may result in irregular edges but are incorporated directly without restructuring the code during initial construction.^[8]^[9]^[11]^[10] Optimizations can be applied during or immediately after construction to simplify the graph. For instance, edge contraction merges adjacent blocks connected by a single sequential edge without branches, effectively removing unnecessary intermediate nodes and reducing graph complexity while preserving semantics. This is particularly useful for eliminating trivial blocks created by labels or empty statements in unstructured code. The overall algorithm ensures the CFG is minimal yet complete, facilitating subsequent analyses.^[8]^[9]

Examples

A simple example of a control-flow graph (CFG) arises from an if-then-else statement in pseudocode, which demonstrates conditional branching and a merge point. Consider the following pseudocode:

if (x > 0) {
    y = x * 2;
} else {
    y = x / 2;
}
z = y + 1;
if (x > 0) {
    y = x * 2;
} else {
    y = x / 2;
}
z = y + 1;

The CFG consists of four basic blocks: an entry block containing the condition if (x > 0), a then-block with y = x * 2;, an else-block with y = x / 2;, and an exit block with z = y + 1;. Edges connect the entry block to the then-block (true branch) and else-block (false branch), while both branches flow to the exit block, forming a diamond-shaped structure with the entry and exit as the top and bottom nodes, respectively.^[12] Another common CFG example is a while loop, illustrating iterative control flow with a back edge. For the pseudocode:

x = 0;
while (x < y) {
    y = f(x, y);
    x = x + 1;
}
x = 0;
while (x < y) {
    y = f(x, y);
    x = x + 1;
}

The graph has four nodes: an initialization block x = 0;, a header block testing x < y, a body block with y = f(x, y); x = x + 1;, and an exit block (often a dummy node for termination). Edges include forward flow from initialization to header, a true branch from header to body, a false branch from header to exit, and a back edge from body to header to enable looping. This creates a cycle between the header and body, with the entry at the initialization and exit after the header.^[13] In visual representations, CFG nodes are typically labeled with the sequence of instructions in each basic block, and directed arrows denote edges representing possible control transfers, such as conditional jumps or unconditional falls-through. The entry node is the program's starting point, often marked as such, while the exit node(s) indicate termination points where control leaves the graph.^[14] Variations in CFGs can include edges for function calls, where a call edge leads to a subroutine entry and a return edge brings control back, or exception-handling edges that divert flow to error-recovery blocks upon runtime faults like division by zero. These edge types extend the basic structure to model interprocedural or fault-tolerant control flow without altering the core node-based representation.^[15]

Basic Properties

Reachability

In a control-flow graph (CFG), reachability defines whether one node can be accessed from another during potential program execution. Specifically, a node B is reachable from a node A if there exists a directed path—a sequence of one or more edges—from A to B. This concept captures the possible flow of control through the program's basic blocks, starting typically from the designated entry node, and is essential for analyzing executable portions of code.^[16] Computing the reachable set from the entry node involves standard graph traversal techniques. Depth-first search (DFS) explores as far as possible along each branch before backtracking, while breadth-first search (BFS) processes nodes level by level from the source. Both algorithms mark visited nodes to avoid cycles and efficiently identify all accessible nodes, with time complexity O(V + E) where V is the number of nodes and E the number of edges.^[17] The reachability relation R, where R(A, B) holds if a path exists from A to B, can also be determined via the transitive closure of the CFG's adjacency matrix; this matrix-powered approach, using algorithms like Warshall's, yields a boolean matrix indicating all pairwise reachabilities in O(V^3) time.^[18] Reachability plays a key role in compiler optimizations and program analysis. For instance, nodes unreachable from the entry point represent dead code, which compilers can eliminate to reduce program size and improve performance without altering semantics.^[19] Furthermore, by examining the reachable subgraph, analysts can detect potential infinite loops, such as cycles lacking any outgoing path to an exit node, enabling warnings or transformations to prevent non-termination.^[20] These applications underscore reachability's foundational utility in ensuring code reliability and efficiency.

Strongly Connected Components

A strongly connected component (SCC) in a directed graph, such as a control-flow graph (CFG), is a maximal subgraph in which there exists a directed path from every node to every other node within the subgraph.^[21] This property of mutual reachability ensures that the component captures tightly coupled regions of control flow, extending the basic notion of reachability by requiring bidirectional accessibility between all pairs of nodes. Efficient algorithms exist to compute the SCCs of a CFG in linear time relative to its size. Tarjan's algorithm, based on a single depth-first search traversal with a stack to track discovery and low-link values, identifies all SCCs in O(V + E) time, where V is the number of vertices (basic blocks) and E is the number of edges. Kosaraju's algorithm similarly achieves O(V + E) time complexity through two passes of depth-first search: one on the original graph to obtain a finishing-time ordering, and another on the transpose graph to extract components in reverse post-order.^[22] In CFGs, SCCs hold particular significance for program analysis, as they delineate natural loops—where the component includes a header node with back edges from the body—and highlight irreducible regions that resist simple loop structuring.^[21] By contracting each SCC into a single supernode, the resulting condensation forms a directed acyclic graph (DAG), which simplifies subsequent optimizations and analyses by enabling topological processing without intra-component cycles. For illustration, consider a CFG for a basic while loop: vertices include an entry node E leading to loop header H, H to body B, B back to H via a back edge, and H to an exit node X. The SCC computation yields {H, B} as a single component due to the mutual paths (H → B → H and B → H → B), while E and X form trivial singleton SCCs, with the condensation DAG showing E → {H,B} → X.^[21]

Analysis Techniques

Dominance Relationships

In a control-flow graph (CFG) with a designated entry node, a node d dominates a node n, written d \dom n, if every path from the entry node to n includes d. The entry node dominates every node in the graph, and every node dominates itself (reflexivity). Strict dominance, denoted d \sdom n, excludes the case where d = n, so d properly dominates n only if d precedes n on all paths to it.^[23] Post-dominance is the dual concept, defined with respect to paths to an exit node: a node p post-dominates n, written p \pdom n, if every path from n to the exit node passes through p. This relation can be computed as forward dominance in the reverse CFG, where edge directions are inverted and the entry and exit nodes are swapped. Strict post-dominance follows analogously by excluding reflexivity. The dominance frontier of a node d is the set of all nodes n such that d dominates some immediate predecessor of n, but d does not strictly dominate n. Formally,

DF(d) = \{ n \mid \exists p \in \preds(n) \text{ s.t. } d \sdom p \land d \nsdom n \},

where \preds(n) denotes the predecessors of n. Dominance frontiers capture the boundaries where dominance breaks, which is crucial for identifying merge points in control flow, such as where phi functions are inserted in static single assignment (SSA) form. Dominators can be computed using iterative data-flow analysis, which solves the fixed-point equations \dom(n) = \{n\} \cup \bigcap_{p \in \preds(n)} \dom(p) starting from the entry node, propagating sets until convergence. This approach has time complexity O(n^2) in the worst case for dense graphs but performs well in practice for typical CFGs. For efficiency, the Lengauer-Tarjan algorithm computes the dominator relation in near-linear time, specifically O(m \alpha(m,n)), using depth-first search and union-find structures to process semi-dominators and link-eval operations.^[23] The dominator tree organizes the dominance relation hierarchically: the root is the entry node, and for each node n \neq entry, its parent is the strict dominator of n that is closest to n (i.e., the immediate dominator \idom(n)). Edges in the tree represent immediate dominance, forming a tree structure that reflects nesting of control regions. This tree facilitates optimizations by enabling traversal for tasks like constant propagation and partial redundancy elimination, as subtrees correspond to dominator-based scopes.

Special Edges

In control-flow graphs (CFGs), edges are classified based on their role in representing program execution paths, which facilitates various analyses and optimizations. This classification relies on a depth-first search (DFS) traversal, producing a DFS spanning tree. Tree edges form the spanning tree itself. Forward edges connect an ancestor to a proper descendant in the DFS tree (not a direct tree edge), progressing down the tree without forming cycles. Back edges connect a descendant to an ancestor in the DFS tree, indicating loops; such edges have the property that the head dominates the tail. Cross edges connect nodes where neither is an ancestor or descendant of the other in the DFS tree, typically linking different branches. Abnormal edges represent exceptional control flows, such as those triggered by exceptions or signals, where the destination may be unpredictable or lead to handlers outside the normal flow.^[24]^[25]^[26]^[27] Critical edges form another important category, defined as edges originating from a basic block with multiple successors and terminating at a basic block with multiple predecessors; these are unmergeable without modification and pose challenges for optimizations requiring precise insertion points. To address this, compilers often split critical edges by inserting new empty basic blocks, ensuring that subsequent passes can safely place code without duplicating instructions across branches. Impossible edges, also known as fake edges, are artificially added to the CFG to maintain structural properties, such as ensuring the exit node has exactly one incoming edge or modeling unreachable paths in infinite loops; they are never executed in practice but aid in formal analysis.^[28]^[29] These special edges significantly influence compiler optimization passes. For instance, back and cross edges help identify loops for dead code elimination by revealing cyclic dependencies, while critical edges necessitate preprocessing to avoid incorrect transformations during redundancy elimination. Abnormal and impossible edges, however, can complicate analyses by introducing uncertainty, often requiring conservative approximations to preserve program correctness.^[30]

Loop Structures

Loop Management

Loop identification in control-flow graphs (CFGs) primarily relies on the concept of natural loops, which are single-entry subgraphs formed by back edges connecting a tail node to a header node that dominates the tail. A back edge is defined as an edge from a node n to a node h where h dominates n, meaning every path from the entry of the CFG to n passes through h. The body of the natural loop includes the header h and all nodes from which n can be reached without traversing h, ensuring a unique entry point for optimization purposes.^[31] To simplify loop entry and facilitate analysis, compilers often insert a pre-header node immediately before the loop header, creating a dedicated entry block even if the original CFG has multiple predecessors to the header. In static single assignment (SSA) form, this pre-header placement allows phi functions—used to merge values from multiple control paths—to be positioned at the loop header without complicating the representation of loop entries. Pre-headers also enable safer insertion of loop-invariant computations outside the loop body while preserving the original control flow. Natural loops exhibit a structured nesting property: any two natural loops in a CFG are either disjoint, one contained within the other, or identical, which aids in building a hierarchical loop nest tree for analysis.^[24] This property holds because loops are defined via dominance and back edges, preventing arbitrary overlaps and allowing disjoint or nested configurations without side entries.^[24] Handling nested loops involves recursively identifying inner loops within the body of outer ones, while disjoint loops can be processed independently. One key optimization leveraging loop management is loop-invariant code motion, which relocates computations whose results do not change across loop iterations to the pre-header or just before the loop. This technique uses dominance information to verify that an expression is invariant—computed from values not modified within the loop—and can be safely hoisted without altering program semantics. For example, in a loop accumulating an array sum where the array bounds are constant, the bounds check can be moved outside, reducing redundant executions.

Reducibility

In control-flow graphs (CFGs), reducibility is a structural property that ensures all cycles can be organized into a hierarchical form resembling structured programming constructs. A CFG is reducible if every cycle is a natural loop, where back edges connect only to a dominating header node, and there are no retreating edges that bypass this dominance outside of such loops. This means that in any depth-first spanning tree of the graph, every retreating edge must be a back edge to a node that dominates its tail.^[25]^[32] To test for reducibility, an iterative reduction algorithm can be applied: repeatedly identify and collapse natural loops (as defined by back edges to dominating headers) until the graph either reduces to a single node or fails to do so, indicating irreducibility. This process involves transformations such as removing self-loops and merging nodes with a unique predecessor, equivalent to collapsing intervals in the graph. The algorithm runs in O(V + E) time, where V is the number of nodes (basic blocks) and E is the number of edges, using a depth-first search to identify back edges and dominators efficiently.^[33]^[32] Reducible CFGs exhibit properties that align with structured control flow, commonly found in languages without unrestricted goto statements, such as those enforcing if-then-else, while, and for constructs. This structure enables simpler and more efficient analyses, including data-flow optimization and interval-based transformations, by limiting the complexity of iterative algorithms to the graph's loop-nesting depth plus a small constant.^[32]^[25] Examples of irreducible CFGs typically involve constructs like goto statements that introduce multiple entry points to a loop, creating cycles without a single dominating header; for instance, a goto jumping directly into a loop's body forms a retreating edge that cannot be part of a natural loop, resulting in a non-collapsible structure.^[32]

Loop Connectedness

Loop connectedness quantifies the degree of interconnection among loops in a control-flow graph (CFG), particularly by measuring the prevalence of back edges within loop nests relative to a depth-first spanning tree (DFST). Formally, for a CFG G and DFST T, the loop connectedness d(G, T) is defined as the maximum number of back edges—those connecting a node to one of its ancestors in T—encountered along any cycle-free path in G.^[34] This metric captures the complexity of loop structures, where higher values indicate denser nesting or multiple looping mechanisms within a single nest. In reducible CFGs, d(G, T) remains constant regardless of the chosen DFST, providing a stable property for analysis.^[34] A key aspect of loop connectedness involves assessing entry and exit points in loops, distinguishing single-entry single-exit (SESE) structures from those with multiple entries or exits. SESE loops, which have exactly one entry edge and one exit edge, represent ideal structured forms that align with high connectedness in reducible graphs, where each natural loop typically features a single back edge.^[34] In contrast, multiple-entry loops arise in irreducible regions, increasing connectedness by allowing multiple back edges to converge on a loop header, complicating the flow.^[34] Reducibility can be viewed as a special case of maximal connectedness limited to one back edge per cycle, enabling straightforward reduction to SESE forms.^[34] The hierarchy of loop connectedness emerges through interval analysis, where intervals—maximal single-entry subgraphs—are constructed to model nested loops as a tree-like structure akin to an interval graph. In this framework, outer intervals encompass inner ones, with back edges defining the boundaries; the depth of nesting correlates with d(G), as multiple back edges signal deeper or overlapping loop interactions. For irreducible regions, which defy simple SESE decomposition due to crossing edges, intervals overlap rather than nest perfectly, requiring auxiliary nodes or transformations to maintain hierarchy. This approach handles non-structured control by partitioning the CFG into progressively refined regions, preserving overall connectedness. Algorithms for computing loop connectedness leverage dominator relations to identify and nest loops efficiently. Dominators, computed in near-linear time, reveal loop headers as nodes that strictly dominate back-edge tails, allowing enumeration of natural loops and their nesting: an inner loop nest exists if its header is dominated by an outer loop's header.^[34] Cooper et al.'s iterative algorithm for dominators, with O(|E|) worst-case complexity assuming low loop nesting, forms the basis for detecting these structures and deriving d(G) by counting back edges per nest.^[34] For region formation, the process extends to building SESE regions via dominator and post-dominator trees, collapsing loops into hierarchical nodes; this yields a program structure tree (PST) where connectedness dictates the tree's depth and branching.^[35] The significance of loop connectedness lies in providing an upper bound on the number of iterations required for convergence in iterative data-flow analyses, such as no more than d(G) + 3 passes when traversing in reverse postorder.^[34] This bound facilitates efficient optimization passes. Additionally, low connectedness can aid region-based restructuring of control flow, enabling transformations that simplify unstructured CFGs while preserving semantics.^[35]

Extensions and Applications

Inter-procedural Control-Flow Graphs

Inter-procedural control-flow graphs (ICFGs) extend intra-procedural control-flow graphs to model control flow across procedure boundaries in a program, enabling whole-program analysis. They are constructed by integrating the control-flow graphs of individual procedures with a call graph that represents procedure invocations and returns. In this supergraph, nodes include statements from each procedure's CFG, along with explicit call nodes and return-site nodes; edges comprise intra-procedural control-flow edges, call edges from call sites to callee entry points, and return edges from callee exit points to the corresponding post-call sites in the caller. This structure captures the program's overall control flow while respecting procedure nesting.^[36] A key challenge in ICFG construction and analysis arises from recursion, which introduces cycles that can lead to infinite paths and unbounded graph explosion if fully unfolded. To mitigate this, analyses restrict consideration to valid paths—those with balanced calls and returns, enforceable via a context-free grammar that ensures proper nesting without infinite recursion. Parameter passing poses another challenge, as actual arguments at call sites must be mapped to formal parameters in callees, often requiring data-flow modeling of values and globals. Context-sensitivity further complicates matters by distinguishing procedure invocations based on calling contexts, improving precision but exponentially increasing the state space compared to context-insensitive approximations.^[36]^[37] For efficient analysis on ICFGs, summary edges approximate the effects of inter-procedural paths by representing the net transformation from a procedure's entry to its exit sites, computed via data-flow propagation without enumerating all paths. These edges summarize the intra-procedural behavior, such as variable modifications or uses, and are particularly useful in flow-sensitive analyses like reaching definitions. Integration with points-to analysis resolves indirect calls and pointer dereferences, refining the call graph and ICFG edges for greater accuracy in languages with dynamic dispatch.^[37]^[38] The IFDS (Interprocedural Finite Distributive Subset) framework provides a foundational algorithm for solving a broad class of inter-procedural data-flow problems on ICFGs, reducing them to graph reachability in an exploded supergraph. Here, nodes are pairs of original graph nodes and data-flow facts from a finite domain D, with edges modeling distributive data-flow functions; the framework ensures precision for subset-based problems (e.g., live variables) by computing only realizable paths in polynomial time, typically O(|E|D^3), while handling recursion through path validation.^[36]

Applications in Optimization and Analysis

Control-flow graphs (CFGs) play a central role in compiler optimizations by enabling the identification and elimination of unreachable code through reachability analysis, which determines whether certain nodes or basic blocks can be executed from the program's entry point.^[12] This technique is fundamental to dead code elimination, where assignments to variables that are never used afterward are removed, reducing program size and improving execution speed without altering semantics.^[39] For instance, in a CFG, if a node lacks incoming edges from reachable paths, it and its dependent subgraphs can be pruned entirely.^[12] Dominance relationships in CFGs further support optimizations like constant propagation, where constant values are forwarded along paths where a definition strictly dominates all uses, allowing substitution and simplification of expressions.^[40] This process converges through iterative data-flow analysis on the CFG, replacing variables with constants to enable further folding and reduce runtime computations.^[40] Such optimizations are implemented in production compilers, where dominance frontiers help manage phi-functions in static single assignment form for precise propagation.^[41] In static analysis, CFGs facilitate bug detection by modeling execution paths to identify potential errors, such as null pointer dereferences, where analysis traces paths from variable assignments to uses and flags those leading to unchecked nulls.^[42] Tools employing flow-sensitive analysis on CFGs can pinpoint 50-80% of such defects in Java code by propagating nullness information across basic blocks.^[42] Similarly, taint tracking for security leverages CFGs to propagate taint labels from untrusted sources through control and data flows, detecting leaks or violations like tainted data reaching sensitive sinks.^[43] This approach reduces over-tainting by incorporating control-flow-aware propagation at the byte level, enhancing precision in vulnerability detection.^[44] Emerging applications extend CFGs to AI and machine learning compilers, where TensorFlow represents control flow using operators like tf.cond and tf.while_loop within its dataflow graphs, enabling optimizations for dynamic execution paths in neural network training.^[45] In reverse engineering, CFGs aid binary analysis by reconstructing control structures from stripped executables, using graph neural networks to predict procedure boundaries and recover function calls with high accuracy.^[46] Major compilers integrate CFG-based passes for these purposes; LLVM's SimplifyCFG pass merges basic blocks and eliminates unreachable code, while GCC employs similar analyses in its optimization pipelines to streamline control flow.^[47] Post-2020 advancements in just-in-time (JIT) compilation have incorporated renewable control-flow integrity, dynamically enforcing CFG policies during runtime to mitigate attacks while preserving performance.^[48] Tools like JITScope visualize IR transformations on CFGs in JIT environments, aiding debugging and optimization tuning in modern systems.^[49]

References

[1]
[PDF] Control Data Flow Graph
A control flow graph (CFG), or simply a flow graph, is a directed graph in which: – (i) the nodes are basic blocks; and. – (ii) the edges are induced from ...
[2]
[PDF] Representation and Analysis of Software 1 Introduction 2 Control ...
A control flow graph1 (CFG) is a directed graph in which each node represents a basic block and each edge represents the flow of control between basic ...Missing: science | Show results with:science
[3]
[PDF] Control Flow Graph: Definition
representation of control flow in programs used in high-level and low-level code optimizers. The flow graph data structure lends itself to use of several ...Missing: science | Show results with:science
[4]
[PDF] Modern Compilers: Theory and Practise - Control flow analysis
CFG - Control flow graph. Definition: A rooted directed graph G = (N,E), where N is given by the set of basic blocks + two special BBs: entry and exit. And ...
[5]
[PDF] Basic Program Analysis - CS@Columbia
• Abstract Syntax Tree : Source code parsed to produce AST. • Control Flow Graph: AST is transformed to CFG. • Data Flow Analysis: operates on CFG. Page 5. The ...
[6]
Why use a CFG instead of a DFG - Computer Science Stack Exchange
Apr 30, 2018 · A control-flow graph serves a different purpose than a data-flow graph. They aren't alternatives. A control-flow graph represents the flow of control.What is the difference between control flow graph & interprocedural ...Backward data-flow: post-order or RPO on reverse CFG?More results from cs.stackexchange.com
[7]
Control flow analysis | Proceedings of a symposium on Compiler ...
In this paper the basic control flow relationships are expressed in a directed graph. Various graph constructs are then found and shown to codify interesting ...
[8]
[PDF] CS 380C Lecture 2 - UT Computer Science
Control Flow Graph Construction. Constructing CFGs with basic blocks (sets of instructions). • Identify Leaders - first instruction of a basic block. • In ...
[9]
[PDF] Control Flow Analysis
Control Flow Graph. □ CFG models flow of control in the program (procedure) ... 2. For each leader, its basic block is the leader and all statements up ...
[10]
[PDF] Compiler Design
A goto-statement ends a basic block. ▫ A return statement ends a basic block. ▫ A label starts a basic block (and ends the previous block).<|control11|><|separator|>
[11]
https://ieeexplore.ieee.org/document/8377792
[12]
[PDF] Optimizations - UT Computer Science
• Control Flow Graph (CFG) = graph representation of computation and control ... • Each outgoing edge = outgoing flow of control in some execution of ...Missing: contraction | Show results with:contraction
[13]
[PDF] Graph Coverage for Source Code
– Control flow graph (CFG): the most common graph for source code ... Example: CFG for a while loop. – Basic blocks. 1: x = 0;. 2: while(x < y). 3: y ...
[14]
[PDF] Basic Blocks and Flow Graphs
Flow Graphs 475–477. Or, read the Dragon book: Basic Blocks 528–530. Flow Graphs 532–534. Summary. A Control Flow Graph (CFG) is a graph whose nodes are basic.
[15]
[PDF] Dragon-book.pdf - School of Information Science and Technology
In the time since the 1986 edition of this book, the world of compiler design has changed significantly. Programming languages have evolved to present new.
[16]
[PDF] Control Flow Analysis
Every- thing that is said about directed graphs in this paper holds for control flow graphs. A subgraph of a directed graph, G = (B,E), is a directed graph.
[17]
[PDF] Directed Graphs - Princeton CS
Transitive closure. For which vertices v and w is ... Adjacency matrix. • Adjacency lists. • Adjacency ... Reachability application: program control-flow analysis.<|control11|><|separator|>
[18]
[PDF] A Path-based Approach to the Detection of Infinite Looping
Infinite looping can be detected if we can show that, once the loop is entered, none of the paths in the loop body leads to the violation of the loop condition.
[19]
Control flow analysis
INTRODUCTION. Any static, global analysis of the expression and data relation- ships in a program requires a knowledge of the control flow of the program.Missing: 1960s | Show results with:1960s
[20]
(PDF) Formally-Proven Kosaraju's algorithm - ResearchGate
This notes explains how the Kosaraju's algorithm that computes the strong-connected components of a directed graph has been for-malised in the Coq prover using ...
[21]
A fast algorithm for finding dominators in a flowgraph
In 1979 Lengauer and Tarjan gave an almost-linear-time algorithm to find dominators. ... In the first part of the paper we show how to extend recent ...Missing: original | Show results with:original
[22]
[PDF] Control flow graphs and loop optimizations - Purdue Engineering
Apr 14, 2011 · Control flow graphs (CFG) divide statements into basic blocks. Loop optimizations include low-level (code motion, strength reduction) and high- ...Missing: contraction | Show results with:contraction
[23]
[PDF] Flow Graph Theory - Stanford InfoLab
Back Edges. ◇An edge is a back edge if its head dominates its tail. ◇Theorem: Every back edge is a retreating edge in every DFST of every flow graph.Missing: types abnormal
[24]
Tree, Back, Edge and Cross Edges in DFS of Graph - GeeksforGeeks
Jul 11, 2025 · Forward Edge: It is an edge (u, v) such that v is a descendant but not part of the DFS tree. An edge from 1 to 8 is a forward edge. Back edge: ...Missing: control flow abnormal
[25]
Control Flow Graphs (CFG) | Echo - Washi
An edge introduced by a conditional jump/branch instruction. Abnormal, An edge introduced by special or exceptional control flow (such as an error or signal).Missing: forward back cross
[26]
Parallel copy motion | Proceedings of the 13th International ...
... control-flow graph edges, leading to edge splitting. This paper presents ... A second application is moving copies out of critical edges, i.e., edges ...
[27]
The Machine-SUIF Control Flow Graph Library
To handle an infinite loop, they insert an impossible edge from one of the loop nodes to the exit. We also provide functions to create and recognize ...
[28]
[PDF] Exceptional Interprocedural Control Flow Graphs for x86-64 Binaries⋆
Each box is itself a control flow graph over various instructions. The thick black edges are also generated by existing works, the dashed edges are not.
[29]
Flow Graph Reducibility | SIAM Journal on Computing
Flow Graph Reducibility. Authors: Matthew S. Hecht and Jeffrey D. Ullman ... In addition, a dominator algorithm for an rfg is presented which takes O(edges) bit ...
[30]
Testing flow graph reducibility | Proceedings of the fifth annual ACM ...
Testing flow graph reducibility. Article. Free access. Share on. Testing flow graph reducibility. Author: Robert Tarjan. Robert Tarjan. View Profile. Authors ...
[31]
https://pages.cs.wisc.edu/~fischer/cs701.f14/finding.loops.html
[32]
[PDF] A Simple, Fast Dominance Algorithm
In this paper, we have presented a technique for computing dominators on a control-flow graph. The algorithm builds on the well-developed and well-understood ...
[33]
The program structure tree: computing control regions in linear time
We give a linear-time algorithm for finding SESE regions and for building the PST of arbitrary control flow graphs (including irreducible ones). Next, we ...
[34]
[PDF] Precise Interprocedural Dataflow Analysis via Graph Reachability
Abstract. The paper shows how a large class of interprocedural dataflow-analysis problems can be solved precisely in poly- nomial time by transforming them ...Missing: challenges seminal
[35]
[PDF] The Program Summary Graph and Flow-sensitive Interprocedural ...
This paper discusses a method for interprocedural data flow analysis which is powerful enough to express flow- sensitive problems but fast enough to apply ...
[36]
[PDF] Increasing the scope and resolution of Interprocedural Static Single ...
A flow-insensitive pointer analysis indicates that x points to either y or z, and g points to x. Since the dereference in Example 1(a) on Line 8 corresponds.
[37]
[PDF] CONTROL FLOW GRAPHS - UFMG
The material in these slides have been taken from the "Dragon Book", Secs 8.5 – 8.7, by Aho et al. ... Remember, if we are running on Linux, then our shared.
[38]
[PDF] Dominance Frontiers - cs.wisc.edu
Since CFGs model flow of control, it is useful to identify those basic blocks whose execution is controlled by a branch decision made by a predecessor. We say Y ...Missing: seminal paper
[39]
CS 6120: Sparse Conditional Constant Propagation
Oct 23, 2019 · Sparse conditional constant propagation computes fixed values at compile time, using SSA to detect control-flow edges that will never be ...
[40]
[PDF] Evaluating and Tuning a Static Analysis to Find Null Pointer Bugs
This paper describes a static analysis for null pointer bugs in Java, using simple techniques, and pinpoints 50-80% of defects in student code.
[41]
[PDF] A Practical Approach for Dynamic Taint Tracking with Control-flow ...
Then, it constructs the modified control flow graph described in Section 3.1 and uses Cooper et al. ... regions identified by data-only, basic-control, SCD ...<|control11|><|separator|>
[42]
A Practical Approach for Dynamic Taint Tracking with Control-flow ...
Dec 24, 2021 · In this article, we introduce Conflux, alternative semantics for propagating taint tags along control flows. Conflux aims to reduce over-tainting.
[43]
[PDF] Implementation of Control Flow in TensorFlow
Nov 1, 2017 · Overview. This document presents the current design and implementation of control flow operators in. TensorFlow.
[44]
Neural Reverse Engineering of Stripped Binaries using Augmented ...
Feb 25, 2019 · We present a novel approach for predicting procedure names in stripped executables. Our approach combines static analysis with neural models.
[45]
LLVM's Analysis and Transform Passes
This document serves as a high-level summary of the optimization features that LLVM provides. Optimizations are implemented as Passes that traverse some ...Missing: GCC 2020
[46]
Renewable Just-In-Time Control-Flow Integrity - ACM Digital Library
Oct 16, 2023 · By leveraging recent advances in lightweight binary disassembly, efficient memory page interception, and fast machine code rewriting, Renew ...Missing: advancements | Show results with:advancements
[47]
JITScope: Interactive Visualization of JIT Compiler IR Transformations
May 27, 2025 · We propose JITScope, a system that visualizes the evolution of Just-in-Time (JIT) compiler's Intermediate Representation (IR) using a backend- ...Missing: advancements 2020-2025