Fact-checked by Grok 2 weeks ago

Program analysis

Program analysis is a branch of computer science focused on the automatic examination of computer programs to infer properties such as correctness, efficiency, security, and behavior without necessarily executing the code.^[1] It serves as a foundational technique for software quality assurance, enabling developers and engineers to detect bugs, optimize performance, and verify compliance with specifications in complex systems.^[1] The field addresses the inherent challenges of software complexity, where manual inspection becomes impractical, by providing scalable methods to analyze code at various scales, from individual functions to large-scale distributed applications.^[2] At its core, program analysis divides into two primary categories: static analysis, which inspects source or compiled code without running it to extract facts like potential errors or dependencies, and dynamic analysis, which observes runtime behavior through execution traces or instrumentation to capture actual program states.^[3] Static approaches, such as dataflow analysis and abstract interpretation, offer broad coverage but may produce false positives due to conservative approximations, while dynamic methods provide precise insights at the cost of requiring test inputs and potentially missing unexercised paths.^[2] These techniques underpin tools used in compilers for optimization, integrated development environments for real-time feedback, and security scanners for vulnerability detection.^[1] The theoretical foundations of program analysis trace back to early computability results, including Alan Turing's 1936 proof of the undecidability of the halting problem, which limits the precision of general analyses, and Rice's theorem, establishing that non-trivial properties of program behavior are undecidable.^[3] Seminal advancements, like the formalization of abstract interpretation by Patrick and Radhia Cousot in 1977, provided a rigorous framework for sound static analyses by mapping concrete program semantics to abstract domains.^[4] Today, the field continues to evolve with integrations of machine learning for improved precision and applications in emerging domains like probabilistic programming and distributed systems.^[1]

Introduction

Definition and Goals

Program analysis is the systematic examination of software artifacts, such as source code, to infer properties about a program's behavior, structure, or performance without necessarily executing it.^[5] This field in computer science employs automated techniques to approximate the dynamic behavior of programs, providing reliable insights into how software operates across possible executions.^[6] Analyses can occur at different representation levels, including source code for high-level semantic understanding, bytecode for intermediate portability in virtual machines, or binary code for low-level efficiency in deployed systems.^[7] The primary goals of program analysis are to detect bugs by identifying potential defects like null pointer dereferences, verify program correctness against specifications, optimize code for better performance through techniques such as dead code elimination, ensure security by uncovering vulnerabilities like buffer overflows, and facilitate refactoring by revealing dependencies and structural insights.^[7] These objectives support broader software quality assurance, enabling developers to produce reliable and efficient systems.^[1] Key properties analyzed include code reachability to determine executable paths, aliasing to identify multiple references to the same memory location, and resource usage to assess demands on memory or computation.^[7] Program analysis integrates throughout the software engineering lifecycle, from initial design and implementation to testing, maintenance, and evolution, thereby reducing costs and enhancing reliability.^[7] Approaches encompass both static methods, which examine code without running it, and dynamic methods, which observe behavior during execution.^[8]

Historical Development

The origins of program analysis trace back to the late 1950s and early 1960s, during the development of early optimizing compilers for high-level programming languages. The FORTRAN I compiler, released by IBM in 1957, incorporated rudimentary flow analysis techniques in its Sections 4 and 5 to determine the frequency of execution for different program paths, enabling basic optimizations such as common subexpression elimination and dead code removal through a Monte Carlo simulation of control flow. This marked one of the first systematic uses of program analysis for optimization, building on prior work in control-flow and data-flow analysis integrated into the Fortran II system for the IBM 7090 in 1961 by Vyssotsky and others.^[9]^[10] In the 1970s, program analysis advanced significantly with foundational work on data-flow frameworks. Gary Kildall introduced a unified method for global program optimization in 1973, formalizing data-flow analysis as iterative computations over lattices to propagate information across program control structures, proving convergence under certain conditions and enabling optimizations like constant propagation and reaching definitions. Concurrently, Frances Allen at IBM developed key techniques for program optimization, including her 1970 framework for analyzing and transforming control-flow graphs, which influenced subsequent compiler optimizations and earned her the 2006 Turing Award for contributions to compiler theory.^[11]^[12] The 1980s and 1990s saw the emergence of more formal and scalable approaches, expanding program analysis beyond optimization to verification. Patrick and Radhia Cousot formalized abstract interpretation in 1977 as a unified lattice-based framework for approximating program semantics, allowing sound static analyses for properties like safety and termination by constructing fixpoints in abstract domains.^[4] Independently, in the early 1980s, Edmund Clarke and E. Allen Emerson developed temporal logic model checking for algorithmic verification of finite-state concurrent systems, while Joseph Sifakis and Jean-Pierre Queille advanced similar automata-based techniques; their combined innovations, recognized with the 2007 Turing Award, enabled exhaustive exploration of state spaces for detecting errors in hardware and software.^[13] From the 2000s onward, program analysis integrated into modern toolchains and addressed security challenges, driven by formal verification and industry needs for scalability. Chris Lattner's LLVM framework, initiated in 2000 and detailed in 2004, provided a modular infrastructure for lifelong program analysis using static single assignment form, facilitating optimizations and enabling widespread adoption in tools like Clang for static analyses across languages. The 2014 Heartbleed vulnerability in OpenSSL spurred advancements in taint analysis, a dynamic and static technique for tracking untrusted data flows to prevent memory leaks and injections, with tools like Coverity demonstrating its role in detecting similar buffer over-reads through taint propagation from network inputs. This period also reflected a shift toward scalable formal methods in industry, such as symbolic execution in tools like KLEE (built on LLVM), balancing precision with performance for verifying complex software. In the 2020s, program analysis has increasingly incorporated large language models to assist in tasks like static analysis, bug detection, and code understanding, further enhancing automation and precision in software verification.^[14]^[15]^[16]

Classification of Program Analysis

Static versus Dynamic Analysis

Static analysis examines the source code or binary of a program without executing it, enabling reasoning about all possible execution behaviors to pursue exhaustive coverage of the program's state space.^[17] This approach is foundational in areas like compiler optimization, where approximations of program semantics are computed to detect errors or optimize code ahead of runtime. However, fundamental limits arise from undecidability, as Rice's theorem demonstrates that non-trivial semantic properties of programs—such as whether a program computes a specific function—are inherently undecidable, necessitating over-approximations in static analyses that may include infeasible behaviors and lead to false positives. In contrast, dynamic analysis relies on executing the program, either in full or on selected inputs, to gather concrete observations of its runtime behavior, yielding precise results for the paths actually traversed.^[17] This execution-based method excels in revealing real-world issues like performance bottlenecks or specific defects under given conditions but is inherently limited by incomplete path coverage, as the exponential growth of possible execution paths—often termed path explosion—renders exhaustive testing impractical even for modest programs.^[18] The primary trade-offs between these approaches center on soundness, precision, and efficiency. Sound static analyses guarantee detection of all errors (no false negatives) by over-approximating possible behaviors, though this conservatism reduces precision and increases false positives, while requiring substantial computational resources for scalability.^[17] Dynamic analyses, conversely, offer high precision with no false positives for observed executions—since results stem directly from concrete runs—but sacrifice soundness due to potential misses in unexplored paths, though they are typically more efficient per test case.^[17] These complementary strengths have spurred hybrid techniques, such as concolic execution, which integrate concrete execution with symbolic reasoning to balance coverage and precision without fully resolving the underlying challenges.^[19]

Soundness, Completeness, and Precision

In program analysis, soundness, completeness, and precision are fundamental properties that assess the reliability and accuracy of an analysis's results relative to the program's actual semantics, often formalized using denotational semantics where the concrete semantics denotes the set of all possible program behaviors.^[4] Soundness ensures that the analysis captures all true behaviors of the program, producing no false negatives; formally, if the concrete semantics of a program is the set of all reachable states or behaviors S, a sound analysis yields an over-approximation \hat{S} such that S \subseteq \hat{S}. This property is crucial for verification tasks, as it guarantees that the absence of reported issues in the analysis implies no issues exist in reality. In the framework of abstract interpretation, soundness is established via a pair of abstraction function \alpha and concretization function \gamma satisfying \gamma(\alpha(x)) \sqsupseteq x for concrete values x, ensuring the abstract domain correctly approximates the concrete one without omission.^[4]^[4] Completeness requires that the analysis results match the true behaviors exactly, with \hat{S} = S, eliminating both false negatives and false positives; in abstract interpretation terms, this holds when \alpha(\gamma(y)) = y for abstract values y, meaning the approximation introduces no extraneous information. However, completeness is generally undecidable for nontrivial program properties due to the implications of the halting problem, which shows that determining exact semantic properties like termination or reachability is impossible in finite time for Turing-complete languages.^[20]^[20] Precision measures the tightness of the approximation, quantifying how closely \hat{S} adheres to S by minimizing spurious elements in over-approximations; for instance, it can be assessed via the width or cardinality of the abstract domain relative to the concrete one, where narrower domains yield finer-grained results. Achieving high precision often involves trade-offs with scalability, as more precise analyses require computationally intensive operations like narrowing to refine approximations, potentially increasing analysis time exponentially.^[21]^[4]^[21] These properties manifest differently across analysis types: static analyses typically employ over-approximation to ensure soundness by including all possible behaviors (e.g., "may" analyses that report potentially reachable states), while dynamic analyses often use under-approximation, capturing only observed behaviors during execution (e.g., "must" analyses confirming definite properties in tested paths). This distinction aligns with the broader static-dynamic divide, where static methods prioritize exhaustive coverage at the cost of potential imprecision.^[22]^[22]

Static Program Analysis Techniques

Control-Flow Analysis

Control-flow analysis is a fundamental static technique in program analysis that models the possible execution paths of a program by constructing a control-flow graph (CFG). A CFG represents the program as a directed graph where nodes correspond to basic blocks—maximal sequences of straight-line code without branches or jumps—and edges indicate possible control transfers, such as conditional branches, unconditional jumps, loops, or procedure returns. This graph-based representation, pioneered in early compiler optimization work, enables the identification of structured control elements like sequences, selections, and iterations without simulating execution.^[23]^[24] The construction of CFGs typically proceeds intraprocedurally, focusing on the control structure within a single function or procedure. Basic blocks are identified by partitioning the code at entry points, branch targets, and control-altering statements, with edges added based on the semantics of jumps, calls, and returns. For interprocedural analysis, which extends the CFG across procedure boundaries, additional edges connect call sites to entry nodes of callees and return sites to post-call nodes, often modeling the call graph to handle procedure interactions. However, precise interprocedural CFGs require approximations to avoid exponential growth, such as treating calls as simple transfers or using call-string summaries for context sensitivity.^[25]^[26] Once constructed, CFGs facilitate computations of structural properties essential for further analysis. Dominators are nodes through which every path from the entry must pass to reach a given node, computed efficiently using iterative dataflow algorithms that propagate dominance information forward from the entry; a simple, fast method iterates over a post-order traversal until fixed-point convergence, achieving near-linear time in practice. Post-dominators are defined analogously but backward from the exit node. These concepts underpin loop detection: natural loops are identified by back edges in a depth-first search spanning tree, where a back edge from node m to ancestor n defines a loop with header n and body consisting of all nodes that can reach m without passing through n, enabling optimizations like loop-invariant code motion.^[27]^[25] Introductory applications of CFGs involve solving basic data-flow problems to illustrate control propagation. Reaching definitions analysis determines, for each program point, the set of variable assignments that may reach it along some path from the entry, formulated as a forward dataflow problem where definitions "gen" at assignment sites and are "kill"ed by subsequent redefinitions; solutions are computed via iterative propagation over the CFG edges until fixed point. Similarly, live variables analysis identifies variables used before their next definition along any path to the exit, a backward problem where uses "gen" at read sites and definitions "kill" liveness, aiding register allocation by revealing variable lifetimes. These computations highlight the CFG's role as a prerequisite for propagating abstract information in more advanced analyses.^[25] Despite their utility, CFGs face limitations in handling complex control features. Recursion introduces cycles in the interprocedural call graph, potentially leading to infinite paths that require summarization or bounded call-string approximations to terminate analysis. Exceptions and non-local transfers, such as those in languages like Java or C++, add irregular edges from throw sites to handlers, complicating graph construction and often necessitating separate exception-flow graphs or integrated modeling to capture all paths. Scalability issues arise in large codebases, where interprocedural CFGs can explode in size due to call-site multiplicity, though practical implementations mitigate this with on-demand construction and pointer analysis for call resolution, maintaining efficiency for programs up to millions of lines.^[28]^[26]

Data-Flow Analysis

Data-flow analysis is a framework for statically gathering information about the possible values of variables or expressions at various points in a program by propagating facts along paths in the control-flow graph.^[29] This technique enables the inference of program properties such as which definitions reach a use or which expressions are available, facilitating optimizations like dead code elimination and constant folding.^[30] Originating from early compiler optimization efforts, it models data dependencies abstractly to avoid the undecidability of precise value tracking.^[29] The core framework employs a lattice structure to represent sets of facts at program points, where the lattice (L, \sqcap, \sqcup, \bot, \top) defines the possible states, with \sqcap as the meet (join) operation for combining information at control-flow merges.^[30] Transfer functions f: L \to L describe how facts transform through statements and must be monotonic, preserving the partial order: if x \sqsubseteq y, then f(x) \sqsubseteq f(y).^[30] Analyses are solved via fixed-point iteration, computing the least fixed point of the data-flow equations over the lattice, which converges in finite steps due to the lattice's finiteness.^[30] The worklist algorithm implements this efficiently by maintaining a queue of nodes whose information needs updating, propagating changes until stabilization.^[30] Analyses are classified as forward or backward. Forward analyses, like reaching definitions, propagate information from program entry toward exit, while backward analyses, like available expressions, flow from exit toward entry.^[29] For a program point p, the basic equations are:

\text{IN} = \bigsqcap_{q \in \text{pred}(p)} \text{OUT}

\text{OUT} = f_p(\text{IN})

where f_p is the transfer function for p, and \bigsqcap is the meet over predecessors at join points.^[29] Classic problems illustrate the framework. Reaching definitions determines which variable assignments may reach a point, using a forward analysis with union (\sqcup = \cup) as the meet and sets of definitions as the lattice. The equations are:

\text{IN} = \bigcup_{q \in \text{pred}(p)} \text{OUT}

\text{OUT} = \text{gen} \cup (\text{IN} \setminus \text{kill})

where \text{gen} are definitions generated at p, and \text{kill} are those invalidated.^[29] Available expressions, a backward analysis for common subexpression elimination, tracks expressions computable at a point from entry, using intersection (\sqcap = \cap) and analogous gen/kill sets in reverse.^[29] Constant propagation forwards constant values across assignments, employing a lattice of constants (e.g., \bot for unknown, specific values, or \top for non-constant), with transfer functions substituting constants where possible.^[30] Extensions address limitations in larger programs. Flow-insensitive analyses approximate by treating statement order loosely, often via fixed-point over all statements, to reduce complexity at the cost of precision.^[31] Context-sensitive variants, crucial for interprocedural analysis, distinguish call sites by modeling calling contexts (e.g., via call strings or graph reachability), enabling precise propagation across procedures while maintaining polynomial-time solvability for distributive problems.^[31] These apply to dead code elimination by identifying unreachable or unused code through reaching definitions or live variables, removing it to reduce program size and execution time.^[30]

Abstract Interpretation

Abstract interpretation provides a general framework for the static analysis of programs by approximating their semantics in a way that ensures decidability and computability. Introduced by Patrick and Radhia Cousot, it formalizes program analysis as the interpretation of programs within abstract domains that over-approximate the concrete semantics of the program. This approach allows for the automatic determination of program properties by computing fixpoints in these abstract domains, which are typically lattices equipped with operations that mimic but simplify the concrete semantics.^[32] At the core of abstract interpretation lies the theory of Galois connections between a concrete domain \langle \mathcal{C}, \sqsubseteq \rangle representing the exact program semantics and an abstract domain \langle \mathcal{A}, \sqsubseteq^\sharp \rangle providing approximations. A Galois connection is defined by a pair of monotonic functions \alpha: \mathcal{C} \to \mathcal{A} (abstraction) and \gamma: \mathcal{A} \to \mathcal{C} (concretization) such that for all c \in \mathcal{C} and a \in \mathcal{A}, \alpha(c) \sqsubseteq^\sharp a if and only if c \sqsubseteq \gamma(a).^[32] Soundness is ensured because the abstract domain over-approximates the concrete one: \gamma(\alpha(c)) \sqsupseteq c for all c \in \mathcal{C}, meaning that any property proven in the abstract domain holds in the concrete semantics, though the converse may not be true due to potential false positives.^[33] The process involves defining an abstract semantics \llbracket \cdot \rrbracket^\sharp that collects the effects of program statements in \mathcal{A}, often by interpreting the program's denotational semantics in the abstract domain. For terminating computations, this is straightforward, but for loops and recursion, termination is achieved using widening operators \nabla: \mathcal{A} \times \mathcal{A} \to \mathcal{A}, which are monotonic functions satisfying x \sqsubseteq^\sharp \nabla(x, y) and y \sqsubseteq^\sharp \nabla(x, y), ensuring that iterative applications stabilize after finitely many steps to compute an over-approximation of the least fixpoint, often denoted \mathsf{T}^\omega.^[32] Practical examples illustrate the framework's application. In interval analysis, the abstract domain consists of intervals [\ell, u] for numerical variables, where \alpha maps a set of concrete values to the tightest enclosing interval, and operations like addition are defined component-wise with appropriate handling for overflow or underflow.^[34] For instance, analyzing x = x + 1 in a loop might yield an interval that widens to [-\infty, +\infty] after iterations, proving boundedness or detecting potential overflows conservatively. Sign analysis uses a simpler domain \{ -, 0, +, \top \}, where \top represents unknown signs, and abstract operations propagate sign information through assignments and conditions, effectively handling non-determinism by merging possibilities into the least upper bound in the lattice.^[35] Non-determinism in inputs or control flow is managed by the join operation \sqcup^\sharp, which computes the least upper bound in \mathcal{A}, ensuring the analysis remains sound by including all possible behaviors.^[33] The advantages of abstract interpretation include its modular design, allowing analysts to define new abstract domains and operations tailored to specific properties without altering the underlying framework, and its deep connection to denotational semantics, where the concrete semantics serves as a semantic function that is lifted to the abstract level.^[32] This modularity facilitates the development of analyses for diverse properties, from numerical accuracy to resource usage. Data-flow analysis can be viewed as a special case of abstract interpretation using finite-height domains without explicit widening.^[33]

Type Systems

Type systems provide a static analysis mechanism in programming languages to assign types to variables, expressions, and functions, ensuring that operations are applied only to compatible data structures. This assignment is formally defined through typing rules, such as \Gamma \vdash e : \tau, where \Gamma is a typing environment mapping variables to types, e is an expression, and \tau is the inferred type of e. These rules specify how types propagate through program constructs, for example, ensuring that the type of an application matches the function's parameter type. Seminal work on type inference, particularly the Hindley-Milner algorithm, enables automatic derivation of principal types for polymorphic functions in languages like ML, achieving completeness for a decidable fragment of the simply typed lambda calculus with polymorphism.^[36]^[37] Static type checking uses these systems to detect type errors before program execution, preventing mismatches such as adding a string to an integer. Polymorphism enhances flexibility: parametric polymorphism allows functions like map to work uniformly across types without explicit instantiation, as in the Hindley-Milner system, while ad-hoc polymorphism, often via type classes or overloading, provides type-specific implementations, such as numeric operations on integers versus floats. Subtyping further refines this by permitting a type \tau_1 to be safely substituted for \tau_2 if \tau_1 \leq \tau_2, enabling structural compatibility in object-oriented languages, as formalized in structural subtyping rules.^[38]^[39]^[40] Advanced type systems extend these foundations to express richer program properties. Dependent types allow types to depend on program values, enabling specifications like array access within bounds (e.g., indexing an array of length n only at positions from 0 to n-1), as demonstrated in practical extensions to ML-like languages. Flow-sensitive typing refines analysis by tracking type changes along control-flow paths, such as refining a variable's type after a conditional check, improving precision over flow-insensitive approximations. Type systems can be viewed as abstract interpretations over type lattices, specializing the general framework to safety properties.^[41]^[42] Despite their power, type systems face trade-offs between expressiveness and decidability; ensuring type checking remains computable limits their ability to verify arbitrary properties, such as catching all runtime errors like division by zero in general-purpose languages. For instance, full dependent types can encode undecidable propositions, necessitating restrictions for practical inference.^[43]^[44]

Model Checking

Model checking is an automated verification technique that exhaustively determines whether a finite-state model of a system satisfies a given specification, typically expressed in temporal logic.^[45] The models are commonly represented as Kripke structures, consisting of a set of states S, a transition relation R \subseteq S \times S, an initial state I \subseteq S, and a labeling function L: S \to 2^{AP} that assigns subsets of atomic propositions AP to each state.^[46] Specifications are formulated in logics such as Linear Temporal Logic (LTL) for linear-time properties or Computation Tree Logic (CTL) for branching-time properties; for instance, the CTL formula AG \, p \rightarrow AF \, q asserts that whenever proposition p holds invariantly from the initial state, proposition q will eventually hold along some path.^[45] If the property fails, model checkers generate a counterexample trace—a finite path in the model violating the specification—to aid debugging.^[47] Core algorithms operate on automata-theoretic or symbolic representations to handle the verification. For LTL specifications, an automata-based approach constructs a Büchi automaton from the negation of the formula and checks the emptiness of the product automaton with the Kripke structure model.^[48] In software contexts, models are often expressed as Labelled Transition Systems (LTS), where transitions are labeled with actions, enabling the analysis of concurrent behaviors.^[49] Symbolic model checking, pioneered using Binary Decision Diagrams (BDDs), represents the state space implicitly to avoid explicit enumeration, as implemented in early tools like SMV for CTL formulas.^[50] To combat state explosion—the exponential growth in the number of states—techniques such as model minimization reduce equivalent states, while partial order reduction prunes the exploration by ignoring independent concurrent actions that do not affect the property.^[51] Originally developed for hardware verification in the early 1980s, model checking has been extended to software systems, where finite-state abstractions of programs are checked against temporal properties.^[52] Seminal work by Clarke and Emerson introduced CTL model checking algorithms for branching-time logic in 1981, demonstrating polynomial-time verification relative to the model size. Tools like SMV, first applied to circuit designs, influenced software extensions by providing counterexample-guided diagnostics.^[50] Scalability remains a primary challenge due to the state explosion in concurrent systems, often addressed through abstraction-refinement loops like Counterexample-Guided Abstraction Refinement (CEGAR), which iteratively refines over-approximate abstract models upon discovering spurious counterexamples.^[53] This process ensures soundness by guaranteeing that valid counterexamples in the concrete model are preserved, though it requires careful predicate selection for effective refinement.

Dynamic Program Analysis Techniques

Testing

Testing is a fundamental dynamic program analysis technique that involves executing a program with selected inputs to observe its behavior and verify whether it meets specified requirements. By running the software in a controlled environment, testing aims to uncover defects, ensure functionality, and assess reliability under various conditions. Unlike static analysis, which examines code without execution, testing relies on actual runtime behavior to detect issues such as crashes, incorrect outputs, or performance anomalies. This approach is essential in software development lifecycles, where it supports iterative validation from early stages to final deployment.^[54] Testing occurs at multiple levels to address different scopes of the system. Unit testing focuses on individual components or functions in isolation, verifying their internal logic against expected behaviors. Integration testing examines interactions between units, ensuring that combined modules function correctly without introducing new faults. System testing evaluates the entire integrated application against overall requirements, often simulating real-world usage scenarios. These levels build progressively, with unit tests providing foundational confidence before broader integration and system validation. Testing strategies are broadly categorized as black-box or white-box, differing in their knowledge of internal structure. Black-box testing treats the program as a opaque entity, selecting inputs based on external specifications and checking outputs for correctness without examining code paths. In contrast, white-box testing leverages knowledge of the program's internals to design tests that exercise specific structures, guided by coverage criteria such as statement coverage (executing every line), branch coverage (traversing all decision outcomes), or path coverage (exploring feasible execution paths). For safety-critical systems like avionics software, modified condition/decision coverage (MC/DC) is mandated, requiring each condition in a decision to independently affect the outcome, ensuring thorough fault detection in boolean expressions.^[55] The testing process begins with test case generation, which can be manual, random, or model-based. Manual generation involves developers crafting inputs based on domain knowledge and requirements, allowing targeted exploration of edge cases but prone to human bias and incompleteness. Random testing automates input selection from input domains, often using feedback to refine sequences, as in tools like Randoop for Java, which generates unit tests by randomly composing method calls and pruning invalid ones. Model-based approaches derive tests from formal models of the system's behavior, such as state machines or UML diagrams, enabling systematic coverage of transitions and states. During execution, assertions embedded in code check invariants, while test oracles determine pass/fail by comparing actual outputs to expected results, often derived from specifications or historical data. Recent advancements include machine learning techniques for automated test generation, improving coverage and fault detection in complex systems as of 2024.^[56]^[54]^[57]^[58] Effectiveness of test suites is measured using metrics like code coverage and mutation testing. Coverage metrics quantify how much of the code is exercised; for instance, branch coverage above 80% is often a threshold for adequate testing in industrial practices, though full path coverage is computationally infeasible for complex programs due to exponential paths. Mutation testing assesses suite quality by injecting small faults (mutants) into the code and verifying if tests detect them, with mutation scores indicating fault-detection capability; empirical studies have shown correlations between mutant detection and real fault revelation. Static path analysis can aid by identifying feasible paths for targeted test generation.^[55]^[59] Despite its strengths, testing has inherent limitations. It cannot prove the absence of bugs, as exhaustive testing is impractical for non-trivial programs, potentially missing defects in unexercised paths. The oracle problem exacerbates this for non-deterministic code, where outputs vary due to timing, concurrency, or external factors, making expected results hard to define without complete specifications. Thus, testing complements but does not replace other verification methods.^[60]^[56]

Runtime Monitoring

Runtime monitoring is a dynamic program analysis technique that observes and analyzes the execution of a program in real-time or post-execution to detect violations of specified properties, gather performance profiles, or identify anomalies during runtime.^[61] Unlike static analysis, which examines code without execution, runtime monitoring relies on actual program runs to collect data such as variable states, control flows, and system events, enabling the verification of behaviors that may be infeasible to predict statically.^[62] This approach is particularly valuable for systems where non-determinism or environmental interactions make exhaustive static prediction challenging.^[63] Key mechanisms for runtime monitoring include instrumentation, which inserts probes or code snippets into the program to capture execution events, and event logging for recording traces that can be analyzed later.^[61] Probes can be added at compile-time, load-time, or runtime, often using techniques like binary instrumentation tools (e.g., Pin or Valgrind) to avoid source code modifications.^[61] Aspect-oriented programming (AOP) provides a modular way to implement such instrumentation by defining aspects that weave monitoring logic around specific join points, such as method calls or exceptions, without altering the core program structure.^[64] Event logging involves capturing sequences of events into traces, which are then processed for pattern matching or statistical analysis to infer system behavior.^[62] Applications of runtime monitoring span several areas, including invariant checking, where tools like Daikon dynamically infer and verify likely program invariants—such as range constraints or relational properties—by observing multiple executions and reporting those holding true across traces.^[65] For performance profiling, gprof generates call graphs and execution time breakdowns by instrumenting functions to count calls and sample runtime, helping identify bottlenecks in large programs.^[66] Anomaly detection uses monitoring to flag deviations from expected patterns, such as unusual resource usage or security violations, often in real-time for immediate response. Modern applications include monitoring in AI and distributed systems, leveraging machine learning for anomaly prediction as of 2024.^[62]^[67] Runtime monitoring techniques distinguish between online analysis, which processes events as they occur to enable immediate feedback or enforcement (e.g., halting execution on violation), and offline analysis, which examines complete traces after execution for comprehensive post-mortem insights.^[68] Online monitoring suits time-sensitive applications like safety-critical systems, while offline allows deeper analysis without runtime constraints.^[68] Handling concurrency requires thread-safe monitors that synchronize access to shared state, often using atomic operations or lock-free data structures to avoid race conditions in multi-threaded environments.^[69] Significant challenges in runtime monitoring include minimizing overhead, as instrumentation can increase execution time by 10-50% or more depending on probe density and analysis complexity, prompting optimizations like selective sampling or hardware-assisted tracing.^[70] Non-determinism in traces, arising from concurrent scheduling, external inputs, or timing variations, complicates verification by producing variable execution paths, necessitating robust trace merging or probabilistic models to ensure reliable detection.^[69]

Program Slicing

Program slicing is a technique for reducing a program or its execution trace to a subset of statements that are relevant to a specific criterion, such as the value of a particular variable at a given program point.^[71] Introduced by Mark Weiser in 1981, static program slicing operates on the source code without executing it, producing a slice that includes all statements that may potentially affect the criterion across all possible executions.^[72] In contrast, dynamic program slicing, proposed by Bogdan Korel and Janusz Laski in 1988, focuses on a specific execution trace for given inputs, yielding a slice that captures only the statements that actually influence the criterion during that run.^[73] These approaches leverage program dependencies to isolate relevant computations, aiding in tasks like fault localization by eliminating irrelevant code.^[71] The core algorithms for computing slices rely on dependence graphs that model control and data flows in the program. A key representation is the program dependence graph (PDG), developed by Ferrante, Ottenstein, and Warren in 1987, which combines data dependences (how values flow between statements) and control dependences (how execution paths influence statement reachability) into a single directed graph.^[74] In backward slicing, the most common variant for debugging, traversal starts from the slicing criterion and proceeds upstream through the dependence graph to include all predecessor statements that could impact it.^[71] Forward slicing, conversely, begins at a starting point and collects all downstream effects, useful for impact analysis.^[71] These graph-based methods ensure the slice preserves the semantics relevant to the criterion while removing extraneous parts.^[74] A slicing criterion typically specifies a variable v and a program point (e.g., line l), aiming to extract "all statements affecting v at l".^[71] For static slicing, algorithms like those using reaching definitions and live variables from data-flow analysis compute the slice by propagating dependencies across the entire control-flow graph.^[71] In dynamic slicing, computation involves an execution history or trace, where statements are filtered based on actual data flows during runtime; for instance, only executed paths contributing to the variable's value are retained, often using dynamic dependence graphs or post-execution marking of relevant instructions.^[73] This results in an executable subset that reproduces the criterion's behavior for the specific input.^[71] Program slicing finds primary applications in debugging, where slices help isolate faults by focusing on code paths leading to erroneous values, as originally envisioned by Weiser.^[72] In software testing, dynamic slices reduce test suites by selecting only relevant execution traces, improving efficiency without sacrificing coverage of critical behaviors.^[71] Additionally, slices enhance program comprehension by abstracting complex codebases to manageable views, facilitating maintenance and understanding of how changes propagate.^[71]

Example

Consider a simple program in pseudocode:

1: input x
2: y = x + 1
3: if x > 0 then
4:   z = y * 2
5: else
6:   z = y
7: endif
8: output z
1: input x
2: y = x + 1
3: if x > 0 then
4:   z = y * 2
5: else
6:   z = y
7: endif
8: output z

A static backward slice for criterion \langle z, 8 \rangle includes lines 1–8, as all paths may affect z.^[71] For input x = 1, a dynamic slice might reduce to lines 1, 2, 3, 4, 7, 8, excluding the else branch.^[73]

Applications

Compiler Optimization

Compiler optimization leverages program analysis techniques to transform source code or intermediate representations into more efficient machine code, primarily aiming to minimize execution time, reduce code size, and lower resource usage while preserving program semantics.^[75] Static analyses, such as data-flow analysis, play a central role in identifying opportunities for transformations like dead code elimination, where unreachable or unused code is removed based on reaching definitions or liveness information.^[76] Constant folding evaluates and replaces constant expressions at compile time, simplifying computations that do not depend on runtime values.^[77] These intra-procedural optimizations rely on control-flow and data-flow frameworks to propagate information across basic blocks, enabling safe code removal or simplification without altering behavior.^[76] In the compiler pipeline, optimization occurs across multiple phases, starting in the front-end during semantic analysis to eliminate early redundancies, progressing to the middle-end for machine-independent transformations, and culminating in the back-end for architecture-specific adjustments.^[75] For instance, loop invariant code motion identifies computations within loops that yield the same value on every iteration and hoists them outside, reducing redundant operations; this is guided by data dependence analysis to ensure correctness.^[77] Register allocation in the back-end employs liveness analysis—a backward data-flow problem—to determine variable lifetimes, minimizing spills to memory and optimizing register usage for faster execution.^[76] Interprocedural analyses extend these to whole programs, constructing call graphs to enable inlining, where small or frequently called functions are merged into callers, eliminating call overhead and exposing further optimizations.^[77] Advanced optimizations like automatic vectorization transform scalar loops into vector instructions for SIMD architectures, requiring precise dependence analysis to detect absence of data dependencies between loop iterations. This involves testing for flow, anti-, and output dependencies using techniques such as Banerjee's inequalities, which bound possible dependence distances to confirm parallelizability. In just-in-time (JIT) compilers, dynamic program analysis complements static methods by profiling execution at runtime to identify hot paths and apply targeted optimizations, such as speculative inlining based on observed call frequencies.^[78] For example, Java Virtual Machine implementations use counter-based profiling to reorder code layout or specialize hot methods, adapting to actual workloads.^[79] These optimizations yield measurable benefits, with studies showing average speedups of 11% to 33% across GPU architectures through refined phase ordering, alongside code size reductions of up to 50% in selective sequences on large software packages.^[80]^[81] However, they introduce trade-offs, as aggressive analyses increase compilation time—sometimes by factors of 2-5—and may complicate debugging due to transformed code structures.^[82] Overall, the integration of static and dynamic analyses in modern compilers, such as LLVM and GCC, balances these factors to achieve substantial performance gains in production workloads.^[82]

Software Verification and Debugging

Software verification and debugging leverage program analysis techniques to ensure correctness and identify defects during software development. These methods systematically examine code or its execution to prove properties like safety invariants or locate faults, reducing reliance on manual inspection and enhancing reliability in complex systems. Formal verification approaches, such as model checking, exhaustively explore state spaces to confirm that software satisfies temporal logic specifications, while static analysis tools detect common errors like null pointer dereferences without running the program.^[83] In debugging, dynamic techniques trace data flows to pinpoint root causes, enabling developers to isolate issues efficiently. Integration of these analyses into development workflows further streamlines the process, from real-time feedback in editors to automated checks in build pipelines. In software verification, formal methods like model checking play a central role by automatically verifying that finite-state systems adhere to invariants expressed in temporal logic. Pioneered in the 1980s, this technique constructs a model of the program and explores all possible executions to check for violations of safety or liveness properties, providing counterexamples if failures occur.^[83] For instance, it has been applied to concurrent systems to ensure deadlock freedom or mutual exclusion. Complementing this, static checkers perform lightweight analyses on source code to identify potential defects; the FindBugs tool, for example, uses pattern-based detection to flag null pointer dereferences by tracking possible null values through control and data flows.^[84] These tools operate interprocedurally, propagating annotations to refine precision and reduce false positives, making them suitable for early-stage verification in large codebases. Debugging benefits from program analysis by narrowing the search space for faults. Program slicing, which extracts subsets of code relevant to a specific criterion like a variable or statement, aids fault localization by highlighting dependencies that could contribute to errors; techniques detailed elsewhere refine this for dynamic executions to rank suspicious statements.^[85] Dynamic taint tracking extends this by marking sensitive data at inputs and propagating taints through execution, revealing root causes such as unintended data influences leading to crashes or incorrect outputs.^[86] Escape analysis supports memory-related debugging by determining if objects allocated locally remain confined to a thread or method, helping identify leaks or synchronization issues without full execution traces.^[87] Tool integration embeds these analyses into everyday workflows for seamless verification and debugging. IDE plugins, such as SonarLint, provide on-the-fly static analysis within editors like Eclipse or Visual Studio Code, flagging issues like null dereferences as developers write code and suggesting fixes based on rule sets. In CI/CD pipelines, tools like SonarQube automate verification by scanning builds for defects, enforcing quality gates before deployment and integrating with platforms such as Jenkins or GitHub Actions to halt faulty commits.^[88] For safety-critical domains, specialized integrations like LDRA's suite align analyses with standards, running coverage and static checks in automated workflows to verify compliance.^[89] A notable case study in aerospace illustrates the impact of program analysis under DO-178C standards, which mandate rigorous verification for airborne software to achieve design assurance levels from A (catastrophic failure) to E (no effect). In developing flight control systems, Collins Aerospace employed structural coverage analysis to verify requirements traceability and detect unexercised paths, ensuring high coverage for DAL A certification while reducing manual review efforts.^[90] Similarly, a federal agency used Parasoft's unified testing platform, incorporating static analysis and unit verification, to comply with DO-178C for embedded air navigation software, cutting testing time by automating defect detection and evidence generation.^[91] These applications demonstrate how analysis techniques mitigate risks in high-stakes environments, fostering safer software through proven, standards-aligned processes.

Security Analysis

Program analysis plays a crucial role in security by identifying vulnerabilities that could lead to exploits, such as injection attacks or memory corruptions, through techniques that track data flows and control transfers without executing the code. Static taint analysis, for instance, propagates labels on potentially malicious inputs to detect flaws like SQL injection (SQLi), where untrusted data reaches query construction points. In this approach, data from external sources is marked as tainted and tracked through the program's control and data flow graphs to identify unsafe uses, such as direct concatenation into SQL statements. A seminal implementation demonstrated its effectiveness in Java applications by achieving high precision in detecting SQLi alongside other taint-style vulnerabilities, reducing false positives through context-sensitive slicing.^[92] Dynamic symbolic execution complements static methods by exploring program paths with symbolic inputs during runtime, enabling the detection of buffer overflows where array bounds are exceeded due to unchecked inputs. This technique solves path constraints using satisfiability modulo theories (SMT) solvers to generate concrete inputs that trigger vulnerabilities, often uncovering deep bugs missed by traditional testing. For example, tools like KLEE apply dynamic symbolic execution to C programs, automatically generating tests that expose buffer overflows by systematically enumerating feasible execution paths and verifying assertions on memory accesses.^[93] Such analysis has proven effective in complex systems, achieving over 80% code coverage in benchmarks while identifying real-world memory errors.^[93] Key security properties enforced via program analysis include information flow control, which ensures non-interference by preventing sensitive data from influencing public outputs, and control-flow integrity (CFI), which restricts indirect jumps to valid targets to thwart return-oriented programming (ROP) attacks. Non-interference, formalized as a property where high-security inputs do not affect low-security observations, is verified through flow-sensitive analyses that model dependencies across program states. CFI, in contrast, instruments binaries to check that control transfers align with a precomputed control-flow graph, mitigating ROP chains that repurpose existing code gadgets. The original CFI framework reduced exploit success rates to near zero in evaluated scenarios while incurring modest runtime overhead of 8-16%.^[94] Practical tools exemplify these techniques: Fortify performs static analysis to scan source code for taint propagation leading to injections and other issues, supporting languages like Java and C# with customizable rulesets for enterprise security.^[95] Valgrind, a dynamic instrumentation framework, detects memory leaks and overflows at runtime by shadowing allocations and validating accesses, commonly used in C/C++ debugging to prevent exploitation vectors like use-after-free.^[96] In web applications, static analysis detects cross-site scripting (XSS) by modeling string operations and sink points like script insertions, with one approach identifying reflected XSS through value-flow tracking in PHP code, achieving 92% precision on benchmarks.^[97] Post-2020, program analysis has increasingly addressed supply-chain attacks, as exemplified by the SolarWinds incident where malware was injected into software updates, compromising thousands of organizations. Analyses of such breaches employ binary instrumentation and dependency scanning to verify build integrity and detect tampered artifacts, emphasizing software bill of materials (SBOM) generation for provenance tracking.^[98] Concurrently, AI-assisted methods, particularly large language models (LLMs) integrated with static analyzers, enhance vulnerability detection by reasoning over code repositories to identify subtle patterns, outperforming traditional tools in recall for zero-day issues while reducing manual review efforts.^[99] Abstract interpretation techniques, detailed elsewhere, further support secure information flows in these contexts.

References

[1]
A Survey of Program Analysis for Distributed Software Systems
Jul 11, 2025 · As a primary form of software analysis, program analysis is an automatic technique or process of analyzing the behaviors hence properties (e.g., ...
[2]
Static Analysis: An Introduction - ACM Queue
Sep 16, 2021 · One such category of tool, static program analysis, consists of programs or algorithms designed to extract facts from another program's source ...
[3]
[PDF] A Survey of Static Program Analysis Techniques
Oct 18, 2005 · Computer program analysis is the process of automatically analysing the bahavior of computer programs. There are two main approaches in progam ...
[4]
Abstract interpretation: a unified lattice model for static analysis of ...
Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. Authors: Patrick Cousot.
[5]
[PDF] Lecture 1: Introduction to Program Analysis
Jan 17, 2023 · ○ Program analysis is the systematic examination of a program to ... ○ Program analysis is the systematic examination of a program to.
[6]
Principles of Program Analysis: | Guide books | ACM Digital Library
This book is unique in providing an overview of the four major approaches to program analysis: data flow analysis, constraint-based analysis, abstract ...Missing: goals | Show results with:goals
[7]
[PDF] Notes on Program Analysis
Program analysis is essential to enforcing security, achieving correctness, improving performance, and reducing costs when developing and maintaining ...
[8]
Program Analysis - an overview | ScienceDirect Topics
Program analysis encompasses a set of techniques used to automatically analyze computer software to infer properties about its behavior. These techniques ...
[9]
A technological review of the FORTRAN I compiler
FLOW ANALYSIS. The function of Sections 4 and 5 of the compiler was to assign ... Section 4 of the compiler did a flow anlaysis of the program to determine the.
[10]
[PDF] Compiler-Based Code-Improvement Techniques
Vyssotsky built control-flow analysis and data-flow analysis into a Fortran II system for the ibm 7090 in 1961 [129]. More formal. Limited distribution for COMP ...
[11]
Frances Allen - IBM
An optimizing compiler can be used to minimize or maximize attributes of an executable computer program, including execution time, memory footprint, storage ...
[12]
Frances Allen - ACM Awards
Allen's 1966 paper, "Program Optimization," laid the conceptual basis for ... Allen developed and implemented her methods as part of compilers for the IBM STRETCH ...
[13]
Using Static Analysis and Clang To Find Heartbleed
Apr 27, 2014 · One approach to identify Heartbleed statically was proposed by Coverity recently, which is to taint the return values of calls to ntohl and ntohs as input data.
[14]
https://blog.trailofbits.com/2014/04/27/using-static-analysis-and-clang-to-find-heartbleed/
[15]
[PDF] Static and dynamic analysis: synergy and duality - Computer Science
Static analysis examines code and all possible behaviors, while dynamic analysis observes program executions, but may not generalize to all possible executions.Missing: trade- offs
[16]
A Systematic Review of Search Strategies in Dynamic Symbolic ...
The most challenging problem in DSE is the path explosion, which is inherited from the traditional symbolic execution. By increasing the number of branches, the ...
[17]
DART: directed automated random testing - ACM Digital Library
We present a new tool, named DART, for automatically testing software that combines three main techniques.Missing: concolic | Show results with:concolic
[18]
Systematic design of program analysis frameworks
Several recent papers (among others Cousot & Cousot[77a], Graham & Wegman[76] ... Systematic design of program analysis frameworks. Theory of computation.
[19]
[PDF] Reasoning About the Unknown in Static Analysis - Stanford CS Theory
Hence, a key challenge for static analysis techniques is achieving a satisfactory com- bination of precision, soundness, and scalability by reporting as few ...
[20]
[PDF] Towards a Unifying Framework for Tuning Analysis Precision by ...
Static analysis computes an over-approximation of the program semantics, while dynamic analysis under-approximates program semantics. In both cases, we have ...
[21]
[PDF] Control Flow Analysis
Having established entry and exit nodes, we can now define the dominance relationships which exist in a directed graph and are of interest in control flow ...
[22]
Control flow analysis | Proceedings of a symposium on Compiler ...
Any static, global analysis of the expression and data relationships in a program requires a knowledge of the control flow of the program.
[23]
[PDF] spa.pdf - Static Program Analysis
Apr 29, 2025 · a closely related program analysis mechanism called context sensitivity. ... 6887 of Lecture Notes in Computer Science, pages 60–76. Springer ...
[24]
[PDF] Lecture Notes: Interprocedural Analysis
We add additional edges to the control flow graph. For every call to function g, we add an edge from the call site to the first instruction of g, and from every ...
[25]
[PDF] A Simple, Fast Dominance Algorithm
In this paper, we have presented a technique for computing dominators on a control-flow graph. The algorithm builds on the well-developed and well-understood ...
[26]
[PDF] Exception Analysis and Points-to Analysis: Better Together
Exceptions change the control flow of a pro- gram, so they enable or disable object assignments. Furthermore, throwing an exception directly introduces an ...
[27]
A program data flow analysis procedure | Communications of the ACM
The global data relationships in a program can be exposed and codified by the static analysis methods described in this paper. A procedure is given which ...
[28]
A unified approach to global program optimization
A technique is presented for global analysis of program structure in order to perform compile time optimization of object code generated for expressions.
[29]
Precise interprocedural dataflow analysis via graph reachability
The paper shows how a large class of interprocedural dataflow-analysis problems can be solved precisely in polynomial time by transforming them into a special ...
[30]
P. Cousot & R. Cousot, Abstract interpretation: a unified lattice model ...
Jan 5, 2010 · Abstract interpretation of programs consists in using that denotation to describe computations in another universe of abstract objects.
[31]
P. Cousot & R. Cousot, Abstract interpretation and application to ...
May 4, 2012 · Abstract: Abstract interpretation is a theory of semantics approximation which is used for the construction of semantics-based program analysis ...
[32]
[PDF] Dynamic interval analysis by abstract interpretation
⟨℘(Sℎ+), ⊆⟩. We have provided the example of interval arithmetics. ... Cousot, P., Cousot, R.: Abstract interpretation: A unified lattice model for static.
[33]
Abstract Interpretation in a Nutshell
Abstract interpretation consists in considering an abstract semantics, that is a superset of the concrete program semantics.
[34]
[PDF] A Theory of Type Polymorphism in Programming
The aim of this work is largely a practical one. A widely employed style of programming, particularly in structure-processing languages.
[35]
[PDF] Principal type-schemes for functional programs - People @EECS
Principal type-schemes for functional programs. Luis Damas* and Robin Milner. Edinburgh University. 1. Introduction. This paper is concerned with the ...
[36]
Principal type-schemes for functional programs - ACM Digital Library
Principal type schemes for modular programs Two of the most prominent features of ML are its expressive module system and its support for Damas-Milner type ...
[37]
[PDF] On Understanding Types, Data Abstraction, and Polymorphism
Parametric polymorphism is obtained when a function works uniformly on a range of types: these types normally exhibit some common structure. Ad-hoc polymorphism ...
[38]
[PDF] Structural Subtyping and the Notion of Power Type - Luca Cardelli
The purpose of this paper is to present a type system where subtyping is an orthogonal concept that applies to all type constructions, including function ...
[39]
[PDF] Dependent Types in Practical Programming
We present an approach to enriching the type system of ML with a restricted form of dependent types, where type index objects are drawn from a constraint ...
[40]
[PDF] Flow-Sensitive Type Qualifiers - Stanford CS Theory
The flow-sensitive analysis associates a store C with each program point. This is in contrast to the flow-insensitive step, which uses one global store CI ...
[41]
[PDF] Dependent Types and Program Equivalence - andrew.cmu.ed
In both cases, decidable type checking comes at a cost, in terms of complexity and expressiveness. Conversely, the benefits to be gained by decidable type ...
[42]
[PDF] Type Systems
1 Introduction. The fundamental purpose of a type system is to prevent the occurrence of execution er- rors during the running of a program.
[43]
[PDF] MODEL CHECKING
The logic CTL* is a super-set of both CTL and LTL. LTL and CTL coincide if the model has only one path! 24... Page 25. Property Patterns: Motivation.
[44]
[PDF] Lecture Notes on Computations & Computation Tree Logic
CTL has the advantage of having a pretty simple model checking algorithm. 2 Kripke Structures. Definition 1 (Kripke structure). A Kripke frame (W,r) consists ...
[45]
[PDF] Model Checking: Software and Beyond
May 7, 2007 · [Clarke and Emerson (1981)] Edmund M. Clarke and E. Allen Emerson. Synthesis of Synchro- nization Skeletons for Branching Time Temporal Logic.
[46]
[PDF] Verification - Rice University
Model Checking: given a Kripke structure K and an LTL formula ψ, do all computations of K satisfy ψ? In this section we describe the automata-theoretic approach ...
[47]
[PDF] Partial Model Checking using Networks of Labelled Transition ...
State explosion can be tackled by divide-and-conquer approaches regrouped under the vocable compositional verification, which take advantage of the com-.<|control11|><|separator|>
[48]
[PDF] Software model checking - Colorado State University
BDD-based model checkers, such as SMV [McMillan 1993], have been extremely successful in hardware model checking. They were also used as back-ends for ...
[49]
[PDF] Model Checking with the Partial Order Reduction
The partial order reduction is aimed at reducing the size of the state space that needs to be searched. It exploits the commutativity of concurrently ...
[50]
[PDF] The Birth of Model Checking - CMU School of Computer Science
Verifying software causes some problems for Model Checking. Software tends to be less structured than hardware. In addition, concurrent software is usually ...
[51]
[PDF] Counterexample-guided Abstraction Refinement - Stanford University
Abstract models may admit erroneous (or “spurious”) counterexamples. We devise new symbolic techniques which analyze such counterexamples and refine the ...Missing: challenges | Show results with:challenges
[52]
Randoop: feedback-directed random testing for Java
Randoop is a well-known tool that proposes a feedback-directed algorithm for automatic and random generation of unit tests for a given Java class. It ...
[53]
Software unit test coverage and adequacy | ACM Computing Surveys
We survey the research work in this area. The notion of adequacy criteria is examined together with its role in software dynamic testing. A review of criteria ...Missing: levels | Show results with:levels
[54]
The Oracle Problem in Software Testing: A Survey - IEEE Xplore
Nov 20, 2014 · This paper provides a comprehensive survey of current approaches to the test oracle problem and an analysis of trends in this important area of software ...
[55]
Model-Based Testing in Practice: An Industrial Case Study using ...
Model-based testing (MBT) automates software testing, including test generation. This study uses GraphWalker (GW) to compare MBT with manual testing in a TCMS ...
[56]
Theoretical and empirical studies on using program mutation to test ...
Theoretical and empirical studies on using program mutation to test the functional correctness of programs ; Timothy A. · Budd · Yale ; Richard A. · DeMillo · Georgia ...
[57]
A Survey of Runtime Monitoring Instrumentation Techniques - arXiv
Aug 24, 2017 · In this paper we compare and contrast the various types of monitoring methodologies found in the current literature, and classify them into a ...Missing: analysis | Show results with:analysis
[58]
A survey of software runtime monitoring - IEEE Xplore
In this paper, starting from the basic concepts of software runtime monitoring, the basic architecture and monitoring levels of software runtime monitoring are ...Missing: program | Show results with:program
[59]
Uncertainty in runtime verification: A survey - ScienceDirect
Runtime Verification can be defined as a collection of formal methods for studying the dynamic evaluation of execution traces against formal specifications.
[60]
Using AOP for detailed runtime monitoring instrumentation
In this paper we demonstrate the need for AOP to be extended if it is to support broad runtime monitoring needs, and then present two new joinpoint types for ...
[61]
The Daikon system for dynamic detection of likely invariants
Daikon is a system for dynamic detection of likely program invariants. It reports properties true over observed program executions.
[62]
Gprof: A call graph execution profiler - ACM Digital Library
Gprof is an execution profiler that accounts for the running time of called routines in the running time of the routines that call them.
[63]
[PDF] A Survey of Runtime Monitoring Instrumentation Techniques
Offline monitors are particularly suitable for properties that can only be verified by globally analysing the complete execution trace that is generated once ...
[64]
A survey of challenges for runtime verification from advanced ...
Nov 11, 2019 · Since monitoring needs to be carried out over a distributed architecture, this will inherently induce non-deterministic computation.
[65]
An In-Depth Study of Runtime Verification Overheads during ...
Sep 11, 2024 · (3) Contrary to conventional wisdom, RV overhead in most projects is dominated by instrumentation, not monitoring. (4) 36.74% of monitoring time ...
[66]
[PDF] A survey of program slicing techniques - FRANK TIP
This survey presents an overview of program slicing, including the various general approaches used to compute slices, as well as the specific techniques used ...
[67]
https://dl.acm.org/doi/10.1145/3652595.3660697
[68]
Dynamic program slicing - ScienceDirect.com
A dynamic program slice is an executable subset of the original program that produces the same computations on a subset of selected variables and inputs.
[69]
The program dependence graph and its use in optimization
In this paper we present an intermediate program representation, called the program dependence graph (PDG), that makes explicit both the data and control ...
[70]
A survey of compiler optimization techniques - ACM Digital Library
This survey describes the major optimization techniques of compilers and groups them into three categories: machine dependent, architecture dependent, and ...
[71]
A unified approach to global program optimization - Semantic Scholar
A technique is presented for global analysis of program structure in order to perform compile time optimization of object code generated for expressions ...
[72]
[PDF] A Catalogue of Optimizing Transformations - Rice University
Frances. E. Allen,. John. Cocke. We now consider the machine dependent transformations. INSTRUCTION. SCHEDULING. In this optimization, sequences of instructions.
[73]
[PDF] A Brief History of Just-In-Time - Department of Computer Science
Software systems have been using “just-in-time” compilation (JIT) techniques since the. 1960s. Broadly, JIT compilation includes any translation performed ...
[74]
[PDF] The Jalapeño Dynamic Optimizing Compiler for JavaTM
This paper describes the design of the Jalapeño. Optimizing Compiler and the implementation results that we have obtained thus far. To the best of our knowledge ...
[75]
[PDF] Machine Learning in Compiler Optimization
The technique proposed in [17] achieves an average speedup between 1.11x and 1.33x across four GPU architectures and does not lead to degraded performance on a ...
[76]
[PDF] Optimizing Whole Programs for Code Size - Publish
I show that my system can improve speed by more than 6% or reduce code size by more than 50% on widely used software packages, depending on the software set and ...
[77]
Exploring the space of optimization sequences for code-size reduction
In this paper, we use 15,000 programs from a public collection to explore the optimization space of LLVM, focusing on code-size reduction. This exploration ...
[78]
Automatic verification of finite-state concurrent systems using ...
Abstract. We give an efficient procedure for verifying that a finite-state concurrent system meets a specification expressed in a (propositional, branching-time) ...
[79]
[PDF] Finding Bugs is Easy - Department of Computer Science
We have implemented a number of automatic bug pattern detectors in a tool called FindBugs. In this paper, we will describe some of the bug patterns our tool ...
[80]
[PDF] PROGRAM SLICING* Mark Weiser
The reduced program, called a "slice", is an indepen- dent program guaranteed to faithfully represent the original program within the domain of the specified ...
[81]
[PDF] Dynamic Taint Analysis for Automatic Detection, Analysis, and ...
In this paper we propose dynamic taint analysis for au- tomatic detection of overwrite attacks, which include most types of exploits. This approach does not ...
[82]
[PDF] Escape Analysis for Java
This paper presents a simple and efficient data flow algorithm for escape analysis of objects in Java programs to determine.
[83]
Static Code Analysis Using SonarQube : A Step-by-Step Guide ...
Aug 5, 2024 · In this article, you'll learn how static code analysis works, what it can do for the quality of your codebase, and how to run static code ...
[84]
CI/CD Pipeline | Safety Critical Continuous Integration Tools - LDRA
LDRA CI/CD tools support CI/CD pipelines with CV by embedding core verification activities into automated workflows and integrating with leading CI platforms, ...
[85]
DO-178() Software Standards Documents & Training - RTCA
DO-178(), originally published in 1981, is the core document for defining both design assurance and product assurance for airborne software.
[86]
Federal Agency Fulfills Rigorous DO-178C Standard With Unified ...
Automate C/C++ testing for embedded air navigation software development. Reduce code testing time, cut costs with static analysis, unit testing & more.Missing: program | Show results with:program
[87]
[PDF] Finding Security Vulnerabilities in Java Applications with Static ...
This paper proposes a static analysis technique for detecting many recently discovered application vulner- abilities such as SQL injections, cross-site ...
[88]
[PDF] KLEE: Unassisted and Automatic Generation of High-Coverage ...
We present a new symbolic execution tool, KLEE, ca- pable of automatically generating tests that achieve high coverage on a diverse set of complex and.
[89]
[PDF] Control-Flow Integrity Principles, Implementations, and Applications
This paper describes and studies one mitigation technique, the enforcement of Control-. Flow Integrity (CFI), that aims to meet these standards for ...Missing: seminal | Show results with:seminal
[90]
OpenText™ Static Application Security Testing (Fortify)
OpenText Static Application Security Testing (Fortify) helps developers find & fix code vulnerabilities early with automated static code analysis.
[91]
Valgrind Home
Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and ...Current Releases · The Valgrind Quick Start Guide · Tool Suite · Code Repository
[92]
Static detection of cross-site scripting vulnerabilities
This paper presents a static analysis for finding XSS vulnerabilities that directly addresses weak or absent input validation.
[93]
Advanced Persistent Threat Compromise of Government Agencies ...
Apr 15, 2021 · The threat actor has been observed leveraging a software supply chain compromise of SolarWinds Orion products[2 ] (see Appendix A). The ...