Fact-checked by Grok 2 weeks ago

Directed acyclic graph

A directed acyclic graph (DAG) is a directed graph composed of a finite set of vertices and directed edges, with the defining property that it contains no directed cycles, meaning no path starts and ends at the same vertex following the edge directions.^[1] This structure ensures a partial order among the vertices, distinguishing DAGs from general directed graphs that may include loops or cyclic paths.^[2] Key properties of DAGs include the existence of a topological ordering, a linear arrangement of vertices such that for every directed edge from vertex u to v, u precedes v in the sequence; this ordering captures all dependencies without circularity.^[1] Every DAG has at least one source vertex (indegree zero) and one sink vertex (outdegree zero), guaranteeing starting and ending points in any traversal.^[3] Topological sorts can be computed efficiently using algorithms like depth-first search in linear time relative to the number of vertices and edges, enabling practical implementations.^[1] DAGs find extensive applications across disciplines due to their ability to model hierarchical and dependency-based relationships. In computer science, they represent precedence constraints, such as course prerequisites in academic scheduling or the order of operations in compiler dependency resolution.^[1] In statistics and epidemiology, DAGs visualize causal structures, helping identify confounders, mediators, and biases to inform unbiased effect estimation in observational studies.^[4] More recently, DAG-based architectures have enhanced blockchain systems by allowing parallel transaction processing, improving scalability and throughput compared to linear chain models.^[5]

Definitions

Formal Definition

A directed graph, or digraph, is formally defined as an ordered pair G = ([V](/page/V.), [E](/page/E!)), where [V](/page/V.) is a finite set of vertices (also called nodes) and [E](/page/E!) \subseteq V \times V is a set of ordered pairs called directed edges (or arcs), each representing a one-way connection from one vertex to another.^[6] This structure allows edges to have direction, distinguishing it from undirected graphs where connections are bidirectional. A directed acyclic graph (DAG) is a directed graph with no self-loops and that satisfies the acyclicity condition: it contains no directed cycles.^[7] Formally, a directed cycle would be a sequence of distinct vertices v_1, v_2, \dots, v_k (with k \geq 2) such that (v_i, v_{i+1}) \in E for all i = 1, \dots, k-1 and (v_k, v_1) \in E; the absence of any such sequence ensures acyclicity. DAGs are typically denoted in the same manner as directed graphs, G = (V, E), with the acyclic property explicitly stated to emphasize the restriction. For illustration, consider a simple DAG with vertex set V = \{A, B, C\} and edge set E = \{(A, B), (B, C), (A, C)\}. Here, edges point from A to B, B to C, and A to C, forming a transitive structure without any path that returns to a starting vertex, thus confirming acyclicity. The concept of directed acyclic graphs emerged within graph theory, with the term formalized in the literature around the 1960s. This acyclicity property implies the existence of a topological ordering of the vertices, a linear arrangement respecting edge directions.

Basic Terminology

In directed acyclic graphs (DAGs), vertices with in-degree zero, meaning no incoming edges, are known as sources, as they represent starting points with no predecessors. Conversely, vertices with out-degree zero, having no outgoing edges, are called sinks, serving as endpoints without successors. These terms highlight the flow-like structure inherent in DAGs, where information or dependencies propagate unidirectionally from sources toward sinks.^[8] Within the partial order framework of a DAG, a vertex u is an ancestor of another vertex v if there exists a directed path from u to v, establishing u as a predecessor in the ordering; symmetrically, v is a descendant of u. This ancestry relation captures the hierarchical dependencies encoded by the graph's edges, with reachability serving as the basis for defining the strict partial order among vertices.^[9] DAGs thus model precedence relations, where the absence of cycles ensures a well-defined, irreflexive partial order via reachability.^[10] Simple examples illustrate these concepts intuitively. A linear chain, such as vertices A, B, and C connected by edges A \to B and B \to C, forms a DAG representing a total order, with A as the sole source and C as the sink; here, A is an ancestor of both B and C. A binary tree, in which each vertex has at most two outgoing edges to child vertices, exemplifies a DAG with multiple branches, such as a root connected to left and right subtrees, where ancestors are parents or higher nodes in the hierarchy.^[11] The diamond graph, featuring vertices S, A, B, and T with edges S \to A, S \to B, A \to T, and B \to T, demonstrates path convergence, with S as source, T as sink, A and B as descendants of S, and both as ancestors of T.^[12] To distinguish DAGs visually from other graphs, consider text-based representations. A linear chain appears as:

A → B → C
A → B → C

Here, directions enforce order without loops. In contrast, a cyclic digraph like:

A → B → C → A
A → B → C → A

contains a directed cycle, violating the acyclicity of DAGs.^[11] Unlike undirected graphs, where edges lack direction (e.g., {A-B, B-C}), DAGs' oriented edges prevent bidirectional traversal and ensure the partial order's asymmetry.^[13]

Properties

Reachability and Transitive Closure

In a directed acyclic graph (DAG), reachability between vertices is defined such that a vertex u reaches a vertex v if there exists a directed path from u to v. This relation captures the directional connectivity inherent in the graph's structure, excluding cycles due to the acyclicity property.^[14] The transitive closure of a DAG G, denoted G^+, is the smallest transitive relation that contains the original edge relation of G; it includes a directed edge from u to v if and only if there is a directed path from u to v in G.^[15] Equivalently, the transitive closure can be represented by a reachability matrix R, where R_{ij} = 1 if there is a path from vertex i to vertex j in G, and R_{ij} = 0 otherwise.^[14] In a DAG, the transitive closure defines a strict partial order on the vertices, as the absence of cycles ensures the relation is irreflexive and transitive, with antisymmetry following from acyclicity.^[14] This partial order aligns with the reachability relation, where u < v if u reaches v.^[16] The transitive reduction of a DAG is the minimal subgraph that preserves the same reachability relation as the original graph, obtained by removing all redundant edges that are implied by longer paths.^[15] For any finite acyclic directed graph G, a unique transitive reduction exists, consisting of edges that cannot be transitively derived from other paths.^[15] For example, consider a simple chain DAG with vertices A, B, and C and edges A \to B and B \to C. The transitive closure adds the edge A \to C, reflecting the path from A to C via B, while the transitive reduction removes A \to C if added, retaining only the direct edges to minimize the graph without altering reachability.^[15]

Topological Ordering

A topological ordering of a directed acyclic graph (DAG) is a linear ordering of its vertices such that, for every directed edge uv, vertex u appears before vertex v in the ordering. This ordering respects all the directional constraints imposed by the edges, ensuring that predecessors precede successors without forming cycles. Every finite DAG admits at least one topological ordering. The existence follows from an inductive argument on the number of vertices n. For the base case n = 0, the empty graph is trivially ordered. For n > 0, a DAG always has at least one vertex of indegree zero (a source, guaranteed by acyclicity). Remove this source vertex s and its outgoing edges to obtain a smaller DAG on n - 1 vertices, which by the inductive hypothesis has a topological ordering. Prepend s to this ordering to yield a valid topological ordering for the original graph, as no edges point to s and all its successors appear after it. A DAG has a unique topological ordering if and only if it contains a Hamiltonian path—a directed path that visits every vertex exactly once. In this case, the path itself dictates the sole valid ordering, as each consecutive pair on the path forms an edge or a chain of dependencies that enforces strict precedence. Conversely, if no Hamiltonian path exists, multiple sources or parallel branches allow for interchangeable positions in the ordering, yielding at least two distinct topological orderings. The reflexive transitive closure of the reachability relation—where u ≤ v if u = v or there is a directed path from u to v—induces a partial order (poset) on the vertices. A topological ordering is then a linear extension of this poset: a total order that preserves the partial order, meaning if u ≤ v in the poset, then u precedes or equals v in the linear extension. This connection bridges graph theory and order theory, highlighting how DAGs model precedence constraints in partially ordered sets. For example, consider a DAG with vertices A, B, C, D and edges A → B, A → C, B → D, C → D, forming a diamond shape. Valid topological orderings include A, B, C, D and A, C, B, D, as B and C are incomparable (no path between them) and can be swapped while respecting the edges from A and to D. This illustrates how incomparable elements in the poset permit multiple extensions. The number of distinct topological orderings of a DAG equals the number of linear extensions of its associated poset. Enumerating these extensions is a fundamental problem in enumerative combinatorics, with applications in counting valid sequences under partial constraints; exact computation is #P-complete in general, but tractable for special posets like trees or series-parallel structures.

Enumeration and Counting

The enumeration of directed acyclic graphs (DAGs) on labeled vertices is a well-studied problem in combinatorial graph theory. The number of such DAGs with n labeled vertices is given by the sequence OEIS A003024, which begins 1, 1, 3, 25, 543, 29281 for n = 0 to 5.^[17] This count was first systematically enumerated by Robinson using recursive decompositions based on sources and extensions of smaller acyclic digraphs. Asymptotically, the number grows as a(n) \sim n! \cdot 2^{n(n-1)/2} / (M p^n), where p \approx 1.488 and M \approx 0.574, reflecting the dominant contribution from orientations that are roughly half-complete in the dense regime.^[17] Enumerating unlabeled DAGs is more challenging due to the need to account for graph isomorphisms. The sequence for the number of acyclic digraphs with n unlabeled vertices is OEIS A003087, starting 1, 1, 2, 6, 31, 302 for n = 0 to 5.^[18] Robinson addressed this by applying Burnside's lemma to count orbits under the action of the symmetric group on potential edge sets, combined with inclusion-exclusion over cycle structures.^[19] This approach yields exact counts but lacks a simple closed form, and asymptotics remain less precise than for the labeled case, with growth driven by the exponential increase in possible transitive relations. For a fixed DAG, the number of topological orders corresponds to the number of linear extensions of the associated partial order (poset). Counting linear extensions is generally #P-complete, but for specific poset families—such as d-complete posets or those corresponding to Young diagrams—exact formulas exist via generalizations of the hook-length formula.^[20] For instance, in a tree poset (a special case of a DAG poset), the hook-length product provides the count directly, analogous to the Frame-Robinson-Thrall formula for standard Young tableaux. The number of acyclic orientations of an undirected graph G with p vertices equals |\chi_G(-1)|, where \chi_G is the chromatic polynomial of G.^[21] This equivalence, established by Stanley, interprets the evaluation at -1 as counting compatible height functions that induce acyclic directions. For trees, the chromatic polynomial \chi_T(x) = x(x-1)^{n-1} yields exactly $2^{n-1} acyclic orientations, corresponding to choices of root and directions away from it.^[21] More generally, the formula connects to the Tutte polynomial and has been used to derive counts for complete graphs and other families via Stirling numbers.^[21] Early enumerative work on DAGs dates to the late 1960s and early 1970s, with Robinson's 1973 enumeration of labeled cases building on prior recursive techniques and providing foundational recurrences still used today. Subsequent refinements, including unlabeled counts and asymptotic analyses, extended these methods through the 1970s and beyond.^[19] Directed acyclic graphs (DAGs) arise as acyclic orientations of undirected graphs, where each edge is assigned a direction such that no directed cycles form.^[22] Every undirected graph admits at least one such orientation, which can be obtained via a topological ordering of its vertices after treating it as a DAG. Finite DAGs are closely related to finite partially ordered sets (posets), with an isomorphism established through the transitive closure: the reachability relation in a DAG induces a partial order on its vertices, and conversely, the Hasse diagram of a poset is a DAG whose transitive closure recovers the poset. This correspondence highlights DAGs as a graphical representation of partial orders, preserving the antisymmetric and transitive structure. In the context of tournaments—complete directed graphs on n vertices—a DAG that is also complete corresponds precisely to a transitive tournament, where the edge directions respect a total order on the vertices, maximizing the number of edges in an acyclic orientation. Unlike cyclic directed graphs, which permit directed cycles that enable feedback loops and potentially infinite traversals, DAGs prohibit such cycles, ensuring finite paths and well-defined topological orders that prevent circular dependencies.^[11] When leveled according to a topological order, a DAG can be viewed as a multilayer graph where each layer forms a bipartite structure between consecutive levels, with edges only directed forward across layers. Series-parallel DAGs represent a structured subclass, built recursively from single edges via series and parallel compositions, and characterized by the absence of N-shaped forbidden subgraphs in their structure.

Algorithms

Recognition and Topological Sorting

A directed acyclic graph (DAG) can be recognized by verifying the absence of cycles in the directed graph, which is a fundamental step in confirming its acyclicity. One standard method employs depth-first search (DFS) augmented with a three-color scheme to track node states: white for unvisited nodes, gray for nodes currently being explored, and black for nodes fully explored. During the DFS traversal starting from each unvisited node, if an adjacent node is encountered that is gray, a back edge is detected, indicating a cycle. This approach classifies edges as tree edges, forward edges, cross edges, or back edges, with back edges signaling cycles. The algorithm runs in O(|V| + |E|) time, where |V| is the number of vertices and |E| is the number of edges, making it efficient for sparse and dense graphs alike when using adjacency lists. Topological sorting provides both a recognition mechanism and a linear ordering of vertices such that for every directed edge from vertex u to v, u precedes v in the ordering; this ordering exists if and only if the graph is acyclic. Kahn's algorithm, introduced in 1962, achieves this by first computing the in-degree (number of incoming edges) for each vertex in O(|V| + |E|) time. It initializes a queue with all vertices of in-degree zero (sources), then iteratively removes a source from the queue, adds it to the ordering, and decreases the in-degree of its neighbors; if a neighbor's in-degree reaches zero, it is added to the queue. The process continues until the queue is empty. If all vertices are ordered, the graph is a DAG and the result is a valid topological order; otherwise, any remaining vertices with positive in-degree indicate a cycle. This method is particularly useful for large networks due to its straightforward implementation and avoidance of recursion, though it requires an initial in-degree computation. The following pseudocode illustrates Kahn's algorithm:

function KahnTopologicalSort(G):
    in_degree = array of size |V|, initialized to 0
    for each edge (u, v) in G:
        in_degree[v] += 1
    [queue](/page/Queue) = empty [queue](/page/Queue)
    for each [vertex](/page/Vertex) v in G:
        if in_degree[v] == 0:
            [queue](/page/Queue).enqueue(v)
    [order](/page/Order) = empty list
    while [queue](/page/Queue) is not empty:
        u = [queue](/page/Queue).dequeue()
        [order](/page/Order).append(u)
        for each neighbor v of u:
            in_degree[v] -= 1
            if in_degree[v] == 0:
                [queue](/page/Queue).enqueue(v)
    if length([order](/page/Order)) == |V|:
        return [order](/page/Order)  // Graph is a DAG
    else:
        return null  // Cycle detected
function KahnTopologicalSort(G):
    in_degree = array of size |V|, initialized to 0
    for each edge (u, v) in G:
        in_degree[v] += 1
    [queue](/page/Queue) = empty [queue](/page/Queue)
    for each [vertex](/page/Vertex) v in G:
        if in_degree[v] == 0:
            [queue](/page/Queue).enqueue(v)
    [order](/page/Order) = empty list
    while [queue](/page/Queue) is not empty:
        u = [queue](/page/Queue).dequeue()
        [order](/page/Order).append(u)
        for each neighbor v of u:
            in_degree[v] -= 1
            if in_degree[v] == 0:
                [queue](/page/Queue).enqueue(v)
    if length([order](/page/Order)) == |V|:
        return [order](/page/Order)  // Graph is a DAG
    else:
        return null  // Cycle detected

An alternative approach uses DFS to compute a topological order via reverse postorder traversal. The algorithm performs DFS on the graph, recording vertices in the order they finish (postorder). The reverse of this finishing-time order yields a topological sorting. During traversal, the same coloring scheme detects cycles: a gray neighbor indicates a back edge and thus a cycle, halting the process. Like Kahn's method, it runs in O(|V| + |E|) time and succeeds precisely when the graph is acyclic, providing a recursive implementation that naturally handles the ordering through the call stack. For large graphs, iterative DFS variants can mitigate stack overflow risks in deep recursions. Both algorithms' correctness stems from the property that in a DAG, sources can always be ordered before their dependents, and DFS finishing times respect this dependency.

Transitive Reduction

The transitive reduction of a directed acyclic graph (DAG) is a minimal subgraph that preserves the reachability relation of the original graph, meaning it has the same transitive closure but with the fewest possible edges.^[23] For finite DAGs, this reduction is unique and forms a subgraph of the original graph.^[23] It removes all redundant edges—those implied by longer paths—while ensuring that the partial order defined by the DAG remains unchanged.^[23] A standard algorithm to compute the transitive reduction exploits the acyclicity of the DAG by first obtaining a topological ordering of the vertices, which can be done in O(|V| + |E|) time using depth-first search or Kahn's algorithm.^[24] Then, process the vertices in this topological order to build reachability sets for each vertex u: the set reachable consists of all vertices v such that there is a directed path from u to v, computed dynamically as the union of {successors of u} and their reachability sets. An edge u → v in the original graph is retained in the reduction if and only if there is no path of length greater than 1 from u to v; this is checked by verifying whether v lies in the union of reachable over all direct successors w of u excluding v itself (or equivalently, in the union over all direct successors, since v ∉ reachable in a DAG without self-loops). This approach ensures minimality because any longer path from u to v must pass through at least one direct successor w of u.^[24] The time complexity of this dynamic programming method on the topological order is O(|V| \cdot |E|), arising from the unions required to build the reachability sets, where the cost per edge reflects the size of the descendant sets in the worst case.^[24] Alternatively, if the transitive closure is provided as input, the reduction can be computed by iteratively removing edges u → v where the closure indicates an intermediate vertex w such that u reaches w and w reaches v.^[23] As an illustrative example, consider a diamond-shaped DAG with vertices A, B, C, D and edges A → B, A → C, B → D, C → D, A → D. The topological order might be A, B, C, D. Computing reachability yields reachable[B] = {D}, reachable[C] = {D}, reachable[D] = ∅. At A, the union of reachable sets over successors B, C, D is {D}. Thus, A → B and A → C are kept (since B and C are not in {D}), but A → D is removed (since D ∈ {D}), resulting in a minimal graph with the same reachability.^[24] While the transitive reduction is unique for DAGs, variants for general directed graphs (possibly with cycles) may not be unique, requiring choices among multiple minimal subgraphs that preserve reachability.^[23]

Path and Cycle Detection

In directed graphs, cycle detection can be achieved using depth-first search (DFS) by tracking the recursion stack or node colors (white for unvisited, gray for visiting, black for visited); a back edge to a gray node indicates a cycle. This approach runs in O(|V| + |E|) time and assumes no prior acyclicity, making it suitable for verifying whether a graph is a directed acyclic graph (DAG). Alternatively, attempting a topological sort via Kahn's algorithm—if the sort does not include all vertices, a cycle exists—provides another O(|V| + |E|) method that leverages in-degree computations and a queue for nodes with zero in-degree.^[25] Assuming acyclicity is confirmed, path detection in DAGs exploits the topological order to efficiently compute path existence, lengths, and structures. For single-source shortest paths from a source s, the algorithm first computes a topological ordering, then relaxes all edges in that order: for each vertex u in the order, update distances to neighbors v as δ(v) = min(δ(v), δ(u) + w(u, v)). This yields correct shortest paths in O(|V| + |E|) time, even with negative edge weights (as no negative cycles exist), without needing priority queues like Dijkstra's algorithm. For all-pairs path detection, reachability (existence of paths) can be determined via transitive closure computed in topological order: initialize a bit vector for each vertex indicating self-reachability, then propagate by OR-ing predecessor reachability sets for each edge u → v. This runs in O(|V|(|V| + |E|)) time overall. For all-pairs shortest paths, repeat the single-source algorithm from each vertex as source, achieving O(|V|(|V| + |E|)) time; alternatively, dynamic programming in reverse topological order can compute distances from all vertices to sinks, but the repeated single-source method is standard and efficient for sparse DAGs.^[26] Finding the longest path in a DAG is NP-hard in general graphs but solvable in polynomial time via dynamic programming on a topological order. Initialize dp = 0 for all v (or edge weights if weighted); then, for each vertex v in topological order, set

dp = \max_{u \in \text{predecessors of } v} (dp + w(u, v))

if predecessors exist, else dp remains the weight of a trivial path. The longest path from a source s ends at the v maximizing dp, computed in O(|V| + |E|) time. In project scheduling, tasks form vertices with durations as self-loop weights, and dependencies as directed edges; the longest path identifies the critical path, determining the minimum completion time, as delays on this path propagate without slack.^[27]

Feedback Arc Set

A feedback arc set in a directed graph is a subset of arcs whose removal eliminates all cycles, resulting in a directed acyclic graph (DAG). The minimum feedback arc set (MFAS) problem is to find the smallest such subset. This process breaks all directed cycles by targeting a minimal number of arcs, enabling subsequent topological ordering on the resulting DAG. The problem originates from early graph theory studies on tournaments and has applications in resolving circular dependencies in ordering tasks, such as linear arrangements where the goal is to embed vertices in a line to minimize backward connections. The MFAS problem is NP-hard for general directed graphs, as established in Karp's foundational enumeration of NP-complete problems, and remains NP-hard even for tournaments, as proven by Alon et al. resolving a long-standing conjecture. Despite this hardness, the problem is approximable; for tournaments, polynomial-time approximation schemes (PTAS) exist, achieving solutions within (1 + ε) of optimal for any fixed ε > 0. Exact solutions can be computed via dynamic programming for small graphs, with a time complexity of O(2^{|V|} |V|^2) by considering subsets of vertices for partial orderings and ensuring acyclicity. Additionally, MFAS is fixed-parameter tractable (FPT) when parameterized by the solution size k, with algorithms running in O(2^{O(k \log k)} n^3) time or better, leveraging kernelization and branching techniques.^[28]^[29]^[30]^[31] Practical algorithms for MFAS include greedy heuristics that approximate the solution efficiently. A prominent example is the method by Eades, Lin, and Smyth, which iteratively constructs an ordering by selecting the vertex with the highest net flow—defined as the difference between its out-degree and in-degree in the remaining graph—and removes arcs pointing backward in this order. This approach yields a feedback arc set comprising the backward arcs and performs well in practice for moderate-sized graphs, often producing near-optimal results. For exact optimization, integer linear programming (ILP) formulations are used, such as the ordering model that assigns distinct integer positions p_i ∈ {1, …, |V|} to vertices i and binary indicators z_{ij} for each arc (i,j), minimizing the sum of z_{ij}, subject to p_i - p_j ≤ (|V|-1) z_{ij} for each arc (ensuring z_{ij}=1 if p_i > p_j, i.e., backward), along with position uniqueness constraints. These ILP models can be solved via branch-and-bound for instances up to hundreds of vertices using modern solvers.^[32]^[30] Historically, the MFAS problem traces back to Rédei's 1934 theorem, which proves that every tournament contains a Hamiltonian path—an acyclic ordering—and relates to tournament scheduling by guaranteeing a feedback arc set of size at most \binom{n}{2} - (n-1), though minimizing it remains challenging. In applications to ordering problems, MFAS minimizes disruptions in linear arrangements, such as in dependency resolution or graph drawing, where the removed arcs represent resolved conflicts to achieve a valid topological order.

Applications

Scheduling and Dependencies

Directed acyclic graphs (DAGs) are fundamental in modeling task scheduling problems where activities have prerequisite dependencies, ensuring that no cycles disrupt the execution order. In such representations, nodes denote tasks or events, and directed edges indicate precedence constraints, allowing for the determination of feasible sequences through topological ordering. This structure is particularly valuable in project management, where it facilitates the identification of timelines and resource needs without circular dependencies.^[33] The Critical Path Method (CPM), developed in 1957 by DuPont engineers Morgan R. Walker and mathematician James E. Kelley in collaboration with Remington Rand, was one of the earliest applications of DAGs in scheduling.^[34] CPM models projects as activity-on-arrow networks, a form of DAG where the longest path from start to finish—computed via dynamic programming after topological sorting—represents the critical path, dictating the minimum project duration.^[34] To compute this, a forward pass calculates the earliest start and finish times for each activity by traversing the DAG in topological order, summing durations along paths from the source. A subsequent backward pass, starting from the project end, determines the latest allowable start and finish times, identifying slack for non-critical activities.^[35] Activities on the critical path have zero slack, and any delay there extends the overall timeline.^[35] Building on CPM, the Program Evaluation and Review Technique (PERT) was introduced in 1958 by the U.S. Navy's Special Projects Office for the Polaris missile program, extending the DAG model to handle uncertainty.^[36] PERT incorporates probabilistic durations on edges using three estimates—optimistic, most likely, and pessimistic—to compute expected times via the formula \frac{O + 4M + P}{6}, where O, M, and P are the estimates, respectively.^[36] The critical path in PERT is then the longest expected path in the DAG, enabling risk assessment through variance calculations along paths. This approach accelerated the Polaris project by two years, demonstrating DAGs' impact on complex, time-sensitive endeavors.^[36] In practice, DAGs underpin software build systems, such as GNU Make, where modules depend on prior compilations of dependencies, forming a global dependency graph resolved via topological sorting to parallelize independent builds efficiently.^[37] When resources like processors or personnel are limited, DAG scheduling incorporates constraints through heuristics like list scheduling, which prioritizes tasks from a ready list (post-topological order) and allocates them to available resources to minimize makespan.^[38] This method, while approximating optimal allocation, is widely used in multiprocessor environments due to its efficiency on DAG-structured workflows.^[38]

Data Processing and Computation Graphs

In data processing and computation graphs, directed acyclic graphs (DAGs) model workflows where nodes represent computational operations or data transformations, and directed edges denote dependencies, ensuring that data flows from inputs to outputs without cycles. This structure facilitates efficient execution by capturing the precedence relationships among tasks, allowing systems to identify independent operations for parallelization. For instance, in early dataflow architectures proposed in the 1970s, such as those developed by Jack Dennis at MIT, computations were represented as DAGs where tokens on edges triggered node activation upon data availability, enabling fine-grained parallelism in hardware designs.^[39] These concepts evolved from theoretical models to practical implementations, influencing modern systems by providing a foundation for asynchronous, dependency-driven processing that underpins both batch and stream workloads.^[40] Frameworks like Apache Spark and TensorFlow leverage DAGs to optimize large-scale computations. In Spark, resilient distributed datasets (RDDs) form a logical DAG of transformations (e.g., map, filter) and actions, which the scheduler optimizes by analyzing the graph for opportunities like predicate pushdown and join reordering before physical execution.^[41] Similarly, TensorFlow represents neural network models as dataflow graphs where nodes are mathematical operations (e.g., matrix multiplications) and edges carry multidimensional tensors, enabling automatic differentiation and distributed training through graph partitioning.^[42] Earlier systems, such as Google's FlumeJava, extended MapReduce by composing multiple map and reduce stages into a high-level DAG, automatically optimizing for intermediate data spilling and fusion of operations to reduce I/O overhead.^[43] DAG representations also support key optimizations, such as common subexpression elimination (CSE), where identical computations are detected and reused to minimize redundant work. In compiler design, basic blocks are constructed as DAGs with nodes for operators and leaves for variables or constants; shared subtrees identify common expressions, allowing the optimizer to compute them once and reference the result, as detailed in foundational techniques for code generation.^[44] For parallel execution on multi-core systems, topological ordering of the DAG provides a linear schedule that respects dependencies, enabling dynamic task assignment to cores while maximizing concurrency; this approach has been formalized in scheduling algorithms that model precedence constraints to achieve bounded response times.^[45] A prominent application appears in database query optimization, where join operations form a DAG to represent alternative execution plans. For complex queries involving multiple relations, the optimizer enumerates join orderings as DAGs, incorporating selections to prune inefficient paths and select bushy trees that share intermediate results, thereby reducing overall computation cost compared to left-deep plans.^[46] This evolution from 1970s dataflow machines to contemporary machine learning pipelines, such as those in TensorFlow or Spark MLlib, underscores DAGs' role in scaling computations from single-node optimizations to distributed environments handling petabyte-scale data.^[47]

Causal and Probabilistic Models

Directed acyclic graphs (DAGs) play a central role in modeling causal relationships, where nodes represent variables and directed edges indicate direct causal influences from one variable to another. In causal DAGs, the absence of cycles ensures that causation flows in a temporal or logical order without feedback loops, allowing for the representation of acyclic causal structures. This framework, formalized in the late 1980s, enables researchers to distinguish correlation from causation by encoding assumptions about direct causes explicitly.^[48] A key property in causal DAGs is d-separation, a graphical criterion that determines conditional independencies between variables based on the graph's structure. Specifically, two variables are d-separated given a set of conditioning variables if all paths between them are blocked, where a path is blocked by a collider (a node with converging arrows) if not conditioned on, or by a non-collider if conditioned on. This criterion allows inference of probabilistic independencies from the graph alone, facilitating the testing of causal models against data. D-separation was introduced as part of the graphical model formalism for Bayesian reasoning.^[49] Bayesian networks extend causal DAGs by associating each node with a random variable and each edge with conditional probabilistic dependencies, representing joint probability distributions over the variables. In a Bayesian network, the joint distribution factorizes according to the graph structure: for each variable, its probability is conditioned only on its parents in the DAG. This factorization enables efficient probabilistic inference, such as computing marginals or conditionals, using algorithms like belief propagation, which exploit the acyclicity to avoid cycles in message passing. Bayesian networks were formalized as a tool for plausible reasoning under uncertainty.^[48] The Markov condition underpins this factorization, stating that every variable is conditionally independent of its non-descendants given its parents in the DAG. This local independence assumption implies the global Markov property: any two sets of variables are independent given a separating set derived from d-separation. In causal models, the Markov condition links the graph's structure to the observed data distribution, assuming no hidden confounders unless explicitly modeled. Violations of the Markov condition indicate model misspecification, such as unmodeled common causes. The condition is a cornerstone of both causal and probabilistic interpretations of DAGs.^[50] For interventions in causal DAGs, Judea Pearl's do-calculus provides a mathematical framework to compute causal effects by distinguishing observational probabilities from interventional ones. The do-operator, denoted as P(Y \mid do(X)), represents the distribution of Y after forcibly setting X to a value, effectively removing incoming edges to X in the graph (mutilating the DAG). The calculus consists of three inference rules that allow reduction of interventional queries to observational ones under d-separation conditions, without needing to enumerate all possible interventions. For example, in a simple DAG where X directly causes Y with no confounders, P(Y \mid do(X)) equals the observational P(Y \mid X); more complex cases, like front-door paths, use the rules to adjust for mediators. Do-calculus was developed to enable causal inference from non-experimental data. Causal effect identification in DAGs addresses when an interventional effect, such as P(Y \mid do(X)), can be expressed solely in terms of observational probabilities. Identification holds if there exists a set of variables whose conditioning blocks all back-door paths (non-causal paths from X to Y) while preserving causal paths, as per the back-door criterion. The front-door criterion identifies effects through mediating variables when direct paths are confounded. More generally, Pearl's ID algorithm recursively applies do-calculus and adjustment formulas over subgraphs to determine identifiability, succeeding if no c-component (a subgraph without adjustment sets) obstructs the query. This ensures that causal queries are answerable from data under the assumed graph. The use of DAGs for causality was pioneered by Judea Pearl in the 1980s, building on earlier probabilistic graphical models to formalize non-parametric causal inference. Pearl's seminal contributions, including the integration of structural equations with graphical criteria, shifted causality from philosophical debate to a computable science, influencing fields like epidemiology, economics, and machine learning.^[48]

Version Control and Genealogy

In version control systems, directed acyclic graphs (DAGs) model the evolution of codebases by representing commits as nodes, with directed edges pointing from child commits to their parent commits, ensuring no cycles to maintain a valid historical order.^[51] This structure allows branches to diverge as separate paths from common ancestors and converge through merges, where a new commit node connects multiple parents to preserve parallel development histories.^[51] For instance, in Git, each commit encapsulates changes along with pointers to one or more predecessors, forming a DAG that supports efficient querying of ancestry and differences across versions.^[52] Merge resolution in these systems relies on topological ordering to sequence commits logically, ensuring parents precede children when linearizing the DAG for operations like history visualization or conflict detection.^[53] In Git, the merge base—identified as the best common ancestor between branches—serves as the starting point for three-way merges, using algorithms that generalize lowest common ancestor (LCA) computations from trees to DAGs to handle multiple possible ancestors.^[52] Conflicts arise when changes overlap without a clear resolution path, requiring manual intervention, but the DAG's acyclicity guarantees that merges can always be ordered topologically to reconstruct a coherent history.^[52] The adoption of DAG-based models in version control accelerated in the 2000s with the rise of distributed systems like Git, created by Linus Torvalds in 2005, and Mercurial, developed by Matt Mackall in the same year, both emerging after the Linux kernel community abandoned the proprietary BitKeeper tool.^[54] These systems replaced linear revision models with DAGs to better support decentralized workflows, enabling large-scale projects with thousands of contributors, such as the Linux kernel and Mozilla codebase.^[54] In genealogy, pedigree graphs are structured as DAGs to represent familial relationships, with nodes denoting individuals and directed edges indicating descent from parents to offspring, where acyclicity prevents invalid loops like self-ancestry.^[55] This model ensures pedigrees accurately trace inheritance patterns without cycles, facilitating analysis of traits across generations and comparison of family trees for accuracy.^[55] For example, querying a pedigree DAG might involve finding the LCA of two individuals to identify their most recent shared ancestor, generalizing tree-based methods to account for multiple paths in complex family structures.^[56] DAGs also appear in blockchain-like systems for tracking transactional histories, though many traditional blockchains form linear chains; alternatives like IOTA's Tangle use a full DAG where transactions are nodes that approve multiple prior ones, enabling parallel validation without a single sequential order.^[57] In the Tangle, acyclicity maintains consensus integrity while allowing scalable, fee-less processing for applications like IoT data integrity.^[57] Such structures extend version control principles to distributed ledgers, where querying common ancestors helps resolve transaction dependencies.^[56]

Citation and Knowledge Graphs

In citation networks, scholarly articles serve as nodes, with directed edges representing citations from newer works to earlier ones, forming a directed acyclic graph (DAG) under the ideal assumption that citations respect chronological order and exclude self-references.^[58] This structure captures the flow of intellectual influence, where the absence of cycles ensures a clear temporal progression of ideas.^[59] The concept of citation indexing originated in the mid-20th century, with Eugene Garfield's 1955 proposal for a system that associates ideas through references, laying the groundwork for analyzing scientific literature as interconnected graphs.^[60] Impact metrics in these networks, such as the h-index introduced by Jorge E. Hirsch in 2005, quantify a researcher's productivity and influence by identifying the largest number h of papers with at least h citations each.^[61] In the DAG framework, direct citations correspond to in-degrees at nodes, while longer paths represent chains of indirect influence, tracing how ideas propagate across multiple works.^[62] By the 1990s, graph-based models advanced this analysis, with author co-citation techniques mapping intellectual structures and revealing clusters of related research through shared references.^[63] Knowledge graphs extend this paradigm to semantic representation, where Resource Description Framework (RDF) triples—structured as subject-predicate-object statements—form directed edges between entities, often yielding DAGs in ontological hierarchies to enforce taxonomic consistency.^[64] For instance, Wikidata employs RDF to build a collaborative knowledge base, modeling relationships like "instance of" or "subclass of" as acyclic links that support entity disambiguation and query resolution across domains.^[65] Inference in these graphs relies on transitive closure, which computes all reachable paths to derive implicit relations, such as inferring broader categories from subclass chains.^[66] Platforms like Google Scholar leverage citation graphs—modeled as DAGs—for ranking search results, prioritizing articles with high in-degree citations and recency to reflect scholarly impact.^[67] However, real-world citation networks deviate from perfect acyclicity due to errors, mutual citations between contemporaneous papers, or updates, necessitating preprocessing to approximate DAG structures for reliable analysis.^[68]

Compression and Storage

Directed acyclic graphs (DAGs) enable efficient compression by representing shared substructures, which reduces redundancy in data representations such as trees or hierarchical documents. In XML compression, for instance, unranked node-labeled trees are converted to their minimal DAG form, where common subtrees are shared via nodes with multiple parents, achieving significant space savings over traditional tree encodings.^[69] This approach exploits the acyclic property to merge identical subsequences without altering the underlying structure, as demonstrated in analyses of real-world XML documents where DAG representations yield compression ratios superior to pointer-based methods.^[70] Similar techniques apply to file systems, where DAGs model directory hierarchies and shared files to minimize storage duplication. Minimal deterministic finite automata (DFAs), which are inherently DAGs, provide a compact representation for pattern compression in string matching and recognition tasks. The minimization process merges equivalent states, reducing the number of nodes and transitions while preserving the accepted language, which directly lowers memory usage for large pattern sets.^[71] For example, in network intrusion detection, minimal DFAs compress regular expression patterns into DAG structures that fit in faster memory tiers, improving query performance without loss of functionality.^[72] Succinct data structures based on DAGs offer near-optimal storage for binary relations, encoding reachability and adjacency with o(n) extra bits beyond the information-theoretic lower bound. The k²-tree, a permutation-based DAG variant, represents sparse binary relations as a compressed adjacency matrix, supporting efficient navigation and updates while using approximately 1 + o(1) bits per edge.^[73] This is particularly effective for web graphs or sparse matrices, where the DAG topology allows predecessor queries in constant time. Transitive reduction algorithms further optimize DAG storage by eliminating redundant edges that can be inferred via longer paths, producing a minimal equivalent graph with fewer edges. For a DAG with n nodes, the reduction can remove up to Θ(n²) edges in dense cases, yielding substantial savings for transitive relations like partial orders.^[74] Dynamic variants maintain this minimality under edge insertions and deletions in polylogarithmic time, enabling storage-efficient updates in evolving graphs.^[75] In versioned databases, DAGs facilitate efficient storage of diffs by modeling versions as nodes with pointers to shared immutable data, avoiding full copies of unchanged content. Systems like Dolt employ Merkle DAGs to structurally share data across versions via prolly trees, compressing historical snapshots and enabling git-like branching with minimal overhead.^[76] This approach ensures that diffs are stored only for modified portions, reducing space for large-scale data evolution. Theoretically, DAG compression relates to Kolmogorov complexity through the minimal description length of generative models, but practical methods rely on grammar-based techniques that rewrite DAGs as small context-free grammars capturing repeated substructures. Grammar-based graph compression recursively replaces isomorphic subgraphs with non-terminals, achieving ratios competitive with general compressors on repetitive datasets.^[77] For tree-like DAGs, such grammars extend to hybrid representations that balance sharing and expansion rules for optimal size.^[78] Historically, DAGs were used in compilers starting in the 1960s for representing expression trees with common subexpression sharing, enabling optimizations like value numbering during code generation. Early works on arithmetic expression translation laid the groundwork for DAG-based intermediate representations, which minimized redundant computations in straight-line code.^[79]

References

[1]
[PDF] Chapter 3 - Graphs - cs.Princeton
Directed Acyclic Graphs. Def. An DAG is a directed graph that contains no directed cycles. Ex. Precedence constraints: edge (v i. , v j. ) means v i must ...
[2]
[PDF] 6.042J Chapter 6: Directed graphs - MIT OpenCourseWare
Definition 6.1. 4. A directed graph is called a directed acyclic graph (or, DAG) if it does not contain any directed cycles. A first glance, DAGs don't appear ...
[3]
[PDF] Directed graphs Digraph D = (V,A). V ={vertices}, A={arcs} - CMU Math
A Directed Acyclic Graph (DAG) is a digraph without any directed cycles. Lemma 1 If D is a DAG then D has at least one source (vertex of indegree 0) and at ...
[4]
Tutorial on Directed Acyclic Graphs - PMC - NIH
Aug 8, 2021 · Directed acyclic graphs (DAGs) are an intuitive yet rigorous tool to communicate about causal questions in clinical and epidemiologic research.1. Introduction: Dags... · Figure 1 · 4. Dags Inform Us About How...Missing: properties | Show results with:properties
[5]
https://ieeexplore.ieee.org/document/8632193
[6]
Directed Graph -- from Wolfram MathWorld
A directed graph having no multiple edges or loops (corresponding to a binary adjacency matrix with 0s on the diagonal) is called a simple directed graph.
[7]
Acyclic Digraph -- from Wolfram MathWorld
An acyclic digraph is a directed graph containing no directed cycles, also known as a directed acyclic graph or a "DAG."
[8]
Dénes König - Biography - MacTutor - University of St Andrews
König saw that the use of graph theory had greatly helped visualise problems and help to solve them. He decided that he would try to make graph theory into a ...Missing: acyclic | Show results with:acyclic<|control11|><|separator|>
[9]
[PDF] 19 Depth-First Search
A directed acyclic graph or dag is a directed graph with no directed cycles. Any vertex in a dag that has no incoming vertices is called a source; any vertex ...Missing: terminology | Show results with:terminology
[10]
Lecture 16: Bayes Nets 1
A Bayes net is a directed acyclic (no cycles) graph (called a DAG for short). So nodes are partially ordered by the ancestor/descendent relationships.<|control11|><|separator|>
[11]
[PDF] Graphs and Network Flows ISE 411 Lecture 5
A directed acyclic graph (DAG) is a directed graph containing no directed cycles. • DAGs can be interpreted as specifying precedence relations or a (partial).
[12]
[PDF] Directed Acyclic Graphs - Swarthmore College
A digraph is acyclic if it has no directed cycles. D2: DAG is an acronym for directed acyclic graph. D3; a source in a digraph is a vertex of indegree zero.
[13]
DAGMan Introduction — HTCondor Manual 23.0.28 documentation
... directed acyclic graph (DAG). A DAGMan workflow automatically submits jobs ... For example, the previous diamond.dag example could be written as follows:.
[14]
4.2 Directed Graphs - Algorithms, 4th Edition
Jan 14, 2020 · A directed graph (or digraph) is a set of vertices and a collection of directed edges that each connects an ordered pair of vertices.
[15]
[PDF] Chapter 8 - Directed graphs - DSpace@MIT
... transitive closure and R* is called the reflexive transitive closure of R. ... It's easy to check that conversely, the graph of any strict partial order is a DAG.
[16]
[PDF] THE TRANSITIVE REDUCTION OF A DIRECTED GRAPH
A graph G is said to be transitive if, for every pair of vertices u and v, not necessarily distinct, (u, v) € G whenever there is a directed path in G from u to ...
[17]
[PDF] Efficient Labeling for Reachability in Directed Acyclic Graphs - DROPS
Reachability queries in a DAG naturally correspond to comparing elements in a partially ordered set (poset). Kleitman and Rothschild [19] proved the following ...
[18]
A003024 - OEIS
Number of acyclic digraphs (or DAGs) with n labeled nodes. (Formerly M3113). 74. 1, 1, 3, 25, 543, 29281, 3781503, 1138779265, 783702329343, 1213442454842881 ...
[19]
A003087 - OEIS
Number of acyclic digraphs with n unlabeled nodes. (Formerly M1696). 24. 1, 1 ... Lawrence Ong, Optimal Finite-Length and Asymptotic Index Codes for Five ...Missing: DAGs | Show results with:DAGs
[20]
https://arxiv.org/abs/2001.08822
[21]
Counting linear extensions of posets with determinants of hook lengths
Jan 23, 2020 · As applications, we give families of tree posets whose numbers of linear extensions are given by generalizations of Euler numbers, we draw ...
[22]
[1506.05977] Enumerating Cyclic Orientations of a Graph - arXiv
Jun 19, 2015 · Acyclic and cyclic orientations of an undirected graph have been widely studied for their importance: an orientation is acyclic if it assigns a ...
[23]
The Transitive Reduction of a Directed Graph
A directed graph G t is said to be a transitive reduction of the directed graph G provided that (i) G t has a directed path from vertex u to vertex v.
[24]
https://doi.org/10.1016/0012-365X(93)90164-O
[25]
Topological sorting of large networks | Communications of the ACM
Topological sorting of large networks. Author: A. B. Kahn. A. B. Kahn ... Published: 01 November 1962 Publication History. 540citation5,832Downloads.
[26]
Solving all-pairs shortest path by single-source computations
Nov 20, 2017 · For the SSSP algorithm, we again visit nodes according to their topological order. Note that if the graph G is a DAG then G ′ is also a DAG.
[27]
Critical-path planning and scheduling - ACM Digital Library
Kelley, J. E., Jr., and M. R. Walker, "Critical-Path Planning and Scheduling: An Introduction," Mauchly Associates, Inc., Ambler, Pa., 1959. Google Scholar.
[28]
[PDF] REDUCIBILITY AMONG COMBINATORIAL PROBLEMS
The main contribution of this paper is the demonstration that a large number of classic difficult computational problems, arising in fields such as mathematical ...Missing: hard | Show results with:hard
[29]
[PDF] The minimum feedback arc set problem is NP-hard for tournaments
A classical result of Lawler and Karp [5] asserts that finding a minimum feedback arc set in a digraph is NP-hard. Bang-Jensen and Thomassen [4] conjectured ...Missing: seminal | Show results with:seminal
[30]
[PDF] An exact method for the minimum feedback arc set problem
The minimum feedback arc set problem is finding a subset of edges in a directed graph that contains at least one edge of every cycle, with the goal of finding ...Missing: seminal | Show results with:seminal
[31]
[PDF] Parameterized Algorithms for Feedback Set Problems and Their ...
May 12, 2005 · We give efficient fixed parameter tractable algorithms for the feedback vertex set and feedback arc set problem in weighted tournaments.
[32]
A fast and effective heuristic for the feedback arc set problem
Eades, X. Lin, W.F. Smyth. Heuristics for the feedback arc set problem. Tech. Rept. No. 1, Curtin University of Technology, School of Computing Science (1989).
[33]
10. Fundamental Scheduling Procedures
Technically, this algorithm accomplishes a "topological sort" of ... project duration identified by the critical path scheduling calculation is impossible.
[34]
[PDF] The Origins of PERT and CPM - Mosaic Projects
The origin of critical path scheduling was the planning of the US Pacific Island-hopping campaign during. World War II. The Quartermaster Corps coordinated ...
[35]
Understanding the basics of CPM calculations | PMI
The first step in the calculation process is known as the Forward Pass. In the forward pass, the Early Start and Early Finish values for each activity, along ...
[36]
How PERT Transformed Project Management - Booz Allen
It helped the Navy hit its deadlines, deploying its subs and manufacturing its Polaris missiles two years ahead of its original target date. In a surprise to ...Missing: 1958 | Show results with:1958<|separator|>
[37]
Build System Rules and Algorithms - Embedded Artistry
Apr 17, 2017 · The global DAG is the entire dependency graph of the software system. This is the DAG used in a non-recursive make setup, or in any global “top ...
[38]
A DAG Scheduling Scheme on Heterogeneous Computing Systems ...
An important subclass of heuristic scheduling is list scheduling with an ordered task list for a DAG job on the basis of some greedy heuristics. Moreover ...
[39]
https://dspace.mit.edu/bitstream/handle/1721.1/149649/MIT-LCS-TR-385.pdf
[40]
The Remarkable Utility of Dataflow Computing - acm sigops
Mar 7, 2020 · Dataflow computing originated in computer architecture proposals in the 1970s ... Jack Dennis and Arvind, as well as many others. The central idea ...
[41]
[PDF] Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In ...
Resilient Distributed Datasets (RDDs) are a read-only, partitioned, distributed memory abstraction for fault-tolerant in-memory computations on large clusters.
[42]
[PDF] TensorFlow: A System for Large-Scale Machine Learning - USENIX
Nov 4, 2016 · TensorFlow uses a unified dataflow graph to repre- sent both the computation in an algorithm and the state on which the algorithm operates. We ...
[43]
[PDF] FlumeJava: easy, efficient data-parallel pipelines - cs.wisc.edu
Jun 5, 2010 · MapReduce works well for computations that can be broken down into a map step, a shuffle step, and a reduce step, but for many real-world ...
[44]
[PDF] Dragon-book.pdf - School of Information Science and Technology
In the time since the 1986 edition of this book, the world of compiler design has changed significantly. Programming languages have evolved to present new.
[45]
[PDF] DAG Scheduling and Analysis on Multi-core Systems by Modelling ...
The node-based methods provide a more generic solution by pro- ducing an explicit node execution order, based on heuristics derived from either the spatial ( ...
[46]
[PDF] Sprinkling Selections over Join DAGs for Efficient Query Optimization
In this paper, we use join graph for a relational database schema to either pre- compute all possible join orderings that can be executed and store it as a join ...Missing: seminal | Show results with:seminal
[47]
[PDF] Scalable Machine Learning using Dataflow Graph Analysis
1.1 Evolution of Computation Frameworks for. Machine Learning. The arrival of the “Big Data” era makes it easy to obtain a large amount of useful data.
[48]
https://ftp.cs.ucla.edu/pub/stat_ser/r246.pdf
[49]
d-Separation: From Theorems to Algorithms - ScienceDirect
An efficient algorithm is developed that identifies all independencies implied by the topology of a Bayesian network.Missing: paper | Show results with:paper
[50]
Causal Models - Stanford Encyclopedia of Philosophy
Aug 7, 2018 · The Faithfulness Condition (FC) is the converse of the Markov Condition: it says that all of the (conditional and unconditional) probabilistic ...
[51]
Git - gitglossary Documentation
### Summary: Git's Use of Directed Acyclic Graph (DAG) for Commits, Branches, and Merges
[52]
git-merge-base Documentation - Git
git merge-base finds the best common ancestor(s) between two commits to use in a three-way merge. One common ancestor is better than another common ancestor.2.19.2 2018-11-21 · 2.24.0 2019-11-04 · 2.0.5 2014-12-17
[53]
Git's database internals II: commit history queries - The GitHub Blog
Aug 30, 2022 · git log --graph uses a sorting algorithm called topological sort to present the commits in a pleasing order. This ordering has one hard ...
[54]
The Architecture of Open Source Applications (Volume 1)Mercurial
Both Git and Mercurial were started in 2005 when the Linux kernel developers decided to no longer use the proprietary BitKeeper system. Both were started by ...
[55]
[1009.0909] Comparing Pedigree Graphs - arXiv
Sep 5, 2010 · Pedigree graphs, or family trees, are typically constructed by an expensive process of examining genealogical records to determine which pairs ...
[56]
[PDF] Lowest Common Ancestors in Trees and Directed Acyclic Graphs1
We study the problem of finding lowest common ancestors (LCA) in trees and directed acyclic graphs (DAGs). Specifically, we.
[57]
An Obvious Choice: Why DAGs Over Blockchains? - IOTA Blog
Nov 8, 2023 · The Benefits of DAG. The Tangle enables a network design where new blocks can approve multiple older blocks without having to form a singular ...
[58]
[0907.4346] Random graph models for directed acyclic networks
Jul 24, 2009 · We study random graph models for directed acyclic graphs, an important class of networks that includes citation networks, food webs, and feed-forward neural ...
[59]
Cycle analysis of Directed Acyclic Graphs - ScienceDirect.com
... Directed Acyclic Graph. This makes the metrics a consistent ... For example the diamond cycles in Fig. 1 are maximally balanced. The balance ...
[60]
Citation Indexes for Science
Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas. Eugene GarfieldAuthors Info & Affiliations. Science. 15 Jul 1955.
[61]
An index to quantify an individual's scientific research output - PNAS
I propose the index h, defined as the number of papers with citation number ≥h, as a useful index to characterize the scientific output of a researcher.
[62]
(PDF) Diversity from the Topology of Citation Networks
Feb 16, 2018 · We study transitivity in directed acyclic graphs and its usefulness in capturing nodes that act as bridges between more densely interconnected ...
[63]
[PDF] An author co-citation analysis of information science, 1972-1995
This study uses author co-citation analysis (ACA) to analyze information science, identifying influential authors and mapping their interrelationships from ...
[64]
Inferring ontology graph structures using OWL reasoning
Jan 5, 2018 · In the past, ontologies in biology were widely developed as directed acyclic graphs (DAGs) in which nodes stand for classes of entities ...Methods · Results And Discussion · Converting Owl Ontologies...<|separator|>
[65]
Wikidata as a knowledge graph for the life sciences - PMC
Mar 17, 2020 · Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, ...
[66]
Efficient semantic summary graphs for querying large knowledge ...
Transitive Closure (TC) is one of the inference rules which is exerted in this work to infer more facts. It is deployed in the graph in order to find more facts ...
[67]
[PDF] Google Scholar's Ranking Algorithm: The Impact of Citation Counts ...
The graphs of the mean citation counts show a very clear interrelationship between an article‟s citation count and the way it is ranked by Google Scholar.
[68]
Are citation networks really acyclic? - Angelo Salatino
Oct 30, 2018 · No, citation networks are not acyclic in practice. While ideally they should be, papers can cite each other, and references can be updated.
[69]
[PDF] XML Compression via Directed Acyclic Graphs
Abstract Unranked node-labeled trees can be represented using their min- imal dag (directed acyclic graph). For XML this achieves high compression.Missing: substructures | Show results with:substructures
[70]
[PDF] XML compression via DAGs - OpenProceedings.org
Hence, the additional feature of sharing sibling sequences is not use- ful for XML. On real XML documents, we show in experi- ments that the “reverse binary dag ...Missing: substructures | Show results with:substructures
[71]
Efficient Deterministic Finite Automata Minimization Based on ...
Nov 2, 2016 · This means that the minimal DFA is unique, and has the least number of states needed to recognize a language represented by regular expressions ...<|separator|>
[72]
[PDF] Generic State Machine Compression for Scalable Pattern Matching
Reducing the space requirement of the DFA has a direct impact on the algorithm performance, since smaller DFAs can fit into faster memory, and thus require.Missing: DAG | Show results with:DAG
[73]
[1911.03195] On dynamic succinct graph representations - arXiv
Nov 8, 2019 · The k^2-tree data structure is one of the succinct data structures proposed for representing static graphs, and binary relations in general.Missing: DAGs | Show results with:DAGs
[74]
[PDF] Fully Dynamic Algorithms for Transitive Reduction - arXiv
Apr 25, 2025 · Given a directed graph G, a transitive reduction Gt of G (first studied by Aho, Garey, Ullman. [SICOMP '72]) is a minimal subgraph of G that ...
[75]
https://eprints.cs.univie.ac.at/8468/1/LIPIcs.ICALP.2025.92.pdf
[76]
So you want Database Versioning? | DoltHub Blog
Aug 4, 2022 · Combining prolly trees with a Merkle DAG allows data to be structurally shared across versions, preserving disk space.Missing: efficient | Show results with:efficient
[77]
Grammar-based graph compression - ScienceDirect.com
We present a new graph compressor that works by recursively detecting repeated substructures and representing them through grammar rules.Missing: DAG | Show results with:DAG
[78]
[PDF] Grammar-Based Tree Compression - Infoscience
Abstract. Grammar-based compression means to find a small grammar that gen- erates a given object. Such a grammar reveals the structure of the object ...
[79]
[PDF] A history of compilers - ITU
Feb 21, 2014 · History: compilation of expressions. • Rutishauser 1952 (unimpl.) – Translating arithmetic expressions to 3-addr code. – Infix operators ...