Interval graph

An interval graph is an undirected graph in which each vertex corresponds to a closed interval on the real line, and an edge connects two vertices if and only if their intervals intersect.^[1] This representation captures the intersection structure of the intervals, making interval graphs a fundamental class in intersection graph theory.^[1] Interval graphs were introduced by Hajós in 1957 and independently applied by Benzer in 1959 to represent genetic structures as linear arrangements of mutant gene segments, with formal characterizations provided by Fulkerson and Gross in 1965.^[1]^[2] A key property is that interval graphs are chordal graphs—meaning every cycle of length greater than three has a chord—and they lack asteroidal triples, which are sets of three vertices such that for every pair in the set, there is a path between them avoiding the neighborhood of the third vertex, as characterized by Lekkerkerker and Boland in 1962.^[3] They also admit a clique-vertex incidence matrix with the consecutive ones property, where the ones in each column form a consecutive block.^[1] Interval graphs can be recognized in linear time using algorithms based on PQ-trees, originally developed by Booth and Lueker in 1976, which test for the consecutive ones property and construct an interval representation if it exists.^[4] These graphs arise in applications such as genetic mapping, where intervals represent mutant gene segments; job scheduling, modeling non-overlapping tasks; and seriation problems in archaeology and psychology for ordering events.^[1]^[3] As perfect graphs, interval graphs exhibit efficient solvability for optimization problems like coloring and independent set finding, aligning with their role in algorithmic graph theory.^[3]

Fundamentals

Definition

An interval graph is an undirected graph that arises as the intersection graph of a collection of intervals on the real line. In this model, each vertex of the graph corresponds to a distinct interval, and an edge exists between two vertices if and only if their associated intervals have a nonempty intersection. This representation captures scheduling problems, temporal relationships, and other linear ordering scenarios in combinatorial optimization and computer science.^[1] Formally, given an undirected graph G = (V, E), it is an interval graph if there is a bijection from the vertices V to a set of closed intervals \{I_v = [l_v, r_v] \mid v \in V\} on the real line, where l_v \leq r_v for each v, such that \{u, v\} \in E if and only if I_u \cap I_v \neq \emptyset, or equivalently, \max(l_u, l_v) \leq \min(r_u, r_v). This interval representation uniquely defines the adjacency structure based on overlap, assuming familiarity with standard graph theory notions like vertices and edges.^[5] Examples of interval graphs include complete graphs, where all vertices can be represented by intervals sharing a common interior point, ensuring full pairwise overlap, and path graphs P_n (the graph consisting of n vertices connected in a linear chain), which admit representations via consecutively overlapping intervals along the line.^[5] Interval graphs form a subclass of chordal graphs.^[1]

Basic Properties

Interval graphs possess several fundamental structural properties that distinguish them as a well-behaved subclass of graphs. Notably, they are chordal graphs, meaning they contain no induced cycles of length four or greater.^[6] This chordality follows directly from their interval representation, where any potential induced cycle would require non-overlapping intervals in a manner inconsistent with the intersection model.^[6] As a subclass of chordal graphs, interval graphs are perfect, satisfying the condition that for every induced subgraph H, the chromatic number \chi(H) equals the clique number \omega(H).^[7] In particular, for an interval graph G, \chi(G) = \omega(G), where \omega(G) corresponds to the size of the largest clique, which in the interval model is the maximum number of intervals sharing a common point (and thus pairwise overlapping).^[8] This equality enables efficient coloring algorithms, as the graph can be colored with \omega(G) colors. Interval graphs are also the incomparability graphs of interval orders, implying that their complements are comparability graphs of interval orders—partial orders defined by non-overlapping intervals where one precedes another if its endpoint is before the other's start.^[9] Furthermore, interval graphs are asteroidal triple-free, containing no three vertices such that between any two there exists a path avoiding the neighborhood of the third.^[6] The complements of interval graphs, known as co-interval graphs, inherit perfection from the original graphs, as the class of perfect graphs is closed under taking complements.^[7]

Characterizations

Interval Model Characterizations

Interval graphs possess a characterization in terms of the arrangement of their maximal cliques. The maximal cliques can be linearly ordered such that, for every vertex, the subsequence of cliques containing that vertex forms a consecutive block in the ordering. This ordering corresponds to the left-to-right progression of intervals in an interval representation, where each maximal clique arises from the set of intervals containing a common point.^[8] This clique arrangement is equivalent to the vertex-maximal clique incidence matrix exhibiting the consecutive ones property: after permuting the columns (cliques), the 1's in each row (vertex) appear consecutively. Fulkerson and Gross proved that a graph is an interval graph if and only if the rows of its maximal clique incidence matrix can be permuted to satisfy the consecutive ones property.^[8] A variant of the Lekkerkerker–Boland theorem leverages this property, recognizing interval graphs among chordal graphs via the consecutive arrangement of maximal cliques in their clique-vertex matrix.^[10] To illustrate, consider an interval graph with vertices v_1, v_2, v_3, v_4 and maximal cliques K_1 = \{v_1, v_2\}, K_2 = \{v_2, v_3\}, K_3 = \{v_3, v_4\}. The vertex-clique incidence matrix is:

\begin{array}{c|ccc} & K_1 & K_2 & K_3 \\ \hline v_1 & 1 & 0 & 0 \\ v_2 & 1 & 1 & 0 \\ v_3 & 0 & 1 & 1 \\ v_4 & 0 & 0 & 1 \\ \end{array}

The 1's in each row are consecutive, confirming the property.^[8] Interval graphs also admit an order-theoretic characterization as the comparability graphs of interval orders. An interval order is a partial order on a set X that admits a representation by intervals I_x for each x \in X such that x < y if and only if I_x ends before I_y begins. Fishburn and Mirkin independently characterized interval orders as those partial orders containing no induced subposet isomorphic to $2+2, the poset on four elements a, b, c, d with relations a \prec b and c \prec d, where \{a, b\} is incomparable to \{c, d\}. The comparability graph of such an order—where vertices are elements and edges connect comparable pairs—yields an interval graph, and every interval graph arises this way.^[11] This bridges geometric interval models with poset representations, highlighting the structural equivalence.

Forbidden Subgraph Characterization

A graph is an interval graph if and only if it is chordal and asteroidal triple-free, by the Lekkerkerker–Boland theorem.^[12] Chordal graphs are those without induced cycles of length four or more, so any such cycle C_n for n \geq 4 is a forbidden induced subgraph for interval graphs.^[3] An asteroidal triple is an independent set of three vertices such that, for every pair among them, there exists a path connecting the pair that contains no vertex from the neighborhood of the third.^[13] The absence of asteroidal triples, combined with chordality, yields a forbidden induced subgraph characterization: interval graphs exclude all induced C_n (n \geq 4) and all chordal graphs containing an asteroidal triple.^[14] The minimal such chordal graphs with asteroidal triples form the core obstructions beyond cycles; Lekkerkerker and Boland enumerated five infinite families of these minimal forbidden induced subgraphs, often labeled G_I through G_V.^[12] These include configurations like fork and claw variants, umbrella, tent, and sun graphs such as S_3 (rising sun) and S_4.^[15] Key examples among these minimal forbidden subgraphs illustrate the asteroidal triple obstructions. The bipartite claw (or fork) consists of a path of length three (four vertices) with an additional pendant vertex attached to the second vertex of the path, forming an independent set of three leaves connected pairwise by paths avoiding the neighborhood of the third.^[16] The umbrella features a triangle with two pendant edges attached to one vertex and a third pendant path of length two from another vertex of the triangle, creating an asteroidal triple among the endpoints.^[16] The rising sun (S_3) is a wheel-like graph on six vertices: a central triangle with three additional vertices, each adjacent to exactly two consecutive vertices of the triangle, inducing an asteroidal triple among the outer vertices.^[14] Higher suns like S_4 and tents (bipyramids over cycles) extend this pattern, with similar asteroidal configurations in larger chordal structures.^[15] These examples, along with the full G_I to G_V families, ensure that any interval graph avoids such induced obstructions.^[12]

Algorithms

Recognition

The recognition of interval graphs involves determining whether a given undirected graph belongs to the class of interval graphs, which can be done efficiently using linear-time algorithms. The first such algorithm, developed by Booth and Lueker in 1976, achieves this in O(n + m) time, where n is the number of vertices and m is the number of edges, by testing the consecutive ones property of the graph's clique-vertex incidence matrix using a data structure called PQ-trees. This property holds if and only if the graph is an interval graph, as the maximal cliques can then be ordered linearly such that the cliques containing each vertex form a consecutive block.^[17] The Booth-Lueker algorithm proceeds by first verifying that the graph is chordal, which can be done in linear time using a lexicographic breadth-first search (Lex-BFS) variant to obtain a perfect elimination ordering, and then constructing a universal PQ-tree representing all possible linear arrangements of the maximal cliques. Vertices are processed in descending degree Lex-BFS order, and for each vertex, the REDUCE operation on the PQ-tree enforces consecutiveness of the cliques incident to it through a series of template-matching reductions (SCAN, SPLIT, etc.), ensuring no asteroidal triples exist implicitly via the structure. If the PQ-tree remains valid after all vertices, the graph is an interval graph; otherwise, it is not. This approach avoids explicit enumeration of forbidden induced subgraphs, such as the asteroidal triples or specific chordless cycles, by leveraging the tree's representational power. A more simplified linear-time algorithm was later proposed by Hsu and Ma in 1999, running in O(n + m) time and directly checking the Lekkerkerker–Boland conditions, which characterize interval graphs as chordal graphs without asteroidal triples.^[18] The algorithm begins with substitution decomposition to identify prime components in linear time, then applies a Lex-BFS ordering to produce a simplicial elimination scheme that confirms chordality via parenthesized ordering checks, where each vertex's neighbors earlier in the order form a clique.^[18] To verify the absence of asteroidal triples, it partitions the maximal cliques into sets based on the ordering (starting from the clique containing the last vertex) and iteratively refines these partitions using neighborhood inclusion tests: for each vertex, cliques are split to ensure those containing it are consecutive, with no path avoiding neighborhoods between non-adjacent vertices in the triple.^[18] Both algorithms exploit Lex-BFS to generate orderings that facilitate efficient verification of structural properties, with the Hsu-Ma method avoiding the complexity of PQ-trees by relying on direct partitioning and modular decomposition for broader applicability to related classes like chordal comparability graphs.^[18] These O(n + m) complexities make recognition practical for large graphs, and implementations are available in libraries such as NetworkX (using Lex-BFS-based checks) and igraph, enabling efficient computation in software applications.

Interval Representation Construction

Once an interval graph has been recognized, constructing an explicit interval representation involves assigning to each vertex an interval [l_v, r_v] on the real line such that two vertices are adjacent if and only if their intervals intersect. A key algorithm for this construction, running in linear time, relies on lexicographic breadth-first search (Lex-BFS) to derive a vertex ordering that facilitates the identification of a clique path—a linear arrangement of the maximal cliques where the cliques containing each vertex form a consecutive subpath.^[19] This approach leverages the consecutive-ones property of the clique-vertex incidence matrix to ensure the representation is valid. The algorithm proceeds as follows: First, perform a Lex-BFS on the graph to obtain an ordering σ of the vertices, which corresponds to a valid interval ordering where adjacent vertices have overlapping intervals in the representation. Using this ordering, compute the maximal cliques and arrange them into a clique path C_1, C_2, ..., C_k, where each C_i is a maximal clique and the cliques containing any vertex v appear consecutively. For each vertex v, identify the subpath of cliques it belongs to, say from position i_v to j_v in the path. Assign the left endpoint l_v to the position i_v (e.g., l_v = i_v) and the right endpoint r_v to j_v (e.g., r_v = j_v + ε for small ε > 0 to ensure proper intersection modeling). This yields an interval representation where intersections precisely match the adjacencies. To minimize the total length of intervals if desired (e.g., for compact representations), adjust the endpoints by scaling or shifting while preserving intersections, achievable in linear time via post-processing on the clique path.^[19] For example, consider a simple interval graph with vertices a, b, c and edges a-b, b-c (a path graph). The maximal cliques are {a,b} and {b,c}, forming the path C1={a,b}, C2={b,c}. Assign l_a=1, r_a=2; l_b=1, r_b=3; l_c=2, r_c=3. The intervals [1,2], [1,3], [2,3] intersect precisely when the vertices are adjacent.^[19] In the special case of unit interval graphs (where all intervals have equal length), the construction can be performed in linear time using PQ-tree methods to represent all possible orderings of the maximal cliques consistent with the consecutive-ones property, ensuring no proper containment of intervals. Unit intervals [p_v, p_v + 1] are assigned where p_v is derived from the leftmost position in a valid clique ordering produced by the PQ-tree.^[17]

Proper Interval Graphs

Proper interval graphs form a subclass of interval graphs that admit an interval representation in which no interval properly contains another.^[20] This containment-free condition distinguishes them from general interval graphs, where intervals may nest.^[21] Proper interval graphs are equivalent to unit interval graphs, in which all intervals have equal length, and to indifference graphs, which arise from assigning vertices real numbers such that edges connect vertices differing by at most one.^[20]^[22] They also coincide with graphs whose clique-vertex incidence matrix exhibits the consecutive ones property in both rows and columns, ensuring no containment in the associated interval model.^[3] A key characterization is that proper interval graphs are precisely the claw-free interval graphs, meaning they contain no induced subgraph isomorphic to K_{1,3} (the claw).^[20]^[21] As interval graphs, proper interval graphs possess a clique tree that is a path, a structure that qualifies as a caterpillar tree.^[3] They have at most n maximal cliques, where n is the number of vertices, due to the linear ordering of cliques in their path representation.^[3] Examples include paths, which admit proper interval representations with non-overlapping endpoints ordered sequentially, and certain caterpillar trees that satisfy the claw-free condition.^[20]

Unit Interval Graphs

A unit interval graph is an interval graph that admits an interval representation on the real line in which every interval has identical length, conventionally taken to be 1. This uniform length distinguishes unit interval graphs from general interval graphs, where intervals may vary in length, and enables interpretations in terms of fixed-distance adjacencies, such as points on a line connected if within unit distance.^[20] The class of unit interval graphs coincides precisely with the class of proper interval graphs, in which there exists an interval representation with no interval properly containing another. This equivalence was established by F. S. Roberts in 1969 as part of his study of indifference relations in graph theory. The proof involves constructing a unit-length representation from a proper one by appropriately scaling and shifting endpoints to equalize lengths while preserving intersections and non-intersections. Conversely, any unit interval representation can be adjusted to eliminate proper containments without altering the intersection graph. This identity allows results from one model to transfer to the other, though the unit model often aids in applications requiring metric uniformity.^[23]^[20] Unit interval graphs admit several equivalent characterizations. They are precisely the indifference graphs, where vertices can be assigned real values f(v) such that two vertices u and v are adjacent if and only if |f(u) - f(v)| \leq 1; the corresponding unit interval for v is then [f(v) - 0.5, f(v) + 0.5]. This model highlights their role in representing indifference relations in social choice theory. Additionally, unit interval graphs are the claw-free interval graphs, i.e., interval graphs containing no induced subgraph isomorphic to K_{1,3} (a star with three leaves). Roberts provided this forbidden subgraph characterization, which facilitates recognition algorithms by combining interval graph testing with claw-freeness checks.^[24]^[23]^[25]

Applications

Scheduling Problems

Interval graphs provide a natural framework for modeling scheduling problems involving time intervals, such as job assignments on machines where each job occupies a fixed time slot and cannot overlap with others on the same resource. In this representation, each vertex corresponds to a job interval, and an edge connects two vertices if their intervals overlap, indicating a conflict that requires separate resource allocation. The problem of assigning jobs to the minimum number of identical machines without overlaps reduces to finding the chromatic number χ(G) of the interval graph G, which determines the minimum resources needed.^[26] A fundamental property enabling efficient solutions is that interval graphs are perfect, meaning χ(G) equals the clique number ω(G), the size of the largest clique, which directly corresponds to the maximum number of intervals overlapping at any single point in time. This equality guarantees that the optimal number of machines matches the peak resource demand, allowing for schedules with minimal makespan without wasted capacity. For instance, in multiprocessor scheduling, coloring the graph partitions the jobs into χ(G) independent sets, each assignable to a single processor. The greedy coloring algorithm achieves this optimal partitioning by sorting the intervals by increasing left endpoints and assigning to each interval the smallest color not used by its earlier neighbors, resulting in exactly ω(G) colors and running in O(n log n) time due to sorting. This approach provides an exact solution and serves as a 1-approximation in broader contexts where the graph structure approximates interval conflicts. For more complex variants, such as those with additional constraints like processing costs, dynamic programming along the clique path—a linear arrangement of maximal cliques in interval graphs—enables exact optimization by building solutions sequentially across the path. In job shop scheduling, interval graphs model scenarios where tasks have predefined time windows and precedence constraints are captured by directed edges or additional graph layers, with coloring resolving resource conflicts among overlapping tasks on shared machines. Applications include crew rostering, where shifts are intervals, and the graph ensures no overlapping assignments to the same crew member.^[26]^[27] The application of interval graphs to scheduling traces back to operations research in the mid-20th century, with foundational work in the 1950s and 1960s, including Ford and Fulkerson's use of network flows for basic interval selection problems and Fulkerson and Gross's formalization of interval graphs in 1965, which facilitated conflict modeling in resource allocation.^[26]^[1]

Bioinformatics

In genome mapping, interval graphs model the relative positions of DNA fragments, such as contigs or genes along chromosomes, where vertices represent intervals corresponding to these fragments and edges indicate overlaps between them, facilitating the reconstruction of the overall genomic structure.^[28] This approach originated from early studies on bacterial gene structures and has been applied to physical mapping problems, where partial overlap information is used to assemble contigs into larger scaffolds.^[29] For instance, algorithms based on interval graph representations process sets of contigs with known pairwise overlaps to determine their linear ordering, enabling efficient assembly even with incomplete data.^[30] In protein structure prediction, interval graphs represent secondary structure elements, such as alpha-helices and beta-sheets, as intervals along the protein sequence, with edges capturing spatial interactions or contacts between these elements to infer overall folding patterns.^[31] This modeling aids in analyzing coevolution networks of protein fragments, where the graph structure highlights functional relationships derived from sequence alignments and structural data.^[32] For sequence alignment in shotgun sequencing, overlap graphs of DNA fragments are often formulated as interval graphs, where each read is an interval and overlaps define edges, supporting the assembly of contigs by identifying consistent linear arrangements.^[33] This representation simplifies the layout phase by leveraging the chordal properties of interval graphs for rapid overlap resolution. Tools employing the overlap-layout-consensus paradigm, such as Phrap, utilize overlap graphs that align with interval graph structures to cluster reads based on sequence similarities, accelerating contig formation through efficient recognition of overlapping fragments.^[34] Post-2010 developments have integrated interval graphs with next-generation sequencing data for enhanced overlap modeling in assembly pipelines, improving contig accuracy by incorporating graph coarsening techniques to handle large-scale read overlaps.^[35] The linear-time recognition algorithms for interval graphs further benefit the processing of these expansive biological datasets.

Advanced Topics

Interval Completions

An interval completion of a graph G is obtained by adding a set of edges to G such that the resulting supergraph is an interval graph, and the minimum interval completion problem seeks the smallest such set.^[36] This edge-addition problem is NP-complete in general, even when restricted to input graphs that are paths. Interval graphs form a proper subclass of chordal graphs, so an interval completion is a special case of chordal completion that additionally ensures the supergraph satisfies the linear ordering of maximal cliques required for interval graphs; the problem is solvable in polynomial time for certain subclasses of chordal graphs, such as primitive starlike graphs.^[37] Algorithms for minimum interval completion often leverage the structure of potential maximal cliques and clique trees via dynamic programming. One such approach enumerates candidate maximal cliques and aligns them into a path-like tree decomposition, achieving fixed-parameter tractability with runtime O(k^{2k} n^3 m) where k is the number of added edges and m is the number of edges in G.^[38] An exact algorithm for a minimal (though not necessarily minimum) interval completion runs in O(n^2) time using matrix-based methods to identify and fill necessary edges while preserving interval properties.^[36] In database theory, interval completion arises in the context of consecutive ones completion for binary matrices, where minimal entry flips (corresponding to edge additions in the associated intersection graph) are performed to achieve the consecutive ones property, enabling efficient storage and query optimization for relational data representations.^[39] For example, consider a graph that is chordal but contains an asteroidal triple—a forbidden induced structure for interval graphs—where paths connecting the triple's vertices avoid each other's neighborhoods; adding edges along these paths fills the triple, yielding an interval supergraph.^[40] Interval graphs are precisely the chordal graphs without asteroidal triples, so completions target the elimination of such structures alongside cycle fillings.^[13]

Pathwidth Relations

Pathwidth, denoted pw(G), is defined as the minimum width over all path decompositions of a graph G, where the width of a path decomposition is the size of the largest bag minus one. A path decomposition consists of a path graph whose nodes are bags (subsets of vertices) satisfying specific coverage conditions for vertices and edges. For interval graphs, pathwidth coincides with treewidth due to their inherent linear structure, allowing an optimal tree decomposition to be arranged along a path. A fundamental result establishes that for any interval graph G, pw(G) = ω(G) - 1, where ω(G) is the size of the maximum clique in G. This equality holds because the maximum clique in an interval graph corresponds to the maximum number of intervals overlapping at any point, directly determining the minimum bag size required in a path decomposition. Since ω(G) can be computed in O(n log n) time via a sweep line algorithm on the interval representation—sorting the 2n endpoints and tracking overlaps—this yields an efficient method to determine pw(G). Interval graphs with maximum clique size at most k+1 have pathwidth at most k, highlighting their role as the canonical examples of bounded pathwidth graphs. To construct an explicit path decomposition of width ω(G) - 1, one sorts the endpoints of the intervals in the representation and processes them in order: at each left endpoint, add the interval to the current bag; at each right endpoint, remove it after ensuring edge coverage. This process generates O(n) bags, each of size at most ω(G), and can be performed in O(n log n) time dominated by the initial sorting. This tight connection between pathwidth and interval structure has broader implications, particularly in modeling bounded pathwidth problems in VLSI design, where interval graphs represent linear layouts of components and wires, enabling efficient dynamic programming for routing and placement optimization.^[41]

Enumeration

The number of labeled interval graphs on n vertices, denoted i_n, can be enumerated through generating functions derived from arrangements of maximal cliques, which form a linear path in interval graphs. Hanlon (1982) established a recursive relation for the ordinary generating function I(x) = \sum i_n x^n and the exponential generating function L(y) = \sum i_n y^n / n!, enabling computation of exact values such as i_3 = 8 and i_{10} = 90{,}729{,}050{,}427.^[42] Asymptotic analysis reveals that \frac{1}{3} n \log n + O(n) \leq \log i_n \leq n \log n + O(n), implying i_n \approx 2^{\Theta(n \log n)}. Pippenger (2017) proved that the radius of convergence of I(x) is zero, resolving a question raised by Hanlon, while the exponential generating function has radius at least $1/2. These bounds stem from lower estimates via threshold graphs and upper estimates via double factorials.^[43] For unlabeled interval graphs, the sequence appears in OEIS A005975, with Hanlon computing values up to n=10, including 67{,}659 for n=10. The enumeration leverages the consecutive ones property of the clique-vertex incidence matrix, where rows correspond to vertices and columns to maximal cliques arranged linearly.^[44]^[42] Enumeration of proper interval graphs (equivalently, unit interval graphs) is structurally simpler, as their maximal cliques admit representations without nested intervals, relating to Catalan numbers C_n in the context of path-like arrangements and recursive decompositions into connected components. The number of unlabeled unit interval graphs on n vertices is given by OEIS A005217, with values 1, 2, 4, 9, 21, 55 for n=1 to 6; Hanlon computed 4,502 for n=10.^[45]^[42] Algorithms for generating interval graphs recursively exploit the forbidden induced subgraph characterization (chordal graphs without asteroidal triples) or test the consecutive ones property on adjacency matrices to ensure linear clique ordering. Bender et al. (2020) developed an algorithm enumerating all nonisomorphic interval graphs on n vertices with polynomial time delay, achieving O(n^4) per graph. For proper interval graphs, dedicated recursive methods based on unit interval representations enable efficient listing and random generation of connected instances.^[46]^[47]