Kruskal's algorithm
Kruskal's algorithm is a greedy algorithm that computes a minimum spanning tree (MST) for a connected, undirected graph with weighted edges by selecting a subset of edges that connects all vertices while minimizing the total edge weight sum.[1]
First described by mathematician Joseph B. Kruskal in his 1956 paper "On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem," the algorithm provides an efficient solution to the MST problem, which has applications in network design, clustering, and approximation algorithms for problems like the traveling salesman problem.
The algorithm operates by sorting all edges in non-decreasing order of weight and initializing a forest where each vertex forms its own tree.[1] It then iterates through the sorted edges, adding an edge to the MST if it connects two distinct components (i.e., does not form a cycle), using a union-find (disjoint-set) data structure to track components and detect cycles efficiently.[2] This process continues until all vertices are connected, resulting in a tree with exactly V-1 edges for a graph with V vertices.[1]
Kruskal's algorithm relies on the cut property of MSTs, which states that the minimum-weight edge crossing any cut in the graph belongs to some MST, ensuring the greedy choice yields an optimal solution.[3] Its time complexity is O(E log E), where E is the number of edges, dominated by the sorting step, though it simplifies to O(E log V) for sparse graphs; the union-find operations contribute nearly constant time per operation with path compression and union by rank.[1] Compared to Prim's algorithm, Kruskal's is particularly effective for graphs with fewer edges relative to vertices, as it processes edges globally rather than growing from a starting vertex.[4]
Background
Minimum spanning tree
In an undirected, connected graph with n vertices, a spanning tree is a subgraph that forms a tree and includes all the original vertices, thereby connecting them without any cycles.[5] Such a spanning tree consists of exactly n-1 edges.[6]
A minimum spanning tree (MST) of a connected, edge-weighted undirected graph is a spanning tree whose total edge weight is minimized among all possible spanning trees.[1] In such graphs with non-negative edge weights, an MST is guaranteed to exist and also contains exactly n-1 edges.[6] The MST problem was first addressed by Otakar Borůvka in 1926 for practical network applications, with the modern algorithmic formulations developed in the 1950s by Joseph Kruskal, Robert Prim, and references to Borůvka's earlier work.[7]
Minimum spanning trees are essential in optimization problems, particularly for designing efficient networks with minimal cost, such as connecting cities via roads or laying telecommunication cables while minimizing total length or expense.[8] For instance, in transportation network design, an MST determines the lowest-cost infrastructure to link all locations without redundant paths.[9]
Union–find data structure
The union–find data structure, also known as the disjoint-set data structure, maintains a collection of disjoint subsets that partition an underlying set of elements, enabling efficient tracking of connectivity among elements. It supports two fundamental operations: find, which returns the representative (typically the root) of the subset containing a specified element, and union, which merges two subsets into a single one. This structure models the components of a graph during processes like building a spanning tree, where elements represent vertices and subsets represent connected components.
The find operation traverses a tree from the given element to its root, and path compression enhances efficiency by updating the parent pointers of all nodes along this path to point directly to the root, effectively flattening the tree for subsequent operations. The union operation connects the roots of two trees; to keep trees shallow, union by rank attaches the root of the tree with the smaller rank to the one with the larger rank, where rank serves as an approximation of tree height (initially 0 for single-node trees, and incremented only when ranks are equal). Union by size is an alternative that merges the smaller tree under the root of the larger, tracking actual subtree sizes instead of ranks. These heuristics ensure balanced tree growth.[10]
A basic implementation uses an array parent of size n (for n elements), where parent initially equals i (each element starts in its own singleton set), and nodes point to their parents to form forest trees. An auxiliary rank array (for union by rank) or size array (for union by size) stores balancing information, updated during merges. To perform find(x), iteratively follow parent until reaching a self-pointing root, optionally applying path compression by setting parent along the path to the root. For union(x, y), first find the roots r_x and r_y; if distinct, link the one with smaller rank (or size) to the other, incrementing rank if needed.[10]
With path compression combined with union by rank or size, the amortized time complexity for a sequence of m operations on n elements is O(m \alpha(n)), where \alpha(n) is the inverse Ackermann function, a very slowly growing function that remains below 5 for all feasible n (effectively constant time per operation). Without these optimizations, operations can degenerate to O(n) in the worst case due to skewed trees.[10]
In the context of minimum spanning tree computation, the union–find structure facilitates cycle detection: before adding an edge between vertices u and v, the find operation checks if u and v share the same root; if they do, the edge connects vertices already in the same component and would form a cycle, so it is skipped; otherwise, union merges the components. This ensures the selected edges form a tree without cycles.
Algorithm
Steps
Kruskal's algorithm constructs a minimum spanning tree (MST) for an undirected, weighted graph by greedily selecting the smallest edges that do not form cycles. The procedure assumes the graph is connected; if disconnected, it produces a minimum spanning forest consisting of MSTs for each connected component.[11]
The first step is to sort all edges of the graph in non-decreasing order of their weights. This ordering ensures that the algorithm considers potential MST edges starting from the lightest.[11]
Next, initialize a union–find data structure where each vertex begins in its own separate component, representing n singleton sets for a graph with n vertices. This structure efficiently tracks connected components during edge additions.[11][12]
Then, iterate through the sorted list of edges. For each edge, check if its endpoints belong to different components using the union–find structure. If they do, add the edge to the MST and merge (union) the two components; if they are already in the same component, skip the edge to avoid creating a cycle.[11][12]
Continue this process until exactly n-1 edges have been added to the MST or all edges have been processed. At this point, the resulting structure is the MST (or forest if the graph is disconnected).[11][12]
The algorithm handles edge cases such as graphs with multiple edges of equal weight correctly, though it may produce one of several possible MSTs in such scenarios, as the choice among equal-weight edges is arbitrary.[11]
Pseudocode
Kruskal's algorithm is presented in pseudocode form below, utilizing a union-find data structure (also known as disjoint-set union or DSU) to efficiently manage connected components and detect cycles. This implementation follows the standard procedure introduced by Kruskal, adapted for modern computational representation.
The input graph G = ([V](/page/V.), E) is undirected and connected, with [V](/page/V.) denoting the set of vertices and E the set of edges, where each edge is represented as a triple (u, v, w) with endpoints u, v \in [V](/page/V.) and weight w. The output is a minimum spanning tree T, a subset of |[V](/page/V.)| - 1 edges from E that connects all vertices without cycles and minimizes the total edge weight.
KRUSKAL(G, w)
1 A ← ∅
2 for each vertex u ∈ G.V
3 MAKE-SET(u)
4 sort the edges of G.E into nondecreasing order by weight w
5 for each edge (u, v) ∈ G.E, taken in sorted order
6 if FIND-SET(u) ≠ FIND-SET(v)
7 A ← A ∪ {(u, v)}
8 UNION(u, v)
9 return A
KRUSKAL(G, w)
1 A ← ∅
2 for each vertex u ∈ G.V
3 MAKE-SET(u)
4 sort the edges of G.E into nondecreasing order by weight w
5 for each edge (u, v) ∈ G.E, taken in sorted order
6 if FIND-SET(u) ≠ FIND-SET(v)
7 A ← A ∪ {(u, v)}
8 UNION(u, v)
9 return A
In this pseudocode, lines 1–3 initialize an empty set A to store the edges of the minimum spanning tree and create a separate set for each vertex using the union-find operations MAKE-SET, FIND-SET, and UNION, which track connected components. Line 4 sorts the edges by increasing weight, ensuring lighter edges are considered first to greedily build the tree. The loop in lines 5–8 iterates over the sorted edges; for each edge, it checks if adding it would form a cycle by verifying if its endpoints are in different components via FIND-SET—if not, the edge is added to A and the components are merged with UNION.
This pseudocode assumes the graph is provided as an explicit edge list for direct sorting; if the input is an adjacency list representation, the edges must first be collected into a list before sorting.[13]
Illustration
Example
Consider a simple undirected graph with four vertices labeled A, B, C, and D, and five weighted edges: A–B with weight 1, A–D with weight 2, B–C with weight 3, A–C with weight 4, and C–D with weight 5. This graph can be visualized as a nearly complete network where A connects to all others, B connects to A and C, and C connects to B and D, forming a connected structure suitable for demonstrating minimum spanning tree construction.
To apply Kruskal's algorithm, first sort the edges in non-decreasing order of weight: A–B (1), A–D (2), B–C (3), A–C (4), C–D (5). Initialize the union–find data structure with each vertex in its own component: {A}, {B}, {C}, {D}.
Process the edges sequentially:
- Add A–B (weight 1): Vertices A and B are in different components, so union them. New components: {A, B}, {C}, {D}.
- Add A–D (weight 2): A and D are in different components, so union them. New components: {A, B, D}, {C}.
- Add B–C (weight 3): B and C are in different components, so union them. New components: {A, B, C, D}.
- Skip A–C (weight 4): A and C are already in the same component, forming a cycle.
- Skip C–D (weight 5): C and D are already in the same component, forming a cycle.
The resulting minimum spanning tree consists of the edges A–B, A–D, and B–C, with a total weight of 6. This MST connects all vertices without cycles and can be depicted as a tree branching from A to B and D, with B further connected to C.
Analysis
Time complexity
Kruskal's algorithm has an overall time complexity of O(E \log E), where E is the number of edges in the graph, or equivalently O(E \log V) since E \leq V^2 for a graph with V vertices.[14]
The dominant operation is sorting the edges by weight, which takes O(E \log E) time using a comparison-based sorting algorithm such as merge sort or heap sort.[15] The subsequent processing involves E iterations, each performing a find operation to check for cycles and a potential union operation to merge components using a union–find data structure. With union by rank and path compression, the amortized time for E such operations is O(E \alpha(V)), where \alpha is the inverse Ackermann function, which grows so slowly that it is effectively constant for all practical graph sizes.[16] Thus, the union–find operations contribute negligibly to the total time, leaving sorting as the bottleneck.
For sparse graphs where E = O(V), the complexity simplifies to O(V \log V), as the sorting cost dominates and \log E \approx \log V. The best-case and worst-case complexities are the same asymptotically for connected graphs, as the algorithm always sorts all edges and processes each one. If edge weights are integers within a small range (up to W), a counting sort can reduce the sorting time to O(E + W), yielding a pseudopolynomial complexity, though the standard analysis assumes comparison-based sorting for arbitrary weights.[17]
Space complexity
Kruskal's algorithm requires O(V + E) space overall, where V denotes the number of vertices and E the number of edges.[18]
This arises from the need to store the edge list in O(E) space, the union–find data structure using O(V) space for parent and rank arrays, and the output minimum spanning tree requiring O(V) space to hold its edges.[18][19]
The sorting of edges typically demands auxiliary space of O(E), though in-place sorting methods can avoid this additional allocation in some implementations.[18]
Relative to the input representation, the space usage remains linear in the graph's size, rendering the algorithm memory-efficient for large-scale graphs.[19]
Correctness
Spanning tree property
Kruskal's algorithm constructs a spanning tree for connected undirected graphs and a spanning forest for disconnected ones, ensuring the output connects all vertices without cycles.
The acyclicity of the output follows from the algorithm's use of a union–find data structure to add only edges that connect distinct components, preventing the formation of cycles. At every step, an edge is included in the growing structure only if its endpoints belong to different connected components in the current forest; otherwise, it is discarded. This guarantees that no cycle is introduced, as adding an edge within the same component would close a loop.[3]
A key property is that the selected edges form a forest at every iteration of the algorithm. This can be established by induction on the number of edges added. For the base case with zero edges, the structure consists of V isolated vertices, which is a forest with V components. Assume that after adding i edges, the current set S forms a forest with V - i components. The next edge e added connects two distinct components in this forest (verified by the find operation in union–find), resulting in a new set with V - i - 1 components and no cycles, preserving the forest property.[20]
Regarding connectivity, the algorithm begins with V components and reduces the number by exactly one with each added edge, as each union merges two components. For a connected graph, it continues adding edges until V-1 edges are selected or a single component remains, at which point no further connecting edges exist among the remaining candidates; since the graph is connected, exactly V-1 edges are added, spanning all vertices. In a disconnected graph with c components, the process yields a minimum spanning forest consisting of c trees, with a total of V - c edges.[21]
Minimality
The minimality of the spanning tree produced by Kruskal's algorithm follows from two fundamental properties of minimum spanning trees (MSTs): the cut property and the cycle property. The cut property states that, given any cut of the graph's vertices into two nonempty subsets S and V \setminus S, the minimum-weight edge crossing the cut belongs to some MST of the graph.[1] To see why, suppose T is an MST and e is the lightest crossing edge for the cut, but e \notin T. Adding e to T creates a cycle, which must contain another crossing edge f. Replacing f with e yields a spanning tree of weight at most that of T (since w(e) \leq w(f)), contradicting the minimality of T unless w(e) = w(f), in which case T \cup \{e\} \setminus \{f\} is also an MST containing e.[22]
Kruskal's algorithm satisfies the cut property at each step, ensuring the selected edges form part of some MST. The edges are considered in non-decreasing order of weight. When an edge e = (u, v) is added, u and v lie in different connected components of the current forest (say, components A and B), defining the cut (A, B). Any lighter edge must have been considered earlier; if such an edge crossed (A, B), it would have connected A and B already, contradicting the separation. Thus, e is the lightest edge crossing the cut, so by the cut property, e belongs to some MST extending the current forest. By induction on the number of edges added, the final tree is an MST.[1][23]
An alternative proof uses contradiction to show the total weight is minimal. Suppose T is the tree produced by Kruskal's algorithm and T' is another spanning tree with strictly smaller total weight. Order the edges by non-decreasing weight and let e be the lowest-weight edge in the symmetric difference T \Delta T' (i.e., the first point of divergence). Since previous edges match, the endpoints of e are in different components in the partial forest when e is added to T. In T', however, these endpoints are connected without e, so the unique path in T' between them consists of edges heavier than or equal to w(e). Replacing the heaviest edge f on that path with e yields a spanning tree of weight at most that of T' (and strictly less if w(f) > w(e)), contradicting the minimality of T'. If w(f) = w(e), the process continues without increasing weight, but the assumption of smaller total weight leads to inconsistency.[24][25]
The cycle property complements this by ensuring no suboptimal edges are included: for any cycle in the graph, the maximum-weight edge on the cycle cannot belong to any MST, as removing it and adding a lighter non-cycle edge would reduce the total weight. Kruskal's algorithm inherently respects this by rejecting edges that would form cycles, always selecting the lightest available edge that avoids them, thus maintaining minimality while building a tree structure.[1]
If edge weights are not distinct, multiple MSTs may exist with the same minimum total weight. Kruskal's algorithm still produces one such MST, as ties in weight do not violate the greedy choice property; the specific order among equal-weight edges may vary, but the total weight remains minimal.[1]
Extensions
Parallel implementation
Parallelizing Kruskal's algorithm for concurrent execution encounters significant challenges, primarily due to the sequential nature of sorting all edges by weight, which serves as a major bottleneck, and the requirement for thread-safe operations on the union-find data structure to handle simultaneous edge additions without conflicts. In theoretical models like the Parallel Random Access Machine (PRAM), the sorting step can be addressed using parallel comparison-based sorting algorithms that achieve O(log E) time with O(E / log E) processors. For the union-find component, concurrent implementations are essential; the seminal work by Shiloach and Vishkin introduced a parallel algorithm for disjoint-set operations in the CRCW PRAM model, enabling O(log V) time per operation with O(V) processors through techniques like pointer jumping to propagate finds efficiently.
To mitigate these issues and enhance parallelism, adaptations often integrate elements of Borůvka's algorithm, which naturally decomposes into independent phases where each connected component simultaneously identifies its minimum-weight outgoing edge, reducing reliance on global sorting. This hybrid approach allows for efficient parallel execution, as demonstrated in the work by Chung and Condon, who developed a parallel Borůvka-based minimum spanning tree algorithm for shared-memory architectures, achieving near-linear speedup on multiprocessors by distributing edge selections across processors. In the PRAM model, such variants attain O(log V) time complexity using E/V processors per phase, with the overall algorithm completing in O(log V) phases, providing a theoretical parallel speedup over the sequential O(E log V) bound.[26]
Practical implementations leverage these ideas in software libraries and distributed frameworks. The Parallel Boost Graph Library includes distributed-memory minimum spanning tree algorithms based on Dehne and Götz's BSP-model variants of Borůvka and Kruskal hybrids, which scale to thousands of processors for graphs with millions of edges by partitioning the graph and synchronizing supersteps.[27] In large-scale distributed environments, MapReduce adaptations of Kruskal's algorithm process edge sorting and union-find operations across clusters, as shown in implementations on Hadoop that handle billion-edge graphs with linear scalability in the number of reducers. For accelerator hardware, GPU-based versions, such as ECL-MST, unify Kruskal's edge processing with Borůvka phases using atomic operations for union-find, achieving up to 10x speedup over CPU baselines on dense graphs with billions of edges.[28]
Recent advancements as of 2025 include concurrent and distributed adaptations of Kruskal's algorithm for dynamic graphs, where edges are added or deleted over time. For instance, ContraMST employs incremental batch processing for efficient MST maintenance under updates, while other works propose online concurrent versions using pipeline distribution for streaming inputs.[29][30]
Comparison with Prim's algorithm
Prim's algorithm constructs a minimum spanning tree (MST) for a connected, undirected graph by starting from an arbitrary vertex and iteratively adding the minimum-weight edge that connects a vertex in the current tree to one outside it, maintaining a priority queue of candidate edges adjacent to the growing subtree.[31] This local, vertex-centric expansion contrasts with Kruskal's global strategy, which sorts all edges by weight and greedily adds the smallest that does not form a cycle, using a union-find structure to track connected components.[32] While Kruskal processes the entire edge set independently of vertex selection, Prim builds the tree incrementally from a seed vertex, focusing on the frontier of unexplored adjacent edges.[33]
In terms of data structures, Kruskal relies on union-find for efficient cycle detection and component merging, whereas Prim uses a priority queue to extract the minimum edge from the set of fringe edges, enabling dynamic updates as the tree expands.[32] Both achieve O(E log V) time complexity with optimized implementations—Kruskal via sorting and nearly constant-time union-find operations, and Prim via binary heap-based priority queue—but Prim can improve to O(E + V log V) with Fibonacci heaps, favoring it in dense graphs where the number of edges E approaches V².[33] Conversely, Kruskal excels in sparse graphs with fewer edges, as the initial sorting cost O(E log E) remains low relative to Prim's per-vertex operations.
Kruskal is simpler to apply to disconnected graphs, naturally yielding a minimum spanning forest by including all components without modification, while Prim requires separate invocations on each connected component to achieve the same result and assumes connectivity from a single starting vertex. Their correctness stems from shared foundational properties: the greedy choice property, ensuring safe edge additions, and the cut property, which guarantees that the lightest edge across any cut in the graph belongs to some MST, allowing both algorithms to build an optimal solution step by step.[34]
Kruskal's algorithm was first described in 1956 by Joseph B. Kruskal in the Proceedings of the American Mathematical Society, with Prim's formulation following in 1957 by Robert C. Prim in the Bell System Technical Journal.[35][31] In contemporary contexts like streaming or distributed graphs, where edges arrive incrementally, Kruskal's edge-focused processing offers advantages for adaptation, such as handling unsorted streams with approximate sorting, highlighting trade-offs beyond classical sparse-versus-dense distinctions. Recent works (2024–2025) have extended this to incremental MST algorithms that efficiently handle graph updates using variants of Kruskal's greedy selection.[36]