Fact-checked by Grok 2 weeks ago

Connected component

In , a connected component of an undirected is a maximal connected , consisting of a set of such that every pair of in the set is connected by a , and no additional outside the set can be added while maintaining this property. The connected components of a form a of its set, as the relation of being connected by a is an on the . Each connected component is itself a connected , and if the original has multiple components, it is disconnected. In , a connected component of a is a maximal connected (or ). Connected components are fundamental in analyzing graph structure, providing insights into the modularity and isolation of subnetworks within larger systems. They have wide applications in fields such as bioinformatics for identifying clusters in molecular networks, scientific computing for solving systems of equations, and for quantifying network fragmentation. In , determining connected components is a basic problem solvable in linear time relative to the number of vertices and edges. To compute connected components, graph traversal algorithms like (DFS) or (BFS) are commonly used: starting from an unvisited , the algorithm explores all reachable vertices to identify one component, then repeats for remaining unvisited vertices until all are covered. These methods efficiently partition the graph and are foundational in algorithms for more complex connectivity problems, such as finding strongly connected components in directed graphs.

Graph Theory

Definition

In , a connected component of an undirected is a maximal connected , consisting of a set of vertices such that every pair of vertices in the set is connected by a , and no additional vertex outside the set can be added while maintaining this property.) The connected components of a form a of its set, as the of being connected by a is an on the vertices: reflexive (a vertex connects to itself via a trivial path), symmetric (paths reverse direction in undirected graphs), and transitive (paths concatenate). The equivalence classes under this relation are the connected components.

Properties

In a connected , there is exactly one connected component, comprising all vertices and edges; conversely, a disconnected possesses multiple connected components, each forming a maximal connected . The number of connected components in a is an invariant under , meaning that isomorphic graphs share the same number of components since isomorphisms preserve paths and thus relations between vertices. Connected components form a of the set of a , as they are the equivalence classes under the of being connected by a : this is reflexive (every reaches itself via a trivial ), symmetric (if a exists from u to v, one exists from v to u), and transitive (paths can be concatenated). To sketch the proof, note that the components are disjoint by maximality—any overlapping vertices would merge into a single connected —and exhaustive, as every belongs to at least one component containing itself. The sizes of connected components vary, with the smallest typically consisting of isolated vertices (single-vertex components with no edges) and the largest potentially dominating the graph; in Erdős–Rényi random graphs G(n, p) where the average degree \lambda = np > 1, a emerges containing a positive \Theta(n) of vertices, while all other components remain sublinear in size. Articulation points (or cut vertices) are vertices whose removal, along with incident edges, increases the number of connected components, thereby disconnecting previously linked parts of the graph; similarly, bridges (or cut edges) are edges whose removal alone increases the component count by separating the graph into distinct subgraphs. In disconnected graphs, the is conventionally defined as infinite, since no finite path length bounds the distances between vertices in different components, contrasting with the finite of connected graphs where paths exist between all pairs.

Computation

The primary algorithms for computing connected components in an undirected are based on techniques, such as (DFS) and (BFS), which systematically explore the to identify reachable vertices from each starting point. These methods achieve a of O(V + E), where V is the number of vertices and E is the number of edges, as they visit each vertex and edge exactly once. In the DFS approach, the algorithm begins at an unvisited , marks it as visited, and recursively traverses all its unvisited neighbors, effectively labeling all vertices in the current component before to start a new component from another unvisited . This process builds a depth-first where each tree corresponds to a connected component. for DFS-based component labeling is as follows:
procedure DFS(v):
    mark v as visited
    assign component label to v
    for each neighbor u of v:
        if u is not visited:
            DFS(u)

main:
    initialize visited array to false
    component_id = 0
    for each [vertex](/page/Vertex) v in [graph](/page/Graph):
        if v is not visited:
            component_id += 1
            DFS(v)  // all vertices reached get component_id
This implementation requires an auxiliary visited array of size O(V) for tracking exploration. The BFS algorithm operates similarly but uses a to explore level by level, starting from an unvisited , marking it visited, and enqueueing its unvisited neighbors until the is empty, then repeating for the next unvisited . Like DFS, BFS has O(V + E) and O(V) for the and visited array, making it suitable for graphs where shortest paths within components are of interest, though not required for mere component identification. An alternative data structure for computing connected components is the Union-Find (also known as Disjoint Set Union or DSU), which initializes each as its own set and iteratively unions sets for adjacent vertices, allowing queries to find the root representative of each component. With optimizations like path compression (flattening paths during finds) and union by rank (merging smaller trees into larger ones to bound tree height), the amortized time per operation approaches O(\alpha(n)), where \alpha(n) is the slow-growing inverse , nearly constant for practical sizes, while the overall time for building components is O(E \alpha(V)). Space usage remains O(V) for parent and rank arrays. For directed graphs, connected components are typically computed as weakly connected components by treating the graph as undirected—ignoring edge directions—and applying the above algorithms to the underlying undirected graph.

Topology

Definition

A is a set X together with a collection \tau of subsets of X, called open sets, that includes the and X itself, and is closed under arbitrary unions and finite intersections. In a X, a A \subseteq X is connected if it cannot be expressed as the of two disjoint nonempty open subsets (relative to the on A). The connected component of a point x \in X is the of all connected s of X that contain x; it is the largest connected containing x. The relation \sim on X defined by x \sim y if there exists a connected subspace of X containing both x and y is an equivalence relation, and the equivalence classes under \sim are precisely the connected components of X. The path-connected component of a point x \in X is the set of all points y \in X such that there exists a continuous path \gamma: [0,1] \to X with \gamma(0) = x and \gamma(1) = y; it provides a stronger notion of connectivity than the connected component. Each connected component of a X is closed in X; moreover, if X is locally path-connected, then each connected component is also open in X, and coincides with the path-connected component of its points. In , connected components analogously partition vertices into maximal subgraphs where every pair is linked by a path, mirroring the discrete case of topological components.

Examples

In the standard topology on the real line \mathbb{R}, the space is connected, so its unique connected component is the entire space \mathbb{R} itself. This follows from the fact that \mathbb{R} cannot be expressed as the union of two nonempty disjoint open sets, as any such attempt would violate the for continuous functions on intervals. The S^1 = \{(x,y) \in \mathbb{R}^2 \mid x^2 + y^2 = 1\}, equipped with the induced from \mathbb{R}^2, is a connected space whose sole connected component is S^1. Moreover, S^1 is -connected, meaning any two points can be joined by a continuous path within the space, which strengthens the connectedness property. The topologist's sine curve provides a non-trivial example of a connected space that is not path-connected. Specifically, consider the subspace T \subset \mathbb{R}^2 defined as T = \{(x, \sin(1/x)) \mid 0 < x \leq 1\} \cup \{0\} \times [-1,1]. This space T is connected, so its connected component is the whole T, but it has two path components: the oscillating curve and the vertical line segment. The connectedness arises because the vertical segment consists of limit points of the curve, preventing separation into disjoint open sets. The rational numbers \mathbb{Q}, as a subspace of \mathbb{R} with the standard topology, form a . Each connected component of \mathbb{Q} is a \{q\} for q \in \mathbb{Q}, since for any distinct rationals p < q, the sets \mathbb{Q} \cap (-\infty, (p+q)/2) and \mathbb{Q} \cap ((p+q)/2, \infty) are disjoint nonempty open sets in the that separate p and q. Thus, \mathbb{Q} has countably many connected components despite being dense in \mathbb{R}. In any topological space equipped with the topology, where every subset is open, the connected components are precisely the . This holds because the only connected subsets are single points or the , as any larger subset can be separated by its singleton open sets. spaces serve as the prototypical totally disconnected spaces with this property. spaces can alter connected components depending on the identification. For instance, the line with doubled origin is obtained as the quotient of \mathbb{R} \sqcup \mathbb{R} by identifying (x,0) \sim (x,1) for all x \neq 0, resulting in two distinct origin points o_0 and o_1. This space is connected, with the entire quotient as its single connected component, even though it is not Hausdorff (neighborhoods of o_0 and o_1 cannot be separated). The connectedness persists because paths avoiding the origins connect points through the identified parts, and the origins are limits of such paths.

Applications

In Computer Science

In , connected component labeling is a fundamental technique in image processing for identifying and assigning unique labels to distinct regions of foreground pixels in images, where pixels are considered connected if they share an edge or corner based on 4- or 8-connectivity. This process typically employs algorithms such as (BFS) or (DFS) to explore and label each component, or the Union-Find to efficiently merge equivalent labels during a two-pass scan. The library implements this via its connectedComponents function, which outputs a labeled image and supports analysis of component statistics like area and , enabling applications in and segmentation. In practice, these algorithms achieve O(N) for an image with N pixels, making them efficient for large-scale processing. A representative example is the algorithm, which operates as a simple DFS-based method to label connected components by recursively filling adjacent pixels of the same color or value from a starting seed point, often used in paint programs or basic tools. This approach highlights the conceptual role of components in isolating bounded regions, such as filling enclosed areas in a digital . In , connected components serve to identify communities within friendship or interaction s, where each component represents a maximal set of users linked by paths of relationships, providing insights into isolated groups or network fragmentation. For instance, in undirected s modeling mutual connections, computing components reveals subgroups where or can propagate internally but not beyond boundaries. For solving, connected components model reachable areas in a grid-based , with walls as absent edges; the component containing the starting delineates all accessible paths, allowing algorithms to verify solvability or extract routes to the without exhaustive search. In systems, connected components facilitate resolving merge conflicts by analyzing dependency graphs of code changes, where strongly connected components in constraint graphs guide structured three-way merges to preserve semantic consistency across branches. This decomposes complex merges into independent subproblems, reducing errors in systems like during concurrent development.

In Other Fields

In , the zeroth homology group H_0(X) of a X is the generated by the path-connected components of X, providing a precise algebraic invariant that detects the number of such components. This connection arises from the chain complex in , where cycles in dimension zero correspond to points, and boundaries are trivial, yielding H_0(X) \cong \bigoplus_{\pi_0(X)} \mathbb{Z}, with the equal to the number of components. The concept of connected components traces its origins to the late 19th century, particularly in Henri Poincaré's foundational work Analysis Situs (1895), where he developed early tools of to classify manifolds and distinguish connected structures, laying the groundwork for modern notions of in continuous spaces. In physics, connected components play a central role in , which models the emergence of large-scale in disordered systems, such as or in random lattices. Introduced by Broadbent and Hammersley in , the theory describes a phase transition at a critical probability p_c, below which clusters remain finite, but above which a giant connected component spanning the system abruptly forms, characterizing phenomena like boiling or epidemic spread. In , connected components in regulatory networks (GRNs)—directed graphs where nodes are and edges represent regulatory interactions—identify functional modules as strongly connected components (SCCs), subgroups where mutually influence each other, enabling coordinated responses like developmental processes or stress . In , connected components appear in the topological analysis of syntax trees, which represent sentence structures as hierarchical graphs; applied to these trees reveals merging patterns of connected components across , distinguishing syntactic in languages from different families, such as Indo-European versus Uralic. This approach aids sentence parsing by quantifying how subtrees connect during incremental processing, with disjoint components indicating modular structures in ambiguous sentences. In , connected components in networks—modeled as graphs with countries as nodes and trade volumes as weighted edges—delineate economic blocs as densely interconnected communities, reflecting like the EU or ASEAN, where intra-bloc dominates and influences global value chains. Multiresolution analysis of these networks from 2010–2019 data shows hierarchical components emerging at different scales.

References

  1. [1]
    [PDF] Connectivity - Computer Science - JMU
    Page 6. Connected Components. Definition: A connected component of a graph G is a connected subgraph of G that is not a proper subgraph of another connected ...Missing: theory | Show results with:theory
  2. [2]
    [PDF] Lecture 2: Connected components 1 Equivalence relations
    2 Connected components. There's two ways to think of connected components of a graph: as sets of vertices, and as subgraphs. We don't know what subgraphs are ...
  3. [3]
    [PDF] Lecture 16 1 Overview 2 Connectivity - Duke Computer Science
    Mar 18, 2019 · A connected component of a graph is a subgraph consisting of a vertex u and all vertices connected to u, along with any edges incident on these ...
  4. [4]
    [PDF] Estimating the number of connected components in a graph via ...
    The paper estimates connected components in a graph using subgraph sampling, where a subset of vertices are sampled. Counting connected components is important ...
  5. [5]
    [PDF] A Linear-Algebraic Algorithm for Finding Connected Components in ...
    Finding all connected components in a graph is a well studied problem in graph theory with applications in bioinformatics [1] and scientific computing [2] ...
  6. [6]
    [PDF] 1 Connected components in undirected graphs 2 Connectivity in ...
    May 3, 2017 · In order to find a connected component of an undirected graph, we can just pick a vertex and start doing a search (BFS or DFS) from that vertex.
  7. [7]
    [PDF] CSE 421: Introduction to Algorithms - Washington
    Page 54. Properties of (undirected) DFS. Like BFS(s):. • DFS(s) visits x iff there is a path in G from s to x. So, we can use DFS to find connected components.
  8. [8]
    Topological Space -- from Wolfram MathWorld
    A topological space, also called an abstract topological space, is a set X together with a collection of open subsets T that satisfies the four conditions.
  9. [9]
    [PDF] Section 25. Components and Local Connectedness
    Jul 23, 2016 · Each connected component of a space X is closed. If X has only finitely many connected components, then each component of X is also open.
  10. [10]
    connected space in nLab
    ### Summary of Connected Space Content from nLab
  11. [11]
    Connected Component -- from Wolfram MathWorld
    See also. Connected Set, Pathwise-Connected, s-Cluster, Strongly Connected Component, Topological Space, Weakly Connected Component ... Topology · Point-Set ...
  12. [12]
    [PDF] CME 305: Discrete Mathematics and Algorithms Lecture 2 - Graph ...
    Jan 11, 2018 · The connected components of an undirected graph are the maximal sets of vertices that are all connected to each other and the same goes for ...
  13. [13]
    [PDF] Notes for 24th Jan (Thursday) - IISc Math
    A property that is invariant under graph isomorphism is the number of connected components. Here is an application of the handshaking lemma. Lemma 2.1. If a ...
  14. [14]
    [PDF] Graphs 1 Graph Definitions - DSpace@MIT
    That is, the connected components of a graph are the equivalence classes of the connectedness relation. In particular, every vertex of a simple graph belongs to ...
  15. [15]
    [PDF] Erdös-Rényi Random Graphs: The Giant Component
    Sep 14, 2010 · In this lecture, we will see (mostly) why Erdös-Rényi random graphs have this property. The large component is called the “Giant Component”.
  16. [16]
    CMSC-27100 — Lecture 25: Graph Theory: Connectivity
    If removing v and the edges incident to it increases the number of connected components, the v is a cut vertex or articulation point. Likewise, if e∈E e ∈ E is ...
  17. [17]
    [PDF] Paths and connectivity - Stat@Duke
    • For a connected graph, the diameter can range from 1 to n − 1. • For an unconnected graph. • by convention the diameter is taken to be infinity;.
  18. [18]
    [PDF] 1 Connected components in undirected graphs - CS 161
    In order to find a connected component of an undirected graph, we can just pick a vertex and start doing a search (BFS or DFS) from that vertex. All the ...
  19. [19]
    Search for connected components in a graph - CP-Algorithms
    Jun 9, 2024 · We are required to find in it all the connected components, ie, several groups of vertices such that within a group each vertex can be reached from another.
  20. [20]
    Disjoint Set Union - Algorithms for Competitive Programming
    Oct 12, 2024 · As mentioned before, if we combine both optimizations - path compression with union by size / rank - we will reach nearly constant time queries.Path compression optimization · Union by size / rank · Time complexity
  21. [21]
  22. [22]
    [PDF] Algebraic Topology - Cornell Mathematics
    This book was written to be a readable introduction to algebraic topology with rather broad coverage of the subject. The viewpoint is quite classical in ...
  23. [23]
    [PDF] Papers on Topology - School of Mathematics
    Jul 31, 2009 · ... Poincaré before topology. In the introduction to his first major topology paper, the Analysis situs, Poincaré. (1895) announced his goal of ...
  24. [24]
    Percolation processes | Mathematical Proceedings of the ...
    The paper studies, in a general way, how the random properties of a 'medium' influence the percolation of a 'fluid' through it.
  25. [25]
    [PDF] Modularity in Gene Regulatory Networks
    Recently, we introduced strongly connected components of a directed graph as good candidates for modules. This structure-based definition of modularity implies ...
  26. [26]
    Identification of key player genes in gene regulatory networks - PMC
    Sep 6, 2016 · A component is called a strongly connected component (SCC) in a directed graph if each of its nodes is reachable via directed edges from every ...
  27. [27]
    [1903.05181] Topological Analysis of Syntactic Structures - arXiv
    Mar 12, 2019 · We then analyze the trees describing the merging structure of persistent connected components for languages in different language families ...
  28. [28]
    A multiresolution framework for the analysis of community structure ...
    Apr 7, 2023 · International trade networks are complex systems that consist of overlapping multiple trade blocs of varying sizes.