Adjacency list

An adjacency list is a compact data structure for representing finite graphs, consisting of an array of lists in which the index corresponds to a vertex and each list contains the vertices adjacent to that vertex via edges.^[1] This representation supports both directed and undirected graphs, with undirected edges typically stored in both endpoints' lists to reflect symmetry.^[2] Compared to the adjacency matrix, which uses a two-dimensional array to indicate edge presence between all pairs of vertices, the adjacency list offers superior space efficiency for sparse graphs where the number of edges E is much smaller than |V|^2, requiring only \Theta(|V| + |E|) storage space.^[1] It facilitates efficient traversal of neighboring vertices in time proportional to the degree of the vertex, making it ideal for algorithms like breadth-first search and depth-first search that iterate over adjacencies.^[3] However, checking for the existence of a specific edge can take up to O(\min(|V|, |E|)) time due to potential list traversal.^[3] Adjacency lists can be extended to weighted graphs by storing pairs of (neighbor, weight) in each list, and they are commonly implemented using arrays of linked lists or dynamic arrays for flexibility in handling varying degrees.^[4] This structure is foundational in computer science for modeling relationships in networks, such as social connections or web links, and is preferred in practice for its balance of simplicity and performance on real-world sparse datasets.^[5]

Fundamentals

Definition

An adjacency list is a data structure used to represent a graph, consisting of an array (or list) of lists where each index in the outer array corresponds to a vertex in the graph, and the sublist at that index contains the vertices adjacent to it. This representation associates each vertex with a collection of its neighboring vertices, facilitating efficient storage and access to connectivity information.^[6] In undirected graphs, each edge between two vertices u and v appears exactly once in the adjacency lists: v is listed in u's list, and u is listed in v's list. In directed graphs, however, edges are asymmetric; the adjacency list for a vertex u includes only the vertices v such that there is a directed edge from u to v, with no reciprocal entry unless a separate edge exists in the opposite direction.

Representation in Graphs

In graph theory, the adjacency list representation, which associates each vertex with a list of its adjacent vertices, can be adapted to accommodate various structural properties of graphs beyond simple unweighted undirected graphs. For directed graphs, each adjacency list contains only the outgoing edges from the vertex, such that an edge from vertex u to vertex v appears solely in u's list; this contrasts with undirected graphs, where the same edge appears in both u's and v's lists to reflect bidirectionality. In weighted graphs, the lists store pairs consisting of the neighbor vertex and the corresponding edge weight, allowing efficient access to both connectivity and cost information. Unweighted edges are stored by listing the neighbor alone. Multigraphs, which permit multiple edges between the same pair of vertices, are represented by including duplicate entries for the same neighbor in the relevant list(s), potentially with distinct weights for each instance. Self-loops, where an edge connects a vertex to itself, are handled by including the vertex itself as an entry in its own adjacency list, again optionally paired with a weight if applicable.^[7] To illustrate, consider a small directed graph with vertices labeled 1 through 4, featuring edges 1→2 (weight 3), 2→3, 2→4 (weight 5), a self-loop at 2 (weight 1), and a multiple edge from 2→3 (another unweighted instance). The adjacency list representation appears as follows:

1: [(2, 3)]
2: [3, (4, 5), (2, 1), 3]  // Unweighted to 3, weighted to 4 and self-loop, duplicate to 3
3: []
4: []
1: [(2, 3)]
2: [3, (4, 5), (2, 1), 3]  // Unweighted to 3, weighted to 4 and self-loop, duplicate to 3
3: []
4: []

This textual depiction highlights how the structure naturally extends to these variants while maintaining the core list-based organization.^[7]

Implementation

Underlying Data Structures

The adjacency list representation of a graph G = ([V](/page/V.), E) is commonly implemented using an array Adj of |[V](/page/V.)| lists, where each entry Adj for a vertex v \in [V](/page/V.) stores the list of vertices adjacent to v.^[8] This structure assumes vertices are labeled from 1 to |[V](/page/V.)| or 0 to |[V](/page/V.)|-1, enabling constant-time access to the adjacency list of any vertex via array indexing.^[9] In languages like C++, this array of lists is frequently realized as a vector of vectors, such as std::vector<std::vector<int>>, where each inner vector dynamically grows to accommodate the neighbors of a vertex. For Python, especially when dealing with sparse graphs or non-consecutive vertex labels, a dictionary of lists is preferred, with vertices as keys and lists of adjacent vertices as values, leveraging the hash table for efficient lookups.^[10] In Java, hash-based structures like HashMap<Integer, ArrayList<Integer>> are often used for similar reasons, providing average O(1) access time to a vertex's adjacency list in sparse scenarios where direct array indexing is inefficient.^[11] To handle dynamic graphs where the number of vertices or edges changes, the lists for adjacent vertices can employ linked lists for efficient insertions and deletions without resizing overhead, or dynamic arrays (such as vectors in C++ or ArrayLists in Java) that automatically resize as needed.^[12] This flexibility is crucial for graphs that evolve during algorithm execution, allowing the structure to adapt without rebuilding the entire representation.^[13] For a graph with n vertices, the in-memory representation typically allocates an array (or equivalent) of size n, with each slot pointing to a list that stores only the actual edges incident to that vertex, making it particularly efficient for sparse graphs where the number of edges m is much smaller than n^2.^[8]

Pseudocode Examples

The adjacency list for a graph with n vertices is initialized as an array of n empty lists, one for each vertex indexed from 0 to n-1. This structure allows efficient storage and access to neighboring vertices.

pseudocode
function InitializeAdjacencyList(n):
    adj ← array of n empty lists  // Indexed from 0 to n-1
    return adj
function InitializeAdjacencyList(n):
    adj ← array of n empty lists  // Indexed from 0 to n-1
    return adj

To construct an adjacency list from an edge list in an undirected graph, iterate through each edge (u, v) and append v to the list of u and u to the list of v, ensuring bidirectional representation without duplicates if edges are unique.^[5]

pseudocode
function BuildAdjacencyListUndirected(edge_list, n):
    adj ← [InitializeAdjacencyList](/page/Pseudocode)(n)
    for each edge in edge_list:
        u, v ← edge
        append v to adj[u]
        append u to adj[v]
    return adj
function BuildAdjacencyListUndirected(edge_list, n):
    adj ← [InitializeAdjacencyList](/page/Pseudocode)(n)
    for each edge in edge_list:
        u, v ← edge
        append v to adj[u]
        append u to adj[v]
    return adj

For weighted graphs, the construction modifies the append operation to store tuples of the neighbor and its corresponding weight, typically as (v, w) where w is the edge weight from u to v; in undirected weighted graphs, the symmetric tuple is added to both directions.

pseudocode
function BuildAdjacencyListWeightedUndirected(edge_list, n):
    adj ← InitializeAdjacencyList(n)
    for each edge in edge_list:
        u, v, w ← edge
        [append](/page/Append) (v, w) to adj[u]
        [append](/page/Append) (u, w) to adj[v]
    return adj
function BuildAdjacencyListWeightedUndirected(edge_list, n):
    adj ← InitializeAdjacencyList(n)
    for each edge in edge_list:
        u, v, w ← edge
        [append](/page/Append) (v, w) to adj[u]
        [append](/page/Append) (u, w) to adj[v]
    return adj

Operations

Traversal and Search

Traversal of graphs represented by adjacency lists relies on systematically iterating over the neighbor lists associated with each vertex. These traversals enable exploration of the graph's structure without modifying it, facilitating tasks such as connectivity checks and path discovery. Two fundamental algorithms for this purpose are Depth-First Search (DFS) and Breadth-First Search (BFS), both of which leverage the adjacency list's efficient access to adjacent vertices.^[14] Depth-First Search (DFS) on an adjacency list representation proceeds by exploring as far as possible along each branch before backtracking. It can be implemented recursively or iteratively using a stack to mimic the recursion. In the recursive version, starting from a source vertex s, the algorithm marks s as visited and then recursively visits each unvisited neighbor in s's adjacency list. The iterative approach uses a stack to push vertices and their neighbors, processing them in last-in-first-out order. This iteration over neighbor lists ensures that each edge is considered a constant number of times, typically once per direction in directed graphs. For an undirected graph with n vertices and m edges, the total time is proportional to n + m, as the algorithm scans each adjacency list exactly once.^[15]^[14] Breadth-First Search (BFS), in contrast, explores the graph level by level, visiting all neighbors of a vertex before proceeding to the next layer. It employs a queue to manage the order of exploration: beginning with the source vertex enqueued and marked as visited, the algorithm dequeues a vertex, enqueues all its unvisited neighbors from its adjacency list, and repeats until the queue is empty. This queue-based iteration prioritizes shallower depths, making BFS suitable for finding shortest paths in unweighted graphs. Like DFS, the time complexity is O(n + m), achieved by traversing each adjacency list a constant number of times.^[16]^[14] A common query operation on adjacency lists is checking for the existence of a specific edge between two vertices u and v, which requires scanning the adjacency list of u to locate v. This neighbor search takes time proportional to the degree of u (the length of its list), in the worst case O(\deg(u)), as each entry must be examined until a match is found or the list is exhausted. For sparse graphs where degrees are low, this is efficient, but it contrasts with adjacency matrix representations that allow constant-time checks.^[3]^[17] The following pseudocode illustrates an iterative check for adjacency between vertices u and v in a directed graph's adjacency list:

[function](/page/Function) isAdjacent(adjList, u, v):
    for each [neighbor](/page/Neighbor) w in adjList[u]:
        if w == v:
            return true
    return false
[function](/page/Function) isAdjacent(adjList, u, v):
    for each [neighbor](/page/Neighbor) w in adjList[u]:
        if w == v:
            return true
    return false

This function directly iterates over the neighbor list of u, returning true upon finding v and false otherwise, with runtime O(\deg(u)).^[18]

Modification Operations

Modification operations on an adjacency list representation of a graph involve updating the structure to add or remove vertices and edges, typically using an array or vector of lists (or sets) where each index corresponds to a vertex and holds its adjacent vertices. These operations vary in efficiency depending on whether the graph is directed or undirected and the underlying data structure for the inner lists, such as linked lists or hash sets. Adding an edge between vertices u and v requires appending v to the list at index u (and u to the list at index v for undirected graphs to maintain symmetry). This append operation is O(1) amortized time when using dynamic arrays or linked lists for the inner structures, as it involves direct insertion at the end without searching. For directed graphs, only the one-way append is needed, while undirected graphs ensure bidirectional linkage to reflect the mutual adjacency. Deleting an edge from u to v necessitates scanning the list at index u to locate and remove v, which takes O(d) time where d is the degree (out-degree for directed graphs) of u, due to linear search in a list-based implementation. In undirected graphs, the symmetric removal from v's list is also required, potentially doubling the cost to O(d_u + d_v). To achieve faster removal, such as O(1) average case, the inner structures can use hash sets instead of lists, allowing direct lookup and deletion, though this increases space usage due to hashing overhead. Adding a vertex involves resizing the outer array to accommodate the new index and initializing an empty list (or set) for its adjacencies, which is O(1) amortized time with dynamic arrays like vectors. No immediate changes to existing lists are needed, as the new vertex starts isolated. Removing a vertex u requires first deleting its entry from the outer array and then scanning all other vertices' lists to remove any references to u, resulting in O(|E|) time in the worst case, as each edge incident to u must be located and excised across the graph. For efficiency in sparse graphs, this can be optimized by tracking degrees or using bidirectional links, but standard list implementations incur the full scan cost. Resizing the outer array after removal is typically O(1) but may involve shifting indices in fixed-size arrays. Edge cases in modification operations include handling self-loops and parallel edges, particularly in multigraphs. A self-loop at vertex u is represented by including u in its own adjacency list, and adding or removing it follows the standard edge operations but requires checking for the vertex itself during scans to avoid duplicate or erroneous entries. In multigraphs allowing parallel edges between the same pair (u, v), the adjacency lists permit multiple instances of v in u's list (and vice versa for undirected), with additions simply appending duplicates and deletions specifying which instance to remove, often requiring positional or attribute-based identification to distinguish parallels.

Analysis

Time and Space Complexity

The space complexity of an adjacency list representation for a graph G = (V, E) is O(|V| + |E|), where |V| denotes the number of vertices and |E| the number of edges, making it particularly efficient for sparse graphs where |E| \ll |V|^2.^[1]^[19] In an undirected graph, the total space required is equivalent to the sum of the sizes of all adjacency lists, which equals $2|E| since each edge appears in exactly two lists (one for each endpoint), plus O(|V|) for the vertex array.^[1] For time complexities, traversals such as depth-first search (DFS) or breadth-first search (BFS) on an adjacency list take O(|V| + |E|) time, as each vertex and edge is visited exactly once by scanning all adjacency lists.^[1] Checking for the existence of an edge between vertices u and v requires scanning the adjacency list of u, yielding an average-case time of O(\deg(u)) where \deg(u) is the degree of u, and a worst-case time of O(|V|) if the graph is dense.^[20] Adding an edge to an adjacency list can be performed in O(1) time by appending the new neighbor to the relevant lists (for directed graphs) or both (for undirected), assuming dynamic arrays or linked lists that support constant-time insertion at the end.^[3] Deleting an edge, however, requires locating the neighbor in the adjacency list of the source vertex, which takes O(\deg(u)) time on average and up to O(|V|) in the worst case, followed by removal which is O(1) for linked lists but potentially amortized O(1) for dynamic arrays with resizing.^[20]

Comparison to Adjacency Matrix

An adjacency matrix represents a graph using a two-dimensional boolean array of size |V| × |V|, where |V| is the number of vertices, with the entry at row i and column j indicating the presence of an edge from vertex i to j; this structure requires Θ(|V|^2) space regardless of the number of edges |E|.^[21] In comparison, adjacency lists offer superior space efficiency for sparse graphs, where |E| ≪ |V|^2, as they consume Θ(|V| + |E|) space by storing only the existing edges. However, adjacency matrices excel in dense graphs, providing constant-time O(1) checks for edge existence, while adjacency lists require scanning a vertex's neighbor list, taking O(deg(v)) time in the worst case. Matrices also facilitate rapid computation of graph properties like transitive closures via matrix multiplication, whereas lists are more efficient for traversals that iterate over neighbors. The following table summarizes key operational differences:

Operation	Adjacency Matrix	Adjacency List
Space	O(	V
Edge existence check	O(1)	O(deg(u) or deg(v))
List neighbors of v	O(	V
Add edge	O(1)	O(1)

For simple undirected graphs, the maximum |E| is |V|(|V|-1)/2 ≈ |V|^2 / 2; when |E| exceeds roughly |V|^2 / 2 (as in directed graphs allowing up to |V|^2 edges), matrices become comparably space-efficient while retaining advantages in query speed.^[22]^[23] Hybrid approaches sometimes combine both representations, using adjacency lists for compact storage of sparse graphs and an auxiliary matrix for frequent O(1) edge queries when memory permits.^[3]

Applications

In Graph Algorithms

Adjacency lists play a central role in implementing key graph algorithms by enabling efficient access to a vertex's neighbors, which is essential for edge relaxation and traversal operations. In shortest path algorithms like Dijkstra's, the structure supports the core relaxation step where, upon extracting a vertex from the priority queue, the algorithm iterates over its adjacent vertices to update tentative distances if a shorter path is found. This neighbor iteration is performed directly from the list associated with the current vertex, ensuring that only relevant edges are considered without scanning irrelevant parts of the graph.^[24] For connectivity algorithms, adjacency lists are instrumental in depth-first search (DFS) and breadth-first search (BFS) to identify connected components in undirected graphs. These traversals begin from an unvisited vertex and explore its component by recursively or iteratively visiting neighbors listed in the adjacency structure, marking vertices as visited to avoid cycles. By performing such searches from each unvisited vertex until all are covered, the algorithm partitions the graph into its maximal connected subgraphs, with each edge and vertex processed exactly once to achieve an overall linear traversal.^[25] In minimum spanning tree (MST) algorithms, adjacency lists provide efficient neighbor access critical for edge selection. Prim's algorithm, starting from an arbitrary vertex, repeatedly selects the minimum-weight edge connecting the growing tree to an outside vertex by scanning the neighbor lists of tree vertices; this localized access avoids global edge examinations.^[26] Adjacency lists also enable efficient topological sorting in directed acyclic graphs (DAGs), where a linear ordering of vertices respects all edge directions. One common approach uses a variant of DFS on the adjacency lists to compute finishing times, then reverses the order to produce the topological sequence; alternatively, Kahn's algorithm processes vertices with zero indegree, iteratively removing them and updating neighbors' indegrees via the outgoing lists. This structure ensures that edge traversals align with dependency resolution, yielding a valid ordering in linear time relative to the graph size.^[27]

In Modern Systems

In social network analysis, adjacency lists serve as a foundational representation for modeling user connections in large-scale platforms, capitalizing on the inherent sparsity of these graphs where most users connect to only a small fraction of the total population. For example, Facebook's social graph, comprising billions of nodes (users and entities) and edges (relationships like friendships or follows), employs list-like structures to efficiently store and traverse these connections through its Graph API, enabling features such as friend recommendations and news feed generation. This approach minimizes memory usage compared to denser representations, supporting real-time queries on distributed systems handling petabytes of data.^[28]^[29] In distributed systems, particularly graph databases like Neo4j, adjacency lists underpin the concept of index-free adjacency, where each node directly references its neighbors via pointers, facilitating rapid traversal and query optimization in big data environments. This native graph storage model avoids the costly joins required in relational databases, allowing Neo4j to process complex pattern-matching queries on graphs with millions of nodes and relationships in sub-second times, as seen in applications for fraud detection and supply chain analysis. By organizing relationships as linked lists per node, the system achieves constant-time access to adjacent data, scaling horizontally across clusters for enterprise-level workloads.^[30]^[31] Graph neural networks (GNNs), a post-2010s advancement in machine learning, rely on adjacency lists to implement the message passing mechanism, where node embeddings are updated by aggregating information from neighboring nodes across multiple layers. This representation enables GNNs to learn on non-Euclidean data like citation networks or molecular structures, propagating features along edges to capture relational dependencies without assuming fixed grid topologies. Influential frameworks such as those reviewed in foundational GNN literature highlight how adjacency lists support scalable implementations, improving performance in tasks like node classification on datasets with irregular connectivity.^[32] A prominent recent trend in high-performance computing involves the compressed sparse row (CSR) format, an evolution of adjacency lists that stores row pointers, column indices, and non-zero values in contiguous arrays for efficient sparse graph operations. In TensorFlow's ecosystem for GNNs, CSR variants optimize memory and computation for large-scale graph convolutions on accelerators like GPUs, reducing overhead in sparse matrix multiplications essential for message passing. This format has been adopted in scientific computing pipelines, where it accelerates simulations on graphs with billions of edges by enabling cache-friendly access patterns.^[33]^[34]