Average path length
In graph theory, the average path length (also known as the characteristic path length) is a key metric that quantifies the typical shortest-path distance between all pairs of vertices in a connected graph, providing a measure of the graph's overall efficiency in connecting its nodes.[1] Formally, for a graph with n vertices, it is calculated as the mean of the shortest-path lengths d(u,v) over all ordered pairs of distinct vertices u and v, given by the formulaL(G) = \frac{1}{n(n-1)} \sum_{u \neq v} d(u,v),
where d(u,v) denotes the length of the shortest path from u to v; equivalently, for undirected graphs, it can be expressed over unordered pairs as L(G) = \frac{2}{n(n-1)} \sum_{u < v} d(u,v).[1][2] This value is undefined for disconnected graphs but can be approximated or analyzed in components for real-world networks.[3] The concept gained prominence in network science through its role in characterizing small-world networks, where a low average path length—often scaling logarithmically with the number of nodes, as L \sim \log n—coexists with high local clustering, enabling efficient global information propagation despite local structure.[4] Introduced in the seminal work by Watts and Strogatz, this property explains phenomena like the "six degrees of separation" in social networks, where individuals are connected through surprisingly short chains of acquaintances.[4] In random graph models, such as the Erdős–Rényi model, the average path length similarly exhibits logarithmic growth for sparse graphs above the connectivity threshold, underscoring its relevance to understanding network diameter and scalability.[5] Beyond theoretical models, average path length finds applications across diverse domains, including biological systems like brain connectivity networks, where shorter paths correlate with efficient neural signaling and are studied in relation to genetic factors.[6] In technological infrastructures, such as the internet or transportation graphs, it assesses routing efficiency and resilience, with values often kept low through design to minimize latency.[3] Computing it exactly requires all-pairs shortest paths algorithms like Floyd-Warshall (with O(n^3) time complexity), though approximations via random walks or sampling are used for large-scale networks to estimate this metric efficiently.[7]