Fact-checked by Grok 2 weeks ago

Graph embedding

Graph embedding is a concept in graph theory and computer science with two primary interpretations. In classical graph theory, it refers to a representation of a graph on a surface (such as a plane or sphere) or in higher-dimensional space where vertices are mapped to distinct points and edges to non-intersecting curves, preserving adjacency relations without crossings; this is central to topics like planarity and genus.^[1] In the modern machine learning context, also known as graph representation learning, it involves mapping nodes, edges, or substructures of a graph G = (V, E) into a low-dimensional vector space of dimension d (where d \ll |V|), such that the structural properties and relational information of the original graph are preserved as much as possible. This process transforms high-dimensional, sparse graph data into dense, continuous embeddings that facilitate downstream machine learning tasks by enabling efficient computation on relational structures.^[2]^[3] The classical notion of graph embedding dates back to the 19th century with studies on planar graphs, evolving through key results like Kuratowski's theorem in 1930; detailed historical context is provided in the Fundamentals section. In parallel, the development of representation learning techniques traces back to the early 2000s, when initial efforts focused on dimensionality reduction for non-relational data using similarity graphs, such as Laplacian eigenmaps for manifold learning. A significant shift occurred around 2010 with the advent of direct graph embedding methods that leverage auxiliary information and deep learning to capture complex network proximities, addressing challenges in scalability and expressiveness for large-scale graphs.^[2] By the mid-2010s, foundational algorithms like DeepWalk (2014), which applies random walks and SkipGram models to learn node representations preserving second-order proximity, and node2vec (2016), which introduces biased random walks for flexible neighborhood exploration, established random walk-based approaches as a cornerstone.^[2] Graph embedding methods in representation learning are broadly categorized into several paradigms, including matrix factorization techniques that decompose graph adjacency or Laplacian matrices to capture global similarities (e.g., LINE for edge preservation); random walk-based methods that sample paths to model local microstructures; deep learning approaches such as Graph Convolutional Networks (GCNs, 2017) for message passing on node features and Graph Attention Networks (GATs, 2017) for weighted neighbor aggregation; and edge reconstruction models like TransE (2013) for knowledge graph embeddings via translational distances. More recent advancements incorporate graph isomorphism networks (GINs, 2019) for distinguishing graph structures and graph transformers (e.g., Graphormer, 2021) for long-range dependencies.^[2]^[3] Detailed discussions of topological and geometric embeddings appear in their respective sections. These embeddings, particularly in the representation learning sense, enable a wide array of applications across domains, including node-level tasks like classification and clustering in social networks, edge-level predictions such as link prediction and recommendation in e-commerce systems, and graph-level analyses like molecular property prediction in bioinformatics or visualization in network analysis. In knowledge graphs, embeddings support entity resolution and relation inference, while in dynamic settings, they aid anomaly detection in temporal networks.^[2]^[3] Trends in representation learning since 2020 have emphasized self-supervised and contrastive learning paradigms to reduce reliance on labeled data, with methods like InfoGraph (2020) maximizing mutual information between graph views, alongside extensions to heterogeneous, multimodal, and temporal graphs for real-world scenarios involving diverse data modalities or evolving structures. As of 2025, further advancements include the integration of large language models (LLMs) with knowledge graph embeddings to enhance semantic understanding and reasoning.^[3]^[4]

Fundamentals

Definition and Terminology

In graph theory, particularly within topological graph theory, a graph embedding refers to a representation of a graph G = (V(G), E(G)) on a host structure, such as a surface or Euclidean space, where vertices are mapped to distinct points and edges to curves or line segments connecting those points, such that adjacency is preserved: for vertices u, v \in V(G), if (u, v) \in E(G), then the images f(u) and f(v) are adjacent via the image of the edge under the mapping f. In such embeddings, edges may intersect only at shared endpoints (vertices), ensuring no improper crossings in the interior. This preservation extends to non-adjacency conditions in certain contexts, where non-adjacent vertices map to non-adjacent points in the host.^[5] Key terminology includes the crossing number of a graph, denoted \operatorname{cr}(G), which is the minimum number of edge crossings over all possible drawings of G in the plane.^[6] The genus of a graph g(G) is the smallest integer g such that G admits an embedding without crossings on an orientable surface of genus g (e.g., a sphere for g=0, a torus for g=1).^[7] A straight-line embedding represents edges as straight line segments between vertex points, as guaranteed for any planar graph by Fáry's theorem. Minor-closed families are classes of graphs closed under taking minors (subgraphs obtained by deleting vertices/edges and contracting edges), such as the family of planar graphs, which exclude K_5 and K_{3,3} as minors.^[8] Embeddings are typically required to be injective, meaning no two vertices map to the same point, distinguishing them from immersions, where the mapping is locally injective but edges may cross without overlapping interiors (though vertices remain distinct). The concept originates from topology and early graph theory studies in the 1930s, particularly investigations into planar graphs by Kazimierz Kuratowski, who characterized non-planar graphs via forbidden subdivisions, and Hassler Whitney, who introduced combinatorial approaches to planarity.^[5] It is important to distinguish these topological and geometric embeddings from the notion of graph embedding in machine learning and representation learning, where the term refers to mapping nodes (and sometimes edges or subgraphs) into low-dimensional vector spaces to capture structural similarities for tasks like node classification or link prediction, often via methods such as random walks or graph neural networks.^[9] In contrast, graph theory embeddings focus on spatial or surface placements preserving combinatorial or geometric properties without dimensional reduction for computation.^[6]

Historical Context

The concept of graph embedding originated in the field of topology during the early 20th century, where mathematicians sought to characterize graphs that could be drawn on a plane without edge crossings. In 1930, Kazimierz Kuratowski established a foundational result by proving that a graph is non-planar if and only if it contains a subdivision of K_5 or K_{3,3}, providing a forbidden subgraph criterion for planarity.^[10] This theorem marked a pivotal milestone in understanding the embeddability of graphs in the plane. Two years later, in 1932, Hassler Whitney offered a dual characterization of planar graphs using cycle spaces and abstract duality, further solidifying the topological foundations of graph embeddings.^[11] These early contributions by Kuratowski and Whitney shifted focus from geometric drawings to abstract combinatorial properties, influencing subsequent developments in graph theory. Building on these topological insights, mid-20th-century research emphasized constructive and computational aspects of embeddings. In 1948, István Fáry demonstrated that every planar graph admits a straight-line embedding in the plane without crossings, bridging abstract planarity with explicit geometric realizations.^[12] This result complemented earlier work and highlighted the feasibility of non-curved representations. By the 1960s and 1970s, attention turned to embeddings on higher-genus surfaces, culminating in the 1968 solution to the Heawood conjecture by Gerhard Ringel and J. W. T. Youngs, who proved the map color theorem for orientable surfaces by constructing explicit genus embeddings for complete graphs.^[13] Concurrently, the computational turn arrived with the 1974 linear-time planarity testing algorithm by John Hopcroft and Robert Tarjan, which efficiently determines embeddability and produces an embedding when possible, enabling practical applications in algorithm design. Frank Harary played a key role in this era through his standardization of graph theory terminology and exploration of embeddings in topological contexts, as detailed in his influential 1969 textbook. From the 2000s onward, graph embedding evolved into an interdisciplinary tool, particularly in machine learning for representation learning on networks. This shift was propelled by the need to encode graph structures into low-dimensional vector spaces for tasks like node classification and link prediction. A seminal advancement came in 2014 with Bryan Perozzi et al.'s DeepWalk, which adapted word2vec techniques to generate node embeddings via random walks on graphs, capturing structural similarities in social and biological networks.^[14] This work marked the transition from pure mathematical topology and algorithms to data-driven AI applications, fostering a surge in scalable embedding methods for large-scale graphs.

Topological Embeddings

Combinatorial Embeddings

In combinatorial graph theory, an embedding of a graph is specified abstractly by a rotation system, which consists of a cyclic ordering of the edges incident to each vertex, independent of any geometric realization. This system defines the facial walks of the embedding by tracing cycles formed by successive edges in the rotations and their reverses, thereby determining the faces and the associated dual graph.^[15] A key property of such embeddings on orientable surfaces is the generalized Euler characteristic, given by
V - E + F = 2 - 2g,
where V is the number of vertices, E the number of edges, F the number of faces, and g the genus of the surface. Additionally, the handshaking lemma for faces holds: the sum of the degrees of the faces equals $2E, ensuring that each edge bounds exactly two faces. An embedding is valid if the rotation system produces closed facial walks and satisfies the Euler characteristic for some genus g \geq 0.^[15]^[16] Two combinatorial embeddings are equivalent if there exists a homeomorphism of the underlying surface that preserves the rotation systems at each vertex. For planar graphs (embeddable on the sphere, where g=0), Whitney's theorem establishes that every 3-connected graph admits a unique embedding up to reflection and homeomorphism of the sphere.^[17] Representative examples illustrate these concepts: a cycle graph C_n (for n \geq 3) embeds combinatorially on the sphere with two faces (the interior and exterior), satisfying Euler's formula with V = n, E = n, and F = 2. In contrast, the complete graph K_5 cannot embed on the sphere but admits multiple combinatorial embeddings on the torus (g=1), such as a rectangular embedding where faces are quadrilaterals.^[15]^[18]

Embeddings on Surfaces

Graph embeddings on surfaces extend the concept of planar embeddings to two-dimensional manifolds with higher topological complexity. These surfaces are broadly classified into orientable and non-orientable types. Orientable surfaces include the sphere, which has genus 0 and corresponds to the plane via stereographic projection, and the torus, which has genus 1. Non-orientable surfaces encompass the projective plane, equivalent to a sphere with one crosscap, and the Klein bottle, formed by a sphere with two crosscaps. Embeddings on such surfaces require drawing the graph without edge crossings, preserving the topological properties of the surface.^[15] The orientable genus of a graph G is defined as the smallest integer g \geq 0 such that G admits a crossing-free embedding on an orientable surface of genus g. For instance, the complete graph K_7 has orientable genus 1 and can be embedded on the torus, where its 21 edges divide the surface into 14 triangular faces, saturating Euler's formula for the torus (V - E + F = 0). In contrast, K_7 is non-planar and requires crossings on the sphere. The formula for the genus of the complete graph K_n (for n \geq 3) is \gamma(K_n) = \left\lceil \frac{(n-3)(n-4)}{12} \right\rceil, which approximates \frac{(n-3)(n-4)}{12} for large n and provides a lower bound derived from Euler's characteristic.^[19]^[20] A key result in this area is the Heawood number H(g), which bounds the maximum chromatic number of any graph embeddable on an orientable surface of genus g \geq 1:

H(g) = \left\lfloor \frac{7 + \sqrt{1 + 48g}}{2} \right\rfloor.

This formula arises from applying a greedy coloring algorithm to graphs satisfying the surface's Euler characteristic, yielding an upper bound on the chromatic number. For non-embeddable graphs on a given surface, the surface crossing number is the minimum number of edge crossings over all possible drawings on that surface, generalizing the classical crossing number to topological constraints. Bounds on surface crossing numbers often rely on Euler's formula and density arguments, with algorithms achieving approximations within O(\log^2 g) of the optimum for genus g.^[13]^[6]^[21] Illustrative examples highlight these concepts. The Petersen graph, a non-planar 3-regular graph on 10 vertices, embeds without crossings on the projective plane, a non-orientable surface of genus 1, where it forms a symmetric embedding with pentagonal faces. Rectangular grid graphs, such as the m \times n grid, embed naturally on the torus via periodic boundary conditions, forming a toroidal grid that wraps around both dimensions without crossings. The Ringel-Youngs theorem (1968) resolved the Heawood map-coloring conjecture by explicitly constructing embeddings of complete graphs K_n on orientable surfaces of genus \gamma(K_n), proving that the chromatic number of such surfaces equals H(g) for g \geq 1, except for the non-orientable Klein bottle (where it is 6 instead of 7). These constructions, often using voltage graphs or current graphs, saturated the Euler bound and confirmed the theorem for all cases.^[22]^[13]

Geometric Embeddings

Planar Embeddings

A planar embedding of a graph is a drawing of the graph in the Euclidean plane in which vertices are represented by distinct points, edges by continuous curves (Jordan arcs) connecting their incident vertices, and no two edges intersect except possibly at shared endpoints.^[23] This definition ensures the embedding is crossing-free, preserving the combinatorial structure without topological distortions beyond the plane's constraints. Embeddings in the plane are topologically equivalent to those on the sphere, as any planar drawing can be projected onto the sphere via stereographic projection, and vice versa. The planarity of a graph is characterized by forbidden subgraph criteria, most notably Kuratowski's theorem from 1930, which states that a finite undirected graph is planar if and only if it contains no subgraph homeomorphic to (i.e., a subdivision of) the complete graph K_5 or the complete bipartite graph K_{3,3}.^[10] Independently, Wagner's theorem provides an equivalent characterization in terms of minors: a graph is planar if and only if it has neither K_5 nor K_{3,3} as a minor, where a minor is obtained by deleting vertices/edges and contracting edges (merging adjacent vertices and removing self-loops).^[24] These theorems are linked because subdivisions relate to topological minors, while contractions enable reduction to core forbidden structures; specifically, any subdivision of K_5 or K_{3,3} contains one of them as a minor, and the operations preserve non-planarity.^[25] Fáry's theorem, established in 1948, extends the representational possibilities for planar graphs by proving that every simple planar graph admits a straight-line embedding in the plane, where all edges are represented as straight line segments without crossings.^[26] This result highlights the flexibility of planar embeddings, allowing geometric realizations that simplify computations in areas like computational geometry. A fundamental relation governing planar embeddings is Euler's formula, which for any connected planar graph with v vertices, e edges, and f faces (including the infinite outer face) satisfies:

v - e + f = 2.

This formula arises from the topology of the plane (Euler characteristic 2) and holds for any maximal embedding, providing bounds on graph density (e.g., e \leq 3v - 6 for simple planar graphs with v \geq 3).^[27] Examples of planar graphs include all trees, which embed without cycles and thus without crossings, and outerplanar graphs, which admit embeddings where all vertices lie on the boundary of the outer face.^[27] In contrast, the utility graph K_{3,3}—modeling connections between three houses and three utilities—is non-planar, as it is itself a forbidden subdivision under Kuratowski's theorem, famously demonstrating the impossibility of crossing-free connections in the plane.^[27] To verify planarity via these criteria, a proof sketch for the equivalence and forbidden structure detection relies on minors and contractions. Assume a graph G is non-planar; then, by maximality arguments, it must contain a subdivision of K_5 or K_{3,3}, as any planar supergraph would contradict embedding properties. Conversely, if G has a K_5 or K_{3,3} minor, non-planarity follows because contracting edges in a hypothetical planar embedding of G yields a planar minor, but K_5 and K_{3,3} are known non-planar (e.g., K_5 violates Euler's formula with v=5, e=10, implying f=7 but exceeding face degree bounds). The detection process iteratively applies deletions and contractions to simplify G toward a forbidden minor: delete non-incident edges/vertices to isolate branches, then contract paths in subdivisions to recover K_5 or K_{3,3}, preserving the embedding contradiction at each step since contractions merge faces without introducing crossings in planar graphs. This reduction establishes the if-and-only-if condition, with full proofs often using induction on vertex count and case analysis for 3-connected components.^[24]

Higher-Dimensional Embeddings

Higher-dimensional embeddings of graphs involve mapping the vertices to distinct points in Euclidean space \mathbb{R}^d for d \geq 3, with edges represented as straight-line segments between these points. In dimensions greater than or equal to three, such embeddings permit edge crossings in general, though constructions exist that avoid them entirely for arbitrary graphs. This contrasts with two-dimensional embeddings, where crossings must be absent for planar graphs to maintain a valid drawing.^[28] Planar graphs admit straight-line crossing-free embeddings in \mathbb{R}^2, particularly for 3-connected ones by Fáry's theorem. For general graphs, \mathbb{R}^3 suffices for straight-line embeddings without edge crossings. A canonical construction positions the n vertices along the moment curve parametrized by (t, t^2, t^3) for distinct real numbers t_1 < t_2 < \cdots < t_n; the resulting straight-line segments for edges do not intersect except at shared vertices, due to the curve's property that any four points span a tetrahedron of positive volume, preventing interior intersections of chords. This approach yields a crossing-free straight-line embedding of any graph in \mathbb{R}^3, including dense graphs like the complete graph K_n.^[29] The n-dimensional hypercube graph Q_n embeds naturally and crossing-free in \mathbb{R}^n, with vertices placed at the $2^n corners of the unit hypercube \{0,1\}^n and edges as axis-parallel segments of length 1. This realization preserves the graph's bipartite structure and connectivity while leveraging the ambient space's coordinates. Beyond topological properties, higher-dimensional embeddings often aim to preserve metric aspects, such as shortest-path distances between vertices. Bourgain's theorem guarantees that any finite metric space with n points can be embedded into an \ell_p space ($1 \leq p < \infty) with distortion at most O(\sqrt{\log n \log \log n}), meaning distances are approximated up to this multiplicative factor. For graph metrics induced by shortest paths, this enables low-distortion embeddings into Euclidean space \mathbb{R}^d with d = O(\log n), facilitating applications like dimension reduction for analysis. Such embeddings find practical use in VLSI design, where multi-layer chip layouts benefit from 3D straight-line representations to optimize wire routing and minimize interlayer crossings, and in network visualization, enabling intuitive 3D layouts of complex interconnections for better comprehension of large-scale systems.^[28]

Computational Aspects

Planarity Testing Algorithms

Planarity testing algorithms determine whether a given graph admits a planar embedding and, if so, construct one explicitly. These algorithms are crucial for computational geometry and graph drawing, as they enable the verification of planarity in linear time relative to the number of vertices. Seminal approaches rely on depth-first search (DFS) to explore the graph structure while detecting forbidden configurations that violate planarity. The Hopcroft-Tarjan algorithm, introduced in 1974, was the first to achieve linear time complexity, O(V), for planarity testing. It employs DFS to construct a spanning tree and classifies edges as tree edges or back edges. Key to the method are lowpoint values, which for each vertex represent the smallest discovery time reachable from its subtree via back edges or adjacent tree edges. By examining these lowpoints, the algorithm identifies interlaced paths that form subdivisions of the complete graph K_5 or the complete bipartite graph K_{3,3}, the forbidden minors for planarity per Kuratowski's theorem. If no such subdivisions are found, the algorithm proceeds to build an embedding by adding paths to a base cycle (the "spine") while ensuring no crossings occur.^[30] Building on similar principles, the Boyer-Myrvold algorithm from 2004 provides a simplified linear-time planarity test that also outputs a combinatorial embedding. It uses DFS to generate a palm tree decomposition, where the DFS tree is augmented with back edges classified by their endpoints' discovery times. Edges are categorized into tree edges, fronds (back edges to ancestors), and return edges. Non-planarity is detected by testing for interlaced back edges that cannot be embedded without crossings, using a block-cut tree structure to handle biconnected components and allow efficient flipping of subembeddings. The algorithm constructs the embedding incrementally by adding edges and vertices, maintaining a rotation system that specifies the cyclic order of edges around each vertex. This approach isolates Kuratowski subgraphs when the graph is non-planar, providing explicit evidence of forbidden subdivisions.^[31] A dedicated Kuratowski subgraph finder explicitly searches for subdivisions of K_5 or K_{3,3} to certify non-planarity. Integrated into algorithms like Boyer-Myrvold, this involves tracing paths in the DFS tree that connect back edge endpoints in a way that forms the forbidden structures, often by identifying multiple disjoint paths between branch vertices. Such finders are particularly useful in planarization tasks, where removing edges from the identified subgraph yields a maximal planar graph.^[31] In practice, these algorithms represent graphs using adjacency lists for efficient traversal, with O(V + E) space. The output is typically a rotation system, a combinatorial representation of the embedding consisting of cyclic orders at vertices, which can be converted to a straight-line drawing. For example, applying the Hopcroft-Tarjan algorithm to K_5 reveals non-planarity through the detection of multiple crossing back edges that cannot be resolved without forming a K_5 subdivision, as all possible orderings lead to at least one pair of interlaced paths.^[30] Recent improvements include certified embedding methods using PQ-trees, as explored by de Fraysseix et al. in 1990, which provide linear-time verification of planarity while supporting grid-based drawings. PQ-trees encode permissible vertex orderings on the outer face, allowing efficient testing and construction of planar layouts by reducing the problem to consecutive ones properties in matrices derived from the graph. These structures ensure the embedding is canonical and certifiable, enhancing reliability in applications like VLSI design.^[32]

Complexity Results

Planarity testing, the problem of determining whether a graph admits a crossing-free embedding in the plane, can be solved in polynomial time, specifically in linear time O(n) where n is the number of vertices, using algorithms such as the one developed by Hopcroft and Tarjan.^[30] The problem of finding a minimum genus embedding, which seeks the smallest genus g of an orientable surface on which the graph can be embedded without crossings, is NP-complete. This hardness holds even when restricting to orientable surfaces, as established by Thomassen (1989).^[33] The crossing number minimization problem, which asks for the minimum number of edge crossings in a drawing of the graph in the plane, is NP-hard, and remains so even when restricted to simple cubic graphs (Hliněný, 2006).^[34]^[35] For fixed genus g, deciding whether a graph embeds on a surface of genus at most g is solvable in polynomial time, leveraging the graph minors theory of Robertson and Seymour, although the runtime is exponential in g due to the enormous constants involved in the forbidden minors characterization.^[36] In the parameterized complexity framework, where the parameter is the genus g, problems like embedding on a surface of genus at most g are fixed-parameter tractable (FPT), with algorithms running in time f(g) \cdot n^{O(1)} for some function f depending only on g; this follows from bidimensionality theory applied to minor-closed graph classes such as bounded-genus graphs.^[36] As an example, deciding whether a graph embeds on the torus (genus 1) is polynomial-time solvable for fixed genus, but the general minimum genus problem remains NP-complete, with constant-factor approximation algorithms existing for bounded-degree graphs (Eisenbrand et al., 2015).^[37] More recent developments include an efficient polynomial-time approximation scheme (EPTAS) for the genus of dense graphs (Jing and Zhou, 2020).^[38]

Representation Learning Embeddings

Node Embeddings

Node embeddings represent individual nodes in a graph as low-dimensional vectors \phi(v) \in \mathbb{R}^d, where the mapping preserves structural similarities such that nodes in close proximity (e.g., connected by short paths) have nearby vectors in the embedding space. This approach facilitates machine learning tasks like node classification and link prediction by transforming graph data into a format amenable to standard algorithms, such as logistic regression or neural networks. The core idea draws from natural language processing techniques, treating graph neighborhoods as analogous to linguistic contexts to learn representations that capture local and global structures.^[39] Seminal methods for node embeddings include DeepWalk, introduced in 2014, which generates sequences of nodes via uniform random walks on the graph and applies the Skip-Gram model from word2vec to learn embeddings by predicting nearby nodes in these sequences. Node2Vec, proposed in 2016, extends this by using biased random walks that balance breadth-first search (BFS) for local exploration and depth-first search (DFS) for global structure, controlled by parameters p and q to flexibly tune the walk behavior. In contrast, LINE from 2015 optimizes embeddings directly from edges, preserving first-order proximity (direct connections) and second-order proximity (shared neighbors) through edge-sampling and stochastic gradient descent, avoiding random walks for efficiency. These methods typically aim to maximize the log-probability of observing a neighbor n given a node v, approximated as \log P(n|v) \approx \phi(n) \cdot \phi(v) under the Skip-Gram objective, often with negative sampling for scalability.^[39]^[40] Embeddings are evaluated on downstream tasks such as node classification and link prediction. For instance, on the Cora citation network dataset, random walk-based methods like DeepWalk demonstrate effectiveness in capturing homophily for node classification using a linear classifier. Dimensions are commonly set to d = 128 to 512, balancing expressiveness and computational cost, with performance stabilizing around 128 for many networks. Scalability is achieved through sampling techniques, enabling processing of graphs with millions of nodes; DeepWalk, for example, handles the 1.1 million-node YouTube network in hours on a single machine. In social networks, such embeddings ensure that friends' vectors cluster together, reflecting relational similarities and aiding in recommendation tasks.^[39]

Graph-Level Embeddings

Graph-level embeddings represent an entire graph G as a fixed-dimensional vector \psi(G) \in \mathbb{R}^d, typically obtained by aggregating embeddings of its nodes to capture global structural and feature information for tasks such as graph classification or similarity computation.^[41] Common aggregation techniques include mean pooling, which averages node vectors; sum pooling, which accumulates them; and max pooling, which selects the element-wise maximum, each preserving different aspects of the graph's collective representation.^[42] Graph Neural Networks (GNNs) advance graph-level embeddings through iterative message-passing, where node representations are refined by exchanging information with neighbors, ultimately enabling a holistic graph vector via readout aggregation.^[41] GraphSAGE, introduced in 2017, exemplifies this by sampling a fixed-size set of neighbors for each node during training, supporting inductive learning on previously unseen graphs and addressing scalability in large networks.^[41] The core update rule in GraphSAGE is:

\mathbf{h}_v^{(k)} = \sigma \left( \mathbf{W}^{(k)} \cdot \mathrm{CONCAT} \left( \mathbf{h}_v^{(k-1)}, \mathrm{AGG} \left( \{ \mathbf{h}_u^{(k-1)} \mid u \in \mathcal{N}(v) \} \right) \right) \right)

where \mathbf{h}_v^{(k)} is the embedding of node v at layer k, \sigma is a nonlinear activation (e.g., ReLU), \mathbf{W}^{(k)} is a weight matrix, \mathrm{CONCAT} concatenates the node's prior embedding with the aggregated neighbor embeddings, \mathcal{N}(v) denotes v's neighborhood, and \mathrm{AGG} is a permutation-invariant function such as mean, LSTM, or pooling.^[41] For graph-level tasks, GraphSAGE applies global mean pooling over all final node embeddings to yield \psi(G).^[41] The Graph Isomorphism Network (GIN), proposed in 2019, enhances expressiveness by employing multi-layer perceptrons (MLPs) within its aggregation steps, approximating the power of the Weisfeiler-Lehman (WL) graph isomorphism test to better distinguish non-isomorphic graphs.^[43] This makes GIN particularly effective for graph classification, where its injective aggregation—such as sum with learnable transformations—outperforms simpler mean-based methods on benchmarks involving molecular and synthetic graphs.^[43] Spectral methods offer an alternative foundation for graph-level embeddings by decomposing the graph Laplacian matrix and using its eigenvectors to encode global harmonic properties, with Laplacian Eigenmaps providing a seminal approach that embeds graphs while preserving local neighborhoods in a low-dimensional space. Introduced in 2003, this technique computes embeddings as the smallest non-trivial eigenvectors of the normalized Laplacian, enabling dimensionality reduction that reflects the graph's intrinsic geometry, though contemporary GNNs often incorporate spectral convolutions for hybrid efficiency. These embeddings find key applications in molecule classification, such as predicting quantum mechanical properties (e.g., HOMO-LUMO gap) on the QM9 dataset of ~134,000 small organic molecules, where GNN-based models achieve low mean absolute errors for several targets. In social network analysis, they support graph-level tasks like classifying communities or detecting anomalous subgraphs by aggregating relational patterns into compact vectors.^[41] Challenges in graph-level embeddings include over-smoothing, where deep GNN layers cause node representations to converge indistinguishably, degrading global aggregation quality after 3–4 layers in dense graphs, and scalability on massive graphs, mitigated by neighbor sampling in methods like GraphSAGE.^[44]^[41] For example, embeddings of protein structures—modeled as graphs of residues or atoms—facilitate fold prediction by classifying these representations to identify structural motifs.^[45]

References

[1]
[1709.07604] A Comprehensive Survey of Graph Embedding - arXiv
Sep 22, 2017 · Graph embedding is an effective yet efficient way to solve the graph analytics problem. It converts the graph data into a low dimensional space.
[2]
A Comprehensive Survey on Deep Graph Representation Learning
Apr 11, 2023 · In this survey, we conduct a comprehensive survey on current deep graph representation learning algorithms by proposing a new taxonomy of existing state-of-the ...
[3]
Non-Separable and Planar Graphs - PNAS
Non-Separable and Planar Graphs. Hassler WhitneyAuthors Info & Affiliations. February 15, 1931. 17 (2) 125-127. https://doi.org/10.1073/pnas.17.2.125.
[4]
[PDF] The Graph Crossing Number and its Variants: A Survey
Dec 20, 2011 · The crossing number is a popular tool in graph drawing and visualization, but there is not really just one crossing number; there is a large ...
[5]
[PDF] On the genus of a random graph. - Robin Thomas
The genus of a graph, denoted by γ(G), is the smallest integer g such that G has an embedding in an orientable surface (= compact 2–manifold) of genus g.
[6]
[PDF] Proper minor-closed families are small - Math (Princeton)
A lower ideal is a class of graphs closed under isomorphism and taking minors, and it is called proper if it is not the class of all graphs. We say that a ...
[7]
[PDF] Graph Representation Learning: A Survey - arXiv
Sep 3, 2019 · Graph embedding methods take a graph as the input, where the graph can be homogeneous graphs, heterogeneous graphs, with/without auxiliary ...
[8]
Sur le problème des courbes gauches en Topologie - EuDML
Kuratowski, Casimir. "Sur le problème des courbes gauches en Topologie." Fundamenta Mathematicae 15.1 (1930): 271-283. <http://eudml.org/doc/ ...
[9]
Non-Separable and Planar Graphs - jstor
Introduction. In this paper the structure of graphs is studied by purely combinatorial methods. The concepts of rank and nullity are fundamental.
[10]
[PDF] On straight line representation of planar graphs.
In the present note I shall prove that if a finite graph can be represented on the plane at all, it can be represented with straight segments as edges too, ...
[11]
SOLUTION OF THE HEAWOOD MAP-COLORING PROBLEM<xref ...
Ringel and Youngs discussed the problem in Berlin in the summer of 1967 and felt that although the rest of Case 2 appeared possible using index 2, they had no ...
[12]
[PDF] DeepWalk: Online Learning of Social Representations - Bryan Perozzi
ABSTRACT. We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent rep-.
[13]
Graphs on Surfaces - Embeddings combinatorially
Motivated by this result we now define an embedding of a connected graph to be a rotation system (or an embedding scheme) of the graph. So from now on, an ...
[14]
[PDF] Chapter 7 Planar graphs - UCSD Math
Bijections: vertices of either graph ↔ faces of the other. The fact that the sum of face degrees is 2E becomes the. Handshaking Lemma applied to the dual graph!
[15]
NON-SEPARABLE AND PLANAR GRAPHS*
This paper studies non-separable graphs and defines a dual of a graph. A necessary and sufficient condition for a graph to be planar is having a dual.
[16]
[PDF] Embeddings of Small Graphs on the Torus - Computer Science
For example, K5 shown in. Figure 1 has a self-dual rectangular embedding. K7 (Figure 12) has a tri- angular embedding whose dual is a honeycomb embedding of ...
[17]
[PDF] Section 10.3. The Genus of a Graph
Mar 18, 2023 · In terms of the original definition of genus of a graph, we see that this means we can embed K4,6 in a surface of genus 2. Such a surface is ...
[18]
[PDF] On the bigenus of the complete graphs
Since circulant graphs have cyclic symmetry, triangular embeddings of such graphs can potentially be constructed using the theory of current graphs. A current ...
[19]
Drawings of graphs on surfaces with few crossings - SpringerLink
Jan 10, 1995 · The number of crossings in the drawings produced by our algorithm are within a multiplicative factor ofO(log2g) from the lower bound (and hence ...
[20]
[PDF] Section 10.6. Surface Embeddings of Graphs
Apr 30, 2021 · Figure 10.25 gives embeddings on the projective plane of K6 and the Petersen graph. We might need to explain the context some more. The ...
[21]
Planar Embedding -- from Wolfram MathWorld
A planar embedding is where no two edges intersect and no two vertices coincide, or edges intersect only at their endpoints.
[22]
[PDF] Planar Graphs and Wagner's and Kuratowski's Theorems
Aug 29, 2015 · We can embed K5 and K3,3 on a torus, for example. Graphs themselves can also be thought of as a specific case of ordered pairs of the form (V,E) ...
[23]
[PDF] Graph minors and equivalence of Wagner's and Kuratowski's theorem
Every subgraph of a planar graph is planar, and thus it suffices to prove that contraction of an edge in a planar graph preserves planarity. Sup- pose we are ...
[24]
[PDF] On straight line representation of planar graphs.
A planar graph can be represented with straight edges if it has no multiple edges. A straight graph has all edges as straight segments. Every simple graph can ...
[25]
Planar Graphs - Discrete Mathematics
Prove your answer. You can use the handshake lemma to find the number of edges, in terms of the number of vertices.Missing: embeddings 2E =
[26]
Three-dimensional graph drawing | Algorithmica
Aug 7, 1995 · Three-dimensional graph drawing. Download PDF. R. F. Cohen,; P. Eades,; Tao Lin & … F. Ruskey. Show authors. 616 Accesses. 50 Citations. 3 ...
[27]
[PDF] Three-Dimensional Drawings - Brown CS
In particular, a moment curve M is a curve defined by parameters (q, q2,q3) ... gives a straight-line drawing of the graph. The famous barycenter method ...
[28]
Efficient Planarity Testing | Journal of the ACM - ACM Digital Library
This paper describes an efficient algorithm to determine whether an arbitrary graph G can be embedded in the plane. The algorithm may be viewed as an ...Missing: original | Show results with:original
[29]
On the Cutting Edge: Simplified O(n) Planarity by Edge Addition
Authors. John Boyer; Wendy Myrvold. DOI: https://doi.org/10.7155/jgaa.00091. Abstract. We present new O(n)-time methods for planar embedding ...
[30]
How to draw a planar graph on a grid | Combinatorica
Jan 12, 1989 · H.de Fraysseix,Drawing planar and non-planar graphs from the half-edge code (to appear). H.de Fraysseix and P.Rosenstiehl, Structures ...
[31]
[PDF] The Graph Genus Problem Is NP-Complete
The author [6] proved that a given embedding is of minimum genus provided all the noncontractible cycles are longer than all facial walks. [6] also contains ...Missing: Stockmeyer 1977
[32]
Crossing Number is NP-Complete | SIAM Journal on Matrix Analysis ...
We show that this problem is NP-complete, and hence there is not likely to be any efficient way to design an optimal embedding.
[33]
[PDF] The Bidimensionality Theory and Its Algorithmic Applications
A class of graphs has bounded genus if every graph in the class has genus at most g for a fixed g. Given an embedded planar graph and a two-coloring of its ...
[34]
Approximation Algorithms for Euler Genus and Related Problems
In this paper we give a polynomial-time algorithm which, given a bounded-degree graph of Euler genus g , computes a drawing in a surface of Euler genus g O ( 1 ) ...
[35]
[1403.6652] DeepWalk: Online Learning of Social Representations
Mar 26, 2014 · We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations.
[36]
[1503.03578] LINE: Large-scale Information Network Embedding
Mar 12, 2015 · This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks.
[37]
Inductive Representation Learning on Large Graphs - arXiv
Jun 7, 2017 · Here we present GraphSAGE, a general, inductive framework that leverages node feature information (eg, text attributes) to efficiently generate node embeddings ...
[38]
[PDF] Hierarchical Graph Representation Learning with Differentiable ...
When applying GNNs to graph classification, the standard approach is to generate embeddings for all the nodes in the graph and then to globally pool all these.
[39]
[1810.00826] How Powerful are Graph Neural Networks? - arXiv
Here, we present a theoretical framework for analyzing the expressive power of GNNs to capture different graph structures.
[40]
A Survey on Oversmoothing in Graph Neural Networks - arXiv
Mar 20, 2023 · Oversmoothing in GNNs is the exponential convergence of node feature similarity measures as network depth increases, where node features become ...
[41]
Structure-based protein function prediction using graph ... - Nature
May 26, 2021 · We introduce DeepFRI, a Graph Convolutional Network for predicting protein functions by leveraging sequence features extracted from a protein language model ...