Graham scan
The Graham scan is an algorithm in computational geometry for computing the convex hull of a finite set of two-dimensional points, which is the smallest convex polygon enclosing all the points.[1] Proposed by Ronald L. Graham in 1972, it provides an efficient method to identify the boundary points that form this hull, with a time complexity of O(n log n), where n is the number of input points, primarily due to an initial sorting step.[1] The algorithm is fundamental in the field, enabling applications in computer graphics, robotics, and geographic information systems by efficiently determining the outer envelope of point sets.[2]
The process begins by selecting the point with the lowest y-coordinate as the starting point P0, breaking ties by choosing the one with the lowest x-coordinate to ensure it lies on the convex hull.[2] The remaining points are then sorted in increasing order of the polar angle they subtend with respect to P0, measured counterclockwise, with ties resolved by increasing distance from P0 to handle collinear points.[2] This sorting step, which takes O(n log n) time, orders the points in a way that traces a potential boundary around the hull.[3]
Following sorting, the algorithm performs a linear scan through the ordered points while maintaining a stack to build the hull incrementally.[2] It starts by pushing the first three points onto the stack; for each subsequent point, it checks the orientation formed by the last three points on the stack (or the new point with the top two).[2] If the turn is clockwise (indicating a non-convex dent), the middle point is popped from the stack, and the check is repeated until a counterclockwise turn is achieved or the stack has fewer than two points; the new point is then pushed.[2] This backtracking ensures only convex vertices remain, completing the hull in O(n) time after sorting.[3] The final stack holds the convex hull vertices in counterclockwise order, starting and ending with P0.[2]
Overview
Definition and Purpose
The Graham scan is an algorithm in computational geometry that computes the convex hull of a finite set of points in the Euclidean plane by sorting the points by polar angle relative to a reference point and scanning to eliminate those not on the hull boundary. The convex hull itself is the smallest convex set containing all input points, often visualized as the tightest convex polygon enclosing them.[4]
The algorithm takes as input a set of n points in 2D space, typically represented by their Cartesian coordinates and assumed to be in general position (no three collinear).[3] Its output is the sequence of vertices forming the convex hull, listed in counterclockwise order around the boundary.[3]
The primary purpose of the Graham scan is to identify this minimal enclosing convex polygon efficiently, addressing the fundamental convex hull problem that underpins many advanced techniques in computational geometry.[5] It finds application in areas such as collision detection in computer graphics, robot motion planning, and defining boundaries in geographic information systems.[5]
History
The Graham scan algorithm was introduced by Ronald Lewis Graham, a prominent American mathematician known for his contributions to discrete mathematics and combinatorics. In 1972, Graham published the algorithm in his paper "An Efficient Algorithm for Determining the Convex Hull of a Finite Planar Set," appearing in Information Processing Letters. This work presented an innovative approach to computing the convex hull of a finite set of points in the plane, achieving O(n log n) time complexity through sorting by x-coordinate followed by a linear scan to eliminate interior points.[6][1] The original algorithm sorted points lexicographically and built the hull using a stack-based scan, assuming no vertical lines but handling collinear points via orientation tests.
The algorithm that is now commonly known as the Graham scan uses polar-angle sorting relative to the lowest y-coordinate point (with ties by x-coordinate), followed by a similar scanning procedure; this variant preserves the O(n log n) complexity and is the standard form described in modern textbooks.
The algorithm emerged amid the rapid growth of computational geometry as a formal discipline in the early 1970s, a period when researchers sought efficient solutions to geometric problems that had previously relied on quadratic-time brute-force methods. Graham's contribution was recognized as one of the inaugural algorithms in the field, highlighting the potential of sorting-based techniques for planar point sets. It coincided with parallel developments, such as Franco Preparata's early explorations of geometric data structures and Richard A. Jarvis's 1973 gift-wrapping algorithm (also known as Jarvis march), which offered output-sensitive performance but worse worst-case bounds.[5][7]
Over time, the Graham scan gained prominence as a standard method due to its balance of simplicity, implementability, and efficiency compared to earlier O(n²) approaches. Subsequent refinements addressed robustness issues, such as handling collinear points and degenerate cases, through modifications that preserved the core O(n log n) complexity while improving numerical stability in practical implementations. These evolutions solidified its role as a foundational technique in computational geometry textbooks and applications.[8]
Algorithm
Preprocessing: Finding the Lowest Point
The preprocessing step of the Graham scan algorithm begins by identifying a pivot point P_0 from the input set of n points in the plane, which serves as the origin for subsequent polar angle sorting. This pivot is selected as the point with the lowest y-coordinate; in case of ties among multiple points sharing this minimum y-coordinate, the one with the smallest x-coordinate (i.e., the leftmost) is chosen.[9]
This selection ensures that P_0 is guaranteed to be a vertex on the convex hull, as it represents an extreme point in the downward direction—no other point can lie below it, preventing it from being interior to the hull. By starting from this point, the algorithm avoids complications arising from collinear points along the bottom edge, as the leftmost choice among ties promotes a consistent counterclockwise ordering in the later phases. The pivot P_0 then acts as the reference for computing polar angles to all other points.[9]
For small input sizes, the algorithm handles trivial cases directly: if n = 1, the convex hull consists solely of that single point; if n = 2, the hull is the line segment connecting the two points. For n = 3, the points form a triangle if non-collinear (all three on the hull) or degenerate to the two endpoints if collinear.[10]
This preprocessing requires a single linear scan through the points to find the minimum y-coordinate and resolve any ties by comparing x-coordinates, achieving O(n) time complexity.[9]
Sorting by Polar Angle
After identifying the point P_0 with the lowest y-coordinate (and lowest x-coordinate in case of ties) as the pivot, the Graham scan proceeds by considering all other points relative to P_0, effectively translating the coordinate system so that P_0 becomes the origin for angular comparisons. The remaining n-1 points are then sorted in increasing order of the polar angle they form with P_0 and the positive x-axis, proceeding in a counterclockwise direction. This ordering prepares the points for sequential processing along the potential boundary of the convex hull.[11][12]
To avoid explicit computation of angles, which would require trigonometric functions and potential floating-point precision issues, the sorting uses the cross product of vectors to determine relative angular order. For two points A and B, form the vectors \overrightarrow{P_0A} = (A_x - P_{0.x}, A_y - P_{0.y}) and \overrightarrow{P_0B} = (B_x - P_{0.x}, B_y - P_{0.y}). The cross product is given by:
(A_x - P_{0.x})(B_y - P_{0.y}) - (A_y - P_{0.y})(B_x - P_{0.x}).
If this value is positive, A precedes B in counterclockwise order (i.e., A has a smaller polar angle than B); if negative, B precedes A; if zero, the points are collinear with P_0. This orientation test leverages the geometric property that the sign of the cross product indicates the turn direction.[11][12]
In the case of collinear points (cross product zero), the tie is broken by comparing Euclidean distances from P_0, sorting in increasing order of distance (closest points first). This ensures that, during the subsequent scanning phase, inner collinear points can be discarded, retaining only the farthest point on each ray from P_0 for the outer boundary. The sorting is performed using a comparison-based algorithm such as merge sort or quicksort, yielding an overall time complexity of O(n \log n) for this phase.[13][11]
By arranging the points in this angular order around P_0, the Graham scan guarantees that the sorted list traverses the boundary in a manner that facilitates linear-time identification of hull vertices in the next phase, contributing to the algorithm's optimal asymptotic efficiency.[6]
Scanning and Building the Hull
The scanning phase of the Graham scan algorithm constructs the convex hull by iteratively processing the points sorted in increasing order of polar angle around the lowest point P_0. This phase begins by initializing a stack with the first two points in the sorted list, P_0 and P_1.[13]
For each point P_i where i ranges from 2 to n-1, the algorithm examines the orientation formed by the last two points on the stack and P_i. While the stack has at least two points and these three points form a right turn (clockwise orientation) or are collinear, the last point on the stack is removed (popped). Once a left turn (counterclockwise orientation) is confirmed or the stack has fewer than two points, P_i is added (pushed) to the stack. This process ensures that concave portions are eliminated, maintaining only convex (left) turns on the boundary. The turn test is performed using the cross-product orientation formula: for points A = (A_x, A_y), B = (B_x, B_y), C = (C_x, C_y),
(B_x - A_x)(C_y - A_y) - (B_y - A_y)(C_x - A_x)
If this value is less than or equal to zero, it indicates a right turn or collinear points, prompting the removal of B.[13]
After processing all points, the stack contains the vertices of the convex hull in counterclockwise order, starting with P_0 and ending with the last hull vertex. The hull is closed by connecting the final vertex back to P_0. This single-pass scanning eliminates all internal and concave points efficiently.
Handling of collinear points during scanning depends on the specific variant and requirements: if the cross-product equals zero, the middle point may be discarded to include only extreme hull vertices (using ≤ 0 in the test), or retained if all boundary points are desired. The original formulation assumes no three points are collinear for simplicity, but practical implementations adjust the inequality (e.g., strict >0 for left turns) to manage degeneracies robustly.[2]
Analysis
Time and Space Complexity
The Graham scan algorithm achieves a time complexity of O(n log n) in both the worst and average cases for a set of n points in the plane, with the dominant cost arising from the sorting of points by polar angle relative to the lowest y-coordinate point. The initial preprocessing step to identify this lowest point requires O(n) time via a single linear pass over the input. The core scanning phase, which constructs the hull by maintaining a stack and performing orientation tests, also operates in O(n) time overall, since each point is pushed onto the stack at most once and popped at most once during the process, leading to an amortized O(1) cost per point.[14]
The space complexity of the standard Graham scan is O(n, accounting for the storage of the input points, a temporary array for the sorted order, and the stack used during hull construction, which holds at most O(n points in the worst case. Implementations employing in-place sorting for the polar angle ordering—such as adaptations of heapsort or quicksort with custom comparators—can minimize auxiliary space to O(1) beyond the input and output hull storage.[14][15]
To derive the time bound, note that sorting by polar angle demands Ω(n log n) comparisons in the standard algebraic decision tree model, providing a tight lower bound that the algorithm meets through efficient comparison-based sorting. The O(n) contributions from preprocessing and scanning do not increase this asymptotic cost. Even in the best case, where points might exhibit favorable ordering, the time remains Θ(n log n) due to the unavoidable sorting step; a linear O(n runtime is theoretically possible only if points are pre-sorted by polar angle, but this assumption is not part of the general analysis.[14]
This O(n log n) performance renders the Graham scan asymptotically optimal for 2D convex hull computation under comparison-based models, aligning precisely with the established Ω(n log n) lower bound for the problem, proven via reduction from sorting or element uniqueness in the plane.
Correctness Proof Sketch
The correctness of the Graham scan algorithm is established by verifying that the output consists precisely of the convex hull vertices in counterclockwise boundary order. A key invariant during the scanning phase states that the points on the current stack form a convex chain in counterclockwise order, where every consecutive triple of points p_{j-1}, p_j, p_{j+1} produces a left turn, determined by a non-negative cross product orientation test (≥ 0).[16]
This invariant is proven by induction on the number of points processed after sorting. For the base case, the stack is initialized with p_0 (lowest point) and p_1 (next in polar order), forming a trivial convex chain (degenerate segment). Processing p_2: while the stack has ≥2 points and the orientation p_0, p_1, p_2 indicates a non-left turn (cross product < 0 or =0 for strict hull excluding collinear intermediates), pop p_1; then push p_2. The resulting stack (either p_0, p_2 if popped, or p_0, p_1, p_2) satisfies the invariant as a convex chain.[16][17]
In the inductive step, assume the invariant holds for the first k points, forming the convex hull prefix CH(P_k). When processing p_{k+1}, the algorithm pops points from the stack while the turn p_{j-1}, p_j, p_{k+1} is not a left turn (cross product < 0), removing any concavity; appending p_{k+1} then maintains the left-turn property and convexity, as p_{k+1} lies outside or on the current hull due to the polar angle sorting.[18] Popped points are guaranteed to lie inside the final hull, as their position inside the triangle formed by neighboring hull points and p_0 contradicts their membership in the convex hull.[19]
A supporting lemma ensures completeness: the polar angle sorting from p_0 implies that no subsequent point p_i (with i > j) can lie inside the angular wedge defined by earlier points p_0 and p_j, preventing omission of hull vertices. The selection of the lowest p_0 and the farthest point among collinear candidates during sorting further guarantees inclusion of all boundary extremes, while the scanning phase detects and resolves all concavities.[18]
Degenerate cases with collinear points are handled by secondary sorting based on distance from p_0 when the cross product is zero, and using ≤ 0 in the popping condition to retain only the extreme points on the hull edge and discard intermediates inside.[16] The algorithm assumes general position or uses the orientation test to manage collinearities without altering the invariant.
Formal verifications decompose the proof into local geometric properties, such as convexity preservation and point inclusion, using induction and invariants in proof assistants like Isabelle, confirming the algorithm's reliability even under degeneracies.[18][19] The resulting hull implicitly forms a simple closed curve, separating the plane into interior and exterior regions as per the Jordan curve theorem, ensuring all points are correctly classified.
Implementation
Pseudocode
The Graham scan algorithm can be expressed in pseudocode as follows, assuming an input set of n \geq 3 points in the plane, each represented as a structure with real-valued coordinates (x, y), and the points are in general position (no three collinear). This formulation integrates the preprocessing steps of pivot selection and polar-angle sorting with the scanning phase to construct the convex hull.
A key helper function is the cross-product orientation test, which determines the turn direction between three points o, a, and b:
cross(o, a, b) = (a.x - o.x) * (b.y - o.y) - (a.y - o.y) * (b.x - o.x)
cross(o, a, b) = (a.x - o.x) * (b.y - o.y) - (a.y - o.y) * (b.x - o.x)
A positive value indicates a counterclockwise (left) turn, a negative value a clockwise (right) turn, and zero indicates collinearity. In the sorting comparator, points are ordered by increasing polar angle with respect to the pivot p_0; if two points have the same angle, the closer one (by Euclidean distance) precedes the farther one to handle potential collinearities without including interior points on the hull. During scanning, the condition \leq 0 ensures right turns and collinear points are removed to maintain convexity.[17]
The main procedure is:
GrahamScan(points):
n = length(points)
// Step 1: Find pivot [p0](/page/P0) (lowest y, then lowest x if tie)
p0 = point with minimum (y, x) among points
swap points[0] and [p0](/page/P0) // p0 now at index 0
// Step 2: Sort points[1..n-1] by polar angle with [p0](/page/P0), using cross-product comparator
sort points[1..n-1] using comparator polarCompare([p0](/page/P0), p, q)
// Where polarCompare(o, p, q):
// c = cross(o, p, q)
// if c != 0: return c
// else: return distSq(o, p) - distSq(o, q)
// (distSq(o, p) = (p.x - o.x)^2 + (p.y - o.y)^2)
// Step 3: Build the [convex hull](/page/Convex_hull) (scan the sorted points)
hull = empty [stack](/page/Stack)
push points[0] to hull
push points[1] to hull
for i = 2 to n-1:
while length(hull) >= 2 and cross(hull[-2], hull[-1], points[i]) <= 0:
pop hull
push points[i] to hull
return hull // Points in hull form the [convex hull](/page/Convex_hull) in counterclockwise order
GrahamScan(points):
n = length(points)
// Step 1: Find pivot [p0](/page/P0) (lowest y, then lowest x if tie)
p0 = point with minimum (y, x) among points
swap points[0] and [p0](/page/P0) // p0 now at index 0
// Step 2: Sort points[1..n-1] by polar angle with [p0](/page/P0), using cross-product comparator
sort points[1..n-1] using comparator polarCompare([p0](/page/P0), p, q)
// Where polarCompare(o, p, q):
// c = cross(o, p, q)
// if c != 0: return c
// else: return distSq(o, p) - distSq(o, q)
// (distSq(o, p) = (p.x - o.x)^2 + (p.y - o.y)^2)
// Step 3: Build the [convex hull](/page/Convex_hull) (scan the sorted points)
hull = empty [stack](/page/Stack)
push points[0] to hull
push points[1] to hull
for i = 2 to n-1:
while length(hull) >= 2 and cross(hull[-2], hull[-1], points[i]) <= 0:
pop hull
push points[i] to hull
return hull // Points in hull form the [convex hull](/page/Convex_hull) in counterclockwise order
This pseudocode produces the convex hull vertices in counterclockwise order starting from the pivot.
A related variant is Andrew's monotone chain algorithm, which achieves the same O(n log n) time complexity by sorting points by x-coordinate (then y-coordinate) instead of polar angle, and constructing upper and lower hulls separately.
The algorithm runs in O(n log n) time overall, dominated by the sorting step, and is straightforward to implement in programming languages such as Python or C++ using standard libraries for sorting and stack operations.[17]
Numerical Robustness
The primary numerical robustness challenge in implementing the Graham scan arises from the orientation test, which relies on the cross product to determine whether three points form a left turn, right turn, or collinear configuration. In finite-precision floating-point arithmetic, the cross product value can be computed as near zero due to round-off errors, leading to misclassification of turns or erroneous detection of collinearity, especially for nearly collinear points or when coordinates are large. This can result in incorrect inclusion or exclusion of points on the convex hull, potentially producing a non-convex polygon or omitting extreme vertices.[20]
To address these issues, one approach is to employ exact arithmetic, such as using integer coordinates with arbitrary-precision big integers to compute the cross product without rounding errors, ensuring precise orientation decisions. Alternatively, adaptive precision methods like Shewchuk's robust geometric predicates can be used for the orientation test; these predicates employ filtered exact arithmetic, starting with floating-point approximations and escalating to exact computations only when necessary, providing speed comparable to floating-point while guaranteeing correctness.[21]
Practical best practices include scaling input coordinates to a normalized range to minimize overflow risks during multiplication in the cross product computation, and introducing small perturbations to points to assume general position and avoid exact degeneracies. For collinear detection, an epsilon threshold—such as considering points collinear if the absolute cross product value is less than $1 \times 10^{-9}—can be applied during sorting and scanning to tolerate minor errors, though this must be tuned carefully to avoid over- or under-correction. Additionally, libraries like CGAL implement the Graham scan (via Andrew's variant) with built-in robust predicates, leveraging exact arithmetic kernels for reliable results in production environments.[22]
In practice, for input points with integer coordinates up to $10^9, 64-bit floating-point (double precision) arithmetic often suffices if the cross product is computed using 64-bit integer intermediates to avoid intermediate rounding, as the final sign determination remains accurate under these constraints. Handling degeneracies explicitly is crucial: during polar angle sorting, collinear points should be ordered by distance from the base point to preserve the hull boundary, and in the scanning phase, chains of collinear points must be resolved by retaining only the farthest one (or all, depending on requirements) to prevent erroneous popping that could lead to infinite loops or malformed hulls. Failure to do so can amplify small errors into topological inconsistencies, such as self-intersecting edges.[20]
Comparisons and Extensions
Comparison with Other Convex Hull Algorithms
The Graham scan, with its O(n log n) time complexity dominated by polar-angle sorting, contrasts with the Jarvis march (also known as gift wrapping), which achieves O(n h) time where h is the number of points on the convex hull.[23][24] The Jarvis march is conceptually simpler, starting from the leftmost point and iteratively selecting the next hull vertex by finding the point that forms the smallest polar angle with the current edge, akin to wrapping a gift around the points.[23] However, its performance degrades when h is large relative to n, such as in dense configurations where many points lie on the hull, making the Graham scan preferable for such cases due to its logarithmic factor independence from h.[24] In practical implementations, the Jarvis march outperforms the Graham scan only on inputs with small h, like clustered points forming a sparse hull, but remains slower overall for typical large-scale datasets.[24]
Andrew's monotone chain algorithm also runs in O(n log n) time, matching the Graham scan's complexity, but employs a different sorting strategy based on x-coordinate followed by y-coordinate, followed by linear scans to build upper and lower hulls separately.[2] This Cartesian sorting avoids the computational expense and numerical issues of polar-angle calculations, rendering it more robust to collinear points and degenerate cases where angles may be undefined or imprecise.[5] While the Graham scan's polar ordering provides an intuitive counterclockwise traversal around the hull, Andrew's method is often viewed as a refined variant that simplifies implementation without sacrificing efficiency.[25]
For output-sensitive scenarios, Chan's algorithm improves upon the Graham scan with an O(n log h) time bound, making it asymptotically optimal for sparse hulls where h << n.[26] It combines elements of the Graham scan—applying it to small subsets of points—and the Jarvis march, using a multi-phase approach that partitions the input, computes sub-hulls, and merges them via gift-wrapping steps on candidate hull points.[27] This hybrid design excels when the hull is small, as the logarithmic factor scales with h rather than n, but it introduces additional overhead from subset management and merging, which can make it more complex to implement than the Graham scan.[24] Empirical evaluations show Chan's algorithm underperforming the Graham scan in some practical settings due to these extra computations, despite its theoretical advantages for sparse distributions.[24]
The Quickhull algorithm, a randomized divide-and-conquer method, achieves expected O(n log n) time similar to the Graham scan but adapts to input geometry like quicksort, potentially running faster in practice for certain point distributions by recursively partitioning based on extreme points and discarding irrelevant regions.[28] Its worst-case time is O(n²), however, on adversarial inputs where partitions remain unbalanced, whereas the Graham scan's sorting ensures consistent performance.[28] Quickhull often proves quicker empirically for evenly distributed points, but the Graham scan's deterministic nature and angle-based intuition make it more reliable for general use without randomization risks.[24]
Unlike output-sensitive algorithms such as Chan's or Jarvis march, the Graham scan is non-adaptive to hull size h, as its sorting step processes all n points regardless of sparsity, which limits scalability in highly sparse cases but ensures predictable O(n log n) behavior.[23] Its straightforward polar sorting and stack-based scanning render it particularly favored in educational contexts for illustrating convex hull concepts, as the angular ordering visually aligns with the hull's boundary traversal.[29] The Graham scan is thus ideally chosen for datasets of moderate size n where balanced performance is needed, or when polar ordering facilitates visualization and debugging in geometric applications.[2]
Variants and Improvements
One notable variant of the Graham scan is Andrew's monotone chain algorithm, which sorts the points lexicographically by x-coordinate (and then by y-coordinate for ties) before constructing the convex hull by building the lower and upper hulls in separate linear passes.[30] This approach maintains the O(n log n) time complexity dominated by sorting while avoiding the polar angle computation, thereby improving numerical robustness against floating-point precision issues in degenerate cases like collinear points.[30]
The Graham scan generalizes poorly to three dimensions because its reliance on polar angle sorting lacks a straightforward equivalent in higher-dimensional spaces, leading to inefficient or undefined ordering.[5] For 3D convex hulls, alternative methods such as gift wrapping (Jarvis march) or incremental algorithms are preferred, as they achieve O(n log n) expected time without depending on angular coordinates.[5]
Parallel implementations of the Graham scan distribute the sorting and scanning phases across multiple threads or processors to handle large datasets efficiently. For instance, the gScan algorithm preprocesses points on the GPU through two filtering rounds to discard interior points and partition by angle, followed by CPU-based hull finalization, yielding speedups of 6x to 7x over traditional implementations for up to 20 million points.[31]
Improvements to the Graham scan often incorporate preprocessing to eliminate interior points early, such as filtering via a bounding quadrilateral or eight-vertex polygon constructed from extreme points, which reduces the input size in O(n time and accelerates the subsequent scan by up to 77 times for dense point sets.[32] Hybrid approaches combine the Graham scan with divide-and-conquer strategies, such as integrating quickhull's partitioning for initial hull approximation before applying the scan, to scale better for very large datasets by balancing recursive decomposition with linear scanning.[33]