Probabilistic method

The probabilistic method is a non-constructive proof technique in mathematics, primarily employed in combinatorics, that demonstrates the existence of combinatorial structures possessing specified properties by showing that a randomly chosen object from a suitable probability space satisfies those properties with positive probability, without needing to explicitly construct such an object.^[1] Pioneered by Paul Erdős in his 1947 paper "Some remarks on the theory of graphs," where it was first applied to obtain lower bounds on Ramsey numbers, the method marked a significant departure from traditional constructive approaches in discrete mathematics.^[2] Although an earlier precursor appeared in Tibor Szele's 1943 work on the existence of Hamilton paths in random tournaments, Erdős's contribution is widely regarded as the foundational moment that popularized and systematized the technique.^[1] At its core, the probabilistic method leverages basic probability tools to analyze random models, such as the Erdős–Rényi random graph G(n,p), where each possible edge between n vertices is included independently with probability p.^[1] Key principles include the first moment method, which uses the linearity of expectation to bound the expected number of "bad" events and show their probability is less than 1, thereby implying a positive probability for the complementary "good" event; the second moment method, which employs variance to demonstrate concentration around the mean; and the Lovász Local Lemma, introduced by Erdős and László Lovász in 1975, which handles dependencies among rare events to prove joint avoidance with positive probability.^[1] Concentration inequalities, such as Chernoff bounds and Azuma's inequality for martingales, further refine these arguments by providing tail probabilities for deviations in random variables.^[1] These tools enable proofs of existence for diverse structures, including bipartite subgraphs with at least half the edges of any given graph and tournaments avoiding specific substructures.^[1] The method's influence extends beyond pure combinatorics to theoretical computer science, coding theory, statistical physics, and algorithm design, where it informs derandomization techniques to convert probabilistic constructions into deterministic ones.^[1] Notable applications include lower bounds on the clique number in random graphs, thresholds for the emergence of subgraphs like triangles in G(n,p), and existence results in circuit complexity, such as Håstad's switching lemma for bounding the size of constant-depth circuits.^[1] In number theory, it has been used to analyze the distribution of primes via the Hardy–Ramanujan theorem on the number of prime factors.^[3] Over time, extensions like the Janson inequalities for dependent events and the Poisson paradigm have broadened its scope, making it one of the most versatile tools in modern discrete mathematics.^[1]

Overview

Definition and Basic Principle

The probabilistic method is a non-constructive proof technique in combinatorics that establishes the existence of mathematical objects or structures satisfying certain properties by employing arguments from probability theory. Rather than explicitly constructing such an object, the method defines a suitable probability space over a universe of potential objects and demonstrates that the probability of selecting an object with the desired properties is strictly positive. This implies that at least one such object must exist in the finite universe, as a positive probability in a discrete uniform or similar measure corresponds to a non-empty set. The approach was first introduced by Paul Erdős in 1947 to solve problems in extremal graph theory.^[4] Formally, consider a finite sample space \Omega representing all possible objects (e.g., graphs, colorings, or sequences), equipped with a probability measure \Pr where each outcome has positive probability. Let X be a random variable uniformly distributed over \Omega, and let S \subseteq \Omega be the event consisting of outcomes with the desired property. If \Pr[X \in S] > 0, then |S| > 0, proving the existence of at least one object in S. This basic principle relies on the fact that probability measures on finite spaces are normalized sums of point masses, so any positive probability guarantees a non-zero count of favorable outcomes. The non-constructive nature stems from the fact that the proof need not identify a specific element of S; it only bounds the probability from below, often using tools like expectation (detailed elsewhere).^[4] To apply the method, one must first define the probability space tailored to the combinatorial context. A probability space (\Omega, \Pr) consists of the sample space \Omega and a measure \Pr: 2^\Omega \to [0,1] satisfying \Pr[\Omega] = 1 and countable additivity. In combinatorial settings, \Omega is often the set of all graphs on n labeled vertices with edges included independently with probability p (the Erdős–Rényi model G(n,p)), or all proper k-colorings of a graph's vertices chosen uniformly at random. Events are subsets of \Omega, such as the set of graphs containing no triangle, and random variables map outcomes to numerical values, like the size of the largest clique in a random graph. These prerequisites from probability theory enable the analysis of \Pr[S] without enumerating \Omega exhaustively.^[4] A simple illustrative example is the existence of a sequence of n fair coin tosses with strictly more heads than tails. Let \Omega = \{H, T\}^n be the sample space of all possible sequences, with uniform probability \Pr[\omega] = 2^{-n} for each \omega \in \Omega. Define S \subseteq \Omega as the set of sequences with more than n/2 heads (assume n even for strict inequality). The number of heads H follows a binomial distribution \Bin(n, 1/2), so \Pr[H > n/2] = \sum_{k = n/2 + 1}^n \binom{n}{k} (1/2)^n. This probability is positive, as it includes terms like the all-heads sequence with \Pr = 2^{-n} > 0, and symmetry ensures \Pr[H > n/2] = (1 - \Pr[H = n/2])/2 > 0 for n > 0. Thus, |S| > 0, proving such a sequence exists without constructing it explicitly.^[4]

Historical Development

The probabilistic method emerged as a transformative tool in combinatorics through its early applications in the mid-20th century, building on nascent probabilistic ideas in the field. Although Hungarian mathematician Tibor Szele employed probabilistic arguments in 1943 to prove the existence of tournaments with at least n!/2^{n-1} Hamiltonian paths, this was an isolated use rather than a systematic approach.^[1] Paul Erdős is widely regarded as the pioneer of the modern probabilistic method, introducing it in his 1947 paper published in the Bulletin of the American Mathematical Society, where he demonstrated a lower bound for the Ramsey number R(k,k) > 2^{k/2} (for k \geq 3) by showing that a random 2-coloring of the edges of the complete graph K_n yields a coloring without monochromatic cliques of size k with positive probability.^[5] This work marked the method's debut as a non-constructive technique for proving existential results in extremal graph theory.^[6] Erdős continued to refine and expand the method in subsequent decades, applying it to diverse problems that highlighted its versatility. In 1959, he used probabilistic arguments to establish the existence of graphs with arbitrarily large girth and chromatic number, resolving a key question in extremal graph theory by considering random graphs and bounding the probability of short cycles.^[1] The 1970s saw significant growth in the method's scope, with Erdős and collaborators, including Joel Spencer, leveraging linearity of expectation to address problems in discrepancy theory and combinatorial constructions, such as bounding the discrepancy of set systems.^[1] A pivotal advancement came in 1975 when Erdős and László Lovász introduced the Local Lemma in their paper on 3-chromatic hypergraphs, providing a framework to handle dependencies among bad events and prove that certain configurations occur with positive probability even when events are correlated.^[1] The publication of The Probabilistic Method by Noga Alon and Joel H. Spencer in 1992 solidified the field's foundations, serving as a comprehensive reference that cataloged core techniques and applications up to that era, with subsequent editions incorporating further developments.^[1] Post-2000, the method gained prominence in theoretical computer science, underpinning randomized algorithms, derandomization techniques, and approximation algorithms for NP-hard problems, while also influencing areas like property testing and online algorithms.^[1] This evolution shifted combinatorial mathematics from predominantly constructive proofs to existential ones, fostering breakthroughs in fields ranging from discrepancy theory to the analysis of random graphs and algorithmic design.^[6]

Core Techniques

Probabilistic Existence via Expectation

The probabilistic existence via expectation is a foundational technique in the probabilistic method, relying on the expected value of indicator random variables to demonstrate that a combinatorial object with desired properties exists with positive probability. An indicator random variable I_A for an event A is defined as I_A = 1 if A occurs and I_A = 0 otherwise, with its expectation satisfying \mathbb{E}[I_A] = \Pr[A].^[7] To prove existence, consider a probability space over potential objects (e.g., random graphs or orientations), and define "bad events" whose avoidance yields the desired structure. Let X = \sum I_{A_i} be the sum of indicator variables for these bad events A_i, so X counts their occurrences and \mathbb{E}[X] = \sum \Pr[A_i]. If \mathbb{E}[X] < 1, then the probability that no bad events occur is positive, implying the existence of an object avoiding all of them.^[7] This follows from the inequality \Pr[X \geq 1] \leq \mathbb{E}[X], a direct consequence of the non-negativity of X and the definition of expectation: since \mathbb{E}[X] = \sum_{k=0}^\infty k \Pr[X = k] \geq 1 \cdot \Pr[X \geq 1]. Thus, \Pr[X \geq 1] < 1 ensures \Pr[X = 0] > 0.^[7] A classic application is the existence of tournaments without large transitive subtournaments. Consider a random tournament on n vertices, where each edge is oriented independently with equal probability. For a fixed subset of k vertices, the indicator I_S is 1 if the subtournament on S is transitive (i.e., admits a total ordering consistent with all orientations). The probability \Pr[I_S = 1] = 2^{-\binom{k}{2}}, as exactly one of the $2^{\binom{k}{2}} possible orientations is transitive. Letting X = \sum_{|S|=k} I_S, we have \mathbb{E}[X] = \binom{n}{k} 2^{-\binom{k}{2}}. For k = \lceil 2 \log_2 n \rceil + 1, this expectation is less than 1, so \Pr[X = 0] > 0, proving a tournament on n vertices exists without a transitive subtournament of size k.^[8]^[7] This approach excels when bad events are independent or weakly dependent, as the expectation bound directly controls the probability of avoidance. However, strong dependencies may make the probability \Pr[X=0] very small, even if \mathbb{E}[X] < 1, limiting the method to proving mere existence rather than high-probability events. For higher probabilities or when \mathbb{E}[X] is large, alteration methods or concentration inequalities are needed (see Alteration Methods and Concentration Inequalities sections).^[7]

Linearity of Expectation

The linearity of expectation is a key probabilistic tool that asserts: for any finite collection of (possibly dependent) random variables X_1, \dots, X_n,

\mathbb{E}\left[\sum_{i=1}^n X_i\right] = \sum_{i=1}^n \mathbb{E}[X_i].

This property extends to linear combinations, where constants can be factored out, and holds without requiring independence among the variables.^[1] In the probabilistic method, linearity facilitates existence proofs by bounding the expectation of a counting random variable X, often expressed as a sum of indicator variables X = \sum X_i for potential substructures.
Here, \mathbb{E}[X] = \sum \mathbb{E}[X_i] = \sum \Pr(X_i = 1), and if \mathbb{E}[X] > 0, then \Pr(X \ge 1) > 0, guaranteeing the existence of a combinatorial object containing at least one such substructure.^[1] This approach leverages the average-case behavior to infer positive instances, even when direct constructions are elusive. A detailed example arises in bounding the maximum number of edges in a K_{s,t}-free bipartite graph, via random edge selection and alteration.
Consider a complete bipartite graph K_{n,n} with parts A and B of size n, and include each possible edge independently with probability p.
Let X count the number of K_{s,t} subgraphs, so X = \sum I_{S,T} where the sum runs over all S \subseteq A with |S|=s, T \subseteq B with |T|=t, and I_{S,T} = 1 if all st edges between S and T are present.
By linearity of expectation,

\mathbb{E}[X] = \binom{n}{s} \binom{n}{t} p^{st}.

Setting p = c n^{-1/s} for suitable constant c > 0 (assuming s \le t) yields \mathbb{E}[X] \sim n^{s} while the expected number of edges is n^2 p = \Omega(n^{2-1/s}). To obtain a K_{s,t}-free graph, delete one edge from each detected K_{s,t}, losing at most \mathbb{E}[X] edges in expectation. For fixed s \ge 2, n^{s} = o(n^{2-1/s}) wait no, actually for optimal order, refined choices or second moment ensure the remaining edges are \Omega(n^{2-1/s}), matching the Kővári–Sós–Turán theorem up to constants (see Alteration Methods for details). Thus, \mathbb{E}[|E| - X] = \Omega(n^{2-1/s}) > 0, proving existence.^[1] The method's primary advantage is its robustness to dependencies: unlike variance-based techniques that require bounding covariances, linearity computes expectations directly from marginal probabilities, simplifying analysis in intricate combinatorial models.^[1] Linearity also yields combinatorial lower bounds, such as guaranteeing a minimum number of desired substructures.
For any graph with e edges, randomly bipartition the vertices (each to one side with probability $1/2); the expected number of crossing edges is e/2, so a bipartite subgraph with at least e/2 edges exists.^[1] In random graphs G(n,p), the expected count of substructures like triangles, \mathbb{E}[X] = \binom{n}{3} p^3, implies some realizations have at least this many, providing lower bounds on typical subgraph densities.^[1]

Concentration Inequalities

Concentration inequalities provide powerful tools in the probabilistic method for bounding the probability that a random variable deviates significantly from its expectation, thereby strengthening existence proofs by demonstrating that favorable events occur with positive probability close to 1. Unlike the first moment method, which only uses expectations to show Pr[good event] > 0, concentration inequalities control variance and tail probabilities, often yielding exponentially small failure probabilities. Markov's inequality states that for a non-negative random variable X and a > 0, \Pr[X \geq a] \leq \mathbb{E}[X]/a. This simple bound is derived by noting that \mathbb{E}[X] \geq \mathbb{E}[X \cdot \mathbf{1}_{X \geq a}] \geq a \Pr[X \geq a], where \mathbf{1} is the indicator function. In the probabilistic method, Markov's inequality is frequently applied to union bounds over "bad events," such as the probability that the number of edges in a random graph falls below a threshold, ensuring the existence of graphs with sufficient connectivity. Chebyshev's inequality extends this to deviations around the mean: for any random variable X with finite variance and t > 0, \Pr[|X - \mathbb{E}[X]| \geq t] \leq \Var(X)/t^2. This follows from applying Markov's inequality to the non-negative random variable (X - \mathbb{E}[X])^2, yielding \Pr[|X - \mathbb{E}[X]| \geq t] = \Pr[(X - \mathbb{E}[X])^2 \geq t^2] \leq \mathbb{E}[(X - \mathbb{E}[X])^2]/t^2 = \Var(X)/t^2. Chebyshev's inequality, part of the second moment method, quantifies how variance limits concentration and is used to prove the existence of graphs where properties like expansion hold with probability bounded away from zero. For sums of independent random variables, sharper bounds are available. Chernoff bounds, particularly for the sum X = \sum_{i=1}^n X_i of independent Bernoulli random variables with \mu = \mathbb{E}[X], provide a multiplicative tail estimate: \Pr[X < (1 - \delta)\mu] \leq e^{-\mu \delta^2 / 2} for $0 < \delta < 1. These bounds, derived using the moment-generating function and Markov's inequality on e^{-tX}, offer exponential decay in the deviation, making them ideal for high-probability statements in combinatorial constructions. Hoeffding's inequality generalizes Chernoff bounds to sums of independent bounded random variables, stating that if X_i are independent with a_i \leq X_i \leq b_i, then for S = \sum X_i and s > 0, \Pr[S - \mathbb{E}[S] \geq s] \leq \exp\left(-2s^2 / \sum (b_i - a_i)^2\right). This uniform bound is particularly useful when variables are not identically distributed but lie in known intervals. In the probabilistic method, concentration inequalities tighten the probability of the good event by showing that the random object concentrates around its expectation with overwhelming probability, often implying \Pr[\text{good event}] \geq 1 - o(1) > 0. For instance, in random graphs G(n,p) with p = c / n for constant c > 1, Chernoff bounds ensure that vertex degrees concentrate around np \approx c, and the graph has a unique giant component of size \Theta(n) with high probability, using degree concentration to verify expansion properties in the connectivity proof.^[1]

Alteration Methods

Simple Deletion Method

The simple deletion method is a fundamental alteration technique within the probabilistic method, whereby a random combinatorial object is generated and then refined by removing a small proportion of its components to eradicate undesirable features, ensuring the existence of a structure satisfying the target properties. This approach extends basic probabilistic existence proofs by actively correcting flaws in the random construction, making it suitable for scenarios where minor adjustments suffice to achieve the desired outcome.^[1] The core principle entails constructing a random object, quantifying the expected number of "bad" substructures (those violating the property of interest), and deleting elements associated with each such substructure to eliminate them entirely. If the expected number of bad substructures is at most ε times the object's size (for small ε > 0), then with positive probability, the actual number is bounded by 2ε times the size via Markov's inequality, allowing deletion of at most 2ε fraction of elements while preserving the bulk of the structure. This semi-random refinement bridges probabilistic existence with more explicit constructions.^[1] Formally, consider seeking a structure of size n where bad substructures each involve a fixed number of elements. Let X denote the number of bad substructures in the random object; if \mathbb{E}[X] \leq \varepsilon n, then \Pr(X > 2 \varepsilon n) < 1/2, so there exists an object with X \leq 2 \varepsilon n. Deleting one endpoint (e.g., vertex or edge) per bad substructure removes at most $2 \varepsilon n elements, yielding a refined object free of bad substructures. The remaining structure retains at least (1 - 2\varepsilon) fraction of the original size, and thus approximately (1 - \varepsilon) times the expected measure of desirable features, such as the number of edges in a graph. The method leverages linearity of expectation to compute \mathbb{E}[X], ensuring the bad events are sufficiently sparse.^[1] A representative example is proving the existence of triangle-free graphs on n vertices with many edges. Start with the random graph G(n,p) for p chosen such that \mathbb{E}[X] = \binom{n}{3} p^3 \leq \varepsilon n, implying p \approx (\varepsilon / n^2)^{1/3}. The expected number of edges is \binom{n}{2} p \approx \varepsilon^{1/3} n^{4/3}/2. With positive probability, the number of triangles is at most $2 \varepsilon n; delete one edge per triangle, removing at most $2 \varepsilon n edges. The resulting graph is triangle-free and has at least (1 - o(1)) \varepsilon^{1/3} n^{4/3}/2 edges for small ε, providing a lower bound on the Turán number \mathrm{ex}(n, K_3) of order n^{4/3}.^[3] This method offers advantages in its straightforward implementation and semi-constructive nature, facilitating derandomization via algorithms that simulate the random generation and deletion process efficiently. It excels when bad events are sparse, enabling tight preservation of the random object's global properties without complex dependency analysis.^[1] Limitations arise when bad events are not sparse, as excessive deletions may erode the structure's desirable scale, necessitating more advanced techniques for denser configurations. The approach also remains non-deterministic in proving existence, though it inspires algorithmic variants.^[3]

Two-Step Alteration

The two-step alteration method involves a sequential probabilistic construction where a random object is first sampled from a distribution conditioned on satisfying an initial property, followed by targeted alterations to incorporate additional properties while preserving the original one. This technique is particularly effective for proving the existence of combinatorial objects with multiple interdependent constraints, where direct random sampling for all properties simultaneously may yield vanishing probability. By conditioning on the primary property using concentration inequalities, the method ensures a positive probability base, upon which alterations—such as edge deletions and additions—are applied sparingly to address secondary constraints.^[1] Beck's alteration exemplifies this approach in random graphs, where edges contributing to forbidden subgraphs are deleted, and new edges are added strategically to restore prescribed degree conditions without introducing the avoided subgraphs. Developed by József Beck, this iterative process bounds the extent of changes needed, allowing the construction of graphs with exact degree sequences that are free of specified substructures, such as certain cliques or cycles. The method relies on the expected number of bad edges being small relative to the graph size, ensuring the final object deviates minimally from the conditioned random model.^[1] The formal guarantee of the two-step process uses linearity of expectation to quantify the expected alterations required for secondary properties, showing that the probability of obtaining a suitable object after conditioning exceeds zero as long as the primary property holds with probability bounded away from zero. In scenarios with uneven bad-event dependencies during the conditioning phase, Janson's inequality provides a refinement by upper-bounding the probability of unions of dependent events, facilitating derandomization of the initial sampling step.^[1] Extensions of the two-step alteration apply to hypergraphs, where uniform random edge sampling conditioned on balance is followed by deletions and additions to avoid forbidden hyperedge configurations, and to set systems, enabling constructions of families with prescribed intersection sizes free of specific inclusion patterns. These generalizations maintain the core probabilistic framework, adapting linearity and concentration tools to higher-uniformity settings.^[1]

The Local Lemma

Symmetric Local Lemma

The Symmetric Local Lemma is a fundamental result in the probabilistic method that provides conditions under which the simultaneous avoidance of multiple low-probability "bad" events is possible with positive probability, even when the events are dependent. It is particularly useful in combinatorial settings where direct union bounds fail due to dependencies. Consider a probability space and a collection of bad events A_1, \dots, A_n. The lemma states that if \Pr[A_i] \leq p for each i, and each A_i is mutually independent of all but at most d other events, and if e p (d+1) \leq 1 (where e \approx 2.718 is the base of the natural logarithm), then \Pr[\bigcap_{i=1}^n A_i^c] > 0. This guarantees the existence of an outcome where none of the bad events occur.^[9] Central to the lemma is the notion of a dependency graph, an undirected graph G with vertices \{1, \dots, n\} where there is an edge between i and j (with i \neq j) if A_i and A_j are dependent (i.e., not mutually independent). The parameter d is then the maximum degree of this graph, capturing the limited dependency structure among the events. This graph formalizes how "local" the dependencies are, allowing the lemma to control the joint probability despite correlations.^[10]^[9] The Symmetric Local Lemma was introduced in 1975 by Paul Erdős and László Lovász as part of their work on chromatic numbers of hypergraphs, marking a key advancement in handling dependent events in probabilistic existence proofs. It was further developed and popularized in the book Probabilistic Methods in Combinatorics by Erdős and Joel Spencer, which expanded on its applications.^[1] A standard proof of the lemma proceeds by induction on the number of events n. For the base case n=1, the condition e p (1+1) = 2 e p \leq 1 implies p \leq 1/(2e) < 1/2, so \Pr[A_1^c] = 1 - p > 0. For the inductive step, fix an event A_n and condition on the sigma-algebra generated by events independent of A_n. By the tower property, \Pr[\bigcap A_i^c] = \mathbb{E}[\Pr[\bigcap A_i^c \mid \mathcal{F}]], where \mathcal{F} is this sigma-algebra. With positive probability over \mathcal{F}, the conditional probability \Pr[A_n^c \mid \mathcal{F}] is at least $1 - p, and the remaining events form a smaller instance satisfying the lemma's hypotheses (with at most d dependencies each). The bound e p (d+1) \leq 1 ensures the inductive hypothesis applies, yielding a positive lower bound. This recursive conditioning exploits the dependency graph to limit the impact of conditioning.^[10]^[9] The parameter e arises naturally from optimizing the bound in the proof, specifically from the inequality $1 - x \geq e^{-x/(1-x)} for $0 < x < 1, which is used to bound products over dependent events; it is known to be tight up to constant factors.^[9] A basic application illustrates the lemma's power: consider edge-coloring the edges of the complete graph K_n using k = O(\sqrt{n}) colors randomly. Define bad events A_t for each triangle t (a K_3) where A_t is the event that the three edges of t receive the same color. Each \Pr[A_t] \leq 1/k^2, and each A_t depends on O(n) other events (those sharing an edge with t). Choosing k large enough, say k \geq 3\sqrt{n}, satisfies e \cdot (1/k^2) \cdot O(n + 1) \leq 1, ensuring a coloring exists with no monochromatic triangle. This demonstrates how the lemma proves existence beyond greedy or union-bound methods when dependencies are locally bounded.^[9]

Asymmetric Local Lemma

The asymmetric Local Lemma extends the symmetric version to scenarios where the bad events A_1, \dots, A_n have unequal probabilities p_i = \Pr[A_i] and varying dependency degrees d_i, allowing for more flexible conditions in the presence of limited dependencies.^[3] In a probability space with a dependency graph G—where each event A_i is mutually independent of all non-adjacent events—suppose there exist parameters x_i \in (0,1) such that

p_i \leq x_i \prod_{j \sim i} (1 - x_j)

for every i, with the product over neighbors j of i in G. Then, the probability that none of the bad events occur satisfies

\Pr\left[\bigcap_{i=1}^n \overline{A_i}\right] \geq \prod_{i=1}^n (1 - x_i) > 0.

This form, originally introduced by Erdős and Lovász, guarantees the existence of outcomes avoiding all bad events.^[11] A simpler sufficient condition arises by setting x_j = \frac{1}{d_j + 1} for each j, yielding p_i \leq \frac{1}{e(d_i + 1)} (where e \approx 2.718) for all i, or equivalently, if \sum_{j \in N(i) \cup \{i\}} p_j \leq \frac{1}{e(d_i + 1)} under suitable approximations for small p_j.^[10] These product and sum-based criteria provide practical thresholds, with the symmetric case emerging as a special instance when all p_i = p and d_i = d.^[3] The proof proceeds by induction on the number of events, conditioning on the occurrence of a particular event and using the dependency graph to bound the relevant conditional probabilities via mutual independence of non-neighbors; the base case follows directly from the inequality, and the inductive step multiplies the lower bound by (1 - x_i).^[10] Alternative proofs employ Lovász's entropy method, which interprets the parameters x_i as relative entropies to compress the probability measure, or Shearer's lemma—a measure-theoretic tool that generalizes the lemma to arbitrary dependency measures and yields tighter bounds in asymmetric settings.^[3] This version surpasses the symmetric Local Lemma by accommodating heterogeneous probabilities and degrees, enabling applications where uniformity fails, such as hypergraph coloring with edges of varying sizes leading to differing p_i. For instance, in scheduling problems like packet routing or job-shop scheduling, bad events represent conflicts between jobs or packets with varying interference probabilities based on sizes or paths; the asymmetric conditions ensure a conflict-free schedule exists with positive probability, even when dependencies and risks differ across tasks.^[12] Modern refinements, developed in the 1980s by Alon and others, sharpened the constants (e.g., replacing earlier factors of 4 with [e](/page/E!)) and extended the lemma to algorithmic settings, culminating in the Moser-Tardos resample-and-analyze framework that constructively finds avoiding configurations in polynomial time under the same conditions.

Applications in Combinatorics

Erdős's Original Examples

In his seminal work, Paul Erdős introduced the probabilistic method to combinatorics through two foundational examples in graph theory, demonstrating the existence of graphs with prescribed properties without constructing them explicitly. The first example establishes the existence of graphs with arbitrarily large girth and chromatic number, while the second provides a lower bound on diagonal Ramsey numbers. These results highlight the nonconstructive nature of the method, relying on the positive probability of favorable outcomes in random graphs.^[13]^[14] The first example, detailed in Erdős's 1959 paper, proves that for any integers g \geq 3 and k \geq 2, there exists a graph with girth greater than g and chromatic number greater than k. Consider the random graph G(n, p) on n vertices where each possible edge is included independently with probability p = c n^{-1 + 1/g} for sufficiently small constant c > 0. The expected number of cycles of length l \leq g in G(n, p) is bounded by \sum_{l=3}^{g} \frac{(np)^l}{2l} < n/2 for sufficiently large n. Since the expected number of such short cycles is less than n/2, there exists a realization of G(n, p) with fewer than n/2 short cycles. Deleting one vertex from each short cycle yields a subgraph on more than n/2 vertices with no cycles of length at most g, hence girth greater than g. To bound the chromatic number from below, note that the independence number \alpha(G(n, p)) satisfies \Pr[\alpha(G(n, p)) \geq s] \leq \binom{n}{s} (1-p)^{\binom{s}{2}} for s = \lceil n/k \rceil. For large n, this probability is less than 1, so there exists a graph with \alpha < n/k, implying \chi > k. The induced subgraph after deletion preserves this property, as \alpha of the subgraph is at most that of the original, yielding \chi > k for the final graph. This construction is nonconstructive, providing no explicit graph.^[14] The second example, from Erdős's 1947 paper, gives an exponential lower bound on the diagonal Ramsey number R(r, r), the smallest n such that any 2-coloring of the edges of K_n contains a monochromatic K_r. Consider a random 2-coloring of the edges of K_n, where each edge is colored red or blue independently with probability $1/2. The probability that a fixed set of r vertices forms a monochromatic K_r is $2 \cdot (1/2)^{\binom{r}{2}} = 2^{1 - \binom{r}{2}}. By the union bound, the probability of a monochromatic K_r existing is at most \binom{n}{r} 2^{1 - \binom{r}{2}}. Choosing n < 2^{r/2}/e, this upper bound is less than 1 for large r, implying there exists a 2-coloring of K_n with no monochromatic K_r, so R(r, r) > n. In particular, this yields R(r, r) > (1 - o(1)) \frac{\sqrt{2}^r}{e r^{1/2}}. For r=3, the bound shows R(3,3) > 5, confirming no monochromatic triangle in some 2-coloring of K_5. Like the first example, this proof is nonconstructive, offering no explicit coloring.^[13] These examples illustrate the power of expectation and union bound in establishing existence, influencing subsequent developments in the probabilistic method.

Bounds on Ramsey Numbers

The probabilistic method provides significant lower bounds on Ramsey numbers R(r, r), the smallest integer n such that any 2-coloring of the edges of the complete graph K_n contains a monochromatic clique of size r. By considering a random 2-coloring of the edges of K_n, one defines bad events A_S for each subset S \subseteq of size r, where A_S occurs if the subgraph induced by S is monochromatic. The probability of each bad event is p = \Pr(A_S) = 2^{1 - \binom{r}{2}}, since there are $2^{\binom{r}{2}} possible colorings of the edges in S, only two of which are monochromatic.^[1] Each bad event A_S depends on at most O(r^2 \binom{n}{r-2}) other bad events, as dependencies arise when two subsets share at least two vertices, and the number of such overlapping subsets is bounded by r^2 choices for the shared vertices times the ways to choose the remaining r-2 vertices from the other n-r vertices. Applying the Lovász Local Lemma to these events with dependency degree \Delta = O(r^2 \binom{n}{r-2}) and e p (\Delta + 1) < 1 yields that there exists a coloring with no monochromatic K_r if n is sufficiently small relative to r, specifically giving the asymptotic lower bound R(r, r) > (1 + o(1)) \frac{2^{r/2}}{\sqrt{r}}.^[1] This bound improves upon earlier probabilistic estimates by handling the dependencies among bad events more effectively. Further refinements in the 1980s, developed by Noga Alon and Joel Spencer, incorporate alteration techniques to achieve better constants in the lower bounds. In this approach, after generating a random coloring, a small number of vertices incident to monochromatic cliques are removed, preserving the overall structure while eliminating the bad subgraphs; this alteration step tightens the probabilistic analysis without significantly reducing the graph order.^[1] For a concrete example, consider r=4: the probabilistic method, combined with deletion of at most a few vertices from potential monochromatic K_4's in a random coloring of K_{17}, shows that there exists a 2-coloring of K_{17} with no monochromatic K_4, implying R(4,4) \geq 18.^[1] The technique extends naturally to off-diagonal Ramsey numbers R(s,t), where random 2-colorings of K_n are analyzed with bad events for monochromatic K_s in one color and K_t in the other; similar applications of the Local Lemma yield lower bounds such as R(k,3) > c k^2 / \log^2 k for some constant c > 0.^[1] While powerful for establishing existence, the probabilistic method inherently provides only lower bounds on Ramsey numbers, as it demonstrates the non-existence of certain structures for smaller n; upper bounds require constructive proofs or different combinatorial arguments.^[1]

Existence of Graphs with Specific Properties

The probabilistic method has been instrumental in proving the existence of graphs with high girth and high chromatic number, extending Erdős's pioneering work. For graphs with high chromatic number and no short cycles, one takes a random graph G(n,p) with p chosen to control the expected number of cycles of length at most g, such as p = n^{-1/m} where m = \lfloor (g-1)/2 \rfloor. The expected number of such cycles is o(n), allowing deletion of one vertex per cycle to obtain a graph on (1-o(1))n vertices with girth at least g+1. The chromatic number is then lower bounded by showing that the independence number α is o(n/χ), using the first moment method on the expected size of independent sets, yielding χ = Ω(n^{1-1/m} / \log n). For the specific case of triangle-free graphs (girth 4), the bound improves to χ = Ω(√n / log n), derived from probabilistic constructions in the off-diagonal Ramsey problem where random graphs with p ≈ √(log n / n) have few triangles and small independence number with positive probability.^[15] Random regular graphs provide a natural construction for graphs with logarithmic girth. A random d-regular graph on n vertices, generated via the configuration model, has girth at least c log_{d-1} n with high probability for some constant c > 0 and fixed d ≥ 3. This follows from bounding the expected number of cycles of length less than the logarithmic threshold using the tree-like local structure up to distance Ω(log n / log d), where the probability of closing a short cycle is exponentially small. If multiple edges or loops occur (with probability o(1)), the simple deletion method can be applied to remove them while preserving the girth property. These graphs have chromatic number Θ(d / log d) by greedy coloring upper bounds and lower bounds from clique number or independence number estimates in random regular models.^[16] The probabilistic method also establishes the existence of expander graphs with specified degree and expansion parameters. Consider the random d-regular graph on n vertices; with high probability, it is a (n, d, λ)-expander with λ ≤ 2√(d-1) + o(1), implying a spectral gap of at least d - 2√(d-1) - o(1). This follows from Friedman's analysis using the local lemma on the eigenvalues of the adjacency matrix, but existence can be shown probabilistically by considering the union of d random perfect matchings, where the expected second-largest eigenvalue magnitude is bounded by √(d-1). For vertex expansion, one can prove the existence of d-regular graphs where every set S of size at most n/2 has at least (d/2)|S| neighbors by bounding the probability that any fixed S has poor expansion via union bound over all possible S, yielding positive probability for good expanders. To construct (n,d,λ)-expanders with λ < d/2 (possible for d ≥ 3), note that 2√(d-1) < d/2 for d > 4, and alteration methods refine this: start with a random graph with excess degree, compute the expected number of short paths between non-adjacent vertices (bounded by n^2 (d/n)^ℓ for path length ℓ), and delete edges incident to such paths to enforce expansion while maintaining near-regularity.^[17] Pósa's rotation-extension technique, augmented by probabilistic counting, proves the existence of Hamiltonian cycles in dense random graphs. In G(n,p) with p = (log n + log log n + ω(1))/n, the expected number of Hamiltonian paths is super-exponential, and for a longest path P, the set of possible rotations (extensions by adding an edge from the endpoint to a non-consecutive vertex) has size Ω_p(n) with high probability. The rotation graph on these paths is then shown to be connected via second-moment estimates on the number of rotations, implying a Hamiltonian cycle exists. This method extends to expander graphs, where local expansion ensures sufficient rotations to connect paths into cycles. Finally, the probabilistic method, employing Janson's inequality, demonstrates the existence of bipartite graphs avoiding certain complete bipartite subgraphs while maximizing edges. For the Zarankiewicz problem z(n,n;s,t), consider a random bipartite graph with parts of size n and edge probability p = Θ(n^{-1/s}). The expected number of K_{s,t} subgraphs is o(1), but to improve the bound, take larger p and use deletion: the number of K_{s,t} is concentrated around its mean μ by Janson's inequality (which lower bounds P(X=0) ≥ (1 - e^{-μ} / √(1 + Var(X)/μ^2))), allowing removal of one edge per copy while retaining Ω(n^{1 + 1/min(s,t)}) edges with positive probability. This yields bipartite graphs with no K_{s,t} and more edges than simple expectation methods provide.