Peterson's algorithm

Peterson's algorithm is a classic software-based mutual exclusion protocol designed for two concurrent processes sharing a critical section in a multiprocessor or multithreaded environment, ensuring that only one process can execute the critical section at a time while using only atomic reads and writes to two shared boolean flags and an integer turn variable. Proposed by Gary L. Peterson in 1981,^[1] the algorithm addresses longstanding challenges in concurrent programming by providing a simple, deadlock-free solution that satisfies mutual exclusion, progress, and bounded waiting properties without relying on hardware support like test-and-set instructions. In the protocol, each process signals its intent to enter the critical section by setting its flag to true and then yields priority to the other process by setting the turn variable to the opponent's identifier; it then spins (busy-waits) until either the opponent's flag is false or the turn favors itself, after which it enters the critical section and exits by clearing its flag. This mechanism guarantees mutual exclusion because at most one process can satisfy the waiting condition simultaneously, as the turn variable breaks symmetry when both flags are true. The algorithm's elegance lies in its brevity—typically implementable in just a few lines of code—and its proof of correctness, which debunks common myths about the requirements for mutual exclusion solutions, such as the necessity of complex busy-waiting schemes. It builds on earlier work like Dekker's algorithm but simplifies the logic, making it a foundational example in operating systems and concurrent programming education.^[2] While originally for two processes, extensions to n-processes exist, though they increase complexity; the two-process version remains influential for illustrating key concurrency principles like liveness and fairness.^[2]

Introduction

Overview

Peterson's algorithm is a software-based solution for achieving mutual exclusion in a two-process system that share a single resource through shared memory, without relying on specialized hardware instructions. Formulated by Gary L. Peterson in 1981, it provides a simple, lock-free mechanism to ensure that only one process can access the critical section at a time, preventing race conditions in concurrent environments. At its core, the algorithm employs two boolean flags—one for each process—and a shared turn variable to coordinate entry into the critical section. Each process signals its intent to enter by setting its flag and then yields the turn to the other process, politely waiting until the other has finished or relinquishes the turn. This design elegantly balances courtesy and fairness in process synchronization. The algorithm satisfies three key properties essential for correct mutual exclusion: mutual exclusion, which guarantees that at most one process is in the critical section; progress, ensuring that if no process is in the critical section and at least one wishes to enter, the system will select one to proceed without deadlock; and bounded waiting, which bounds the number of times the other process can enter the critical section while a waiting process is denied access, preventing starvation. These properties make it a foundational example in concurrent programming. While originally designed for two processes, Peterson's algorithm can be extended to n processes using a filter-based generalization.

Historical context

Peterson's algorithm was developed by Gary L. Peterson in 1981 during his research on concurrent computing and the design of synchronization primitives for shared-memory systems.^[1] The algorithm emerged as a response to ongoing challenges in ensuring mutual exclusion without hardware support, building on foundational work in process synchronization. Its primary motivation was to overcome the complexities of prior software-based solutions, notably Dekker's algorithm from 1965, which achieved mutual exclusion for two processes but required intricate reasoning for its proof of correctness. Peterson sought a more straightforward approach using minimal shared variables, making verification accessible while satisfying key properties like mutual exclusion and progress. The algorithm first appeared in Peterson's seminal paper "Myths About the Mutual Exclusion Problem," where he debunked common misconceptions in concurrent programming and introduced this elegant two-process solution.^[1] Since its publication, Peterson's algorithm has profoundly influenced synchronization literature, establishing itself as a cornerstone for understanding software-based concurrency control. It is featured prominently in authoritative operating systems textbooks, such as Abraham Silberschatz, Peter B. Galvin, and Greg Gagne's Operating System Concepts, which presents it as a classic example of a correct mutual exclusion mechanism. Similarly, Andrew S. Tanenbaum's Modern Operating Systems highlights its simplicity and role in illustrating critical-section solutions. Scholars have praised it as one of the most elegant two-process algorithms, appearing in nearly every major text on concurrent programming due to its clarity and pedagogical value.

Two-process version

Algorithm description

Peterson's algorithm provides a software-based solution for mutual exclusion between two concurrently executing processes, denoted as P_0 and P_1, using only shared memory variables without relying on hardware primitives beyond atomic reads and writes.^[3] The algorithm structures each process into distinct phases: an entry protocol to request access to the critical section, the critical section itself where shared resources are accessed, and an exit protocol to relinquish access. Two shared variables facilitate coordination: a boolean array

\textit{flag}{{grok:render&&&type=render_inline_citation&&&citation_id=2&&&citation_type=wikipedia}}

, where \textit{flag} indicates whether process P_i wishes to enter the critical section (initially false for both), and an integer \textit{turn}, which records the process that is tentatively granted priority (initially arbitrary).^[3] In the entry protocol, process P_i (where j = 1 - i) first sets \textit{flag} = \textit{true} to signal its intent, then assigns \textit{turn} = j to defer to the other process if necessary, and busy-waits in a loop until either \textit{flag} is false (indicating P_j is not interested) or \textit{turn} \neq j (indicating P_i has priority). Upon exiting the critical section, P_i simply sets \textit{flag} = \textit{false} to withdraw its interest, allowing the other process to potentially proceed. This design ensures that the variables collectively enforce coordination by combining a declaration of intent with a willingness to yield. Consider a scenario where both processes attempt to enter the critical section simultaneously, with interleaving of their actions. Suppose P_0 first sets

\textit{flag}{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}} = \textit{true}

and \textit{turn} = 1. Before P_0 checks its waiting condition, P_1 sets

\textit{flag}{{grok:render&&&type=render_inline_citation&&&citation_id=1&&&citation_type=wikipedia}} = \textit{true}

and then \textit{turn} = 0. Now, P_0 evaluates its loop:

\textit{flag}{{grok:render&&&type=render_inline_citation&&&citation_id=1&&&citation_type=wikipedia}}

is true, but \textit{turn} = 0 \neq 1, so the condition is false and P_0 proceeds to its critical section. Meanwhile, P_1's loop sees

\textit{flag}{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}}

true and \textit{turn} = 0, so it waits. After P_0 completes its critical section and sets

\textit{flag}{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}} = \textit{false}

, P_1's condition becomes false, granting it access.^[4] In alternative interleavings, such as both setting flags before updating turn, the final turn value still favors only one process, preventing both from entering concurrently. The algorithm avoids starvation through the turn variable, which enforces alternation by requiring a process to yield if the other has claimed priority first; upon exit, clearing the flag enables the waiting process to advance without indefinite postponement, satisfying bounded waiting where each process waits at most for one full execution of the other.^[3] This mechanism provides an intuitive balance of courtesy and persistence in resource contention.

Pseudocode implementation

Peterson's algorithm for two processes uses shared variables to coordinate access to a critical section, ensuring mutual exclusion through busy-waiting. The shared variables consist of a boolean array flag[2] initialized to false, indicating each process's interest in entering the critical section, and an integer turn initialized to either 0 or 1, which records the process to which the right to enter has been tentatively yielded.^[1] The algorithm executes within an infinite loop for each process, allowing repeated attempts to enter the critical section. The following pseudocode presents the implementation for process i (where i is 0 or 1, and j = 1 - i denotes the other process):

do {
    flag[i] = true;                    // i expresses interest in entering
    turn = j;                          // i yields priority to j
    while (flag[j] && turn == j);      // busy-wait if j is interested and has priority
    // critical section
    flag[i] = false;                   // i withdraws interest
    // remainder section
} while (true);
do {
    flag[i] = true;                    // i expresses interest in entering
    turn = j;                          // i yields priority to j
    while (flag[j] && turn == j);      // busy-wait if j is interested and has priority
    // critical section
    flag[i] = false;                   // i withdraws interest
    // remainder section
} while (true);

This structure applies symmetrically to both processes, with indices swapped accordingly.^[3] The implementation relies on the atomicity of read and write operations to the shared variables flag and turn, assuming a sequentially consistent memory model where these operations appear to occur instantaneously and in program order. No specialized hardware instructions, such as test-and-set, are required, making the algorithm suitable for software-only synchronization in shared-memory systems.^[1] To illustrate the variable states during execution, consider a scenario where both processes attempt to enter the critical section simultaneously. The table below shows a representative sequence of states (assuming process 0 starts first, and turn initializes to 0):

Step	Process 0 Action	flag	flag^[5]	turn	Process 1 Action	Description
0	Initialize	false	false	0	Initialize	Initial state
1	flag = true; turn = 1	true	false	1	-	Process 0 expresses interest and yields
2	while loop (false)	true	false	1	flag^[5] = true; turn = 0	Process 1 expresses interest; process 0 enters CS
3	Critical section	true	true	0	while (true && 0==0)	Process 1 busy-waits
4	flag = false	false	true	0	while (false && 0==0)	Process 0 exits; process 1 enters CS

This example highlights how the variables evolve to enforce ordering without deadlock, though full traces vary based on scheduling.^[3]

Correctness properties

Peterson's two-process algorithm satisfies three fundamental correctness properties essential for mutual exclusion in concurrent programming: mutual exclusion, progress, and bounded waiting. These properties ensure safe, deadlock-free, and fair access to the critical section under the assumption of atomic reads and writes to shared variables in a sequentially consistent memory model.90048-0) Mutual exclusion guarantees that at most one process can execute in its critical section at any time, preventing concurrent access to shared resources. Informally, suppose both processes P_i and P_j (where i \neq j) are in their critical sections simultaneously. This would require both to have set their flags to true (indicating intent to enter) and exited their waiting loops. For P_i to exit its loop, the condition \texttt{flag} = \text{true} \land \texttt{turn} = j must be false, implying \texttt{turn} = i since \texttt{flag} is true. Similarly, for P_j, \texttt{turn} = j. This leads to a contradiction, as \texttt{turn} cannot simultaneously equal both i and j. Thus, mutual exclusion holds via the interplay of the flags and the single shared \texttt{turn} variable.90048-0) Progress ensures that if no process is in its critical section and at least one process wishes to enter, then some process will eventually enter without indefinite postponement or deadlock. In the algorithm, a process signals its intent by setting its flag but yields the \texttt{turn} to the other. If only one process is interested (\texttt{flag} = \text{false}), it enters immediately, as the waiting condition fails. If both are interested, the process that last sets \texttt{turn} to the other's index waits, allowing the other to enter first; upon its exit (\texttt{flag} = \text{false}), the waiting process can proceed. This mechanism prevents scenarios where both processes perpetually block each other.90048-0) Bounded waiting provides fairness by ensuring that a process attempting to enter its critical section will do so after a bounded number of entries by the other process—in this case, at most one. When P_i is waiting (with \texttt{flag} = \text{true} and the loop condition true), P_j may enter if it has priority via \texttt{turn} = j. However, once P_j exits and sets \texttt{flag} = \text{false}, the condition for P_i becomes false, allowing immediate entry on its next attempt. No further delays occur beyond this single entry by P_j, eliminating starvation for two processes.90048-0) The algorithm assumes a memory model with sequential consistency, where all memory operations appear to occur in a total order consistent with each process's program order, and reads return the most recent write value. This relies on atomicity of assignments to \texttt{flag} and \texttt{turn}, which may not hold without additional barriers on modern hardware prone to reordering. Detailed formal proofs of these properties are provided in subsequent analyses.90048-0)

Proofs of correctness

Mutual exclusion proof

The mutual exclusion property of Peterson's algorithm ensures that at most one of the two processes can be in its critical section at any time. This is established through a combination of invariants maintained by the shared variables and a proof by contradiction. The algorithm uses two boolean arrays flag[0] and flag[1] (initially false) to indicate a process's interest in entering the critical section, and an integer turn (initially 0 or 1) to break symmetry when both processes contend.^[1]^[5] A key invariant is that if both flag[0] and flag[1] are true, then turn equals 0 or 1, pointing to exactly one process and forcing the other to wait in its while loop. This invariant holds because when a process sets its flag to true, it immediately sets turn to the other process's index before checking the condition, ensuring the other process (if interested) will see turn favoring itself and exit the loop, while the current process yields.^[1]^[7] To prove mutual exclusion, assume for contradiction that both processes P0 and P1 are simultaneously in their critical sections. For P0 to have entered, it must have exited its while loop, meaning the condition flag[1] && turn == 1 was false at that moment. Similarly, for P1 to have entered, flag[0] && turn == 0 was false. Since both are in the critical section, their flags remain true (flag[0] = true and flag[1] = true), so the conditions simplify to turn ≠ 1 for P0 and turn ≠ 0 for P1, implying turn = 0 and turn = 1 simultaneously, which is impossible. Thus, the assumption leads to a contradiction, proving mutual exclusion.^[5]^[7] A supporting lemma is that the while loop condition (flag[j] && turn == j) (where j is the other process's index) ensures at most one process can exit the loop when both flags are true, as turn can match only one index at a time. This lemma follows directly from the invariant and the atomic updates in the algorithm.^[1]^[5] Edge cases preserve mutual exclusion as well. If one process is not interested (e.g., flag[1] = false), P0's while loop condition is immediately false, allowing entry without contention, while P1 cannot enter until it sets its flag. Sequential entries, where one process completes its critical section before the other attempts entry, also hold trivially, as the exiting process sets its flag to false, satisfying the waiting condition for the entrant.^[7]^[5]

Progress proof

The progress property for Peterson's two-process mutual exclusion algorithm ensures that the system as a whole cannot become deadlocked: if the critical section is unoccupied and at least one process expresses interest in entering it (by setting its flag to true), then some process will eventually enter the critical section.^[8] This property guarantees forward movement, preventing scenarios where interested processes are indefinitely delayed despite the resource being available.^[9] To prove progress, consider the entry protocol where each process i (0 or 1) sets flag[i] = true, then sets turn = 1 - i (yielding to the other process), and waits in the loop while (flag[1 - i] && turn == 1 - i). The proof proceeds by cases based on the number of interested processes, showing that the waiting loops terminate and the critical section is entered.^[10] If only one process, say process 0, is interested (flag[0] = true and flag[1] = false), process 0 sets turn = 1 and evaluates its loop condition flag[1] && turn == 1, which is false && true = false. Thus, the loop terminates immediately, and process 0 enters the critical section without delay. The symmetric case holds for process 1 alone.^[8] If both processes are interested (flag[0] = flag[1] = true), each sets turn to the other's index before waiting. Due to finite execution steps, one process updates turn last. Suppose process 0 updates turn = 1 after process 1 has set turn = 0. Then, process 1's loop condition flag[0] && turn == 0 becomes true && false = false (since turn = 1), so process 1 enters the critical section while process 0 waits (flag[1] && turn == 1 is true && true = true). The reverse holds if process 1 updates last. In either case, the turn variable forces exactly one process to exit the loop and enter.^[10] To confirm no indefinite blocking for the system, assume by contradiction that both processes are stuck in their waiting loops. This requires flag[1] && turn == 1 for process 0 and flag[0] && turn == 0 for process 1, implying turn == 1 and turn == 0 simultaneously—a contradiction, as turn holds a single value (0 or 1). Thus, at least one loop must terminate, ensuring entry. Formally, using leads-to relations (denoted →), if process i reaches the wait state (on the loop), then (wait state of i) → (acquired by i), established by invariants on flags and turn, and the fact that the other process either yields or eventually exits (clearing its flag).^[8]^[10] Upon exiting the critical section, a process sets its flag to false, which falsifies the loop condition for any waiting process (now flag[1 - i] == false), allowing it to proceed immediately. This flag-clearing mechanism, combined with the turn dynamics, ensures that waiting loops always terminate after finite steps, maintaining system progress across entry-exit cycles.^[9]

Bounded waiting proof

The bounded waiting property for Peterson's two-process algorithm ensures that a process requesting access to the critical section will enter it after at most one entry by the other process, thereby preventing indefinite starvation while maintaining fairness.^[3] This distinguishes the algorithm from solutions like simple test-and-set mechanisms, which may allow unbounded preemptions without additional structures such as queues.^[11] To prove this property, assume without loss of generality that process P_i (where i \in \{0, 1\}) is attempting to enter the critical section, and let P_j be the other process (j = 1 - i). Process P_i executes flag[i] = true; turn = j; and then waits in the loop while (flag[j] && turn == j);. The proof proceeds by cases based on the state of flag[j] when P_i reaches the wait condition.^[3] Case 1: If flag[j] == false, then P_j is not interested in entering the critical section. The wait condition evaluates to false immediately, allowing P_i to enter without delay. In this scenario, P_j performs zero entries to the critical section while P_i waits.^[12] Case 2: If flag[j] == true, then P_j is also attempting to enter. However, since turn == j (set by P_i), P_j's wait condition while (flag[i] && turn == i); evaluates to false because turn != i. Thus, P_j proceeds to the critical section first, while P_i remains in its wait loop. Upon completing its critical section, P_j executes flag[j] = false. This causes P_i's wait condition to become false (flag[j] is now false), granting P_i entry. During this interval, P_j enters the critical section exactly once.^[3]^[12] In neither case can P_j enter the critical section more than once while P_i waits from its initial request, as the turn variable enforces alternation after the first entry by P_j. If P_j attempts to re-enter before P_i does, it would set turn = i and then wait due to flag[i] == true and turn == i, but this does not affect the bound established from P_i's request point.^[11] Consequently, the algorithm satisfies bounded waiting with a strict bound of one: the waiting time for P_i is at most the execution time of one critical section by P_j.^[3]

Generalization to multiple processes

Filter algorithm overview

The filter algorithm generalizes Peterson's two-process mutual exclusion solution to support an arbitrary number n of processes by introducing a hierarchical, layered structure that progressively narrows contention. Each process attempting to enter the critical section must pass through n-1 successive filter stages, with the final stage emulating the two-process Peterson lock using the same shared variables to resolve the last remaining contention. This design ensures that only one process ultimately gains exclusive access, building directly on the shared-memory primitives of flags and turn variables from the binary case.^[5]^[7] At each filter stage k (for k = 1 to n-1), the algorithm reduces the number of actively competing processes from n-k+1 to n-k, effectively eliminating one contender per level through a mechanism akin to a two-way exclusion but adapted for the current subgroup. Local variables, including level indicators and victim selectors per stage, track participation and yield decisions, allowing processes to advance asynchronously while blocking others as needed. This level-by-level filtering mimics a tournament elimination, where processes announce their intent at a given level and busy-wait if designated as the victim in a conflict.^[5]^[7] Developed by Gary L. Peterson in 1981 as part of his foundational work on mutual exclusion myths, the filter algorithm innovates by maintaining mutual exclusion and progress guarantees without preserving first-in-first-out (FIFO) ordering, permitting processes to be overtaken unboundedly across stages.^[13] It achieves this with O(n) space complexity for the per-stage variables and O(n) worst-case time per acquisition, as each process traverses all n-1 filters plus the final lock.^[14]

Extension mechanism

The extension mechanism of Peterson's algorithm to n processes employs a filter structure consisting of n-1 stages, where each stage progressively reduces the number of contending processes from n to 1 by eliminating one contender per stage. This staged approach ensures that contention is localized and managed systematically, with processes advancing only after yielding to higher-priority or earlier-arriving contenders at each level.^[5] The algorithm maintains two key shared arrays: level[0..n-1], which records the highest stage (or level) each process i has attempted to enter (initialized to 0 for all processes), and victim[1..n-1], which designates the process that must yield at each stage k (initialized to an invalid value such as -1). To acquire the lock, a process i (numbered from 0 to n-1) proceeds through the stages sequentially, starting from k=1 up to k=n-1. At stage k, process i first sets level[i] = k to declare its intent to advance, then sets victim[k] = i to indicate its willingness to yield if necessary. It then busy-waits with the condition while (there exists j ≠ i such that level[j] ≥ k and victim[k] == i). This wait ensures that if any other process has reached stage k or higher and process i is designated to yield, i defers until that process advances or the victim changes. Upon exiting the loop, process i has effectively filtered past stage k and proceeds to the next.^[5]^[14] After completing stage n-1, the process directly enters the critical section, as the final stage ensures that at most one process can proceed. To release the lock, the process resets level[i] = 0 after exiting the critical section, allowing blocked processes to progress through the stages.

Properties for n processes

The filter algorithm, which generalizes Peterson's two-process mutual exclusion solution to n processes, preserves mutual exclusion by structuring contention resolution into n-1 sequential stages, where each stage emulates the two-process Peterson algorithm to ensure that at most one process advances from any group of contenders.^[1] This layered approach guarantees that only a single process can traverse all stages and enter the critical section simultaneously, inheriting the safety property of the base algorithm without requiring hardware support.^[1] The mechanism relies on shared variables to filter out all but one contender per level, maintaining the invariant that no two processes occupy the critical section at the same time.^[15] Progress is also guaranteed in the n-process filter algorithm, as each stage ensures that at least one process from the contending set advances to the next level, preventing indefinite postponement and ensuring that some process eventually enters the critical section if it seeks access.^[1] This property holds because the two-process Peterson sub-algorithm at every filter level satisfies progress, allowing the overall structure to resolve contention without deadlock across all n processes.^[15] Consequently, the algorithm achieves lockout-freedom, meaning no process is permanently excluded from entering the critical section, even under adversarial scheduling.^[1] In contrast to the two-process version, which provides strict bounded waiting (limited to one entry by another process), the filter algorithm relaxes this property for n processes, as a waiting process can be overtaken repeatedly at different stages, resulting in a worst-case wait bounded by O(n) critical section entries rather than a constant.^[15] While the wait remains finite due to the absence of starvation, the lack of a fixed bound means it does not satisfy the traditional definition of bounded waiting for arbitrary n.^[15] Additionally, the algorithm does not enforce first-in-first-out (FIFO) fairness, allowing later-arriving processes to bypass earlier ones through the filter stages.^[15]

Applications and variants

Practical uses

Peterson's algorithm plays a central role in operating systems education, where it is routinely presented as a classic software-based solution to the critical section problem for two processes. It appears in seminal textbooks such as Operating System Concepts by Silberschatz, Galvin, and Gagne, which use it to teach fundamental concepts of process synchronization, mutual exclusion, progress, and bounded waiting without relying on hardware support. Implementations in languages like Java and C++ are frequently developed in academic settings to demonstrate these principles, allowing students to simulate concurrent execution and observe the algorithm's behavior in controlled environments.^[16] In practical deployments, particularly within embedded systems, Peterson's algorithm found application in low-level, hardware-agnostic scenarios such as multiprocessor kernels lacking robust atomic instructions. A notable example was its use in the Linux kernel (up to version 5.7, released in 2020) for NVIDIA's Tegra mobile processors, an ARM-based system-on-chip, where it implemented synchronization for sleep state management and low-level locks in files like sleep-tegra20.S^[17]. This deployment highlighted its utility in resource-constrained environments like mobile and embedded devices, ensuring mutual exclusion through shared memory variables without depending on specialized hardware primitives. Despite these uses, Peterson's algorithm's practicality is constrained by its reliance on a strong memory consistency model, assuming sequential instruction execution and atomic memory accesses.^[11] On modern processors with out-of-order execution, pipelining, and cache coherency issues, it can fail to maintain mutual exclusion unless augmented with memory barriers or fences, which introduce overhead and complexity.^[11] As a result, contemporary implementations favor hardware-supported mechanisms like load-link/store-conditional (LL/SC) instructions or standard libraries such as POSIX threads (pthreads) mutexes for efficiency and portability. Regarding performance, the algorithm for two processes operates in constant time, O(1), making it suitable for simple dual-process synchronization.^[18] Its generalization to n processes, often via a filter or tournament structure, incurs linear O(n) complexity in the worst case, as each process may traverse up to n-1 stages, limiting scalability in multi-process environments.^[19] Peterson's algorithm builds upon earlier software-based mutual exclusion solutions, most notably Dekker's algorithm, which was the first known correct implementation using only atomic reads and writes for two processes. Introduced in 1965 and attributed to Th. J. Dekker but published by Edsger W. Dijkstra, Dekker's solution relies on a more intricate set of shared variables and conditional checks to ensure mutual exclusion, progress, and bounded waiting, making it significantly more complex than Peterson's streamlined approach with just two variables (flags and turn).^[19] The primary contrast lies in simplicity: Peterson's algorithm reduces the number of synchronization points and eliminates some of the redundancy in Dekker's, achieving the same properties with fewer instructions while maintaining fairness through bounded waiting.^[2] Subsequent developments extended Peterson's ideas to multiple processes and improved scalability. Lamport's bakery algorithm, proposed in 1974, generalizes mutual exclusion to n processes using a ticket-taking mechanism that enforces first-in, first-out (FIFO) ordering, unlike Peterson's two-process focus.^[20] This FIFO property provides stronger fairness than Peterson's bounded waiting, preventing indefinite postponement in high-contention scenarios, though it requires more memory for ticket counters.^[21] Building on queue-like concepts similar to the filter extension of Peterson's algorithm, the bakery approach influenced later fair locking mechanisms but incurs higher overhead due to pairwise comparisons among processes.^[7] In terms of scalable hardware-assisted alternatives, Peterson's algorithm offers an advantage by relying solely on standard memory operations without special atomic instructions like test-and-set (TAS) or compare-and-swap (CAS). TAS-based locks, a hardware primitive available since the 1960s, implement mutual exclusion through a simple spinlock on a shared boolean, but they suffer from high contention as all processes busy-wait on the same location, leading to poor scalability on multiprocessors.^[5] CAS, an extension allowing conditional updates, enables more efficient non-blocking algorithms but still requires hardware support that Peterson's avoids, making it preferable in environments without such primitives.^[22] Variants inspired by Peterson's queue-filter generalization include ticket locks and CLH locks, which enhance fairness and performance for multiprocessors. Ticket locks, akin to the bakery algorithm, assign sequential numbers to processes for FIFO access, reducing contention compared to TAS while inheriting Peterson's software-only ethos.^[23] The CLH (Craig, Landi, Hagersten) queue lock, introduced in 1993, uses a linked-list structure for local spinning, improving cache locality and scalability over array-based queues; it evolves the filter idea by allowing O(1) space per lock and bounded waiting, directly influenced by Peterson's emphasis on progress without hardware dependencies.^[24] Overall, Peterson's algorithm has shaped non-blocking synchronization by demonstrating that mutual exclusion can be achieved portably, paving the way for these evolutions in concurrent systems.^[25]