Fact-checked by Grok 2 weeks ago

Thundering herd problem

The thundering herd problem is a concurrency in operating systems where numerous or threads, blocked on a shared wait queue awaiting an such as I/O completion or availability, are all awakened simultaneously when the event occurs, leading to intense contention for the resource as only one can typically proceed while the others compete or return to waiting, thereby causing significant overhead from switches, cache invalidations, and CPU waste. The term "thundering herd" originates from early Unix systems, evoking the image of a herd of animals stampeding toward a . This problem arises primarily in multiprocessor environments and shared- scenarios, such as when multiple tasks wait for a locked page during operations like page faults or I/O in the . In such cases, a single zone-wide wait could wake all processes at once, exacerbating inefficiency; to mitigate this, the employs a hashed wait table (e.g., zone->wait_table), sized based on the zone's (typically up to thousands of entries), distributing waiters to unnecessary wakeups and reduce collisions. In legacy high- configurations (e.g., 32-bit systems), similar issues occurred with queues like pkmap_map_wait for persistent map slots, leading to contention upon slot release. The thundering herd manifests notably in network servers handling listener sockets, where multiple threads blocked on select() or poll() wake up en masse upon a connection arrival, overwhelming the system with redundant checks. In Linux, tools like epoll address this through edge-triggered notifications and the EPOLLEXCLUSIVE flag, which ensures only one waiter is woken per event on a shared file descriptor, optimizing for scenarios with high concurrency. Wakeup functions such as wake_up() can also be tuned to wake only a single task (e.g., via wake_up_process()), preventing the full herd from stampeding. Beyond kernel-level operations, the problem extends to distributed systems, where a surge of simultaneous requests—often after a expiration or service recovery—can overload or databases, mimicking the OS herd effect at scale. Mitigations in this domain include , , and circuit breakers. Overall, understanding and alleviating the thundering herd remains crucial for scalable, efficient concurrent programming in both single-machine and networked environments.

Definition and Background

Core Concept

The thundering herd problem refers to a concurrency issue in computing systems where a large number of processes or threads, previously blocked while waiting for a such as a lock or event, are simultaneously awakened upon the resource's availability, resulting in intense and inefficient competition among them. This phenomenon leads to significant performance overhead, as only one contender can typically acquire the resource, forcing the others to requeue or block again. At its core, the problem arises from the behavior of synchronization primitives like semaphores or condition variables in . When a is released, these primitives often employ a broadcast mechanism—such as broadcast() in monitor semantics or wake_up_all() in wait queues—that notifies all waiting entities rather than selecting just one. This indiscriminate wakeup triggers a "" of concurrent attempts to access the resource, where threads or processes repeatedly check conditions (e.g., via spurious wakeups in Mesa-style monitors) and fail, exacerbating system contention. A basic illustration occurs in Unix-like operating systems during I/O event handling, such as when multiple processes use select() or poll() to monitor a shared for incoming connections on a server socket. Upon an like a new connection arriving, the awakens all waiting processes, prompting each to reissue system calls and compete for the descriptor, even though only one can accept the connection at a time. Key characteristics of the thundering herd include unnecessary context switches as the scheduler cycles through awakened entities, CPU cache thrashing from concurrent access patterns that invalidate shared data, and redundant system calls that consume resources without productive outcome. These effects highlight the inefficiency in arbitration, particularly in high-concurrency environments.

Historical Origins

The thundering herd problem was first observed in Unix systems during the , particularly in multi-process network servers where the accept() on listening sockets would awaken multiple blocked processes upon a single incoming connection, causing unnecessary context switches and resource contention. This behavior was prominent in AT&T's System V Release 3, introduced in 1986, which standardized certain networking interfaces that exacerbated the issue in high-concurrency environments. Early discussions of the problem appeared in analyses of BSD Unix kernels, where synchronization primitives in process scheduling highlighted inefficiencies in mass wakeups for shared events. A key publication exploring its implications is "Accept() Scalability in Linux" by Stephen P. Molloy and Chuck Lever, presented at the 2000 USENIX Annual Technical Conference, which examined the POSIX-compliant accept() implementation and its "thundering herd" effects in , tracing roots to traditional Unix designs. The issue evolved into a more prominent concern in the alongside the proliferation of multi-user systems and hardware, as servers scaled to handle thousands of concurrent users. Discussions in standards development, particularly around in POSIX.1c (1995), underscored the need for better wakeup mechanisms to avoid herd-like contention in environments. The terminology "thundering herd" originated as a in computing literature, evoking the chaotic rush of a large herd of or charging across the prairie in response to a single stimulus, analogous to processes stampeding toward a resource. The term gained prominence in the late in discussions of scalable network servers.

Causes and Mechanisms

In Multithreaded Environments

In multithreaded environments, the thundering herd problem arises primarily from synchronization primitives that awaken all waiting threads simultaneously upon an event, leading to intense contention for shared resources such as locks or queues. A common trigger is the use of condition variable broadcast operations in threading libraries. For instance, the pthread_cond_broadcast() function unblocks all threads currently blocked on the specified condition variable, which is useful in scenarios like multi-consumer producer patterns but results in all awakened threads immediately competing for the associated mutex. This contention occurs because, after unblocking, the threads attempt to reacquire the mutex in accordance with the system's scheduling policy, often causing a surge in CPU usage and cache invalidations as only one thread can proceed at a time. At the level, similar issues manifest in low-level mechanisms like Linux's (fast userspace mutex) system. The FUTEX_WAKE operation wakes up to a specified number of waiters blocked on a futex word, typically used after unlocking to notify waiting tasks; however, when multiple waiters are present and FUTEX_WAKE is invoked with a count greater than one, all awakened tasks may race to acquire another contended futex or resource, exacerbating the thundering herd effect. This behavior stems from the design of futexes, which prioritize fast userspace paths but can lead to -mediated wakeups that flood the scheduler with runnable threads. To illustrate, in a contended lock scenario, invoking FUTEX_WAKE on a futex address after decrementing a lock counter wakes multiple processes or threads, which then contend for the next point, resulting in poor on multiprocessor systems. The thundering herd problem is amplified in multi-process setups compared to multithreaded ones due to the lack of , forcing reliance on (IPC) mechanisms that incur higher overhead. In threads within a single , synchronization via private futexes allows efficient userspace handling without kernel involvement in uncontended cases, but in multi-process environments, shared futexes (using or mapped files) or other IPC like signals sent to wake all relevant processes, leading to costly context switches and resource contention across . For example, broadcasting a signal to a via kill(-pgid, SIGUSR1) can unblock multiple processes waiting in sigsuspend(), causing them to herd toward a like a or , where the inter-process latency further degrades performance. Pipes previously exacerbated this in multi-process readers (prior to 5.6), as a write operation could awaken all blocked read() calls across processes, triggering a race for the available data bytes and associated metadata locks, but modern kernels use exclusive waits to wake only one reader. The following pseudo-code snippet demonstrates a simple producer-consumer scenario using where pthread_cond_broadcast() triggers the herd effect:
#include <pthread.h>
#include <queue>

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
std::queue<int> task_queue;  // [Shared resource](/page/Shared_resource)

void* consumer(void* arg) {
    while (true) {
        pthread_mutex_lock(&mutex);
        while (task_queue.empty()) {
            pthread_cond_wait(&cond, &mutex);  // Threads block here
        }
        int task = task_queue.front();
        task_queue.pop();  // All consumers race here after broadcast
        pthread_mutex_unlock(&mutex);
        // Process task
    }
    return NULL;
}

void producer() {
    // Produce a batch of tasks
    pthread_mutex_lock(&mutex);
    // ... add multiple tasks to queue ...
    pthread_cond_broadcast(&cond);  // Wakes all waiting consumers, causing contention
    pthread_mutex_unlock(&mutex);
}
In this example, when the producer broadcasts, all consumer threads awaken and contend for the mutex and queue access, illustrating the herd behavior.

In Event-Driven Systems

In event-driven systems, the thundering herd problem arises when multiple threads or processes are blocked on an I/O multiplexing mechanism, such as select() or epoll in Linux, waiting for events on a shared file descriptor like a listening socket. When an event occurs, such as an incoming connection, the kernel notifies all waiting entities, leading to a race where only one can process the event while the others contend unsuccessfully, consuming CPU cycles in repeated system calls. This overload is particularly pronounced in level-triggered modes without exclusive wakeups, where epoll_wait() can awaken multiple waiters for the same descriptor, exacerbating contention in high-concurrency scenarios. In network servers like and , the problem manifests during connection acceptance on a shared listening socket. Multiple worker processes monitor the socket via polling mechanisms; upon a new connection, all workers awaken and attempt accept() calls, but only one succeeds, with others receiving EAGAIN errors and retrying, resulting in inefficient load distribution and reduced throughput. 's prefork model historically relied on semaphores to serialize accepts and mitigate this, while uses an accept_mutex to ensure only one worker accepts at a time, preventing under load. In both cases, bursts of traffic amplify the issue, as the kernel's socket queue fills and triggers simultaneous wakeups across processes. Asynchronous I/O frameworks introduce similar challenges through centralized event queues. In Windows, I/O completion ports (IOCP) queue completion notifications for multiple threads; a surge of events can lead to herd-like contention if threads dequeue simultaneously without proper balancing, though IOCP's design associates ports with threads to distribute load efficiently. On BSD systems, kqueue notifies waiters of events on monitored descriptors; without flags like EV_ONESHOT or EV_DISPATCH, multiple threads blocked on kevent() for the same queue awaken for a single event, causing redundant processing attempts and CPU waste. These mechanisms aim for in reactive architectures but require careful tuning to avoid herd effects in multi-threaded event loops. Consider a busy HTTP handling a burst of requests using an event-driven model with multiple worker processes and epoll. Each worker polls the shared listening socket for readability. When 100 requests arrive rapidly, the kernel signals the event, waking all 20 workers. They race to call accept(), with only one succeeding per connection; the rest loop back into epoll_wait(), generating thousands of failed calls per second. This spikes CPU usage to near 100% on polling alone, delays connection processing, and starves the event , reducing the 's capacity from thousands to hundreds of requests per second during the burst.

Impacts and Consequences

Performance Degradation

The thundering herd problem induces significant CPU overhead by triggering excessive wake-ups of waiting processes or threads, leading to a surge in context switches and increased load on the operating system scheduler. In high-contention scenarios, such as when multiple threads await a like a mutex or connection, the may awaken up to 10 times more threads than necessary, as all waiters are notified simultaneously rather than serially. This unnecessary scheduling activity consumes CPU cycles that could otherwise support productive work, with the overhead scaling linearly with the number of waiters in unmitigated systems. Latency spikes are a direct consequence of this contention, particularly in request-handling environments where the herd effect causes multiple processes to compete for the resource immediately after its release. For instance, in database systems with caching layers like , a thundering herd can manifest as a sudden barrage of cache misses, overwhelming the with concurrent queries and amplifying tail . Without intervention, this competition results in prolonged wait times for the resource, exacerbating delays in overall request processing. Throughput suffers markedly under thundering herd conditions, as the system diverts resources to resolving contention rather than executing tasks. Benchmarks on contested synchronization primitives, such as the accept() call in , demonstrate throughput reductions of approximately 50% in high-load network servers with hundreds of simultaneous connections. To observe these inefficiencies, tools like perf can profile events and scheduler latencies, revealing elevated rates during herd occurrences, while traces system calls to identify patterns of mass wake-ups on futexes or semaphores. These measurement techniques provide quantitative insights into the overhead, such as spikes in voluntary es correlating with events.

Resource Exhaustion

When numerous threads or processes are simultaneously awakened in response to an event, such as resource availability in a multi-threaded application, they often compete for limited resources, increasing overall system pressure. This phenomenon imposes scalability limits in multi-core environments, as the contention serializes access to shared kernel structures like wait queues or locks, preventing efficient parallelization across cores. Evaluations on show that without mitigation, CPU utilization plateaus despite additional cores, as thundering herd wakeups waste cycles on contention rather than useful computation. In resource-constrained embedded systems, the thundering herd amplifies kernel interference on shared multicore resources, causing excessive delays that violate deadlines and potentially lead to system failures. For example, attacks exploiting herd effects on seL4 can delay high-priority threads by over cycles per malicious participant, overwhelming limited CPU budgets in devices with minimal and cores.

Mitigation Strategies

Operating System Solutions

Operating systems address the thundering herd problem through kernel-level mechanisms that limit the number of processes or threads awakened simultaneously when an event occurs, thereby reducing contention and improving efficiency. These solutions typically involve wake-one semantics, where only a waiter is notified, and structural changes in scheduling to serialize or distribute wake-ups across CPU resources. One key approach is wake-one semantics, implemented in Linux via the interface. The EPOLLEXCLUSIVE flag in ensures that only one waiting task is awakened for a given event in level-triggered mode, preventing multiple threads from contending for the same resource such as an incoming connection on a listening . Similarly, the EPOLLONESHOT flag disables the after delivering an event to a single waiter, requiring explicit rearming, which further serializes access and mitigates herd behavior in multithreaded applications. In operations, the FUTEX_WAKE call with a count set to 1 wakes exactly one thread from the wait queue, avoiding the overhead of broadcasting to all waiters and thus tackling thundering herd in user-space locking primitives. POSIX standards provide extensions that favor selective waking over broadcasting. The pthread_cond_signal function unblocks at least one thread blocked on a condition variable, in contrast to pthread_cond_broadcast which awakens all waiters and risks thundering herd by causing unnecessary contention for associated mutexes. Modern kernels enhance this with queue-based dispatching in wait queues, where wake-ups are managed through ordered lists that prioritize and limit notifications to one or a few threads at a time, ensuring fair and efficient without global broadcasts. Historically, early System V Unix implementations suffered from thundering herd due to signal broadcasting that awakened all waiting processes indiscriminately, leading to high CPU overhead from contention. The evolution to Linux 2.6 introduced per-CPU runqueues in the O(1) scheduler, replacing a global runqueue lock with localized queues per processor core; this serializes wake-ups within each CPU's context, distributing load and reducing the stampede effect across multiprocessor systems. Configuration options in kernels allow tuning to further curb potential herd issues. The parameter /proc/sys/kernel/threads-max sets a system-wide on the total number of , indirectly preventing excessive concurrent wake-ups by capping the overall size and avoiding resource exhaustion from over-proliferation of waiters.

Application Design Approaches

Developers can mitigate the thundering herd problem at the application level by implementing patterns, where a single designated process or thread acts as the leader to handle resource access or event processing, thereby serializing operations and preventing multiple concurrent attempts that lead to contention. In distributed systems, this approach ensures that only the elected leader performs critical tasks, such as updating shared caches or coordinating writes, while followers are notified sequentially through mechanisms like work queues, reducing the risk of overwhelming the resource. For instance, Amazon's Elastic Block Store (EBS) employs to assign primary responsibilities for volume shards, avoiding redundant processing and coordination overhead that could trigger herd-like behavior. Another key strategy involves incorporating backoff mechanisms in retry logic for accessing contended resources, particularly with to desynchronize attempts and distribute load over time. This technique progressively increases wait times between retries—starting with a base delay and multiplying it exponentially (e.g., delay = initial * 2^attempt)—while adding random to prevent synchronized retries that exacerbate the problem. In , this can be applied when using locks like ReentrantLock for contended sections, where threads attempt acquisition with timeouts and back off accordingly to avoid repeated immediate retries. The following code snippet illustrates a simple exponential backoff retry for acquiring a lock:
java
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.ReentrantLock;
import java.util.Random;

public class BackoffRetryExample {
    private static final ReentrantLock lock = new ReentrantLock();
    private static final Random random = new Random();
    private static final long INITIAL_DELAY = 100; // ms
    private static final double MULTIPLIER = 2.0;
    private static final long MAX_DELAY = 10000; // ms

    public boolean tryAcquireWithBackoff(int maxAttempts) {
        long delay = INITIAL_DELAY;
        for (int attempt = 0; attempt < maxAttempts; attempt++) {
            if (lock.tryLock()) {
                try {
                    // Critical section
                    return true;
                } finally {
                    lock.unlock();
                }
            }
            // Exponential backoff with jitter
            long jitter = (long) (random.nextDouble() * delay);
            try {
                TimeUnit.MILLISECONDS.sleep(delay + jitter);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                return false;
            }
            delay = Math.min(delay * MULTIPLIER, MAX_DELAY);
        }
        return false;
    }
}
This pattern, recommended for retry storms in distributed services, helps stagger access and prevents cascading failures. Asynchronous dispatching further addresses the issue by serializing access through models like the actor model or dedicated message queues, ensuring events are processed in a controlled, non-broadcast manner. In the actor model, as implemented in frameworks like Akka, each actor handles messages sequentially on its own thread, encapsulating state and avoiding shared mutable data that leads to contention; this promotes efficient concurrency without locks, as multiple actors can run on a shared thread pool while processing independently. For example, Akka's persistence recovery mechanisms use staggered retries among actors to prevent thundering herd during state reconstruction. Similarly, message queues such as RabbitMQ, in versions 3.12 and later (as of 2025), default to storing messages on disk immediately for classic queues, minimizing RAM usage and allowing consumers to process messages gradually without all threads awakening simultaneously due to memory pressure. This decouples producers and consumers, serializing access and smoothing bursts. Best practices for application design emphasize avoiding global broadcasts in favor of targeted signaling, such as directing notifications only to relevant components via or pub-sub with filters, which minimizes unnecessary wake-ups and contention. While these approaches introduce some complexity—such as managing leader failover or backlogs—they enhance portability across platforms and improve overall system resilience by prioritizing serialized, desynchronized access over reactive, herd-inducing patterns. Trade-offs include potential latency from queuing but are offset by reduced resource exhaustion during peaks.

References

  1. [1]
    epoll(7) - Linux manual page - man7.org
    This provides a useful optimization for avoiding "thundering herd" wake-ups in some scenarios. Interaction with autosleep If the system is in autosleep mode ...
  2. [2]
    [PDF] Understanding The Linux Virtual Memory Manager
    Jul 9, 2007 · ... thundering herd. Wait tables are discussed further in Section 2.2.3 ... problem. Instead, a hash table of wait queues is stored in ...
  3. [3]
    epoll_ctl(2) - Linux manual page - man7.org
    EPOLLEXCLUSIVE is thus useful for avoiding thundering herd problems in certain scenarios. If the same file descriptor is in multiple epoll instances, some ...
  4. [4]
    Linux Kernel 2.4 Internals
    Aug 7, 2002 · This means that when this task is sleeping on a wait queue with many other tasks, it will be woken up alone instead of causing "thundering herd" ...
  5. [5]
    Case Study: A Thundering Herd in the Wild - USENIX
    Mar 25, 2025 · The 'thundering herd problem' is an issue that occurrs when multiple threads wait on the same event and are all woken up at the same time.
  6. [6]
    [PDF] Monitors (Deprecated) - cs.wisc.edu
    As usual, waking all waiters can lead to the thundering herd problem. Because ... One of the first books on operating systems; certainly ahead of its time.
  7. [7]
    Sleeping in the Kernel | Linux Journal
    Jul 28, 2005 · It is a classic problem in operating systems. Consider two ... This problem is called the thundering herd problem. That is why ...
  8. [8]
    I/O Event Handling Under Linux - MIT
    Jun 3, 1999 · Where we win is that only 5 threads had to go back and call select(2) or poll(2). ... The Thundering Herd Problem. A note on the "thundering ...
  9. [9]
    16. Common Failure Patterns - Designing Distributed Systems, 2nd ...
    The thundering herd derives its name from the metaphor of a bison or other large animal on the prairie. Individually they may be manageable, but when moving ...
  10. [10]
    pthread_cond_broadcast
    `pthread_cond_broadcast()` unblocks all threads blocked on a condition variable, used when multiple threads can proceed, like in producer/consumer scenarios.
  11. [11]
    futex(2) - Linux manual page
    ### Summary of FUTEX_WAKE and Thundering Herd Problem from futex(2)
  12. [12]
    [PDF] Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux
    In this paper we are describing a particular fast user level locking mechanism called futexes that was developed in the context of the Linux operating system.
  13. [13]
    pipe(7) - Linux manual page - man7.org
    A pipe has a read end and a write end. Data written to the write end of a pipe can be read from the read end of the pipe. A pipe is created using pipe(2) ...
  14. [14]
    epoll(7) - Linux manual page
    ### Definition and Explanation of the Thundering Herd Problem
  15. [15]
    Serializing accept(), AKA Thundering Herd, AKA the Zeeg Problem
    Jun 27, 2013 · ... file descriptor. Before you start blaming your kernel developers ... poll()/select()… usage in each thread. Another problem solved (and ...
  16. [16]
    event - Apache HTTP Server Version 2.4
    To solve this problem, this MPM uses a dedicated listener thread for each process to handle both the Listening sockets, all sockets that are in a Keep Alive ...Missing: nginx | Show results with:nginx
  17. [17]
    612 (multiple users can still do "accept" when "accept_mutex ... - nginx
    The main purpose of accept mutex is to solve "​Thundering herd problem". Under high load the mutex contention shouldn't ever happen. It's an abnormal ...Multiple Users Can Still Do... · Description · Follow-Up: 3 Comment:2 By...
  18. [18]
    IO Completion Ports - Matt Godbolt's blog
    Jul 17, 2008 · One of the issues with using thread pools is the thundering herd problem ... problem lies in the interesting concept of “IO Completion Ports.
  19. [19]
    How thread-friendly is kevent? - FreeBSD Mailing Lists
    Nov 10, 2014 · By default, if you don't use _ONESHOT or _DISPATCH, all threads will be woken up.. So, yes, there is potential for a thundering herd problem...kqueue bug in 7.x with "things" that go away.A (perhaps silly) kqueue questionMore results from lists.freebsd.org
  20. [20]
    None
    ### Summary of Historical Context and Origins of the Thundering Herd Problem
  21. [21]
    [PDF] Scaling Memcache at Facebook - USENIX
    A thundering herd happens when a specific key undergoes heavy read and write ac- tivity. As the write activity repeatedly invalidates the re- cently set ...
  22. [22]
    Linux perf Examples - Brendan Gregg
    Examples of using the Linux perf command, aka perf_events, for performance analysis and debugging. perf is a profiler and tracer.Missing: herd | Show results with:herd
  23. [23]
    [PDF] The Demikernel Datapath OS Architecture for Microsecond-scale ...
    Oct 4, 2021 · Thus, PDPIX centers around an I/O queue ab- straction that makes I/O ... have a well-known “thundering herd” issue [56]: when the.
  24. [24]
    [PDF] Improving Network Connection Locality on Multicore Systems
    4.1 Thundering Herd. Some applications serialize calls to accept() and poll() to avoid the thundering herd problem, where a kernel wakes up many threads in ...
  25. [25]
    Caching Best Practices | Amazon Web Services
    It avoids cache misses, which can help the application perform better and feel snappier. ... The thundering herd. Also known as dog piling, the thundering ...
  26. [26]
    [PDF] The Thundering Herd: Amplifying Kernel Interference to Attack ...
    In the classical Thundering Herd problem [10], many ... Liedtke, “On µ-kernel construction,” in 15th ACM Symposium on. Operating Systems Principles (SOSP).
  27. [27]
    Issues with epoll() - LWN.net
    Mar 23, 2015 · There are two problems that Baron is trying to address: the "thundering herd" problem on wakeups and the use of global locks when manipulating the epoll() sets.Missing: documentation | Show results with:documentation
  28. [28]
    epoll_ctl(2) - Linux manual page - man7.org
    EPOLLEXCLUSIVE is thus useful for avoiding thundering herd problems in certain scenarios. If the same file descriptor is in multiple epoll instances, some with ...
  29. [29]
    [PDF] Issues with Selected Scalability Features of the 2.6 Kernel
    For example, the global runqueue_lock lock was replaced by the locks on the new per-cpu runqueues. Gone was io_request_lock with the introduc- tion of the new ...Missing: herd | Show results with:herd
  30. [30]
    Linux Kernel Documentation: sysctl
    Anything past the max length of the sysctl value buffer will be ignored. ... The maximum value that can be written to threads-max is given by the constant ...
  31. [31]
    Leader election in distributed systems, Amazon Builders' Library
    Leader election is a powerful tool for improving efficiency, reducing coordination, simplifying architectures, and reducing operations.
  32. [32]
    Retry Storm Antipattern - Azure Architecture Center | Microsoft Learn
    Aug 11, 2025 · Gradually increase the amount of time between attempts, for example by using an exponential backoff strategy. Handle service faults gracefully.Missing: mitigation | Show results with:mitigation
  33. [33]
    Better Retries with Exponential Backoff and Jitter | Baeldung
    Aug 1, 2019 · We can use the exponential backoff algorithm in Resilience4j retry by configuring its IntervalFunction that accepts an initialInterval and a ...Missing: reentrantlock | Show results with:reentrantlock
  34. [34]
    How the Actor Model Meets the Needs of Modern, Distributed Systems
    The actor model uses message passing, avoiding locks, and allows for encapsulation without locks, preserving invariants without synchronization.Missing: contention herd
  35. [35]
    Practical Akka Persistence - Digital Magic
    Apr 14, 2016 · Familiarity with the actor model and its Akka implementation are ... actor (and retrying recovery) to avoid the "thundering herd" problem.
  36. [36]
    Stopping the stampeding herd problem with lazy queues
    Jul 5, 2017 · A sudden rush of incoming messages can inundate your RabbitMQ cluster and grind processing to a halt. Here lazy queues can come to the rescue.
  37. [37]
    Avoiding the Thundering Herd - NATS Docs
    Oct 28, 2024 · When a server goes down, there is a possible anti-pattern called the Thundering Herd where all of the clients try to reconnect immediately, thus creating a ...