Fact-checked by Grok 2 weeks ago

Thundering herd problem

The thundering herd problem is a concurrency issue in operating systems where numerous processes or threads, blocked on a shared wait queue awaiting an event such as I/O completion or resource availability, are all awakened simultaneously when the event occurs, leading to intense contention for the resource as only one process can typically proceed while the others compete or return to waiting, thereby causing significant performance overhead from context switches, cache invalidations, and CPU waste.^[1]^[2] The term "thundering herd" originates from early Unix systems, evoking the image of a herd of animals stampeding toward a resource. This problem arises primarily in multiprocessor environments and shared-resource scenarios, such as when multiple tasks wait for a locked page during memory management operations like page faults or I/O in the Linux kernel.^[2] In such cases, a single zone-wide wait queue could wake all processes at once, exacerbating inefficiency; to mitigate this, the kernel employs a hashed wait table (e.g., zone->wait_table), sized based on the zone's memory (typically up to thousands of entries), distributing waiters to limit unnecessary wakeups and reduce collisions.^[3] In legacy high-memory configurations (e.g., 32-bit systems), similar issues occurred with queues like pkmap_map_wait for persistent kernel map slots, leading to contention upon slot release.^[2] The thundering herd manifests notably in network servers handling listener sockets, where multiple threads blocked on select() or poll() wake up en masse upon a connection arrival, overwhelming the system with redundant checks.^[1] In Linux, tools like epoll address this through edge-triggered notifications and the EPOLLEXCLUSIVE flag, which ensures only one waiter is woken per event on a shared file descriptor, optimizing for scenarios with high concurrency.^[1]^[4] Wakeup functions such as wake_up() can also be tuned to wake only a single task (e.g., via wake_up_process()), preventing the full herd from stampeding.^[5] Beyond kernel-level operations, the problem extends to distributed systems, where a surge of simultaneous requests—often after a cache expiration or service recovery—can overload APIs or databases, mimicking the OS herd effect at scale.^[6] Mitigations in this domain include rate limiting, exponential backoff, and circuit breakers.^[7] Overall, understanding and alleviating the thundering herd remains crucial for scalable, efficient concurrent programming in both single-machine and networked environments.

Definition and Background

Core Concept

The thundering herd problem refers to a concurrency issue in computing systems where a large number of processes or threads, previously blocked while waiting for a shared resource such as a lock or event, are simultaneously awakened upon the resource's availability, resulting in intense and inefficient competition among them.^[8] This phenomenon leads to significant performance overhead, as only one contender can typically acquire the resource, forcing the others to requeue or block again.^[9] At its core, the problem arises from the behavior of synchronization primitives like semaphores or condition variables in monitors. When a resource is released, these primitives often employ a broadcast mechanism—such as broadcast() in monitor semantics or wake_up_all() in kernel wait queues—that notifies all waiting entities rather than selecting just one.^[8] This indiscriminate wakeup triggers a "stampede" of concurrent attempts to access the resource, where threads or processes repeatedly check conditions (e.g., via spurious wakeups in Mesa-style monitors) and fail, exacerbating system contention.^[9] A basic illustration occurs in Unix-like operating systems during I/O event handling, such as when multiple processes use select() or poll() to monitor a shared file descriptor for incoming connections on a server socket. Upon an event like a new connection arriving, the kernel awakens all waiting processes, prompting each to reissue system calls and compete for the descriptor, even though only one can accept the connection at a time.^[10] Key characteristics of the thundering herd include unnecessary context switches as the scheduler cycles through awakened entities, CPU cache thrashing from concurrent access patterns that invalidate shared data, and redundant system calls that consume kernel resources without productive outcome.^[9] These effects highlight the inefficiency in resource arbitration, particularly in high-concurrency environments.^[10]

Historical Origins

The thundering herd problem was first observed in Unix systems during the 1980s, particularly in multi-process network servers where the accept() system call on listening sockets would awaken multiple blocked processes upon a single incoming connection, causing unnecessary context switches and resource contention. This behavior was prominent in AT&T's System V Release 3, introduced in 1986, which standardized certain networking interfaces that exacerbated the issue in high-concurrency environments. Early discussions of the problem appeared in analyses of BSD Unix kernels, where synchronization primitives in process scheduling highlighted inefficiencies in mass wakeups for shared events. A key publication exploring its implications is "Accept() Scalability in Linux" by Stephen P. Molloy and Chuck Lever, presented at the 2000 USENIX Annual Technical Conference, which examined the POSIX-compliant accept() implementation and its "thundering herd" effects in Linux, tracing roots to traditional Unix designs.^[11] The issue evolved into a more prominent concern in the 1990s alongside the proliferation of multi-user systems and multiprocessing hardware, as servers scaled to handle thousands of concurrent users. Discussions in POSIX standards development, particularly around thread synchronization in POSIX.1c (1995), underscored the need for better wakeup mechanisms to avoid herd-like contention in pthreads environments. The terminology "thundering herd" originated as a metaphor in computing literature, evoking the chaotic rush of a large herd of bison or wildebeest charging across the prairie in response to a single stimulus, analogous to processes stampeding toward a resource. The term gained prominence in the late 1990s in discussions of scalable network servers.^[12]

Causes and Mechanisms

In Multithreaded Environments

In multithreaded environments, the thundering herd problem arises primarily from synchronization primitives that awaken all waiting threads simultaneously upon an event, leading to intense contention for shared resources such as locks or queues. A common trigger is the use of condition variable broadcast operations in threading libraries. For instance, the POSIX pthread_cond_broadcast() function unblocks all threads currently blocked on the specified condition variable, which is useful in scenarios like multi-consumer producer patterns but results in all awakened threads immediately competing for the associated mutex.^[13] This contention occurs because, after unblocking, the threads attempt to reacquire the mutex in accordance with the system's scheduling policy, often causing a surge in CPU usage and cache invalidations as only one thread can proceed at a time.^[13] At the kernel level, similar issues manifest in low-level synchronization mechanisms like Linux's futex (fast userspace mutex) system. The FUTEX_WAKE operation wakes up to a specified number of waiters blocked on a futex word, typically used after unlocking to notify waiting tasks; however, when multiple waiters are present and FUTEX_WAKE is invoked with a count greater than one, all awakened tasks may race to acquire another contended futex or resource, exacerbating the thundering herd effect.^[14] This behavior stems from the design of futexes, which prioritize fast userspace paths but can lead to kernel-mediated wakeups that flood the scheduler with runnable threads.^[15] To illustrate, in a contended lock scenario, invoking FUTEX_WAKE on a futex address after decrementing a lock counter wakes multiple processes or threads, which then contend for the next synchronization point, resulting in poor scalability on multiprocessor systems.^[15] The thundering herd problem is amplified in multi-process setups compared to multithreaded ones due to the lack of shared address space, forcing reliance on inter-process communication (IPC) mechanisms that incur higher overhead. In threads within a single process, synchronization via private futexes allows efficient userspace handling without kernel involvement in uncontended cases, but in multi-process environments, shared futexes (using shared memory or mapped files) or other IPC like signals sent to process groups wake all relevant processes, leading to costly context switches and resource contention across address spaces.^[15] For example, broadcasting a signal to a process group via kill(-pgid, SIGUSR1) can unblock multiple processes waiting in sigsuspend(), causing them to herd toward a shared resource like a semaphore or pipe, where the inter-process latency further degrades performance. Pipes previously exacerbated this in multi-process readers (prior to Linux 5.6), as a write operation could awaken all blocked read() calls across processes, triggering a race for the available data bytes and associated metadata locks, but modern kernels use exclusive waits to wake only one reader.^[16]^[17] The following pseudo-code snippet demonstrates a simple producer-consumer scenario using pthreads where pthread_cond_broadcast() triggers the herd effect:

#include <pthread.h>
#include <queue>

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
std::queue<int> task_queue;  // [Shared resource](/page/Shared_resource)

void* consumer(void* arg) {
    while (true) {
        pthread_mutex_lock(&mutex);
        while (task_queue.empty()) {
            pthread_cond_wait(&cond, &mutex);  // Threads block here
        }
        int task = task_queue.front();
        task_queue.pop();  // All consumers race here after broadcast
        pthread_mutex_unlock(&mutex);
        // Process task
    }
    return NULL;
}

void producer() {
    // Produce a batch of tasks
    pthread_mutex_lock(&mutex);
    // ... add multiple tasks to queue ...
    pthread_cond_broadcast(&cond);  // Wakes all waiting consumers, causing contention
    pthread_mutex_unlock(&mutex);
}
#include <pthread.h>
#include <queue>

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
std::queue<int> task_queue;  // [Shared resource](/page/Shared_resource)

void* consumer(void* arg) {
    while (true) {
        pthread_mutex_lock(&mutex);
        while (task_queue.empty()) {
            pthread_cond_wait(&cond, &mutex);  // Threads block here
        }
        int task = task_queue.front();
        task_queue.pop();  // All consumers race here after broadcast
        pthread_mutex_unlock(&mutex);
        // Process task
    }
    return NULL;
}

void producer() {
    // Produce a batch of tasks
    pthread_mutex_lock(&mutex);
    // ... add multiple tasks to queue ...
    pthread_cond_broadcast(&cond);  // Wakes all waiting consumers, causing contention
    pthread_mutex_unlock(&mutex);
}

In this example, when the producer broadcasts, all consumer threads awaken and contend for the mutex and queue access, illustrating the herd behavior.^[13]

In Event-Driven Systems

In event-driven systems, the thundering herd problem arises when multiple threads or processes are blocked on an I/O multiplexing mechanism, such as select() or epoll in Linux, waiting for events on a shared file descriptor like a listening socket. When an event occurs, such as an incoming connection, the kernel notifies all waiting entities, leading to a race where only one can process the event while the others contend unsuccessfully, consuming CPU cycles in repeated system calls. This overload is particularly pronounced in level-triggered modes without exclusive wakeups, where epoll_wait() can awaken multiple waiters for the same descriptor, exacerbating contention in high-concurrency scenarios.^[18]^[19] In network servers like Apache and Nginx, the problem manifests during connection acceptance on a shared listening socket. Multiple worker processes monitor the socket via polling mechanisms; upon a new connection, all workers awaken and attempt accept() calls, but only one succeeds, with others receiving EAGAIN errors and retrying, resulting in inefficient load distribution and reduced throughput. Apache's prefork model historically relied on semaphores to serialize accepts and mitigate this, while Nginx uses an accept_mutex to ensure only one worker accepts at a time, preventing herd behavior under load. In both cases, bursts of traffic amplify the issue, as the kernel's socket queue fills and triggers simultaneous wakeups across processes.^[20]^[21]^[19] Asynchronous I/O frameworks introduce similar challenges through centralized event queues. In Windows, I/O completion ports (IOCP) queue completion notifications for multiple threads; a surge of events can lead to herd-like contention if threads dequeue simultaneously without proper balancing, though IOCP's design associates ports with threads to distribute load efficiently. On BSD systems, kqueue notifies waiters of events on monitored descriptors; without flags like EV_ONESHOT or EV_DISPATCH, multiple threads blocked on kevent() for the same queue awaken for a single event, causing redundant processing attempts and CPU waste. These mechanisms aim for scalability in reactive architectures but require careful tuning to avoid herd effects in multi-threaded event loops.^[22]^[23] Consider a busy HTTP server handling a burst of requests using an event-driven model with multiple worker processes and epoll. Each worker polls the shared listening socket for readability. When 100 requests arrive rapidly, the kernel signals the event, waking all 20 workers. They race to call accept(), with only one succeeding per connection; the rest loop back into epoll_wait(), generating thousands of failed system calls per second. This cycle spikes CPU usage to near 100% on polling alone, delays connection processing, and starves the event loop, reducing the server's capacity from thousands to hundreds of requests per second during the burst.^[19]^[18]

Impacts and Consequences

Performance Degradation

The thundering herd problem induces significant CPU overhead by triggering excessive wake-ups of waiting processes or threads, leading to a surge in context switches and increased load on the operating system scheduler. In high-contention scenarios, such as when multiple threads await a shared resource like a mutex or socket connection, the kernel may awaken up to 10 times more threads than necessary, as all waiters are notified simultaneously rather than serially.^[24] This unnecessary scheduling activity consumes CPU cycles that could otherwise support productive work, with the overhead scaling linearly with the number of waiters in unmitigated systems.^[24] Latency spikes are a direct consequence of this contention, particularly in request-handling environments where the herd effect causes multiple processes to compete for the resource immediately after its release. For instance, in database systems with caching layers like Memcached, a thundering herd can manifest as a sudden barrage of cache misses, overwhelming the backend database with concurrent queries and amplifying tail latencies.^[25] Without intervention, this competition results in prolonged wait times for the resource, exacerbating delays in overall request processing.^[25] Throughput suffers markedly under thundering herd conditions, as the system diverts resources to resolving contention rather than executing tasks. Benchmarks on contested synchronization primitives, such as the POSIX accept() call in Linux, demonstrate throughput reductions of approximately 50% in high-load network servers with hundreds of simultaneous connections.^[24] To observe these inefficiencies, tools like perf can profile context switch events and scheduler latencies, revealing elevated rates during herd occurrences, while strace traces system calls to identify patterns of mass wake-ups on futexes or semaphores.^[26] These measurement techniques provide quantitative insights into the overhead, such as spikes in voluntary context switches correlating with resource contention events.

Resource Exhaustion

When numerous threads or processes are simultaneously awakened in response to an event, such as resource availability in a multi-threaded application, they often compete for limited resources, increasing overall system pressure. This phenomenon imposes scalability limits in multi-core environments, as the contention serializes access to shared kernel structures like wait queues or locks, preventing efficient parallelization across cores. Evaluations on Linux show that without mitigation, CPU utilization plateaus despite additional cores, as thundering herd wakeups waste cycles on contention rather than useful computation.^[27] In resource-constrained embedded systems, the thundering herd amplifies kernel interference on shared multicore resources, causing excessive delays that violate real-time deadlines and potentially lead to system failures. For example, attacks exploiting herd effects on seL4 microkernel can delay high-priority threads by over 100,000 cycles per malicious participant, overwhelming limited CPU budgets in devices with minimal RAM and cores.^[28]

Mitigation Strategies

Operating System Solutions

Operating systems address the thundering herd problem through kernel-level mechanisms that limit the number of processes or threads awakened simultaneously when an event occurs, thereby reducing contention and improving efficiency. These solutions typically involve wake-one semantics, where only a single waiter is notified, and structural changes in scheduling to serialize or distribute wake-ups across CPU resources.^[29] One key approach is wake-one semantics, implemented in Linux via the epoll interface. The EPOLLEXCLUSIVE flag in epoll_ctl ensures that only one waiting task is awakened for a given file descriptor event in level-triggered mode, preventing multiple threads from contending for the same resource such as an incoming connection on a listening socket. Similarly, the EPOLLONESHOT flag disables the file descriptor after delivering an event to a single waiter, requiring explicit rearming, which further serializes access and mitigates herd behavior in multithreaded applications. In futex operations, the FUTEX_WAKE call with a count parameter set to 1 wakes exactly one thread from the wait queue, avoiding the overhead of broadcasting to all waiters and thus tackling thundering herd in user-space locking primitives.^[30]^[14] POSIX standards provide extensions that favor selective waking over broadcasting. The pthread_cond_signal function unblocks at least one thread blocked on a condition variable, in contrast to pthread_cond_broadcast which awakens all waiters and risks thundering herd by causing unnecessary contention for associated mutexes. Modern Linux kernels enhance this with queue-based dispatching in wait queues, where wake-ups are managed through ordered lists that prioritize and limit notifications to one or a few threads at a time, ensuring fair and efficient resource allocation without global broadcasts.^[13] Historically, early System V Unix implementations suffered from thundering herd due to signal broadcasting that awakened all waiting processes indiscriminately, leading to high CPU overhead from contention. The evolution to Linux 2.6 introduced per-CPU runqueues in the O(1) scheduler, replacing a global runqueue lock with localized queues per processor core; this serializes wake-ups within each CPU's context, distributing load and reducing the stampede effect across multiprocessor systems.^[31] Configuration options in Linux kernels allow tuning to further curb potential herd issues. The sysctl parameter /proc/sys/kernel/threads-max sets a system-wide limit on the total number of threads, indirectly preventing excessive concurrent wake-ups by capping the overall thread pool size and avoiding resource exhaustion from over-proliferation of waiters.^[32]

Application Design Approaches

Developers can mitigate the thundering herd problem at the application level by implementing leader election patterns, where a single designated process or thread acts as the leader to handle resource access or event processing, thereby serializing operations and preventing multiple concurrent attempts that lead to contention. In distributed systems, this approach ensures that only the elected leader performs critical tasks, such as updating shared caches or coordinating writes, while followers are notified sequentially through mechanisms like work queues, reducing the risk of overwhelming the resource. For instance, Amazon's Elastic Block Store (EBS) employs leader election to assign primary responsibilities for volume shards, avoiding redundant processing and coordination overhead that could trigger herd-like behavior.^[33] Another key strategy involves incorporating backoff mechanisms in retry logic for accessing contended resources, particularly exponential backoff with jitter to desynchronize attempts and distribute load over time. This technique progressively increases wait times between retries—starting with a base delay and multiplying it exponentially (e.g., delay = initial * 2^attempt)—while adding random jitter to prevent synchronized retries that exacerbate the problem. In Java, this can be applied when using locks like ReentrantLock for contended sections, where threads attempt acquisition with timeouts and back off accordingly to avoid repeated immediate retries. The following code snippet illustrates a simple exponential backoff retry for acquiring a lock:

java
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.ReentrantLock;
import java.util.Random;

public class BackoffRetryExample {
    private static final ReentrantLock lock = new ReentrantLock();
    private static final Random random = new Random();
    private static final long INITIAL_DELAY = 100; // ms
    private static final double MULTIPLIER = 2.0;
    private static final long MAX_DELAY = 10000; // ms

    public boolean tryAcquireWithBackoff(int maxAttempts) {
        long delay = INITIAL_DELAY;
        for (int attempt = 0; attempt < maxAttempts; attempt++) {
            if (lock.tryLock()) {
                try {
                    // Critical section
                    return true;
                } finally {
                    lock.unlock();
                }
            }
            // Exponential backoff with jitter
            long jitter = (long) (random.nextDouble() * delay);
            try {
                TimeUnit.MILLISECONDS.sleep(delay + jitter);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                return false;
            }
            delay = Math.min(delay * MULTIPLIER, MAX_DELAY);
        }
        return false;
    }
}
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.ReentrantLock;
import java.util.Random;

public class BackoffRetryExample {
    private static final ReentrantLock lock = new ReentrantLock();
    private static final Random random = new Random();
    private static final long INITIAL_DELAY = 100; // ms
    private static final double MULTIPLIER = 2.0;
    private static final long MAX_DELAY = 10000; // ms

    public boolean tryAcquireWithBackoff(int maxAttempts) {
        long delay = INITIAL_DELAY;
        for (int attempt = 0; attempt < maxAttempts; attempt++) {
            if (lock.tryLock()) {
                try {
                    // Critical section
                    return true;
                } finally {
                    lock.unlock();
                }
            }
            // Exponential backoff with jitter
            long jitter = (long) (random.nextDouble() * delay);
            try {
                TimeUnit.MILLISECONDS.sleep(delay + jitter);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                return false;
            }
            delay = Math.min(delay * MULTIPLIER, MAX_DELAY);
        }
        return false;
    }
}

This pattern, recommended for retry storms in distributed services, helps stagger access and prevents cascading failures.^[34]^[35] Asynchronous dispatching further addresses the issue by serializing access through models like the actor model or dedicated message queues, ensuring events are processed in a controlled, non-broadcast manner. In the actor model, as implemented in frameworks like Akka, each actor handles messages sequentially on its own thread, encapsulating state and avoiding shared mutable data that leads to contention; this promotes efficient concurrency without locks, as multiple actors can run on a shared thread pool while processing independently. For example, Akka's persistence recovery mechanisms use staggered retries among actors to prevent thundering herd during state reconstruction. Similarly, message queues such as RabbitMQ, in versions 3.12 and later (as of 2025), default to storing messages on disk immediately for classic queues, minimizing RAM usage and allowing consumers to process messages gradually without all threads awakening simultaneously due to memory pressure. This decouples producers and consumers, serializing access and smoothing bursts.^[36]^[37]^[38] Best practices for application design emphasize avoiding global broadcasts in favor of targeted signaling, such as directing notifications only to relevant components via queues or pub-sub with filters, which minimizes unnecessary wake-ups and contention. While these approaches introduce some complexity—such as managing leader failover or queue backlogs—they enhance portability across platforms and improve overall system resilience by prioritizing serialized, desynchronized access over reactive, herd-inducing patterns. Trade-offs include potential latency from queuing but are offset by reduced resource exhaustion during peaks.^[33]^[39]

References

[1]
epoll(7) - Linux manual page - man7.org
This provides a useful optimization for avoiding "thundering herd" wake-ups in some scenarios. Interaction with autosleep If the system is in autosleep mode ...
[2]
[PDF] Understanding The Linux Virtual Memory Manager
Jul 9, 2007 · ... thundering herd. Wait tables are discussed further in Section 2.2.3 ... problem. Instead, a hash table of wait queues is stored in ...
[3]
epoll_ctl(2) - Linux manual page - man7.org
EPOLLEXCLUSIVE is thus useful for avoiding thundering herd problems in certain scenarios. If the same file descriptor is in multiple epoll instances, some ...
[4]
Linux Kernel 2.4 Internals
Aug 7, 2002 · This means that when this task is sleeping on a wait queue with many other tasks, it will be woken up alone instead of causing "thundering herd" ...
[5]
Case Study: A Thundering Herd in the Wild - USENIX
Mar 25, 2025 · The 'thundering herd problem' is an issue that occurrs when multiple threads wait on the same event and are all woken up at the same time.
[6]
[PDF] Monitors (Deprecated) - cs.wisc.edu
As usual, waking all waiters can lead to the thundering herd problem. Because ... One of the first books on operating systems; certainly ahead of its time.
[7]
Sleeping in the Kernel | Linux Journal
Jul 28, 2005 · It is a classic problem in operating systems. Consider two ... This problem is called the thundering herd problem. That is why ...
[8]
I/O Event Handling Under Linux - MIT
Jun 3, 1999 · Where we win is that only 5 threads had to go back and call select(2) or poll(2). ... The Thundering Herd Problem. A note on the "thundering ...
[9]
16. Common Failure Patterns - Designing Distributed Systems, 2nd ...
The thundering herd derives its name from the metaphor of a bison or other large animal on the prairie. Individually they may be manageable, but when moving ...
[10]
pthread_cond_broadcast
`pthread_cond_broadcast()` unblocks all threads blocked on a condition variable, used when multiple threads can proceed, like in producer/consumer scenarios.
[11]
futex(2) - Linux manual page
### Summary of FUTEX_WAKE and Thundering Herd Problem from futex(2)
[12]
[PDF] Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux
In this paper we are describing a particular fast user level locking mechanism called futexes that was developed in the context of the Linux operating system.
[13]
pipe(7) - Linux manual page - man7.org
A pipe has a read end and a write end. Data written to the write end of a pipe can be read from the read end of the pipe. A pipe is created using pipe(2) ...
[14]
epoll(7) - Linux manual page
### Definition and Explanation of the Thundering Herd Problem
[15]
Serializing accept(), AKA Thundering Herd, AKA the Zeeg Problem
Jun 27, 2013 · ... file descriptor. Before you start blaming your kernel developers ... poll()/select()… usage in each thread. Another problem solved (and ...
[16]
event - Apache HTTP Server Version 2.4
To solve this problem, this MPM uses a dedicated listener thread for each process to handle both the Listening sockets, all sockets that are in a Keep Alive ...Missing: nginx | Show results with:nginx
[17]
612 (multiple users can still do "accept" when "accept_mutex ... - nginx
The main purpose of accept mutex is to solve "Thundering herd problem". Under high load the mutex contention shouldn't ever happen. It's an abnormal ...Multiple Users Can Still Do... · Description · Follow-Up: 3 Comment:2 By...
[18]
IO Completion Ports - Matt Godbolt's blog
Jul 17, 2008 · One of the issues with using thread pools is the thundering herd problem ... problem lies in the interesting concept of “IO Completion Ports.
[19]
How thread-friendly is kevent? - FreeBSD Mailing Lists
Nov 10, 2014 · By default, if you don't use _ONESHOT or _DISPATCH, all threads will be woken up.. So, yes, there is potential for a thundering herd problem...kqueue bug in 7.x with "things" that go away.A (perhaps silly) kqueue questionMore results from lists.freebsd.org
[20]
None
### Summary of Historical Context and Origins of the Thundering Herd Problem
[21]
[PDF] Scaling Memcache at Facebook - USENIX
A thundering herd happens when a specific key undergoes heavy read and write ac- tivity. As the write activity repeatedly invalidates the re- cently set ...
[22]
Linux perf Examples - Brendan Gregg
Examples of using the Linux perf command, aka perf_events, for performance analysis and debugging. perf is a profiler and tracer.Missing: herd | Show results with:herd
[23]
[PDF] The Demikernel Datapath OS Architecture for Microsecond-scale ...
Oct 4, 2021 · Thus, PDPIX centers around an I/O queue ab- straction that makes I/O ... have a well-known “thundering herd” issue [56]: when the.
[24]
[PDF] Improving Network Connection Locality on Multicore Systems
4.1 Thundering Herd. Some applications serialize calls to accept() and poll() to avoid the thundering herd problem, where a kernel wakes up many threads in ...
[25]
Caching Best Practices | Amazon Web Services
It avoids cache misses, which can help the application perform better and feel snappier. ... The thundering herd. Also known as dog piling, the thundering ...
[26]
[PDF] The Thundering Herd: Amplifying Kernel Interference to Attack ...
In the classical Thundering Herd problem [10], many ... Liedtke, “On µ-kernel construction,” in 15th ACM Symposium on. Operating Systems Principles (SOSP).
[27]
Issues with epoll() - LWN.net
Mar 23, 2015 · There are two problems that Baron is trying to address: the "thundering herd" problem on wakeups and the use of global locks when manipulating the epoll() sets.Missing: documentation | Show results with:documentation
[28]
epoll_ctl(2) - Linux manual page - man7.org
EPOLLEXCLUSIVE is thus useful for avoiding thundering herd problems in certain scenarios. If the same file descriptor is in multiple epoll instances, some with ...
[29]
[PDF] Issues with Selected Scalability Features of the 2.6 Kernel
For example, the global runqueue_lock lock was replaced by the locks on the new per-cpu runqueues. Gone was io_request_lock with the introduc- tion of the new ...Missing: herd | Show results with:herd
[30]
Linux Kernel Documentation: sysctl
Anything past the max length of the sysctl value buffer will be ignored. ... The maximum value that can be written to threads-max is given by the constant ...
[31]
Leader election in distributed systems, Amazon Builders' Library
Leader election is a powerful tool for improving efficiency, reducing coordination, simplifying architectures, and reducing operations.
[32]
Retry Storm Antipattern - Azure Architecture Center | Microsoft Learn
Aug 11, 2025 · Gradually increase the amount of time between attempts, for example by using an exponential backoff strategy. Handle service faults gracefully.Missing: mitigation | Show results with:mitigation
[33]
Better Retries with Exponential Backoff and Jitter | Baeldung
Aug 1, 2019 · We can use the exponential backoff algorithm in Resilience4j retry by configuring its IntervalFunction that accepts an initialInterval and a ...Missing: reentrantlock | Show results with:reentrantlock
[34]
How the Actor Model Meets the Needs of Modern, Distributed Systems
The actor model uses message passing, avoiding locks, and allows for encapsulation without locks, preserving invariants without synchronization.Missing: contention herd
[35]
Practical Akka Persistence - Digital Magic
Apr 14, 2016 · Familiarity with the actor model and its Akka implementation are ... actor (and retrying recovery) to avoid the "thundering herd" problem.
[36]
Stopping the stampeding herd problem with lazy queues
Jul 5, 2017 · A sudden rush of incoming messages can inundate your RabbitMQ cluster and grind processing to a halt. Here lazy queues can come to the rescue.
[37]
Avoiding the Thundering Herd - NATS Docs
Oct 28, 2024 · When a server goes down, there is a possible anti-pattern called the Thundering Herd where all of the clients try to reconnect immediately, thus creating a ...