Fact-checked by Grok 2 weeks ago

Network load balancing

Network Load Balancing (NLB) is a clustering technology that allows multiple servers to be managed as a single virtual cluster, distributing incoming TCP/IP traffic across the nodes to improve availability and scalability for applications such as web servers, FTP, and VPNs.^[1] Primarily implemented as a software feature in Microsoft Windows Server, NLB operates by having cluster hosts respond to client requests using a shared virtual IP address, functioning at the network and transport layers of the OSI model. In NLB, traffic distribution occurs distributively among cluster nodes through heartbeats for status monitoring, with equal load balancing across available hosts based on configured port rules. It supports session affinity using client IP for consistent routing and enables dynamic scaling by allowing hosts to be added or removed without downtime.^[1] NLB uses virtual IP addresses to present the cluster as a unified entity and operates in unicast or multicast modes to handle network traffic efficiently in enterprise and data center environments. NLB enhances reliability through automatic failover, redistributing traffic from failed hosts within about 10 seconds, and supports high availability for handling variable loads in networked applications.^[1]

Fundamentals

Definition and Purpose

Network load balancing (NLB) is a technique used to distribute incoming network traffic across multiple servers or resources in a cluster, ensuring that no single server becomes overwhelmed and acts as a bottleneck.^[2] This method treats the cluster as a single virtual entity, allowing client requests to be evenly spread to optimize performance and prevent failures due to overload.^[1] At its core, NLB operates within the client-server architecture, where clients—such as web browsers or applications—send requests for services or data to servers that process and respond to those requests.^[3] In this model, network traffic flows from clients to a central point (the load balancer), which then directs the requests to available backend servers based on predefined criteria, maintaining smooth communication and resource access without assuming advanced prior knowledge of protocols. The primary purposes of NLB include enhancing scalability to accommodate growing traffic volumes, providing high availability through server redundancy to minimize downtime, and improving resource utilization in environments like data centers and web applications.^[4] By distributing workloads, NLB ensures that applications remain responsive under high demand, reducing latency and supporting fault tolerance if one server fails.^[5] NLB typically focuses on Layer 4 (transport layer) balancing, where decisions are made based on IP addresses and ports without inspecting application data, distinguishing it from Layer 7 (application layer) proxies that analyze content for more granular routing.^[6]

Historical Development

Network load balancing emerged in the mid-1990s amid the rapid growth of the internet, web servers, and early e-commerce platforms, which generated unprecedented traffic spikes that overwhelmed single-server architectures.^[7] Early approaches relied on simple techniques like DNS round-robin, where multiple IP addresses were assigned to a single domain name and rotated sequentially to distribute requests across servers, providing a basic precursor to more sophisticated balancing methods.^[8] This was driven by the need to scale websites during the dot-com boom of the late 1990s, when surging online demand necessitated affordable ways to handle high volumes of concurrent users without hardware failures.^[9] A pivotal milestone came with the introduction of Microsoft's Network Load Balancing (NLB) in Windows 2000, offering a software-based clustering solution that enabled TCP/IP traffic distribution across multiple hosts for high availability without dedicated hardware.^[1] In the early 2000s, hardware appliances from vendors like F5 and Cisco gained prominence, providing robust Layer 4 traffic management with health checks and NAT to route requests away from overloaded or failed servers, improving performance by up to 25% over DNS methods.^[8] These developments were influenced by Moore's Law, which exponentially reduced server hardware costs, making clustered deployments economically viable for enterprises scaling beyond individual machines.^[10] The mid-2000s saw virtualization trends, led by VMware's advancements since 1999, integrate load balancing into virtual environments, allowing dynamic resource allocation across virtual machines and paving the way for software-defined solutions.^[11] Post-2010, the rise of cloud computing shifted focus to elastic, software-based balancing; Amazon Web Services launched Elastic Load Balancing in 2009 to automatically distribute traffic across EC2 instances in scalable clusters. This transition from on-premises hardware to cloud-native services enabled seamless handling of variable loads, reflecting broader adoption in distributed architectures.^[12]

Core Mechanisms

Traffic Distribution Techniques

Network load balancing employs various techniques to distribute incoming traffic across multiple servers, ensuring efficient resource utilization and high availability. One foundational method is IP-based distribution, where traffic is routed by hashing attributes such as the client's source IP address (and often the destination port) to deterministically select a backend server from the pool. This approach, known as IP hashing, generates a unique key from the IP addresses of both client and server, mapping the request to a specific server to maintain consistency without requiring session state tracking at the load balancer.^[13] Session persistence, also referred to as sticky sessions, complements IP hashing by ensuring that subsequent requests from the same client are directed to the same server, preserving application state for stateful protocols like HTTP sessions. This is achieved through affinity rules based on client IP, source port, or higher-layer identifiers such as cookies, preventing disruptions in user sessions while allowing load distribution across the cluster. In implementations, inactivity timeouts are applied to release affinity after a period, balancing persistence with even load spreading.^[14] Health checks are integral to traffic distribution, enabling the load balancer to continuously probe server availability and remove unhealthy nodes from the rotation. Probes typically operate at different layers: Layer 3 using ICMP pings for basic connectivity verification, Layer 4 via TCP or UDP connections to check port responsiveness, and Layer 7 through HTTP requests to validate application-level functionality. Failed probes trigger immediate traffic rerouting to available servers, maintaining cluster reliability.^[14]^[15] At Layer 4, traffic distribution focuses on transport-layer protocols like TCP and UDP, enabling port-based routing where connections are balanced based on the 4-tuple (source IP, source port, destination IP, destination port). This allows for connection multiplexing, in which multiple client connections are aggregated and shared over fewer server links, optimizing bandwidth usage in high-throughput environments. Such techniques ensure stateless operation while supporting protocols requiring low-latency forwarding.^[14]^[15] Cluster synchronization facilitates dynamic load redistribution through mechanisms like heartbeat protocols, where nodes periodically exchange status messages to detect failures and share load information. Upon detecting a node failure via missed heartbeats, the cluster updates its membership view, prompting surviving nodes to absorb the redistributed traffic according to predefined rules. This accrual-based detection estimates failure probability from heartbeat arrival times, enabling proactive adjustments without centralized coordination.^[16] A representative workflow for traffic distribution begins with the load balancer inspecting an incoming packet's header for source details. An affinity rule or hash function then selects a target server; if the server's health check passes, the packet is forwarded, potentially multiplexed with others. In case of failure—detected via heartbeat or probe—the traffic is rerouted to an alternative server, ensuring seamless continuity (visualize this as a flowchart: client packet → inspection/hash → health check → forward/reroute → server response). Load balancing algorithms, such as those optimizing for least connections, inform these decisions but are detailed separately.^[14]^[13]

Load Balancing Algorithms

Load balancing algorithms determine how incoming network traffic is distributed across multiple servers to optimize resource utilization, minimize response times, and prevent overload on any single node. These algorithms can be broadly classified into static methods, which make decisions based on predefined configurations without considering real-time server states, and dynamic methods, which adapt to current load conditions for more efficient distribution. A comprehensive survey of load balancing techniques in cloud computing environments highlights that static algorithms like round-robin are suitable for homogeneous server clusters, while dynamic ones such as least connections excel in heterogeneous setups with varying workloads.^[17] Among the most common algorithms is round-robin, which sequentially assigns incoming requests to servers in a cyclic order, ensuring an even distribution over time. This method is particularly effective for environments where servers have identical processing capabilities and request handling times are uniform, as it promotes fairness without requiring ongoing monitoring. However, round-robin does not account for current server loads, potentially leading to inefficiencies if some servers become temporarily overloaded.^[18] The least connections algorithm, a dynamic approach, routes new requests to the server with the fewest active connections at the moment of arrival, aiming to balance the workload more precisely in scenarios with persistent or long-duration sessions. This method assumes that connections indicate processing load and is ideal for applications like web servers where connection counts correlate with resource usage. Its primary advantage is improved fairness under uneven loads, though it incurs overhead from continuous tracking of connection states across the cluster.^[19] Weighted round-robin extends the basic round-robin by assigning proportional weights to servers based on their capacity, such as CPU power or memory, allowing higher-capacity servers to receive more traffic. For instance, a server with twice the capacity of another might be assigned a weight of 2, receiving roughly double the requests in the rotation. This static variant enhances distribution in heterogeneous environments but lacks adaptability to runtime changes in server performance.^[20] Advanced methods include IP hash, which generates a hash value from the client and server IP addresses (and optionally ports) to deterministically map requests from the same client to the same server, preserving session affinity without storing state. This ensures consistent routing for sticky sessions in applications requiring it, such as e-commerce carts, but can result in uneven loads if client IP distributions are skewed, such as in NAT environments.^[21] Least response time builds on dynamic balancing by selecting the server with the lowest measured response time for recent requests, often combined with connection counts to avoid overburdening slow servers. It directly targets end-user performance by prioritizing speed, making it suitable for latency-sensitive applications like video streaming, though it requires active health checks and can introduce slight delays in decision-making due to latency measurements.^[22] For highly variable traffic patterns, dynamic algorithms incorporating predictive analytics and machine learning forecast future loads using historical data and real-time metrics to proactively allocate resources. These approaches, such as those employing temporal graph neural networks for state prediction and reinforcement learning for task scheduling, enable anticipation of spikes, reducing reactive adjustments. They offer superior handling of bursty workloads but demand significant computational resources for model training and inference.^[23] The mathematical foundation of the least connections algorithm can be expressed as selecting the server i that minimizes the current number of active connections:

i = \arg\min_{j \in \text{servers}} \text{connections}_j

Pseudocode for its implementation upon a new request arrival is as follows:

function selectServer(request):
    min_conn = infinity
    selected_server = None
    for server in cluster_servers:
        if connections[server] < min_conn:
            min_conn = connections[server]
            selected_server = server
    route(request, selected_server)
    connections[selected_server] += 1
    return selected_server
function selectServer(request):
    min_conn = infinity
    selected_server = None
    for server in cluster_servers:
        if connections[server] < min_conn:
            min_conn = connections[server]
            selected_server = server
    route(request, selected_server)
    connections[selected_server] += 1
    return selected_server

This logic ensures balanced distribution by favoring underutilized servers, promoting fairness in connection-heavy scenarios at the cost of monitoring overhead.^[18] In handling uneven loads, such as during traffic spikes, dynamic algorithms like least connections and machine learning-based predictors outperform static ones like round-robin by adapting to real-time conditions, achieving throughput improvements of 20-24% and response time reductions of up to 40% in simulated cloud environments with heterogeneous workloads. For example, in a study of SIP server clusters, least connections yielded up to 24% higher throughput compared to non-adaptive methods under imbalanced conditions. Machine learning variants further enhance this by forecasting loads, demonstrating 20% throughput gains and 35% makespan reductions over traditional heuristics in dynamic graph-based models.^[24]^[23]

Operational Modes

Microsoft Network Load Balancing (NLB) operational modes, including unicast and multicast, are deprecated as of Windows Server 2022 and no longer actively developed; alternatives like software load balancers are recommended.^[25]

Unicast Mode

In unicast mode, all cluster nodes share a single virtual IP address and respond to ARP requests for that IP by advertising the same virtual cluster MAC address, a process akin to ARP spoofing that causes the network switch to associate the MAC with multiple ports. Incoming traffic directed to the virtual IP is then flooded by the switch to all connected cluster nodes, where an internal load balancing mechanism selects one node to process the packets while the others discard them. This emulation makes the cluster appear as a single network entity to upstream devices.^[26]^[27] Configuration in environments like Microsoft Windows involves selecting unicast mode during cluster creation via the Network Load Balancing Manager, which binds the NLB driver to the designated network adapters and overrides their original hardware MAC addresses with the cluster MAC. Switches connected to the cluster must support this setup by allowing the same MAC on multiple ports, often requiring the disabling of port security features that enforce unique MAC learning per port to prevent blocking; unlike multicast, IGMP snooping is irrelevant and should not be enabled for unicast operations. Nodes typically connect to a dedicated switch or VLAN to isolate flooding.^[28]^[29] A primary advantage of unicast mode is its straightforward integration with standard network infrastructure, as the cluster presents itself as one logical device without requiring multicast-enabled hardware or protocols, making it ideal for legacy or non-multicast-supporting environments.^[1] However, unicast mode can lead to network inefficiencies, including traffic duplication where inbound packets are broadcast to every node, roughly doubling the load on the local network segment as non-selected nodes receive and drop unnecessary copies. Without proper switch configuration, such as isolating the cluster on a dedicated segment, this flooding risks broadcast storms, excessive bandwidth consumption, or even spanning tree loops if redundant paths exist.^[26]^[30]

Multicast Mode

In multicast mode, Network Load Balancing (NLB) assigns a shared multicast MAC address (in the format 03-BF-XX-XX-XX-XX, derived from the virtual IP address octets in hexadecimal) to the cluster's virtual IP address, while each node retains its original unicast MAC address. Incoming traffic destined for the virtual IP is resolved via ARP to this multicast MAC, causing network switches to flood the packets to all ports in the VLAN unless IGMP snooping is enabled. Each node in the cluster joins the corresponding multicast group and receives the flooded traffic, after which the NLB driver filters it according to predefined port rules to determine which node processes the request.^[26] To implement multicast mode, administrators enable it through the NLB Manager console during cluster configuration, which modifies the network adapter settings to support multicast operations. Network interfaces must have multicast enabled, and for optimal performance, IGMP multicast mode is recommended, where nodes send IGMP membership reports to join the group (typically mapped to a multicast IP like 239.255.x.y, with x.y derived from the virtual IP's last two octets). Switches capable of IGMP snooping are required to dynamically build MAC address tables based on these reports; in environments without an IGMP querier (often provided by a router or designated switch), manual configuration or enabling a querier may be necessary to maintain group membership and prevent traffic flooding.^[26]^[31] This mode offers benefits such as efficient bandwidth utilization by avoiding the traffic duplication common in unicast mode, where all nodes share a single MAC address leading to switch port blocking or replication overhead. It supports high-throughput scenarios by leveraging native multicast delivery, reducing performance impacts on interconnected switches, and permits direct node-to-node communication within the cluster since individual MAC addresses are preserved.^[28]^[26] However, multicast mode introduces drawbacks including incompatibility with switches that block or poorly handle multicast traffic, potentially causing packet drops or excessive flooding. It also adds complexity to routing tables, as the multicast MAC requires static ARP entries on routers and switches without IGMP support, and some network devices may not forward multicast packets correctly without additional configuration.^[26]^[27]

Implementations

Microsoft NLB

Microsoft Network Load Balancing (NLB) is a clustering technology introduced as the Windows Load Balancing Service (WLBS) with Windows NT Server 4.0 Enterprise Edition in 1997, functioning as a kernel-mode driver that enables up to 32 nodes to operate as a single virtual cluster for distributing TCP/IP traffic.^[32]^[28] It primarily supports stateless TCP/UDP-based services such as HTTP for web servers and FTP, allowing seamless load distribution across cluster hosts without requiring shared storage.^[1]^[28] Key features of NLB include automatic failover, where the cluster detects a failed host and redistributes traffic to remaining nodes within 10 seconds, ensuring minimal disruption for high-availability scenarios.^[1]^[28] It supports port-specific rules to define load balancing behavior for individual TCP/IP ports or port ranges, such as directing all HTTP traffic (port 80) to multiple hosts while restricting other ports to a single host for affinity-based handling.^[1] NLB is compatible with Hyper-V, enabling virtualized clusters where multiple virtual machines on Hyper-V hosts can form an NLB cluster without needing multihomed physical servers, thus supporting scalable deployments in virtual environments.^[1] Configuration of an NLB cluster begins with installing the feature through Server Manager via the Add Roles and Features Wizard or using the PowerShell cmdlet Install-WindowsFeature NLB -IncludeManagementTools, followed by creating the cluster with tools like the NLB Manager (nlbmgr.exe) or the New-NLBCluster cmdlet specifying parameters such as the cluster IP address and virtual name.^[1]^[33] Port rules and host priorities are then defined in the NLB Manager interface, with affinity settings configurable as none (for stateless distribution), single (routing all requests from a client IP to one host), or class C (network address-based affinity for broader client grouping).^[28] Once configured, the cluster can operate in unicast or multicast mode to handle traffic routing.^[28] NLB integrates natively with Windows Server editions from 2000 through 2022, providing built-in support for on-premises clustering in enterprise environments.^[1] However, as of Windows Server 2025, NLB is deprecated and no longer under active development, with Microsoft recommending migration to cloud-native alternatives like Azure Load Balancer for modern, scalable deployments.^[25]

Alternative Solutions

Software solutions for network load balancing include open-source options like HAProxy, which has supported both layer 4 (TCP) and layer 7 (HTTP) balancing since its initial release in 2001. HAProxy operates as a high-performance proxy, distributing traffic based on configurable algorithms such as round-robin or least connections, and is widely used for its reliability in handling high-traffic environments. Another open-source alternative is the Linux Virtual Server (LVS), a kernel-based module using IP Virtual Server (IPVS) for Layer 4 load balancing, enabling efficient distribution of TCP/UDP traffic across multiple nodes in Linux environments. NGINX, originally released in 2004, functions as a reverse proxy with built-in load balancing modules like ngx_http_upstream_module for HTTP, TCP, and UDP traffic. NGINX supports methods including round-robin, least connections, and IP hash, making it suitable for web applications requiring session persistence. For commercial software, F5 BIG-IP provides enterprise-grade load balancing through its application delivery controller, offering advanced features like traffic management, security, and global server load balancing across on-premises and cloud deployments.^[34]^[35]^[21]^[36]^[37] Hardware appliances deliver dedicated network load balancing with optimized processing. Citrix NetScaler (now NetScaler ADC) hardware platforms, such as the MPX series, provide high-speed balancing with up to 200 Gbps of layer 7 throughput in a single appliance, leveraging hardware acceleration for low-latency traffic distribution in enterprise data centers. These devices support layer 4 and 7 protocols, including SSL offloading and content switching. Although Cisco ACE was a prominent hardware load balancer offering up to 16 Gbps throughput with ASIC-based processing, it reached end-of-sale in 2014 and end-of-support in 2019, prompting migrations to modern alternatives like NetScaler or F5 hardware.^[38]^[39]^[40]^[41] Cloud-native solutions emphasize seamless integration and scalability in distributed environments. AWS Network Load Balancer (NLB), launched in 2017, operates at layer 4 to handle TCP, UDP, and TLS traffic, supporting millions of requests per second with automatic scaling based on demand. It preserves client IP addresses and integrates with services like Amazon EC2 and containers. Microsoft Azure Load Balancer, available since 2010 and updated for modern features, provides Layer 4 load balancing for TCP and UDP traffic, with automatic scaling and integration across Azure Virtual Machines, containers, and virtual networks. Google Cloud's TCP Proxy Load Balancer provides global anycast IP distribution for TCP traffic, using backend services that auto-scale with compute instances or Kubernetes clusters in Google Kubernetes Engine (GKE). This setup enables low-latency routing across regions without manual intervention for traffic spikes.^[42]^[4]^[43]^[44] In comparisons, software solutions like HAProxy, LVS, and NGINX offer cost-effectiveness and high scalability by running on commodity hardware or virtual machines, allowing easy horizontal scaling in hybrid cloud setups, though they may introduce slightly higher latency due to general-purpose processing. Hardware appliances such as NetScaler excel in low-latency scenarios with dedicated throughput exceeding 100 Gbps, but incur higher upfront costs and less flexibility for rapid scaling in dynamic hybrid clouds. For instance, organizations deploying across AWS and on-premises often combine software balancers for cost savings with cloud-native options like AWS NLB or Azure Load Balancer for auto-scaling, contrasting the Windows-specific focus of Microsoft NLB.^[45]^[46]^[47]

Applications and Considerations

Use Cases

Network load balancing plays a pivotal role in web and e-commerce environments by distributing incoming traffic across multiple servers to handle massive surges, such as those during seasonal sales events. For instance, Amazon utilizes Elastic Load Balancing (ELB) from AWS to automatically scale and route traffic across availability zones during high-demand periods like Prime Day, which in 2016 processed over 85 billion clickstream log entries in 40 hours, representing 74% of U.S. e-commerce volume and doubling mobile orders from the previous year.^[48] This approach ensures seamless performance for high-traffic sites, mitigating overloads during events akin to Black Friday, where temporary server additions and intelligent routing prevent bottlenecks and maintain user access to shopping platforms.^[49] In database and API services within microservices architectures, network load balancing distributes queries and requests evenly to prevent any single instance from becoming overwhelmed, promoting scalability and reliability. AWS ELB, for example, supports hybrid environments by balancing traffic across AWS resources and on-premises databases, allowing applications to scale dynamically without manual intervention.^[49] Research on microservices highlights how client-side load balancing, such as consistent hashing, enables efficient HTTP-based communication among service instances, reducing latency and ensuring fault tolerance in distributed systems.^[50] This is particularly vital for API gateways, where load balancers route requests to available containers or virtual machines, optimizing resource utilization in cloud-native setups. For gaming and streaming applications, network load balancing facilitates real-time traffic management to support low-latency interactions and content delivery. In multiplayer gaming servers, techniques like Network Load Balancers handle TCP connections for player traffic, enabling auto-scaling during peak times and distributing loads across shards to maintain stable performance for thousands of concurrent users.^[51] Netflix integrates load balancing within its Open Connect CDN, which employs horizontal scaling and traffic distribution across appliances to deliver video streams to over 300 million paid subscribers globally, as of 2025, selecting optimal servers based on proximity and load to minimize buffering and ensure high-quality playback.^[52]^[53]^[54] Enterprise deployments in finance and healthcare leverage network load balancing for high-availability configurations that minimize disruptions in mission-critical systems. In financial trading platforms, Layer 4 load balancers like the Netberg Aurora 610 distribute multi-terabit traffic for electronic FX (eFX) and multi-asset trading, supporting millions of persistent connections with sub-second failover and embedded health checks to sustain operations during volatile market hours.^[55] Similarly, in healthcare electronic health record (EHR) systems, load balancing enhances performability in medical information systems (MIS) by employing strategies such as shortest-queue distribution across fog nodes and virtual machines, which improves throughput and reduces response times while ensuring service continuity through fail-over mechanisms.^[56] For Epic EHR implementations, load balancers automatically redirect traffic upon server failure, achieving robust availability for patient data access and clinical workflows.^[57] These setups have demonstrated uptime exceeding 99.9% in optimized trading environments, underscoring downtime reductions in regulated sectors.^[58]

Benefits and Limitations

Network load balancing provides enhanced fault tolerance by distributing traffic across multiple servers, enabling zero-downtime failover when a server fails, as traffic is automatically rerouted to healthy nodes without interrupting service.^[59] This redundancy minimizes outages and ensures continuous availability, particularly in high-traffic environments where single-server failures could otherwise cause significant disruptions.^[60] It also improves overall performance by optimizing resource utilization and maximizing throughput in server clusters, allowing systems to handle increased loads more efficiently through even traffic distribution.^[61] For instance, in clustered setups, load balancing can scale throughput proportionally to the number of added nodes, potentially achieving substantial gains in capacity for demanding applications.^[62] Additionally, it enables cost savings by leveraging commodity hardware for scalability, reducing the need for expensive proprietary systems while maintaining high performance.^[63] Despite these advantages, network load balancing introduces limitations, such as the potential for the load balancer itself to become a single point of failure if not properly configured, though this can be mitigated through high-availability (HA) pairs that provide redundancy and automatic failover between balancers.^[64] Configuration complexity poses another challenge, as misconfigurations can lead to uneven traffic distribution or outages; studies indicate that up to 75% of network performance issues stem from such errors.^[65] Health checks, essential for monitoring server status, add operational overhead, including moderate CPU utilization on both the balancer and backend servers to perform periodic probes.^[66] Security considerations are critical, as load balancers often serve as the public-facing entry point, exposing them to distributed denial-of-service (DDoS) attacks that can overwhelm resources unless protected by firewalls or integrated mitigation tools.^[67] Furthermore, SSL termination at the load balancer—where encryption is decrypted before forwarding traffic—centralizes security management but requires robust protocols to prevent vulnerabilities in the unencrypted internal traffic between the balancer and backend servers.^[68] Looking ahead, future trends in network load balancing include integration with AI for predictive traffic management, which analyzes patterns to proactively distribute loads and prevent bottlenecks, enhancing efficiency in dynamic environments.^[69] This approach also addresses limitations in edge computing, such as bandwidth constraints and latency, by enabling more adaptive balancing closer to data sources.^[70]

References

[1]
What is load balancing? | How load balancers work - Cloudflare
Load balancing is the process of distributing traffic among multiple servers to improve a service or application's performance and reliability.
[2]
What Is Load Balancing? | IBM
Load balancing is the process of distributing network traffic efficiently among multiple servers to optimize application availability.What is load balancing? · How it works
[3]
Network Load Balancing | Microsoft Learn
Jul 29, 2021 · The Network Load Balancing (NLB) feature distributes traffic across several servers by using the TCP/IP networking protocol. By combining two or ...Practical applications · Important functionality
[4]
What is network load balancing (NLB)? | Definition from TechTarget
Nov 15, 2023 · NLB is used to distribute or balance network traffic across multiple servers or VMs. In doing so, it prevents the overloading of any single host.
[5]
What Is the Client/Server Model? - Akamai
The client/server model refers to a basic concept in networking where the client is a device or software that requests information or services.
[6]
The client/server model - IBM
Client/server is a model of interaction in which a program sends a request to another program and awaits a response. The requesting program is called a client; ...Missing: definition | Show results with:definition
[7]
What is a Network Load Balancer? - AWS Documentation
A load balancer serves as the single point of contact for clients. The load balancer distributes incoming traffic across multiple targets, such as Amazon EC2 ...Network Load Balancer... · Network Load Balancer overview
[8]
Layer 4 vs. Layer 7 Load Balancing: Learn the Difference - Kemp
Feb 23, 2023 · Layer 4 load balancing gets used in situations where the contents of the data packets aren't needed when making routing decisions.
[9]
What Is Load Balancing & How Do Load Balancers Work - Kemp
A load balancer acts as a 'reverse-proxy' to represent the application servers to the client through a virtual IP address (VIP).<|control11|><|separator|>
[10]
Load Balancing Evolution - The Basics - Austral Tech
Jan 17, 2017 · Server load balancing grew out of the need to scale websites in the 1990s and is the foundation of today's modern application delivery controller.
[11]
A History of Load Balancing - Packet Pushers
Aug 19, 2014 · A visual representation of the company and, to a lesser extent, product history of the load balancing/application delivery field.
[12]
Classic.Ars: Understanding Moore's Law - Ars Technica
Sep 27, 2008 · "The number of transistors per chip that yields the minimum cost per transistor has increased at a rate of roughly a factor of two per year.".
[13]
The 3 Generations of Load Balancing, According to a Lifer
Apr 16, 2020 · The first load balancers offered connection-based traffic management between clients and servers. These hardware appliances were used primarily ...Missing: 2000s | Show results with:2000s
[14]
Timeline of Amazon Web Services - Wikipedia
... Amazon S3. 2009, May 18, Product (compute), Amazon introduces Elastic Load Balancing (ELB) (which makes it easy for users to distribute web traffic across ...
[15]
What Is Load Balancing & How Does It Work? - CDNetworks
Mar 22, 2021 · A Brief History of Load Balancing. Load balancing came to prominence in the 1990s as hardware appliances distributing traffic across a network.
[16]
https://www.usenix.org/system/files/atc25-rovelli.pdf
[17]
Layer 4-7 Server Load Balancing - IEEE Web Hosting
• Load balancing at the “transaction” boundaries. IP. Network. IP. Network ... • UDP session persistence using Layer 3/4. ➢ Source IP & port, Destination ...
[18]
Ananta: cloud scale load balancing: ACM SIGCOMM Computer Communication Review: Vol 43, No 4
### Summary of Health Checks in Ananta Paper for Load Balancing
[19]
[PDF] FiDe: Reliable and Fast Crash Failure Detection to Boost Datacenter ...
Jul 9, 2025 · Timeouts are at the center of the prominent heartbeat crash FD design which has dis- tributed processes periodically send “I am alive” messages ...
[20]
Load balancing algorithms in cloud computing: A survey of modern ...
This paper surveys the state of the art load balancing tools and techniques over the period of 2004-2015. We group existing approaches aimed at providing load ...
[21]
Types of load balancing algorithms - Cloudflare
Round robin: Round robin load balancing distributes traffic to a list of servers in rotation using the Domain Name System (DNS).
[22]
What is Load Balancing? - Load Balancing Algorithm Explained - AWS
Round-robin method. Servers have IP addresses that tell the client where to send requests. · Weighted round-robin method · IP hash method · Least connection method.
[23]
Load Balancing Algorithms and Techniques - Kemp Technologies
Weighted round robin is similar to the round-robin load balancing algorithm, adding the ability to spread the incoming client requests across the server farm ...
[24]
HTTP Load Balancing | NGINX Documentation
NGINX Open Source supports four load balancing methods: Round Robin, Least Connections, IP Hash, and Generic Hash. ... least average time to receive the response ...
[25]
What are load-balancing algorithms? - HAProxy Technologies
Least response time – Requests are routed to servers with the quickest response times, provided they also have the least number of connections. This takes ...
[26]
Dynamic load balancing in cloud computing using predictive graph ...
Jul 1, 2025 · More recently, machine learning (ML) and deep learning (DL) methods have been explored to manage load balancing issues in cloud environments.
[27]
https://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/107995-configure-nlb-00.html
[28]
Configure network infrastructure to support the NLB operation mode
Jan 15, 2025 · In unicast mode, NLB uses the NLB MAC address to replace the original hardware MAC address of each adapter in each node of the cluster.
[29]
Catalyst Switches for Microsoft Network Load Balancing ... - Cisco
Aug 6, 2014 · This document describes how to configure the Cisco Catalyst switches in order to interact with Microsoft Network Load Balancing (NLB).
[30]
Network Load Balancing concept and notes - Windows Server
Jan 15, 2025 · NLB enables each host to detect and receive incoming TCP/IP traffic. This traffic is received by all the hosts in cluster and NLB driver filter ...
[31]
Configure Catalyst Switches for Microsoft NLB - Cisco
Oct 16, 2020 · The IGMP snooping programs the virtual MAC address for you once the switch receives a membership report from a member in the cluster. An Mrouter ...
[32]
Microsoft Network Load Balancing Multicast and Unicast operation ...
Apr 21, 2025 · Unicast mode induces switch flooding, where all switch ports are flooded with NLB traffic, even ports to which non-NLB servers are attached.
[33]
Cisco Nexus 9000 Series NX-OS Multicast Routing Configuration ...
Sep 11, 2023 · This section describes how to configure a Cisco Nexus 9000 series switches for multicast and IGMP multicast mode NLB.
[34]
Microsoft Expands Clustering Capabilities Of Windows NT Server ...
The Windows NT Load Balancing Service allows customers to cluster their TCP/IP-based network services across up to 32 systems, which then appear as a single ...
[35]
nlbmgr | Microsoft Learn
Nov 1, 2024 · The nlbmgr command manages Network Load Balancing clusters and hosts. It is started with `nlbmgr.exe` and can replicate cluster configurations.
[36]
Features Removed or No Longer Developed in Windows Server
Sep 24, 2025 · NLB is no longer in active feature development and is deprecated. ... 2025 update, Windows Server 2025 no longer includes Windows PowerShell 2.0.
[37]
The History of HAProxy
Nov 8, 2019 · Willy released the first version of HAProxy in 2001, welcomed the first contribution in 2004, and became a Linux kernel maintainer in 2006. His ...Missing: official | Show results with:official
[38]
Layer 4 vs Layer 7 Load Balancing (Differences Explained)
Jun 9, 2011 · With HAProxy you can switch between proxying traffic at layer 4 (TCP) or layer 7 (HTTP). This blog post describes the features available to you ...Missing: 2000 | Show results with:2000
[39]
Module ngx_http_upstream_module - nginx
The ngx_http_upstream_module module is used to define groups of servers that can be referenced by the proxy_pass, fastcgi_pass, uwsgi_pass, scgi_pass, ...
[40]
Load Balancing - F5
F5 offers load balancing via hardware, software (like NGINX), and cloud solutions, including BIG-IP, BIG-IP LTM, and Distributed Cloud DNS Load Balancer.F5 Load Balancing Deployment... · Load Balancing Services · Resources
[41]
High-Performance Load Balancing Use Cases - NetScaler
A single NetScaler hardware ADC achieves up to 200 Gbps L7 throughput, and a single software ADC achieves up to 100 Gbps L7 throughput. ... NetScaler doesn't ...
[42]
Hardware platforms | NetScaler MPX™ - Product Documentation
The NetScaler hardware platforms range from the single 10-core processor MPX 9100 to the high-capacity MPX two 16-core processor MPX 16000.Missing: throughput ASIC
[43]
Cisco ACE Application Control Engine Module for Cisco Catalyst ...
The Cisco ACE Module provides best-in-industry scalability and throughput for managing application traffic, up to 16 Gbps in a single module.Missing: ASIC | Show results with:ASIC
[44]
Cisco ACE Application Control Engine Module
The Cisco ACE Application Control Engine Module has been retired and is no longer supported. End-of-Sale Date: 2014-01-24. End-of-Support Date: 2019-01-31.<|separator|>
[45]
New Network Load Balancer – Effortless Scaling to Millions of ...
Sep 7, 2017 · It is designed to handle tens of millions of requests per second while maintaining high throughput at ultra low latency, with no effort on your part.
[46]
Proxy Network Load Balancer overview
Proxy Network Load Balancers are layer 4 reverse proxy load balancers that distribute TCP traffic to backends in your Google Cloud Virtual Private Cloud (VPC) ...
[47]
Hardware vs Software Load Balancers: Key Differences - Serverion
Jun 7, 2025 · Explore the key differences between hardware and software load balancers, including performance, scalability, and cost to find the right fit ...
[48]
Software Load Balancers vs Appliances (Differences Explained)
Feb 15, 2024 · We'll take a look at some core differences between software load balancers and load balancing appliances, and explain how HAProxy's unique approach to building ...Missing: hybrid | Show results with:hybrid
[49]
What is Hardware Load Balancer (HLD) | Box vs Cloud - Imperva
The main difference is the higher cost of purchasing hardware compared to a LBaaS subscription fee. In addition, lack of HLD scalability may hinder performance, ...Missing: software | Show results with:software
[50]
How AWS Powered Amazon's Biggest Day Ever
Jul 25, 2016 · Running the Amazon website and mobile app on AWS makes short-term, large scale global events like Prime Day technically feasible and economically viable.
[51]
Load Balancer - Elastic Load Balancing (ELB) - AWS
### Summary of Elastic Load Balancing Use Cases
[52]
[PDF] Load Balancing across Microservices - Fangming Liu
Using. HTTP, a microservice instance can initiate a connection to anyone without relying on an intermediate broker. However, since HTTP communication requires ...
[53]
Running your game servers at scale for up to 90% lower compute cost
Mar 5, 2019 · For game servers that use TCP as the transport network layer, AWS offers Network Load Balancers as an option for distributing player traffic ...
[54]
Netflix | Open Connect
The Netflix Open Connect program provides opportunities for ISP partners to improve their customers' Netflix user experience by localizing Netflix traffic.Peering With Open Connect · English (GB) · Appliances · Partner Portal
[55]
How Netflix Delivers High-Quality Streaming to 269 Million Viewers
Jun 10, 2024 · CDNs use load balancing to distribute traffic across multiple servers to prevent any single server from becoming overloaded. This technique is ...
[56]
Layer 4 Load Balancer in eFX and Multi-asset Class Trading Case ...
Considerable resources are invested in deploying of hardware and software load balancing for scale-out. Increased infrastructure costs and the burden of life ...
[57]
Performability Evaluation of Load Balancing and Fail-over Strategies ...
Sep 17, 2021 · This study can help improve the design of MIS systems integrated with different load-balancing techniques and fail-over mechanisms to maintain ...
[58]
Load balancing Epic EHR: A step-by-step guide - Loadbalancer.org
Mar 3, 2023 · High Availability (HA): By distributing traffic across multiple servers, if one server fails, the load balancer automatically redirects users ...
[59]
System Optimization & Real-Time Analytics for a Financial Trading ...
Mar 30, 2025 · Cloud migration, load balancing, and system redundancy together ensured the platform achieved 99.9% uptime even during peak trading hours. List ...
[60]
What is server failover? | Failover meaning - Cloudflare
Cloudflare Load Balancing achieves fast failover by actively monitoring servers and instantly rerouting traffic when an issue is detected, resulting in zero ...
[61]
What Is High Availability? - Cisco
Fault tolerance aims for zero downtime, while high availability is focused on delivering minimal downtime. A high-availability system designed to provide 99.999 ...
[62]
Load Balancing Options - Azure Architecture Center | Microsoft Learn
The term load balancing refers to the distribution of processing across multiple computing resources. You load balance to optimize resource usage, maximize ...Azure load balancing services · Service categorizations
[63]
Throughput and load balancing optimization in heterogeneous ...
Oct 9, 2025 · The outcomes are examined to see how various approaches, which seek to increase throughput while guaranteeing equitable resource distribution ...
[64]
5 Scaling Strategies for Load Balancers - Serverion
May 24, 2025 · "Clustering is a cost-effective way to improve a website or application's performance, reliability, and scalability using commodity hardware.
[65]
How to avoid a single point of failure - Loadbalancer.org
Feb 8, 2022 · Here I'll walk you through how you can maximize your uptime by avoiding what we call "a single point of failure."
[66]
How to test for and help prevent bad network connectivity
Mar 3, 2020 · By some estimates, 75% of network outages and performance issues are the result of a misconfiguration, and more often than not, ...
[67]
Troubleshoot a Classic Load Balancer: Health checks
Check the monitoring graph for over-utilization of CPU. · Check the utilization of other application resources, such as memory or limits, by connecting to your ...Missing: overhead | Show results with:overhead
[68]
How to prevent DDoS attacks | Methods and tools - Cloudflare
Several methods for reducing this exposure include restricting traffic to specific locations, implementing a load balancer, and blocking communication from ...
[69]
Should SSL be terminated at a load balancer?
Feb 6, 2013 · A second reason SSL should terminate at the load balancer is because it offers a centralized place to correct SSL attacks such as CRIME or BEAST ...
[70]
AI Load Balancing for Data Centers: How It Works - Serverion
Oct 20, 2025 · AI load balancing keeps an eye on network activity and adjusts resource allocation on the fly to ensure everything runs smoothly.
[71]
Edge Computing: the Future of Load Balancers - skudonet
Rating 5.0 (2) Mar 14, 2024 · Issues such as network latency, bandwidth limitations, and network congestion can impact the overall performance of load balancers. SKUDONET ...