Bidirectional Forwarding Detection
Bidirectional Forwarding Detection (BFD) is a network protocol designed to rapidly detect faults in the bidirectional path between two forwarding engines, enabling quick convergence in routing and forwarding systems.[1] Developed by the Internet Engineering Task Force (IETF), BFD operates independently of underlying media types, data protocols, and routing protocols, making it versatile for use across various network environments such as physical links, virtual circuits, tunnels, and multi-hop routed paths.[1] The protocol achieves low-latency failure detection—potentially in milliseconds—through the exchange of lightweight BFD control packets, which allow systems to monitor path integrity without significant overhead.[1] BFD supports two primary operational modes: asynchronous mode, where systems periodically transmit control packets to each other, and demand mode, which reduces packet overhead by relying on an independent connectivity verification mechanism.[1] An optional echo function further enhances testing by looping packets back through the forwarding path to verify its operational status.[1] Sessions are established using UDP ports 3784 for control packets and 4784 for echo packets, with detection times configurable via transmit/receive intervals and a detection multiplier to balance speed and reliability.[1] Standardized in RFC 5880 as a Proposed Standard in June 2010, BFD has been extended in subsequent RFCs to support specific applications, including IPv4/IPv6 single-hop paths (RFC 5881), multi-hop paths (RFC 5883), and encapsulation in various tunneling protocols.[1]Overview
Definition and Purpose
Bidirectional Forwarding Detection (BFD) is a UDP-based protocol designed to detect faults in the bidirectional forwarding path between two network devices, such as routers or switches.[2] For single-hop sessions, it utilizes UDP port 3784 for control packets and port 3785 for echo packets; multi-hop control packets use port 4784.[3] This protocol enables rapid identification of connectivity issues in the path between forwarding engines. The primary purpose of BFD is to provide low-overhead, short-duration failure detection, typically achieving sub-second detection times to facilitate fast convergence in routing protocols.[2] By operating independently of specific media types, encapsulations, or network topologies, BFD ensures versatile applicability across diverse environments without reliance on underlying data or routing protocols.[4] This independence allows BFD to enhance network reliability by decoupling fault detection from slower control-plane mechanisms.[2] Unlike traditional hello mechanisms in routing protocols, BFD functions at the forwarding plane (data plane), enabling quicker and more efficient path validation.[4] BFD sessions are inherently lightweight, requiring minimal bandwidth and processing resources, which makes them suitable for high-scale deployments.[2] These sessions can operate over various Layer 2 and Layer 3 transports, including IP, MPLS, and Ethernet.[2] BFD supports high-level detection modes such as asynchronous, demand, and echo for flexible fault monitoring.[2]Key Features
BFD operates independently of any specific media types, data protocols, or routing protocols, enabling fault detection across diverse environments such as Ethernet, SONET/SDH, and various topologies including point-to-point and multi-hop paths.[2] This protocol-agnostic design allows BFD to run atop network layer, link layer, or tunnel encapsulations without modification to the underlying forwarding engines.[2] The protocol exhibits high scalability, supporting a very large number of sessions—potentially hundreds or thousands—per device with minimal CPU and bandwidth overhead, achieved through simple periodic hello messages.[2][5] For instance, achieving a 50-millisecond detection time requires only about 60 packets per second, consuming roughly 48 kbps of bandwidth, which underscores its efficiency in resource-constrained environments.[2] Unlike unidirectional probing mechanisms, BFD ensures bidirectional verification by confirming two-way communication between systems before declaring a path operational, thereby providing robust fault detection for the full forwarding path.[2] Detection times in BFD are highly configurable to balance speed and resource utilization, with transmit and receive intervals tunable to sub-50-millisecond latencies while keeping overhead low.[2] This flexibility is facilitated by a multiplier mechanism, where the detection time equals the negotiated minimum receive interval multiplied by the detect multiplier value; for example, a multiplier of 3 triggers a down state after three consecutive missed packets.[2]Protocol Mechanics
Session Establishment and States
Bidirectional Forwarding Detection (BFD) sessions are established through the exchange of BFD Control packets between two peering systems, with at least one system required to be in the Active mode to initiate the process.[6] This establishment lacks an explicit discovery mechanism, relying instead on application-specific methods to identify peers, and supports both single-hop configurations for adjacent systems and multi-hop configurations across multiple intermediate hops or insecure tunnels.[6] During this phase, peers negotiate key parameters such as transmit and receive intervals by including the Desired Min TX Interval and Required Min RX Interval in Control packets, allowing each direction to operate independently based on the higher of the proposed values.[6] The BFD protocol employs a four-state finite state machine to manage session lifecycle: AdminDown, Down, Init, and Up.[6] The AdminDown state indicates the session is administratively disabled, preventing any packet transmission or reception.[6] The Down state represents the initial or non-operational condition, where no valid packets have been received from the peer.[6] The Init state occurs when packets are received from the peer but bidirectional communication is not yet confirmed, signaling the start of a three-way handshake.[6] Finally, the Up state denotes a fully operational session with ongoing bidirectional exchange of Control packets.[6] State transitions are driven by the receipt or absence of Control packets, following a deterministic three-way handshake process.[6] For instance, a session moves from Down to Init upon receiving a packet indicating the peer's Down state; from Init to Up when the peer echoes the local state; and from Up to Down if the Detection Time expires due to missed packets, such as three consecutive failures when the Detect Multiplier is set to three.[6] Other transitions include reverting to Down from Init if no further packets arrive, ensuring rapid detection of issues.[6] These changes are influenced by timers like the Detection Time, which is the product of the negotiated receive interval and the Detect Multiplier.[6] Sessions are uniquely identified using 32-bit discriminators: the My Discriminator field, a locally generated unique value, and the Your Discriminator field, which echoes the peer's My Discriminator.[6] This pairing allows demultiplexing of multiple concurrent sessions, supporting up to 2^32 unique sessions per system.[6] Initially, before discriminators are exchanged, sessions rely on application-layer or transport-specific identifiers.[6] Upon detecting a fault, such as a transition to the Down state, BFD signals the client protocols (e.g., routing protocols) through an application programming interface (API), including a diagnostic code to indicate the failure reason, like Control Detection Time Expired.[6] This notification enables rapid reconvergence by triggering actions such as route withdrawal or failover in the dependent protocols.[6]| Current State | Event | Next State |
|---|---|---|
| Down | Receive packet with peer's state Down | Init |
| Init | Receive packet with peer's state Init | Up |
| Up | Detection Time expires (e.g., 3 missed packets) | Down |
| Any | Administrative disable | AdminDown |
| AdminDown | Administrative enable | Down |
Detection Modes
Bidirectional Forwarding Detection (BFD) operates in three primary detection modes to verify path integrity between forwarding engines: asynchronous, demand, and echo. These modes determine how control and echo packets are exchanged post-session establishment, with the session typically required to be in the Up state for active detection.[2] In asynchronous mode, the default operational mode, BFD peers continuously exchange BFD Control packets at negotiated intervals to monitor bidirectional connectivity. Each peer detects a failure if it misses a predefined number of consecutive packets from the remote peer, triggering a session down event. This mode ensures proactive fault detection without external triggers, making it suitable for most environments requiring ongoing path verification.[7] Demand mode builds on asynchronous mode for session establishment but transitions to a maintenance phase where periodic Control packets are suppressed to reduce overhead. Connectivity is verified only upon demand, such as during periodic polling or specific events, by sending a sequence of Control packets with the Poll bit set; failure to receive responses within the detection window declares the session down. The Demand bit in Control packets signals entry into this mode, though it is less commonly used due to the need for explicit polling mechanisms to maintain reliability.[8] Echo mode provides a supplementary diagnostic mechanism where one peer sends BFD Echo packets, which the remote peer loops back at the forwarding plane level without control plane processing, testing the data path directly. Echo packets are encapsulated in UDP with destination port 3785 and require underlying IP connectivity between peers. This mode enables faster failure detection, often sub-50 ms, by leveraging hardware acceleration, but it is diagnostic-only and not intended for standalone ongoing detection; instead, it complements asynchronous or demand modes by allowing reduced Control packet rates. Echo mode is enabled via the Required Min Echo RX Interval field in Control packets, set to a nonzero value by the looping peer.[9][10] Mode selection occurs independently in each direction during session negotiation through flags and fields in BFD Control packets, such as the Demand bit for demand mode and the Required Min Echo RX Interval for echo support. Asynchronous mode is preferred for general-purpose, continuous detection due to its simplicity and reliability, while echo mode is selected for scenarios demanding sub-50 ms fault isolation where hardware looping is supported.[11]Packet Format and Timers
The BFD Control packet consists of a fixed 24-octet header, optionally followed by an Authentication Section, and is transmitted in an encapsulation appropriate to the environment, such as UDP over IPv4 or IPv6.[12] The header fields are bit-packed as follows, with the first two octets containing version, diagnostic, state, and flag bits; subsequent fields specify session identifiers, intervals, and detection parameters.[12]| Field | Size (bits) | Description |
|---|---|---|
| Version (Vers) | 3 | Protocol version number, set to 1.[12] |
| Diagnostic (Diag) | 5 | Indicates the local diagnostic code for the last state change (e.g., 0 for no diagnostic, 1 for control detection time expired).[12] |
| State (Sta) | 2 | Current session state (0 = AdminDown, 1 = Down, 2 = Init, 3 = Up).[12] |
| Poll (P) | 1 | Set to 1 to request verification of connectivity and parameters.[12] |
| Final (F) | 1 | Set to 1 in response to a Poll bit, acknowledging the update.[12] |
| Control Plane Independent (C) | 1 | Set to 1 if the session operates independently of the control plane.[12] |
| Authentication Present (A) | 1 | Set to 1 if an Authentication Section follows the header.[12] |
| Demand (D) | 1 | Set to 1 if operating in Demand mode (no periodic transmission).[12] |
| Multipoint (M) | 1 | Reserved; must be 0 (for future multipoint use).[12] |
| Detect Mult | 8 | Detection multiplier (non-zero integer, typically 3-5).[12] |
| Length | 8 | Total length of the packet in octets, including Authentication Section if present (minimum 24).[12] |
| My Discriminator | 32 | Unique local session identifier (non-zero when session is active).[12] |
| Your Discriminator | 32 | Remote system's session discriminator (0 during initialization).[12] |
| Desired Min TX Interval | 32 | Minimum transmission interval desired by sender, in microseconds (0 reserved for Demand mode).[12] |
| Required Min RX Interval | 32 | Minimum receive interval supported by sender, in microseconds (0 means no periodic packets expected).[12] |
| Required Min Echo RX Interval | 32 | Minimum echo packet receive interval supported by sender, in microseconds (0 means no echo support).[12] |