iSCSI
iSCSI (Internet Small Computer Systems Interface) is a transport protocol for the Small Computer System Interface (SCSI) that enables the encapsulation and transmission of SCSI commands, data, and status over TCP/IP networks, allowing block-level access to storage devices using standard IP infrastructure such as Ethernet. Developed to provide an interoperable solution for storage area networks (SANs), iSCSI maps the SCSI architecture model onto TCP, supporting reliable delivery of I/O operations between initiators (client systems) and targets (storage servers or devices). This protocol facilitates cost-effective, scalable storage connectivity over existing networks without requiring specialized hardware like Fibre Channel.[1] Standardized by the Internet Engineering Task Force (IETF), iSCSI was initially defined in RFC 3720 in April 2004[2] and later consolidated and updated in RFC 7143 in April 2014[3] to incorporate errata, clarifications, and enhancements while maintaining backward compatibility. Key features include a login phase for parameter negotiation, support for multiple connections per session for performance and redundancy, error detection and recovery mechanisms aligned with SCSI standards, and optional security extensions like CHAP authentication and IPsec. Widely adopted in enterprise and data center environments, iSCSI enables virtualization, cloud storage, and remote replication by leveraging high-speed Ethernet advancements up to 100 Gbps and beyond.[4]Fundamentals
Definition and Purpose
iSCSI, or Internet Small Computer Systems Interface, is a transport protocol for the Small Computer System Interface (SCSI) that operates on top of TCP/IP networks.[5] It was originally defined in RFC 3720 as a proposed standard by the Internet Engineering Task Force (IETF) in April 2004 and later consolidated and updated in RFC 7143 in April 2014.[6][5] The protocol encapsulates SCSI commands, data, and status within iSCSI protocol data units (PDUs) for transmission over standard IP networks, ensuring compatibility with the SCSI Architecture Model.[5] The primary purpose of iSCSI is to enable initiators, such as servers, to access remote storage targets as if they were locally attached block devices, thereby facilitating block-level storage networking over Ethernet without requiring dedicated storage area network (SAN) hardware.[7] This approach contrasts with traditional SCSI, which relies on direct physical connections or specialized transports like Fibre Channel, by leveraging ubiquitous TCP/IP for extending storage access across local area networks (LANs) or wide area networks (WANs).[7] Key benefits of iSCSI include its cost-effectiveness, as it utilizes existing Ethernet infrastructure and avoids the expense of Fibre Channel switches and host bus adapters, making it suitable for small to medium-sized enterprises.[7] It also offers scalability for large data centers by supporting high-speed Ethernet links and integration with virtualization environments, where multiple virtual machines can share remote storage resources efficiently.[8] Historically, iSCSI originated from a proof-of-concept developed by IBM in 1998, with the initial draft submitted to the IETF in 2000 and approved as a proposed standard in February 2003.[7][9]Protocol Architecture
The iSCSI protocol employs a layered architecture that maps SCSI operations onto TCP/IP networks, enabling block-level storage access over IP. At the core is the SCSI layer, which generates and processes Command Descriptor Blocks (CDBs) for commands and responses in compliance with the SCSI architecture model as defined in SAM-2.[10] The iSCSI layer then encapsulates these SCSI elements into Protocol Data Units (PDUs) suitable for transmission, handling tasks such as session management, command sequencing, and error recovery.[11] Underlying this is the TCP/IP transport layer, which provides reliable, connection-oriented delivery of the PDUs without awareness of their SCSI or iSCSI semantics.[11] This layering ensures that iSCSI maintains SCSI semantics while leveraging the ubiquity of TCP/IP for network transport.[12] Central to the iSCSI layer are PDUs, which structure all communications between initiators and targets. Each PDU begins with a Basic Header Segment (BHS) of 48 bytes, including key fields such as the opcode (specifying the PDU type, e.g., 0x01 for SCSI Command or 0x21 for SCSI Response) and the Initiator Task Tag (a unique identifier for tracking individual tasks across the session).[13] Following the BHS are optional Additional Header Segments (AHS) for extended information and one or more Data Segments, which carry SCSI payloads, text parameters, or other data, always padded to 4-byte boundaries for alignment.[14] For login-related PDUs, the structure incorporates specific formats during phases like security negotiation (for authentication parameters) and operational negotiation (for session settings such as maximum connections or error recovery levels).[15] Session establishment in iSCSI occurs through a multi-phase login process on TCP connections, distinguishing between normal sessions for full SCSI operations and discovery sessions limited to target enumeration.[16] Normal sessions begin with a leading login connection using a Target Session Identifying Handle (TSIH) of 0, progressing through three phases: security negotiation to authenticate parties and establish security parameters, login operational negotiation to agree on session-wide settings, and the full feature phase to enable SCSI command execution.[17] Discovery sessions, by contrast, restrict operations to SendTargets commands and similar discovery functions, omitting full data transfer capabilities.[18] Initiators and targets collaboratively negotiate these phases to form a session, which may span multiple TCP connections for enhanced reliability.[19] Error handling in iSCSI emphasizes integrity and recovery at the protocol level, with support for optional digests to verify PDU components. Header and data digests use CRC32C (or none) to detect corruption during transit, applied independently to the BHS/AHS and data segments.[20] Recovery mechanisms address connection failures and data loss through task reassignment (transferring active tasks to a new connection), Selective Negative Acknowledgment (SNACK) requests for retransmitting lost PDUs, and hierarchical levels including within-command recovery for partial errors and session-wide recovery via logout.[21] These features ensure robust operation over potentially unreliable networks while preserving SCSI task integrity.[22]Core Components
Initiators
iSCSI initiators serve as client-side agents on host servers that originate SCSI commands to remote targets, encapsulating them within TCP/IP packets to access storage over IP networks.[23] These components map iSCSI logical units (LUNs) presented by targets as local block devices, such as /dev/sdX in Linux systems, enabling applications to treat remote storage as if it were directly attached.[23] By establishing sessions via a login phase, initiators facilitate reliable data transfer while handling identification through unique iSCSI names and initiator session identifiers (ISIDs).[23] Initiators are available in software and hardware forms, with software variants integrated into operating systems to leverage the host CPU for protocol processing, making them the most common deployment method.[24] Hardware initiators, typically implemented as host bus adapters (HBAs) or TCP offload engines (TOEs), offload iSCSI and TCP/IP tasks from the CPU to dedicated silicon for reduced latency.[24] A prominent example of a software initiator is the Microsoft iSCSI Initiator service, a built-in Windows component that manages connections to iSCSI targets without requiring additional hardware. In operation, initiators issue SCSI read and write commands via Command Descriptor Blocks (CDBs) embedded in SCSI-Command Protocol Data Units (PDUs), using initiator task tags and command sequence numbers (CmdSN) to ensure ordered delivery.[23] They manage sessions by negotiating parameters during login, such as maximum burst length (default 262144 bytes), and support multiple TCP connections per session for enhanced throughput and redundancy through multi-path I/O (MPIO).[23] Error recovery involves levels from connection-only (level 0) to session-wide (level 2), incorporating mechanisms like selective negative acknowledgments (SNACKs) for retransmissions, task reassignment, and failover across portal groups to maintain data integrity during network disruptions.[23] Performance of initiators varies by type: software implementations introduce CPU overhead for encapsulation and error handling, potentially consuming 10-20% of cycles at high throughput, whereas hardware offloads minimize this to under 10% while providing lower latency for latency-sensitive workloads.[25] To optimize availability and performance, initiators integrate with multipathing frameworks, such as Device Mapper in Linux, which aggregates multiple paths into a single logical device for load balancing and failover.[26]Targets and Logical Units
In iSCSI, a target serves as the server-side endpoint that exposes storage resources to initiators over IP networks using TCP connections. It receives SCSI commands encapsulated within iSCSI Protocol Data Units (PDUs), executes the associated I/O operations on underlying storage, and returns responses or status information to the initiator.[27] Targets operate primarily in the Full Feature Phase following successful login negotiation, managing tasks such as command ordering via Command Sequence Numbers (CmdSN) and ensuring connection allegiance where related PDUs stay on the same TCP connection.[10] Each target is uniquely identified by an iSCSI Qualified Name (IQN), a globally unique string formatted according to RFC 3721, such asiqn.2001-04.com.example:storage:diskarrays-sn-a8675309, which combines a timestamp, naming authority, and vendor-specific identifier.[28] Targets support multiple network portals—combinations of IP addresses and TCP ports—grouped into portal groups to enable load balancing, failover, and multi-connection sessions for improved performance and reliability.[29]
iSCSI targets can be implemented in hardware or software configurations. Hardware targets are typically integrated into enterprise storage area network (SAN) arrays, where dedicated controllers handle protocol processing and storage exposure at high speeds. Software targets, in contrast, run on general-purpose servers using operating system tools to emulate storage providers; for example, the targetcli administration shell in Linux environments allows configuration of iSCSI targets backed by local block devices or file I/O on commodity hardware.[30] These implementations process incoming SCSI-Command PDUs (opcode 0x01), which include details like the Expected Data Transfer Length for I/O size, and respond with Data-In PDUs for reads or Ready to Transfer (R2T) PDUs to solicit data for writes.[31]
Logical units (LUs) represent the fundamental addressable storage entities within an iSCSI target, corresponding to SCSI logical units that appear as block devices to initiators. Each LU is identified by a 64-bit Logical Unit Number (LUN), formatted per the SCSI Architecture Model (SAM) and included in PDUs such as the SCSI Command PDU (bytes 8-15) to specify the target LU for operations.[32] LUNs are mapped to physical or virtual storage volumes on the target, enabling abstraction of underlying hardware like disks or RAID arrays, and access is scoped to the target's IQN combined with the LUN, as in iqn.1993-08.org.[debian](/page/Debian):01:abc123/lun/0. LUN masking restricts visibility and access to authorized initiators based on their IQNs, while mapping associates LUNs with specific backend storage resources to control data placement and availability.[33]
Target operations center on handling I/O workflows initiated by commands from iSCSI initiators. Upon receiving a SCSI command, the target processes it in CmdSN order, transferring data bidirectionally—sending output via Data-In PDUs for reads or requesting input via R2T PDUs for writes—before concluding with a SCSI Response PDU containing status, such as GOOD or CHECK CONDITION.[34] Task management functions, like ABORT TASK or CLEAR TASK SET, allow termination of specific LUN operations, with the LUN field specifying the affected unit.[35] Many iSCSI targets support advanced features at the LUN level, including thin provisioning to allocate storage on demand for efficient capacity utilization and snapshots to create point-in-time copies for backup or recovery. These capabilities enhance scalability in environments like virtualized data centers, where LUNs may represent thinly provisioned volumes over-allocated relative to physical backing store.
Discovery and Connectivity
Addressing Mechanisms
iSCSI employs standardized naming conventions to uniquely identify initiators and targets across IP networks, ensuring persistent and location-independent identification. The primary format is the iSCSI Qualified Name (IQN), structured asiqn.yyyy-mm.reversed-domain:unique-id, where yyyy-mm denotes the year and month of domain registration, the reversed domain follows standard DNS conventions (e.g., com.example), and the unique identifier is vendor-specific (e.g., iqn.2001-04.com.example:storage:diskarrays-sn-a8675309).[6] Alternatively, the EUI-64 format uses eui. followed by a 16-hex-digit IEEE EUI-64 identifier (e.g., eui.02004567A425678D), providing a compact alias for nodes based on hardware or software identifiers.[6] These names are globally unique, permanent, and not tied to specific hardware, with optional aliases for human-readable reference.[5]
Portal addressing facilitates endpoint connectivity by specifying targets via IP address and TCP port, with the default port being 3260 for iSCSI sessions.[6] The TargetAddress parameter in login operations supports formats such as domain name (e.g., example.com:3260,1), IPv4 (e.g., 10.0.1.1:3260,1), or IPv6 (e.g., [2001:db8::1]:3260,1), optionally including a comma-separated portal group tag for session coordination.[5] This addressing scheme enables initiators to establish TCP connections to targets over standard IP networks, abstracting SCSI commands into iSCSI Protocol Data Units (PDUs).[6]
Connection management in iSCSI organizes multiple IP/port combinations into portal groups, identified by a 16-bit portal group tag (0-65535), allowing sessions to span several network portals while maintaining consistent SCSI logical unit access.[5] During login, initiators select routes based on discovered or configured target addresses, with targets returning the servicing portal group tag in the initial response to ensure session affinity.[6] Redirection occurs if a target issues a login response with status class 0101h (Redirect), providing an alternative TargetAddress (e.g., omitting the portal group tag in redirects) to guide the initiator to another portal for load balancing or failover.[6] This mechanism supports multiple connections per session within the same portal group, enhancing reliability without requiring hardware-specific adaptations.[5]
Initial addressing security integrates with authentication protocols during login, where CHAP provides in-band verification of initiator and target identities using directional secrets, ensuring secure name resolution and connection establishment.[6] For broader protection, IKEv2 enables IPsec encapsulation, supporting IPv6 identification types like ID_IPV6_ADDR to secure addressing in dual-stack environments.[5]
iSCSI supports both IPv4 and IPv6 natively over TCP, with dual-stack compatibility allowing seamless transitions in addressing formats from early specifications.[6] RFC 3720 laid the foundation for IP-agnostic transport, while RFC 7143 refined IPv6 integration, mandating bracketed notation in TargetAddress and IKE identification for modern networks, evolving from IPv4-centric examples to full IPv6 interoperability without protocol changes.[5]
iSNS Protocol
The Internet Storage Name Service (iSNS) is an IETF standard defined in RFC 4171 that serves as a directory service for iSCSI and related storage devices on IP networks, enabling automated discovery and management akin to the Domain Name System (DNS) but tailored for storage resources.[36] It allows initiators to locate available targets dynamically without prior manual configuration of all device details, facilitating integration of iSCSI initiators, targets, and management nodes into a centralized database.[36] As a client-server protocol, iSNS operates over TCP (mandatory) or UDP (optional), using the default port 3205 for communications between iSNS servers and clients.[36] Key functions of iSNS include registration, where targets register their iSCSI Qualified Names (IQNs) and portal addresses (IP and port combinations) with the iSNS server using Device Attributes Registration (DevAttrReg) messages; discovery, where initiators query the server via Device Attributes Query (DevAttrQry) or Device Get Next (DevGetNext) messages to retrieve lists of available targets and their attributes; and state change notifications (SCNs), which alert registered clients to dynamic events such as target availability changes or failover scenarios through SCN messages.[36] These notifications support real-time updates, with message types encoded in a Type-Length-Value (TLV) format for attributes like entity identifiers and portal details.[36] For example, an SCN might notify of an object addition or removal, enabling seamless session management in storage networks.[36] iSNS offers benefits such as reduced manual configuration in large-scale environments by centralizing device information and automating target discovery, which simplifies deployment compared to static setups.[36] However, it is optional for iSCSI implementations, with alternatives including the Service Location Protocol (SLP) per RFC 2608, static configuration of target addresses, or the SendTargets method for direct queries.[37] Security considerations in iSNS, as outlined in RFC 4171, address threats like unauthorized access and message replay through recommended IPsec ESP (SHOULD per RFC 4171) for authentication and integrity (with optional confidentiality), timestamps in messages, and support for digital signatures or X.509 certificates in multicast scenarios.[36] As of 2025, while iSNS remains in use in various storage systems such as NetApp ONTAP and Broadcom fabrics, Microsoft has deprecated support for iSNS in Windows Server 2025, recommending the Server Message Block (SMB) feature as an alternative for similar functionality.[38]Deployment Features
Network Booting
Network booting with iSCSI allows diskless clients to load and execute operating systems from remote storage devices over an IP network, treating the remote iSCSI logical unit number (LUN) as a local block device during the boot sequence. This process integrates with standard network boot mechanisms like the Preboot Execution Environment (PXE), where the client's firmware—either BIOS or UEFI—initiates the connection to an iSCSI target. The initiator, embedded in the firmware or loaded via PXE, establishes an iSCSI session to access the bootable LUN containing the operating system image.[39] The boot process begins with the client broadcasting a DHCP request to obtain network configuration, including the IP address of the boot server and details for locating the iSCSI target. In PXE-enabled setups, the DHCP server responds with option 67 specifying the boot file name, which may chainload an enhanced firmware like iPXE to handle iSCSI-specific operations. Once network parameters are acquired, the client uses additional DHCP options—such as vendor-specific option 43 or the iSCSI root path in option 17 (format: "iscsi:"servername":"protocol":"port":"LUN":"targetname"")—to identify the iSCSI target. If details are incomplete, the client queries a discovery service like iSNS or SLP to resolve the target name to an IP address and port.[39][40][41] Following discovery, the iSCSI initiator in the firmware logs into the target using the obtained credentials, establishing a session over TCP. The boot firmware then reads the LUN as a block device, loading the master boot record and mounting the root filesystem to continue the operating system boot. For UEFI systems, the process aligns with EFI boot services, while BIOS uses INT13h extensions to present the remote disk; multiple boot paths can be handled by prioritizing interfaces or targets based on firmware configuration, allowing failover if the primary path fails. In advanced setups, chainloading via iPXE enables scripting for dynamic target selection or authentication, such as CHAP, before passing control to the OS loader.[39][42][40] Key requirements include firmware support for iSCSI, such as Intel iSCSI Remote Boot integrated into the NIC option ROM or BIOS, and an iSCSI target exporting a bootable LUN formatted with a compatible partition scheme (e.g., GPT for UEFI). The network must provide reliable TCP connectivity, with the target configured to allow initiator access; no local storage is needed on the client, though fallback options may be provisioned.[42][41] Common use cases include stateless computing environments, where multiple clients boot identical OS images from a central repository for simplified management and rapid deployment, and diskless workstations in educational or lab settings to reduce hardware costs and enable centralized updates. For instance, a single 40 GB master image can boot hundreds of clients using differencing virtual hard disks, saving over 90% on storage compared to local duplicates.[43][44] Challenges arise in wide area network (WAN) booting due to increased latency from geographic distance and potential packet loss, which can prolong the initial session establishment and OS loading. This is mitigated by enabling jumbo frames (MTU up to 9000 bytes) end-to-end to reduce overhead and improve throughput, though all network components must support consistent MTU sizes to avoid fragmentation. Local area network (LAN) deployments with 10 GbE or higher typically avoid these issues, emphasizing dedicated iSCSI VLANs for optimal performance.[45][43]Configuration Basics
Configuring an iSCSI initiator and target begins with assigning unique iSCSI Qualified Names (IQNs) to each node, defining network portals as IP address and TCP port combinations (typically port 3260), and setting up authentication secrets using the Challenge Handshake Authentication Protocol (CHAP). On the target side, administrators create IQNs and associate them with portal groups, which manage access to logical unit numbers (LUNs), while enabling CHAP by specifying usernames and secrets (at least 96 bits recommended for security without IPsec, with implementations required to support up to 128 bits and potentially longer) for one-way or mutual authentication.[6][46] For the initiator, the IQN is defined in a configuration file such as/etc/iscsi/initiatorname.iscsi, and CHAP credentials are entered in /etc/iscsi/iscsid.conf to match the target's settings.[47]
Practical setup on Linux systems utilizes the iscsiadm utility for discovery and login operations. Discovery employs the SendTargets method via commands like iscsiadm --mode discoverydb --type sendtargets --portal <target-ip>:3260, which queries the target for available IQNs and portals without establishing a full session. Subsequent login is performed with iscsiadm -m node -T <target-iqn> -p <target-ip>:3260 --login, establishing a persistent session that can be automated at boot by marking nodes as such in the open-iSCSI database.[47][48]
During the login phase, iSCSI negotiates session parameters using text key-value pairs to ensure compatibility and optimal performance. Key parameters include MaxConnections, which specifies the maximum number of concurrent TCP connections per session (default 1, range 1-65535, negotiated to the minimum value); HeaderDigest and DataDigest, which enable optional CRC32C checksumming for integrity (default None, per-connection negotiation); and ErrorRecoveryLevel, which defines recovery capabilities (0 for none, 1 for within-connection recovery like retransmissions, and 2 for full session-level task reassignment across connections).[6]
iSCSI supports multipathing through extensions like Multi-Path I/O (MPIO) to enhance redundancy and performance across multiple network paths. In Linux environments, the Device Mapper Multipath (DM-Multipath) subsystem aggregates paths to an iSCSI LUN into a single device, using policies such as round-robin for load balancing I/O across active paths. Persistent bindings ensure consistent LUN mapping by configuring multipath.conf with device-specific aliases, WWIDs, and path priorities, preventing device name changes on reboot and enabling failover without data disruption.
Monitoring iSCSI sessions involves tools to track performance metrics and diagnose issues. The iscsiadm command provides session statistics with --stats, reporting throughput in bytes per second, I/O operations per second (IOPS), and error counts for active connections. For troubleshooting portal failover, administrators use iscsiadm -m session to verify connection states and manually trigger failovers with --logout and --login on alternate portals, ensuring quick recovery in multipathed setups.[47][49]