Berkeley sockets
Berkeley sockets, also known as the BSD sockets API, is an application programming interface (API) for developing networked applications in Unix-like operating systems by providing a uniform mechanism to create and manage communication endpoints called sockets.[1] These sockets enable inter-process communication (IPC) over networks using protocols like TCP and UDP, or locally via Unix domain sockets, abstracting the complexities of the underlying transport layer to focus on data delivery between endpoints.[2][3]
The API originated in the 4.2BSD release of the Berkeley Software Distribution (BSD) Unix operating system in August 1983, developed by Bill Joy and colleagues at the University of California, Berkeley, as part of integrating TCP/IP networking support into Unix.[1] Prior to this, early BSD versions like 4.1BSD in 1981 had limited networking, but 4.2BSD introduced the sockets interface to provide a portable and extensible framework for Internet protocols, drawing from ARPANET research and BBN's TCP/IP implementation.[1] Over time, Berkeley sockets evolved through subsequent BSD releases, such as 4.3BSD in 1986, which refined the API for broader protocol support, and became a de facto standard before formal standardization.[3]
In 1988, the API was incorporated into the POSIX.1 standard (IEEE 1003.1) by the IEEE, with further refinements in POSIX.1-2001, ensuring its portability across Unix variants, Linux, macOS, and even non-Unix systems like Windows via adaptations such as Winsock.[2][3] This standardization defines core functions like socket() to create an unbound socket in a specified domain and type, bind() to associate a local address, listen() and accept() for server-side connection handling, connect() for clients, and send()/recv() (or read()/write()) for data transmission, supporting socket types including SOCK_STREAM for reliable TCP connections and SOCK_DGRAM for unreliable UDP datagrams.[2][3] The interface's design emphasizes a client-server model, protocol independence through address families (e.g., AF_INET for IPv4), and integration with file descriptors for seamless use with standard I/O operations, making it foundational for scalable network programming.[1][4]
Overview and Fundamentals
Definition and Core Concepts
Berkeley sockets constitute the original application programming interface (API) for inter-process communication over networks in Unix systems, providing a mechanism for processes to exchange data across local and wide-area networks. Introduced in the 4.2BSD release in August 1983, this interface was developed by the Computer Systems Research Group at the University of California, Berkeley, to facilitate the integration of ARPANET protocols into the Berkeley Software Distribution.[5][6]
Fundamentally, a socket acts as an endpoint for communication, representing one side of a bidirectional path between processes and abstracting the intricacies of underlying network protocols, such as the TCP/IP stack. This design allows applications to perform network operations through a consistent set of system calls, insulating developers from the details of protocol layers, addressing, and transmission mechanics.[6]
Key principles of Berkeley sockets include support for both local inter-process communication via the Unix domain (AF_UNIX), which uses filesystem pathnames for addressing, and remote communication through the Internet domain (AF_INET), relying on IP addresses for host identification. Sockets further differentiate between stream sockets (SOCK_STREAM), which deliver reliable, sequenced, and connection-oriented byte streams, and datagram sockets (SOCK_DGRAM), which enable unreliable, connectionless transfer of variable-length messages without delivery guarantees.[6]
The Berkeley sockets API was later formalized and adopted as part of the POSIX.1 standard, ensuring portability across compliant Unix-like systems.[2]
Socket Types and Domains
Berkeley sockets operate within specific domains, also known as address families, which define the protocol suite and addressing scheme used for communication. The address family is specified during socket creation to indicate the type of network or communication domain. Common address families include AF_INET for IPv4 Internet protocols, which supports addressing using 32-bit IP addresses and 16-bit port numbers.[3] AF_INET6 extends this to IPv6 protocols, utilizing 128-bit addresses to accommodate the larger address space required for modern networks, while maintaining compatibility with IPv4 through mapped addresses.[7] Additionally, AF_UNIX provides a mechanism for interprocess communication on the same host using filesystem paths as addresses, enabling efficient local data exchange without network overhead.[3]
Socket types determine the semantics of data transmission and reception, such as whether the communication is connection-oriented or connectionless. The SOCK_STREAM type establishes a reliable, sequenced, two-way byte stream, typically implemented over TCP in Internet domains, ensuring data delivery without loss or duplication.[3] In contrast, SOCK_DGRAM supports unreliable, connectionless datagram delivery, akin to UDP, where messages are sent without establishing a connection and may be lost or arrive out of order.[8] SOCK_RAW allows direct access to lower-level protocols, bypassing standard transport layers for custom packet construction and inspection, though it requires elevated privileges.[3]
Protocol families, denoted by constants like PF_INET, generally align with the corresponding address families (e.g., PF_INET for AF_INET) and specify the protocol suite to be used. The protocol parameter further refines this by selecting a specific protocol within the family, such as IPPROTO_TCP for stream-oriented transmission or IPPROTO_UDP for datagram services; a value of 0 often defaults to the standard protocol for the given type and family.[3] These protocols ensure that the socket interfaces with the appropriate network stack layer.
For IPv4 communications in the AF_INET domain, addresses are represented by the sockaddr_in structure, which encapsulates the necessary fields for binding or connecting. This structure includes sin_family, set to AF_INET to denote the address family; sin_port, a 16-bit port number in network byte order; and sin_addr, containing the 32-bit IPv4 address in network byte order via its s_addr member.[9] Proper initialization of these fields is essential for accurate address resolution and communication setup.
Historical Development
Origins in BSD
Berkeley sockets were introduced as part of the networking facilities in the 4.2BSD release of the Berkeley Software Distribution (BSD) Unix operating system in August 1983.[10][11] This implementation was led by a team at the University of California, Berkeley, including key contributors William N. Joy, Samuel J. Leffler, and Robert S. Fabry, who developed the system under sponsorship from the Defense Advanced Research Projects Agency (DARPA).[10][11] The work built on an initial TCP/IP prototype provided by Rob Gurwitz in fall 1981, which Joy integrated and refined starting with the 4.1a release in April 1982, culminating in the robust networking support of 4.2BSD.[11]
The primary motivations for developing Berkeley sockets stemmed from the need to provide a portable and uniform interface for network programming within Unix, amid the rapid growth of internetworking technologies in the early 1980s.[10] This effort was driven by DARPA's requirements to enable participation in the ARPANET for distributed systems research among contractors, replacing the outdated Network Control Protocol (NCP) with the more scalable TCP/IP protocol suite.[10][11] By standardizing access to TCP/IP, the sockets interface addressed the limitations of prior ad-hoc networking approaches, facilitating easier integration of multiple protocol families and hardware interfaces while supporting the emerging demands of multi-gigabyte processes and remote resource sharing.[10][11]
At its inception, Berkeley sockets provided core support for both connection-oriented (TCP) and connectionless (UDP) protocols, allowing applications to communicate over IP networks.[10] A key feature was the tight integration with Unix file descriptors, treating sockets as file-like objects for input/output operations, which enabled seamless multiplexing of network I/O with other system resources.[10] This design innovation allowed the use of the existing select() system call for asynchronous handling of multiple sockets, promoting efficient event-driven programming without blocking on individual connections.[10] Overall, these elements established sockets as a foundational abstraction for Unix networking, influencing subsequent standardizations like POSIX.[11]
Standardization and POSIX Adoption
The Berkeley sockets API, originating from the BSD implementations, achieved formal standardization through the POSIX process to promote portability across Unix-like operating systems. The core functions of the API, including socket(), bind(), and related primitives, were initially specified in IEEE Std 1003.1g-2000, a dedicated standard for networking services that built upon earlier drafts dating back to the 1990s.[12] This ratification marked the first comprehensive POSIX definition of the sockets interface, ensuring normative requirements for compliant systems.[2]
Subsequent evolutions integrated these networking features into the main POSIX.1 standard. In IEEE Std 1003.1-2001, the content of POSIX.1g was merged with the base system services, expanding the API to include advanced networking capabilities while maintaining backward compatibility with BSD-derived implementations.[13] This revision also introduced extensions for IPv6 support, such as the AF_INET6 address family and associated structures like sockaddr_in6, enabling dual-stack IPv4/IPv6 operations in POSIX-conformant environments.[14] Later updates, including POSIX.1-2008, refined these specifications with technical corrigenda for clarity and consistency.[2]
Adoption extended the sockets API beyond BSD lineages to diverse systems. AT&T incorporated Berkeley sockets into System V Release 4 (SVR4) in 1989, unifying networking features across commercial Unix variants through a dedicated kernel implementation. In Linux, the GNU C Library (glibc) delivers fully POSIX-compliant sockets, leveraging the kernel's native support for domains like PF_INET and PF_UNIX since early distributions.[3] Microsoft Windows provides a partial analog via the Windows Sockets API (Winsock2), introduced in 1996 and based on BSD sockets, though it requires explicit DLL initialization and uses distinct error reporting via WSAGetLastError() rather than errno.[15]
While the POSIX standardization preserved the essential BSD API for broad compatibility, minor differences persist in implementations, such as variations in error codes (e.g., EADDRINUSE in POSIX versus WSAEADDRINUSE in Winsock) and optional extensions like non-blocking I/O behaviors, ensuring core portability without mandating every BSD-specific detail.[2][15]
API Components
Header Files and Data Structures
The Berkeley sockets API relies on several standard header files to provide the necessary definitions, constants, and data structures for socket programming. The primary header, <sys/socket.h>, contains core definitions for socket operations, including address families, socket types, and the generic address structure. This header is required for most socket-related functions and types, such as the creation of sockets and manipulation of options. For IPv4-specific functionality, <netinet/in.h> supplies protocol-dependent structures and types, including those for Internet addresses and ports. Additionally, <arpa/inet.h> offers utility functions and types for address conversion, complementing the core socket definitions without directly defining socket creation primitives.[16][17]
Key data structures in the API include the opaque struct socket, which represents a socket endpoint but is not directly accessible to user applications; instead, it is referenced via a file descriptor returned by the socket() function.[3] The generic address structure, struct sockaddr, serves as a common format for passing socket addresses to API functions, defined in <sys/socket.h> as follows:
c
struct sockaddr {
sa_family_t sa_family;
char sa_data[14];
};
struct sockaddr {
sa_family_t sa_family;
char sa_data[14];
};
Here, sa_family_t is an unsigned integer type specifying the address family (e.g., AF_INET for IPv4), and sa_data holds protocol-specific address information.[16] For IPv4, the specialized struct sockaddr_in in <netinet/in.h> extends this with explicit fields for family, port, and address:
c
struct sockaddr_in {
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
unsigned char sin_zero[8];
};
struct sockaddr_in {
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
unsigned char sin_zero[8];
};
The sin_family field must be set to AF_INET, sin_port uses in_port_t (a 16-bit unsigned integer in network byte order) for the port number, and sin_addr is a struct in_addr containing the IPv4 address as in_addr_t (a 32-bit unsigned integer, also in network byte order).[16][9] These structures ensure compatibility across functions by casting pointers to struct sockaddr * as needed.[18]
Related types such as sa_family_t, in_port_t, and in_addr_t are typedefs defined in the respective headers to promote portability; sa_family_t supports various address domains like AF_INET or AF_UNIX, while in_port_t and in_addr_t are tailored for Internet protocols.[16]
Socket Creation and Configuration Functions
The Berkeley sockets API provides functions for creating sockets and configuring their behavior prior to use in network communication. Sockets are treated as file descriptors in Unix-like systems, allowing them to be integrated with standard I/O operations such as read() and write(), or socket-specific functions like send() and recv().[2]
The primary function for socket creation is socket(), which initializes an unbound endpoint for communication. Its syntax is:
#include <sys/socket.h>
int socket(int [domain](/page/Domain), int type, int [protocol](/page/Protocol));
#include <sys/socket.h>
int socket(int [domain](/page/Domain), int type, int [protocol](/page/Protocol));
The domain parameter specifies the address family, such as AF_INET for IPv4 or AF_INET6 for IPv6, as defined in <sys/socket.h>. The type indicates the semantics of communication, for example SOCK_STREAM for reliable, connection-oriented TCP sockets or SOCK_DGRAM for unreliable, connectionless UDP. The protocol parameter selects the specific protocol within the domain, typically set to 0 to use the default (e.g., TCP for SOCK_STREAM in AF_INET); otherwise, it uses an implementation-defined value. On success, socket() returns a non-negative integer representing the new file descriptor; on failure, it returns -1 and sets errno. Common combinations include socket(AF_INET, SOCK_STREAM, 0) for a TCP socket or socket(AF_INET, SOCK_DGRAM, 0) for UDP.[2]
Socket options can be configured using setsockopt() and queried with getsockopt() to control behavior at various protocol levels. The syntax for setsockopt() is:
#include <sys/socket.h>
int setsockopt(int [socket](/page/Socket), int level, int option_name,
const void *restrict option_value,
socklen_t option_len);
#include <sys/socket.h>
int setsockopt(int [socket](/page/Socket), int level, int option_name,
const void *restrict option_value,
socklen_t option_len);
Here, socket is the file descriptor from socket(), level specifies the protocol layer (e.g., SOL_SOCKET for socket-level options or IPPROTO_TCP for TCP-specific), option_name identifies the option, option_value points to the value to set, and option_len gives its size. On success, it returns 0; on failure, -1 with errno set. Examples include setting SO_REUSEADDR to 1 (an integer value) to allow reuse of local addresses for quick server restarts, or SO_KEEPALIVE to 1 to enable periodic TCP probes for detecting dead connections. The corresponding getsockopt() has a similar syntax but retrieves values:
int getsockopt(int socket, int level, int option_name,
void *restrict option_value,
socklen_t *restrict option_len);
int getsockopt(int socket, int level, int option_name,
void *restrict option_value,
socklen_t *restrict option_len);
It updates option_value with the current setting and adjusts option_len to the actual size, returning 0 on success or -1 on failure. These functions enable fine-tuned control, such as enabling address reuse via setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval)) where fd is the socket descriptor and optval is 1.[19][20]
Error handling is essential, as these functions set errno on failure. For socket(), common errors include EAFNOSUPPORT if the address family is unsupported, EMFILE if the process file descriptor limit is reached, ENFILE for system-wide limits, and EPROTONOSUPPORT for invalid protocols. For setsockopt() and getsockopt(), typical issues are EBADF for an invalid descriptor, ENOPROTOOPT for unsupported options, and EINVAL for invalid arguments or a shut-down socket. Applications should check return values and consult errno (via <errno.h>) to diagnose issues, often using perror() or strerror() for reporting.[2][19]
Connection-Oriented Operations
Binding and Listening
In server applications employing connection-oriented protocols such as TCP, after creating a socket with the socket() function, the binding process associates the socket with a specific local network address and port number to enable communication.[21]
The bind() function accomplishes this association, with the prototype:
c
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
It assigns the local socket address specified by the addr pointer—whose length is given by addrlen—to the socket referenced by the file descriptor sockfd, provided the socket has not previously been bound.[21] The addr argument is typically a structure like struct sockaddr_in for IPv4, containing the IP address in its sin_addr field and the port number in sin_port, both in network byte order.[21]
To allow the socket to accept connections on any available network interface, the IP address field can be set to INADDR_ANY, a constant defined as (in_addr_t)0x00000000 that acts as the IPv4 wildcard address.[22][9] Port numbers span 0 to 65535, but ports 1 through 1023 are reserved as well-known (or system) ports, which require elevated privileges (e.g., root access on Unix-like systems) to bind due to their assignment for standard services by the Internet Assigned Numbers Authority (IANA); specifying port 0 requests the kernel to assign an available ephemeral port.[23] Servers commonly bind to these well-known ports, while higher-numbered ports (1024–65535) include registered ports for specific applications and ephemeral (dynamic) ports allocated temporarily for client-side use.[23]
If the requested address and port combination is already bound by another socket, bind() fails with a return value of -1 and sets errno to EADDRINUSE.[21][24]
Following a successful bind, the listen() function configures the socket for incoming connections, with the prototype:
c
int listen(int sockfd, int backlog);
int listen(int sockfd, int backlog);
This marks the connection-mode socket identified by sockfd as passive, indicating it will accept connection requests, and the backlog parameter serves as a hint to limit the size of the queue for pending connections.[25] Implementations use this hint to manage outstanding connections, often capping the queue length; for example, in Linux, values exceeding SOMAXCONN (defaulting to 128 but configurable via /proc/sys/net/core/somaxconn) are silently truncated.[25][26]
The queuing semantics typically involve separate handling for incomplete (SYN) and complete (established) connections, though the precise division and enforcement are implementation-defined, ensuring the system avoids resource exhaustion from rapid connection attempts.[25][26]
These steps—binding and listening—are performed exclusively on the server side for TCP stream sockets to prepare for client-initiated connections.[25]
Accepting and Connecting
In the client-server model of Berkeley sockets, the accept() function enables a server to establish a connection with an incoming client request. This function extracts the first pending connection from the queue associated with a listening socket and creates a new socket descriptor specifically for that client connection. The signature of the function is int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);, where sockfd is the listening socket file descriptor, addr is a pointer to a buffer that receives the client's address structure upon successful connection, and addrlen points to the length of that buffer, which is updated to reflect the actual size of the address returned.[27] Upon success, accept() returns a non-negative integer representing the new connected socket descriptor, which the server can use for further communication with that specific client, while the original listening socket remains open to handle additional incoming connections.[27]
The accept() function is typically used with connection-oriented socket types such as SOCK_STREAM (TCP), where it dequeues the earliest connection request established via the three-way handshake. If no connections are pending and the socket is in blocking mode (the default), accept() will block until a connection arrives; however, if the socket is set to non-blocking mode using fcntl() with O_NONBLOCK, it returns immediately with -1 and sets errno to EAGAIN or EWOULDBLOCK if the queue is empty.[27] This non-blocking behavior allows servers to integrate accept() into event-driven loops, such as those using select() or poll(), to efficiently manage multiple potential connections without dedicated threads per client. The extracted client address in addr provides the server with details like the client's IP and port, facilitating logging, access control, or routing decisions, though the server must allocate and initialize appropriate storage for struct sockaddr based on the address family (e.g., sockaddr_in for IPv4).[27] Errors such as EBADF (invalid descriptor), EINVAL (socket not listening), or ENOBUFS (insufficient buffer space) may occur, requiring robust error handling to maintain server reliability.[27]
On the client side, the connect() function initiates the connection process by associating the socket with a remote endpoint. Its signature is int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);, where sockfd is the client's socket descriptor, addr specifies the server's address structure (including family, IP, and port), and addrlen is the size of that structure.[28] For TCP sockets, connect() triggers the connection establishment, blocking until the handshake completes successfully (returning 0) or fails (returning -1 with errno set); common failures include ECONNREFUSED if no server is listening on the target port, ETIMEDOUT after exceeding the system's connection timeout (typically around 75 seconds for initial SYN, varying by implementation), or EHOSTUNREACH if the remote host is inaccessible.[28] If the socket is non-blocking, connect() returns -1 with EINPROGRESS immediately if the connection cannot complete synchronously, allowing the client to monitor progress using select() on the socket for writability or check the socket error via getsockopt() with SO_ERROR.[28]
Address handling in connect() requires the client to explicitly provide the full server details, often obtained via name resolution functions like getaddrinfo(), ensuring compatibility across address families such as AF_INET or AF_INET6.[28] Asynchronous non-blocking usage of connect() is particularly valuable in high-performance applications, where it avoids blocking the main thread during potentially long connection attempts.[28]
Connectionless and Utility Operations
Sending and Receiving Data
In Berkeley sockets, data transfer occurs through functions that handle both connection-oriented (e.g., TCP) and connectionless (e.g., UDP) protocols, enabling reliable or unreliable message exchange over network sockets.[29] The core functions for sending and receiving data on connected sockets are send() and recv(), which operate on established connections from prior connect() or accept() calls. These functions support flags to modify behavior, such as MSG_OOB for out-of-band data transmission, which allows protocols like TCP to send expedited data ahead of the normal stream.[29][30]
The send() function transmits data from a buffer to the peer on a connected socket, with the prototype:
#include <sys/socket.h>
ssize_t send(int [socket](/page/Socket), const void *[buffer](/page/Buffer), size_t length, int flags);
#include <sys/socket.h>
ssize_t send(int [socket](/page/Socket), const void *[buffer](/page/Buffer), size_t length, int flags);
It returns the number of bytes sent upon success, or -1 on error with errno set; for stream protocols, it may send fewer bytes than requested if buffer space is limited, requiring applications to loop until the full length is transmitted.[29] Similarly, recv() receives data into a buffer from a connected socket:
#include <sys/socket.h>
ssize_t recv(int socket, void *buffer, size_t length, int flags);
#include <sys/socket.h>
ssize_t recv(int socket, void *buffer, size_t length, int flags);
This function returns the bytes received (possibly partial for streams), 0 if the peer has performed an orderly shutdown, or -1 on error; errors include EAGAIN or EWOULDBLOCK in non-blocking mode when no data is available.[30] The length parameter specifies the maximum bytes to send or receive, bounded by the socket's send/receive buffer sizes set via setsockopt(); exceeding these may lead to blocking or partial operations unless the socket is configured non-blocking with fcntl().[29][30]
For connectionless protocols like UDP, where sockets are not explicitly connected, sendto() and recvfrom() are used to include destination and source addresses, respectively. The sendto() function sends a datagram to a specified address:
#include <sys/socket.h>
ssize_t sendto(int socket, const void *message, size_t length, int flags,
const struct sockaddr *dest_addr, socklen_t dest_len);
#include <sys/socket.h>
ssize_t sendto(int socket, const void *message, size_t length, int flags,
const struct sockaddr *dest_addr, socklen_t dest_len);
It requires the dest_addr parameter for unconnected sockets, failing with EDESTADDRREQ if omitted, and returns the bytes sent or -1 on error, such as EMSGSIZE for oversized datagrams.[31] Conversely, recvfrom() receives a datagram and populates the source address:
#include <sys/socket.h>
ssize_t recvfrom(int socket, void *buffer, size_t length, int flags,
struct sockaddr *address, socklen_t *address_len);
#include <sys/socket.h>
ssize_t recvfrom(int socket, void *buffer, size_t length, int flags,
struct sockaddr *address, socklen_t *address_len);
This returns the bytes received or -1 on error, storing the sender's address in address if provided; for datagram sockets, it delivers complete messages or discards excess data if the buffer is too small, unlike streams.[32]
In TCP, a stream-oriented protocol, data transfer treats the connection as a continuous byte stream without message boundaries, so send() and recv() may result in partial transfers—applications must track and retry to ensure complete data movement, often using loops that accumulate bytes until the desired length is met or an error occurs.[29][30] To manage flow control and partial operations in non-blocking mode, applications handle EWOULDBLOCK by retrying later, typically with select() or poll() to wait for writability or readability.[29] Additionally, TCP supports half-close via shutdown(), allowing one direction of the connection to be terminated while the other remains open—for instance, shutdown(socket, SHUT_WR) disables further sends but permits receives, enabling graceful protocol shutdowns where one peer signals completion of transmission.[33] This feature ensures orderly data exchange without abrupt connection termination, though both directions must eventually close for full cleanup.[33]
Name Resolution Functions
Name resolution functions in the Berkeley sockets API enable applications to translate human-readable hostnames into network addresses and vice versa, facilitating dynamic addressing without hardcoding IP addresses. These functions are essential for establishing connections in both IPv4 and IPv6 environments, bridging the gap between symbolic names and binary address representations used by socket operations. Early implementations focused on IPv4, but subsequent standards introduced protocol-independent alternatives to support modern networks.[34]
The traditional functions gethostbyname() and gethostbyaddr() provide IPv4-specific name resolution. The gethostbyname() function takes a hostname as input and returns a pointer to a struct hostent, which contains the canonical hostname (h_name), an array of aliases (h_aliases), the address type (h_addrtype, typically AF_INET), address length (h_length), and a list of binary addresses (h_addr_list). Its prototype is struct hostent *gethostbyname(const [char](/page/Char) *name);. Similarly, gethostbyaddr() reverses this process, accepting a binary address, length, and type to return the corresponding struct hostent via struct hostent *gethostbyaddr(const void *addr, socklen_t len, int type);. These functions, originating from early BSD implementations, are limited to IPv4 and lack support for multiple address families.[34][34]
Due to these limitations, gethostbyname() and gethostbyaddr() are deprecated in favor of more versatile alternatives introduced in RFC 3493, which obsoletes earlier specifications like RFC 2553. The modern getaddrinfo() function performs protocol-independent resolution, accepting a nodename (hostname or address string), servicename (port or service name), hints (via struct addrinfo to specify preferences like address family, socket type, and protocol), and returns a linked list of struct addrinfo entries through a double pointer. The struct addrinfo includes fields such as ai_flags (resolution hints), ai_family (e.g., AF_INET or AF_INET6), ai_socktype (e.g., SOCK_STREAM), ai_protocol, ai_addrlen, ai_canonname (canonical name), ai_addr (a sockaddr structure populated with the resolved address), and ai_next (for chaining multiple results). Its prototype is int getaddrinfo(const char *nodename, const char *servname, const struct addrinfo *hints, struct addrinfo **res);, returning 0 on success or an error code (e.g., EAI_NONAME) that can be converted to a string via gai_strerror(). This function supports IPv6 by including AF_INET6 in the ai_family field and handling IPv4-mapped IPv6 addresses when the AI_V4MAPPED flag is set in hints. Applications must free the returned list using freeaddrinfo(res) to avoid memory leaks.[34][34][34]
Complementing getaddrinfo(), the getnameinfo() function provides the inverse operation, converting a socket address into a hostname and service name in a protocol-independent manner. It takes a sockaddr pointer, its length, buffers for node (hostname) and service (port/service) strings, their sizes, and flags (e.g., NI_NUMERICHOST to force numeric output), with prototype int getnameinfo(const struct sockaddr *sa, socklen_t salen, char *host, socklen_t hostlen, char *serv, socklen_t servlen, int flags);. Like getaddrinfo(), it returns 0 on success or an error code, and supports IPv6 addresses directly. Both functions are designed to be thread-safe, unlike their predecessors.[34][34]
While effective, these functions have notable limitations, particularly in legacy contexts. The older gethostbyname() and gethostbyaddr() are not thread-safe, as they rely on static internal buffers that can lead to race conditions in multithreaded applications, and they do not support IPv6 natively. In contrast, getaddrinfo() and getnameinfo() address these issues but may still exhibit implementation-specific behaviors regarding scope IDs in IPv6 link-local addresses. For new development, RFC 3493 explicitly recommends using getaddrinfo() and getnameinfo() over the deprecated functions to ensure portability and future-proofing across IPv4 and IPv6 networks.[34][34][34]
Advanced Features
Raw Sockets
Raw sockets in the Berkeley sockets API provide applications with direct access to the transport layer and below, enabling the construction and dissection of network packets at a low level without the kernel's standard protocol processing. They are created using the socket() function with the SOCK_RAW type and a specific protocol number, such as socket(AF_INET, SOCK_RAW, IPPROTO_ICMP) for Internet Control Message Protocol (ICMP) packets or IPPROTO_IGMP for Internet Group Management Protocol.[35][36] This creation requires elevated privileges, typically the effective user ID of 0 (root) or the CAP_NET_RAW Linux capability, due to the potential for crafting arbitrary packets that could disrupt network operations.[35][37]
Once created, raw sockets are used with functions like sendto() and recvfrom() to transmit and receive datagrams, where the application supplies the full packet payload, including transport-layer headers, and optionally the IP header via the IP_HDRINCL socket option.[35] Common applications include network diagnostic tools such as ping, which sends ICMP echo requests, and traceroute, which uses ICMP or UDP packets to map routes; these tools leverage raw sockets to inject custom packets into the network stack.[38][37] Without IP_HDRINCL set, the kernel automatically constructs the IPv4 header, filling in fields like the checksum, source address, packet ID, and total length if they are zeroed in the user-provided buffer.[35]
Raw sockets impose several limitations compared to higher-level socket types. Applications must handle all protocol details manually, with no automatic error checking, fragmentation, or reassembly provided by the kernel, and received packets exclude link-level headers.[35][36] For IPv4, using protocol IPPROTO_RAW allows sending custom headers but prohibits receiving data, and the kernel may still process or forward packets to other modules, potentially leading to inconsistencies.[35] The IPv4 identification (ID) field, used for fragmentation, is typically assigned by the kernel unless overridden, which can cause issues in high-volume scenarios where unique IDs are needed for reassembly tracking.[35] Implementations are not fully portable across BSD variants, as behaviors like header inclusion vary.[35]
Security restrictions on raw sockets are stringent to mitigate risks like denial-of-service attacks from malformed packets. On modern Linux systems, access to ICMP raw sockets is further controlled by the /proc/sys/net/ipv4/ping_group_range parameter, which by default ("1 0") allows only root to create such sockets, though it can be configured to permit specific group IDs since kernel version 2.6.39.[37] IPv6 raw sockets differ from IPv4 counterparts; while there is no IP_HDRINCL socket option, using socket(AF_INET6, SOCK_RAW, IPPROTO_RAW) allows sending complete packets including the IPv6 header and extension headers supplied by the application. Raw sockets receive packets starting from the IPv6 header, including extension headers.[39][35][40] These restrictions ensure that raw socket usage remains limited to trusted, privileged applications.[36]
Blocking and Non-Blocking Modes
Berkeley sockets operate in two primary modes: blocking and non-blocking, which determine how socket functions behave when an operation cannot complete immediately.
In the default blocking mode, functions such as connect(), accept(), send(), and recv() suspend the calling process until the operation completes successfully or encounters an unrecoverable error. This behavior simplifies programming for single-connection applications, as the process waits indefinitely for events like connection establishment or data availability, but it risks indefinite hangs if the peer is unresponsive or network conditions delay completion. For instance, a blocking connect() will not return until the TCP three-way handshake finishes or fails with a timeout or error.[28][41]
To enable non-blocking mode, the fcntl() function is used to set the O_NONBLOCK flag on the socket file descriptor with the F_SETFL command, as shown in the following example:
c
#include <fcntl.h>
int flags = fcntl(socket_fd, F_GETFL, 0);
fcntl(socket_fd, F_SETFL, flags | O_NONBLOCK);
#include <fcntl.h>
int flags = fcntl(socket_fd, F_GETFL, 0);
fcntl(socket_fd, F_SETFL, flags | O_NONBLOCK);
This configuration causes socket operations to return immediately if they cannot complete without waiting, typically failing with -1 and setting errno to EAGAIN or EWOULDBLOCK to indicate that the operation would block. In many POSIX systems, EWOULDBLOCK is defined equivalently to EAGAIN. For connect() in non-blocking mode, it returns EINPROGRESS if the connection initiation begins but is not yet complete, allowing the process to continue and later check status via I/O readiness mechanisms. Similarly, send() and recv() may transfer partial data—fewer bytes than requested—and require application-level loops to retry until the full amount is handled or an error occurs.[28][41]
Non-blocking mode supports asynchronous patterns by integrating with I/O multiplexing functions like select(), poll(), or epoll() to monitor multiple sockets for readiness, enabling efficient handling of concurrent connections without dedicated threads per socket. This is particularly valuable for scalable server applications, where blocking mode could tie up resources on slow clients, but it demands careful error handling and retry logic to manage incomplete operations and avoid busy-waiting. While blocking mode prioritizes simplicity for straightforward, low-concurrency scenarios, non-blocking mode enhances responsiveness and throughput in high-load environments at the cost of increased code complexity.[42][43]
Lifecycle Management
Socket Termination
Socket termination in Berkeley sockets involves properly closing socket descriptors to release system resources, ensure graceful disconnection, and prevent resource leaks. The primary mechanism for terminating a socket is the close() function, which releases the file descriptor associated with the socket and, for connection-oriented protocols like TCP, initiates the disconnection process by sending a FIN segment to the peer. The function is declared as int close(int sockfd);, where sockfd is the socket file descriptor; it returns 0 on success or -1 on failure, with errno set accordingly (e.g., EBADF for an invalid descriptor).[44] Calling close() multiple times on the same descriptor results in undefined behavior, as the descriptor is no longer valid after the first invocation.[44]
In Unix-like systems, file descriptors maintain a reference count. The close() function decrements this count, releasing the socket only when it reaches zero. In multi-process programs using fork(), each process must call close() on its descriptor copy to ensure proper termination.[45]
For more controlled termination, especially in scenarios requiring partial closure, the shutdown() function allows disabling send and/or receive operations without immediately releasing the descriptor. Declared as int shutdown(int sockfd, int how);, it takes a how parameter specifying the shutdown mode: SHUT_RD to disable further receives, SHUT_WR to disable further sends (useful after all data transmission is complete), or SHUT_RDWR to disable both. This enables half-open connections, where one direction remains active—for instance, shutting down writes allows pending reads while preventing new sends. The function returns 0 on success or -1 on failure (e.g., ENOTCONN if the socket is not connected).[46]
In TCP sockets, full termination follows a four-way handshake to ensure reliable closure of the bidirectional connection. The process begins when one peer calls close() or shutdown(SHUT_WR), sending a FIN segment and entering the FIN-WAIT-1 state; the remote peer acknowledges with an ACK, transitioning the initiator to FIN-WAIT-2 and itself to CLOSE-WAIT. The remote peer then sends its own FIN (after its application closes), prompting an ACK from the initiator, which enters TIME-WAIT for up to 2 times the Maximum Segment Lifetime (typically 2 minutes) to handle delayed packets before fully closing. This sequence accommodates asynchronous closure and prevents data loss.[47]
To manage lingering sockets during TCP closure, the SO_LINGER option can be set via setsockopt() using a struct linger with fields l_onoff (non-zero to enable) and l_linger (timeout in seconds). If enabled with a non-zero timeout, close() blocks until untransmitted data is sent or the timeout expires; a zero timeout discards data and aborts the connection with a RST. This option is crucial for applications needing to guarantee data delivery before termination.[19]
Beyond descriptor closure, proper cleanup includes freeing dynamically allocated resources, such as address information chains returned by getaddrinfo(). The freeaddrinfo() function releases these addrinfo structures and associated memory, traversing the linked list via the ai_next pointer to prevent leaks; it supports freeing sublists if needed. Additionally, close() may return EINTR if interrupted by a signal, requiring applications to retry the call for reliable termination.[48][44]
Error Handling and Best Practices
In Berkeley sockets, error reporting relies on the global integer variable errno, which is set by system calls and library functions upon failure, indicated by a return value of -1 or NULL. For instance, socket functions such as connect(2) or send(2) populate errno with codes like ETIMEDOUT (connection timed out) or ENOTCONN (socket not connected) when operations fail due to network issues or invalid states.[49] Developers must check and preserve errno immediately after a failing call, as subsequent operations may overwrite it, and in multithreaded programs, errno is thread-local to avoid interference.[49]
Name resolution functions, such as gethostbyname(3), use a separate variable h_errno to report errors like HOST_NOT_FOUND (unknown host) or TRY_AGAIN (temporary server failure), as these are distinct from general system errors.[50] For diagnostics, functions like perror(3) print a human-readable message to standard error based on the current errno, prefixed by a user-supplied string (e.g., "socket error:"), while strerror(3) returns the message as a string for custom logging.[51] These tools facilitate debugging without relying on numeric codes alone.
Best practices emphasize rigorous checking of return values from all socket functions, as failures are silent except through errno; for example, always verify if recv(2) returns a positive byte count, zero (indicating shutdown), or -1 (error). In non-blocking modes, use getsockopt(2) with the SO_ERROR option at the SOL_SOCKET level to retrieve and clear pending errors from asynchronous operations like connect(2), avoiding reliance on errno alone which may not capture deferred issues.[52] To prevent inefficient resource usage, avoid busy-waiting loops when polling for data or connections; instead, employ select(2), poll(2), or Linux-specific epoll(2) for event-driven handling that scales better under load.
Security considerations include validating peer addresses in received packets to prevent spoofing, such as checking the sin_family field in struct sockaddr_in or sockaddr_in6 matches expected protocol families (e.g., AF_INET or AF_INET6) before processing data, particularly in UDP sockets where source validation mitigates injection risks.[36] For TCP servers, limit the backlog parameter in listen(2) to a reasonable value like 128 or SOMAXCONN (typically 128-4096 depending on the system) to reduce vulnerability to SYN flood attacks, where excessive half-open connections exhaust the queue and deny service to legitimate clients.[53] In IPv6 dual-stack environments, enable compatibility by binding to :: (IPv6 anycast) while ensuring the socket accepts IPv4-mapped addresses via the IPV6_V6ONLY option set to zero, allowing seamless handling of both protocols without separate sockets.[34]
Portability between BSD and POSIX implementations requires awareness of subtle differences, particularly in IPv6 support; while POSIX.1g standardizes the core API, older BSD variants (e.g., 4.3BSD) lack full IPv6 extensions like struct sockaddr_in6 and scoped address handling via sin6_scope_id, necessitating use of protocol-independent functions like getaddrinfo(3) for resolution and inet_pton(3) for address conversion to ensure compatibility across systems.[34] POSIX-compliant systems generally align closely with BSD sockets but may vary in default backlog limits or IPv6 multicast behavior, so testing on target platforms (e.g., FreeBSD vs. Linux) is essential for robust code.[36]
Practical Examples
TCP Client-Server Implementation
A simple TCP client-server implementation using Berkeley sockets demonstrates the connection-oriented nature of TCP, where the server listens for incoming connections and the client establishes a reliable, bidirectional communication channel. This example assumes IPv4 addressing (AF_INET), stream sockets (SOCK_STREAM), and blocking mode for all operations, ensuring synchronous behavior without additional threading or non-blocking configurations. The server binds to port 8080, a common non-privileged port above 1024 to avoid requiring root privileges, and echoes messages back to the client to illustrate data exchange. Error handling uses perror() to report system errors based on errno.[54]
The server code follows the standard sequence: create a socket, resolve the local address with getaddrinfo(), bind to it, listen for connections, accept a client, exchange data in a loop until no more input, and close sockets. getaddrinfo() is used with AI_PASSIVE flag for the server to prepare for incoming connections on all interfaces. SO_REUSEADDR is set via setsockopt() to allow the port to be reused immediately after server termination, preventing "address already in use" errors. The recv()/send() loop handles multiple message exchanges until the client sends an empty message.[55][2][21][56][57][58][29][59][60]
c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "Usage: %s <port>\n", argv[0]);
exit(EXIT_FAILURE);
}
char *port = argv[1];
struct addrinfo hints = {0}, *res;
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_PASSIVE; // For binding to any address
int status = getaddrinfo(NULL, port, &hints, &res);
if (status != 0) {
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(status));
exit(EXIT_FAILURE);
}
int sockfd = [socket](/page/Socket)(res->ai_family, res->ai_socktype, res->ai_protocol);
if (sockfd < 0) {
perror("socket");
exit(EXIT_FAILURE);
}
int yes = 1;
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(int));
if (bind(sockfd, res->ai_addr, res->ai_addrlen) < 0) {
perror("bind");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
if (listen(sockfd, 10) < 0) { // Backlog of 10 pending connections
perror("listen");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
printf("Server listening on port %s\n", port);
struct sockaddr_storage client_addr;
socklen_t addr_size = sizeof(client_addr);
int client_sock = accept(sockfd, (struct sockaddr*)&client_addr, &addr_size);
if (client_sock < 0) {
perror("accept");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
char buf[1024];
int bytes_received;
while ((bytes_received = recv(client_sock, buf, sizeof(buf) - 1, 0)) > 0) {
buf[bytes_received] = '\0';
printf("Received: %s", buf);
if (send(client_sock, buf, bytes_received, 0) < 0) {
perror("send");
break;
}
if (bytes_received < sizeof(buf) - 1 && buf[bytes_received - 1] == '\n') {
break; // Simple termination on newline for demo
}
}
close(client_sock);
close(sockfd);
freeaddrinfo(res);
return 0;
}
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "Usage: %s <port>\n", argv[0]);
exit(EXIT_FAILURE);
}
char *port = argv[1];
struct addrinfo hints = {0}, *res;
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_PASSIVE; // For binding to any address
int status = getaddrinfo(NULL, port, &hints, &res);
if (status != 0) {
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(status));
exit(EXIT_FAILURE);
}
int sockfd = [socket](/page/Socket)(res->ai_family, res->ai_socktype, res->ai_protocol);
if (sockfd < 0) {
perror("socket");
exit(EXIT_FAILURE);
}
int yes = 1;
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(int));
if (bind(sockfd, res->ai_addr, res->ai_addrlen) < 0) {
perror("bind");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
if (listen(sockfd, 10) < 0) { // Backlog of 10 pending connections
perror("listen");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
printf("Server listening on port %s\n", port);
struct sockaddr_storage client_addr;
socklen_t addr_size = sizeof(client_addr);
int client_sock = accept(sockfd, (struct sockaddr*)&client_addr, &addr_size);
if (client_sock < 0) {
perror("accept");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
char buf[1024];
int bytes_received;
while ((bytes_received = recv(client_sock, buf, sizeof(buf) - 1, 0)) > 0) {
buf[bytes_received] = '\0';
printf("Received: %s", buf);
if (send(client_sock, buf, bytes_received, 0) < 0) {
perror("send");
break;
}
if (bytes_received < sizeof(buf) - 1 && buf[bytes_received - 1] == '\n') {
break; // Simple termination on newline for demo
}
}
close(client_sock);
close(sockfd);
freeaddrinfo(res);
return 0;
}
The client code mirrors the server's setup but uses getaddrinfo() to resolve the server's hostname (e.g., "localhost") and port, creates a socket, connects to the server, sends a message, receives the echo, and closes the connection. connect() establishes the TCP three-way handshake implicitly. This handles one exchange but can be extended for multiple by wrapping send()/recv() in a loop. Like the server, errors are checked after each call.[55][2][61][58][29][60]
c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
int main(int argc, char *argv[]) {
if (argc != 3) {
fprintf(stderr, "Usage: %s <hostname> <port>\n", argv[0]);
exit(EXIT_FAILURE);
}
char *hostname = argv[1];
char *port = argv[2];
struct addrinfo hints = {0}, *res;
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
int status = getaddrinfo(hostname, port, &hints, &res);
if (status != 0) {
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(status));
exit(EXIT_FAILURE);
}
int sockfd = [socket](/page/Socket)(res->ai_family, res->ai_socktype, res->ai_protocol);
if (sockfd < 0) {
perror("socket");
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
if (connect(sockfd, res->ai_addr, res->ai_addrlen) < 0) {
perror("connect");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
char *message = "Hello, server!\n";
int bytes_sent = send(sockfd, message, strlen(message), 0);
if (bytes_sent < 0) {
perror("send");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
char buf[1024];
int bytes_received = recv(sockfd, buf, sizeof(buf) - 1, 0);
if (bytes_received > 0) {
buf[bytes_received] = '\0';
printf("Echo from server: %s", buf);
}
close(sockfd);
freeaddrinfo(res);
return 0;
}
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
int main(int argc, char *argv[]) {
if (argc != 3) {
fprintf(stderr, "Usage: %s <hostname> <port>\n", argv[0]);
exit(EXIT_FAILURE);
}
char *hostname = argv[1];
char *port = argv[2];
struct addrinfo hints = {0}, *res;
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
int status = getaddrinfo(hostname, port, &hints, &res);
if (status != 0) {
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(status));
exit(EXIT_FAILURE);
}
int sockfd = [socket](/page/Socket)(res->ai_family, res->ai_socktype, res->ai_protocol);
if (sockfd < 0) {
perror("socket");
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
if (connect(sockfd, res->ai_addr, res->ai_addrlen) < 0) {
perror("connect");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
char *message = "Hello, server!\n";
int bytes_sent = send(sockfd, message, strlen(message), 0);
if (bytes_sent < 0) {
perror("send");
close(sockfd);
freeaddrinfo(res);
exit(EXIT_FAILURE);
}
char buf[1024];
int bytes_received = recv(sockfd, buf, sizeof(buf) - 1, 0);
if (bytes_received > 0) {
buf[bytes_received] = '\0';
printf("Echo from server: %s", buf);
}
close(sockfd);
freeaddrinfo(res);
return 0;
}
To compile and run, use gcc -o server server.c for the server and gcc -o client client.c for the client on a POSIX-compliant system. Start the server with ./server 8080, then run the client with ./client [localhost](/page/Localhost) 8080 in another terminal; the client sends "Hello, server!" and prints the echoed response. This setup verifies the full TCP socket lifecycle in blocking mode.
UDP Client-Server Implementation
UDP sockets provide a connectionless communication model using datagrams, where each message is sent independently without establishing a persistent connection, unlike TCP's stream-oriented approach. This allows for simpler implementation but requires explicit handling of source and destination addresses in every transaction, as there is no underlying connection state to track peers. Implementations must account for possible datagram loss, duplication, or out-of-order delivery, which are not guaranteed to be reliable by the protocol. A common example is an echo service, where the server reflects received messages back to the client.[62]
The server begins by creating a datagram socket with the socket() function, specifying the IPv4 address family (AF_INET) and datagram type (SOCK_DGRAM). It then binds the socket to a local address and port using bind(), making it available for incoming datagrams on that endpoint. Unlike TCP servers, no listen() or accept() is required, as UDP operates without connections. The server enters a loop using recvfrom() to receive datagrams along with the client's address, processes the data (e.g., echoing it), and sends the response via sendto() to the client's address. Error checking is essential; for instance, socket() and bind() return -1 on failure, and their errno should be inspected for issues like address already in use (EADDRINUSE).[62]
Here is an annotated C implementation of a UDP echo server, adapted from standard Berkeley sockets usage:
c
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#define MAXLINE 1024
#define SERV_PORT 9877 // Example port
int main(int argc, char **argv) {
int sockfd, n;
socklen_t len;
char mesg[MAXLINE];
struct sockaddr_in servaddr, cliaddr;
// Create datagram socket
if ((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) {
perror("socket error");
return 1;
} // socket() failure check
// Bind to local address (any IP, specific port)
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY); // Bind to all interfaces
servaddr.sin_port = htons(SERV_PORT);
if (bind(sockfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0) {
perror("bind error");
return 1;
} // bind() failure check, e.g., EADDRINUSE
printf("UDP Echo Server started on port %d\n", SERV_PORT);
for (;;) {
len = sizeof(cliaddr);
// Receive datagram with sender's address
n = recvfrom(sockfd, mesg, MAXLINE, 0, (struct sockaddr *) &cliaddr, &len);
if (n < 0) {
perror("recvfrom error");
continue;
} // recvfrom() retrieves data and client addr; returns bytes or -1
mesg[n] = 0; // Null-terminate for printing
printf("Received %d bytes: %s\n", n, mesg);
// Echo back to client (address included in every send)
if (sendto(sockfd, mesg, n, 0, (struct sockaddr *) &cliaddr, len) != n) {
perror("sendto error");
} // sendto() requires explicit destination addr; may fail due to loss or errors
}
close(sockfd); // Cleanup on exit
return 0;
}
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#define MAXLINE 1024
#define SERV_PORT 9877 // Example port
int main(int argc, char **argv) {
int sockfd, n;
socklen_t len;
char mesg[MAXLINE];
struct sockaddr_in servaddr, cliaddr;
// Create datagram socket
if ((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) {
perror("socket error");
return 1;
} // socket() failure check
// Bind to local address (any IP, specific port)
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY); // Bind to all interfaces
servaddr.sin_port = htons(SERV_PORT);
if (bind(sockfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0) {
perror("bind error");
return 1;
} // bind() failure check, e.g., EADDRINUSE
printf("UDP Echo Server started on port %d\n", SERV_PORT);
for (;;) {
len = sizeof(cliaddr);
// Receive datagram with sender's address
n = recvfrom(sockfd, mesg, MAXLINE, 0, (struct sockaddr *) &cliaddr, &len);
if (n < 0) {
perror("recvfrom error");
continue;
} // recvfrom() retrieves data and client addr; returns bytes or -1
mesg[n] = 0; // Null-terminate for printing
printf("Received %d bytes: %s\n", n, mesg);
// Echo back to client (address included in every send)
if (sendto(sockfd, mesg, n, 0, (struct sockaddr *) &cliaddr, len) != n) {
perror("sendto error");
} // sendto() requires explicit destination addr; may fail due to loss or errors
}
close(sockfd); // Cleanup on exit
return 0;
}
This code highlights the stateless nature: each recvfrom() captures the client's address anew, and sendto() must supply it for responses, accommodating multiple clients without dedicated state. Datagrams may arrive unordered or be lost, so applications must implement reliability if needed, such as acknowledgments.[62]
The client creates a datagram socket similarly but does not bind to a specific local port, allowing the kernel to assign an ephemeral one. It resolves the server's address using getaddrinfo() for name resolution, which returns a list of possible addresses (e.g., IPv4/IPv6). The client then enters a loop: reading input, sending it via sendto() with the server's address, and receiving the response with recvfrom(), ignoring the sender's address since it's expected from the server. Error checks mirror the server's, with sendto() and recvfrom() potentially returning -1 for network issues like unreachable hosts (EHOSTUNREACH). Name resolution is handled briefly here via getaddrinfo() to obtain the server structure.[62]
Here is an annotated C implementation of a UDP echo client, adapted from standard Berkeley sockets usage:
c
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#define MAXLINE 1024
#define SERV_PORT 9877 // Matching server port
#define MAXDATASIZE 100 // Max input line
int main(int argc, char **argv) {
int sockfd, n;
char mesg[MAXLINE];
struct sockaddr_in servaddr;
struct addrinfo hints, *res, *ressave;
if (argc != 2) {
fprintf(stderr, "usage: udpcli <IPaddress>\n");
return 1;
}
// Create datagram socket (no bind needed for client)
if ((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) {
perror("socket error");
return 1;
} // socket() as in server
// Resolve server address (brief name resolution)
bzero(&hints, sizeof(hints));
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_DGRAM;
if (getaddrinfo(argv[1], "9877", &hints, &res) != 0) {
fprintf(stderr, "getaddrinfo error for %s\n", argv[1]);
return 1;
} // getaddrinfo() for server addr; use numeric IP or hostname
memcpy(&servaddr, res->ai_addr, res->ai_addrlen);
freeaddrinfo(res);
printf("UDP Echo Client connected to %s:%d\n", argv[1], SERV_PORT);
for (;;) {
// Read input from stdin
fgets(mesg, MAXDATASIZE, stdin);
n = strlen(mesg);
if (mesg[n-1] == '\n') {
n--; // Remove newline
}
mesg[n] = 0;
if (strcmp(mesg, "q") == 0 || n == 0) {
break; // Quit on 'q' or empty
}
// Send to server with explicit address
if (sendto(sockfd, mesg, n, 0, (struct sockaddr *) &servaddr, sizeof(servaddr)) != n) {
perror("sendto error");
continue;
} // sendto() includes dest addr in every datagram
// Receive response (sender addr not needed for echo)
n = recvfrom(sockfd, mesg, MAXLINE, 0, NULL, NULL);
if (n < 0) {
perror("recvfrom error");
continue;
} // recvfrom() may timeout or lose packets; no order guarantee
mesg[n] = 0;
printf("Response: %s\n", mesg);
}
close(sockfd);
return 0;
}
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#define MAXLINE 1024
#define SERV_PORT 9877 // Matching server port
#define MAXDATASIZE 100 // Max input line
int main(int argc, char **argv) {
int sockfd, n;
char mesg[MAXLINE];
struct sockaddr_in servaddr;
struct addrinfo hints, *res, *ressave;
if (argc != 2) {
fprintf(stderr, "usage: udpcli <IPaddress>\n");
return 1;
}
// Create datagram socket (no bind needed for client)
if ((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) {
perror("socket error");
return 1;
} // socket() as in server
// Resolve server address (brief name resolution)
bzero(&hints, sizeof(hints));
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_DGRAM;
if (getaddrinfo(argv[1], "9877", &hints, &res) != 0) {
fprintf(stderr, "getaddrinfo error for %s\n", argv[1]);
return 1;
} // getaddrinfo() for server addr; use numeric IP or hostname
memcpy(&servaddr, res->ai_addr, res->ai_addrlen);
freeaddrinfo(res);
printf("UDP Echo Client connected to %s:%d\n", argv[1], SERV_PORT);
for (;;) {
// Read input from stdin
fgets(mesg, MAXDATASIZE, stdin);
n = strlen(mesg);
if (mesg[n-1] == '\n') {
n--; // Remove newline
}
mesg[n] = 0;
if (strcmp(mesg, "q") == 0 || n == 0) {
break; // Quit on 'q' or empty
}
// Send to server with explicit address
if (sendto(sockfd, mesg, n, 0, (struct sockaddr *) &servaddr, sizeof(servaddr)) != n) {
perror("sendto error");
continue;
} // sendto() includes dest addr in every datagram
// Receive response (sender addr not needed for echo)
n = recvfrom(sockfd, mesg, MAXLINE, 0, NULL, NULL);
if (n < 0) {
perror("recvfrom error");
continue;
} // recvfrom() may timeout or lose packets; no order guarantee
mesg[n] = 0;
printf("Response: %s\n", mesg);
}
close(sockfd);
return 0;
}
In contrast to TCP, UDP exchanges are stateless, with no connection setup or teardown, enabling fire-and-forget messaging but exposing applications to unreliability—datagrams can be duplicated, lost, or reordered without notification. This example demonstrates iterative handling suitable for low-volume services; for high concurrency, non-blocking modes or multiplexing (e.g., select()) may be added. Best practices include validating received lengths and using timeouts on recvfrom() to detect losses.[62]