User identifier
A user identifier, often abbreviated as UID, is a unique symbol or character string employed by information systems to distinguish and recognize a specific user within the system.[1] This identifier facilitates essential functions such as assigning access privileges, tracking user activities, and managing system operations across various computing environments, including operating systems, networks, and applications.[2] In practice, UIDs can take forms like numeric integers in Unix-like systems or alphanumeric security identifiers (SIDs) in Windows environments, ensuring unambiguous user recognition without relying solely on human-readable names like usernames.[3] In Unix-like operating systems, which adhere to POSIX standards, the UID is typically a 32-bit integer ranging from 0 to 4294967295, uniquely assigned to each user account to enforce file permissions and resource access controls in conjunction with the group identifier (GID).[4] For instance, system users like root often have predefined UIDs (e.g., 0 for root), while regular users receive higher values starting from 1000 to avoid conflicts with system accounts.[5] This numeric approach enhances security by allowing the kernel to perform efficient, privilege-based decisions without repeated name lookups.[6] Beyond operating systems, user identifiers play a critical role in broader contexts like database management, web authentication, and identity management protocols (e.g., OAuth or SAML), where they enable secure, scalable user tracking while complying with privacy regulations such as GDPR by minimizing the exposure of personal data.[7] Effective management of UIDs, including preventing reuse and ensuring uniqueness across distributed systems, is vital to mitigate risks like privilege escalation or unauthorized access.Overview and fundamentals
Definition and purpose
In Unix-like operating systems, a user identifier (UID) is a unique integer value that identifies a user account, enabling the distinction between multiple users in shared computing environments.[8] In Unix-like systems, the UID is stored in configuration files such as /etc/passwd and serves as the fundamental mechanism by which the kernel associates processes, files, and resources with specific users.[9] This numeric label, typically represented as an unsigned integer, replaces textual usernames internally to optimize system operations and enforce security boundaries.[10] The primary purposes of a UID revolve around core security functions: authentication, which verifies a user's identity during login by mapping the provided credentials to the corresponding UID; authorization, which governs access to files, processes, and system resources based on the UID's associated privileges; and auditing, which attributes actions and events to individual users for logging and accountability.[10] For instance, when a file is created, it is stamped with the UID of its owner, allowing the operating system to apply permission checks against that identifier during subsequent access attempts.[11] The concept of user identifiers first emerged in early multi-user operating systems like Multics during the 1960s, where identification combined elements such as person and project IDs to enable secure, shared access among users.[12] It was formalized and simplified into a single numeric UID in Unix during the 1970s, supporting efficient resource sharing in time-sharing environments.[13] A basic example in Unix-like systems is UID 0, which denotes the superuser account (root) and grants unrestricted privileges, exempting it from standard access constraints.[8]Role in multi-user operating systems
In multi-user operating systems, such as Unix-like systems, user identifiers (UIDs) play a critical role in enabling secure resource sharing among multiple concurrent users, such as in time-sharing environments, servers, and mainframes. By assigning a unique numeric UID to each user account, the operating system kernel can associate processes, files, and other resources with specific users, preventing unauthorized access and ensuring that one user's activities do not inadvertently or maliciously affect others. This mechanism is foundational to multi-user setups where resources like CPU time, memory, and storage are allocated dynamically among users logging in locally or remotely.[14][15] UIDs integrate seamlessly with the permission model to enforce access controls, particularly through file and directory permissions categorized for the owner (user), group, and others. In Unix systems, permissions specify read, write, and execute rights (e.g., represented asrwxr-xr-x), where the owner's UID determines full control over resources they create, while group and other categories allow controlled collaboration. This model extends to advanced mechanisms like access control lists (ACLs) in POSIX-compliant systems, which refine permissions beyond basic categories by referencing specific UIDs. For runtime checks, the effective user ID may temporarily alter permissions for processes, but the core association remains tied to the user's UID.[16][17]
The security benefits of UIDs stem from their enforcement of the least privilege principle, isolating users and their processes to minimize interference and limit potential damage from compromises. Each process inherits its parent's UID, ensuring that operations are confined to authorized resources; for instance, a user's files cannot be modified by another without explicit permission, protecting data privacy and integrity in shared environments. This isolation is vital in multi-user scenarios, where virtual memory separation and kernel enforcement prevent processes from accessing unauthorized memory or hardware.[15][18]
In practice, on a Unix server, web server processes like Apache are typically run under a non-privileged UID, such as that of the www-data user (often UID 33 in Debian-based systems), to restrict their access to only necessary files and directories. This limits the scope of damage if the server is exploited, as the process cannot escalate to system-wide privileges without additional mechanisms.[19][18]
User ID attributes in Unix-like systems
Real user ID
The real user ID (RUID), also known as the real UID, is the user identifier assigned to a process upon its creation in Unix-like operating systems, matching the UID of the user who originally invoked the process. This value represents the true owner of the process and is inherited from the parent process during a fork operation, establishing the persistent identity of the initiating user.[20] The RUID remains unchanged throughout the process lifecycle unless explicitly modified by a privileged system call, such as setuid or setreuid executed with superuser privileges, ensuring it serves as a stable reference to the original invoker even amid runtime privilege adjustments.[21][20] The RUID plays a key role in system accounting, where it is used to attribute resource consumption—such as CPU time, memory usage, and I/O operations—to the user who started the process, rather than any temporary privileged persona. In process accounting mechanisms, the recorded user identifier in the accounting file structure (e.g., the acct structure in Linux) corresponds specifically to the real UID, enabling accurate tracking and potential billing independent of effective privilege changes.[22] It also acts as a fallback for determining the process's core identity in scenarios where the effective user ID has been altered, such as during privilege escalation via setuid executables, allowing applications to query the original user via the getuid() system call.[20] In POSIX-compliant systems, the real user ID is retrieved using the getuid() function, which returns the RUID of the calling process and is always successful without setting errno. This call provides a reliable way to access the invariant user identity, supporting auditing and ownership verification without interference from dynamic permission shifts.[23] For example, if a user with UID 1001 launches a shell script, the process's RUID is set to 1001 at creation and persists regardless of subsequent actions, such as executing a setuid program that temporarily elevates privileges; resource usage in accounting logs will thus be charged to UID 1001.[20]Effective user ID
The effective user ID (EUID), also known as the effective UID, is the user identifier that determines a process's access permissions and privileges in Unix-like operating systems during runtime.[24] It represents the identity under which the process operates for resource access and security checks, potentially differing from the real user ID after privilege adjustments.[25] The EUID can be retrieved using the geteuid() system call.[24] The EUID is modified via the setuid() system call, which sets it to a specified UID value if the process holds appropriate privileges, such as the CAP_SETUID capability in Linux or execution of a setuid binary.[26] This allows dynamic privilege escalation or de-escalation, enabling non-privileged users to perform authorized elevated tasks without granting full root access.[21] For instance, when a setuid program executes, the kernel sets the EUID to the file owner's UID, facilitating controlled privilege changes.[26] In standard Unix-like systems, the EUID is used by the kernel to enforce access control for operations like file opens via open() or program execution via exec(), where it is compared against file ownership and permissions.[25] A practical example is the passwd utility, a setuid-root program that temporarily sets its EUID to 0 (root) to update the protected /etc/shadow file with the user's new password, then reverts the EUID to the caller's real UID to minimize security risks.[27] Linux introduces a filesystem user ID (FSUID) as a specialized variant of the EUID, dedicated exclusively to permission checks for filesystem accesses like path resolution and inode operations.[25] The FSUID typically mirrors the EUID but can be independently adjusted using setfsuid(), providing finer-grained separation to support features like NFS server implementations without altering general process privileges.[28] This allows processes to maintain distinct identities for file system interactions while using the EUID for other resources, such as semaphores or shared memory.[25]Saved set-user-ID
The saved set-user-ID (SUID) is a process credential in Unix-like systems that stores a copy of the effective user ID (EUID) as it was set during the last successfulexec() call or by a privileged setuid() invocation, enabling the process to later restore that EUID without requiring superuser privileges.[21] This mechanism preserves the original privilege level associated with a set-user-ID executable, distinguishing it from the real user ID (RUID) and allowing dynamic privilege management within the process.[21]
The primary purpose of the saved set-user-ID is to facilitate privilege bracketing, a security practice where a privileged process temporarily relinquishes elevated permissions by setting its EUID to the less-privileged RUID for routine operations, then restores the original EUID from the saved value for sensitive tasks that require higher access.[21] This approach minimizes the exposure of elevated privileges, reducing the risk of exploitation if a vulnerability is triggered during unprivileged execution, and is essential for secure implementation of set-user-ID applications.[21]
In POSIX-compliant systems where the _POSIX_SAVED_IDS feature is defined, the saved set-user-ID is automatically set by the setuid() system call when the process has appropriate privileges (e.g., effective UID of 0), or explicitly managed using the setresuid() function, which allows setting the RUID, EUID, and SUID independently—provided the process possesses the CAP_SETUID capability or meets unprivileged constraints (e.g., setting to current RUID, EUID, or SUID values).[21][29] Restoration occurs via calls like seteuid() or setreuid(), which can switch the EUID back to the saved value. However, not all Unix-like systems fully implement this POSIX extension; historical BSD variants, such as 4.3BSD, lacked dedicated saved set-user-ID support and instead used setreuid() to swap between RUID and EUID directly for similar privilege-switching effects.[30]
A representative example is a network daemon process, which often starts with an EUID of 0 to bind to privileged ports below 1024. The daemon saves this EUID in the saved set-user-ID, drops privileges by setting the EUID to a non-root value (e.g., via setresuid() or setuid()) for accepting connections and handling non-sensitive I/O, and later restores the root EUID from the saved value for operations like user authentication or file access that require elevated permissions.[31][21] This bracketing ensures the daemon operates with minimal privileges most of the time while retaining the ability to escalate securely when needed.[21]
Conventions and implementation details
Data types and storage
In Unix-like operating systems, user identifiers (UIDs) are represented using theuid_t data type defined in the <sys/types.h> header. According to the POSIX standard, uid_t is an arithmetic type suitable for holding user IDs, which may be signed or unsigned depending on the implementation, but without a specified minimum range.[32] Historical Unix systems, running on 16-bit architectures like the PDP-11, limited UIDs to 16 bits, supporting values from 0 to 65535.[33]
Modern implementations, such as Linux, define uid_t as an unsigned 32-bit integer, allowing up to approximately 4.3 billion unique users and addressing the limitations of earlier 16-bit designs.[34] This change was formalized in the Linux kernel starting with version 2.4, which introduced full 32-bit UID support through new system calls like setfsuid32() to prevent overflows and enable scalability for larger environments.[28] Although uid_t remains 32-bit even on 64-bit systems, some extensions in filesystems and network protocols (e.g., NFSv4) accommodate mappings for effectively larger identifier spaces in distributed setups.[35]
UIDs are stored in kernel structures managing processes and users. In the Linux kernel, they reside within the struct cred pointed to by the cred field of struct task_struct, the process control block that tracks per-process credentials including real, effective, and saved UIDs.[36] For persistent user account information, UIDs are maintained in databases such as the /etc/passwd file, where each line's third colon-separated field holds the UID as a decimal integer, or in directory services like LDAP, using the uidNumber attribute as an integer value.[8][37] This storage ensures portability across POSIX-compliant systems, where uid_t facilitates consistent handling despite varying underlying integer sizes.[32]
UID allocation ranges
In Unix-like systems, user identifiers (UIDs) are typically assigned within numerical ranges defined by standards and implementations to ensure compatibility and prevent conflicts between system and user accounts. Historical Unix systems limited UIDs to a 16-bit unsigned integer range of 0 to 65,535 for compatibility. Modern implementations like Linux extend support using the 32-bituid_t type, allowing UIDs up to 4,294,967,295 (2^32 - 1), though practical allocation often adheres to established conventions, such as those outlined in the Linux Standard Base (LSB), for interoperability.[38] According to the Linux Standard Base (LSB), UIDs 0-99 are statically allocated for system use, 100-499 for dynamic system accounts, and 500+ for regular users, though distributions may vary these ranges.[38]
To distinguish system accounts from regular users and avoid overlap that could compromise security, UIDs are divided into reserved and allocatable ranges. UIDs from 0 to 99 are conventionally reserved for system accounts, such as root (UID 0) and other privileged services, while some distributions extend this to 0-999 for additional system use. Regular user accounts are typically assigned UIDs starting from 1000 onward, ensuring that user processes cannot inadvertently access or interfere with system resources. This separation is configurable via system files like /etc/login.defs in Linux, where parameters such as UID_MIN and UID_MAX define the boundaries for dynamic allocation.
UID allocation is managed through tools like useradd or adduser, which draw from predefined pools to assign unique values automatically, often querying the Name Service Switch (NSS) for configuration and availability checks across local files, LDAP, or other backends. The process ensures sequential or lowest-available assignment within the user range, with NSS modules like files or sss handling lookups to maintain uniqueness without manual intervention. In containerized environments, such as those using Docker, UID mapping techniques remap container-internal UIDs to distinct host ranges (e.g., 100000+ for isolated namespaces) to prevent conflicts between containerized applications and the host system.