Fact-checked by Grok 2 weeks ago

Unix file types

In Unix-like operating systems, files form the fundamental abstraction for data storage, organization, and interaction with system resources, embodying the philosophy that "everything is a file" to provide a uniform interface for diverse entities such as documents, devices, and communication channels. This approach simplifies programming and system management by allowing the same operations—like reading, writing, and permission control—to apply across file types. The POSIX standard defines seven primary file types, encoded in the st_mode field of the stat structure via constants such as S_IFREG for regular files, enabling consistent identification and handling across compliant systems.^[1]

Historical Development

The Unix file type system originated in the early 1970s at Bell Labs with the development of the first Unix versions. The initial file system, introduced in the First Edition of Unix in 1971, supported regular files, directories, and special device files (block and character special files). Named pipes (FIFOs) were added in the Fifth Edition in 1974 to facilitate inter-process communication. Symbolic links and sockets were introduced in Berkeley Software Distribution (BSD) Unix, specifically in 4.2BSD released in 1983. The POSIX.1 standard, published by the IEEE in 1988, formalized these seven file types to promote portability across Unix-like systems.^[2]^[1]

Introduction

Concept and Purpose

In Unix-like operating systems, file types represent categories assigned to filesystem objects—such as files, directories, and devices—based on their intended use and operational behavior. These types are primarily encoded in the file mode bits within the inode data structure, which stores metadata for each object on disk.^[3] The primary purpose of Unix file types is to allow the kernel to enforce suitable operations, access permissions, and behavioral semantics for diverse objects. For instance, this enables straightforward data read/write access for certain types while restricting others to navigation or hardware I/O interactions, thereby maintaining filesystem integrity and security.^[4] Fundamentally, Unix treats all filesystem objects as uniform byte streams at the kernel interface, providing a consistent abstraction for system calls. However, the designated file type introduces specialized behaviors, such as executability for program launch or direct device communication, optimizing resource management and application interactions. The core Unix file types, including regular files, directories, and special files, were standardized in the POSIX.1 specification (IEEE Std 1003.1-1988) during the late 1980s to promote portability across Unix variants.^[5]

Historical Development

The Unix file system originated in the late 1960s at Bell Labs, where Ken Thompson, Dennis Ritchie, and Rudd Canaday designed it as a hierarchical structure starting with a prototype on a PDP-7 in 1969. This early implementation introduced the inode (index node) as a core data structure, a fixed-size entry containing metadata such as file size, protection mode, and pointers to data blocks, which distinguished between regular files holding user data, directory files organizing the namespace through name-to-inode mappings, and special files representing devices.^[6] The design drew significant influence from Multics, a 1960s time-sharing system developed jointly by MIT, Bell Labs, and General Electric, particularly in its uniform treatment of files for secondary storage and hierarchical organization, where files were accessed via a consistent interface without distinguishing between data and devices at the user level.^[7] By the release of Version 7 Unix in 1979, the file type system had been formalized through the inode's mode bits, explicitly supporting regular files for data storage, directories for namespace management, and special files divided into character devices for stream-based I/O (e.g., terminals) and block devices for buffered, fixed-size block access (e.g., disks).^[8] This structure ensured a unified interface for all I/O operations, hiding device specifics behind ordinary file semantics. The POSIX.1 standard (IEEE Std 1003.1-1988), ratified in 1988 by the IEEE, further standardized these core types—regular, directory, character special, block special—along with emerging ones like FIFOs, promoting portability across Unix variants by defining behaviors through system calls such as stat() and macros in <sys/stat.h>.^[5] Subsequent evolution in the 1980s introduced additional types to enhance interprocess communication and flexibility. Symbolic links, which store pathnames to other files for indirect referencing, were added in 4.2BSD in 1983, allowing pathname indirection similar to Multics mechanisms and supporting cross-filesystem references without inode linkage. Symbolic links and sockets were formally incorporated into the POSIX standard as file types in POSIX.1-2001 (IEEE Std 1003.1-2001).^[9] Concurrently, AT&T's System V Unix in 1983 provided FIFOs (named pipes) as special files in the filesystem for unidirectional interprocess data streams using standard I/O calls.^[10] Berkeley Software Distribution variants in the 1980s extended this with socket files via the sockets API in 4.2BSD, enabling local domain communication akin to network sockets but within the filesystem namespace.^[11] In the 1990s, minor extensions appeared in advanced filesystems, such as whiteout files in union mounts from 4.4BSD-Lite (1994), which act as opaque markers to hide underlying files in overlaid namespaces, influenced by Plan 9's union directory concepts for resource unification.^[12] Standards bodies like IEEE and The Open Group have maintained consistency through ongoing POSIX revisions, with POSIX.1-2024 (published 2024 by IEEE, with ISO/IEC approval in 2025) ensuring these file types remain foundational for portability in modern systems including Linux, BSD, and macOS, without altering core semantics but enhancing related APIs for real-time and networking integration.^[13]

Determining File Types

Mode Bits Structure

In Unix-like systems, the mode bits are stored in a 16-bit field within the inode structure, which serves as the core metadata for filesystems such as those defined by POSIX. This field, often represented as mode_t or i_mode in kernel structures, is divided into distinct components: the highest 4 bits (bits 12–15, masked by S_IFMT or 0xF000) encode the file type; the next 3 bits (bits 9–11) hold special flags such as setuid (S_ISUID, 0x0800), setgid (S_ISGID, 0x0400), and sticky (S_ISVTX, 0x0200); and the lowest 9 bits (bits 0–8, masked by 0x01FF) specify the access permissions for owner, group, and others (each with read, write, and execute bits).^[3] The file type is determined by specific values in the high 4 bits, standardized across POSIX-compliant systems including Linux and BSD variants. These values are defined as follows:

Macro	Octal Value	Hex Value	Description
S_IFREG	0100000	0x8000	Regular file
S_IFDIR	0040000	0x4000	Directory
S_IFBLK	0060000	0x6000	Block device
S_IFCHR	0020000	0x2000	Character device
S_IFLNK	0120000	0xA000	Symbolic link
S_IFIFO	0010000	0x1000	Named pipe (FIFO)
S_IFSOCK	0140000	0xC000	Socket

These constants are used by macros like S_ISREG(mode) to test the type by applying the S_IFMT mask and comparing.^[3] The kernel relies on these mode bits to enforce type-specific behaviors during system calls such as open() and stat(). For instance, in the virtual file system (VFS) layer, the file type influences the selection of inode operations (i_op) and permission checks; attempting to open a directory for writing (e.g., via O_WRONLY) typically results in an EISDIR error, as the kernel dispatches to directory-specific handlers that prohibit direct writes to prevent filesystem corruption. Similarly, stat() populates the st_mode field by extracting and returning the full inode mode, allowing user-space applications to inspect the type. This bit-level representation ensures efficient, low-overhead type validation in kernel paths.^[4] While the core structure adheres closely to POSIX in Unix and Linux implementations, non-POSIX systems like Plan 9 introduce variations: it employs a 32-bit mode field where high bits (e.g., 0x80000000 for directories, 0x40000000 for append-only files) encode types and additional attributes, diverging from the 4-bit type encoding while retaining Unix-compatible permission bits in the low 12 bits.

Inspection Methods

In Unix systems, file types can be inspected using various command-line tools that query the file's metadata or content. The ls utility, when invoked with the -l option, displays a long listing format where the first character of each entry indicates the file type: for instance, a hyphen (-) denotes a regular file, while d signifies a directory.^[14] This representation derives from the file's mode bits, providing a quick visual summary without additional computation.^[14] The file command offers a more comprehensive analysis by examining both the file's metadata and its contents, employing a database of magic numbers—specific byte sequences at the file's beginning—to identify formats like ELF executables, supplemented by heuristics for cases without distinctive signatures.^[15] For example, it might classify a binary as "ELF 64-bit LSB executable" based on the header, extending beyond simple mode-based typing.^[15] Meanwhile, the stat command retrieves detailed inode information, including the numeric mode value (e.g., in octal), which encodes the file type within its higher bits.^[16] At the programmatic level, the POSIX stat() and lstat() system calls populate a struct stat with file status details, where the st_mode field holds the mode bits; the S_IFMT mask extracts the file type portion for further testing.^[17] Developers commonly use convenience macros defined in <sys/stat.h> to check types directly, such as S_ISREG(st_mode) for regular files, S_ISDIR(st_mode) for directories, or S_ISLNK(st_mode) for symbolic links, enabling conditional logic in applications.^[17] A key limitation arises with symbolic links: stat() follows the link to the target file, potentially masking the link's own type, whereas lstat() examines the link itself to reveal its S_IFLNK type. Additionally, while mode bits provide structural typing, tools like file address content-based identification, such as distinguishing compressed archives from plain binaries, which mode alone cannot discern.^[15]

Representations

Numeric (Octal) Notation

In Unix systems, the numeric notation for file types and modes employs octal (base-8) encoding to represent the 16-bit st_mode field from the stat structure, with the file type encoded in bits 12-15 and permissions plus special bits in bits 0-11. This full mode is typically expressed as a multi-digit octal number, where the leading digits encode the file type and special bits, followed by three digits for owner, group, and other permissions (each ranging from 0 to 7, summing read=4, write=2, execute=1). For instance, 0644 denotes a regular file with owner read/write (6) and group/other read-only (4 each), while the complete mode including type might appear as 0100644 for such a regular file.^[4] The file type is determined by specific octal values within the S_IFMT mask (0170000), which isolates the type bits. These values are as follows:

File Type	Octal Value	Description
FIFO	0010000	Named pipe
Character Device	0020000	Character special file
Directory	0040000	Directory
Block Device	0060000	Block special file
Regular File	0100000	Ordinary file
Symbolic Link	0120000	Symbolic link
Socket	0140000	Socket

When specifying the full mode, such as 0120000 for a socket with no permissions set, the type contributes to the higher digits, enabling low-level distinction in system calls or scripting.^[4] This notation is commonly used with the chmod command to set modes precisely, where an octal argument like 4755 creates a regular file that is setuid (4), owner-readable/executable (7), and group/other-readable/executable (5), resulting in full mode 0104755 assuming the regular file type. Similarly, umask accepts a three-digit octal mask (e.g., 022) to restrict default permissions during file creation, applying bitwise AND to the requested mode. In scripting, octal literals (prefixed with 0, like 0644) allow direct manipulation via bitwise operations in languages like C or shell arithmetic.^[18] The octal format's advantages lie in its compactness for automated tools and scripts, where a single integer succinctly captures the entire mode, and its historical roots in early Unix implementations, such as Version 6, where chmod processed octal arguments to update inode modes via system calls. This precision facilitates low-level operations like compiling mode checks in kernel code or portable scripting across POSIX systems.^[19]

Symbolic Notation

Symbolic notation provides a human-readable representation of Unix file types and permissions, commonly displayed in tools like the ls command with the long format option. This notation consists of a 10-character string where the first character indicates the file type, followed by three groups of three characters each representing read (r), write (w), and execute (x) permissions for the owner, group, and others, respectively, with hyphens (-) denoting absent permissions.^[20] Special permission bits modify the execute positions: uppercase 'S' or 'T' for setuid/setgid or sticky bit without execute permission, and lowercase 's' or 't' when execute is also set.^[20] An optional eleventh character, such as '+', may indicate additional access control attributes like ACLs, depending on the implementation.^[20] The file type indicators in this notation are standardized as follows: '-' for regular files, 'd' for directories, 'l' for symbolic links, 'b' for block devices, 'c' for character devices, 'p' for named pipes (FIFOs), and 's' for sockets, though some systems extend this list.^[20] For example, a regular executable file might appear as -rwxr-xr-x, indicating owner read/write/execute, group and others read/execute, while a directory with sticky bit could show drwxrwxrwt.^[20] This format derives from the file mode structure obtained via system calls like stat(), where tools parse the mode bits to construct the symbolic string for display.^[1] Symbolic notation extends to modifying permissions via the chmod command's symbolic mode, using a syntax of [who] [operator] [permissions], where 'who' specifies u (user/owner), g (group), o (others), or a (all); operators are + (add), - (remove), or = (set exactly); and permissions include r (read), w (write), x (execute), s (setuid/setgid), or t (sticky bit).^[21] For instance, u+x,g+rx adds execute to the owner and read/execute to the group, while the 'X' permission conditionally adds execute only to directories or existing executables.^[21] This mode applies to permissions on existing files but does not alter file types, which are established during creation using specialized mechanisms for non-regular types like pipes or devices.^[21] In GNU implementations, such as those in Linux coreutils, the ls tool enhances symbolic notation with color coding to visually distinguish file types when output is to a terminal, controlled by the --color option and the LS_COLORS environment variable, where directories might appear in blue and executables in green.^[22] BSD systems, like FreeBSD, adhere closely to POSIX but include extensions such as 'w' for whiteout files in the type indicator and display of MAC labels in long output, which are absent in standard GNU versions. These variations ensure compatibility while accommodating system-specific features, with GNU favoring user-friendly enhancements like colors and BSD emphasizing security-oriented additions.^[22] Unlike numeric octal notation used for programmatic settings, symbolic notation prioritizes readability for interactive use.^[21]

Core File Types

Regular Files

Regular files in Unix systems serve as the primary mechanism for storing arbitrary data, functioning as ordinary containers of unstructured byte sequences without any system-imposed organization beyond their size and access offsets. According to the POSIX standard, a regular file is defined as a randomly accessible sequence of bytes, distinguishing it from other file types that may have specialized structures or behaviors.^[23] This design principle allows regular files to act as the default type for most user-generated content, accommodating everything from simple text to complex binary data in a uniform manner. Regular files are created using standard utilities that initialize them as empty or populated byte streams. The touch utility creates an empty regular file if the target does not exist, typically to establish a new file or update its timestamps. The cp utility duplicates an existing file's contents into a new regular file, preserving the source's data while assigning a distinct inode.^[24] Shell redirection, such as echo "data" > filename, also generates a regular file by writing output to it, overwriting any prior contents if present. These files are identified in the file system by a mode value of 0100000 in octal, corresponding to the S_IFREG constant, and by the '-' indicator in the first position of the permission string output by ls -l. Access to regular files occurs through sequential read and write operations, managed via file descriptors and offsets that enable both linear traversal and random positioning within the byte stream. If the execute bit is set in the file permissions, a regular file can be invoked as an executable, supporting use as shell scripts or machine code binaries loaded into memory for execution.^[23] Regular files further support hard links, where multiple directory entries reference the same inode, allowing shared access to identical data blocks without duplication. Common applications of regular files include text documents for configuration or logging, image and media files for storage, and program binaries for software distribution; they represent the predominant file type in Unix file systems, handling the bulk of persistent data storage.^[25]

Directory Files

Directory files in Unix function as specialized containers that organize filesystem entries by mapping filenames to corresponding inode numbers, forming the hierarchical structure of the file system. These files are distinguished by a specific mode bit pattern of 0040000 in octal representation, which identifies them as directories within the inode's mode field. In the output of the ls -l command, directories are prefixed with a 'd' to indicate their type, differentiating them from other file types.^[26] The internal structure of a directory consists of a sequence of directory entries, each comprising a filename paired with an inode number that points to the associated file's metadata. In the ext4 filesystem, these entries follow a variable-length format defined by the struct ext4_dir_entry, where filenames are limited to 255 bytes, and the total entry size reaches up to 263 bytes to accommodate the name, inode reference, and metadata like file type flags. To support the filesystem hierarchy, every directory automatically includes two reserved entries: "." (a self-reference to the current directory's inode) and ".." (a reference to the parent directory's inode), ensuring navigability without explicit user management.^[27]^[28] Unlike regular files, directories cannot be treated as sources of arbitrary user data; opening and reading a directory file descriptor with standard read() yields binary representations of the directory entries rather than interpretable content, and operations expecting regular file behavior, such as certain seeks or direct data access, may fail with errors like ENOTDIR when the path component is mismatched. Key operations on directories include listing contents via ls (requiring read permission), traversing into them with cd (requiring execute permission), and creating subentries with mkdir (requiring write permission). Directory permissions are uniquely interpreted: the read bit (r) controls visibility of entries (e.g., via ls), the execute bit (x) allows path traversal and access to contained items (e.g., via cd), and the write bit (w) permits adding, removing, or renaming entries within the directory.^[29] Directories impose specific limitations to maintain filesystem integrity; direct writing to a directory's file descriptor is not permitted and results in an EISDIR error, as modifications must occur through dedicated system calls like mkdir or unlink to update the entry mappings safely. The capacity for entries varies across filesystems, with traditional UFS implementations capping the maximum at around 64K entries in updated versions, such as FreeBSD's UFS2, to balance performance and storage efficiency.^[30]

Symbolic Links

Symbolic links, also known as soft links, are special files in Unix-like systems that contain a string representing the pathname of another filesystem object, such as a file or directory, allowing indirect referencing without duplicating data.^[31]^[32] These links are identified by a file mode of 0120000 in octal notation or by the 'l' character in the output of the ls command.^[32] They are created using the ln -s command, which specifies the target pathname and the name for the new link, for example: ln -s /path/to/target linkname.^[33]^[32] In terms of behavior, symbolic links remain valid files even if their target becomes inaccessible, resulting in what are known as dangling links.^[34] During path resolution, the kernel automatically follows symbolic links to reach the target unless the lstat() system call is used, which retrieves information about the link itself rather than the target; in contrast, stat() dereferences the link. The stored pathname can be relative, resolved from the directory containing the link, or absolute, starting from the root filesystem; relative paths maintain portability when the link and target are moved together within the same directory structure.^[32] Unlike the target, a symbolic link has its own distinct inode and does not share storage or attributes with the referenced object.^[32] Symbolic links are commonly used as shortcuts to simplify access to frequently referenced files or directories and in system configuration, such as the /etc/alternatives mechanism, which manages multiple versions of commands like Java by dynamically updating symlinks to point to the selected implementation. They can form cycles, where a link points back to itself or creates a loop, but this is risky as it may trigger an ELOOP error during path resolution after encountering too many links, potentially leading to infinite loops in tools that traverse the filesystem.^[31] In contrast to hard links, symbolic links store only the target pathname as their content rather than referencing the actual data, enabling them to span different filesystems and link to directories, though this makes them prone to breaking if the target moves or is deleted.^[32]

Device Files

Device files, also known as special files, serve as interfaces to hardware devices in Unix-like operating systems, allowing processes to interact with peripherals through the standard file I/O operations.^[35] There are two primary types: block special files, identified by the mode bit S_IFBLK (octal 0060000) and denoted by 'b' in listings, which support buffered, random-access I/O suitable for storage devices like disks; and character special files, identified by S_IFCHR (octal 0020000) and denoted by 'c', which provide unbuffered, stream-oriented I/O for devices such as terminals or printers.^[35]^[3] Device files are created using the mknod system call or command, which requires specifying the file type, major number (identifying the device driver), and minor number (distinguishing specific instances of the device).^[36] For example, the block device file /dev/sda for the first SCSI disk is typically created as mknod /dev/sda b 8 0, where 8 is the major number for SCSI disks and 0 is the minor number.^[37] In traditional Unix systems, these files are statically defined in the /dev directory, but modern implementations employ dynamic mechanisms like udev on Linux to automatically generate and manage device files upon hardware detection. Operations on device files translate directly to hardware interactions: reads and writes invoke the associated device driver rather than manipulating file contents, and seeking may or may not be supported depending on the device type. Unlike regular files, device files have no inherent size; the st_size field in the stat structure is typically 0, as they do not store data but represent infinite or device-specific streams. Permissions on device files function similarly to those on other files, controlling read, write, and execute access to the underlying hardware, with the owner often set to root for security.^[35] Common examples include /dev/null, a character special file with major 1 and minor 3 that discards all input and returns EOF on reads, useful for suppressing output.^[37] Legacy IDE disks used files like /dev/hda, a block special file with major 3 and minor 0, enabling buffered access to hard drive partitions.^[38]

IPC Files

IPC files in Unix systems are special file types designed to facilitate interprocess communication (IPC) between processes on the same machine, appearing as entries in the filesystem but serving as channels for data exchange rather than storing persistent data. These include named pipes, known as FIFOs (first-in, first-out), and Unix domain sockets. Unlike regular files, IPC files do not retain data after transmission; instead, they provide mechanisms for processes to send and receive byte streams or datagrams in real time. They are identified in the ls -l output by the file type indicator 'p' for FIFOs and 's' for sockets, and their mode bits are set to 0010000 for FIFOs (S_IFIFO) and 0140000 for sockets (S_IFSOCK).^[4]^[22] Named pipes, or FIFOs, provide a simple, unidirectional communication channel for blocking byte streams between unrelated processes. They are created using the mkfifo command or the mkfifo(3) system call, which establishes a filesystem entry that processes can open for reading (O_RDONLY) or writing (O_WRONLY) via open(2). Once opened, a writing process blocks if no reader is available, and vice versa, ensuring synchronized data flow unless non-blocking mode (O_NONBLOCK) is specified. FIFOs are particularly useful for scenarios like extending shell pipelines to unrelated processes, where one process writes output to the FIFO and another reads it as input, mimicking the behavior of anonymous pipes but with a persistent name. Permissions on FIFOs function similarly to regular files, controlling access based on user, group, and other bits, though the primary operations revolve around opening the ends rather than direct file manipulation.^[39]^[40]^[41] Unix domain sockets, created via the socket(2) system call with the AF_UNIX address family, offer more versatile bidirectional communication, supporting stream (SOCK_STREAM), datagram (SOCK_DGRAM), and sequenced packet (SOCK_SEQPACKET) types. A server process binds a socket to a filesystem pathname using bind(2), allowing clients to connect via connect(2), after which data can be exchanged efficiently without network overhead. These sockets support advanced features such as passing process credentials (PID, UID, GID) using ancillary data like SCM_CREDENTIALS or socket options such as SO_PASSCRED, enabling secure authentication between processes. They are commonly used in local IPC protocols, for example, the X Window System communicates with the X server over a Unix domain socket at /tmp/.X11-unix/X0, and D-Bus employs them for inter-application messaging on Linux desktops. Permissions apply to the bound pathname, similar to files, but emphasize bind and connect operations for access control.^[42]^[43]^[44]^[45] Both IPC file types exhibit no data persistence, as transmitted bytes are not stored on disk and are discarded upon reading, with the channel closing when the last process reference is released via close(2). However, the filesystem entry persists until explicitly removed with unlink(2) or rm, preventing reuse without cleanup. FIFOs are simpler, lacking addressing or multiple connection types, making them suitable for basic producer-consumer patterns, whereas Unix domain sockets provide greater flexibility with connection-oriented or connectionless modes, credential passing, and support for multiple concurrent connections, though at the cost of more complex setup involving bind and connect. Unlike device files, which interface with hardware, IPC files focus exclusively on software-mediated process communication.^[40]^[42]

Practical Usage

Command Examples

To demonstrate the creation and inspection of regular files in Unix systems, the touch command can be used to create an empty file or update timestamps on an existing one. For instance, executing touch file.txt creates a new regular file named file.txt if it does not exist.^[46] Viewing the file with ls -l file.txt displays output such as -rw-r--r-- 1 user group 0 Nov 11 12:00 file.txt, where the leading - indicates a regular file, followed by permissions and other metadata.^[22] For directories, the mkdir command creates a new directory entry. The command mkdir dir establishes a directory named dir with default permissions.^[47] Inspecting it via ls -ld dir yields output like drwxr-xr-x 2 [user](/page/User) group 4096 Nov 11 12:00 dir, with the leading d denoting a directory file type.^[22] Symbolic links are created using the ln command with the -s option. Running ln -s target [link](/page/Link) produces a symbolic link named link pointing to the target file or directory.^[33] The ls -l link command shows details such as lrwxrwxrwx 1 [user](/page/User) group 6 Nov 11 12:00 [link](/page/Link) -> target, where l signifies the symbolic link type and the arrow indicates the referenced path.^[22] Device files, such as character special files, are generated with the mknod command, typically requiring superuser privileges. The invocation sudo mknod /dev/mydev c 100 0 creates a character device file /dev/mydev with major number 100 and minor number 0.^[48] Listing it with ls -l /dev/mydev results in output like crw-r--r-- 1 root root 100, 0 Nov 11 12:00 /dev/mydev, prefixed by c to denote a character device.^[22] FIFO (named pipe) files are made using mkfifo, which creates a special file for inter-process communication. Executing mkfifo pipe establishes a FIFO named pipe.^[49] The ls -l pipe command displays prw-r--r-- 1 user group 0 Nov 11 12:00 pipe, with p indicating the FIFO type.^[22] For usage, one process can write data with echo "data" > pipe & while another reads it via cat pipe, blocking until both ends are open to facilitate unidirectional data flow.^[49] Unix domain sockets, used for local inter-process communication, are not directly created via a simple command like the others but can be viewed and managed with tools such as ss. The command ss -lx lists listening Unix sockets, showing entries like u_str LISTEN 0 128 /tmp/mysocket * 0, where the u_str prefix identifies the Unix stream socket type.^[50] These sockets appear as file types with a leading s in ls -l output, such as srw-rw-rw- 1 user group 0 Nov 11 12:00 /tmp/socket.^[22]

System Implications

The Unix file type system enforces strict separation of behaviors, enhancing security by preventing operations inappropriate to a file's nature. For instance, the kernel prohibits executing non-regular files, such as device files, to avoid unintended code execution on hardware interfaces; attempting to run a device file via execve(2) results in EACCES, as only regular files with execute permissions are permitted. This type-based enforcement mitigates risks like privilege escalation through misconfigured or malicious devices. Additionally, symbolic links introduce vulnerabilities such as time-of-check-to-time-of-use (TOCTOU) races, where an attacker swaps a link between a safety check (e.g., lstat) and use (e.g., open), potentially leading to unauthorized access or data corruption; over 1,300 CVEs stem from such symlink issues.^[51] Mitigations include the O_NOFOLLOW flag in open(2), which fails if the final path component is a symlink, and openat(2) for relative operations on directory file descriptors to eliminate races.^[52] Performance benefits arise from type-specific optimizations tailored to each file's role. Directories in the ext4 filesystem employ hashed B-trees (enabled via the dir_index feature) to accelerate name lookups in large directories, reducing traversal time from linear to logarithmic complexity and supporting directories with millions of entries efficiently.^[53] Device files, particularly block devices, support direct I/O via the O_DIRECT flag, bypassing the kernel's page cache to minimize memory overhead and latency for high-throughput applications like databases, where buffered I/O could introduce unnecessary copying. The file type model profoundly shapes filesystem design, particularly through abstractions like the Linux Virtual File System (VFS) layer, which standardizes operations across types via inode and file operation structures; filesystems must implement type-specific inode_operations (e.g., mknod for devices, symlink for links) to integrate seamlessly.^[54] This extensibility allows innovations such as Solaris doors, a special file type using file descriptors for lightweight RPC, where clients invoke server procedures via door_call(3), enabling efficient inter-process communication without sockets while leveraging Unix security via descriptor inheritance.^[55] In modern environments, containerization technologies like Docker virtualize file types through kernel namespaces, providing each container an isolated mount namespace that presents a customized filesystem view, restricting access to host types unless explicitly bind-mounted and enhancing isolation without altering underlying semantics.^[56] Auditing tools such as auditd further support security by logging file attribute changes, including mode modifications that affect type (e.g., via chmod or mknod), capturing events in /var/log/audit/audit.log with details like syscall, path, and user ID for forensic analysis.^[57]

References

[1]
<sys/stat.h>
### Summary of File Types from `st_mode` Field in `<sys/stat.h>`
[2]
File System
UNIX philosophy: Everything is a file. Plain file; Directory; Block device ... UNIX file systems are organized as directed acyclic graphs (DAGs). Think of ...
[3]
inode(7) - Linux manual page - man7.org
The file type and mode The stat.st_mode field (for statx(2), the statx.stx_mode field) contains the file type and mode. POSIX refers to the stat.st_mode ...
[4]
stat(2) - Linux manual page - man7.org
These functions return information about a file, in the buffer pointed to by statbuf. No permissions are required on the file itself.
[5]
[PDF] IEEE standard portable operating system interface for computer ...
It defines the applications interface to basic system services for input-output, file system access, and process management. It also defines a format for data ...
[6]
Evolution of the Unix Time-sharing System - Nokia
This paper presents a technical and social history of the evolution of the system. Origins. For computer science at Bell Laboratories, the period 1968-1969 was ...
[7]
Unix and Multics
Jul 10, 2025 · There was some influence in the other direction in the 70s and 80s. For example, Multics "master directories" work very much like Unix mount ...
[8]
[PDF] The UNIX Programmer's Ma~ual for the UNIX TimeuSharing SysteiD
Jan 16, 1979 · There are a few differences between this printing of the UNIX Programmer's Manual for the Seventh Edition of the UNIX time-sharing system and ...
[9]
[PDF] Bug fixes and changes in 4.2BSD July 28, 1983 - RogueLife.org
Jul 28, 1983 · Symbolic links provide a ''symbolic referencing'' mechanism similar to that found in Multics. They are interpolated during pathname expansion ...
[10]
Interprocess Communication in the Ninth Edition Unix System - Nokia
System V also provides named pipes (FIFOs). Their names reside in the file system, and ordinary I/O operations apply to them. They can provide a convenient ...
[11]
Chapter 7. Sockets | FreeBSD Documentation Portal
BSD sockets take interprocess communications to a new level. It is no longer necessary for the communicating processes to run on the same machine.
[12]
Union mounts in 4.4BSD-Lite - FreeBSD Presentations and Papers
This paper describes the design and rationale behind union mounts, a new filesystem-namespace management tool available in 4.4BSD-Lite.
[13]
POSIX™ 1003.1 Frequently Asked Questions (FAQ Version 1.18)
May 25, 2025 · This is the Frequently Asked Questions file for the POSIX 1003.1 standard (IEEE Std 1003.1). Its maintainer is Andrew Josey (ajosey at The Open Group).
[14]
ls - The Open Group Publications Catalog
The ls utility shall detect infinite loops; that is, entering a previously visited directory that is an ancestor of the last file encountered. When it detects ...
[15]
file(1) - Linux manual page - man7.org
This manual page documents version 5.46 of the file command. file tests each argument in an attempt to classify it. There are three sets of tests, performed in ...
[16]
stat(1) - Linux manual page - man7.org
See MODE below -c --format=FORMAT use the specified FORMAT instead of the default; output a newline after each use of FORMAT --printf=FORMAT like --format, but ...
[17]
<sys/stat.h>
The <sys/stat.h> header shall define the structure of the data returned by the functions fstat(), lstat(), and stat(). The stat structure shall contain at ...
[18]
chmod(1) - Linux manual page - man7.org
This manual page documents the GNU version of chmod. chmod changes the file mode bits of each given file according to mode, which can be either a symbolic ...Missing: V6 | Show results with:V6
[19]
[PDF] UNIX OPERATING SYSTEM SOURCE CODE LEVEL SIX
It containes a specially edited selection of the UNIX Operating. System source code, such as might be used on a typical. PDP11/40 computer installation. The ...
[20]
ls
### Summary of Symbolic Notation for File Types and Permissions in `ls` Command Output
[21]
chmod
### Summary of Symbolic Mode for `chmod`
[22]
ls(1) - Linux manual page - man7.org
With --color=auto, ls emits color codes only when standard output is connected to a terminal. The LS_COLORS environment variable can change the settings. Use ...
[23]
Definitions - The Open Group Publications Catalog
File types include regular file, character special file, block special file, FIFO special file, symbolic link, socket, and directory. Other types of files ...
[24]
cp - The Open Group Publications Catalog
The `cp` command copies files. It can copy a single file to a target, or multiple files to a directory, and can copy file hierarchies with the -R option.
[25]
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_403
[26]
stat.h source code [glibc/sysdeps/unix/sysv/linux/bits/stat.h]
#define __S_IFMT 0170000 /* These bits determine file type. */ · /* File types. */ · #define __S_IFDIR 0040000 /* Directory. */.
[27]
4.3. Directory Entries — The Linux Kernel documentation
The original directory entry format is struct ext4_dir_entry , which is at most 263 bytes long, though on disk you'll need to reference dirent.rec_len to know ...
[28]
Directories and Links - Stanford University
Directories are stored on disk just like regular files (i.e. inode with 14 pointers, etc.) · Each directory contains <name, i-number> pairs in no particular ...
[29]
Filesystem permissions on Unix - Server Fault
Oct 20, 2009 · Directories have two different read permissions. You have the standard read permission, like you do with files. This stops you from doing an ...Missing: rxw | Show results with:rxw
[30]
13.3 UFS subdirectory (link) maximum increased to 65530
May 29, 2024 · And a couple of the files (dirnode.h and inode.h) have this comment: Increase UFS/FFS maximum link count from 32767 to 65530. Tested on ...UFS - How many files could be stored in a directory?Maximum sensible number of files in a server directory?More results from forums.freebsd.org
[31]
symlink
### Summary of symlink from https://pubs.opengroup.org/onlinepubs/009695399/functions/symlink.html
[32]
symlink(7) - Linux manual page - man7.org
A symbolic link is a special type of file whose contents are a string that is the pathname of another file, the file to which the link refers. (The contents ...
[33]
ln(1) - Linux manual page - man7.org
Create hard links by default, symbolic links with --symbolic. By default, each destination (name of new link) should not already exist. When creating hard links ...
[34]
symlink(2) - Linux manual page - man7.org
Symbolic links are interpreted at run time as if the contents of the link had been substituted into the path being followed to find a file or directory. ...<|control11|><|separator|>
[35]
<sys/stat.h>
Test for a pipe or FIFO special file. S_ISREG(m): Test for a regular file. S_ISLNK(m): Test for a symbolic link. The implementation may implement ...
[36]
mknod
The `mknod()` function creates a new file, including directories, special files, or regular files. The only portable use is to create a FIFO-special file.
[37]
mknod(2) - Linux manual page - man7.org
The mknod() system call creates a filesystem node (file, device special file, or named pipe) named path, with attributes specified by mode and dev.
[38]
Major and Minor Numbers - Linux Device Drivers, Second Edition ...
The major number identifies the driver associated with the device. For example, /dev/null and /dev/zero are both managed by driver 1.
[39]
mkfifo(3) - Linux manual page - man7.org
Linux manual page. NAME | LIBRARY | SYNOPSIS | DESCRIPTION ... mkfifo() POSIX.1-2001. mkfifoat() glibc 2.4. POSIX.1-2008. SEE ALSO top.
[40]
pipe(7) - Linux manual page - man7.org
A FIFO (short for First In First Out) has a name within the filesystem (created using mkfifo(3)), and is opened using open(2). Any process may open a FIFO, ...Missing: mode | Show results with:mode
[41]
open(2) - Linux manual page - man7.org
ENOTDIR (openat()) path is a relative pathname and dirfd is a file descriptor referring to a file other than a directory. ENXIO O_NONBLOCK | O_WRONLY is set, ...Read(2) · Fcntl(2) · Write(2) · Close(2)<|separator|>
[42]
unix(7) - Linux manual page - man7.org
Traditionally, UNIX domain sockets can be either unnamed, or bound to a filesystem pathname (marked as being of type socket).
[43]
socket(2) - Linux manual page - man7.org
The formats currently understood by the Linux kernel include: Name Purpose Man page AF_UNIX ... The operation of sockets is controlled by socket level options.
[44]
https://man7.org/linux/man-pages/man2/bind.2.html
[45]
D-Bus Specification - Freedesktop.org
D-Bus is a low-overhead, easy-to-use interprocess communication (IPC) system using a binary protocol, designed for same-machine IPC.
[46]
touch(1) - Linux manual page
### Synopsis
[47]
mkdir(1) - Linux manual page
### Synopsis
[48]
mknod(1) - Linux manual page
### Summary of mknod(1) from https://man7.org/linux/man-pages/man1/mknod.1.html
[49]
mkfifo(1) - Linux manual page - man7.org
Create named pipes (FIFOs) with the given NAMEs. Mandatory arguments to long options are mandatory for short options too.Missing: POSIX | Show results with:POSIX
[50]
ss(8) - Linux manual page
### Summary: Viewing Unix Sockets with `ss -l`
[51]
The trouble with symbolic links - LWN.net
Jul 7, 2022 · The solutions to the problems posed by symlinks led to substantial increases in the complexity of the APIs involved in working with pathnames.Missing: dangling | Show results with:dangling
[52]
Portably Solving File TOCTTOU Races with Hardness Amplification
### Summary of TOCTTOU Vulnerabilities with Symlinks in Unix and Portable Solutions
[53]
ext4(5) - Linux manual page - man7.org
For this feature to be useful the inode size must be 256 bytes in size or larger. filetype This feature enables the storage of file type information in ...
[54]
Overview of the Linux Virtual File System
The Virtual File System (also known as the Virtual Filesystem Switch) is the software layer in the kernel that provides the filesystem interface to userspace ...Missing: influence | Show results with:influence
[55]
"Doors" in Solaris TM : Lightweight RPC using File Descriptors
A door is a "file" descriptor used to describe a procedure in a process and optionally some additional state associated with the procedure.Missing: type extensibility
[56]
Docker Engine security - Docker Docs
Namespaces provide the first and most straightforward form of isolation. Processes running within a container cannot see, and even less affect, processes ...Isolate containers with a user... · Seccomp security profiles · Rootless mode
[57]
7.6. Understanding Audit Log Files - Red Hat Documentation
The Audit system stores log entries in the /var/log/audit/audit.log file; if log rotation is enabled, rotated audit.log files are stored in the same directory.