Fact-checked by Grok 2 weeks ago

System call

A system call is the primary mechanism by which a computer program running in user space requests privileged services from the operating system's kernel, such as accessing hardware resources, managing processes, or performing input/output operations.^[1] This interface ensures that user applications cannot directly manipulate sensitive system components, thereby maintaining security and stability.^[2] System calls operate by invoking a special processor instruction—such as ecall in RISC-V or syscall in x86-64 architectures—that triggers a controlled transition from user mode to kernel mode.^[2]^[3] The kernel then validates the request, executes the operation using arguments passed through CPU registers, and returns control to the user program with the results, restoring the saved state.^[1] This process abstracts complex kernel functionalities into a standardized API, with each operating system defining its own set of available calls, such as POSIX-compliant ones in Unix-like systems.^[4] Common examples include fork() for creating child processes, exec() for loading new programs, read() and write() for file I/O, and waitpid() for process synchronization, all of which are essential for application development and system resource management.^[1] By encapsulating hardware interactions, system calls enable portability across architectures while enforcing isolation between user code and the kernel, preventing unauthorized access or crashes from propagating.^[2]

Fundamentals

Definition and Role

A system call is a programmatic mechanism by which a computer program running in user space requests a service from the operating system's kernel, such as hardware abstraction, resource allocation, or process management.^[5] These requests allow applications to access kernel-managed resources without needing to understand underlying hardware details, providing a standardized interface for system interactions.^[6] The core role of system calls is to enable the safe and controlled execution of privileged operations—like input/output (I/O) handling and memory management—while preventing user programs from gaining direct access to the kernel or hardware, which could compromise system stability.^[7] This mediation contrasts with non-operating system environments, where programs interact directly with hardware, often at the risk of errors or conflicts. System calls originated to enforce security boundaries and abstraction layers in operating systems, ensuring that only authorized operations can affect critical resources.^[8] In modern multitasking operating systems such as Unix, Linux, and Windows, system calls are essential for maintaining the distinction between user space (where applications run with limited privileges) and kernel space (a protected mode for system-wide operations), thereby supporting secure resource sharing among multiple processes.^[9] For instance, the Linux kernel 6.17 supports 356 system calls on the x86_64 architecture as of October 2025, illustrating the scale of services available while varying across operating systems and versions.^[10]

Historical Evolution

The concept of system calls emerged in the 1960s with early mainframe operating systems, where they served as a mechanism to transition from user mode to privileged supervisor mode for accessing kernel services. IBM's OS/360, released in 1966, introduced the Supervisor Call (SVC) instruction as a dedicated hardware mechanism to invoke these privileged operations, enabling controlled access to system resources like I/O and memory management while maintaining protection boundaries.^[11] This design was influenced by contemporary efforts in protected procedure calls, notably in the Multics operating system developed from 1965 onward, which used ring-based protection to enforce secure transitions during procedure invocations, laying groundwork for modular and secure system interfaces.^[12] In the 1970s, Unix on Digital Equipment Corporation's PDP-11 minicomputers standardized system call interfaces through trap instructions, simplifying the invocation of kernel services from user programs. The PDP-11's TRAP instruction triggered a software interrupt to enter the kernel, handling operations such as file access and process creation via a numbered table of services, which promoted a clean separation between user and kernel code.^[13] This approach carried over to Unix ports on VAX systems, where similar trap mechanisms ensured compatibility and efficiency in multi-user environments, influencing the evolution toward portable and standardized OS designs. The transition from low-level assembly macros—used initially to wrap these traps—to support in high-level languages like C further democratized system call usage, allowing developers to abstract hardware details while retaining direct kernel access. The 1980s saw the formalization of system call portability through POSIX standards, driven by the IEEE's P1003.1 effort starting in 1984 and culminating in the 1988 standard, which defined a core set of system calls for process, file, and signal management to enable application compatibility across Unix-like systems.^[14] Concurrently, Microsoft's MS-DOS relied on software interrupts like INT 21h to dispatch a wide range of services, from disk I/O to console output, providing a simple but non-portable interface for early personal computing. By the 1990s, Windows NT shifted to a more robust Native API, with functions prefixed Nt or Zw in ntdll.dll serving as the entry points for system calls into the kernel, supporting protected subsystems and enhancing stability over DOS's interrupt-based model.^[15] Post-2000 developments in Windows emphasized security mitigations around system calls, including kernel patch protection in Windows Vista (2007) to prevent unauthorized modifications and address space layout randomization to obscure call patterns from exploits. In parallel, Linux expanded its syscall table to address performance and security needs; for instance, io_uring was introduced in kernel 5.1 (May 2019) as an asynchronous I/O interface using shared ring buffers to reduce syscall overhead for high-throughput applications.^[16] eBPF-related enhancements, building on the bpf() syscall available since kernel 3.18 but significantly extended in the 2020s, enabled programmable kernel hooks for networking and tracing without module loading. ^[17] Post-2020, confidential computing integrations added kernel support, including ioctls and modules, for trusted execution environments, such as those supporting attestation and memory encryption in x86 virtualization under technologies like Intel TDX and AMD SEV-SNP.^[18]^[19] The 2010s brought impacts from virtualization, where hypervisors like KVM and Xen introduced hypercalls—analogous to system calls but for guest-to-hypervisor communication—to manage virtual resources efficiently, with innovations like hyperupcalls reversing the flow for event notifications and improving scalability in cloud environments.^[20] These milestones collectively shifted system calls from hardware-specific traps to abstracted, secure, and performant interfaces supporting modern distributed systems.

Mechanism of Operation

Privilege Levels and Modes

System calls depend on hardware-enforced privilege levels to isolate user-mode processes from kernel operations, preventing direct manipulation of critical system resources and ensuring kernel integrity. In the x86 architecture, privilege levels are organized into four rings, with ring 0 reserved for kernel-mode execution—granting full access to hardware instructions and memory—and ring 3 for user-mode applications, which are restricted from privileged operations to avoid unauthorized access or system compromise.^[21] This ring-based model enforces protection by checking the current privilege level before allowing access to sensitive instructions, such as those modifying page tables or I/O ports.^[21] Comparable structures appear in other instruction set architectures to achieve similar isolation. ARM processors utilize exception levels, where EL0 executes unprivileged user-mode code for applications, while EL1 runs the kernel with elevated privileges to manage system resources securely.^[22] In RISC-V, user mode (U-mode) confines application execution to limited capabilities, supervisor mode (S-mode) supports operating system tasks like virtual memory management, and machine mode (M-mode) provides the highest privilege for low-level hardware control, all designed to block unauthorized resource access through mode-specific register and instruction restrictions.^[23] The transition to kernel mode during a system call temporarily elevates privileges, enabling controlled access to hardware while user mode remains barred from direct interactions, such as memory-mapped I/O or interrupt handling. This elevation occurs via architecture-specific mechanisms, like the SYSCALL instruction in x86 or exception raising in ARM, which switch the processor state without exposing kernel internals to user code.^[21]^[22] Privilege levels integrate with broader security models, including capability-based and access control list (ACL) approaches, where hardware modes underpin enforcement. Capability-based systems, pioneered by Dennis and Van Horn, rely on tamper-proof tokens that confer specific rights, with privilege rings ensuring only kernel-mode code can validate or revoke them.^[24] In contrast, ACL systems, as detailed in Saltzer's analysis of Multics, attach permission lists to objects, enforced by hardware privileges that restrict user-mode alterations to these lists.^[12] Core enforcement tools include page tables, which validate memory accesses against the current privilege level to prevent user-mode violations, and segment descriptors in x86, which specify access rights and boundaries per memory segment.^[21] Historical precedents include supervisor mode in IBM's System/360 architecture, which separated privileged kernel operations from unprivileged problem-state user programs, establishing early hardware-based protection for mainframe environments.^[25] Modern enhancements bolster these modes against evolving threats. Intel's Control-flow Enforcement Technology (CET), deployed in processors from the early 2020s, introduces shadow stacks to safeguard return addresses during privilege transitions in system calls, countering control-flow hijacking attacks without altering core ring semantics.^[26] Complementing this, Linux's seccomp framework filters system calls using BPF programs, allowing kernel-mode enforcement of user-defined restrictions to limit privilege elevations and reduce attack surfaces in contemporary deployments.^[27]

Context Switching and Kernel Entry

When a user program issues a system call, it executes a special instruction that triggers a trap or software interrupt, causing the processor to automatically save the current user-mode registers (including the program counter, stack pointer, and general-purpose registers) onto the kernel stack and switch to kernel mode.^[28] The kernel's entry point routine then performs additional state preservation, such as saving the system call number and parameters from user registers, validates them to prevent invalid or malicious requests, and dispatches the appropriate kernel handler to execute the operation on the per-process kernel stack.^[29] This process ensures that the kernel operates with elevated privileges while maintaining isolation from user space. Unlike a full context switch during thread or process scheduling—which involves saving the entire process state and loading another—a system call entry reuses the invoking process's address space, page tables, and thread context, avoiding the higher overhead of scheduler involvement.^[30] Upon completion of the kernel work, control returns to user mode by restoring the saved registers and resuming execution at the point following the system call instruction, typically using specialized return mechanisms that efficiently handle the mode transition.^[31] The latency of this kernel entry and exit is a key performance factor, typically ranging from 50 to 700 CPU cycles on modern x86 processors for simple calls, influenced by factors like register save/restore operations and branch predictions.^[32] ^[33] Optimizations, such as maintaining separate user and kernel stacks per process to minimize stack pointer adjustments and caching frequent validation paths, help reduce this overhead in production kernels. System calls handle errors by returning a negative value representing the negated errno code (e.g., -13 for EACCES) directly to the kernel's caller, which user-space libraries interpret to set the global errno variable and return -1 to the application; in cases of interruption or unrecoverable issues, the kernel may instead deliver a signal to the process. In contemporary containerized cloud environments of the 2020s, such as those employing gVisor for enhanced isolation, system calls from containerized workloads are often intercepted by a user-space runtime and proxied to the host kernel, introducing an additional mediation layer that modifies the entry flow for security without altering the core process.^[34]

Implementation Approaches

Interrupt-Based Methods

Interrupt-based methods for system calls rely on software-generated exceptions to transition from user mode to kernel mode, allowing user programs to request operating system services securely. In these approaches, a dedicated instruction triggers an interrupt, which the processor handles by saving the current context, switching privilege levels, and invoking a kernel-resident handler. The handler then examines a system call number passed via a register or immediate value to dispatch the appropriate kernel routine, restoring user context upon completion. This mechanism ensures isolation while providing a uniform entry point for kernel services across various architectures.^[35] On x86 architectures, the INT n instruction serves this purpose, where n specifies the interrupt vector; for system calls, INT 0x80 was commonly used in early implementations, with the call number loaded into the EAX register for dispatch by the kernel's interrupt descriptor table (IDT) handler. Similarly, ARM processors employ the Supervisor Call (SVC) instruction, often encoded as SVC #0 with the system call number in a general-purpose register like R7 (in AArch32) or X8 (in AArch64), prompting the exception vector table to route control to the SVC handler for parameter validation and execution. These instructions emulate hardware interrupts but are explicitly invoked by software, incurring the full overhead of exception processing, including stack frame setup and privilege level changes.^[35]^[36] Historically, interrupt-based methods trace back to mainframe systems, such as the IBM System/360, where the SVC instruction (introduced in 1964) allowed problem-state programs to request supervisor services by specifying a function code in the instruction's opcode field, with the operating system using it for tasks like I/O initiation and memory allocation. In the PDP-11 minicomputer family, developed by Digital Equipment Corporation in the late 1960s, the EMT (Emulate T) instruction handled system calls in early UNIX implementations, while the TRAP instruction provided a general exception mechanism, both vectoring to kernel routines via a fixed memory location for service dispatch. By the 1980s, MS-DOS utilized INT 21h as its primary interface, where subfunction codes in the AH register enabled over 100 services, from file operations to console I/O, making it a cornerstone of PC software development. In Linux's initial x86 ports during the 1990s, INT 0x80 served as the standard entry point for ia32 binaries, with the kernel's ia32_syscall handler managing compatibility until its deprecation in favor of faster alternatives around the mid-2000s due to performance limitations.^[37]^[38]^[39] The simplicity of interrupt-based methods offers advantages in portability and ease of implementation, as they leverage existing hardware exception infrastructure without requiring specialized instructions, making them suitable for early or resource-constrained systems. However, they introduce significant performance overhead from full interrupt handling, including mandatory context saves, IDT lookups, and mode switches, which is significantly slower than modern optimized paths—benchmarks show INT 0x80 can be several times costlier than fast syscall alternatives on contemporary hardware. This latency stems from the general-purpose nature of interrupts, designed for asynchronous events rather than frequent, synchronous kernel requests.^[35]^[40] Despite their decline in general-purpose computing, interrupt-based methods persist in legacy systems, real-time operating systems (RTOS), and embedded environments as of 2025, particularly in IoT devices using microcontrollers like ARM Cortex-M. For instance, FreeRTOS employs the SVC instruction to initiate privileged operations and start the scheduler, ensuring deterministic behavior in resource-limited settings where simplicity outweighs raw speed, while avoiding the complexity of dedicated syscall instructions. These uses maintain compatibility with older hardware and prioritize reliability in safety-critical applications over throughput.

Fast Syscall Instructions

Fast syscall instructions represent hardware-optimized mechanisms for invoking system calls, designed to minimize the overhead associated with transitioning from user mode to kernel mode compared to traditional interrupt-based approaches. These instructions enable direct, low-latency entry into the operating system kernel by leveraging specialized processor features, such as model-specific registers (MSRs), to preconfigure exception handlers and avoid the full interrupt processing pipeline. Introduced primarily in the late 1990s and early 2000s, they address the performance bottlenecks in high-frequency system call scenarios, such as in server workloads or real-time applications, by reducing context switch times.^[41]^[42]^[43] In the x86 architecture, the SYSENTER and SYSCALL instructions provide the core of fast syscall implementations. SYSENTER, introduced by Intel in the Pentium II processor in 1997, pairs with SYSEXIT for rapid kernel entry and return, using MSRs like IA32_SYSENTER_CS, IA32_SYSENTER_ESP, and IA32_SYSENTER_EIP to store segment, stack pointer, and instruction pointer values for immediate setup. SYSCALL, originally specified by AMD in 1997 and integrated into x86-64 extensions around 2003, operates similarly but loads the kernel entry point directly from the IA32_LSTAR MSR, saving the return address in RCX and the instruction pointer in RFLAGS for efficient exit via SYSRET. These instructions bypass the general interrupt descriptor table (IDT) lookup and vectoring overhead, enabling faster privilege level changes without emulating a full trap.^[41]^[44]^[42] Across other architectures, equivalent instructions follow a similar model of direct exception generation. On ARM processors, the Supervisor Call (SVC) instruction triggers a supervisor exception for system calls, with the Vector Base Address Register (VBAR) configuring the base address of the exception vector table to route execution to the kernel handler; this setup allows immediate mode switching to EL1 (formerly SVC mode) without interrupt latency. In RISC-V, the ECALL (Environment Call) instruction generates a trap from user mode (U-mode) to supervisor mode (S-mode), invoking a handler whose base address is configured in the stvec CSR as per the privileged architecture specification, with arguments passed via general-purpose registers a0–a7 (where a7 holds the syscall number). These mechanisms ensure architecture-specific optimizations while maintaining a consistent low-overhead profile.^[45]^[46]^[47] Adoption of fast syscall instructions became widespread in major operating systems during the 2000s, driven by their performance advantages. In Linux, kernels from version 2.6 (released in 2003) onward defaulted to SYSENTER or SYSCALL on supported x86 hardware via the vsyscall page mechanism, with full integration in 64-bit environments; by kernel 6.x series (2022 onward), these instructions are standard across x86, ARM, and RISC-V ports, with the kernel entry assembly (e.g., entry_SYSCALL_64 in arch/x86/entry/entry_64.S) handling the transition and argument validation. Microsoft Windows adopted SYSCALL for x64 systems starting with Windows Vista in 2007, replacing older interrupt methods in the Native API for improved efficiency. Benchmarks on modern hardware show that fast syscalls significantly reduce latency compared to interrupt-based methods, though exact figures vary by processor generation, workload, and mitigations.^[48]^[49]^[50] Recent enhancements have focused on security without sacrificing speed. Post-2020 AMD processors, such as those in the Zen 3 and Zen 4 families, incorporate mitigations like enhanced SYSRET handling to address speculative execution vulnerabilities (e.g., improved branch prediction for return paths), ensuring secure indirect branches during syscall exits. Apple Silicon (M-series chips, introduced in 2020) leverages ARM's SVC instruction with customized vector table setups in the XNU kernel, optimized for the integrated SoC architecture. These developments, including cross-architecture support in Linux 6.x for RISC-V ECALL and ARM SVC, highlight ongoing refinements for performance-critical and secure computing environments.^[51]^[52]

Abstraction Layers

System Libraries

System libraries act as essential intermediaries in the interaction between user-space applications and the operating system kernel, encapsulating low-level system calls within higher-level wrapper functions to simplify development and conceal kernel-specific intricacies. Prominent examples include glibc on Linux systems, which provides a comprehensive set of wrappers for nearly all system calls; libc on BSD variants, offering similar POSIX-compliant interfaces; and ntdll.dll on Windows, which houses native API stubs that bridge user-mode code to kernel services.^[53]^[54] These wrappers deliver key benefits such as automated error handling, where kernel return values are mapped to portable error codes like errno for consistent application feedback; rigorous parameter validation to prevent invalid inputs from reaching the kernel; and preservation of binary compatibility, allowing programs to remain functional across kernel updates without modification.^[55]^[56] By standardizing these aspects, libraries reduce the risk of kernel-induced crashes and enhance overall system reliability. At their core, wrapper functions operate by preparing system call arguments in appropriate registers or stack locations and then triggering the kernel transition via specialized assembly instructions or macros. In glibc, for example, this involves three main approaches: auto-generated assembly stubs derived from syscall lists for straightforward invocations; C-based macros like INLINE_SYSCALL_CALL for inline execution with cancellation support; and bespoke custom code for cases requiring unique semantics, such as legacy compatibility.^[53] These mechanisms ensure efficient, low-overhead transitions while integrating features like thread cancellation. The development of such libraries traces back to the early 1970s, when C libraries emerged alongside Unix to support the kernel's rewrite in C, with foundational components like portable I/O routines enabling cross-platform deployment on hardware such as the PDP-11. This foundation evolved into more specialized forms, including language-specific bindings like the Java Native Interface (JNI), which permits Java code to invoke native C or C++ routines that execute system calls, thereby fostering modularity in environments like microkernels where kernel services are minimal and extended via user-space components.^[57]^[58] A critical aspect is that not every library function equates to a direct system call; for instance, buffered I/O routines in C libraries may batch multiple underlying invocations for performance optimization, managing the complete interaction lifecycle internally without kernel exposure.^[56] This selective layering underscores the libraries' role in balancing efficiency and abstraction.

API Wrappers and Portability

API wrappers abstract low-level system calls into higher-level interfaces, facilitating easier development and cross-operating system compatibility. In Unix-like systems, the POSIX API defines standardized functions such as read() and write(), which map directly to underlying system calls provided by the kernel, ensuring consistent behavior for file I/O operations across compliant implementations.^[59] Similarly, on Windows, the Win32 API acts as an abstraction layer over the NT Native API, where functions like ReadFile() invoke kernel-mode system services that ultimately execute NT system calls for resource access.^[15] Portability is enhanced through standards and emulation tools that unify syscall-like interfaces. The Single UNIX Specification (SUS), aligned with IEEE Std 1003.1, mandates a common programming environment across Unix variants, promoting source code portability by specifying interfaces that mimic system call semantics for processes, files, and signals.^[60] For non-Unix platforms, Cygwin provides a POSIX-compliant layer on Windows via its core DLL (cygwin1.dll), which translates API calls to native Windows system calls, allowing Unix applications to run after recompilation without full OS emulation.^[61] Despite these abstractions, challenges persist due to inconsistencies in system call implementations, such as differing syscall numbers—for example, the open() syscall is number 2 on x86_64 Linux but 5 on x86_64 FreeBSD—complicating direct binary portability.^[62] Dynamic linking addresses this by enabling runtime resolution of syscall interfaces through shared libraries like glibc, which adapt to the host kernel's specifics without recompilation.^[63] In contemporary computing, container runtimes extend portability by managing syscall access in isolated environments; for instance, Docker integrates with gVisor, a user-space kernel that intercepts and proxies syscalls via mechanisms like seccomp-BPF, forwarding only necessary ones to the host for security and compatibility.^[34] The Windows Subsystem for Linux (WSL) further bridges ecosystems, with WSL2 employing a lightweight virtual machine hosting a genuine Linux kernel for direct syscall execution, and 2025 updates including its open-sourcing to enhance interop and translation layers for hybrid workloads.^[64] Optimizations like the Virtual Dynamic Shared Object (VDSO) refine wrapper efficiency by mapping kernel-provided code into user space, allowing time-sensitive syscalls such as gettimeofday() to execute without full kernel transitions, thus reducing overhead in portable applications.^[65]

Categories of System Calls

Process Management

System calls for process management enable the creation, termination, monitoring, and control of processes and threads, forming the foundation of multitasking operating systems. In Unix-like systems, process creation typically involves the fork() system call, which duplicates the calling process to produce a child process sharing the same code, data, and open files, but with a distinct process ID (PID) assigned by the kernel for unique identification and resource allocation.^[66] The child process then often uses one of the exec() family of calls, such as execve(), to replace its program image with a new one while retaining the PID and open resources. This two-step model separates duplication from image replacement, allowing flexible initialization before execution. In contrast, Windows employs the CreateProcess() API, which combines process creation and image loading in a single call, launching a new process and its primary thread in the caller's security context, with the kernel allocating a PID and handling resource inheritance like environment variables.^[67] For process termination, the exit() system call in Unix-like systems terminates the calling process, releasing its resources and returning an exit status to the parent, while the wait() or waitpid() calls allow the parent to suspend execution until a child terminates, retrieving its status to manage cleanup and avoid zombie processes.^[68] These calls ensure orderly lifecycle management, with the kernel reclaiming memory, file descriptors, and other resources upon exit. System calls like getpid() provide process identification by returning the current PID, essential for logging, synchronization, and resource naming, while setuid() alters the effective user ID to enforce security controls, such as privilege escalation or dropping, typically restricted to privileged processes.^[69] Thread management builds on process creation primitives, particularly in Linux, where the clone() system call creates threads or processes with fine-grained control over shared resources like memory and file descriptors via flags (e.g., CLONE_VM for shared address space).^[70] The POSIX pthread_create() function serves as a portable wrapper, invoking clone() under the Native POSIX Thread Library (NPTL) to implement one-to-one threading, where each user thread maps to a kernel thread for efficient multi-core utilization.^[71] Scheduling cooperation is facilitated by sched_yield(), which voluntarily relinquishes the CPU to peers at the same priority, moving the caller to the end of the run queue without blocking, thus preventing starvation in contended scenarios.^[72] Process and thread lifecycle models influence how these calls operate, with the many-to-one model mapping multiple user-level threads to a single kernel thread for lightweight creation but limiting parallelism since blocking user threads stall the entire process on multi-core systems.^[73] The one-to-one model, adopted in modern Unix-like kernels like Linux's NPTL, assigns a kernel thread per user thread, enabling true concurrency across cores but increasing overhead from kernel involvement in creation and context switches.^[71] Variations persist across systems; for instance, Windows CreateProcess() inherently supports threaded execution via its primary thread, differing from Unix's fork-exec separation. Recent enhancements, such as Linux's clone3() introduced in kernel 5.3 (2019), extend clone() with a struct-based interface for advanced features like namespace isolation and flexible stack allocation, improving efficiency in containerized and multi-threaded environments on multi-core hardware.^[70] Post-2020 developments, including scheduler refinements in Linux 6.6's Earliest Eligible Virtual Deadline First (EEVDF), further optimize thread placement for better multi-core responsiveness without altering core system call interfaces.

File and Device Operations

System calls for file and device operations form the core mechanism by which user programs interact with persistent storage and peripherals in operating systems, providing a uniform abstraction over diverse hardware through kernel-mediated access. In Unix-like systems adhering to POSIX standards, the primary file operations revolve around the open(), read(), write(), and close() system calls, which manage file descriptors—small integers serving as handles to kernel resources. The open() call initializes access to a file or device by path, specifying mode (e.g., read-only, write-append) and performing initial permission verification against the process's effective user ID and group ID; it returns a non-negative file descriptor on success or -1 on failure. Subsequent read() and write() calls transfer data between a user-provided buffer and the associated resource, with the kernel handling any necessary device-specific mapping without imposing user-space buffering—these calls are unbuffered at the syscall level to ensure direct control, though standard C libraries like libc implement buffering atop them to amortize overhead. Finally, close() releases the file descriptor, flushing any pending kernel buffers and freeing resources. Additional operations enhance file manipulation: lseek() repositions the file offset for non-sequential access, allowing seeks relative to the start, current position, or end of the file, which is essential for random I/O patterns on both files and devices.^[74] Permission management occurs via chmod(), which alters the file mode bits (e.g., read, write, execute for owner, group, others) in the inode, enforcing discretionary access control at the kernel level during subsequent operations. Buffering distinctions are critical: while syscalls directly manage kernel page caches for efficiency, user-space libraries handle stream buffering to batch small reads/writes, reducing syscall frequency without altering the kernel's unbuffered interface. For devices, Unix-like kernels distinguish between character devices, which provide sequential byte-stream access (e.g., terminals via /dev/tty), and block devices, which operate on fixed-size blocks (typically 512 bytes or more) for random access (e.g., disks via /dev/sda). The mount() system call attaches a block device's filesystem to the directory hierarchy, enabling transparent access through the VFS layer. Device-specific control uses ioctl(), a versatile syscall for non-standard operations like setting baud rates on serial ports or querying hardware status, passing command codes and arguments tailored to the device driver.^[75] Memory mapping via mmap() allows direct user-space access to device memory or file contents, bypassing read/write for high-performance scenarios like graphics buffers or shared memory devices. The Virtual File System (VFS) in Unix-like kernels, such as Linux, serves as an abstraction layer that intercepts these syscalls, routing them to appropriate filesystem or device drivers while maintaining a consistent interface across types like ext4 or NFS; it handles path resolution, caching, and permission enforcement uniformly.^[76] For modern asynchronous I/O, Linux introduced io_uring in kernel version 5.1 (March 2019), using shared ring buffers for submission and completion queues to batch operations and minimize context switches, significantly improving throughput for high-IOPS workloads on files and block devices compared to older AIO interfaces. In Windows, analogous functionality is provided by Win32 API functions that operate on handles rather than file descriptors: CreateFile() opens or creates files/devices with specified access rights (e.g., GENERIC_READ), security attributes, and sharing modes, returning a handle after performing access checks. ReadFile() and WriteFile() transfer data synchronously or asynchronously to/from the handle, supporting overlapped I/O for devices via completion ports, with the kernel managing buffering in its cache.^[77] CloseHandle() releases the handle, akin to close(). Security in these operations centers on file descriptor (or handle) ownership, where the kernel enforces access at syscall entry: for instance, read/write attempts validate the descriptor against current permissions, process credentials, and mandatory controls if enabled, preventing unauthorized I/O even if the descriptor is inherited. Recent advancements address storage performance gaps; in Linux kernel 6.17 (released September 2025), the fallocate() syscall gained NVMe-specific optimizations, leveraging Write Zeroes commands to efficiently zero-range allocate on SSDs, reducing latency for large-scale file operations.^[78]

Communication and Protection

System calls for interprocess communication (IPC) enable processes to exchange data and coordinate actions in operating systems. In Unix-like systems, the pipe() system call creates a unidirectional byte stream for parent-child processes, where data written to the write end by one process can be read from the read end by another, facilitating simple data transfer without shared memory. System V IPC mechanisms provide more advanced options, such as message queues controlled by msgctl(), which allows operations like removing a queue with the IPC_RMID command to free resources after communication. Shared memory segments are allocated via shmget(), using flags like IPC_CREAT to create new segments and shmflg to specify protection modes, ensuring controlled access based on permissions. In Windows, named pipes support bidirectional IPC between related or unrelated processes through the CreateNamedPipe API, which establishes a server endpoint for client connections, often used for client-server architectures within the system.^[79] Mailslots offer a lightweight, one-way broadcast mechanism for IPC, created with CreateMailslot and allowing messages to be sent to multiple recipients via a hierarchical name structure, suitable for simple notifications.^[80] Network-based IPC is handled by socket system calls; in Unix, socket() creates a socket descriptor, bind() associates it with a local address, and connect() establishes a connection to a remote endpoint, enabling communication over protocols like TCP. Windows implements similar functionality through the Winsock API, where socket() allocates a descriptor bound to a transport provider for subsequent bind and connect operations.^[81] Synchronization system calls ensure orderly access to shared resources in IPC scenarios. The semop() call performs atomic operations on System V semaphores, such as wait (decrement) or signal (increment), to coordinate multiple processes accessing shared memory or other IPC objects. Mutex wrappers, often built atop POSIX threads but ultimately relying on kernel synchronization primitives, provide mutual exclusion for critical sections, though direct system calls like futex (fast user-space mutex) underlie efficient implementations. Signals for interprocess notification are sent using kill(), which delivers a specified signal to a target process or process group, serving as a lightweight synchronization mechanism for events like termination or resource availability. Protection mechanisms in system calls enforce security and resource isolation during communication. The setrlimit() call sets or retrieves resource limits for a process, such as maximum file size or CPU time, preventing abuse in IPC contexts by capping shared resource usage. Identity-based protection is provided by getuid(), which returns the real user ID of the calling process, enabling checks against access permissions for IPC objects like pipes or shared memory. Auditing system calls, such as Linux's audit_write() introduced in kernel version 2.6.6 around 2004, allow writing records to the audit log for security monitoring of IPC and protection events. Modern enhancements address evolving protection needs in IPC. Extended Berkeley Packet Filter (eBPF) programs, integrated into the Linux kernel since version 4.1 in 2015 with significant expansions in the 2020s, enable dynamic hooking of system calls for runtime monitoring and enforcement, such as filtering unauthorized IPC access without modifying kernel code. Gaps in traditional protection, like vulnerability to privileged attacks, are mitigated by confidential computing technologies; for instance, Intel Software Guard Extensions (SGX) provides system calls and ioctls (e.g., via /dev/sgx) to create and manage hardware-isolated enclaves for secure IPC within trusted execution environments. Key concepts in these system calls include ownership models for IPC resources, where System V objects like shared memory segments have owner and group IDs with permission bits (read, write, alter) enforced at allocation, similar to file permissions. POSIX standards emphasize access control through these ownership attributes, while extensions like POSIX Access Control Lists (ACLs) allow finer-grained permissions on shared memory objects, managed via library calls that invoke underlying system interfaces for secure multi-user access.

Examples and Tools

Unix-like Systems

In Unix-like systems, system calls provide the primary interface for user-space programs to request services from the kernel, such as process creation and file operations. A canonical example is the fork() system call, which creates a new child process by duplicating the calling process; it returns the child's process ID to the parent and zero to the child upon success. Another fundamental call is open(), which opens a file and returns a file descriptor; for instance, open("/path/to/file", O_RDONLY) attempts to open the file in read-only mode, returning a non-negative integer descriptor on success or -1 on failure. System call numbers uniquely identify each call within the kernel's interface table. In x86_64 Linux, these are defined in headers like arch/x86/entry/syscalls/syscall_64.tbl, with examples including __NR_read as 0 for reading from a file descriptor, __NR_open as 2 for opening files, __NR_fork as 57 for process creation, and __NR_clone as 220 for creating processes or threads depending on flags like CLONE_THREAD.^[82] The clone() call, in particular, enables thread creation by sharing the parent's memory space when the CLONE_VM flag is set, forming the basis for POSIX threads in Linux.^[83] In FreeBSD, syscall numbers differ but follow a similar structure, with over 600 calls available across architectures as of FreeBSD 14.2 (May 2024).^[84]^[85] Invocation typically occurs through library wrappers, but direct kernel entry uses architecture-specific instructions. In x86_64 assembly on Linux, the syscall number is loaded into the %rax register, up to six arguments into %rdi, %rsi, %rdx, %r10, %r8, and %r9, followed by the syscall instruction to trigger the kernel transition; the return value appears in %rax, with negative values indicating errors.^[86] For example, to invoke write(1, "hello", 5):

mov &#36;1, %rax     # Syscall number for write
mov &#36;1, %rdi     # [File descriptor](/page/File_descriptor) (stdout)
mov $msg, %rsi   # [Buffer](/page/Buffer) address
mov &#36;5, %rdx     # Length
syscall
msg: .ascii "hello\0"
mov &#36;1, %rax     # Syscall number for write
mov &#36;1, %rdi     # [File descriptor](/page/File_descriptor) (stdout)
mov $msg, %rsi   # [Buffer](/page/Buffer) address
mov &#36;5, %rdx     # Length
syscall
msg: .ascii "hello\0"

This writes "hello" to standard output, returning the bytes written or -1 on error.^[87] Errors are signaled by a return value of -1, with the specific cause stored in the errno variable; for instance, open() fails with EACCES (errno 13) if the process lacks search permission on a component of the path or read permission on the file. Linux supports approximately 340 syscalls in recent kernels as of November 2025, such as 6.12, covering process management, file I/O, and more, while FreeBSD exceeds 600 as of version 14.2.^[88]^[10]^[85] Modern additions illustrate evolving capabilities; for example, Linux introduced fanotify_init() in kernel 2.6.36 (2010) for filesystem event monitoring, allowing applications to receive notifications on file access with syscall number 300 in x86_64, enabling advanced tools like antivirus scanners.^[89]^[90] In C, a simple fork() usage appears as:

c
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>

int main() {
    pid_t pid = fork();
    if (pid == -1) {
        perror("fork failed");  // Prints error like "Permission denied" if EACCES
        return 1;
    } else if (pid == 0) {
        printf("Child process\n");
    } else {
        printf("Parent: child PID %d\n", pid);
    }
    return 0;
}
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>

int main() {
    pid_t pid = fork();
    if (pid == -1) {
        perror("fork failed");  // Prints error like "Permission denied" if EACCES
        return 1;
    } else if (pid == 0) {
        printf("Child process\n");
    } else {
        printf("Parent: child PID %d\n", pid);
    }
    return 0;
}

This demonstrates process creation and basic error handling, where library calls like fork() resolve to the underlying syscall. Fuchsia, a modern capability-based OS with some Unix-like elements, integrates Rust for user-space components that interface with its Zircon kernel syscalls, such as zx_process_create for task management, enhancing safety.^[91]^[92]

Windows Systems

In Windows NT-based operating systems, system calls are implemented through the Native API, a set of low-level functions primarily exported by the ntdll.dll library, which serve as the interface between user-mode applications and the kernel. Unlike the more standardized and publicly documented system calls in Unix-like systems, Windows system calls are largely proprietary and undocumented, with Microsoft providing official support only for higher-level Win32 APIs. This opacity stems from the NT kernel's design philosophy, which prioritizes stability and abstraction over direct exposure of kernel interfaces, resulting in a "thick" layer of the Win32 subsystem that translates application requests into Native API calls before they reach the kernel. The Native API consists of functions prefixed with Nt or Zw, such as NtCreateFile and ZwCreateFile, which are functionally equivalent in user mode but differ in their handling of buffer probing—Zw versions assume user-mode buffers, while Nt versions treat them as potentially kernel-mode. These functions invoke the kernel via numbered system service dispatch tables (SSDT), where each call is identified by a system service number (SSN), such as 0x00B3 for NtCreateUserProcess in recent Windows versions. System calls in Windows are typically initiated using fast syscall instructions like SYSENTER on x86 architectures or SYSCALL on x86-64, which trigger a mode switch from user to kernel space without relying on software interrupts like INT 0x2E in older implementations. For example, NtCreateFile, an undocumented Native API function used to create or open files and devices, performs the core file system operations and is one of approximately 500 such calls available across Windows versions as of Windows 11 (2021–2025), though only a subset is officially documented. A related example is ZwReadFile, which acts as a user-mode wrapper for reading from file handles asynchronously or synchronously, passing parameters like I/O buffers and event objects to the kernel for processing. These calls return an NTSTATUS code—a 32-bit value structured with severity, facility, and code fields—to indicate success (e.g., STATUS_SUCCESS, 0x00000000) or failure (e.g., STATUS_ACCESS_DENIED, 0xC0000022), providing more granular error reporting than traditional errno mechanisms in Unix. Asynchronous operations, such as overlapped I/O, often leverage user-mode Asynchronous Procedure Calls (APCs), where the kernel queues a callback routine to the target thread's APC queue, executed when the thread enters an alertable state, such as during a system call return or explicit wait. In modern Windows versions, particularly Windows 11 as of November 2025, new system calls have been added to support features such as Virtualization-based Security (VBS), which uses hardware virtualization to isolate sensitive kernel components like Credential Guard and Device Guard enclaves, introducing syscalls for hypervisor management and secure memory allocation. Due to the ongoing undocumentation of many Native APIs, community efforts have filled gaps by reverse-engineering and publishing syscall tables for Windows 11, cataloging SSNs for functions like NtCreatePartition for VBS-related partitioning, enabling developers and researchers to interact with these interfaces reliably across builds. This contrasts sharply with Unix-like systems, where syscall interfaces are openly specified in standards like POSIX, making Windows' approach more challenging for portability and direct kernel interaction.

Tracing Tools

Tracing tools enable developers and system administrators to monitor and debug system calls invoked by running processes, capturing details such as entry and exit points, arguments, return values, and timings without altering the program's execution. These tools are essential for diagnosing issues like unexpected resource access or inefficient kernel interactions, operating by intercepting calls at the user-kernel boundary or within the kernel itself.^[93]^[94] On Linux systems, strace is a widely used command-line utility that traces system calls and signals for a specified process or command, logging each call's name, arguments, and return status to standard output or a file. For instance, executing strace -e trace=open ls limits tracing to file-open operations, revealing paths and error codes for debugging file access issues.^[95] Similarly, Solaris employs truss, which traces system calls and signals with timing information to identify hangs or performance anomalies in processes.^[96] For Windows, Event Tracing for Windows (ETW) provides kernel-level tracing of system calls through providers like the NT Kernel Logger, capturing events such as process creation and I/O operations; tools like logman can start sessions to log these to ETL files for later analysis using the Microsoft.Windows.EventTracing.Syscalls library.^[97]^[98] Kernel-integrated tracing in Linux, such as ftrace and kprobes, offers lower-level observation by hooking into kernel functions, including system call entry points, to record execution flows without user-space overhead. Ftrace, part of the kernel's tracing infrastructure, supports function graphing and event filtering via the tracefs filesystem, while kprobes allow dynamic breakpoints on any kernel instruction for custom data collection.^[94]^[99] These mechanisms facilitate analysis for performance bottlenecks, such as excessive system call frequency, and security vulnerabilities, like unauthorized file accesses, by generating statistics on call counts and durations.^[100] In the 2020s, extended Berkeley Packet Filter (eBPF)-based tools have advanced system call tracing with programmable, efficient kernel probes that minimize overhead compared to traditional methods. bpftrace, a high-level scripting language built on eBPF, enables concise scripts to trace specific system calls system-wide, such as monitoring execve for process spawning, and aggregates metrics like latency histograms.^[101] For containerized environments, sysdig addresses visibility gaps by capturing system calls and events directly from the kernel, providing container-aware filtering to troubleshoot microservices without host-level noise.^[102] Despite their utility, tracing tools introduce performance overhead, typically ranging from 5% to 20% slowdown for moderate workloads but potentially exceeding 100x in syscall-intensive scenarios due to interception and logging costs; they are observational only and cannot modify calls in real-time.^[103]^[104]

References

[1]
System Calls, Signals, & Interrupts - CS 3410
Each OS defines a set of system calls that it offers to user space. This set of system calls constitutes the abstraction layer between the kernel and user code.
[2]
Lecture 6: System calls & Interrupts & Exceptions - PDOS-MIT
Like system calls, except: devices generate them at any time, there are no arguments in CPU registers, nothing to return to, usually can't ignore them. There is ...
[3]
CS360 Lecture notes -- Introduction to System Calls (I/O System Calls)
A system call looks like a procedure call (see below), but it's different -- it is a request to the operating system to perform some activity.
[4]
System Calls in Operating System Explained | phoenixNAP KB
Aug 31, 2023 · System calls are interfaces between user programs and the OS, acting as intermediaries between applications and the kernel.Missing: authoritative | Show results with:authoritative
[5]
System Call - GeeksforGeeks
Sep 22, 2025 · A system call is a programmatic way in which a computer program requests a service from the kernel of the operating system on which it is executed.System Programs in Operating... · Different Types of System...Missing: authoritative sources
[6]
https://www.geeksforgeeks.org/operating-systems/introduction-of-system-call/
[7]
What are system calls and why are they necessary? - IONOS
May 6, 2020 · A system call is a method for programs to communicate with the system core, necessary for user mode programs to access kernel mode functions.Missing: authoritative | Show results with:authoritative
[8]
System Calls in OS: Functions, Types & How They Work - upGrad
Jun 6, 2025 · Process control system calls manage the creation, execution, and termination of processes. They are fundamental for multitasking and process ...
[9]
Linux kernel system calls for all architectures - Marcin Juszkiewicz
Oct 28, 2025 · Linux kernel system calls for all architectures. The Linux kernel provides many system calls for userspace. However, the numbers used for ...
[10]
[PDF] IBM OS/360: An Overview of the First General Purpose Mainframe
The system can enter supervisor state by use of an SVC, the supervisor call instruction. ... The IBM OS/360 was a pioneering operating system that bridged the two.
[11]
[PDF] Protection and the Control of Information Sharing in Multics
To simplify its support of protected subsystems, Multics imposes a nesting constraint on all subsystems which operate within a single process: each subsystem is ...
[12]
Homework 7: Unix v6 on the PDP-11 -- Part III - PDOS-MIT
Sep 27, 2004 · ... system call and traps into the kernel. Tracing a PDP-11 Trap. Boot up the unmodified unix kernel in the PDP ... TRAP affects the PDP-11 processor.<|separator|>
[13]
POSIX.1 Backgrounder - The Open Group
The basic goal was to promote portability of application programs across UNIX system environments by developing a clear, consistent, and unambiguous standard ...
[14]
Using Nt and Zw Versions of the Native System Services Routines
Apr 30, 2025 · For system calls from user mode, the Nt and Zw versions of a routine behave identically. For calls from a kernel-mode driver, the Nt and Zw ...
[15]
The rapid growth of io_uring - LWN.net
Jan 24, 2020 · io_uring is a mechanism for asynchronous I/O using a shared ring buffer, and it has rapidly grown since its introduction in the 5.1 kernel.
[16]
eBPF Syscall - The Linux Kernel documentation
The operation to be performed by the bpf() system call is determined by the cmd argument. Each operation takes an accompanying argument, provided via attr.
[17]
Confidential Computing in Linux for x86 virtualization
This document focuses on a subclass of CoCo technologies that are targeting virtualized environments and allow running Virtual Machines (VM) inside TEE.
[18]
[PDF] The Design and Implementation of Hyperupcalls - USENIX
Jul 11, 2018 · Hypercalls require that the guest make a request to be executed in the hypervisor, much like a system call, and upcalls require that the ...Missing: history | Show results with:history
[19]
https://docs.kernel.org/virt/kvm/amd-memory-encryption.html
[20]
Exception levels - Learn the architecture - AArch64 Exception Model
For example, the lowest level of privilege is referred to as EL0. As shown in Exception levels, there are four Exception levels: EL0, EL1, EL2 and EL3. Figure 1 ...
[21]
https://cdrdv2.intel.com/v1/dl/getContent/671200
[22]
Capability-based addressing | Communications of the ACM
Implementation of capability-based addressing is discussed. It is predicted that the use of tags to identify capabilities will dominate. A hardware address ...
[23]
[PDF] Systems Reference Library IBM System/360 Principles of Operation
The manual defines System/360 operating princi- ples, central processing unit, instructions, system con- trol panel, branching, status switching, interruption.
[24]
https://dl.acm.org/doi/10.1145/361011.361070
[25]
Seccomp BPF (SECure COMPuting with filters) — The Linux Kernel documentation
### Summary of Seccomp for Syscall Filtering, Security Role, and Post-2020 Updates
[26]
[PDF] Chapter 3 System calls, exceptions, and interrupts - Columbia CS
An operating system must handle system calls, exceptions, and interrupts. With a system call a user program can ask for an operating system service, as we saw ...
[27]
How the Linux kernel handles a system call · Linux Inside - 0xax
When the Linux kernel gets the control to handle an interrupt, it had to do some preparations like save user space registers, switch to a new stack and many ...
[28]
Does a system call involve a context switch or not? - Stack Overflow
Jun 18, 2022 · A system call does not generally require a context switch to another process; instead, it is executed in the context of whichever process invoked it.System call and context switch - Stack OverflowSystem call without context switching? - Stack OverflowMore results from stackoverflow.comMissing: reuse | Show results with:reuse
[29]
CS 537 Notes, Section #3B: Entering and Exiting the Kernel
Trap instructions are most often used to implement system calls and to be inserted into a process by a debugger to stop the process at a breakpoint. The flow of ...
[30]
Fastest Linux system call - Stack Overflow
Feb 21, 2018 · It's about 50 cycles versus 70 on my system with KPTI enabled. Some system calls don't even go thru any user->kernel transition, read vdso(7).Why do x86-64 Linux system calls work with 6 registers set?How many CPU cycles are needed for each assembly instruction?More results from stackoverflow.com
[31]
Syscall latency… and some uses of speculative execution | linux
Sep 12, 2023 · As mentioned in CPU-parameters, L1d-loads take 4-6 cycles on Skylake-X. We also know that in the good case (UEK5), this loop is capable of an ...Missing: modern | Show results with:modern
[32]
What is gVisor?
gVisor provides a strong layer of isolation between running applications and the host operating system. It is an application kernel that implements a Linux- ...Filesystem · Docker in gVisor · Installation · Kubernetes Quick Start
[33]
[PDF] Chapter 3 Traps, interrupts, and drivers - cs.wisc.edu
Traps (or interrupts) occur when a program does an illegal action (exception) or a device signals for attention. In xv6, traps are caused by the current ...
[34]
SVC exception handling - Arm Developer
The SVC #0 instruction makes the ARM core take the SVC exception, the mechanism to access a kernel function. Register R7 defines which system call you want (in ...
[35]
[PDF] IBM System/360 Model 44 Programming System Supervisor Call ...
This publication supplies detailed information for writing the assembler lan- guage sequences required to use the Super- visor Call (SVC) functions provided by ...Missing: historical | Show results with:historical
[36]
Mark Smotherman - System Call Support
Early history - including memory protection, multiple modes of execution, and privileged instructions. Many of the seminal ideas, however, were widely spread by ...
[37]
Anatomy of a system call, part 2 - LWN.net
Jul 16, 2014 · Ancient 32-bit programs use the INT 0x80 instruction to trigger a software interrupt handler, but this is much slower than SYSENTER on modern ...
[38]
Measurements of system call performance and overhead - Arkanis
Jan 5, 2017 · The benchmarks execute the function call or syscall 10 million times in a loop. The benchmark is run 10 times. The average of those runs is then ...Missing: cycles | Show results with:cycles
[39]
SYSENTER — Fast System Call
The SYSENTER and SYSEXIT instructions were introduced into the IA-32 architecture in the Pentium II processor. The availability of these instructions on a ...
[40]
SYSCALL — Fast System Call
SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction ...
[41]
SYSENTER - OSDev Wiki
Introduction. The SYSENTER/SYSEXIT instructions (and equivalent SYSCALL/SYSRET on AMD) enable fast entry to the kernel, avoiding interrupt overhead.
[42]
[PDF] SYSCALL and SYSRET Instruction Specification
Sept 1997. B Initial published release. May 1998. C Revised code sample on testing for SYSCALL/SYSRET support on page 2. Page 6 ...
[43]
Supervisor calls - Arm Developer
The SVC instruction generates an SVC. A typical use for SVCs is to request privileged operations or access to system resources from an operating system.Missing: VBAR setup
[44]
ARM64 System calls - ElseWhere
Jun 4, 2023 · In Armv8-A, a user-mode process invokes a system call to request a service provided by the kernel using the supervisor call ( SVC ) instruction.A64 system calls · A64 Instruction Set Architecture · linux syscall conventions
[45]
(Mis)understanding RISC-V ecalls and syscalls - Juraj's Blog
Apr 22, 2021 · RISC-V offers an ecall (Environment Call) instruction to implement system calls. These are basically requests made by a lower privileged code (user mode) to ...
[46]
The Definitive Guide to Linux System Calls | Packagecloud Blog
This blog post explains how Linux programs call functions in the Linux kernel. It will outline several different methods of making systems calls.
[47]
System Calls — The Linux Kernel documentation
There can be a maximum of 6 system call parameters. Both the system call number and the parameters are stored in certain registers.
[48]
Moving to Windows Vista x64 - CodeProject
The old ones we already knew are easy to recognize in their 64-bit form: rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp (and rip if we want to count the instruction ...
[49]
AMD Product Security
AMD seeks more efficient ways to make our products more secure, including working closely with partners, academics, researchers, and end users in the ecosystem.AMD SEV Confidential... · AMD CPU Microcode... · AMD SMM VulnerabilitiesMissing: SYSRET | Show results with:SYSRET
[50]
Kernel Syscalls - The Apple Wiki
Sep 15, 2023 · As in all ARM (i.e. also on Android) the kernel entry is accomplished by the SVC command (SWI in some debuggers and ARM dialects). On the ...Note on these · Unix · Usage · List of system calls from iOS 6...
[51]
System Call Wrappers - Sourceware
Aug 24, 2017 · There are three types of OS kernel system call wrappers that are used by glibc: assembly, macro, and bespoke. First we'll talk about the assembly ones.
[52]
A Deep Dive Into Malicious Direct Syscall Detection
Feb 13, 2024 · In a conventional flow, the system call is implemented inside system call stubs located inside ntdll.dll or win32u.dll, Windows DLLs, and it ...
[53]
[PDF] glibc and system call wrappers - Linux Plumbers
Aug 28, 2020 · ▷ Why do we have system call wrappers? ▷ How can we add them to glibc? ▷ Do we actually want to do that? ▷ What can the kernel do ...
[54]
C library system-call wrappers, or the lack thereof - LWN.net
Nov 12, 2018 · Calling into the kernel is not like calling a normal function; a special trap into the kernel must be triggered with the system-call arguments ...
[55]
The Development of the C Language - Nokia
C was devised in the early 1970s for Unix, derived from BCPL and B. Dennis Ritchie turned B into C, and by 1973, the essentials were complete.
[56]
Guide to JNI (Java Native Interface) - Baeldung
Jan 8, 2024 · The JDK introduces a bridge between the bytecode running in our JVM and the native code (usually written in C or C++). The tool is called Java Native Interface.
[57]
POSIX.1 FAQ
### Summary of POSIX API, System Calls Mapping, and Portability
[58]
Single UNIX® Specification, Version 4, 2018 Edition
### Summary of Single UNIX® Specification (SUS)
[59]
Cygwin
The Cygwin DLL currently works with all recent, commercially released x86_64 versions of Windows, starting with Windows 8.1. For more information see the FAQ.Install · Cygwin Packages · Cygwin DLL · Cygwin/X
[60]
Linux/FreeBSD System Call Concordance
Name, Description, Linux, FreeBSD. Number, Arguments, Definition, Number, Arguments, Definition. __acl_aclcheck_fd, 354, int filedes, acl_type_t type,
[61]
A look at dynamic linking - LWN.net
Feb 13, 2024 · The dynamic linker is a critical component of modern Linux systems, being responsible for setting up the address space of most processes.
[62]
The Windows Subsystem for Linux is now open source
May 19, 2025 · The code that powers WSL is now available on GitHub at Microsoft/WSL and open sourced to the community! You can download WSL and build it from source.
[63]
vdso(7) - Linux manual page - man7.org
The vDSO (virtual dynamic shared object) is a small shared library that the kernel automatically maps into the address space of all user-space applications.Missing: optimized | Show results with:optimized
[64]
fork(2) - Linux manual page - man7.org
fork() creates a new process by duplicating the calling process. The new process is referred to as the child process. The calling process is referred to as the ...
[65]
CreateProcessA function (processthreadsapi.h) - Win32 apps
Feb 9, 2023 · Creates a new process and its primary thread. The new process runs in the security context of the calling process.
[66]
wait(2) - Linux manual page - man7.org
wait() and waitpid() The wait() system call suspends execution of the calling thread until one of its children terminates. The call wait(&wstatus) is equivalent ...
[67]
getpid(2) - Linux manual page - man7.org
getpid() returns the process ID (PID) of the calling process. (This is often used by routines that generate unique temporary filenames.)Missing: setuid | Show results with:setuid
[68]
clone(2) - Linux manual page - man7.org
When a clone call is made without specifying CLONE_THREAD, then the resulting thread is placed in a new thread group whose TGID is the same as the thread's TID.
[69]
pthreads(7) - Linux manual page - man7.org
Both threading implementations employ the Linux clone(2) system call. In NPTL, thread synchronization primitives (mutexes, thread joining, and so on) are ...
[70]
sched_yield(2) - Linux manual page - man7.org
sched_yield() causes the calling thread to relinquish the CPU. The thread is moved to the end of the queue for its static priority and a new thread gets to run.
[71]
Operating Systems: Threads
Because a single kernel thread can operate only on a single CPU, the many-to-one model does not allow individual processes to be split across multiple CPUs.
[72]
lseek(2) - Linux manual page - man7.org
lseek() repositions the file offset of the open file description associated with the file descriptor fd to the argument offset according to the directive ...
[73]
ioctl(2) - Linux manual page - man7.org
The ioctl() system call manipulates the underlying device parameters of special files. In particular, many operating characteristics of character special files ...
[74]
Overview of the Linux Virtual File System
The Virtual File System (also known as the Virtual Filesystem Switch) is the software layer in the kernel that provides the filesystem interface to userspace ...
[75]
ReadFile function (fileapi.h) - Win32 apps - Microsoft Learn
Jul 22, 2025 · Reads data from the specified file or input/output (I/O) device. Reads occur at the position specified by the file pointer if supported by the device.Syntax · Parameters
[76]
Named Pipes - Win32 apps - Microsoft Learn
Jan 7, 2021 · A named pipe is a named, one-way or duplex pipe for communication between the pipe server and one or more pipe clients.Missing: calls mailslots
[77]
Mailslots - Win32 apps - Microsoft Learn
Jan 7, 2021 · A mailslot is a mechanism for one-way interprocess communications (IPC). Applications can store messages in a mailslot.
[78]
socket function (winsock2.h) - Win32 apps | Microsoft Learn
Oct 13, 2021 · The socket function causes a socket descriptor and any related resources to be allocated and bound to a specific transport-service provider.Missing: Unix | Show results with:Unix
[79]
Searchable Linux Syscall Table for x86_64 - Filippo Valsorda
A searchable Linux system call table for the x86-64 architecture, with arguments and links to manual and implementation.
[80]
clone(2) - Linux manual page - man7.org
CLONE_THREAD (since Linux 2.4.0) If CLONE_THREAD is set, the child is placed in the same thread group as the calling process. To make the remainder of the ...
[81]
AddingSyscalls - FreeBSD Wiki
Oct 1, 2025 · The process of adding syscalls to FreeBSD has slowly grown more complex over the years. New features such as audit have added extra fields to the syscalls. ...Auditing · Userspace considerations · Adding 32-bit compatibility
[82]
Linux System Call Table for x86 64 - Ryan A. Chapman
Nov 29, 2012 · 64-bit x86 uses syscall instead of interrupt 0x80. The result value will be in %rax. To find the implementation of a system call, grep the kernel tree for ...
[83]
Direct Operating System Access via Syscalls - UAF CS
You identify which system call you'd like to make by loading a syscall number into register rax. A full list of syscall numbers is here, or on /usr/include/asm/ ...
[84]
syscalls.master « kern « sys - src - FreeBSD source tree
... 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 ... syscall( int number, ... ); } 1 AUE_EXIT STD|CAPENABLED|NORETURN { void ...
[85]
fanotify(7) - Linux manual page - man7.org
An fanotify notification group is a kernel-internal object that holds a list of files, directories, filesystems, and mounts for which events shall be created.
[86]
Linux_2_6_36 - Linux Kernel Newbies
Summary: Linux 2.6.36 includes support for the Tilera architecture, a new filesystem notification interface called fanotify, a redesign of workqueues optimized ...
[87]
Zircon System Calls | Fuchsia
### Summary of Fuchsia System Call Interface
[88]
Rust - Fuchsia
Fuchsia uses four GN target templates for Rust projects: library, binary, test, and macro. Fuchsia Rust targets are not built with cargo.
[89]
strace
It is used to monitor and tamper with interactions between processes and the Linux kernel, which include system calls, signal deliveries, and changes of process ...
[90]
ftrace - Function Tracer - The Linux Kernel documentation
Ftrace is an internal tracer designed to help out developers and designers of systems to find what is going on inside the kernel.
[91]
strace(1) - Linux manual page - man7.org
It intercepts and records the system calls made by a process and the signals a process receives. The name of each system call, its arguments, and its return ...
[92]
truss Command - Oracle Solaris
The `truss` command checks if a process is hung, and can be used to find out about timings of each system call executed by the process.
[93]
Instrumenting Your Code with ETW | Microsoft Learn
May 16, 2022 · Event Tracing for Windows (ETW) is a high speed tracing facility built into Windows. Using a buffering and logging mechanism implemented in the operating ...
[94]
Microsoft.Windows.EventTracing.Syscalls 1.12.10 - NuGet
Provides a set of APIs to process syscall data in Event Tracing for Windows (ETW) traces (.etl files) in .NET. Consider using Microsoft.Windows.EventTracing.
[95]
Kernel Probes (Kprobes) - The Linux Kernel documentation
Kprobes enables you to dynamically break into any kernel routine and collect debugging and performance information non-disruptively.
[96]
Linux eBPF Tracing Tools - Brendan Gregg
Dec 28, 2016 · Linux eBPF tracing tools, showing static and dynamic tracing with extended BPF and the open source BCC collection of tools.
[97]
bpftrace/bpftrace: High-level tracing language for Linux - GitHub
bpftrace is a high-level tracing language for Linux. bpftrace uses LLVM as a backend to compile scripts to eBPF-bytecode and makes use of libbpf and bcc.
[98]
draios/sysdig: Linux system exploration and troubleshooting tool ...
Sysdig is a simple tool for deep system visibility, with native support for containers. The best way to understand sysdig is to try it - its super easy!How to Install Sysdig for Linux · Sysdig-builder · Issues 112
[99]
strace Wow Much Syscall - Brendan Gregg
May 11, 2014 · WARNING: Can cause significant and sometimes massive performance overhead, in the worst case, slowing the target application by over 100x. · Can' ...
[100]
Trace Linux System Calls with Least Impact on Performance - TiDB
Dec 24, 2020 · As the benchmark shows, strace caused the biggest decrease in application performance. perf-trace caused a smaller decrease, and traceloop ...