Fact-checked by Grok 2 weeks ago

Calling convention

In and , a calling convention is a standardized set of rules that governs how one piece of code transfers control to a subroutine or function, passes arguments to it, handles the return of values, and manages the program's and registers to ensure proper execution and . These conventions specify critical details such as the order and mechanism for passing parameters—typically prioritizing registers for the first few arguments (e.g., up to six or eight, depending on the ) before spilling excess to the —and define how values are placed, often in specific registers like %rax in or x10 in . They also delineate register preservation rules, distinguishing between caller-saved registers (which the calling code must preserve if needed) and callee-saved registers (which the called must restore), to prevent unintended across boundaries. Stack management is another core aspect, including requirements (e.g., 16-byte boundaries), the direction of stack growth (usually downward), and responsibility for cleanup—such as whether the caller or callee adjusts the pointer after . Calling conventions are architecture-specific and often influenced by operating systems and application binary interfaces (ABIs), with prominent examples including the System V AMD64 ABI for on , the x64 convention for Windows, and the ARM Procedure Call Standard for processors. Their primary purpose is to enable , allowing code compiled by different compilers or written in various languages to link and execute correctly, while optimizing for performance by minimizing unnecessary memory accesses. Violations of these conventions can lead to runtime errors, such as stack overflows or incorrect parameter values, making adherence essential in low-level programming like and systems development.

Introduction

Definition and Purpose

A calling convention is a standardized set of rules that specifies how subroutines or functions receive parameters from their callers, manage the and registers, and values to ensure between separately compiled modules of code. These conventions define the interface for function invocation at the level, governing aspects such as argument passing locations and the preservation of caller state. The primary purpose of calling conventions is to facilitate by providing a consistent mechanism for linkage across different compilers, assemblers, and tools, thereby preventing errors such as corruption or incorrect access. They are essential for generating correct that operates reliably in mixed-language environments or with external libraries. By standardizing these low-level interactions, calling conventions enable binary compatibility, allowing object files from disparate sources to link successfully without recompilation. Calling conventions emerged in the 1970s alongside the rise of and early compilers, particularly with architectures like the PDP-11, where initial subprogram calling rules were formalized to support efficient subroutine invocation and reentrancy. Over time, they evolved to accommodate optimizations, portability across platforms, and advanced features like dynamic linking. Key benefits include enhanced through predictable stack frame layouts, support for and disassembly tools, and the integration of inline assembly within high-level languages. The (ABI) defines the low-level conventions for how compiled programs interact with the operating system, libraries, and other binaries on a specific , encompassing aspects such as calling conventions, data type representations, and memory alignments. Calling conventions form a key subset of the ABI, dictating how function arguments are passed (e.g., via registers or stack) and results are returned, while the broader ABI also specifies the sizes, alignments, and layouts of fundamental s like integers and structures to ensure binary compatibility across modules. For instance, in the System V ABI for , integers are typically 32 or 64 bits with natural alignment, whereas floating-point types may require specific padding to match hardware expectations. In contrast, the operates at a higher level, providing a source-code interface for developers to interact with software modules, libraries, or systems without concern for underlying binary details. While an API might define function signatures and behaviors in languages like or , it does not specify binary-level details such as register usage or object layouts, which are instead governed by the ABI to enable linking of independently compiled code. This distinction ensures that source-code portability via APIs does not imply binary compatibility, as changes in the ABI (e.g., due to compiler updates) can break executables even if the API remains unchanged. The Procedure Linkage Table (PLT) and Global Offset Table (GOT) are critical components in dynamic linking for formats like , enabling runtime resolution of external and data addresses in shared libraries. The PLT serves as a for indirect calls, initially redirecting to a stub that populates the GOT with the actual addresses upon first invocation, thus supporting lazy binding and . These mechanisms depend on the platform's calling conventions to correctly pass control and parameters during resolution; for example, in the System V ABI, calls to PLT entries use the standard register-based parameter passing to invoke the linker resolver without disrupting the caller's stack frame. Inline assembly and Foreign Function Interfaces (FFI) rely on calling conventions to seamlessly integrate low-level assembly code with high-level languages, ensuring that function calls across language boundaries adhere to the expected ABI. In FFI, high-level languages like Rust or Haskell declare foreign functions using conventions such as extern "C" to match the C ABI, which specifies parameter passing and return mechanisms, allowing safe invocation of C libraries while wrapping calls in unsafe blocks to handle potential memory or threading issues. Similarly, inline assembly in compilers like GCC or LLVM requires explicit adherence to the calling convention (e.g., preserving callee-saved registers and aligning the stack) to avoid corrupting the caller's state when embedding assembly snippets directly in C code. Debugging formats such as and PDB incorporate calling convention details to facilitate stack unwinding and symbol resolution during runtime analysis or . DWARF's Call Frame Information (CFI) encodes rules derived from the ABI—such as register mappings and stack adjustments at each instruction—to enable debuggers like GDB to reconstruct call stacks by virtually restoring registers and pointers across frames. Likewise, Microsoft's PDB format embeds unwind data in the PE optional header, using ABI-specific conventions (e.g., x64 table-based unwinding) to trace execution paths and locate variables, ensuring accurate backtraces even in optimized binaries.

Types of Calling Conventions

Calling conventions can be classified by their approach to parameter handling, which determines how arguments are passed between functions. Stack-based conventions push all parameters onto the call , offering simplicity and compatibility across varying numbers of s but incurring overhead from accesses. Register-based conventions pass the first few s in CPU registers to minimize operations and improve speed, particularly for small counts, while spilling excess parameters to the . Hybrid conventions combine these methods, using registers for initial parameters and the for overflow, as seen in modern 64-bit systems where up to four s are register-passed before usage. Another key classification distinguishes conventions by the responsibility for stack cleanup after parameter passing. In caller-cleanup conventions, such as __cdecl, the calling function adjusts the stack pointer to remove arguments post-call, enabling variable-argument functions like printf but adding code size due to repeated cleanup instructions. Conversely, callee-cleanup conventions, exemplified by __stdcall, require the called function to clean the stack using a fixed argument count known at compile time, which reduces caller overhead and executable size for APIs with consistent signatures, such as the Windows API. Performance-optimized variants like fastcall and vectorcall extend standard conventions by prioritizing register usage for specific argument types. Fastcall passes the first two integer parameters in registers—ECX and on x86—to accelerate short calls, with remaining arguments on the , though its benefits diminish in 64-bit environments where registers are more abundant. Vectorcall builds on fastcall by incorporating SSE/AVX registers (e.g., XMM0–XMM5 or YMM0–YMM5) for up to six vector arguments on x86 or four on x64, reducing pressure for SIMD-heavy and enabling by-value passing of aggregates larger than eight bytes, which enhances throughput in vectorized computations. Conventions also vary in support for position-independent code (PIC), which allows relocatable binaries without address fixes at load time. Position-dependent conventions assume fixed load addresses, simplifying direct jumps but limiting shared libraries. Position-independent variants, as in the System V ABI, employ IP-relative addressing or a Global Offset Table (GOT) accessed via a dedicated like %r15 for indirect calls, enabling dynamic while maintaining calling efficiency across small (IP-relative only), medium (with large data offsets), or large (full 64-bit GOT) models. The evolution of calling conventions reflects architectural shifts from memory-constrained early systems to performance-oriented designs. Early conventions, prevalent in 16-bit x86 environments like , relied heavily on stack-based models with limited registers, prioritizing simplicity for segmented memory. Over time, as RISC architectures emphasized register abundance, conventions transitioned to register-heavy approaches, such as RISC-V's use of eight argument registers (a0–a7) for integers and fa0–fa7 for floats, reducing latency and aligning with compiler optimizations in 64-bit and embedded systems.

Core Elements

Parameter Passing Methods

Parameter passing methods in calling conventions define how arguments are transferred from the caller to the callee, ensuring between functions compiled by different tools. These methods balance , , and flexibility, with choices influencing and usage. Common techniques include passing data by value, by , or specialized variants for output parameters, each implemented via registers or the stack as per the convention's rules. In pass-by-value, the caller copies the argument's value into a location accessible to the callee, such as a register or stack slot, preventing modifications to the original data. This approach is straightforward and isolates the function's effects but can be inefficient for large data types due to copying overhead. For instance, in languages like C, primitive types like integers are typically passed by value. Pass-by-reference, also known as pass-by-pointer, involves the caller providing the of the argument, allowing the callee to access and potentially modify the original without copying. This method is more efficient for large or complex structures, as only the address (often fitting in a ) is transferred, though it introduces risks like unintended where multiple parameters refer to the same memory. It is prevalent in C-like languages for modifiable parameters. For output parameters, pass-by-result and pass-by-copy-restore provide mechanisms to return modified values to the caller. In pass-by-result, the callee writes to a caller-allocated location without initial copying, suitable for pure outputs but requiring careful to avoid uninitialized access. Pass-by-copy-restore, or value-result, copies the argument into the callee's scope at entry, allows modifications, and copies the result back upon exit; this handles inputs that may be overwritten but incurs double-copy overhead and can lead to order-dependent behavior in multi-parameter calls. These variants appear in languages like Ada for in-out parameters. When passing structures or aggregates, conventions distinguish between small and large types to optimize transfer. Small structs, often those fitting within one or two s (e.g., up to 16 bytes in some standards), are passed by directly in registers for speed, while larger ones are passed by or on the to avoid excessive copying. rules typically require to natural boundaries (e.g., 8-byte), adding minor memory overhead but ensuring efficient access. For example, in standard ABIs, a 4-byte struct might occupy a full 8-byte slot. Variadic arguments, as in C's ellipsis notation (...), follow hybrid rules where fixed parameters use the standard method (registers first), but additional arguments spill to the stack regardless of size, enabling flexible but less optimized handling. The callee accesses them via macros like va_start, which rely on the stack layout for , often with to word size. This accommodates variable counts but complicates optimization due to unknown types. Trade-offs in these methods center on speed versus flexibility: register-based passing (for values or pointers) minimizes compared to stack pushes, which involve memory operations, but registers are limited, forcing spills for many arguments. By-value suits small, immutable to avoid risks, while by-reference or copy variants enable modifications at the cost of potential overhead—e.g., copying a large struct by might double memory traffic. Alignment padding, though small (typically 0-7 bytes per ), accumulates in calls with aggregates, influencing overall efficiency. Overall, conventions prioritize register use for the first few parameters to leverage CPU speed, referencing management only when necessary.

Stack and Register Usage

In calling conventions, the stack frame represents the memory allocation on the call for a , typically including the return address, saved , local variables, and sometimes parameters if not fully passed in . The return address stores the location to resume execution in the caller after the completes, while saved preserve the caller's for callee-saved . Local variables occupy allocated dynamically based on the 's needs, and parameters may spill onto the beyond the initial . This layout ensures proper isolation between and facilitates unwinding for debugging or . A typical stack frame layout, growing downward from higher to lower addresses, can be represented as follows:
Higher addresses
+-------------------+
| Caller parameters |  (including spills if exceeding registers)
+-------------------+
| [Return address](/page/Return_address)    |  <- Pushed by CALL [instruction](/page/Instruction)
+-------------------+
| Saved registers   |  <- e.g., frame pointer (optional)
+-------------------+
| Local variables   |  <- Allocated in [prologue](/page/Prologue)
+-------------------+
| Spill area / Temp |  <- For additional data if needed
+-------------------+  <- %rsp ([stack](/page/Stack) pointer)
Lower addresses
This structure varies slightly by platform but maintains the core components for runtime management. For instance, in the PowerPC ABI, the stack frame header includes a back chain to the previous frame, condition register save area, (return address) save area, and additional metadata like save area. The function establishes the stack frame at entry, typically by saving the previous frame pointer, setting the current frame pointer, and allocating space for , while the reverses these steps at exit to restore the stack and control. A common sequence in x86 conventions involves instructions like push ebp to save the caller's base pointer, mov ebp, esp to establish the new frame pointer, and sub esp, N to allocate local space in the ; the then uses mov esp, ebp, pop ebp, and ret to deallocate and . These operations ensure the stack pointer is adjusted correctly and registers are preserved, with multiple epilogues possible for functions with early s. Register preservation rules divide registers into caller-saved (volatile) and callee-saved (non-volatile) categories to balance efficiency and reliability across calls. Caller-saved registers, used for temporaries like intermediate computations, may be freely modified by the callee without , placing the burden on the caller to save them if needed before the call. In contrast, callee-saved registers hold longer-lived values and must be preserved by the callee, which saves them in the (often on the ) and restores them in the . This minimizes overhead: callers avoid saving registers they won't use post-call, while callees only save those they modify. For example, in the x64 , registers like RAX and RCX are caller-saved, while and R12-R15 are callee-saved. Some conventions incorporate a , an unwritten area immediately below the stack pointer reserved for fast temporary storage without explicit allocation, enabling optimizations by avoiding prologue adjustments for small locals. In the System V ABI, this zone spans 128 bytes and remains untouched by asynchronous events like signals, allowing leaf functions to use it directly for efficiency. However, its use is platform-specific and requires awareness to prevent conflicts with interrupt handlers. Stack pointer alignment requirements ensure optimal performance for vector operations and SIMD instructions, typically mandating that the stack pointer be aligned to a 16-byte upon function entry and maintained throughout, except in /epilogues or leaf routines. Misalignment can incur penalties, such as additional cycles for unaligned , so conventions enforce this via prologue adjustments (e.g., subtracting multiples of 16 from the stack pointer). In 64-bit systems like x86-64, this 16-byte rule supports efficient use of 128-bit XMM registers, with higher alignments (e.g., 32 bytes) for 256-bit operations in some extensions. To mitigate stack overflows that could corrupt frames, many conventions integrate overflow protection mechanisms like (or security cookies), which insert a random value between locals and sensitive components such as the address. The places this on the , and the verifies it before ; a mismatch triggers termination to prevent exploitation. Enabled via flags like Microsoft's /GS or GCC's -fstack-protector, this adds minimal runtime overhead while protecting against buffer overruns in vulnerable functions.

Function Return Mechanisms

In calling conventions, the return address is typically managed by the caller pushing it onto the immediately before transferring to the callee via a jump . Upon completion, the callee restores to the caller by this from the and branching to it, ensuring resumption of execution at the following the original call. This mechanism maintains the integrity of the program's across function boundaries. Scalar return values, such as or small floating-point numbers that fit within a single or pair of registers, are conventionally returned in designated registers to minimize overhead and enable efficient access by the caller. For instance, scalars are placed in general-purpose registers, while floating-point scalars use dedicated floating-point or registers. This approach leverages efficiency for common data types. For larger or composite return values, such as structures exceeding a certain size threshold (often 128 bits or more), calling conventions require the caller to allocate in advance and pass a hidden pointer to this location as an implicit . The callee then writes the directly to this caller-provided , and may the pointer itself in a to inform the caller of the location. This "pass-by-reference" for returns avoids the inefficiency of copying large data on the or in registers. Void-returning functions, by contrast, transfer control back to the caller without producing any , simply executing the mechanism to pop the address and branch. Exception handling integrates with calling conventions through defined stack unwinding procedures, where runtime systems use frame information (such as tables or equivalent ABI-specified metadata) to traverse the call stack, invoke for local objects, and propagate exceptions to appropriate handlers. This ensures that resources are properly cleaned up during error conditions without violating the convention's stack discipline. optimization further enhances efficiency by allowing a function's final call to another function to reuse the current stack frame, effectively replacing with a direct jump, provided the callee adheres to the same convention and no conflicting operations (like large returns or exception setup) are required; this is particularly useful for recursive algorithms to prevent .

Platform Variations

Multiple Conventions per Platform

Platforms support multiple calling conventions to accommodate legacy software, optimize for specific use cases such as operating kernels versus user-space applications, and facilitate between different programming languages and libraries. support is particularly crucial on Windows, where older applications developed under 16-bit and early 32-bit environments rely on conventions like __cdecl for compatibility with existing binaries and dynamic link libraries (DLLs). Optimization differences arise because conventions vary in management and usage; for instance, __stdcall reduces overhead in frequent calls by having the callee clean the , making it suitable for functions, while __fastcall prioritizes speed by passing initial arguments in registers like ECX and EDX on x86. Language is enhanced by allowing conventions tailored to language runtimes, such as those bridging C++ and components on Windows. On Windows x86 (32-bit), the platform supports several conventions including __cdecl (default for C/C++), __stdcall (used for Win32 calls), and __fastcall (for performance-critical functions with few arguments). In contrast, primarily adheres to the System V ABI, which defines a single dominant convention for user-space applications—passing up to six integer or pointer arguments in registers (RDI, RSI, RDX, RCX, R8, R9 on )—but includes alternatives for system calls and legacy 32-bit code using __cdecl-like behavior. These alternatives on often emerge in cross-compilation scenarios or when interfacing with Windows binaries via tools like Wine, where emulation layers handle convention translations. Developers select conventions explicitly in code using compiler-specific pragmas or attributes to ensure compatibility. In Microsoft Visual C++ (MSVC), attributes like __stdcall or __fastcall are appended to function declarations, such as int __stdcall MyFunction(int a);, directing the to generate code adhering to the specified and rules. and provide similar functionality through the stdcall or fastcall attributes, or via command-line flags like -mregparm=3 for register-based passing, allowing fine-grained control within the same binary. This selection mechanism enables mixing conventions in a single program, such as using __stdcall for interactions while defaulting to System V for internal functions on . Interoperability challenges arise when conventions mismatch between caller and callee, often leading to stack corruption, incorrect parameter values, or application crashes due to improper stack pointer adjustments. For example, calling a __stdcall function with __cdecl code results in the caller attempting to clean the stack, leaving residual data that corrupts subsequent operations. Reverse engineering tools like disassemblers (e.g., IDA Pro or ) aid in detection by analyzing prologue/epilogue code patterns, such as the presence of RET n instructions indicating fixed stack cleanup sizes in __stdcall versus variable cleanup in __cdecl. The evolution of calling conventions on platforms reflects a shift from rigid, architecture-fixed standards in early systems to selectable, flexible models for modern embedded and desktop environments. Early 16-bit x86 systems under enforced conventions tied to segment registers for simplicity, but 32-bit transitions introduced multiplicity to support diverse and reduce migration friction. By the era, platforms like Windows consolidated to a single fastcall-like convention for efficiency, while Linux's System V ABI emphasized register usage to minimize stack pressure, yet retained options for legacy via compiler flags. This progression prioritizes in desktop OSes alongside optimization in resource-constrained embedded systems, where selectable conventions allow tailoring to hardware constraints like limited registers.

ABI Integration

Calling conventions are integral components of Application Binary Interfaces (ABIs), which define the low-level between applications and the operating , including how is laid out in , how calls are invoked, and how dynamic linking occurs to resolve symbols at . In ABIs, calling conventions specify the mechanics of , such as and frame construction, ensuring that compiled code from different compilers or languages can interoperate seamlessly within the same ecosystem. For instance, layout rules in an ABI dictate padding and alignment for structures passed via calling conventions, preventing misalignment faults on , while calls often adhere to the platform's calling convention to maintain consistency between user-space and interactions. Dynamic linking relies on these conventions to correctly marshal arguments when loading shared objects, as mismatches can lead to errors like corruption. Prominent platform-specific ABIs incorporate calling conventions with tailored rules for and padding to optimize performance and hardware compatibility. The System V ABI, widely used in Unix-like systems such as and BSD, mandates 16-byte stack before function calls and classifies parameters into integer, floating-point, or memory categories, with padding added to structures for natural (e.g., up to 16 bytes for vector types). In contrast, the Microsoft x64 ABI for Windows employs a similar 16-byte but introduces "shadow space" (32 bytes reserved on the stack for the callee) and passes the first four integer parameters in RCX, RDX, R8, and R9, with floating-point arguments in XMM0–XMM3; padding for structures follows a 8-byte maximum in most cases, differing from System V's vector-aware rules. These differences arise from historical OS choices: System V emphasizes Unix portability, while Microsoft's design prioritizes integration with its libraries, affecting how padding is inserted for aggregate types in parameter passing. ABI versioning ensures long-term compatibility by providing stability guarantees, particularly for critical interfaces like system calls, to prevent disruptions from evolving calling conventions. In , the commits to ABI stability for the syscall interface, meaning existing syscall numbers, argument layouts, and return conventions remain unchanged across kernel versions unless deprecated with advance notice, allowing binaries compiled years ago to execute without modification. This stability extends to user-space ABIs like , where calling conventions for public symbols are frozen to avoid breaking dependencies. Such guarantees contrast with less stable systems, where kernel updates might alter low-level conventions, but 's policy has supported decades of binary compatibility since the early . Cross-ABI issues emerge when mixing code from libraries adhering to different conventions, such as System V and x64 in multi-OS environments or hybrid binaries, often requiring wrappers or thunks to bridge mismatches in usage and management. Thunks are small stubs that adapt passing—for example, relocating parameters from Windows' RCX/RDX s to System V's RDI/RSI—enabling shared libraries to interoperate without recompilation. In Windows, thunks facilitate calls between ARM64EC (extended calling convention) and x64 code, handling volatility differences and alignment, while tools like GCC's ABI attributes generate such adapters automatically for cross-convention linkage. These mechanisms are essential for in shared libraries, where unresolved symbols might invoke functions under varying conventions, preventing crashes from misaligned s or lost s. Post-2010 ABI extensions have incorporated support for advanced types like SIMD vectors and 128-bit integers to leverage modern hardware features without breaking compatibility. The System V ABI supplements, updated around 2013–2015, define passing of 128-bit integers (__int128 in ) by splitting them across two general-purpose registers (e.g., RAX:RDX) or spilling to the stack if registers are exhausted, with similar rules for returning values. For SIMD types, such as 128-bit or 256-bit vectors (e.g., AVX), ABIs classify them as HFA (Homogeneous Floating-Point Aggregates) or memory-passed if exceeding register limits, with alignment padded to 16 or 32 bytes to match vector unit requirements; Microsoft's x64 ABI aligns these similarly but uses XMM/YMM registers for up to four 64-bit floats in HFAs. These extensions, driven by SIMD's prevalence in compute-intensive applications, ensure efficient data transfer while maintaining through optional feature detection.

Architectures

x86 (32-bit)

In 32-bit x86 architectures, calling conventions primarily rely on the for passing, with the stack growing downward from high to low addresses and maintaining 4-byte alignment for efficiency. These conventions originated in the 1980s with early compilers for the processor, evolving from 16-bit segmented memory models in environments to flat 32-bit models in operating systems like Windows and Unix variants to ensure compatibility and performance. return values are typically placed in the for scalar types up to 32 bits, while larger or complex returns use a caller-provided pointer. The cdecl convention, the default for C language functions on both Unix and Windows platforms, passes parameters on the stack from right to left, allowing support for variable arguments. The caller is responsible for cleaning the stack after the function returns, which can lead to slightly larger executables due to repeated cleanup code but provides flexibility for variadic functions. No registers are specifically reserved for parameters in cdecl, preserving the general-purpose registers like EDI, ESI, EBP, and EBX across calls. The stdcall convention, similar to cdecl in parameter ordering and stack-based passing, differs by having the callee clean the stack, which is beneficial for functions with a fixed number of arguments as it avoids redundant cleanup instructions. It is the standard for the Win32 API, enabling efficient calls to DLL exports by standardizing stack management and reducing code size in callers. Like cdecl, it does not use dedicated registers for parameters and maintains 4-byte stack alignment. The fastcall convention optimizes for speed by passing the first two 32-bit parameters in the ECX and registers, with any additional parameters pushed onto the from right to left; the callee handles stack cleanup. and variants align closely, though minor differences exist in naming conventions, such as 's use of leading "@" symbols in symbol names (e.g., @function@8). This approach suits small, frequently called s but does not support variable arguments. The thiscall convention, specific to C++ member functions, passes the "this" pointer in the ECX register, with subsequent parameters on the from right to left, and the callee performing cleanup. It builds on fastcall principles for the implicit parameter but is the default for non-variadic C++ methods in compilers, ensuring compatibility with object-oriented calling patterns. values follow the standard usage, and the convention preserves key registers like EBX and EBP.

x86-64

The architecture, also known as AMD64, introduced calling conventions optimized for 64-bit processing, leveraging expanded sets to reduce stack pressure compared to 32-bit x86 designs. These conventions vary by platform, with the System V ABI predominant on systems (, macOS, BSD) and the Microsoft x64 ABI on Windows. Both prioritize -based parameter passing for efficiency in 64-bit addressing, supporting larger memory spaces and improved performance through direct utilization. In the System V AMD64 ABI, the first six integer or pointer arguments are passed in registers RDI, RSI, , RCX, R8, and R9, in left-to-right order, while the first eight floating-point arguments use XMM0 through XMM7. Additional arguments beyond these limits are passed on the stack in right-to-left order, with each occupying 8 bytes and the stack maintaining 16-byte alignment. Integer return values are placed in RAX (with for multi-word results), and floating-point returns use XMM0 (or XMM0 and XMM1 for larger types). The Windows x64 calling convention passes the first four integer or pointer arguments in RCX, RDX, R8, and R9, with the first four floating-point arguments in the lower 64 bits of XMM0 through XMM3. The caller must allocate 32 bytes of "shadow space" on the stack immediately before the call for these registers, allowing the callee to spill values there without further adjustment. Subsequent arguments are pushed onto the stack in right-to-left order, 8-byte aligned. Returns follow similar patterns to System V, with scalars in RAX or XMM0, though user-defined types larger than 64 bits are returned via a caller-provided pointer in RAX. The vector calling convention extends these ABIs to handle (AVX) and beyond, passing up to six vector arguments (e.g., __m128 or __m256 types) in XMM0XMM5 or YMM0YMM5 on Windows, and up to eight in System V using XMM0XMM7 or YMM0YMM7. For , it utilizes ZMM0ZMM7 (System V) or ZMM0ZMM3 (Windows), with stack alignment increased to 32 bytes for 256-bit vectors and 64 bytes for 512-bit. Homogeneous vector aggregates with four or fewer elements are passed in consecutive vector registers, while larger ones use references to avoid excessive register pressure. This convention, introduced in 2013 for compilers, enhances performance in vector-intensive applications like by minimizing stack spills. Both conventions enforce 16-byte stack alignment at the point of function calls, with the stack pointer (RSP) adjusted to maintain this invariant outside of function prologs and epilogs. The System V ABI includes a 128-byte "" below RSP, usable by leaf functions for temporary storage without explicit stack allocation, further optimizing for 64-bit code execution. These features support seamless 64-bit addressing, enabling and efficient handling of large address spaces. The calling conventions emerged with the AMD64 architecture's release in 2003, initially specified in AMD's programmer manuals and refined through platform-specific ABIs to capitalize on the extended (16 general-purpose registers) and 64-bit operations for reduced overhead in calls.

ARM (A32 and A64)

The architecture employs distinct calling conventions for its 32-bit (A32, including Thumb) and 64-bit (A64) instruction sets, as defined by the Procedure Call Standard (PCS) family, which ensures between separately compiled subroutines. These conventions prioritize register-based parameter passing to leverage the reduced instruction set computing (RISC) design, minimizing stack usage for efficiency in and systems. The standards support both integer and floating-point operations, with variants for coprocessor extensions like VFP. In the 32-bit A32 mode, the AAPCS specifies that the first four integer or pointer arguments are passed in registers R0 through R3, with any additional arguments placed on the in a full-descending manner, maintaining word (SP 4 equals 0) and double-word alignment at public interfaces (SP 8 equals 0). Return values up to word size (e.g., or ) are placed in R0, while double-word values use R0 and R1, and larger composite types are returned via a memory location whose address is passed in R0. The return address is stored in the (LR, R14), which the callee must preserve or restore if needed, and the stack pointer (, R13) remains unchanged across the call except for local frame allocation. For the 64-bit A64 mode, the AAPCS64 extends this model to accommodate wider registers, passing the first eight or pointer arguments in X0 through X7, with excess arguments on a 16-byte aligned . Vector and floating-point arguments use V0 through V7, supporting homogeneous aggregates like up to four floating-point members in consecutive V registers. Returns follow a similar pattern, with or single-precision values in X0 and floating-point or results in V0, while larger types (e.g., exceeding one register) are returned indirectly via a hidden pointer in X8 to a callee-allocated memory block. The must maintain 16-byte alignment throughout public interfaces, and the frame pointer (FP, X29) and (LR, X30) aid in unwinding. A variant of the AAPCS incorporates the Vector Floating-Point (VFP) , where the first four single-precision floating-point arguments are passed in S0 through S3 (or D0 through D1 for doubles), with subsequent values on the or in higher VFP registers; results follow the same registers for return. This soft-float alternative uses integer registers for all parameters, ensuring compatibility without hardware floating-point support. In Thumb mode, which uses 16-bit instructions for code density, the calling convention aligns closely with A32 but imposes restrictions on high register access (e.g., limited to R0-R7, , LR, PC without extensions), requiring interworking via the BX instruction for state switches and careful stack limit checks using simplified instructions like ADD/CMP for frames under 256 bytes. These adjustments ensure symmetric support between Thumb and A32 states without altering core parameter passing rules. The Procedure Call Standard for originated in the as the ARM Procedure Call Standard (APCS) and Thumb Procedure Call Standard (TPCS) to facilitate subroutine linkage, with the modern AAPCS representing its fifth major revision, first publicly released in 2003 and updated through 2009 for 32-bit enhancements like VFPv3 support. For 64-bit, the AAPCS64 was introduced in the early to align with AArch64's expanded and alignment needs.

RISC-V

The calling convention, specified in the RISC-V Procedure Call Standard for the application processor binary interface (psABI), leverages the instruction set architecture's () modular nature to support base integer operations alongside optional extensions for floating-point and processing. This design enables flexible implementations across devices and high-performance systems, prioritizing simplicity and extensibility without mandating features beyond the core . The convention distinguishes between caller-saved (temporary) and callee-saved registers to minimize overhead in function calls, while ensuring compatibility with standard toolchains like and . In the integer-focused ABI for RV32I and RV64I, the first eight arguments are passed in general-purpose registers a0–a7 (corresponding to x10–x17), with any excess arguments placed on the . Temporary registers t0–t6 (x5–x7 and x28–x31) are caller-saved, allowing the caller to use them freely but requiring restoration if needed across calls, while callee-saved registers s0–s11 (x8–x9 and x18–x27) must be preserved by the callee. Function return values are placed in a0 (and a1 for composite types up to two words in size). When the floating-point extensions (F for single-precision or D for double-precision) are enabled, up to eight floating-point arguments are passed in fa0–fa7, with returns in fa0 (and fa1 if necessary); in soft-float mode without hardware support, these fall back to registers. The stack grows downward and maintains 16-byte at the point of each call, with no defined for speculative access; the ABI also accommodates the compressed instructions extension (RVC) to reduce code size in resource-constrained environments. The psABI version 1.0 was ratified in November 2022, building on earlier drafts to standardize conventions for operating systems including (with an appendix detailing ELF-specific rules) and Windows, ensuring binary portability across compliant implementations. RISC-V's emphasis on embeddability is reflected in the convention's lightweight and avoidance of complex shadow spaces, making it suitable for and low-power applications. For the ratified vector extension (RVV 1.0), an optional variant calling convention introduces 32 vector registers v0–v31 for passing SIMD arguments and returns, where v0 handles masks, v8–v23 manage data and scaled by LMUL (register group multiplier) and NFIELDS (tuple fields), and v1–v7 plus v24–v31 are callee-saved; functions using this must be explicitly marked, with large vectors passed by if register capacity is exceeded.

PowerPC and Power ISA

The architecture, introduced in 1991 through the of , , and , established a reduced instruction set computing (RISC) design optimized for systems, personal computers, and servers, with its calling conventions defined in the Embedded Application Binary Interface (EABI) supplement to the System V ABI. These conventions specify that integer and pointer parameters are passed in general-purpose registers (GPRs) r3 through r10, accommodating up to eight 32-bit arguments before spilling to the stack; floating-point parameters use floating-point registers (FPRs) f1 through f8 for up to eight single- or double-precision values. Return values for scalar integers and pointers are placed in r3, while floating-point results occupy f1, and the (LR) holds the return address, which the caller saves if necessary to preserve it across subroutine calls. This register-based approach minimizes stack usage for small functions, aligning with the architecture's emphasis on efficient branch and load/store operations. In 64-bit mode, as extended in the PowerPC ELF ABI, the conventions build on the 32-bit foundation but accommodate larger data sizes and additional parameters through stack overflow mechanisms, with the stack pointer (r1) maintaining 16-byte (quadword) alignment to support atomic operations and SIMD extensions. Parameters beyond the eight GPR slots (r3-r10 for 64-bit integers/pointers) or eight FPR slots (f1-f8 for floating-point) are stored in a parameter save area on the stack starting at offset 48 from the current stack pointer, ensuring doubleword alignment for subsequent arguments. Return mechanisms remain similar, with 64-bit scalars in r3 or f1, though aggregates larger than a doubleword are returned via a caller-allocated buffer whose address is passed in r3. The link register continues to manage returns, saved at stack offset 16 in the callee's frame. Variants of these conventions exist between operating systems, notably AIX and distributions on PowerPC hardware, primarily differing in () handling for accessing global variables and functions. On AIX, GPR r2 serves as a dedicated TOC pointer, facilitating by anchoring global data access within a 64 KB , whereas implementations (under the ABI) integrate TOC in r2 similarly for 64-bit but employ a global table (GOT) approach in 32-bit mode and allow ELFv2 optimizations that reduce cross-module TOC saves/restores. passing aligns closely across both, using r3-r10 and f1-f8 (or f1-f13 for extended floating-point on AIX), but AIX handles structures and unions by passing them in GPRs or memory without FPR allocation, while may pass addresses to memory copies for complex types. The architecture defaults to big-endian byte ordering for data storage and transmission, though 64-bit Power ISA implementations support bi-endian modes configurable at runtime. For vector and SIMD processing via (also known as VMX), up to 12 vector parameters are passed in vector registers v2 through v13, with returns in v2, integrating seamlessly with the scalar conventions by reserving these 128-bit registers only when the function signature indicates vector usage. The , evolving from the original PowerPC specification, has seen iterative updates through the 2020s—reaching version 3.1c in 2024—to enhance support for in servers and AI accelerators, such as improved matrix math units in processors, while preserving core calling convention stability for .

MIPS

The MIPS architecture, originating in the early 1980s, employs several application binary interfaces (ABIs) that define its calling conventions, primarily tailored for reduced instruction set computing (RISC) principles in embedded and high-performance systems. The most foundational is the O32 ABI, which supports 32-bit operations and has been the standard since the architecture's inception with the MIPS R2000 processor in 1985. Subsequent extensions like N32 and N64 introduced 64-bit capabilities while maintaining backward compatibility where possible. These conventions emphasize register-based parameter passing for efficiency, with the stack used for overflow, and are characterized by big-endian byte ordering as the default. In the O32 ABI, the first four integer or pointer arguments are passed in registers a0 to a3 (corresponding to $4 to $7), while floating-point arguments use f12 for the first single-precision value and f12–f13 or f14–f15 for the first double-precision pair; additional arguments beyond these are placed on the stack in right-justified, 4-byte-aligned positions.[63] Return values follow a similar pattern: 32-bit integers or pointers in v0 ($2), with v1 (&#36;3) used for a second word if needed (e.g., for 64-bit integers or pairs); single-precision floats return in f0, and doubles in f0–f1. Structures and unions smaller than or equal to one word are returned in v0, while larger ones require the caller to allocate space and pass its address in a0, with the callee storing the result there and returning the address in v0.[](https://refspecs.linuxfoundation.org/elf/mipsabi.pdf) The stack grows downward from higher addresses and must be doubleword (8-byte) aligned, with each frame reserving space for at least 16 bytes of arguments plus saved registers such as &#36;16–&#36;23 (callee-saved temporaries), &#36;30 (fp, frame pointer), and $31 ($ra, return address). Doubles and long doubles require 8-byte alignment in memory, ensuring proper access without byte swapping in the big-endian environment. The N32 and N64 ABIs extend O32 for 64-bit architectures (MIPS III and later), with N32 providing an ILP32 model (32-bit integers and pointers, 64-bit longs) and N64 using LP64 (64-bit integers, longs, and pointers). These support up to eight integer arguments in $4–$11 and eight floating-point arguments in f12–f19, regardless of mixing integer and FP types, surpassing O32's limit of four registers. Returns use v0–v1 for 64-bit integers or pairs, and f0–f2 for floating-point values up to 128 bits; larger aggregates may use both general-purpose and FP registers or stack space. A key distinction is the treatment of the global pointer $gp ($28), which is callee-saved in N32/N64 (unlike caller-saved in O32) and points to the global offset table for position-independent code access to globals and dynamic symbols. Stack slots are 64-bit (8-byte) wide, with frames aligned to quadwords (16 bytes) for improved performance on 64-bit datapaths. MicroMIPS, a compact extension to the MIPS32/64 ISAs introduced in , preserves the core register usage and ABI compatibility of O32, N32, and N64 but employs 16-bit and 32-bit instruction encodings to reduce code density, particularly beneficial in memory-constrained environments. This impacts function s and s by allowing compressed instructions like LWM16 and SWM16 for loading/storing multiple registers to/from the stack, ADDIUSP for immediate stack pointer adjustments, and JALRC16 for compact jump-and-link operations, which can shrink / sequences by up to 50% in size compared to standard 32-bit MIPS encodings. Entry points remain 32-bit aligned for , though future MicroMIPS-specific ABIs could relax this to 16-bit alignment; relocation types are extended to handle smaller offset fields (e.g., 7-bit or 10-bit PC-relative). These optimizations do not alter passing or mechanisms but enhance in saving/restoring ra, fp, and callee-saved registers during s and s. MIPS calling conventions remain relevant in the 2020s for embedded applications, including networking devices like routers from vendors such as Cisco, where legacy MIPS-based systems continue to operate in resource-limited scenarios.

SPARC

The SPARC (Scalable Processor ARChitecture) calling convention, developed by Sun Microsystems in 1987 as part of its RISC design, leverages register windowing to enable efficient procedure calls by minimizing stack accesses for parameter passing and local variables. This architecture organizes the 32 general-purpose registers (in 32-bit V8) or 72 (in 64-bit V9, including additional globals) into overlapping windows, each consisting of eight input (%i), eight local (%l), and eight output (%o) registers, plus eight global registers (%g) shared across windows. The current window pointer (CWP) manages window shifts via SAVE and RESTORE instructions, allowing the caller's %o registers to become the callee's %i registers seamlessly, which supports rapid context switching compared to architectures with static register allocation like MIPS. SPARC systems are big-endian, with stack frames aligned to 8 bytes in V8 and 16 bytes in V9, and the convention was integral to Sun's Solaris operating system until its decline in the 2010s amid shifts to x86 and ARM platforms. In the SPARC V8 ABI (32-bit), the first six integer or pointer arguments are passed in %o0 through %o5, with subsequent arguments placed on the stack starting at an offset of 92 bytes (%fp + 68) from the frame pointer, ensuring 64-byte stack alignment. Floating-point arguments use dedicated %f registers if applicable, but integers dominate register usage. The return value is placed in %i0 (the callee's view of the caller's %o0 after SAVE), while structures or larger types may use %i0 and %i1 or spill to the stack. Caller-saved registers include %o0–%o7 and %g1–%g5, requiring the caller to preserve them if needed, whereas callee-saved registers (%i0–%i7 and %l0–%l7) must be saved by the callee before modification. Register windows enhance performance by allocating a new set on entry (via SAVE, which decrements CWP), avoiding immediate stack spills for locals and parameters unless the window count (typically 8–32) is exhausted. The V9 ABI (64-bit) extends V8 with full 64-bit registers and instructions, passing arguments identically in %o0–%o5 but treating them as 64-bit slots, with excess parameters on the stack at offsets biased by 2047 bytes (%sp + 2047) for to 16 bytes and a minimum frame size of 176 bytes. Additional global registers (%g0–%g7) provide more shared state, while %g6 and %g7 are reserved for operating system use, such as in . Return values follow V8 patterns but leverage 64-bit width, with integers in %i0 and doubles in %i0–%i1; floating-point returns use %f0 or %d0 (double-precision pair). Window management remains central, with 3–32 windows (implementation-dependent via NWINDOWS), but now supports up to 160 physical registers for deeper call chains without traps. Unlike V8's 32-bit cells, V9 eliminates hidden parameter words and passes small structures in registers, improving efficiency. Window overflow and underflow are handled via dedicated traps to maintain performance: when CANSAVE reaches zero during SAVE (no free windows), a spill trap (vectors 0x80–0xBF) saves the current window's locals and inputs to the stack, updating window counters (CANSAVE, CANRESTORE, CLEANWIN, OTHERWIN) before retrying. Conversely, underflow on RESTORE (CANRESTORE = 0) triggers a fill trap (vectors 0xC0–0xFF), reloading from the stack into the prior window. These traps, managed by privileged software, ensure seamless operation, with the FLUSHW instruction optionally flushing all but the current window for debugging or context switches. This dynamic spilling contrasts with static register schemes, enabling SPARC's high-performance procedure linkage in environments like early Solaris servers.

Other Architectures

The calling convention for and its successor , used in since the architecture's introduction in 1964, employs a stack-based approach augmented by base registers for addressing. In the standard OS linkage convention, parameters are passed via a parameter list pointed to by register R1, while R13 holds the address of the caller's save area for preserving registers R0-R15 across calls, R14 contains the return address, and R15 points to the callee's entry point. This mainframe-oriented design emphasizes reentrancy and multitasking, with the stack managed through base and displacement addressing to handle large address spaces. For the () architecture, particularly the Renesas SH-4 variant prevalent in systems in from the 1980s through the 2010s, the calling convention passes the first four 32-bit integer parameters in registers R4 through R7, with additional arguments placed on the following a 16-byte reserved home space. Return values are placed in R0, and the pointer is restored to its entry value upon function exit, supporting efficient leaf routines in resource-constrained environments. Floating-point parameters follow similar in non-variadic calls, using FR4-FR7. The (68k) series, a cornerstone of legacy systems like the and early Macintosh from the late , primarily passes all parameters on the in a right-to-left order, with the stack pointer A7 managing growth from high to low es. However, certain conventions, including those for and some compilers, utilize address registers A0 and as caller-saved scratch registers for the first few pointer or parameters, while A2-A5 serve as callee-saved locals and A6 as a frame pointer. Return values, such as pointers, are typically placed in A0. In the 1130, a 16-bit system from the 1960s designed for programming on punch-card-based setups, subroutines are invoked via CALL statements followed by DC (define constant) directives to specify parameters, which are passed by reference using the system's limited registers—an accumulator and three index registers (XR1-XR3)—or directly onto the for arithmetic and I/O operations. This early stack-oriented convention reflects the era's focus on scientific with minimal resources. These architectures share common traits in their calling conventions, including heavy reliance on the for passing due to relatively limited general-purpose registers compared to modern designs, which necessitates careful management of frames and save areas. Big-endian byte ordering predominates in many historical examples, such as System/360, 68k, and 1130, facilitating consistent multi-byte data handling in early mainframes and workstations, though exceptions like SuperH's little-endian format exist for applications.

Non-Architectural Conventions

Threaded Code

represents a specialized execution model in interpreters where the program consists primarily of addresses pointing to executable fragments, rather than traditional machine instructions or . In this paradigm, "calls" are implemented as indirect jumps, eliminating the overhead of conventional subroutine invocation and return mechanisms. Calling conventions in thus revolve around maintaining interpreter state—particularly the instruction pointer (IP)—across these jumps, typically by passing the in a dedicated rather than on the or via parameters. This approach contrasts with standard architectures by treating as and leveraging an inner interpreter to dispatch execution. A core element of indirect threaded code, prevalent in Forth implementations, is the NEXT routine or macro, which orchestrates the dispatch. NEXT fetches the next code-field address (CFA) from the current , loads the execution address from that CFA, performs the indirect jump to the subroutine (which may be a or another word), and then increments the IP to point to the subsequent instruction. The IP is usually held in a to minimize access latency during this loop, ensuring seamless progression through the thread without explicit parameter passing or stack-based returns for . This setup enables highly efficient interpretation, as the overhead per instruction is limited to a single . In Forth-like systems, the calling convention aligns with a stack-based architecture, utilizing a data for operands and a separate (return) for nesting levels and temporary during word execution. Operations do not receive traditional parameters; instead, they pop required values from the data , perform computations, and push results back, while the control manages or nesting via saved . This eliminates the need for fixed argument registers or stack frames, making the model lightweight for and real-time applications. Just-in-time () compilers extend principles dynamically, generating threaded sequences at runtime for optimized interpretation. For instance, employs a direct threaded interpreter, where opcodes map directly to addresses of assembly code snippets in a dispatch table, allowing the IP to jump straight to handlers without layers. The IP advances explicitly after each handler (e.g., by adding instruction size to a register-held pointer), supporting rapid execution before JIT compilation kicks in for hot paths. This dynamic convention facilitates seamless transitions between interpreted and compiled modes. The advantages of threaded code include significantly more compact representations due to storing only addresses rather than full instructions, which is crucial for memory-constrained environments like virtual machines. It also yields faster dispatch on modern hardware with branch predictors, achieving up to 5.6x speedups in benchmarks through optimizations like superinstructions that fuse common sequences. originated in the 1970s with Charles H. Moore's development of Forth at the National Radio Astronomy Observatory, where indirect threading enabled efficient interpretation on limited hardware; it persists today in embedded scripting and VMs for its balance of density and performance.

Language-Specific Examples

In , calling conventions are designed to support multitasking environments through descriptor-based parameter passing, particularly for variable-length data types like strings and arrays. Strings declared as CHARACTER(*) are passed using descriptors that include length and information, allowing the callee to dynamic extents without fixed-size assumptions; this approach adds an extra descriptor argument to the call stack for each such . Structures are passed by reference unless specified with the attribute, ensuring efficient handling of complex business data in multitasking scenarios where shared is common. Separate linkage conventions govern entry points, using an 8-byte structure comprising a 4-byte descriptor pointer and a 4-byte pointer to manage nesting and external calls, compatible with systems like IBM z/OS. Pascal and Ada employ stack frame-based calling conventions that incorporate static links to handle nested scopes, enabling inner procedures to access variables from enclosing scopes without global visibility. In Pascal, parameters are passed by by default for simple types but by using the VAR keyword for modifiable aggregates, with static links—a pointer to the parent's stack frame—passed implicitly to support lexical scoping in nested functions. Ada extends this with explicit parameter modes: IN (default, by copy for scalars and by for large composites to optimize ), OUT, and IN OUT, while nested subprograms use static chains or display registers to resolve enclosing references, preserving in concurrent or recursive calls. These mechanisms ensure and efficient resolution in block-structured programs. COBOL's calling conventions emphasize record-oriented data passing for business applications, particularly on systems, where the LINKAGE SECTION defines receiving areas for parameters from calling programs. Parameters are typically passed BY REFERENCE (default for efficiency with large records), BY (copying values without modification), or BY , with records like 01-level groups in the LINKAGE SECTION aligned to match the caller's USING clause, supporting fixed-length fields for decimal arithmetic and packed data. /OS variants use the GENERAL linkage convention for inter-program calls, including stored procedures, where COBOL records are mapped directly to parameter lists without additional descriptors, facilitating seamless integration with assembler or modules in enterprise environments. In modern languages like and Go, calling conventions adapt to runtime constraints and custom environments. Rust's no_std mode, used in bare-metal or embedded systems, allows developers to bypass the and define custom ABIs via attributes like #[unsafe(naked)], enabling direct control over register usage and stack frames without OS dependencies, while still defaulting to platform conventions like System V for interoperability. Go implements an internal ABI with runtime-managed stack growth, where arguments are passed using a register-based convention (introduced in Go 1.17) for the first few, with excess on the stack, to support goroutine preemption and garbage collection; the runtime dynamically resizes stacks during calls, copying frames as needed to handle or channel operations efficiently. For inter-language foreign function interfaces (FFI), the C ABI serves as the lingua franca, standardizing parameter passing, name mangling, and data layout across languages to enable seamless integration. Languages like Rust and Go expose C-compatible interfaces via extern "C" blocks, passing pointers and scalars by value or reference per the C convention, avoiding language-specific features like exceptions or generics during cross-boundary calls. This approach ensures portability, as seen in Rust's cbindgen tool generating C headers for FFI, or Go's cgo bridging to C libraries, prioritizing stability over performance optimizations unique to each language.

References

  1. [1]
    Assembly 2: Calling convention – CS 61 2018
    A calling convention governs how functions on a particular architecture and operating system interact. This includes rules about includes how function arguments ...
  2. [2]
    [PDF] Calling Conventions - CS@Cornell
    Assume a function uses two callee-save registers. How do we allocate a stack frame? How large is the stack frame? What should be stored in the stack.
  3. [3]
    Guide to x86 Assembly - Computer Science
    Mar 8, 2022 · The calling convention is a protocol about how to call and return from routines. For example, given a set of calling convention rules, a ...Instructions · Arithmetic And Logic... · Calling Convention
  4. [4]
    Calling Convention
    ### Summary of Calling Convention Explanation
  5. [5]
    Calling Conventions | Microsoft Learn
    Aug 3, 2021 · The Visual C/C++ compiler provides several different conventions for calling internal and external functions.Missing: science | Show results with:science
  6. [6]
    [PDF] Lecture Notes on Calling Conventions
    Feb 21, 2023 · Strict adherence to the calling conventions is crucial so that your code can interoperate with library routines, and the environment can call.
  7. [7]
    [PDF] Calling conventions - Agner Fog
    Feb 1, 2023 · does not have to know what this code means in order to fulfill purpose 1 and 2. It only needs to check if strings are identical. Different ...
  8. [8]
    Calling Conventions - CS [45]12[01] Spring 2022 - Cornell University
    A calling convention is a standardized contract about how to invoke functions. Having a calling convention allows code generated by different compilers and ...Missing: computer | Show results with:computer
  9. [9]
    [PDF] 'mDmnomn INTEROFFICE MEMORANDUM
    PDP-11 Subprogram. Calling Conventions. DATE: November 10, 1970. TO: PDP-11 List C. PDP-11 Master List. FROM. Hank Spencer. DEPARTMENT: Programming.Missing: history | Show results with:history<|control11|><|separator|>
  10. [10]
  11. [11]
    ABI and ISA - GNU MP 4.1
    ABI (Application Binary Interface) refers to the calling conventions between functions, meaning what registers are used and what sizes the various C data types ...
  12. [12]
    Chapter 1 Introduction to the API (System Interface Guide)
    The terms Application Binary Interface (ABI) and System Binary Interface (SBI) indicate the binary interfaces corresponding to the respective source level ...
  13. [13]
    Native interoperability ABI support - .NET - Microsoft Learn
    May 27, 2025 · The Application Binary Interface (ABI) is the interface that runtimes and operating systems use to express low-level binary details.
  14. [14]
    19.2 Controlling the Exported Symbols of Shared Libraries - GNU.org
    Within a shared library, a call to a function that is a global symbol costs a “call” instruction to a code location in the so-called PLT (procedure linkage ...
  15. [15]
    FFI - The Rustonomicon - Rust Documentation
    Foreign calling conventions. Most foreign code exposes a C ABI, and Rust uses the platform's C calling convention by default when calling foreign functions.Missing: inline assembly
  16. [16]
    Exception Handling in LLVM — LLVM 22.0.0git documentation
    An exception handling frame eh_frame is very similar to the unwind frame used by DWARF debug info. The frame contains all the information necessary to tear ...
  17. [17]
    x64 Calling Convention | Microsoft Learn
    Jul 25, 2025 · This article describes the standard processes and conventions that one function (the caller) uses to make calls into another function (the callee) in x64 code.
  18. [18]
    __cdecl | Microsoft Learn
    Aug 3, 2021 · The __cdecl calling convention creates larger executables than __stdcall, because it requires each function call to include stack cleanup code.
  19. [19]
    __fastcall | Microsoft Learn
    Sep 15, 2023 · The __fastcall calling convention specifies that arguments to functions are to be passed in registers, when possible. This calling convention ...
  20. [20]
    __vectorcall | Microsoft Learn
    Oct 17, 2022 · The __vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible. __vectorcall uses more ...
  21. [21]
    [PDF] System V Application Binary Interface - AMD64 Architecture ...
    Jul 2, 2012 · Medium position independent code model (PIC) This model is like the previ- ous model, but similarly to the medium static model adds large ...
  22. [22]
  23. [23]
    [PDF] Calling Convention - RISC-V International
    The RISC-V calling convention passes arguments in registers when possible. Up to eight integer registers, a0–a7, and up to eight floating-point registers, fa0– ...Missing: programming | Show results with:programming<|control11|><|separator|>
  24. [24]
    6.1. Parameter-Passing Mechanisms — Programming Languages
    The remaining three parameter-passing mechanisms use lazy evaluation: The arguments of a function call are passed without being evaluated to the function.
  25. [25]
    [PDF] Compiler construction - UiO
    As said, call-by-value and call-by-result are the two main alternative for classic procedural, ... copy-in-copy-out or copy-restore ... The last parameter-passing ...
  26. [26]
    Calling Variadics (The GNU C Library)
    You don't have to do anything special to call a variadic function. Just put the arguments (required arguments, followed by optional ones) inside parentheses, ...
  27. [27]
    [PDF] System V Application Binary Interface - x86-64
    Sep 13, 2002 · is given in Figure 3.6, the stack frame offset given shows the frame before calling ... calling conventions as user-level applications (see ...Missing: components | Show results with:components
  28. [28]
    64-bit PowerPC ELF Application Binary Interface Supplement 1.9
    The 64-bit PowerPC ELF ABI is intended to use the same structure layout and calling convention rules as the 64-bit PowerOpen ABI.
  29. [29]
    x64 prolog and epilog - Microsoft Learn
    Oct 3, 2025 · Epilog code exists at each exit to a function. Whereas there is normally only one prolog, there can be many epilogs. Epilog code trims the stack ...
  30. [30]
    /GS (Buffer Security Check)
    ### Summary of Stack Overflow Protection (Canaries/Security Cookies) in Calling Conventions
  31. [31]
    Itanium C++ ABI: Exception Handling ($Revision: 1.22 $)
    For example, the first phase allows an exception-handling mechanism to dismiss an exception before stack unwinding begins, which allows resumptive exception ...
  32. [32]
    Exceptions and stack unwinding in C++ - Microsoft Learn
    Nov 14, 2022 · Stack unwinding example. The following example demonstrates how the stack is unwound when an exception is thrown. Execution on the thread ...
  33. [33]
    LLVM Language Reference Manual — LLVM 22.0.0git documentation
    ... ABI (Application Binary Interface). Tail calls can only be optimized when this, the tailcc, the GHC or the HiPE convention is used. This calling convention ...
  34. [34]
    The history of calling conventions, part 1 - The Old New Thing
    Jan 2, 2004 · In the 16-bit world, part of the calling convention was fixed by the instruction set: The BP register defaults to the SS selector, whereas the ...
  35. [35]
    Unmanaged calling conventions - .NET | Microsoft Learn
    Aug 22, 2023 · Mismatches in unmanaged calling conventions lead to data corruptions and fatal crashes that require low-level debugging skills to diagnose.Missing: interoperability | Show results with:interoperability
  36. [36]
    System V ABI - OSDev Wiki
    This is a 64-bit platform. The stack grows downwards. Parameters to functions are passed in via the registers rdi, rsi, rdx, rcx, r8, and r9. Floating-point ...Executable and Linkable Format · Calling Convention · i386 · x86-64
  37. [37]
    __stdcall | Microsoft Learn
    Feb 10, 2025 · The __stdcall calling convention is used to call Win32 API functions. The callee cleans the stack, so the compiler makes vararg functions __cdecl.
  38. [38]
    x86 Function Attributes (Using the GNU Compiler Collection (GCC))
    On 32-bit and 64-bit x86 targets, you can use an ABI attribute to indicate which calling convention should be used for a function. The ms_abi attribute tells ...<|separator|>
  39. [39]
    The cost of forgetting to specify a calling convention
    Sep 2, 2021 · This led to customers consuming the header file incorrectly, and passing callback function pointers that used the __cdecl calling convention ...Missing: mismatches interoperability challenges
  40. [40]
    Overview of x64 ABI conventions - Microsoft Learn
    Jun 25, 2025 · For details on the x64 calling convention, including register usage, stack parameters, return values, and stack unwinding, see x64 calling ...
  41. [41]
    ABI stable symbols - The Linux Kernel documentation
    Most interfaces (like syscalls) are expected to never change and always be available.
  42. [42]
    Documentation-ABI-README - The Linux Kernel Archives
    Most interfaces (like syscalls) are expected to never change and always be available. testing/ This directory documents interfaces that are felt to be stable, ...
  43. [43]
    Overview of ARM64EC ABI conventions | Microsoft Learn
    Oct 14, 2022 · The ARM64EC ABI follows x64 software conventions including calling convention, stack usage, and data alignment, making ARM64EC and x64 code interoperable.
  44. [44]
    [PDF] SIMD Types: ABI Considerations [N4395] - Open Standards
    Apr 10, 2015 · An ABI describes machine-, operating system-, and compiler-specific choices that are not covered by a programming language standard.
  45. [45]
    The history of calling conventions, part 3 - The Old New Thing
    Jan 8, 2004 · The 32-bit x86 calling conventions all preserve the EDI, ESI, EBP, and EBX registers, using the EDX:EAX pair for return values. C (__cdecl).
  46. [46]
    C/C++/SYCL Calling Conventions - Intel
    Oct 31, 2024 · Calling Conventions on Windows ; __thiscall. None. Default calling convention used by C++ member functions that do not use variable arguments.
  47. [47]
    [PDF] System V Application Binary Interface - CS 61
    Sep 28, 2021 · Operating systems conforming to the AMD64 ABI may provide support for executing programs that are designed to execute in these compatibility ...
  48. [48]
    Introducing 'Vector Calling Convention' - C++ Team Blog
    Jul 11, 2013 · Please note, the vector calling convention is only supported for native amd64/x86 targets and further it does not apply to MSIL (/clr) target.
  49. [49]
    The history of calling conventions, part 5: amd64 - The Old New Thing
    Jan 14, 2004 · The last architecture I'm going to cover in this series is the AMD64 architecture (also known as x86-64). The AMD64 takes the traditional ...Missing: platforms legacy
  50. [50]
    [PDF] Procedure Call Standard for the ARM Architecture
    Oct 16, 2009 · The AAPCS embodies the fifth major revision of the APCS and third major revision of the TPCS. It forms part of the complete ABI specification ...
  51. [51]
    [PDF] Procedure Call Standard for the ARM 64-bit Architecture - c9x.me
    May 22, 2013 · This ABI supports two views of memory implemented by the underlying hardware. □ In a little-endian view of memory the least significant byte of ...
  52. [52]
    Procedure Call Standard - Arm Developer
    It can be used as the static base register (SB) to point to position-independent data, or as the thread register (TR) where thread-local storage is used. In ...
  53. [53]
    Overview of ARM ABI Conventions | Microsoft Learn
    Integer type values are returned in r0, optionally extended to r1 for 64-bit return values. VFP/NEON floating-point or SIMD type values are returned in s0, d0, ...Vfp Registers · Parameter Passing · Stage C: Assignment Of...
  54. [54]
    [PDF] The ARM-THUMB Procedure Call Standard
    Nov 5, 1998 · This document defines a family of procedure call standards for the ARM and THUMB instruction sets. Keywords procedure call, function call, ...<|separator|>
  55. [55]
    [PDF] Procedure Call Standard for the ARM Architecture - 0x04.net
    May 4, 2006 · The AAPCS embodies the fifth major revision of the APCS and third major revision of the TPCS. It forms part of the complete ABI specification ...
  56. [56]
    [PDF] RISC-V ABIs Specification
    This specification is written in collaboration with the development communities of the major open- source toolchain and operating system communities, ...
  57. [57]
  58. [58]
    Ratified Specifications - RISC-V International
    The RISC-V open-standard instruction set architecture (ISA) defines the fundamental guidelines for designing and implementing RISC-V processors.
  59. [59]
    Instruction Set Architecture - OpenPOWER Foundation
    May 26, 2024 · The Power ISA is a specification describing the architecture used by POWER processors, defining the instructions the processor executes.
  60. [60]
    [PDF] Power Architecture™ 32-bit Application Binary Interface Supplement ...
    Implementations of this Power Architecture 32-bit Application Binary Interface Supplement should indicate which ABI software features (see Appendix A) and ...
  61. [61]
    64-bit PowerPC ELF Application Binary Interface Supplement 1.7
    The stack pointer (stored in r1) shall maintain quadword alignment. It shall always point to the lowest allocated valid stack frame, and grow toward low ...Introduction · Low Level System Information · Object Files
  62. [62]
    Register usage and conventions - IBM
    In Linux on PPC the address of a copy in memory is passed in the next available gpr (or in memory). The varargs parameters are specifically handled and ...
  63. [63]
    [PDF] SYSTEM V APPLICATION BINARY INTERFACE - Linux Foundation
    Frames are allocated dynamically on the program stack, depending on program execution. The architecture, standard calling sequence, and stack frame support.
  64. [64]
    [PDF] MIPSproTM N32 ABI Handbook
    Calling Convention Implementations. This chapter describes the differences between o32, n32, and n64 ABIs with respect to calling convention implementations.Missing: specification | Show results with:specification
  65. [65]
    [PDF] MIPS32® Architecture Reference Manual Volume II-b: microMIPS
    Jun 6, 2016 · ... ABI Compatibility ... Volume IV-c describes the MIPS-3D® Application-Specific Extension to the MIPS® Architecture.
  66. [66]
    A Brief History of the MIPS Architecture - SemiWiki
    Dec 7, 2012 · MIPS is one of the most prolific, longest-living industry-standard processor architectures, existing in numerous incarnations over nearly ...Missing: ABI | Show results with:ABI
  67. [67]
    Everything You Need to Know About SPARC Architecture - Stromasys
    SPARC (Scalable Processor Architecture) was introduced by Sun Microsystems in 1987. It is still powering NASA's 2020 Solar Orbiter mission and is an open, ...
  68. [68]
    Milestones:SPARC RISC Architecture, 1987
    Mar 18, 2024 · Sun Microsystems first introduced SPARC (Scalable Processor Architecture) RISC (Reduced Instruction-Set Computing) in 1987. Over the course of ...
  69. [69]
    [PDF] Oracle Solaris and Sun SPARC Systems—Integrated and Optimized ...
    SPARC (Scalable Processor ARChitecture) is a RISC instruction set architecture developed by Sun. Microsystems (now Oracle). The ―Scalable‖ in SPARC comes from ...
  70. [70]
    [PDF] The SPARC Architecture Manual, Version 9 - Texas Computer Science
    SPARC-V9, like its predecessor SPARC-V8, is a microprocessor specification created by the SPARC Architecture Committee of SPARC International. SPARC-V9 is ...
  71. [71]
    [PDF] SPARC Assembly Language Reference Manual - Oracle Help Center
    Compiler Calling Convention. The calling convention differs for each architecture. You can see this by examining the assembler code generated by the compiler ...
  72. [72]
    SPARC V9 ABI Features - Oracle Help Center
    The basic calling convention is the same. The first six arguments of the caller are placed in the out registers %o0-%o5. The SPARC V9 ABI still uses a ...
  73. [73]
    [PDF] Program Linkage A Visible/Z Lesson - The Punctilious Programmer
    Each type of program has it own conventions. The Calling Program's Conventions. 1) Register 13 should contain the address of a “Save Area”. The save area is a ...
  74. [74]
    Standard CALL linkage conventions - IBM
    This topic describes the standard Language Environment protocols for passing arguments to external routines.Missing: 1130 architecture
  75. [75]
    The SuperH-3, part 12: Calling convention and function prologues ...
    Aug 20, 2019 · To make things easier, variadic parameters are always passed in integer registers, so that the callee can just spill them into the home space ...
  76. [76]
    M68k Application Binary Interface (ABI)
    In this section, we're going to talk about the standard calling convention used by M68k. It is splitted into three sub-sections: Stack frame layout, passing ...Missing: Amiga Mac
  77. [77]
    m68k - Free Pascal wiki
    Nov 19, 2023 · The Motorola 68k CPU target supports several different calling conventions. stdcall: this calling convention is entirely stack based. It ...Supported CPU types · Supported Targets · Registers · Calling Conventions
  78. [78]
    [PDF] IBM 1130 Subroutine Library - Bitsavers.org
    Each calling sequence used with subroutines in the 1130 system consists of a. CALL or LIBF statement (whichever is required to call the specific subroutine), ...Missing: convention | Show results with:convention
  79. [79]
    [PDF] IBM 1130 Subroutine Library
    The appropriate sub- routine calls are generated by the FORTRAN com- piler whenever a read, write, arithmetic, or CALL statement is encountered. This ...
  80. [80]
    Threaded Code - Compilers and Languages
    Threaded code is a technique for implementing virtual machine interpreters. There are various ways to implement interpreters.
  81. [81]
    [PDF] Threaded Code Variations and Optimizations
    Forth has been traditionally implemented as in- direct threaded code, where the code for non- primitives is the code-field address of the word. To.Missing: macro | Show results with:macro
  82. [82]
    Re: Is Lua direct-threaded?
    Jul 30, 2008 · Subject: Re: Is Lua direct-threaded? From: Mike Pall <mikelu-0807@...> Date: Wed, 30 Jul 2008 21:13:48 +0200 ...Suggestions on implementing an efficient instruction set simulator in ...RE: Implementation of Lua and direct/context threaded codeMore results from lua-users.org
  83. [83]
    The evolution of Forth | History of programming languages---II
    Forth is unique among programming languages in that its development and proliferation has been a grass-roots effort unsupported by any major corporate or ...
  84. [84]
    Chapter 2: Language Concepts
    ... descriptors for S and T). Open PL/I Calling Conventions. The following sections describe the calling conventions for HP, Intel, RS/6000, and Sun Sparc systems.
  85. [85]
    Syntax and Linkage Conventions for the Callable Services - IBM
    Syntax and Linkage Conventions for the Callable Services. All APPC/MVS callable services have a general calling syntax as follows ...Missing: descriptor- based
  86. [86]
    [PDF] Chapter 8 :: Subroutines and Control Abstraction
    • Pascal, Ada, list, Scheme. • static chain used to locate objects. • static links points to frame of surrounding subroutine. • guaranteed surrounding ...
  87. [87]
    Functions and Procedures - learn.adacore.com
    Parameters can be passed in three distinct modes: in , which is the default, is for input parameters, whose value is provided by the caller and cannot be ...Missing: links nested
  88. [88]
    Passing data - IBM
    You can choose among three ways of passing data between programs: BY REFERENCE, BY CONTENT, or BY VALUE.Missing: variants | Show results with:variants
  89. [89]
    Example of GENERAL linkage convention - IBM
    The following examples demonstrate how an assembler, C, COBOL, or PL/I stored procedure uses the GENERAL linkage convention to receive parameters.Missing: descriptor- based
  90. [90]
    Coding the LINKAGE SECTION - IBM
    Coding the LINKAGE SECTION ... Code the same number of data-names in the identifier list of the called program as the number of arguments in the calling program.Missing: convention mainframe record
  91. [91]
    2972-constrained-naked - The Rust RFC Book
    A naked function has a defined calling convention and a body which contains only assembly code which can rely upon the defined calling convention. A naked ...<|control11|><|separator|>
  92. [92]
    Go internal ABI specification - - The Go Programming Language
    Go's ABI defines the layout of data in memory and the conventions for calling between Go functions. This ABI is unstable and will change between Go versions. If ...<|control11|><|separator|>
  93. [93]
    The Go low-level calling convention on x86-64 (updated)
    Dec 1, 2020 · This article reviewed the low-level code generation of the Go compiler, as of version 1.10. A few things have changed since, and so an update is in order.
  94. [94]
    rustfoundation/interop-initiative - GitHub
    Even for non-systems languages, C is the lingua franca for FFI generally and accessing OS-level resources in particular. As such, C ↔︎ Rust ...
  95. [95]
    Safety attributes for C - Open Standards
    Jan 5, 2021 · This is useful not just to carry over cross-language information, but also enhances C's ability as lingua franca for defining interfaces (i.e., ...