Fact-checked by Grok 2 weeks ago

Application binary interface

An Application Binary Interface (ABI) is a standardized set of conventions that defines how compiled binary code from different sources interacts with the operating system, runtime libraries, and other software modules at the machine code level, ensuring interoperability without requiring source code access.^[1] Unlike an Application Programming Interface (API), which operates at the source code level to specify function calls and data structures for developers, an ABI focuses on low-level details such as binary file formats, ensuring that object files can link and execute across compatible systems.^[1] ABIs are essential for binary compatibility, allowing software components compiled separately—often by different compilers or vendors—to work together seamlessly on the same platform.^[2] ABIs typically specify critical elements including calling conventions (how functions pass arguments and return values), data type layouts and alignments in memory, register usage, and name mangling for symbols to resolve linkages correctly.^[3] For instance, in C++ environments, the ABI extends to handling object-oriented features like virtual function tables, exception propagation, and runtime type information (RTTI), as outlined in standards like the Itanium C++ ABI adopted by many compilers.^[2] Common binary formats standardized by ABIs include ELF (Executable and Linkable Format) on Unix-like systems and PE (Portable Executable) on Windows, which dictate how executables and shared libraries are structured for loading and execution.^[1] These specifications are platform-specific, varying between architectures like x86, ARM, or PowerPC, and are often defined by industry consortia or operating system vendors to promote portability.^[3] The evolution of ABIs has been driven by the need for stable software ecosystems, where breaking changes in an ABI—such as altering calling conventions—can render existing binaries incompatible, necessitating recompilation.^[2] In embedded systems, ABIs like the Embedded Application Binary Interface (EABI) emphasize efficiency by standardizing minimal overhead for resource-constrained environments.^[3] Overall, ABIs underpin modern software distribution, dynamic linking, and plugin architectures, facilitating the reuse of precompiled libraries across diverse applications.^[1]

Fundamentals

Definition

An application binary interface (ABI) is the low-level interface between two binary program modules, such as an application and a library or the operating system kernel, that specifies the runtime conventions for how machine code from different modules interacts.^[4]^[5] It covers in-process communication for compiled code, including rules for function calls, data passing, and control flow.^[6]^[7] The term ABI refers to the specification of these conventions, whereas its implementation manifests in object files, executables, and the output of tools like compilers, assemblers, and linkers that conform to the specification.^[4]^[6] Unlike the higher-level application programming interface (API), which defines interactions at the source code level, an ABI operates at the binary level to enable interoperability among pre-compiled components.^[6]

Purpose

The application binary interface (ABI) serves as the binary-level counterpart to source-level application programming interfaces (APIs), defining the conventions for how compiled code interacts at the machine level.^[8] ABIs play a crucial role in promoting software modularity by enabling the separate compilation of program modules, such as libraries and executables, which can then be linked and executed together without requiring access to the original source code. This separation allows developers to build and distribute reusable binary components that remain compatible across different compilation units, as long as they adhere to the same ABI standards for object layouts, name mangling, and linking processes. For instance, in C++ environments, the ABI standardizes how classes and functions are represented in binaries, facilitating the integration of third-party libraries into larger applications without recompilation.^[8]^[2] In terms of runtime execution, ABIs ensure predictable and consistent behavior during binary interactions, such as the passing of function arguments, return value handling, and exception propagation across module boundaries. By specifying the exact formats for data exchange and control flow— including stack usage and register conventions—ABIs prevent mismatches that could lead to crashes or undefined behavior when binaries from different compilers or versions are combined. This reliability is essential for dynamic loading scenarios, where code is executed on-the-fly, maintaining the integrity of the program's runtime environment.^[8] ABIs also facilitate seamless system integration by providing a stable interface for user programs to communicate with operating system services, particularly through system calls, helping to avoid the need for recompilation after OS updates as long as the ABI remains stable. In Linux, for example, the kernel's ABI documents the conventions for invoking system calls, ensuring that binaries can reliably request kernel resources like file operations or process management while preserving backward compatibility for at least two years on stable interfaces. This stability allows applications to leverage OS functionalities across kernel versions, supporting long-term binary portability and reducing maintenance overhead in diverse computing environments.^[9]^[10]

ABI versus API

Key Differences

The Application Binary Interface (ABI) and Application Programming Interface (API) differ fundamentally in their levels of abstraction and operational scope. An API operates at the source code level, defining how developers interact with software components through high-level constructs such as function signatures, class definitions, and constants exposed in header files or documentation.^[11]^[12] In contrast, an ABI functions at the compiled machine code level, specifying low-level details like register usage, stack frame organization, and calling conventions that enable binaries to interoperate without source access.^[11]^[12] This distinction means APIs are portable across compilers and languages as long as the source adheres to the interface, whereas ABIs are tightly coupled to specific hardware architectures and compiler implementations, such as the Itanium C++ ABI for certain platforms.^[12] A key implication of these differences lies in stability and the impact of changes. Modifying an API, such as altering a function's parameter list, typically requires updating and recompiling the source code of dependent applications but does not affect existing binaries.^[11] Breaking an ABI, however, demands recompilation of all dependents because it disrupts binary compatibility; for instance, changes in data structure layouts or symbol versioning can render precompiled binaries unusable against the updated library.^[13]^[12] ABI stability is thus a critical concern in shared library design, often managed through versioning schemes like those in ELF binaries to avoid widespread recompilation needs.^[13] These interfaces can diverge in practice, highlighting their independent natures. For example, an API might declare a function accepting an int parameter without specifying its exact representation, allowing flexibility in source code.^[11] The corresponding ABI, however, mandates the precise bit width (e.g., 32 bits for int in the System V ABI), alignment rules, and byte order (e.g., little-endian on x86-64 architectures) to ensure correct binary interpretation across modules.^[10] Such details prevent runtime errors but tie the ABI to platform specifics, unlike the more abstract API.^[10]

Interdependence

APIs define the source-level interfaces that developers use to write software, specifying function signatures, data types, and behaviors in human-readable code. Compilers translate these API definitions into binary representations governed by the ABI, which dictates the low-level details such as memory layouts, calling conventions, and name mangling to ensure that the compiled binary accurately reflects the intended source interactions.^[8]^[11] This mapping process is crucial because the resulting library ABI emerges from the combination of the library's API and the compiler's ABI implementation, forming a binary contract that enables interoperability between compiled components.^[8] Changes to an API can directly influence the ABI, particularly when modifications alter binary structures or function interfaces. For instance, adding a parameter to a function signature in the API requires recompilation, which may shift the ABI's calling convention or stack layout, potentially breaking compatibility with existing binaries that expect the original interface.^[14] Similarly, changing a parameter type from a primitive like int to a reference type like Integer maintains source-level compatibility in some languages but disrupts the ABI due to differing binary signatures.^[15] Conversely, maintaining a stable API—by avoiding such alterations—facilitates ABI stability across versions, allowing libraries to evolve internally without forcing widespread recompilation of dependent software.^[11] In practice, developers leverage APIs to achieve source code portability across diverse compilers and platforms, writing once and compiling as needed for different environments. However, once deployed, applications depend on the ABI for binary-level execution and linkage, enabling dynamic loading of libraries without access to the original source code.^[11] This interdependence underscores the need for careful API design to minimize ABI disruptions, as binary incompatibility can lead to runtime failures in production systems where recompilation is impractical.^[8]

Components of an ABI

Calling Conventions

Calling conventions form a critical component of an application binary interface (ABI), dictating the precise mechanisms by which one function invokes another at the binary level. They outline the order and location of parameter passing—typically via CPU registers for efficiency or the call stack for larger or excess arguments—the handling of return values in designated registers or memory, and the division of responsibilities between the caller and callee for stack cleanup and register preservation. These rules ensure interoperability between separately compiled modules, preventing runtime errors from mismatched expectations during function calls.^[10] Parameter passing in modern calling conventions prioritizes registers to minimize latency, with the stack serving as overflow for additional arguments. For instance, integer and pointer parameters are classified as fitting into registers like 64-bit general-purpose ones, while floating-point values use vector registers such as SSE extensions. Return values follow similar patterns: scalar integers up to 64 bits return in the accumulator register (e.g., %rax or RAX), while larger or structured returns may involve hidden pointers to caller-allocated memory. Callee cleanup responsibilities vary; in many conventions, the caller manages stack unwinding for its parameters, but the callee must preserve non-volatile registers it uses, restoring them before returning to maintain program state. This delineation supports exception handling and debugging by standardizing prologue and epilogue code sequences.^[16]^[10]^[17] Prominent examples include the System V ABI, prevalent in Unix-like environments, which assigns the first six integer or pointer parameters to registers %rdi, %rsi, %rdx, %rcx, %r8, and %r9, and the first eight floating-point parameters to %xmm0 through %xmm7, pushing excess arguments right-to-left onto a 16-byte-aligned stack. Return values use %rax/%rdx for integers and %xmm0/%xmm1 for floating-point types, with the callee responsible for preserving registers like %rbx, %rbp, and %r12–%r15. In contrast, the Microsoft x64 calling convention, standard for Windows, limits the first four integer parameters to RCX, RDX, R8, and R9, and the first four floating-point to XMM0–XMM3, requiring the caller to reserve 32 bytes of "shadow space" on the stack for potential callee overflow usage. Returns mirror System V with RAX or XMM0, and the callee handles non-volatile preservation including XMM6–XMM15. These register-heavy approaches enhance performance by avoiding stack memory accesses, which can introduce cache misses in high-call-frequency scenarios compared to stack-only conventions.^[10]^[17]^[16] Stack management under these conventions enforces strict alignment—typically 16 bytes before a call—to optimize SIMD instructions and hardware prefetching, with 32-byte alignment for wider vectors like __m256 in some cases. Frame pointers, such as %rbp or RBP, are optional but commonly employed in the prologue to establish a reliable reference for local variables and arguments, especially in non-optimized code; their omission in leaf functions can reduce overhead by up to 5% in frame size. The prologue typically saves callee-saved registers, adjusts the stack pointer (RSP), and may leverage a 128-byte "red zone" below RSP in System V for temporary storage without explicit allocation, while the epilogue reverses these operations to ensure RSP alignment and register integrity upon return. These mechanisms collectively minimize overhead in binary execution while enabling seamless function interoperation across ABI-compliant code.^[10]^[17]^[16]

Data Types and Memory Layout

In an application binary interface (ABI), the representation of fundamental data types is strictly defined to ensure consistent interpretation across compiled binaries. For instance, in the System V ABI for AMD64 processors, a char occupies 1 byte with 1-byte alignment, a short is 2 bytes with 2-byte alignment, an int is 4 bytes with 4-byte alignment, a long is 8 bytes with 8-byte alignment, a float is 4 bytes with 4-byte alignment, a double is 8 bytes with 8-byte alignment, and a long double is 16 bytes (with 10 bytes of precision and 6 bytes of padding) with 16-byte alignment.^[10] Similarly, the Windows x64 ABI defines char and unsigned char as 1 byte (1-byte alignment), short and unsigned short as 2 bytes (2-byte alignment), int, long, unsigned int, and unsigned long as 4 bytes (4-byte alignment), __int64 and unsigned __int64 as 8 bytes (8-byte alignment), float as 4 bytes (4-byte alignment), and double as 8 bytes (8-byte alignment).^[5] These specifications prevent mismatches in type interpretation, such as treating a 32-bit integer as 4 bytes across modules.

Data Type	System V AMD64 Size (bytes) / Alignment (bytes)	Windows x64 Size (bytes) / Alignment (bytes)
`char`	1 / 1	1 / 1
`short`	2 / 2	2 / 2
`int`	4 / 4	4 / 4
`long`	8 / 8	4 / 4
`float`	4 / 4	4 / 4
`double`	8 / 8	8 / 8
Pointer	8 / 8	8 / 8
`long double`	16 / 16	N/A (uses `double` or extensions)

Endianness further governs multi-byte data representation within the ABI, with both System V AMD64 and Windows x64 employing little-endian byte order, where the least significant byte is stored at the lowest memory address.^[10]^[5] This convention ensures that, for example, the 32-bit value 0x01020304 is laid out in memory as bytes 04 03 02 01. For aggregate types like structures, ABIs mandate member ordering by declaration sequence, with padding inserted to satisfy individual member alignments and ensure the overall structure size is a multiple of its strictest alignment requirement. In the System V AMD64 ABI, structures are padded such that each member starts at an offset aligned to its natural boundary, and the total size aligns to the maximum member alignment; for example, a structure with a 1-byte char followed by an 8-byte long includes 7 bytes of padding after the char to align the long.^[10] The Windows x64 ABI follows similar rules, adding padding between or after members as needed, such as 4 bytes after an int in a structure preceding a larger aligned type.^[5] In C++, the Itanium ABI extends this to class layouts, where non-virtual bases and data members appear in declaration order, virtual bases follow the inheritance graph order, and empty base classes may share offsets to optimize space without violating alignment.^[2] Virtual tables (vtables) in C++ under the Itanium ABI represent a specialized structure layout for polymorphism, consisting of an offset-to-top field, a type information pointer, and an array of virtual function pointers, with primary vtables followed by secondary ones for non-primary virtual bases in depth-first, left-to-right traversal order.^[2] Packing directives, such as those overriding default alignments, are preserved across ABIs but must be explicitly specified to avoid portability issues. Pointers in ABIs are treated as unsigned integers of the platform's word size, with null conventionally represented as the all-zero bit pattern (address 0), and arrays as contiguous blocks of elements starting at the base address, sized as a multiple of the element's size and aligned to the element's alignment (or 16 bytes for large arrays in System V AMD64).^[10]^[5] This ensures pointers can be safely passed and interpreted without ambiguity, as their 8-byte size and 8-byte alignment on 64-bit systems allow direct memory addressing.

Procedure Linkage

Procedure linkage in an application binary interface (ABI) governs how procedures, or functions, are referenced and connected between binary modules during linking and execution. This involves establishing rules for symbol naming to ensure unique identification of external procedures, particularly in languages supporting overloading and namespaces. In C, external symbols retain their plain source names without decoration, allowing straightforward resolution in symbol tables of object files like ELF.^[10] In contrast, C++ employs name mangling to encode additional information such as parameter types, enabling the linker to distinguish overloaded functions and resolve external references unambiguously; mangled names typically begin with an underscore followed by 'Z' and a detailed encoding of the function signature.^[2] Relocation and symbol resolution handle the adjustment of procedure addresses in binaries, supporting both static linking at compile time and dynamic loading at runtime. During static linking, the linker resolves symbols by matching references to definitions across object files, applying relocations to update addresses—absolute relocations fix direct memory locations, while relative ones use offsets for position-independent code. In dynamic scenarios, the runtime loader, such as ld.so on Unix-like systems, employs the Procedure Linkage Table (PLT) and Global Offset Table (GOT) to defer resolution; initial calls to external procedures redirect through PLT stubs to the loader, which then performs lazy binding by searching symbol tables in loaded shared objects and patching the GOT with resolved absolute addresses.^[18]^[19] Exception handling across module boundaries requires coordinated propagation and unwinding to maintain program integrity when errors occur in external procedures. The ABI specifies mechanisms for exceptions to traverse stack frames in different binaries, relying on unwind tables to guide the process; in ELF-based systems, the .eh_frame section contains Call Frame Information (CFI) with Common Information Entries (CIEs) for default rules and Frame Description Entries (FDEs) for function-specific instructions on register restoration and stack pointer adjustments. This enables the unwinder to propagate exceptions by iteratively restoring prior stack states, ensuring cleanup actions like destructor calls are executed consistently across modules.^[20]

Standards and Implementations

POSIX and Unix-like Systems

In POSIX-compliant and Unix-like systems, the System V Application Binary Interface (ABI) serves as a foundational standard for ensuring binary compatibility across diverse architectures and implementations. This ABI, originally developed as part of System V Release 4 (SVR4), specifies the interface between applications and the operating system, including object file formats, linking mechanisms, and runtime behaviors.^[21] It promotes portability by defining consistent rules for how binaries interact with shared libraries and the kernel, enabling executables compiled on one Unix variant to run on another with compatible hardware.^[22] A core component of the System V ABI is the Executable and Linking Format (ELF), which standardizes the structure of object files, executables, and shared libraries. ELF files include headers for identification (e.g., architecture via e_machine), sections for code and data (e.g., .text, .data), and program headers for loading segments into memory. For processor-specific conventions, the x86 architecture follows the System V AMD64 psABI, which uses 64-bit ELF (ELFCLASS64) with little-endian byte order, defining memory layouts such as stack growth downward and 16-byte alignment.^[23] Similarly, for ARM architectures, the ELF supplement outlines conventions like 32-bit or 64-bit (AArch32/AArch64) support, with specific relocation types (e.g., R_ARM_PC24) and section flags for efficient loading on resource-constrained devices.^[24] Thread-local storage (TLS) is handled uniformly across these, using dedicated sections (.tdata for initialized data, .tbss for uninitialized) flagged with SHF_TLS, and a PT_TLS program header for runtime allocation per thread, ensuring thread-safe access without global synchronization.^[21] In Linux distributions, which adhere to the System V ABI, the GNU C Library (glibc) implements user-space components, providing stable interfaces for functions and data structures while maintaining backward compatibility for major versions. The kernel-user boundary is enforced through system calls (syscalls), where user programs invoke kernel services via standardized interfaces like the syscall instruction on x86 or svc on ARM, with argument passing governed by the architecture's psABI to prevent privilege escalation.^[25] Linux further supports multi-ABI environments, such as running 32-bit binaries on 64-bit kernels through compatibility layers (e.g., ia32 emulation), allowing seamless execution of legacy applications without recompilation.^[9] The IEEE POSIX standards, particularly POSIX.1, establish a baseline for portable binaries by specifying source-level interfaces that map to underlying ABIs, ensuring that Unix-like systems (e.g., Linux, BSD variants) produce interoperable executables across vendors. This standardization facilitates binary distribution without architecture-specific adjustments, as long as the target system complies with the System V ELF extensions.^[26]

Windows and Microsoft Ecosystems

The Portable Executable (PE) format, an extension of the Common Object File Format (COFF), defines the structure for executable files, object files, and dynamic-link libraries (DLLs) in the Windows operating system. It includes a DOS header for compatibility, followed by a PE signature, COFF file header specifying machine type and sections, and an optional header with data directories pointing to key structures like imports and exports.^[27]^[28] The import table in PE files lists external functions and data required by the executable, organized into import descriptor directories that reference DLLs and contain thunks for address resolution at load time, enabling dynamic linking to shared libraries. Export tables, conversely, define the public symbols a DLL exposes to other modules, including function names, ordinals, and addresses, facilitating procedure linkage for reusable components. Delay-loading extends this by deferring DLL and function loading until first use, reducing startup time and memory footprint through a separate delay import descriptor that triggers runtime resolution via helper functions like __delayLoadHelper.^[27]^[29] Microsoft's calling conventions standardize argument passing and stack management for x86 and x64 architectures to ensure interoperability. On x86, conventions like __stdcall pass parameters right-to-left on the stack with the callee cleaning up, commonly used for Win32 API calls to minimize executable size, while __fastcall optimizes by passing the first two integer or pointer arguments in ECX and EDX registers before stack usage. For x64, Microsoft adopts a unified fastcall-like convention where the first four integer/pointer arguments and first four floating-point arguments use registers (RCX, RDX, R8, R9 for integers; XMM0-XMM3 for floats), with the caller allocating 32 bytes of shadow space and cleaning the stack, promoting efficiency across binaries.^[30]^[31]^[17] The Component Object Model (COM) provides binary stability through interface-based contracts, where objects expose versioned interfaces via GUIDs and vtables, ensuring that binary layouts remain unchanged across implementations and compiler versions for seamless interoperation in distributed systems. This design allows clients to bind to interfaces without recompilation, as long as new versions maintain backward compatibility by not altering existing vtable entries.^[32]^[33] The Visual C++ (MSVC) toolchain enforces an ABI for binary compatibility across Visual Studio versions starting from 2015 (toolset v140), guaranteeing that object files, libraries, and executables built with later versions (v141 through v145) can link and run together without recompilation as of November 2025, provided the linker version matches or exceeds the newest toolset used.^[34]^[35] This includes the Standard Template Library (STL), where containers and algorithms maintain stable binary layouts and exception specifications within the C++ runtime library (CRT), supported by a single Visual C++ Redistributable package for deployment. Exceptions arise with optimizations like whole-program optimization (/GL) or link-time code generation (/LTCG), which require identical toolset versions for compatibility.^[34]

Embedded and Specialized ABIs

In embedded systems, ABIs are optimized for resource-constrained environments, such as microcontrollers with limited memory and processing power, where minimizing overhead is critical. These ABIs often incorporate reduced stack usage by limiting the number of registers passed as arguments and employing compact calling conventions to preserve interrupt latency and stack space. For instance, the Embedded Application Binary Interface (EABI) for processors like the MSP430 specifies ELF-based formats tailored for low-memory devices, ensuring efficient object file layouts without unnecessary padding or metadata that could inflate binary sizes. Similarly, in real-time operating systems (RTOS) like FreeRTOS, which target bare-metal or minimal-kernel setups, ABIs favor fixed-point arithmetic over floating-point to avoid hardware dependencies and reduce computational overhead; fixed-point operations use integer instructions, conserving cycles and energy in systems without floating-point units.^[36]^[37]^[38] The ARM Architecture Procedure Call Standard (AAPCS), part of the broader ARM ABI, defines rules for parameter passing, register usage, and stack alignment in embedded contexts, including the EABI variant for bare-metal applications. In AAPCS, up to four integer or pointer arguments are passed in registers (r0-r3) to minimize stack pushes, with the stack growing downwards and maintaining 8-byte alignment at function boundaries to support atomic operations common in interrupt-driven systems. For bare-metal ARM environments, the EABI extends this by omitting dynamic linking support and focusing on static executables, which is essential for resource-limited devices without an OS loader; interrupt handling follows AAPCS conventions, where handlers save only necessary context (e.g., link register and arguments) to enable low-latency responses in RTOS tasks. The RISC-V ABI, documented in the ELF psABI specification, similarly prioritizes efficiency for embedded use, with the proposed Embedded ABI (EABI) reducing argument registers from eight to four to cut context-save costs during interrupts, thereby improving real-time performance on microcontrollers. In bare-metal RISC-V setups, interrupt handling adheres to the machine-mode trap mechanism, where the ABI ensures handlers access callee-saved registers (e.g., s0-s11) minimally, often integrating with the Supervisor Binary Interface (SBI) for standardized exception vectors without OS mediation.^[39]^[40]^[41]^[38]^[42] Domain-specific ABIs address niche requirements beyond general-purpose computing, such as parallel processing and blockchain execution. The Message Passing Interface (MPI) ABI, standardized in MPI-5.0, enables binary compatibility across implementations for high-performance computing clusters by defining consistent handle types (e.g., opaque pointers for communicators and datatypes), status objects (arrays of eight integers), and integer types like MPI_Aint as intptr_t, allowing compiled parallel applications to link against different MPI libraries without recompilation. This ABI supports efficient point-to-point and collective operations in distributed-memory systems, with functions like MPI_Abi_get_version ensuring runtime version checks for interoperability. In blockchain contexts, the Ethereum smart contract ABI provides a JSON-based interface for encoding function calls, events, and data between the Ethereum Virtual Machine (EVM) and external applications or contracts, specifying static types (e.g., uint256) encoded in-place and dynamic types (e.g., bytes arrays) via offset pointers in 32-byte words. It uses a 4-byte function selector from the Keccak-256 hash of signatures to dispatch calls, facilitating deterministic interaction in decentralized environments without native OS support.^[43]^[43]^[44]

Historical Development

Origins in Early Computing

The concepts underlying the application binary interface (ABI) first emerged in the 1960s and 1970s through the development of linkers and loaders in pioneering operating systems, which standardized how binary modules interacted at the machine level. In Multics, initiated in 1965 as a collaborative project by MIT, Bell Labs, and General Electric, dynamic linking was a core feature that allowed segments of code to be loaded and bound at runtime, using segment tables and linkage sections to resolve references across modules.^[45] This approach established early rules for binary modularity, where procedures in separate segments could reference each other via symbolic names, influencing subsequent systems by prioritizing relocatability and shared code execution.^[46] By contrast, early Unix, developed at Bell Labs starting in 1969 on the PDP-7 and later the PDP-11, initially lacked a dedicated linker, with programs written in assembly as self-contained binaries.^[47] The introduction of the B compiler in 1970 on the PDP-11 brought the a.out format, a simple executable structure output by the assembler (as) and linker (ld), which included headers for text, data, and symbol tables to enable basic relocation and loading.^[47] This format, named for the default output file "a.out," laid foundational ABI principles by defining how object files could be combined into runnable binaries without recompilation.^[47] Parallel advancements in assembler and loader technology, particularly in IBM's OS/360 released in 1966, further solidified binary relocatability as a cornerstone of early ABIs. OS/360's linkage editor processed relocatable object modules—generated by assemblers like the Basic Assembler—using external symbol dictionaries (ESD) and relocation dictionaries (RLD) to resolve inter-module references and adjust addresses relative to a base origin.^[48] Control sections served as the minimal relocatable units, with A-type constants handling intra-segment addresses and V-type for external ones, enabling the creation of load modules that could be dynamically positioned in memory during execution.^[48] The loader then performed final address modifications and overlay management, reducing storage needs—for instance, overlay structures could shrink a 32K-byte program to 18K bytes—while enforcing standardized formats for text, constants, and entry points.^[48] These mechanisms established basic interface rules, such as consistent symbol resolution and error handling (e.g., IEW0012 for invalid constants), ensuring binary compatibility across programs and libraries in multi-programming environments.^[48] Early calling conventions, essential to ABI for subroutine interactions, were formalized in systems like the PDP-11, where hardware capabilities directly shaped software interfaces. A 1970 DEC memorandum outlined PDP-11 subprogram conventions, recommending the Jump to Subroutine (JSR) instruction with a branch register (BR) to pass argument counts and addresses, while using the stack for reentrant calls via push operations like MOV #A, -(SP).^[49] This prioritized execution speed and simplicity, supporting variable-length arguments and fail-soft recovery, with the RS stack tracking call locations for debugging.^[49] The PDP-11's instruction set architecture (ISA), featuring eight general-purpose registers (R0-R5 for data, R6 as stack pointer, R7 as program counter), facilitated these conventions by enabling autoincrement/decrement addressing for efficient parameter passing and nesting without manual linkage saves.^[50] Similarly, the x86 ISA, introduced with the Intel 8086 in 1978, imposed initial ABI assumptions through its segmented memory model and limited registers (e.g., AX, BX for parameters), dictating stack-based conventions for function calls that echoed PDP-11 influences but adapted to 16-bit addressing constraints.^[51] These hardware-driven designs ensured that binaries could interoperate reliably, setting precedents for parameter marshaling and return value handling in subsequent architectures.^[50]

Evolution with Modern Languages

The 1980s marked a pivotal transition in ABI development as the C programming language became the dominant systems language, replacing assembly for most development and necessitating standardized binary interfaces for portability across Unix variants. AT&T's UNIX System V, first released in 1983, introduced the Common Object File Format (COFF) as a more sophisticated replacement for the a.out format, supporting relocatable object modules with sections for code, data, and debugging symbols, along with standardized relocation and symbol tables to enable consistent linking and loading.^[52] Concurrently, industry efforts through the X/Open Company (formed 1984) and the IEEE POSIX standards (first draft 1986, standardized 1988) began defining C-specific ABI elements, including calling conventions (e.g., right-to-left stack parameter passing on x86), data type sizes and alignments (e.g., int as 32-bit), and system call interfaces, promoting binary compatibility in a fragmented Unix ecosystem.^[53] The development of ABIs in the late 20th century was heavily influenced by the growing complexity of C and C++, which demanded standardized binary interfaces to support features beyond simple procedural code. In the 1990s, the Itanium C++ ABI emerged from an industry collaboration led by HP and Intel, providing a comprehensive specification for C++ binaries on the Itanium architecture but influencing broader ecosystems. This ABI introduced detailed name mangling rules to encode overloaded functions, operators, and constructors into unique symbols starting with "_Z", ensuring unambiguous linkage across object files and libraries. It also defined exception handling protocols, including routines like __cxa_throw and __cxa_begin_catch, which integrate with platform unwind mechanisms to propagate exceptions across call stacks while preserving type safety.^[2]^[54] To accommodate C++'s evolving language features, the Itanium ABI extended support for templates through specialized mangling of template arguments (e.g., using "I" for template instantiation followed by encoded parameters), allowing distinct representations for different instantiations without name collisions. Run-time type information (RTTI) was formalized via std::type_info structures, emitted with vague linkage in COMDAT groups for polymorphic classes, enabling operations like dynamic_cast and exception type matching. Namespaces received nested mangling support (e.g., "N" for nested names ending with "E"), facilitating hierarchical organization in binaries while maintaining compatibility with global scope. These extensions were crucial as C++ standardized features in the 1990s and early 2000s, but implementation variances arose: GCC has maintained Itanium ABI stability since version 3.4 in 2004, while MSVC historically introduced ABI breaks with each major release from Visual Studio .NET 2003 through 2013—such as changes in structure padding and virtual table layouts—before achieving intra-family compatibility from Visual Studio 2015 onward via toolset versioning (e.g., v140 to v143).^[2]^[55]^[34] In the 2010s, WebAssembly (Wasm) marked a shift toward platform-agnostic ABIs, standardizing a binary instruction format for a stack-based virtual machine that supports portable code execution across browsers and standalone runtimes. Its core ABI defines fixed-width numeric types (e.g., i32 for 32-bit integers, f64 for doubles) and reference types, with calling conventions relying on operand stacks and linear memory addressing to pass arguments and results between functions and host environments. This design ensures binaries compiled from languages like C++, Rust, or AssemblyScript remain interoperable without architecture-specific adjustments, addressing portability challenges in distributed systems. The subsequent Canonical ABI, part of the WebAssembly Component Model, builds on this by specifying canonical representations for higher-level constructs—such as records via lists of {tag, payload} and strings as {buffer: list<u8>, length: u32}—enabling seamless data exchange across language boundaries in multi-module applications.^[56]^[57] Contemporary languages continue to drive ABI evolution, with Rust exemplifying efforts to balance stability and performance. Rust's project goals include developing a modular, stable ABI to support dynamic loading of crates as plugins and interoperability with languages like C or Swift, motivated by needs in systems programming such as the Fuchsia OS kernel. Proposals envision versioned ABIs using attributes like #[repr(RustABI)] for explicit contracts, drawing lessons from C++'s fragmentation to avoid optimization constraints while enabling runtime linking; however, as of 2025, Rust maintains no guaranteed stable ABI across compiler versions to preserve flexibility in code generation and memory layouts.^[58]

Challenges and Considerations

Binary Compatibility

Binary compatibility in application binary interfaces (ABIs) refers to the ability of software components, such as libraries and executables, to interoperate at the binary level across different versions without requiring recompilation. Changes to an ABI can disrupt this compatibility, particularly when altering core elements like calling conventions or data type representations. For instance, modifying a calling convention—such as shifting the order of parameter passing between registers and the stack—can lead to incorrect function invocations, resulting in runtime crashes or erroneous computations in dependent binaries. Similarly, increasing the size of a fundamental type, like changing int from 32 bits to 64 bits, may cause memory misalignment or buffer overflows when older binaries expect the original layout, leading to segmentation faults or data corruption. These effects stem from the ABI's role in defining precise memory layouts and procedure linkages, where even subtle shifts can invalidate assumptions made during compilation.^[59] To mitigate such breaking changes, developers employ versioning techniques that preserve the interface for existing binaries while enabling evolution. In ELF-based systems, symbol versioning allows libraries to expose multiple versions of the same symbol, ensuring that older applications link to the original implementation while newer ones access updated functionality. For example, the GNU C++ standard library uses version tags like GLIBCXX_3.4 in its shared objects, permitting the addition of new symbols in minor releases without invalidating prior ABIs, as the linker resolves symbols based on the required version recorded at link time. On Windows, side-by-side assemblies enable multiple versions of dynamic-link libraries (DLLs) to coexist within the same process, reducing "DLL hell" by isolating version-specific manifests and preventing conflicts during loading. ABI wrappers provide another strategy, often involving a stable intermediary layer—such as a C interface wrapping a C++ library—to shield downstream code from internal ABI alterations, ensuring that the exposed interface remains unchanged across library updates. These techniques collectively allow libraries to evolve while maintaining backward compatibility for deployed software. Compiler flags further aid in preserving ABI stability by controlling symbol exposure. The GCC option -fvisibility=hidden sets the default visibility of symbols in shared objects to hidden, preventing unintended exports that could lead to linkage issues or ABI bloat in future versions. Developers then explicitly mark public symbols with attributes like __attribute__((visibility("default"))) , limiting the ABI surface to only essential interfaces and reducing the risk of breaking changes from internal modifications. This approach not only safeguards compatibility but also optimizes load times and library sizes by minimizing dynamic symbol resolution overhead.^[60]

Portability and Vendor Lock-in

Application binary interfaces (ABIs) are inherently tied to specific instruction set architectures (ISAs), which introduces significant challenges to binary portability across different hardware platforms. For instance, the x86 ABI, commonly used in Intel and AMD processors, relies on conventions such as the System V ABI for Unix-like systems, where parameters are primarily passed on the stack or in a limited set of registers like EAX and EDX, with variable-length instructions. In contrast, the ARM ABI, defined by the Procedure Call Standard (PCS), passes the first four integer arguments in registers r0-r3 and up to eight floating-point arguments in VFP registers (s0-s15 or d0-d7), with excess parameters on the stack, and mandates the use of the Thumb-2 instruction set where code pointers must have the low bit set. These differences in calling conventions, register usage, and instruction encoding mean that binaries compiled for x86 cannot be directly executed or linked on ARM processors without recompilation, emulation, or translation layers, severely limiting reuse and requiring substantial porting efforts.^[61]^[62] Vendor-specific variations in ABI implementations further exacerbate portability issues and contribute to vendor lock-in, where software becomes dependent on a particular compiler or ecosystem. Compilers like GCC and Clang adhere to the Itanium C++ ABI, an industry-standard specification that ensures compatibility for C++ binaries across these tools by defining consistent rules for name mangling, virtual function calls, and data layout, allowing objects compiled with one to link seamlessly with those from the other on supported platforms. However, Microsoft's Visual C++ (MSVC) employs a distinct ABI that diverges from the Itanium model, particularly in vtable layouts, exception handling, and runtime library integrations, which has maintained ABI stability across major versions of Visual Studio since 2015 (e.g., toolsets v140 to v145 in Visual Studio 2026 as of November 2025) but prevents direct interoperability with non-Microsoft compilers.^[34] This proprietary approach ties developers to the Windows ecosystem and Microsoft tools, as mixing binaries from MSVC with those from GCC or Clang often requires recompilation or wrappers, increasing costs and reducing flexibility; additionally, compiler-specific extensions, such as non-standard attributes or optimizations, can embed vendor-unique behaviors into binaries, further entrenching lock-in by making migration to alternative compilers error-prone or impossible without code changes.^[8]^[63] To mitigate these portability and lock-in challenges, standardized ABIs have been developed to abstract platform and vendor dependencies. The Itanium C++ ABI serves as a cross-compiler foundation for Unix-like systems and multiple architectures, enabling relocatable objects and dynamic shared objects (DSOs) that work across GCC, Clang, and other compliant tools without recompilation, thus promoting ecosystem openness on Linux and similar environments. Similarly, WebAssembly (Wasm) provides a portable ABI through its stack-based virtual machine and binary format, which assumes little-endian byte ordering, two's complement integers, IEEE 754 floating-point, and unaligned memory access support, allowing modules compiled from languages like C++, Rust, or Go to execute efficiently on diverse ISAs including x86 and ARM, across web browsers, servers, and embedded devices via host-defined interfaces like WASI, without tying to specific hardware or vendor runtimes. These standards reduce the barriers to binary reuse and vendor switching by enforcing uniform interfaces that prioritize interoperability over proprietary optimizations.^[63]^[64]

References

[1]
What Is Application Binary Interface (ABI) - ITU Online IT Training
An Application Binary Interface (ABI) is a standardized interface between two binary program modules, often between an application and the operating system.Missing: authoritative | Show results with:authoritative
[2]
Itanium C++ ABI
In this document, we specify the Application Binary Interface (ABI) for C++ programs: that is, the object code interfaces between different user-provided C++ ...
[3]
5.10 Application Binary Interface
An Application Binary Interface (ABI) defines how functions that are written separately, and compiled separately can work together. This involves standardizing ...Missing: definition authoritative
[4]
Binary Compatibility - Using the GNU Compiler Collection (GCC)
Binary compatibility encompasses several related concepts: application binary interface (ABI): The set of runtime conventions followed by all of the tools that ...
[5]
Overview of x64 ABI conventions - Microsoft Learn
Jun 25, 2025 · This topic describes the basic application binary interface (ABI) for x64, the 64-bit extension to the x86 architecture.
[6]
Frequently Asked Questions
### Definition and Explanation of ABI
[7]
[PDF] SYSTEM V APPLICATION BINARY INTERFACE - SCO
Mar 18, 1997 · Page: 10. Page 11. System V Application Binary Interface. The System V Application Binary Interface, or ABI, defines a system interface for. ...
[8]
ABI Policy and Guidelines
### Summary of ABI in Software Development
[9]
Understanding Application binary interface (ABI) [closed]
Aug 1, 2011 · An ABI is a set of conventions that allows a linker to combine separately compiled modules into one unit without recompilation, such as calling ...Missing: integration | Show results with:integration
[10]
Linux ABI description - The Linux Kernel documentation
This part of the documentation inside Documentation/ABI directory attempts to document the ABI between the Linux kernel and userspace, and the relative ...
[11]
[PDF] System V Application Binary Interface - AMD64 Architecture ...
Jul 2, 2012 · header understood by the unwind interface, defined as follows: 88 ... This ABI does not define the passing of optional arguments. They ...<|separator|>
[12]
Differences Between APIs and ABIs | Baeldung on Computer Science
Apr 24, 2024 · An ABI, or “Application Binary Interface” is analogous to an API but expressed in compiled code instead of source code. In a Java application, ...
[13]
[PDF] ABI Compatibility Through a Customizable Language
There are two types of programming interfaces to a library: the Ap- plication Programming Interface (API) and the Application Binary. Interface (ABI). The API ...
[14]
[PDF] How To Write Shared Libraries - Dartmouth Computer Science
Dec 10, 2011 · In addition, it introduces the concept of ABI (Application Binary Interface) stability and shows how to manage it.<|separator|>
[15]
20 ABI (Application Binary Interface) breaking changes every C++ ...
Apr 3, 2019 · If ABI compatibility is broken between the calling binary (exe or another dll) and your dll, it can result in unintended crashes.
[16]
Understanding the Full Impact of Breaking Changes - InfoQ
Jan 18, 2024 · For example, in Java, a library might maintain API compatibility even if method signature changes occur, but this can result in a loss of ABI ...
[17]
[PDF] Calling conventions - Agner Fog
Feb 1, 2023 · The System V ABI for. 64-bit Unix systems requires alignment by 32. The System V ABI for 32-bit Unix does not mention __m256, but tests show ...
[18]
x64 Calling Convention | Microsoft Learn
Jul 25, 2025 · This article describes the standard processes and conventions that one function (the caller) uses to make calls into another function (the callee) in x64 code.
[19]
Relocation
### Summary of Relocation in ELF
[20]
Dynamic Linking
### Summary: Procedure Linkage Table (PLT), Global Offset Table (GOT), and Dynamic Symbol Resolution
[21]
None
Below is a merged summary of the provided segments on Call Frame Information and `.eh_frame` for unwinding in exception handling, based on Section 6.4 and related sections of the DWARF5 standard. To retain all information in a dense and organized manner, I’ve used a combination of narrative text and tables in CSV format where applicable. The response consolidates details from all summaries, avoiding redundancy while preserving key points, and includes relevant URLs where provided.
[22]
[PDF] elf.pdf - ELF Object File Format
Sep 4, 2025 · The System V Release 4 (SVR4) Application Binary Interface (ABI) is composed of several components, ranging from a high-level specification of ...
[23]
ELF and ABI Standards - Linux Foundation
This version breaks ELF into 3 seperate books, ELF, x86 psABI, and the Operating System Specific Specification for SVR4. This appears to be the most current TIS ...Missing: conventions thread- local storage
[24]
[PDF] System V Application Binary Interface - x86-64
Sep 13, 2002 · The most fundamental differences from the Intel386 ABI document are as follows: • Sizes of fundamental data types. 1The architecture ...<|control11|><|separator|>
[25]
ARM-software/abi-aa: Application Binary Interface for the ... - GitHub
This is the official place for the latest documents of the Application Binary Interface for the Arm® Architecture, both for source files and officially released ...
[26]
ABI List - glibc wiki - Sourceware
May 4, 2024 · glibc supports the following (architecture, ABI) combinations, with dynamic linker names as indicated. There may well be other cases of ...
[27]
The Open Group Base Specifications Issue 7, 2018 edition
POSIX.1-2017 is intended to be used by both application developers and system implementors and comprises four major components (each in an associated volume):.Introduction · Download · Utility Conventions · Regular ExpressionsMissing: ABI | Show results with:ABI
[28]
PE Format - Win32 apps - Microsoft Learn
Jul 14, 2025 · This document specifies the structure of executable (image) files and object files under the Microsoft Windows family of operating systems.
[29]
[PDF] Microsoft Portable Executable and Common Object File Format ...
The PE file header consists of an MS-DOS stub, the PE signature, the COFF File Header, and an Optional Header. A COFF object file header consists of a COFF File ...
[30]
An In-Depth Look into the Win32 Portable Executable File Format ...
The discussion includes the exports section, export forwarding, binding, and delayloading. The debug directory, thread local storage, and the resources sections ...
[31]
__stdcall | Microsoft Learn
Feb 10, 2025 · The __stdcall calling convention is used to call Win32 API functions. The callee cleans the stack, so the compiler makes vararg functions __cdecl.
[32]
__fastcall | Microsoft Learn
Sep 15, 2023 · The __fastcall calling convention specifies that arguments to functions are to be passed in registers, when possible.
[33]
COM Technical Overview - Win32 apps - Microsoft Learn
Jan 6, 2021 · The Microsoft Component Object Model (COM) defines a binary interoperability standard for creating reusable software libraries that interact at run time.Missing: ABI documentation<|control11|><|separator|>
[34]
Native interoperability ABI support - .NET - Microsoft Learn
May 27, 2025 · The Application Binary Interface (ABI) is the interface that runtimes and operating systems use to express low-level binary details.
[35]
C++ binary compatibility 2015-2026
### Summary of Visual C++ ABI and STL Binary Compatibility Guarantees
[36]
[PDF] MSP430 Embedded Application Binary Interface - Texas Instruments
This document is a specification for the ELF-based Embedded Application Binary Interface (EABI) for the. MSP430 family of processors from Texas Instruments.
[37]
[PDF] PPCEABI: PowerPC Embedded Application Binary Interface (EABI)
Jan 10, 1995 · As in the SVR4 ABI, Fixed-Point. Load and Store Multiple instructions and the Fixed Point Move Assist instructions shall not be al- lowed in ...
[38]
[PDF] RISC-V for Real-time MCUs - Software Optimization and ...
Compared to the regular. ABI, the EABI reduces the number of argument registers from. 8 to 4, consequently reducing the number of registers needed to be saved ...
[39]
[PDF] Procedure Call Standard for the Arm Architecture
Jun 12, 2020 · ... Procedure Call Standard use by the Application Binary Interface (ABI) ... AAPCS Procedure Call Standard for the Arm Architecture (this standard).
[40]
1.4. Bare Metal Compiler - Intel
It targets the ARM processor, it assumes bare metal operation, and it uses the standard ARM embedded application binary interface (EABI) conventions. The ...
[41]
https://github.com/riscv-non-isa/riscv-eabi-spec
[42]
https://github.com/riscv-non-isa/riscv-sbi-doc
[43]
None
Below is a merged response summarizing the MPI ABI standardization and details on ABI for parallel computing from the MPI-5.0 report. To retain all information in a dense and organized manner, I’ve used tables where appropriate, alongside narrative text for clarity. The response consolidates all segments, focusing on key aspects such as standardization efforts, purpose, details (e.g., C and Fortran ABI specifics), relevant sections, and useful URLs.
[44]
Contract ABI Specification — Solidity 0.8.31-develop documentation
The Contract ABI is the standard way to interact with contracts in Ethereum, encoding data by type, requiring a schema to decode.
[45]
History - Multics
Jul 31, 2025 · Multics design was started in 1965 as a joint project by MIT's Project MAC, Bell Telephone Laboratories, and General Electric Company's Large ...
[46]
[PDF] Dynamic Linking Multics
Two ways to refer to data: (segment number, offset) and (file name, offset); segment is stored on disk or memory. ▫. Kind of like “mmap” for all data.
[47]
Evolution of the Unix Time-sharing System - Nokia
This paper presents a technical and social history of the evolution of the system. Origins. For computer science at Bell Laboratories, the period 1968-1969 was ...
[48]
[PDF] IBM OS - Bitsavers.org
This publication provides the information necessary to use the linkage editor or loader program of the IBM. System/360 Operating System to prepare the ...
[49]
[PDF] 'mDmnomn INTEROFFICE MEMORANDUM
PDP-11 Subprogram. Calling Conventions. DATE: November 10, 1970. TO: PDP-11 List C. PDP-11 Master List. FROM. Hank Spencer. DEPARTMENT: Programming.
[50]
None
Summary of each segment:
[51]
Intel “x86” Family and the Microprocessor Wars - CHM Revolution
Shown below are generations of Intel microprocessors derived from the original 8086 architecture. As the number of bits in the CPU increased from 16 to 32 to 64 ...
[52]
Itanium C++ ABI: Exception Handling ($Revision: 1.22 $)
The standard ABI exception handling / unwind process begins with the raising of an exception, in one of the forms mentioned above. This call specifies an ...Level I: Base ABI · Exception Handler Framework · Level II: C++ ABI
[53]
Microsoft C/C++ change history 2003 - 2015
Oct 3, 2025 · This article describes all the breaking changes from Visual Studio 2015 going back to Visual Studio 2003.
[54]
WebAssembly Specification — WebAssembly 3.0 (2025-11-02)
No readable text found in the HTML.<|separator|>
[55]
Canonical ABI - The WebAssembly Component Model
An ABI is an application binary interface - an agreement on how to pass data around in a binary format. ABIs are specifically concerned with data layout at the ...
[56]
A Stable Modular ABI for Rust - compiler - Rust Internals
May 14, 2020 · A stable ABI would allow Rust libraries to be loaded by other languages (such as Swift), and would allow Rust to interop with libraries defined in other ...
[57]
Policies/Binary Compatibility Issues With C++ - KDE Community Wiki
### Summary of Binary Compatibility Issues in C++ (KDE Community Wiki)
[58]
Visibility - GCC Wiki
Dec 17, 2021 · GCC's visibility feature hides unnecessary ELF symbols, improving load times, enabling better code optimization, and reducing DSO size by 5-20%.Missing: ABI | Show results with:ABI
[59]
Overview of ARM ABI Conventions | Microsoft Learn
Endianness. Windows on ARM executes in little-endian mode. Both the MSVC compiler and the Windows runtime always expect little-endian data. The SETEND ...Vfp Registers · Parameter Passing · Stage C: Assignment Of...
[60]
https://gcc.gnu.org/wiki/Visibility
[61]
Itanium C++ ABI
Mar 14, 2017 · Introduction. The Itanium C++ ABI is an ABI for C++. As an ABI, it gives precise rules for implementing the language, ensuring that separately- ...
[62]
Overview of potential upgrade issues (Visual C++) - Microsoft Learn
Oct 24, 2021 · This overview summarizes the most common classes of issues you're likely to see, and provides links to more detailed information.
[63]
Portability - WebAssembly
### Summary: WebAssembly Portability Across ISAs via ABI