Fact-checked by Grok 2 weeks ago

Application binary interface

An Application Binary Interface (ABI) is a standardized set of conventions that defines how compiled binary code from different sources interacts with the operating system, runtime libraries, and other software modules at the machine code level, ensuring interoperability without requiring source code access. Unlike an Application Programming Interface (API), which operates at the source code level to specify function calls and data structures for developers, an ABI focuses on low-level details such as binary file formats, ensuring that object files can link and execute across compatible systems. ABIs are essential for binary compatibility, allowing software components compiled separately—often by different compilers or vendors—to work together seamlessly on the same platform. ABIs typically specify critical elements including calling conventions (how functions pass arguments and return values), layouts and alignments in , register usage, and for symbols to resolve linkages correctly. For instance, in C++ environments, the ABI extends to handling object-oriented features like tables, exception propagation, and (RTTI), as outlined in standards like the C++ ABI adopted by many compilers. Common binary formats standardized by ABIs include ELF () on systems and PE () on Windows, which dictate how executables and shared libraries are structured for loading and execution. These specifications are platform-specific, varying between architectures like x86, ARM, or PowerPC, and are often defined by industry consortia or operating system vendors to promote portability. The evolution of ABIs has been driven by the need for stable software ecosystems, where breaking changes in an ABI—such as altering calling conventions—can render existing binaries incompatible, necessitating recompilation. In systems, ABIs like the Embedded Application Binary Interface (EABI) emphasize efficiency by standardizing minimal overhead for resource-constrained environments. Overall, ABIs underpin modern , dynamic linking, and plugin architectures, facilitating the reuse of precompiled libraries across diverse applications.

Fundamentals

Definition

An application binary interface (ABI) is the low-level interface between two binary program modules, such as an application and a or the operating system , that specifies the runtime conventions for how from different modules interacts. It covers in-process communication for compiled code, including rules for calls, data passing, and . The term ABI refers to the specification of these conventions, whereas its implementation manifests in object files, executables, and the output of tools like compilers, assemblers, and linkers that conform to the specification. Unlike the higher-level application programming interface (API), which defines interactions at the source code level, an ABI operates at the binary level to enable interoperability among pre-compiled components.

Purpose

The application binary interface (ABI) serves as the binary-level counterpart to source-level application programming interfaces (APIs), defining the conventions for how compiled code interacts at the machine level. ABIs play a crucial role in promoting software modularity by enabling the separate compilation of program modules, such as libraries and executables, which can then be linked and executed together without requiring access to the original source code. This separation allows developers to build and distribute reusable binary components that remain compatible across different compilation units, as long as they adhere to the same ABI standards for object layouts, name mangling, and linking processes. For instance, in C++ environments, the ABI standardizes how classes and functions are represented in binaries, facilitating the integration of third-party libraries into larger applications without recompilation. In terms of runtime execution, ABIs ensure predictable and consistent behavior during binary interactions, such as the passing of arguments, value handling, and exception propagation across module boundaries. By specifying the exact formats for data exchange and — including usage and conventions—ABIs prevent mismatches that could lead to crashes or when binaries from different compilers or versions are combined. This reliability is essential for scenarios, where code is executed on-the-fly, maintaining the integrity of the program's environment. ABIs also facilitate seamless system integration by providing a stable interface for user programs to communicate with operating system services, particularly through system calls, helping to avoid the need for recompilation after OS updates as long as the ABI remains stable. In Linux, for example, the kernel's ABI documents the conventions for invoking system calls, ensuring that binaries can reliably request kernel resources like file operations or process management while preserving backward compatibility for at least two years on stable interfaces. This stability allows applications to leverage OS functionalities across kernel versions, supporting long-term binary portability and reducing maintenance overhead in diverse computing environments.

ABI versus API

Key Differences

The Application Binary Interface (ABI) and Application Programming Interface () differ fundamentally in their levels of abstraction and operational scope. An operates at the source code level, defining how developers interact with software components through high-level constructs such as signatures, definitions, and constants exposed in header files or documentation. In contrast, an ABI functions at the compiled level, specifying low-level details like register usage, stack frame organization, and calling conventions that enable binaries to interoperate without source access. This distinction means are portable across compilers and languages as long as the source adheres to the interface, whereas ABIs are tightly coupled to specific hardware architectures and compiler implementations, such as the C++ ABI for certain platforms. A key implication of these differences lies in stability and the impact of changes. Modifying an API, such as altering a function's parameter list, typically requires updating and recompiling the source code of dependent applications but does not affect existing binaries. Breaking an ABI, however, demands recompilation of all dependents because it disrupts binary compatibility; for instance, changes in data structure layouts or symbol versioning can render precompiled binaries unusable against the updated library. ABI stability is thus a critical concern in shared library design, often managed through versioning schemes like those in ELF binaries to avoid widespread recompilation needs. These interfaces can diverge in practice, highlighting their independent natures. For example, an might declare a accepting an int parameter without specifying its exact representation, allowing flexibility in . The corresponding ABI, however, mandates the precise bit width (e.g., 32 bits for int in the System V ABI), alignment rules, and byte order (e.g., little-endian on architectures) to ensure correct binary interpretation across modules. Such details prevent runtime errors but tie the ABI to platform specifics, unlike the more abstract .

Interdependence

APIs define the source-level interfaces that developers use to write software, specifying signatures, types, and behaviors in human-readable . Compilers translate these definitions into representations governed by the ABI, which dictates the low-level details such as memory layouts, calling conventions, and to ensure that the compiled accurately reflects the intended source interactions. This mapping process is crucial because the resulting library ABI emerges from the combination of the library's and the compiler's ABI implementation, forming a that enables between compiled components. Changes to an can directly influence the ABI, particularly when modifications alter binary structures or function interfaces. For instance, adding a parameter to a in the API requires recompilation, which may shift the ABI's or stack layout, potentially breaking with existing binaries that expect the original interface. Similarly, changing a parameter type from a like int to a reference type like maintains source-level in some languages but disrupts the ABI due to differing binary signatures. Conversely, maintaining a API—by avoiding such alterations—facilitates ABI stability across versions, allowing libraries to evolve internally without forcing widespread recompilation of dependent software. In practice, developers leverage to achieve portability across diverse compilers and platforms, writing once and compiling as needed for different environments. However, once deployed, applications depend on the ABI for binary-level execution and linkage, enabling of libraries without access to the original . This interdependence underscores the need for careful API design to minimize ABI disruptions, as binary incompatibility can lead to runtime failures in production systems where recompilation is impractical.

Components of an ABI

Calling Conventions

Calling conventions form a critical component of an application binary interface (ABI), dictating the precise mechanisms by which one invokes another at the binary level. They outline the order and location of passing—typically via CPU for or the call for larger or excess arguments—the handling of values in designated or , and the division of responsibilities between the caller and callee for stack cleanup and register preservation. These rules ensure between separately compiled modules, preventing runtime errors from mismatched expectations during function calls. Parameter passing in modern calling conventions prioritizes registers to minimize latency, with the serving as overflow for additional arguments. For instance, integer and pointer parameters are classified as fitting into registers like 64-bit general-purpose ones, while floating-point values use vector registers such as extensions. Return values follow similar patterns: scalar integers up to 64 bits return in the accumulator register (e.g., %rax or RAX), while larger or structured returns may involve hidden pointers to caller-allocated . Callee cleanup responsibilities vary; in many conventions, the caller manages stack unwinding for its parameters, but the callee must preserve non-volatile registers it uses, restoring them before returning to maintain program state. This delineation supports and debugging by standardizing and code sequences. Prominent examples include the System V ABI, prevalent in Unix-like environments, which assigns the first six integer or pointer parameters to registers %rdi, %rsi, %rdx, %rcx, %r8, and %r9, and the first eight floating-point parameters to %xmm0 through %xmm7, pushing excess arguments right-to-left onto a 16-byte-aligned . Return values use %rax/%rdx for integers and %xmm0/%xmm1 for floating-point types, with the callee responsible for preserving registers like %rbx, %rbp, and %r12–%r15. In contrast, the x64 , standard for Windows, limits the first four integer parameters to RCX, RDX, R8, and R9, and the first four floating-point to XMM0–XMM3, requiring the caller to reserve 32 bytes of "shadow space" on the for potential callee overflow usage. Returns mirror System V with RAX or XMM0, and the callee handles non-volatile preservation including XMM6–XMM15. These register-heavy approaches enhance performance by avoiding memory accesses, which can introduce cache misses in high-call-frequency scenarios compared to stack-only conventions. Stack management under these conventions enforces strict —typically 16 bytes before a call—to optimize SIMD instructions and hardware prefetching, with 32-byte alignment for wider vectors like __m256 in some cases. Frame pointers, such as %rbp or RBP, are optional but commonly employed in the prologue to establish a reliable reference for local variables and arguments, especially in non-optimized code; their omission in leaf functions can reduce overhead by up to 5% in frame size. The prologue typically saves callee-saved , adjusts the pointer (RSP), and may leverage a 128-byte "" below RSP in System V for temporary storage without explicit allocation, while the epilogue reverses these operations to ensure RSP alignment and register integrity upon return. These mechanisms collectively minimize overhead in binary execution while enabling seamless function interoperation across ABI-compliant .

Data Types and Memory Layout

In an application binary interface (ABI), the representation of fundamental data types is strictly defined to ensure consistent interpretation across compiled binaries. For instance, in the System V ABI for AMD64 processors, a char occupies 1 byte with 1-byte alignment, a short is 2 bytes with 2-byte alignment, an int is 4 bytes with 4-byte alignment, a long is 8 bytes with 8-byte alignment, a float is 4 bytes with 4-byte alignment, a double is 8 bytes with 8-byte alignment, and a long double is 16 bytes (with 10 bytes of precision and 6 bytes of padding) with 16-byte alignment. Similarly, the Windows x64 ABI defines char and unsigned char as 1 byte (1-byte alignment), short and unsigned short as 2 bytes (2-byte alignment), int, long, unsigned int, and unsigned long as 4 bytes (4-byte alignment), __int64 and unsigned __int64 as 8 bytes (8-byte alignment), float as 4 bytes (4-byte alignment), and double as 8 bytes (8-byte alignment). These specifications prevent mismatches in type interpretation, such as treating a 32-bit integer as 4 bytes across modules.
Data TypeSystem V AMD64 Size (bytes) / Alignment (bytes)Windows x64 Size (bytes) / Alignment (bytes)
char1 / 11 / 1
short2 / 22 / 2
4 / 44 / 4
8 / 84 / 4
4 / 44 / 4
8 / 88 / 8
Pointer8 / 88 / 8
16 / 16N/A (uses double or extensions)
Endianness further governs multi-byte data representation within the ABI, with both System V AMD64 and Windows x64 employing little-endian byte order, where the least significant byte is stored at the lowest memory address. This convention ensures that, for example, the 32-bit value 0x01020304 is laid out in memory as bytes 04 03 02 01. For aggregate types like structures, ABIs mandate member ordering by declaration sequence, with padding inserted to satisfy individual member alignments and ensure the overall structure size is a multiple of its strictest alignment requirement. In the System V AMD64 ABI, structures are padded such that each member starts at an offset aligned to its natural boundary, and the total size aligns to the maximum member alignment; for example, a structure with a 1-byte char followed by an 8-byte long includes 7 bytes of padding after the char to align the long. The Windows x64 ABI follows similar rules, adding padding between or after members as needed, such as 4 bytes after an int in a structure preceding a larger aligned type. In C++, the Itanium ABI extends this to class layouts, where non-virtual bases and data members appear in declaration order, virtual bases follow the inheritance graph order, and empty base classes may share offsets to optimize space without violating alignment. Virtual tables (vtables) in C++ under the ABI represent a specialized structure layout for polymorphism, consisting of an offset-to-top field, a type information pointer, and an array of pointers, with primary vtables followed by secondary ones for non-primary virtual bases in depth-first, left-to-right traversal order. Packing directives, such as those overriding default alignments, are preserved across ABIs but must be explicitly specified to avoid portability issues. Pointers in ABIs are treated as unsigned integers of the platform's word size, with null conventionally represented as the all-zero bit pattern (address 0), and arrays as contiguous blocks of elements starting at the base address, sized as a multiple of the element's size and aligned to the element's alignment (or 16 bytes for large arrays in System V AMD64). This ensures pointers can be safely passed and interpreted without ambiguity, as their 8-byte size and 8-byte alignment on 64-bit systems allow direct addressing.

Procedure Linkage

Procedure linkage in an application binary interface (ABI) governs how procedures, or functions, are referenced and connected between binary modules during linking and execution. This involves establishing rules for symbol naming to ensure unique identification of external procedures, particularly in languages supporting overloading and namespaces. , external symbols retain their plain source names without decoration, allowing straightforward resolution in symbol tables of object files like . In contrast, C++ employs to encode additional information such as parameter types, enabling the linker to distinguish overloaded functions and resolve external references unambiguously; mangled names typically begin with an followed by 'Z' and a detailed encoding of the function signature. Relocation and symbol resolution handle the adjustment of procedure addresses in binaries, supporting both static linking at and dynamic loading at . During static linking, the linker resolves by matching references to definitions across object files, applying relocations to update addresses—absolute relocations fix direct memory locations, while relative ones use offsets for . In dynamic scenarios, the loader, such as ld.so on systems, employs the Procedure Linkage Table (PLT) and Global Offset Table (GOT) to defer resolution; initial calls to external procedures redirect through PLT stubs to the loader, which then performs lazy binding by searching symbol tables in loaded shared objects and patching the GOT with resolved absolute addresses. Exception handling across module boundaries requires coordinated propagation and unwinding to maintain program integrity when errors occur in external procedures. The ABI specifies mechanisms for to traverse frames in different binaries, relying on unwind tables to guide the process; in ELF-based systems, the .eh_frame section contains Call Frame Information (CFI) with Common Information Entries (CIEs) for default rules and Frame Description Entries (FDEs) for function-specific instructions on restoration and pointer adjustments. This enables the unwinder to propagate exceptions by iteratively restoring prior states, ensuring cleanup actions like destructor calls are executed consistently across modules.

Standards and Implementations

POSIX and Unix-like Systems

In POSIX-compliant and systems, the System V Application Binary Interface (ABI) serves as a foundational standard for ensuring binary compatibility across diverse architectures and implementations. This ABI, originally developed as part of System V Release 4 (SVR4), specifies the between applications and the operating system, including formats, linking mechanisms, and runtime behaviors. It promotes portability by defining consistent rules for how binaries interact with shared libraries and the , enabling executables compiled on one Unix variant to run on another with compatible . A core component of the System V ABI is the Executable and Linking Format (ELF), which standardizes the structure of object files, executables, and shared libraries. ELF files include headers for identification (e.g., architecture via e_machine), sections for code and data (e.g., .text, .data), and program headers for loading segments into memory. For processor-specific conventions, the x86 architecture follows the System V AMD64 psABI, which uses 64-bit ELF (ELFCLASS64) with little-endian byte order, defining memory layouts such as stack growth downward and 16-byte alignment. Similarly, for ARM architectures, the ELF supplement outlines conventions like 32-bit or 64-bit (AArch32/AArch64) support, with specific relocation types (e.g., R_ARM_PC24) and section flags for efficient loading on resource-constrained devices. Thread-local storage (TLS) is handled uniformly across these, using dedicated sections (.tdata for initialized data, .tbss for uninitialized) flagged with SHF_TLS, and a PT_TLS program header for runtime allocation per thread, ensuring thread-safe access without global synchronization. In distributions, which adhere to the System V ABI, the GNU C Library () implements user-space components, providing stable interfaces for functions and data structures while maintaining for major versions. The kernel-user boundary is enforced through system calls (syscalls), where user programs invoke kernel services via standardized interfaces like the syscall instruction on x86 or on , with argument passing governed by the architecture's psABI to prevent . further supports multi-ABI environments, such as running 32-bit binaries on 64-bit kernels through compatibility layers (e.g., emulation), allowing seamless execution of legacy applications without recompilation. The IEEE POSIX standards, particularly POSIX.1, establish a baseline for portable binaries by specifying source-level interfaces that map to underlying ABIs, ensuring that systems (e.g., , BSD variants) produce interoperable executables across vendors. This standardization facilitates binary distribution without architecture-specific adjustments, as long as the target system complies with the System V ELF extensions.

Windows and Microsoft Ecosystems

The (PE) format, an extension of the Common Object File Format (COFF), defines the structure for executable files, object files, and dynamic-link libraries (DLLs) in the Windows operating system. It includes a header for compatibility, followed by a PE signature, COFF file header specifying machine type and sections, and an optional header with data directories pointing to key structures like imports and exports. The table in PE files lists external functions and data required by the executable, organized into import descriptor directories that reference DLLs and contain thunks for address at load time, enabling dynamic linking to shared libraries. Export tables, conversely, define the public symbols a DLL exposes to other modules, including function names, ordinals, and addresses, facilitating procedure linkage for reusable components. Delay-loading extends this by deferring DLL and function loading until first use, reducing startup time and through a separate delay import descriptor that triggers via helper functions like __delayLoadHelper. Microsoft's calling standardize argument passing and management for x86 and x64 architectures to ensure . On x86, conventions like __stdcall pass parameters right-to-left on the with the callee cleaning up, commonly used for Win32 calls to minimize size, while __fastcall optimizes by passing the first two or pointer arguments in ECX and EDX registers before usage. For x64, adopts a unified fastcall-like convention where the first four /pointer arguments and first four floating-point arguments use registers (RCX, RDX, R8, R9 for integers; XMM0-XMM3 for floats), with the caller allocating 32 bytes of shadow space and cleaning the , promoting efficiency across binaries. The (COM) provides binary stability through interface-based contracts, where objects expose versioned interfaces via GUIDs and vtables, ensuring that binary layouts remain unchanged across implementations and compiler versions for seamless interoperation in distributed systems. This design allows clients to bind to interfaces without recompilation, as long as new versions maintain by not altering existing vtable entries. The Visual C++ (MSVC) toolchain enforces an ABI for binary compatibility across Visual Studio versions starting from 2015 (toolset v140), guaranteeing that object files, libraries, and executables built with later versions (v141 through v145) can link and run together without recompilation as of November 2025, provided the linker version matches or exceeds the newest toolset used. This includes the (STL), where containers and algorithms maintain stable binary layouts and exception specifications within the C++ (), supported by a single Visual C++ Redistributable package for deployment. Exceptions arise with optimizations like whole-program optimization (/GL) or link-time code generation (/LTCG), which require identical toolset versions for compatibility.

Embedded and Specialized ABIs

In systems, ABIs are optimized for resource-constrained environments, such as microcontrollers with limited and power, where minimizing overhead is critical. These ABIs often incorporate reduced usage by limiting the number of registers passed as arguments and employing compact calling conventions to preserve and space. For instance, the Embedded Application Binary Interface (EABI) for processors like the MSP430 specifies ELF-based formats tailored for low- devices, ensuring efficient layouts without unnecessary padding or metadata that could inflate binary sizes. Similarly, in real-time operating systems (RTOS) like , which target bare-metal or minimal-kernel setups, ABIs favor over floating-point to avoid hardware dependencies and reduce computational overhead; fixed-point operations use integer instructions, conserving cycles and energy in systems without floating-point units. The ARM Architecture Procedure Call Standard (AAPCS), part of the broader ARM ABI, defines rules for parameter passing, register usage, and stack alignment in embedded contexts, including the EABI variant for bare-metal applications. In AAPCS, up to four integer or pointer arguments are passed in registers (r0-r3) to minimize stack pushes, with the stack growing downwards and maintaining 8-byte alignment at function boundaries to support atomic operations common in interrupt-driven systems. For bare-metal ARM environments, the EABI extends this by omitting dynamic linking support and focusing on static executables, which is essential for resource-limited devices without an OS loader; interrupt handling follows AAPCS conventions, where handlers save only necessary context (e.g., link register and arguments) to enable low-latency responses in RTOS tasks. The RISC-V ABI, documented in the ELF psABI specification, similarly prioritizes efficiency for embedded use, with the proposed Embedded ABI (EABI) reducing argument registers from eight to four to cut context-save costs during interrupts, thereby improving real-time performance on microcontrollers. In bare-metal RISC-V setups, interrupt handling adheres to the machine-mode trap mechanism, where the ABI ensures handlers access callee-saved registers (e.g., s0-s11) minimally, often integrating with the Supervisor Binary Interface (SBI) for standardized exception vectors without OS mediation. Domain-specific ABIs address niche requirements beyond general-purpose computing, such as and execution. The (MPI) ABI, standardized in MPI-5.0, enables binary compatibility across implementations for clusters by defining consistent handle types (e.g., opaque pointers for communicators and datatypes), status objects (arrays of eight integers), and integer types like MPI_Aint as intptr_t, allowing compiled parallel applications to link against different MPI libraries without recompilation. This ABI supports efficient point-to-point and collective operations in distributed-memory systems, with functions like MPI_Abi_get_version ensuring runtime version checks for interoperability. In blockchain contexts, the Ethereum ABI provides a JSON-based for encoding function calls, events, and data between the Ethereum (EVM) and external applications or contracts, specifying static types (e.g., uint256) encoded in-place and dynamic types (e.g., bytes arrays) via offset pointers in 32-byte words. It uses a 4-byte function selector from the Keccak-256 of signatures to dispatch calls, facilitating deterministic in decentralized environments without native OS support.

Historical Development

Origins in Early Computing

The concepts underlying the application binary interface (ABI) first emerged in the 1960s and 1970s through the development of linkers and loaders in pioneering operating systems, which standardized how binary modules interacted at the machine level. In Multics, initiated in 1965 as a collaborative project by MIT, Bell Labs, and General Electric, dynamic linking was a core feature that allowed segments of code to be loaded and bound at runtime, using segment tables and linkage sections to resolve references across modules. This approach established early rules for binary modularity, where procedures in separate segments could reference each other via symbolic names, influencing subsequent systems by prioritizing relocatability and shared code execution. By contrast, early Unix, developed at Bell Labs starting in 1969 on the PDP-7 and later the PDP-11, initially lacked a dedicated linker, with programs written in assembly as self-contained binaries. The introduction of the B compiler in 1970 on the PDP-11 brought the a.out format, a simple executable structure output by the assembler (as) and linker (ld), which included headers for text, data, and symbol tables to enable basic relocation and loading. This format, named for the default output file "a.out," laid foundational ABI principles by defining how object files could be combined into runnable binaries without recompilation. Parallel advancements in assembler and loader technology, particularly in IBM's OS/360 released in 1966, further solidified binary relocatability as a cornerstone of early ABIs. OS/360's linkage editor processed relocatable object modules—generated by assemblers like the Basic Assembler—using external symbol dictionaries (ESD) and relocation dictionaries (RLD) to resolve inter-module references and adjust addresses relative to a base origin. Control sections served as the minimal relocatable units, with A-type constants handling intra-segment addresses and V-type for external ones, enabling the creation of load modules that could be dynamically positioned in memory during execution. The loader then performed final address modifications and overlay management, reducing storage needs—for instance, overlay structures could shrink a 32K-byte program to 18K bytes—while enforcing standardized formats for text, constants, and entry points. These mechanisms established basic interface rules, such as consistent symbol resolution and error handling (e.g., IEW0012 for invalid constants), ensuring binary compatibility across programs and libraries in multi-programming environments. Early calling conventions, essential to ABI for subroutine interactions, were formalized in systems like the PDP-11, where hardware capabilities directly shaped software interfaces. A 1970 DEC memorandum outlined PDP-11 subprogram conventions, recommending the Jump to Subroutine (JSR) instruction with a branch register (BR) to pass argument counts and addresses, while using the for reentrant calls via push operations like MOV #A, -(SP). This prioritized execution speed and simplicity, supporting variable-length arguments and fail-soft recovery, with the RS tracking call locations for . The PDP-11's (ISA), featuring eight general-purpose registers (R0-R5 for data, R6 as stack pointer, R7 as ), facilitated these conventions by enabling autoincrement/decrement addressing for efficient parameter passing and nesting without manual linkage saves. Similarly, the x86 ISA, introduced with the in 1978, imposed initial ABI assumptions through its segmented memory model and limited registers (e.g., AX, BX for parameters), dictating stack-based conventions for calls that echoed PDP-11 influences but adapted to 16-bit addressing constraints. These hardware-driven designs ensured that binaries could interoperate reliably, setting precedents for parameter marshaling and return value handling in subsequent architectures.

Evolution with Modern Languages

The 1980s marked a pivotal transition in ABI development as became the dominant systems language, replacing for most development and necessitating standardized binary interfaces for portability across Unix variants. AT&T's , first released in 1983, introduced the Common Object File Format (COFF) as a more sophisticated replacement for the a.out format, supporting relocatable object modules with sections for code, data, and debugging symbols, along with standardized relocation and symbol tables to enable consistent linking and loading. Concurrently, industry efforts through the X/Open Company (formed 1984) and the IEEE standards (first draft 1986, standardized 1988) began defining C-specific ABI elements, including calling conventions (e.g., right-to-left stack parameter passing on x86), data type sizes and alignments (e.g., int as 32-bit), and interfaces, promoting binary compatibility in a fragmented Unix ecosystem. The development of ABIs in the late 20th century was heavily influenced by the growing complexity of C and C++, which demanded standardized binary interfaces to support features beyond simple procedural code. In the 1990s, the Itanium C++ ABI emerged from an industry collaboration led by HP and Intel, providing a comprehensive specification for C++ binaries on the Itanium architecture but influencing broader ecosystems. This ABI introduced detailed name mangling rules to encode overloaded functions, operators, and constructors into unique symbols starting with "_Z", ensuring unambiguous linkage across object files and libraries. It also defined exception handling protocols, including routines like __cxa_throw and __cxa_begin_catch, which integrate with platform unwind mechanisms to propagate exceptions across call stacks while preserving type safety. To accommodate C++'s evolving language features, the ABI extended support for templates through specialized mangling of template arguments (e.g., using "I" for template instantiation followed by encoded parameters), allowing distinct representations for different instantiations without name collisions. (RTTI) was formalized via std::type_info structures, emitted with vague linkage in COMDAT groups for polymorphic classes, enabling operations like dynamic_cast and exception type matching. Namespaces received nested mangling support (e.g., "N" for nested names ending with "E"), facilitating in binaries while maintaining compatibility with global scope. These extensions were crucial as C++ standardized features in the 1990s and early 2000s, but implementation variances arose: has maintained Itanium ABI stability since version 3.4 in 2004, while MSVC historically introduced ABI breaks with each major release from .NET 2003 through 2013—such as changes in structure padding and virtual table layouts—before achieving intra-family compatibility from 2015 onward via toolset versioning (e.g., v140 to v143). In the 2010s, (Wasm) marked a shift toward platform-agnostic ABIs, standardizing a binary instruction format for a stack-based that supports portable code execution across browsers and standalone runtimes. Its core ABI defines fixed-width numeric types (e.g., i32 for 32-bit integers, f64 for doubles) and reference types, with calling conventions relying on operand stacks and linear memory addressing to pass arguments and results between functions and host environments. This design ensures binaries compiled from languages like C++, , or remain interoperable without architecture-specific adjustments, addressing portability challenges in distributed systems. The subsequent Canonical ABI, part of the WebAssembly Component Model, builds on this by specifying canonical representations for higher-level constructs—such as records via lists of {tag, payload} and strings as {buffer: list<u8>, length: u32}—enabling seamless data exchange across language boundaries in multi-module applications. Contemporary languages continue to drive ABI evolution, with Rust exemplifying efforts to balance stability and performance. Rust's project goals include developing a modular, stable ABI to support dynamic loading of crates as plugins and interoperability with languages like C or Swift, motivated by needs in systems programming such as the Fuchsia OS kernel. Proposals envision versioned ABIs using attributes like #[repr(RustABI)] for explicit contracts, drawing lessons from C++'s fragmentation to avoid optimization constraints while enabling runtime linking; however, as of 2025, Rust maintains no guaranteed stable ABI across compiler versions to preserve flexibility in code generation and memory layouts.

Challenges and Considerations

Binary Compatibility

Binary compatibility in application binary interfaces (ABIs) refers to the ability of software components, such as libraries and executables, to interoperate at the level across different versions without requiring recompilation. Changes to an ABI can disrupt this compatibility, particularly when altering core elements like s or representations. For instance, modifying a —such as shifting the order of parameter passing between registers and the —can lead to incorrect invocations, resulting in crashes or erroneous computations in dependent binaries. Similarly, increasing the size of a fundamental type, like changing int from 32 bits to 64 bits, may cause memory misalignment or buffer overflows when older binaries expect the original layout, leading to segmentation faults or . These effects stem from the ABI's role in defining precise layouts and linkages, where even subtle shifts can invalidate assumptions made during . To mitigate such breaking changes, developers employ versioning techniques that preserve the interface for existing binaries while enabling evolution. In ELF-based systems, symbol versioning allows libraries to expose multiple versions of the same symbol, ensuring that older applications link to the original while newer ones access updated functionality. For example, the GNU uses version tags like GLIBCXX_3.4 in its shared objects, permitting the addition of new symbols in minor releases without invalidating prior ABIs, as the linker resolves symbols based on the required version recorded at link time. On Windows, side-by-side assemblies enable multiple versions of dynamic-link libraries (DLLs) to coexist within the same process, reducing "" by isolating version-specific manifests and preventing conflicts during loading. ABI wrappers provide another strategy, often involving a stable intermediary layer—such as a C interface wrapping a C++ library—to shield downstream code from internal ABI alterations, ensuring that the exposed interface remains unchanged across library updates. These techniques collectively allow libraries to evolve while maintaining for deployed software. Compiler flags further aid in preserving ABI stability by controlling symbol exposure. The option -fvisibility=hidden sets the default visibility of in shared objects to , preventing unintended exports that could lead to linkage issues or ABI bloat in future versions. Developers then explicitly mark public symbols with attributes like __attribute__((visibility("default"))) , limiting the ABI surface to only essential interfaces and reducing the risk of breaking changes from internal modifications. This approach not only safeguards compatibility but also optimizes load times and library sizes by minimizing dynamic symbol resolution overhead.

Portability and Vendor Lock-in

Application binary interfaces (ABIs) are inherently tied to specific instruction set architectures (ISAs), which introduces significant challenges to binary portability across different hardware platforms. For instance, the x86 ABI, commonly used in and processors, relies on conventions such as the System V ABI for systems, where parameters are primarily passed on the stack or in a limited set of registers like and , with variable-length instructions. In contrast, the ABI, defined by the Procedure Call Standard (), passes the first four integer arguments in registers r0-r3 and up to eight floating-point arguments in VFP registers (s0-s15 or d0-d7), with excess parameters on the stack, and mandates the use of the Thumb-2 instruction set where pointers must have the low bit set. These differences in calling conventions, register usage, and instruction encoding mean that binaries compiled for x86 cannot be directly executed or linked on processors without recompilation, , or layers, severely limiting reuse and requiring substantial efforts. Vendor-specific variations in ABI implementations further exacerbate portability issues and contribute to vendor lock-in, where software becomes dependent on a particular compiler or ecosystem. Compilers like GCC and Clang adhere to the Itanium C++ ABI, an industry-standard specification that ensures compatibility for C++ binaries across these tools by defining consistent rules for name mangling, virtual function calls, and data layout, allowing objects compiled with one to link seamlessly with those from the other on supported platforms. However, Microsoft's Visual C++ (MSVC) employs a distinct ABI that diverges from the Itanium model, particularly in vtable layouts, exception handling, and runtime library integrations, which has maintained ABI stability across major versions of Visual Studio since 2015 (e.g., toolsets v140 to v145 in Visual Studio 2026 as of November 2025) but prevents direct interoperability with non-Microsoft compilers. This proprietary approach ties developers to the Windows ecosystem and Microsoft tools, as mixing binaries from MSVC with those from GCC or Clang often requires recompilation or wrappers, increasing costs and reducing flexibility; additionally, compiler-specific extensions, such as non-standard attributes or optimizations, can embed vendor-unique behaviors into binaries, further entrenching lock-in by making migration to alternative compilers error-prone or impossible without code changes. To mitigate these portability and lock-in challenges, standardized ABIs have been developed to abstract platform and vendor dependencies. The C++ ABI serves as a cross-compiler foundation for systems and multiple architectures, enabling relocatable objects and dynamic shared objects (DSOs) that work across , , and other compliant tools without recompilation, thus promoting ecosystem openness on and similar environments. Similarly, (Wasm) provides a portable ABI through its stack-based and binary format, which assumes little-endian byte ordering, integers, floating-point, and unaligned memory access support, allowing modules compiled from languages like C++, , or Go to execute efficiently on diverse ISAs including x86 and , across web browsers, servers, and embedded devices via host-defined interfaces like WASI, without tying to specific hardware or vendor runtimes. These standards reduce the barriers to binary reuse and vendor switching by enforcing uniform interfaces that prioritize interoperability over proprietary optimizations.

References

  1. [1]
    What Is Application Binary Interface (ABI) - ITU Online IT Training
    An Application Binary Interface (ABI) is a standardized interface between two binary program modules, often between an application and the operating system.Missing: authoritative | Show results with:authoritative
  2. [2]
    Itanium C++ ABI
    In this document, we specify the Application Binary Interface (ABI) for C++ programs: that is, the object code interfaces between different user-provided C++ ...
  3. [3]
    5.10 Application Binary Interface
    An Application Binary Interface (ABI) defines how functions that are written separately, and compiled separately can work together. This involves standardizing ...Missing: definition authoritative
  4. [4]
    Binary Compatibility - Using the GNU Compiler Collection (GCC)
    Binary compatibility encompasses several related concepts: application binary interface (ABI): The set of runtime conventions followed by all of the tools that ...
  5. [5]
    Overview of x64 ABI conventions - Microsoft Learn
    Jun 25, 2025 · This topic describes the basic application binary interface (ABI) for x64, the 64-bit extension to the x86 architecture.
  6. [6]
    Frequently Asked Questions
    ### Definition and Explanation of ABI
  7. [7]
    [PDF] SYSTEM V APPLICATION BINARY INTERFACE - SCO
    Mar 18, 1997 · Page: 10. Page 11. System V Application Binary Interface. The System V Application Binary Interface, or ABI, defines a system interface for. ...
  8. [8]
    ABI Policy and Guidelines
    ### Summary of ABI in Software Development
  9. [9]
    Understanding Application binary interface (ABI) [closed]
    Aug 1, 2011 · An ABI is a set of conventions that allows a linker to combine separately compiled modules into one unit without recompilation, such as calling ...Missing: integration | Show results with:integration
  10. [10]
    Linux ABI description - The Linux Kernel documentation
    This part of the documentation inside Documentation/ABI directory attempts to document the ABI between the Linux kernel and userspace, and the relative ...
  11. [11]
    [PDF] System V Application Binary Interface - AMD64 Architecture ...
    Jul 2, 2012 · header understood by the unwind interface, defined as follows: 88 ... This ABI does not define the passing of optional arguments. They ...<|separator|>
  12. [12]
    Differences Between APIs and ABIs | Baeldung on Computer Science
    Apr 24, 2024 · An ABI, or “Application Binary Interface” is analogous to an API but expressed in compiled code instead of source code. In a Java application, ...
  13. [13]
    [PDF] ABI Compatibility Through a Customizable Language
    There are two types of programming interfaces to a library: the Ap- plication Programming Interface (API) and the Application Binary. Interface (ABI). The API ...
  14. [14]
    [PDF] How To Write Shared Libraries - Dartmouth Computer Science
    Dec 10, 2011 · In addition, it introduces the concept of ABI (Application Binary Interface) stability and shows how to manage it.<|separator|>
  15. [15]
    20 ABI (Application Binary Interface) breaking changes every C++ ...
    Apr 3, 2019 · If ABI compatibility is broken between the calling binary (exe or another dll) and your dll, it can result in unintended crashes.
  16. [16]
    Understanding the Full Impact of Breaking Changes - InfoQ
    Jan 18, 2024 · For example, in Java, a library might maintain API compatibility even if method signature changes occur, but this can result in a loss of ABI ...
  17. [17]
    [PDF] Calling conventions - Agner Fog
    Feb 1, 2023 · The System V ABI for. 64-bit Unix systems requires alignment by 32. The System V ABI for 32-bit Unix does not mention __m256, but tests show ...
  18. [18]
    x64 Calling Convention | Microsoft Learn
    Jul 25, 2025 · This article describes the standard processes and conventions that one function (the caller) uses to make calls into another function (the callee) in x64 code.
  19. [19]
    Relocation
    ### Summary of Relocation in ELF
  20. [20]
    Dynamic Linking
    ### Summary: Procedure Linkage Table (PLT), Global Offset Table (GOT), and Dynamic Symbol Resolution
  21. [21]
    None
    Below is a merged summary of the provided segments on Call Frame Information and `.eh_frame` for unwinding in exception handling, based on Section 6.4 and related sections of the DWARF5 standard. To retain all information in a dense and organized manner, I’ve used a combination of narrative text and tables in CSV format where applicable. The response consolidates details from all summaries, avoiding redundancy while preserving key points, and includes relevant URLs where provided.
  22. [22]
    [PDF] elf.pdf - ELF Object File Format
    Sep 4, 2025 · The System V Release 4 (SVR4) Application Binary Interface (ABI) is composed of several components, ranging from a high-level specification of ...
  23. [23]
    ELF and ABI Standards - Linux Foundation
    This version breaks ELF into 3 seperate books, ELF, x86 psABI, and the Operating System Specific Specification for SVR4. This appears to be the most current TIS ...Missing: conventions thread- local storage
  24. [24]
    [PDF] System V Application Binary Interface - x86-64
    Sep 13, 2002 · The most fundamental differences from the Intel386 ABI document are as follows: • Sizes of fundamental data types. 1The architecture ...<|control11|><|separator|>
  25. [25]
    ARM-software/abi-aa: Application Binary Interface for the ... - GitHub
    This is the official place for the latest documents of the Application Binary Interface for the Arm® Architecture, both for source files and officially released ...
  26. [26]
    ABI List - glibc wiki - Sourceware
    May 4, 2024 · glibc supports the following (architecture, ABI) combinations, with dynamic linker names as indicated. There may well be other cases of ...
  27. [27]
    The Open Group Base Specifications Issue 7, 2018 edition
    POSIX.1-2017 is intended to be used by both application developers and system implementors and comprises four major components (each in an associated volume):.Introduction · Download · Utility Conventions · Regular ExpressionsMissing: ABI | Show results with:ABI
  28. [28]
    PE Format - Win32 apps - Microsoft Learn
    Jul 14, 2025 · This document specifies the structure of executable (image) files and object files under the Microsoft Windows family of operating systems.
  29. [29]
    [PDF] Microsoft Portable Executable and Common Object File Format ...
    The PE file header consists of an MS-DOS stub, the PE signature, the COFF File Header, and an Optional Header. A COFF object file header consists of a COFF File ...
  30. [30]
    An In-Depth Look into the Win32 Portable Executable File Format ...
    The discussion includes the exports section, export forwarding, binding, and delayloading. The debug directory, thread local storage, and the resources sections ...
  31. [31]
    __stdcall | Microsoft Learn
    Feb 10, 2025 · The __stdcall calling convention is used to call Win32 API functions. The callee cleans the stack, so the compiler makes vararg functions __cdecl.
  32. [32]
    __fastcall | Microsoft Learn
    Sep 15, 2023 · The __fastcall calling convention specifies that arguments to functions are to be passed in registers, when possible.
  33. [33]
    COM Technical Overview - Win32 apps - Microsoft Learn
    Jan 6, 2021 · The Microsoft Component Object Model (COM) defines a binary interoperability standard for creating reusable software libraries that interact at run time.Missing: ABI documentation<|control11|><|separator|>
  34. [34]
    Native interoperability ABI support - .NET - Microsoft Learn
    May 27, 2025 · The Application Binary Interface (ABI) is the interface that runtimes and operating systems use to express low-level binary details.
  35. [35]
    C++ binary compatibility 2015-2026
    ### Summary of Visual C++ ABI and STL Binary Compatibility Guarantees
  36. [36]
    [PDF] MSP430 Embedded Application Binary Interface - Texas Instruments
    This document is a specification for the ELF-based Embedded Application Binary Interface (EABI) for the. MSP430 family of processors from Texas Instruments.
  37. [37]
    [PDF] PPCEABI: PowerPC Embedded Application Binary Interface (EABI)
    Jan 10, 1995 · As in the SVR4 ABI, Fixed-Point. Load and Store Multiple instructions and the Fixed Point Move Assist instructions shall not be al- lowed in ...
  38. [38]
    [PDF] RISC-V for Real-time MCUs - Software Optimization and ...
    Compared to the regular. ABI, the EABI reduces the number of argument registers from. 8 to 4, consequently reducing the number of registers needed to be saved ...
  39. [39]
    [PDF] Procedure Call Standard for the Arm Architecture
    Jun 12, 2020 · ... Procedure Call Standard use by the Application Binary Interface (ABI) ... AAPCS Procedure Call Standard for the Arm Architecture (this standard).
  40. [40]
    1.4. Bare Metal Compiler - Intel
    It targets the ARM processor, it assumes bare metal operation, and it uses the standard ARM embedded application binary interface (EABI) conventions. The ...
  41. [41]
  42. [42]
  43. [43]
    None
    Below is a merged response summarizing the MPI ABI standardization and details on ABI for parallel computing from the MPI-5.0 report. To retain all information in a dense and organized manner, I’ve used tables where appropriate, alongside narrative text for clarity. The response consolidates all segments, focusing on key aspects such as standardization efforts, purpose, details (e.g., C and Fortran ABI specifics), relevant sections, and useful URLs.
  44. [44]
    Contract ABI Specification — Solidity 0.8.31-develop documentation
    The Contract ABI is the standard way to interact with contracts in Ethereum, encoding data by type, requiring a schema to decode.
  45. [45]
    History - Multics
    Jul 31, 2025 · Multics design was started in 1965 as a joint project by MIT's Project MAC, Bell Telephone Laboratories, and General Electric Company's Large ...
  46. [46]
    [PDF] Dynamic Linking Multics
    Two ways to refer to data: (segment number, offset) and (file name, offset); segment is stored on disk or memory. ▫. Kind of like “mmap” for all data.
  47. [47]
    Evolution of the Unix Time-sharing System - Nokia
    This paper presents a technical and social history of the evolution of the system. Origins. For computer science at Bell Laboratories, the period 1968-1969 was ...
  48. [48]
    [PDF] IBM OS - Bitsavers.org
    This publication provides the information necessary to use the linkage editor or loader program of the IBM. System/360 Operating System to prepare the ...
  49. [49]
    [PDF] 'mDmnomn INTEROFFICE MEMORANDUM
    PDP-11 Subprogram. Calling Conventions. DATE: November 10, 1970. TO: PDP-11 List C. PDP-11 Master List. FROM. Hank Spencer. DEPARTMENT: Programming.
  50. [50]
    None
    Summary of each segment:
  51. [51]
    Intel “x86” Family and the Microprocessor Wars - CHM Revolution
    Shown below are generations of Intel microprocessors derived from the original 8086 architecture. As the number of bits in the CPU increased from 16 to 32 to 64 ...
  52. [52]
    Itanium C++ ABI: Exception Handling ($Revision: 1.22 $)
    The standard ABI exception handling / unwind process begins with the raising of an exception, in one of the forms mentioned above. This call specifies an ...Level I: Base ABI · Exception Handler Framework · Level II: C++ ABI
  53. [53]
    Microsoft C/C++ change history 2003 - 2015
    Oct 3, 2025 · This article describes all the breaking changes from Visual Studio 2015 going back to Visual Studio 2003.
  54. [54]
    WebAssembly Specification — WebAssembly 3.0 (2025-11-02)
    No readable text found in the HTML.<|separator|>
  55. [55]
    Canonical ABI - The WebAssembly Component Model
    An ABI is an application binary interface - an agreement on how to pass data around in a binary format. ABIs are specifically concerned with data layout at the ...
  56. [56]
    A Stable Modular ABI for Rust - compiler - Rust Internals
    May 14, 2020 · A stable ABI would allow Rust libraries to be loaded by other languages (such as Swift), and would allow Rust to interop with libraries defined in other ...
  57. [57]
    Policies/Binary Compatibility Issues With C++ - KDE Community Wiki
    ### Summary of Binary Compatibility Issues in C++ (KDE Community Wiki)
  58. [58]
    Visibility - GCC Wiki
    Dec 17, 2021 · GCC's visibility feature hides unnecessary ELF symbols, improving load times, enabling better code optimization, and reducing DSO size by 5-20%.Missing: ABI | Show results with:ABI
  59. [59]
    Overview of ARM ABI Conventions | Microsoft Learn
    Endianness. Windows on ARM executes in little-endian mode. Both the MSVC compiler and the Windows runtime always expect little-endian data. The SETEND ...Vfp Registers · Parameter Passing · Stage C: Assignment Of...
  60. [60]
  61. [61]
    Itanium C++ ABI
    Mar 14, 2017 · Introduction. The Itanium C++ ABI is an ABI for C++. As an ABI, it gives precise rules for implementing the language, ensuring that separately- ...
  62. [62]
    Overview of potential upgrade issues (Visual C++) - Microsoft Learn
    Oct 24, 2021 · This overview summarizes the most common classes of issues you're likely to see, and provides links to more detailed information.
  63. [63]
    Portability - WebAssembly
    ### Summary: WebAssembly Portability Across ISAs via ABI