Fact-checked by Grok 2 weeks ago

Executable

In , an executable is a that contains a , consisting of machine-readable instructions and that can be loaded into and directly executed by a computer's operating or . These files instruct the to perform specific tasks, such as running applications or processes, and are typically inert until invoked. Executables take the form of compiled files, which hold optimized for a particular . Executable files are essential components of and execution across operating systems, with standardized formats ensuring compatibility and efficient loading. For Microsoft Windows, the (PE) format structures these files to include headers, code sections, and resources, allowing the system to map them into process address space. On Unix-like systems such as , the (ELF) serves a similar role, organizing object code, symbols, and relocation data for dynamic linking and execution. Apple's macOS uses the format, which supports both executables and shared libraries with provisions for fat binaries that run on multiple architectures. The PE format evolved from the earlier Common Object File Format (COFF), while ELF and Mach-O have distinct historical developments. Beyond technical structure, executables play a critical role in and portability, as they must be verified for integrity before execution to prevent infection. Operating systems employ mechanisms like digital signatures and to authenticate executables, reducing risks from unauthorized or tampered files. As computing has advanced, executables have adapted to support , , and cross-platform execution, enabling software to run seamlessly across diverse hardware and environments.

Definition and Fundamentals

Core Concept

An executable is a file or program segment containing or that a (CPU) or can directly execute to perform specified tasks, in contrast to which must be processed further or scripts which require an interpreter at . This form encodes instructions in a format native to the hardware or a managed environment, allowing the computer to carry out operations without additional translation steps during execution. Executables differ from non-executable files, such as or data files, by being pre-processed into a ready-to-run state that includes structural elements like headers for , entry points, and information, enabling direct loading into for execution. Unlike human-readable , which is written in high-level languages and requires or , or plain-text scripts that are executed line-by-line by an interpreter, executables represent a compiled or assembled output optimized for efficient hardware-level processing. In the software lifecycle, an executable serves as the final output of the build process, transforming developer-written into a standalone artifact that can be distributed and run independently on compatible systems. This role enables programs to operate without needing the original or tools present, facilitating deployment across environments. For example, a basic "Hello World" program assembled from low-level instructions produces a compact executable that outputs the message upon running, whereas an equivalent script remains as interpreted text requiring a runtime environment like the interpreter to execute.

Key Characteristics

Executables feature a modular internal structure designed to facilitate loading and execution by the operating system. At the core is a header that provides essential , including a magic number to identify the —such as 0x7F 'E' 'L' 'F' for files or the "PE\0\0" signature for Portable Executable () files—along with details on the file's , , and layout of subsequent sections. Following the header, the file is divided into sections, each serving a distinct purpose: the .text section contains the instructions, marked as read-only to prevent modification; the .data section holds initialized global and static variables; the .bss section reserves space for uninitialized variables, which are zeroed at runtime; and a symbol table section stores references to functions and variables for linking and . This segmented organization allows tools like linkers and loaders to efficiently parse and map the file into memory. Portability of executables is inherently limited by dependencies on the target CPU architecture and operating system. For instance, binaries compiled for x86 architectures use a different instruction set than those for , rendering them incompatible without recompilation or . Additionally, operating system variations introduce challenges such as —where x86 systems typically employ little-endian byte ordering while some others use big-endian—and differing calling conventions that dictate how function parameters are passed between caller and callee. These factors necessitate architecture-specific and OS-specific builds to ensure correct execution, as mismatches can lead to crashes or . Key attributes of executables include mechanisms that enhance security and stability during . The (.text) is configured with read-only and executable permissions, preventing accidental or malicious writes to instructions while allowing the CPU to fetch and execute them. In contrast, data segments (.data and .bss) are granted read-write permissions for variable modifications but are non-executable to mitigate risks. memory is further segregated into and regions: the , used for local variables and calls, operates on a last-in-first-out basis with automatic allocation and deallocation; the , for dynamic allocations via functions like malloc, grows as needed and requires explicit management to avoid leaks or overflows. This separation ensures efficient resource use and isolation of execution contexts. The size of an executable binary is influenced by optimization techniques applied during compilation and linking, which balance performance, functionality, and efficiency. , a common optimization, removes unused functions, variables, and instructions that are never reached, directly reducing the final and improving load times— for example, interprocedural can significantly reduce code size in large programs by identifying unreferenced sections. Other factors include the inclusion of debug symbols (which can be stripped post-build), alignment padding for hardware requirements, and the embedding of runtime libraries, all of which contribute to variability in footprint across builds. These optimizations prioritize minimalism without sacrificing correctness, making executables more suitable for distribution and deployment.

Creation Process

Compilation and Linking

The compilation phase of creating an executable begins with translating high-level , such as or C++, into machine-readable object files using a like the GNU Compiler Collection (GCC). This process involves multiple sub-phases in the 's frontend: , where the source code is scanned to identify tokens such as keywords, identifiers, and operators while ignoring whitespace and comments; syntax analysis or , which checks the token sequence against the language's grammar to build a representing the program's structure; and semantic analysis, which verifies type compatibility, scope rules, and other meaning-related aspects to ensure the code is valid beyond syntax. Following these, the generates intermediate code, applies optimizations to improve efficiency (such as or ), and produces target-specific assembly code through the backend's phase. The assembly step converts the generated assembly code into relocatable object files, typically using the GNU Assembler (as), which translates low-level instructions into binary while preserving relocation information for unresolved addresses and symbols. These object files contain the program's segments, data, and symbol tables but are not yet executable, as external references (like function calls to libraries) remain unresolved. In the linking phase, a linker such as GNU ld combines multiple object files and libraries into a single executable image by resolving symbols—mapping references to their definitions—and assigning final memory addresses. Static linking embeds the entire contents of required libraries directly into the executable, resulting in a self-contained file that includes all necessary code at build time, which increases file size but eliminates runtime dependencies. In contrast, dynamic linking incorporates only stubs or references to external libraries, deferring full resolution to runtime via a dynamic linker, which allows shared libraries to be loaded once and reused across programs but requires the libraries to be present on the target system. The linker also handles relocation, adjusting addresses in the object code to fit the final layout, and produces formats like ELF for Unix-like systems.

Source to Executable Conversion

The transformation from high-level , such as or C++ files, to a runnable executable follows a structured that ensures the is processed into machine-readable instructions compatible with the target system. This end-to-end begins with preprocessing and progresses through , , and linking, automating the conversion while resolving dependencies and optimizing for execution. Preprocessing is the initial stage, where the compiler's preprocessor expands macros, resolves include directives to incorporate header files, and handles conditional compilation based on directives like #ifdef. This step modifies the source code to produce an intermediate form ready for further processing, often expanding files like .c or .cpp without altering the core logic. The output is then fed into , where the compiler translates the preprocessed code into , generating human-readable instructions specific to the target architecture. Assembly follows immediately, converting this assembly code into object files (typically .o or .obj) that contain relocatable segments. Finally, linking combines these object files with required libraries, resolving external references to form a cohesive executable file, such as a.out on systems or an .exe on Windows. To automate and scale this pipeline across complex projects involving multiple source files, build systems play a crucial role in managing dependencies, incremental builds, and platform variations. Makefiles, part of the GNU Make tool, define rules specifying targets (e.g., the executable), prerequisites (e.g., object files), and shell commands (recipes) to execute the stages, using file timestamps to recompile only modified components. , a cross-platform meta-build system, generates native build files (e.g., Makefiles or projects) from a high-level CMakeLists.txt script, using commands like add_executable() to define the output and target_link_libraries() to handle linking dependencies. Integrated development environments (), such as or , often integrate these tools or provide built-in builders to streamline the workflow within a graphical . Cross-compilation extends this pipeline to produce executables for architectures different from the host machine, enabling development on powerful desktops for or remote targets. For instance, using , developers specify the target triple (e.g., arm-linux-gnueabi-gcc) to configure the , ensuring preprocessing, , and generate code for the desired , such as building Windows executables on a host. This requires matching libraries and headers for the target, often managed by build systems like through toolchain files that override default settings. Throughout the conversion, handling is essential to identify issues early and maintain integrity. During , type mismatches—such as incompatible pointer assignments or implicit conversions that alter values—trigger warnings or errors, configurable via flags like -Wconversion or -Wincompatible-pointer-types to enforce strict type checking. In the linking phase, unresolved symbols occur when references to functions or variables lack corresponding definitions in the object files or libraries, leading to linker errors that halt the build unless suppressed with options like --unresolved-symbols=ignore-all. These issues, often stemming from missing includes, incorrect library paths, or mismatched declarations across files, demand iterative to ensure a successful executable output.

Types and Formats

Native vs. Managed Executables

Native executables are programs compiled directly into machine code tailored to a specific CPU architecture, allowing the operating system to execute them without additional interpretation or translation layers. This direct compilation, often from languages like C or C++, results in binaries such as ELF files on Linux or PE files on Windows, with no runtime overhead during execution beyond the OS loader. In contrast, they require recompilation for different platforms, limiting portability, and place the burden of memory management and error handling on the developer, which can lead to issues like buffer overflows if not implemented carefully. Managed executables, on the other hand, are compiled into an intermediate representation, such as (CIL) in .NET or in , which is not directly executable by the hardware. These executables rely on a — like the (CLR) for .NET or the (JVM)—to perform just-in-time (JIT) compilation at runtime, converting the intermediate code to native machine instructions as needed. Examples include .NET assemblies (.dll or .exe files containing CIL, structured in the PE format on Windows) and Java class files (.class files containing , typically packaged in JAR archives based on the ZIP format). The primary advantages of native executables lie in their performance and efficiency: they execute at full hardware speed with minimal startup and no ongoing costs, making them ideal for resource-constrained or high-performance applications like . However, their platform specificity reduces cross-architecture portability, requiring separate builds for each target environment, such as x86 versus . Managed executables offer enhanced portability, as the same intermediate code can run on any platform with the appropriate , facilitating "" development. They also provide built-in security features, such as automatic via garbage collection and enforced by the runtime, reducing common vulnerabilities like memory leaks. Drawbacks include dependency on the runtime environment, which adds requirements and potential performance overhead from , though optimizations mitigate this in modern implementations. Hybrid approaches bridge these paradigms by applying ahead-of-time (AOT) compilation to managed code, producing native executables from intermediate representations without at . In .NET, Native AOT compiles CIL directly to during the build process, yielding self-contained binaries with faster startup times and smaller memory footprints compared to traditional JIT-managed executables, while retaining managed benefits like garbage collection. This method enhances deployment scenarios, such as cloud-native applications or mobile apps, by reducing dependencies, though it may limit dynamic features like .

Common File Formats

Executable file formats standardize the structure of binaries across operating systems, enabling loaders to map code, data, and metadata into memory for execution. Major formats include the for Windows, the for systems, and for Apple platforms, each defining headers, sections, and linking information tailored to their ecosystems. Additional formats like the legacy Common Object File Format (COFF) and the binary format (WASM) address specialized or emerging use cases, such as object files and web-native execution. The (PE) format serves as the standard for executable files on Windows and Win32/Win64 systems, encompassing applications (.exe files) and dynamic-link libraries (.dll files). It begins with a DOS header for compatibility with , followed by a PE signature, COFF file header, optional header with subsystem information and data directories (such as imports and exports), and an array of section headers that define the layout of segments like .text for executable code, .data for initialized data, .rdata for read-only data, and .bss for uninitialized data. This structure allows the Windows loader to relocate the image, resolve imports, and initialize the process environment, supporting features like (ASLR) for security. PE files are extensible, accommodating debug information, resources, and certificates in dedicated sections. The (ELF) is the predominant format for executables, object files, shared libraries, and core dumps on operating systems, including and . Defined by the Tool Interface Standard, an ELF file starts with an ELF header specifying the file class (32-bit or 64-bit), , ABI version, and , followed by optional program header tables that describe loadable segments (e.g., PT_LOAD for code and data) and section header tables that organize content into sections like .text for code, .data for initialized variables, .rodata for constants, and .symtab for symbols. Program headers guide the dynamic loader in mapping segments into , while sections facilitate linking and ; shared objects (.so files) use ELF to enable dynamic linking at . ELF's flexibility supports multiple architectures and processor-specific features, such as note sections for auxiliary information. Mach-O, short for Mach Object, is the executable format used in macOS, , , and , organizing binaries into a header, load commands, and segments containing sections for efficient loading by the dyld . The header identifies the CPU type, file type (e.g., MH_EXECUTE for executables or MH_DYLIB for libraries), and number of load commands, which specify details like segment permissions, symbol tables, and dynamic library paths. Segments such as __TEXT (for code and read-only data) and __DATA (for writable data) group related sections, with __LINKEDIT holding linking information; supports "fat" binaries that embed multiple architectures (e.g., x86_64 and arm64) in one file, allowing universal execution across devices like Intel-based Macs and . This format integrates with Apple's code-signing system, embedding entitlements and signatures directly in the binary. Other notable formats include the Common Object File Format (COFF), a legacy predecessor to PE used primarily for object files (.obj) in Windows toolchains and older Unix systems, featuring a file header with machine type and section count, followed by optional headers, section tables, and raw section data for relocatable code and symbols. COFF lacks the full executable portability of PE but remains relevant in build processes for its simplicity in handling intermediate compilation outputs. In contrast, WebAssembly (WASM) provides a platform-independent binary format for high-performance execution in web browsers and standalone runtimes, encoding modules as a sequence of typed instructions in a compact, linear structure with sections for code, data, types, functions, and imports/exports, compiled from languages like or to run sandboxed at near-native speeds without traditional OS dependencies.

Execution Mechanism

Loading and Running

The loading of an executable into memory begins when the operating system receives a request to execute a program file, typically through system calls that initiate process creation. The first reads the executable's header to verify its format and extract metadata about memory layout, such as segment sizes and permissions. For instance, in systems using the format, the 's load_elf_binary() function parses the ELF header and program header table to identify loadable segments like code, data, and . Similarly, in Windows with the format, the loader examines the header, headers, and optional header to determine the image base and section alignments. Once headers are parsed, the maps the executable's segments into the process's , allocating memory pages as needed without immediately loading all physical pages to support demand paging. Read-only segments like code are mapped with execute permissions, while data segments receive read-write access; the segment, representing uninitialized data, is zero-filled by allocating fresh pages. The also establishes the and regions: the grows downward from a high virtual address, often with (ASLR) for security, while the starts just after the and expands via system calls like brk() or mmap(). In , setup_arg_pages() configures the initial size and adjusts memory accounting for argument pages. Windows performs analogous mappings through the Ntdll.dll loader, reserving for sections and committing pages on demand. Process creation integrates loading in operating system-specific models. In Unix-like systems such as Linux, the common approach uses the fork-exec paradigm: the fork() system call duplicates the parent process to create a child, sharing the address space initially via copy-on-write, after which the child invokes execve() to replace its image with the new executable. The execve() call triggers the kernel to load the binary, clear the old address space via flush_old_exec(), and set up the new one, returning control to the child only on success. In contrast, Windows employs the CreateProcess() API, which atomically creates a new process object, allocates its virtual address space, loads the specified executable, and starts its primary thread in a single operation, inheriting the parent's environment unless overridden. After loading, execution begins at the designated entry point, with the kernel performing final initializations. In Linux ELF executables, the kernel jumps to the entry address from the ELF header (or the dynamic linker's if present) via start_thread(), having populated the stack with the argument count argc, an array of argument pointers argv (with argv[0] typically the program name), environment pointers envp, and an auxiliary vector containing metadata like the entry point and page size. The actual entry symbol _start, provided by the C runtime (e.g., in glibc's crt1.o), receives these via the stack or registers, initializes the runtime environment (such as constructors and global variables), and invokes __libc_start_main() to call the user's main(int argc, char *argv[]) function. For Windows PE executables, the loader computes the entry point by adding the AddressOfEntryPoint RVA from the optional header to the image base, then starts the primary thread there; the C runtime entry (e.g., mainCRTStartup) similarly sets up argc and argv from the command line before calling main. Process termination occurs when the program calls an exit function, such as exit() , which sets an exit code and triggers cleanup. The exit code, an integer typically 0 for success and non-zero for failure, is returned to the ; in , the least significant byte of the status is passed via wait() or waitpid(), while the reaps the process, freeing its mappings, closing file descriptors, and releasing other resources to prevent leaks. If the parent ignores SIGCHLD or has set SA_NOCLDWAIT, the child is immediately reaped without becoming a . In Windows, ExitProcess() sets the exit code (queryable via GetExitCodeProcess()) and notifies loaded DLLs, terminates all threads, unmaps the image from , and closes handles, though persistent objects like files may remain if referenced elsewhere. Forced termination via signals (e.g., SIGKILL in Unix) or TerminateProcess() in Windows bypasses cleanup but still reclaims system resources.

Dynamic Linking and Libraries

Dynamic linking allows executables to reference external shared libraries at runtime rather than embedding all code during compilation, enabling modular program design where libraries like .dll files on Windows or .so files on Unix-like systems are loaded on demand. This process relies on symbol tables within the executable and library files, which contain unresolved references to functions and variables; the runtime system resolves these symbols by searching for matching exports in loaded libraries, often using a dynamic symbol table for efficient lookups. Lazy loading defers the actual loading of a library until the first reference to one of its symbols is encountered, optimizing memory usage by avoiding unnecessary loads for unused components. The runtime loader, such as dyld on macOS or ld.so on , manages this linking process by handling resolution, applying relocations to adjust addresses based on the library's load , and enforcing versioning to ensure between executable and library versions. For instance, ld.so on uses a dependency tree to load prerequisite libraries recursively and performs global to bind imports across modules. Versioning mechanisms, like sonames in ELF files, prevent conflicts by specifying minimum required library versions, allowing multiple variants to coexist on the system. One key advantage of dynamic linking is the reduction in executable file size, as shared code is stored once in libraries and reused across multiple programs, which also facilitates easier updates to libraries without recompiling dependent executables. However, it introduces challenges such as dependency conflicts, colloquially known as "" on Windows, where mismatched library versions can cause failures if the system loads an incompatible variant. To support dynamic linking effectively, executables and shared libraries often employ (PIC), which compiles instructions to be relocatable without fixed addresses, using techniques like relative addressing and GOT/PLT tables to defer address resolution until . This enables libraries to be loaded at arbitrary memory locations and shared among processes, enhancing system efficiency, though it may incur a slight performance overhead due to indirect jumps. In contrast to static linking, where all library code is incorporated at build time, dynamic linking promotes resource sharing but requires careful management of dependencies.

Security Considerations

Vulnerabilities and Protections

Executables are susceptible to vulnerabilities, where programs write more data to a fixed-length buffer than it can hold, potentially overwriting adjacent regions such as return addresses on the . This occurs due to the intermixing of data storage areas and control data in , allowing malformed inputs to alter program and enable . Stack smashing attacks exemplify this risk, exploiting stack-based buffer overflows in C programs by using functions like strcpy() to copy excessive data, overwriting the return address to redirect execution to injected . Code injection vulnerabilities further compound these threats, arising when executables fail to neutralize special elements in externally influenced inputs, permitting attackers to insert and execute malicious code. For instance, unvalidated user inputs can be interpreted as executable commands in languages like PHP or Python, leading to unauthorized actions such as system calls. To mitigate these exploits, operating systems implement protections like Address Space Layout Randomization (ASLR), which randomly relocates key areas of a process's virtual address space—including stacks, heaps, and loaded modules—at runtime to thwart address prediction by attackers. Complementing ASLR, Data Execution Prevention (DEP) uses the processor's NX (No eXecute) bit to mark certain memory pages as non-executable, preventing buffer overflow payloads from running code in data regions like the stack or heap. If execution is attempted on non-executable memory, DEP triggers an access violation, terminating the process to block exploitation. Executables also serve as primary vectors for , including viruses that attach to legitimate files and activate upon execution, spreading via shared disks or networks. Trojans similarly masquerade as benign executables, such as attachments or downloads, tricking users into running them to grant attackers backdoor access or capabilities. detection often relies on methods, which analyze runtime behaviors in simulated environments to identify suspicious actions like , even for unknown variants without matching signatures. Best practices for securing executables emphasize input validation, where data is checked early against allowlists for format, length, and semantics to block malformed inputs that could trigger overflows or injections. Additionally, least privilege execution restricts processes to minimal necessary permissions, confining potential damage from compromised executables by elevating privileges only when required and dropping them immediately afterward.

Signing and Verification

Code signing is a cryptographic process that attaches a to an executable file, ensuring its and by verifying that it has not been altered since signing and originates from a trusted publisher. This is achieved using digital s, typically in the format, issued by trusted authorities (). The signature is generated by hashing the executable—commonly with SHA-256—and encrypting the hash with the developer's private key, which is embedded in the certificate along with the corresponding public key for later . Developers obtain an from a after undergoing identity validation, then use platform-specific tools to apply the signature. On Windows, the SignTool utility (signtool.exe) from the Windows SDK signs executables or catalog files by computing a SHA-256 hash of the file contents, signing it with the private key, and embedding the result in a structure within the () file format. Similarly, on macOS, the codesign command-line tool signs executables and bundles, creating a CodeResources file that includes SHA-256 hashes of resources and the signature, stored in the bundle's _CodeSignature directory. For distribution outside app stores, employs Authenticode as its standard, which supports dual-signing with both and SHA-256 for broader compatibility, while Apple uses Developer ID certificates to enable verification for non-App Store software. During loading or execution, the operating system verifies the to enforce trust. The process involves recomputing the SHA-256 of the executable and decrypting the embedded with the public key from the to obtain the original ; if they match, the is deemed untampered. The chain is then validated against trusted root to confirm the publisher's identity, often requiring online checks for revocation status via Certificate Revocation Lists (CRLs) published by the . On Windows, Authenticode verification occurs via the WinVerifyTrust , which chains to roots in the system's trusted store and blocks execution if the is invalid or revoked. macOS uses the framework for similar checks during Gatekeeper assessment, ensuring the Developer ID aligns with Apple's notarization ticket if applicable. The primary purposes of signing and verification are to prevent tampering by detecting unauthorized modifications and to establish a chain of trust from the developer to the end user through CA-anchored certificates, thereby reducing risks from malware masquerading as legitimate software. Revocation mechanisms, such as CRLs, allow CAs to invalidate compromised certificates before expiration by listing their serial numbers, prompting systems to deny verification and halt execution of affected executables; this is critical for code signing, where revoked certificates remain on CRLs indefinitely to maintain long-term protection. Standards like Microsoft's Authenticode and Apple's Developer ID ensure interoperability and enforce these practices across ecosystems.

Historical Development

Early Executables

The earliest forms of executables emerged in the pre-1950s era through physical media like punch cards and magnetic tapes, which enabled direct machine execution on pioneering computers. The , completed in 1945 as the first general-purpose electronic digital computer, relied on punch cards for input via an integrated card reader, allowing programs to be loaded and executed by configuring the machine's wiring and switches based on the card data. These punch cards, perforated with holes representing instructions, served as the primary medium for storing and inputting both data and rudimentary programs, marking the transition from manual calculations to automated execution. Magnetic tapes began supplementing punch cards in the late 1940s, offering higher capacity for sequential program storage and execution on systems like the (1951), where tapes could be read directly to initiate computations without intermediate transcription. In the and , executables evolved with the advent of s on mainframe computers, facilitating more . The , introduced in 1952 as one of the first commercial scientific computers, used where programmers encoded instructions in symbolic form, translated into stored on punch cards or tape for loading into . This period also saw the development of the first loaders for relocatable code, enabling programs to be assembled independently and then positioned in at runtime; Grace Hopper's for the in 1952 implemented an early linking loader that combined subroutines from separate modules into a single executable. These loaders addressed the rigidity of absolute addressing in earlier systems, allowing code to be moved without full reprogramming, though execution still required manual intervention to set addresses. A key milestone in executable management came with the operating system in 1967, which introduced file permissions specifically for executables to enhance in a multi-user environment. segmented files with access controls, including an "execute" permission bit that restricted direct execution of data segments and enforced protection rings to isolate user programs from system resources. This innovation, part of ' hierarchical file system, was the first to systematically apply permissions to executable files, preventing unauthorized access or modification during shared computing sessions. Early executables were hampered by significant limitations, including the absence of automated linking mechanisms and reliance on . Programmers had to explicitly calculate and adjust addresses for each load, with no dynamic resolution of external references, leading to error-prone setups on mainframes like the IBM 701. Memory allocation was entirely manual, requiring operators to track available space and avoid overlaps, which constrained program size and portability across sessions.

Evolution in Modern Systems

In the 1980s and 1990s, executable formats evolved significantly alongside the growth of personal computing and systems. The operating system, released by in 1981, introduced the .COM format for simple, memory-resident programs limited to 64 KB, followed by the more advanced .EXE format in 1.0, which supported relocatable code and larger programs through a header-based structure known as MZ after its designer. Concurrently, systems saw the rise of dynamic linking, first implemented in 4.0 in late 1988, enabling shared libraries to be loaded at for efficient memory use and easier updates, building on advancements. By the 1990s, the (ELF) emerged as a standardized alternative to older formats like a.out, with initial specifications published by Unix System Laboratories in 1990 and the Tool Interface Standard (TIS) version 1.2 released in May 1995, facilitating portable executables across variants such as and . The 2000s marked a shift toward managed code environments, prioritizing portability and security over native binaries. released Java in May 1995, introducing as an executed by the (JVM), which handled and automatically. Microsoft followed with the .NET Framework in February 2002, featuring the (CLR) for executing (CIL) code, enabling cross-language interoperability and just-in-time (JIT) compilation for platform independence. Early previews of also appeared, with introducing Jails in 2000 to isolate processes, filesystems, and networks within a single kernel, laying groundwork for resource partitioning in shared environments. From the 2010s onward, executables adapted to diverse architectures, web integration, and heightened security demands in cloud and mobile ecosystems. Apple announced Universal 2 binaries in June 2020 to support the transition to (ARM64), allowing single files to contain both x86-64 and ARM code for seamless execution across hardware. (Wasm), released by the W3C in March 2017, emerged as a compact, binary instruction format for high-performance, cross-platform code execution in browsers and beyond, compiling from languages like C++ and without traditional plugins. Security enhancements, such as OS-level sandboxing, gained prominence; for instance, Windows introduced AppContainer in (2012) to restrict executable access to resources via mandatory integrity control, while macOS expanded sandboxing in 2012 to limit app privileges by default. Looking ahead, hybrid just-in-time (JIT) and ahead-of-time (AOT) compilation strategies are gaining traction to balance startup speed and runtime optimization, as seen in tools like GraalVM for Java, which combines AOT for initial execution with JIT for adaptive improvements. Executable compression techniques, such as those in UPX, continue to evolve for efficient distribution, reducing file sizes by 50-70% through algorithms like LZMA while preserving fast decompression, aiding bandwidth-constrained mobile and cloud deployments.

References

  1. [1]
    where are the exe. files ???? - Microsoft Q&A
    Jul 19, 2019 · An executable is a file that contains a program - that is, a particular kind of file that is capable of being executed or run as a program in ...Missing: definition science
  2. [2]
    OS Processes - CS 3410 - Cornell: Computer Science
    This is a file that contains the instructions and data for your program. An executable is inert: it's not doing anything; it's just sitting there on your disk.
  3. [3]
    Glossary - Princeton Research Computing
    Executable. An executable is a file that can be typed in a shell and run (executed) the commands within the file. The executable can be a script file with ...
  4. [4]
    Types of files - IBM
    Binary files are regular files that contain information readable by the computer. Binary files might be executable files that instruct the system to accomplish ...
  5. [5]
    PE Format - Win32 apps - Microsoft Learn
    Jul 14, 2025 · This document specifies the structure of executable (image) files and object files under the Microsoft Windows family of operating systems.
  6. [6]
    [PDF] I Executable and Linkable Format (ELF)
    Executable and shared object files have a base address, which is the lowest virtual address associated with the memory image of the program's object file. One ...
  7. [7]
    Chapter 7 Object File Format (Linker and Libraries Guide)
    An executable file holds a program that is ready to execute. The file specifies how exec(2) creates a program's process image. A shared object file holds code ...Missing: science | Show results with:science
  8. [8]
    Inside Windows: Win32 Portable Executable File Format in Detail
    Microsoft introduced the PE File format, more commonly known as the PE format, as part of the original Win32 specifications. However, PE files are derived from ...
  9. [9]
    [PDF] CS429: Computer Organization and Architecture - Linking I & II
    Apr 5, 2018 · Linking combines code and data into a single executable. A linker merges object files, resolves external references, and relocates symbols.<|control11|><|separator|>
  10. [10]
    Representing executable files
    Binary Format​​ This executable file format can be specific to the operating system, as we would not normally expect that a program compiled for one system will ...
  11. [11]
    Executable definition by The Linux Information Project (LINFO)
    Jul 9, 2005 · An executable file, also called an executable or a binary, is the ready-to-run (i.e., executable) form of a program. A program is a sequence of ...
  12. [12]
    Executable File - Artifact Details | MITRE D3FEND™
    name: Executable File; definition: In computing, executable code or an executable file or executable program, sometimes simply an executable, ...
  13. [13]
    x86 Assembly Language Programming
    This document contains very brief examples of assembly language programs for the x86. ... Hello, World" to the console using only system calls. Runs on 64-bit ...
  14. [14]
    [PDF] Tool Interface Standard (TIS) Executable and Linking Format (ELF ...
    An object file's section header table lets one locate all the file's sections. The section header table is an array of Elf32_Shdr structures as described below.Missing: internal | Show results with:internal
  15. [15]
    [PDF] Common Object File Format (COFF - Texas Instruments
    Apr 8, 2009 · Each section has its own section header. Table 5 shows the structure of each section header. Table 5. Section Header Contents. Byte Number Type.
  16. [16]
    Are binaries portable across different CPU architectures?
    Jul 26, 2016 · No, binaries are not portable across different CPU architectures. They must be recompiled for each target architecture.Missing: conventions | Show results with:conventions
  17. [17]
    Android ABIs - NDK
    Feb 10, 2025 · Android ABIs are combinations of CPU and instruction sets, including supported instruction sets, endianness, data passing conventions, and ...
  18. [18]
    A Technical Guide to Porting Software to ARM64 Architecture
    Jul 4, 2024 · ARM64 processors are on the up, and x86/x64 emulation isn't enough. Discover how to port software to ARM64 architectures with our ...
  19. [19]
    [PDF] Memory management in C: The heap and the stack
    Oct 7, 2010 · • Code segment or text segment: Code segment contains the code executable or code binary. • Data segment: Data segment is sub divided into two ...
  20. [20]
    3.4: Memory segments - Engineering LibreTexts
    Nov 30, 2020 · The heap segment contains chunks of memory allocated at run time, most often by calling the C library function malloc . The stack segment ...
  21. [21]
    Methods to Optimize Code Size - Intel
    Using interprocedural optimization (IPO) may reduce code size. It enables dead code elimination and suppresses generation of code for functions that are ...
  22. [22]
    Optimize Options (Using the GNU Compiler Collection (GCC))
    This can improve dead code elimination and common subexpression elimination. ... Maximum size (in bytes) of objects tracked bytewise by dead store elimination.
  23. [23]
    Overall Options (Using the GNU Compiler Collection (GCC))
    Compilation can involve up to four stages: preprocessing, compilation proper, assembly and linking, always in that order. GCC is capable of preprocessing and ...
  24. [24]
    GCC Frontend HOWTO: Some general ideas about Compilers
    Lexical analysis. The lexical analyzer reads the source program and emits tokens. · Syntax analysis. Tokens from the lexical analyzer are the input to this phase ...
  25. [25]
    Using as - Sourceware
    The GNU assembler can be configured to produce several alternative object file formats. For the most part, this does not affect how you write assembly language ...
  26. [26]
    Link Options (Using the GNU Compiler Collection (GCC))
    If both static and shared libraries are found, the linker gives preference to linking with the shared library unless the -static option is used. It makes a ...
  27. [27]
    GNU make
    Summary of each segment:
  28. [28]
    cmake-buildsystem(7)
    A CMake-based buildsystem is organized as a set of high-level logical targets. Each target corresponds to an executable or library, or is a custom target ...
  29. [29]
    Invoking GCC (Using the GNU Compiler Collection (GCC))
    ### Summary of Cross-Compilation and Generating Executables for Different Architectures
  30. [30]
  31. [31]
    LD
    Summary of each segment:
  32. [32]
    What is managed code? - .NET - Microsoft Learn
    Apr 19, 2023 · To put it simply, managed code is just that: code whose execution is managed by a runtime. In this case, the runtime in question is called the ...
  33. [33]
    Managed Execution Process - .NET - Microsoft Learn
    Apr 20, 2024 · The CIL and metadata are contained in a portable executable (PE) file that is based on and that extends the published Microsoft PE and common ...
  34. [34]
    .NET Native and Compilation - UWP applications - Microsoft Learn
    Oct 20, 2022 · This article compares .NET Native with other compilation technologies available for .NET Framework apps, and also provides a practical overview of how .NET ...
  35. [35]
    Chapter 3. Compiling for the Java Virtual Machine
    Oracle's JDK software contains a compiler from source code written in the Java programming language to the instruction set of the Java Virtual Machine.Missing: native | Show results with:native
  36. [36]
    Native AOT deployment overview - .NET | Microsoft Learn
    The Native AOT deployment model uses an ahead-of-time compiler to compile IL to native code at the time of publish. Native AOT apps don't use a just-in-time ...Optimizing AOT deployments · Known trimming incompatibilities
  37. [37]
    Optimizing AOT deployments - .NET | Microsoft Learn
    Sep 4, 2024 · The Native AOT publishing process generates a self-contained executable with a subset of the runtime libraries that are tailored ...
  38. [38]
    Overview of the Mach-O Executable Format - Apple Developer
    Mar 10, 2014 · A Mach-O binary is organized into segments. Each segment contains one or more sections. Code or data of different types goes into each section.
  39. [39]
    Binary Format — WebAssembly 3.0 (2025-11-02)
    Binary Format¶ · Conventions · Grammar · Auxiliary Notation · Lists · Values · Bytes · Integers · Floating-Point · Names · Types · Number Types · Vector Types ...Conventions · Modules · Instructions · Types
  40. [40]
    How programs get run: ELF binaries - LWN.net
    Feb 4, 2015 · This article only focuses on what's needed to load an ELF program, rather than exploring all of the details of the format.
  41. [41]
    fork(2) - Linux manual page
    ### Summary of fork System Call and Its Role in Unix Process Creation Model with exec
  42. [42]
    execve(2) - Linux manual page
    ### Summary of execve System Call
  43. [43]
    Create processes - Win32 apps - Microsoft Learn
    Jul 14, 2025 · The CreateProcess function creates a new process that runs independently of the creating process. For simplicity, this relationship is called a parent-child ...
  44. [44]
  45. [45]
    exit(3) - Linux manual page - man7.org
    The exit() function causes normal process termination and the least significant byte of status (ie, status & 0xFF) is returned to the parent.On_exit(3) · Pthread_exit(3) · Atexit(3)Missing: OS | Show results with:OS
  46. [46]
    Terminating a Process - Win32 apps | Microsoft Learn
    Jul 14, 2025 · The GetExitCodeProcess function returns the termination status of a process. While a process is executing, its termination status is STILL_ACTIVE.
  47. [47]
    [PDF] Buffer Overflow Vulnerability Lab - UTC
    This vulnerability arises due to the mixing of the storage for data (e.g. buffers) and the storage for controls (e.g. return addresses): an overflow in the data ...
  48. [48]
    [PDF] Smashing The Stack For Fun And Profit Aleph One
    Smashing the stack corrupts the execution stack by writing past an array's end, causing a jump to a random address. The stack is a memory region with a stack ...
  49. [49]
    CWE-94: Improper Control of Generation of Code ('Code Injection')
    To reduce the likelihood of code injection, use stringent allowlists that limit which constructs are allowed. If you are dynamically constructing code that ...
  50. [50]
    DYNAMICBASE (Use address space layout randomization)
    May 6, 2022 · Specifies whether to generate an executable image that can be randomly rebased at load time by using the address space layout randomization (ASLR) feature of ...Missing: explanation | Show results with:explanation
  51. [51]
    Data Execution Prevention - Win32 apps - Microsoft Learn
    May 1, 2023 · Data Execution Prevention (DEP) is a system-level memory protection feature that is built into the operating system starting with Windows XP and Windows Server ...<|separator|>
  52. [52]
    What Is the Difference: Viruses, Worms, Trojans, and Bots? - Cisco
    Jun 14, 2018 · Almost all viruses are attached to an executable file, which means the virus may exist on a system but will not be active or able to spread ...
  53. [53]
    What Is a Trojan Horse? Trojan Virus and Malware Explained | Fortinet
    A Trojan Horse Virus is a type of malware that downloads onto a computer disguised as a legitimate program. The delivery method typically sees an attacker ...How Trojans Work · Most Common Types Of Trojan... · Trojan Horse Virus Faqs<|separator|>
  54. [54]
    [PDF] The Evolving Virus Threat - NIST Computer Security Resource Center
    Oct 19, 2000 · Heuristics are behavior-based technologies that can detect the suspicious behavior of new and unknown threats. 2.<|separator|>
  55. [55]
    Input Validation - OWASP Cheat Sheet Series
    This article is focused on providing clear, simple, actionable guidance for providing Input Validation security functionality in your applications.
  56. [56]
    Secure Coding Practices Checklist - OWASP Foundation
    Data protection. Implement least privilege, restrict users to only the functionality, data and system information that is required to perform their tasks.
  57. [57]
    What Is Code Signing? | Sectigo® Official
    Digitally signing the executable software with a publicly trusted X. 509 certificate increases confidence in the software.
  58. [58]
    Authenticode Signing for Game Developers - Win32 apps
    Jul 21, 2021 · After the CA decides that you meet its policy criteria, it generates a code-signing certificate that conforms to X. 509, the industry-standard ...
  59. [59]
    Authenticode Digital Signatures - Windows drivers - Microsoft Learn
    Jul 12, 2025 · Authenticode is a Microsoft code-signing technology that identifies the publisher of Authenticode-signed software.Missing: documentation | Show results with:documentation
  60. [60]
    Code Signing Tasks - Apple Developer
    Sep 13, 2016 · Explains how to use command-line tools to sign your code.
  61. [61]
    Developer ID - Signing Your Apps for Gatekeeper
    A Developer ID certificate lets Gatekeeper verify that you're a trusted developer when people download and open your app, plug-in, or installer package from ...
  62. [62]
    App code signing process in macOS - Apple Support
    Feb 18, 2021 · Code signing is performed by the developer using their Developer ID certificate (issued by Apple). Verification of this signature proves to the ...
  63. [63]
    Revoke a code signing certificate - DigiCert documentation
    The certificate revocation process works as follows: · Submit a request to revoke the certificate. · An administrator approves the revocation request. · DigiCert ...
  64. [64]
  65. [65]
    ENIAC - CHM Revolution - Computer History Museum
    ENIAC (Electronic Numerical Integrator And Computer), built between 1943 and 1945—the first large-scale computer to run at electronic speed without being slowed ...<|control11|><|separator|>
  66. [66]
    Punch Cards for Data Processing
    Punch cards became the preferred method of entering data and programs onto them. They also were used in later minicomputers and some early desktop calculators.
  67. [67]
    1951: Tape unit developed for data storage
    By the late 1940s computer design engineers recognized that magnetic audio tape technology could be adapted for digital data recording.
  68. [68]
    When was the relocatable object module invented?
    Mar 31, 2017 · Grace Hopper invented a kind of linking loader in 1951 for the Univac, as part of the A-0 "compiler" (not a compiler like we understand it ...
  69. [69]
    [PDF] Protection and the Control of Information Sharing in Multics
    The design of mechanisms to control the sharing of information in the Multics system is described. Five design principles help provide insight into the ...
  70. [70]
    A Hardware Architecture for Implementing Protection Rings - Multics
    As described earlier, the permission flags for each segment in the virtual memory of a process simply indicate that the segment can or cannot be read, written, ...
  71. [71]
    [PDF] IBM 701 users' class notes and related materials
    instructions affected by relocation for the general purpose program desired by the problem planner. These control data together with the specified ...
  72. [72]
    [PDF] FORTRAN Session - Software Preservation Group
    Before 1954 almost all programming was done in machine language or assembly lan- ... IBM 701 Speedcoding and other automatic programming systems. In Proe. Syrup ...
  73. [73]
    microsoft/MS-DOS - GitHub
    Apr 25, 2024 · This repo contains the original source-code and compiled binaries for MS-DOS v1.25 and MS-DOS v2.0, plus the source-code for MS-DOS v4.00 jointly developed by ...Actions · Pull requests · Security · Activity<|separator|>
  74. [74]
    [PDF] 1 AN INTRODUCTION TO SOLARIS - Pearsoncmg.com
    Aug 25, 2000 · 1988. SunOS 4.0. • New virtual memory system integrates the file system cache with the memory system. • Dynamic linking added. • The first ...
  75. [75]
    Evolution of the ELF object file format - MaskRay
    May 26, 2024 · Version 1.2 was released in May 1995. ELF has been very influential. In the 1990s, many Unix and Unix-like operating systems, including Solaris, ...
  76. [76]
    [PDF] The Java Language: A White Paper - Tech Insider
    Introduction. The Java programming language and environment is designed to solve a number of problems in modern programming practice.
  77. [77]
    Common Language Runtime (CLR) overview - .NET - Microsoft Learn
    Get started with common language runtime (CLR), .NET's run-time environment. The CLR runs code and provides services to make the development process easier.Missing: structure | Show results with:structure<|separator|>
  78. [78]
    A Brief History of Containers: From the 1970s Till Now - Aqua Security
    Jan 10, 2020 · Read on to understand the changes and developments we saw and offer our view of where we believe Containers are heading to in the near future.
  79. [79]
    Apple announces Mac transition to Apple silicon
    Jun 22, 2020 · Using Universal 2 application binaries, developers will be able to easily create a single app that taps into the native power and ...
  80. [80]
    Sandboxing and Virtualization: Modern Tools for Combating Malware
    Utilizing hardware virtualization based techniques, a malware pro tection solution runs the target application in its own OS in a virtual machine. Malware is ...Missing: executables | Show results with:executables
  81. [81]
    [PDF] AOT vs. JIT: Impact of Profile Data on Code Quality
    Just-in-time (JIT) compilation during program execution and ahead-of-time (AOT) compilation during software installation are alternate techniques used by ...
  82. [82]
    UPX: the Ultimate Packer for eXecutables - Homepage
    UPX is an advanced executable file compressor. UPX will typically reduce the file size of programs and DLLs by around 50%-70%, thus reducing disk space, network ...Missing: techniques | Show results with:techniques