Fact-checked by Grok 2 weeks ago

Binary file

A binary file is a type of computer file that stores data in a binary format, consisting of a sequence of bytes where each byte comprises eight bits, enabling efficient representation of information as zeros and ones directly interpretable by computer hardware and software.^[1] Unlike text files, which use human-readable characters encoded in formats like ASCII or UTF-8 and can be viewed in a basic editor, binary files contain raw, unstructured or structured data that appears as gibberish to humans without specialized tools, as they are optimized for machine processing rather than direct readability.^[2]^[3] Binary files serve as the foundation for numerous applications in computing, including executable programs (often with extensions like .exe or .bin), multimedia content such as JPEG images, MP3 audio, and MP4 videos, as well as document formats like PDF files, all of which rely on predefined structures to encode complex data compactly and portably across systems.^[1]^[4] Their efficiency stems from utilizing the full range of byte values without the constraints of printable characters, allowing for smaller file sizes and faster processing compared to equivalent text-based representations.^[5] Accessing or manipulating binary files typically requires programming languages or utilities that handle byte-level operations, such as reading and writing in modes that preserve the exact bit patterns, to avoid corruption or misinterpretation.^[6]^[7] In operating systems, binary files are distinguished from text files during input/output operations, where special handling ensures no automatic translation of line endings or character encodings occurs, maintaining data integrity for applications like compiled software binaries or database dumps.^[8] This distinction is crucial for cross-platform compatibility, as byte order (endianness) and architecture-specific details can affect how binary data is loaded and executed, often necessitating tools like hex editors for inspection or conversion utilities for portability.^[9]

Fundamentals

Definition and Characteristics

A binary file is a computer file that stores data as a sequence of bytes, where each byte consists of eight bits and can represent any integer value from 0 to 255.^[1]^[10] This structure allows for the direct encoding of information in a machine-readable format, without the intention of human readability as plain text.^[11] Unlike text files, which use character encodings to form legible sequences, binary files require specific software or hardware to interpret their contents correctly.^[12] Key characteristics of binary files include their non-textual nature, enabling the inclusion of arbitrary byte values, including non-printable characters, which contrasts with the printable byte restrictions typical in text files.^[13] They often contain compressed or encoded data to optimize storage efficiency and processing speed, making them suitable for representing complex information such as machine code instructions or pixel color values in images.^[14] While structured formats like XML are inherently textual, binary files can embody similar structures through byte-level encoding, provided they are processed as such by applications.^[2] In fundamental terms, a file serves as a named stream of data on storage media, and the byte represents the primary unit of organization in binary files, with each byte's bit-level composition (eight binary digits, or bits) forming the basis for all data representation.^[15]^[16] This bit composition allows binary files to capture precise, low-level data patterns essential for computational tasks.^[17]

Historical Development

The concept of binary files emerged in the early days of electronic computing during the 1950s, as mainframe systems began storing programs and data in non-human-readable formats to enable efficient processing. Early mainframes like the IBM 701, introduced in 1953, utilized magnetic tapes to store large volumes of binary data, including executables and databases, marking a shift from mechanical punched cards to electronic storage media that could handle fixed-length records of binary instructions and operands. Punched cards and paper tapes, used in systems such as the ERA 1101 (1950), also encoded binary data for input, with the stored-program architecture exemplified by the IAS machine (1952) allowing instructions and data to be held interchangeably in binary form on drums or tapes.^[18] In the 1970s, binary files became integral to operating systems for microcomputers and minicomputers, with significant advancements in executables driven by assembly languages and early compilers. The UNIX operating system, developed at Bell Labs starting in 1969 on the PDP-7 and ported to the PDP-11 in 1971, included an assembler that generated binary executables for system tools like the file system and text editor, enabling compact storage of machine code.^[19] By 1973, UNIX's rewrite in the C programming language introduced compilers that produced portable binary outputs, facilitating the distribution of Version 6 UNIX in 1975 with precompiled binaries for broader adoption.^[19] Concurrently, CP/M (Control Program for Microcomputers), demonstrated in 1974 by Gary Kildall at Digital Research, standardized binary file handling on 8-bit microcomputers, using a disk-based file system to store executables as .COM files in fixed 128-byte records, which became widespread for software like word processors and utilities.^[20] This era also saw the first prominent use of floppy disks for binary software distribution, as CP/M systems relied on 8-inch floppies to exchange programs across incompatible hardware.^[21] The 1980s personal computing boom expanded binary files beyond executables to multimedia formats, influenced by the rise of MS-DOS and graphical applications. MS-DOS 1.0, released in 1981 alongside the IBM PC, supported binary executables (.EXE and .COM files) distributed primarily on 5.25-inch floppy disks, enabling software like games and productivity tools to proliferate on compatible hardware.^[22] A key milestone was the development of binary image formats for efficient storage and transmission; CompuServe introduced the Graphics Interchange Format (GIF) in 1987, using Lempel-Ziv-Welch compression to encode color images in a compact binary structure, replacing earlier run-length encoded formats and supporting the nascent online services era.^[23] Binary file structures evolved from rigid fixed-length records—suited to sequential tape access in early systems—to flexible variable-length byte streams, accommodating the random-access nature of hard disks and diverse data types by the late 1970s and 1980s. The advent of networking in the 1980s and 1990s further drove portability, as protocols like TCP/IP on Ethernet-enabled PCs required binary formats to handle byte-order differences across architectures, exemplified by Sun Microsystems' External Data Representation (XDR) standard in 1987 for cross-platform data exchange.^[24] In the 2000s, mobile computing introduced compressed binary formats for apps, such as Android's APK bundles using ZIP compression since 2008 to package executables and resources efficiently for limited storage, while container technologies like FreeBSD Jails (2000) and later Docker (2013) layered binaries in portable, isolated environments for deployment.^[25]

Structure and Data Representation

Byte-Level Organization

Binary files are composed of unstructured or semi-structured sequences of bytes, where each byte is an 8-bit unit capable of representing integer values from 0 to 255 in decimal (or 0x00 to 0xFF in hexadecimal). This byte-level foundation allows binary files to store arbitrary data without the constraints of human-readable characters, enabling compact representation of complex information such as images, executables, or databases. Unlike text files, binary files lack inherent line breaks or delimiters, treating the entire content as a continuous stream of bytes that applications interpret based on predefined formats.^[26]^[27] To impose structure within this byte stream, binary files often incorporate internal headers, footers, or metadata sections that define the file's layout and content type. A common mechanism is the use of "magic numbers," which are specific byte sequences at the file's beginning (or elsewhere) to identify the format, such as 0x7F 'E' 'L' 'F' for ELF executables on Unix-like systems. These elements provide essential cues for parsing without relying on file extensions, ensuring reliable identification even if the file is renamed or transmitted. Padding with null bytes (0x00) is frequently employed to fill gaps, maintaining fixed sizes or boundaries in the structure, while data alignment ensures that multi-byte elements like integers or floats start at addresses that are multiples of their size (e.g., 4-byte alignment for 32-bit integers) to optimize processing efficiency on hardware. Misaligned data can lead to performance penalties or errors in some architectures, so padding bytes are inserted as needed to achieve this.^[28]^[27] A critical aspect of byte-level organization is endianness, which dictates the order in which bytes of multi-byte values are stored and read. In big-endian (network byte order), the most significant byte is stored first, mirroring human reading conventions; for example, the 16-bit value 0x1234 would be written as bytes 0x12 followed by 0x34. Conversely, little-endian stores the least significant byte first, so 0x1234 becomes 0x34 then 0x12, as used in x86 architectures. This convention affects interoperability between systems, with network protocols like TCP/IP standardizing big-endian to avoid ambiguity. The terms "big-endian" and "little-endian" originated from debates on byte ordering consistency in computing protocols.^[29]^[30]

hex
Example: Storing 16-bit integer 0x1234 (4660 decimal)

Big-endian: 12 34
Little-endian: 34 12
Example: Storing 16-bit integer 0x1234 (4660 decimal)

Big-endian: 12 34
Little-endian: 34 12

Such byte arrangements ensure efficient low-level handling but require applications to account for the host system's endianness when reading or writing files.^[26]

Common Encoding Schemes

Binary files often employ binary serialization schemes to efficiently encode structured data, such as Protocol Buffers developed by Google, which uses a compact binary format for serializing structured data in a language-neutral manner.^[31] This approach tags fields with wire types and varints for lengths, enabling smaller payloads compared to text-based alternatives like JSON, while supporting forward and backward compatibility through optional fields.^[32] Compression algorithms are integral to binary file encoding, reducing storage and transmission sizes through lossless methods. The ZIP format, introduced in 1989, utilizes the DEFLATE algorithm, which combines LZ77 dictionary-based compression with Huffman coding to achieve efficient data reduction without loss of information.^[33] Similarly, the gzip format, defined in RFC 1952, also relies on DEFLATE for its core compression mechanism, wrapping the compressed data in a simple header and trailer for integrity checks. Binary-specific encoding schemes address the representation of diverse data types within files. Base64 serves as a binary-to-text encoding that represents binary data using 64 ASCII characters, effectively bridging binary content for transmission over text-only channels like email, though the underlying data remains binary after decoding.^[34] For numerical values, fixed-point representation allocates a fixed number of bits for the integer and fractional parts, offering predictable precision suitable for embedded systems and avoiding the overhead of dynamic scaling.^[35] In contrast, floating-point representation, standardized by IEEE 754, uses a sign bit, biased exponent, and normalized mantissa to handle a wide dynamic range; for example, the single-precision format (32 bits) encodes the value 1.0 as the byte sequence 0x3F800000 in big-endian order, with bits [0 | 01111111 | 00000000000000000000000] for sign, exponent (127), and mantissa (1.0 implied).^[36]^[37] Metadata integration in binary files often involves embedding encoding information directly in headers to facilitate identification and processing. Magic bytes, such as the 0xFFD8 sequence marking the Start of Image (SOI) in JPEG files, serve as file signatures in the header to denote the encoding scheme without relying on extensions.^[38] While MIME types primarily classify content in protocols like HTTP, binary files may incorporate similar identifiers in their headers for self-description. The evolution of encoding schemes includes ASN.1 (Abstract Syntax Notation One), standardized by the ITU-T (formerly CCITT) in 1984, which provides a formal notation for defining data structures in networking protocols, enabling platform-independent binary encoding through rules like BER (Basic Encoding Rules).^[39] This standard has been pivotal for structured binary data in telecommunications since the 1980s, supporting extensible and interoperable formats.^[40]

Comparison to Text Files

Key Differences in Composition

Binary files differ fundamentally from text files in their composition, as they utilize the complete range of byte values from 0 to 255, encompassing all possible 8-bit combinations including non-printable control characters, null bytes, and arbitrary data sequences.^[41] This unrestricted byte usage enables binary files to store complex, machine-interpretable data such as executable code, images, and multimedia without encoding constraints.^[42] In contrast, text files are limited to a subset of byte values that correspond to human-readable characters in standardized encodings, typically 7-bit ASCII (values 0-127) or 8-bit extensions like ISO-8859-1, where bytes generally represent printable symbols (e.g., letters, digits, and punctuation from 32-126 in ASCII) and specific control characters like line feeds, while avoiding others to maintain readability across systems.^[43] These compositional differences lead to significant practical impacts on file handling and usability. Binary files can achieve greater storage efficiency through data packing techniques, such as bit-level compression or direct representation of numerical values, potentially resulting in smaller sizes compared to equivalent text-based encodings (e.g., a binary integer uses 4 bytes for 32-bit values, while its decimal text form might require more).^[44] However, text files prioritize human accessibility, allowing direct editing in basic notepad applications without specialized software, whereas binary files demand hex editors or binary-aware tools to prevent corruption from misinterpreting or altering byte sequences.^[45] For instance, casual modifications to a binary executable in a text editor could insert invalid bytes, rendering the file unusable, while text files remain intact under similar operations due to their constrained character set.^[41] The lack of interchangeability between the two formats underscores their distinct natures. Attempting to view a binary file in a text editor often produces gibberish or mojibake—garbled characters arising from mismatched encoding interpretations of non-text bytes—making the content incomprehensible without proper decoding tools.^[42] Conversely, processing a text file as binary data risks corruption if the application expects full byte ranges but encounters encoding-specific restrictions, such as absent null bytes, leading to parsing errors or data loss during operations like concatenation or transmission.^[45] Historically, early computer files in the 1960s were predominantly text-like, designed for compatibility with punch-card readers, teletypes, and line printers that handled character streams efficiently.^[46] This shifted in the 1970s as minicomputers and early personal systems, such as UNIX on the PDP-11 and the Altair 8800, necessitated binary formats for executable programs and graphics to optimize storage and processing speed on limited hardware resources.^[47] For example, binary executables like the UNIX a.out format allowed direct loading into memory without runtime interpretation, while emerging graphics applications required packed pixel data that text representations could not support without excessive inefficiency.

Detection and Identification Methods

Detecting whether a file is binary or text involves both manual inspection and automated techniques, primarily relying on heuristics that examine the presence of non-printable characters, control codes, or patterns indicative of structured data rather than human-readable content. A common heuristic scans the initial portion of the file—often the first 512 bytes—for non-printable ASCII characters (those below 32 or above 126, excluding common whitespace like tabs and newlines); if these exceed a threshold, such as 40% in tools like GNU a2ps, the file is classified as binary.^[48] Similarly, the presence of null bytes (0x00) is a strong indicator of binary content, as text files rarely contain them except in specific encodings, leading tools like GNU grep to treat files with null bytes as binary unless overridden.^[49] The Unix "file" utility, originating in Research Version 4 Unix in November 1973, exemplifies an automated approach using a database of magic numbers—unique byte sequences at specific offsets that identify file types.^[50] For binary files, it matches signatures like the ELF header (0x7F 'E' 'L' 'F') for executables or JPEG markers (0xFF 0xD8), classifying them accordingly without relying solely on extensions. This method has evolved into a standard for file identification in Unix-like systems and is maintained through community-updated magic databases. Standards for MIME type detection further aid identification, often combining file extensions with content sniffing to determine if a file is binary. The MIME Sniffing Standard outlines algorithms for inspecting byte streams to infer types, such as recognizing binary formats through initial bytes while treating ambiguous cases conservatively to avoid misinterpretation.^[51] Entropy analysis provides another quantitative method, calculating Shannon entropy (typically 1-4 bits per byte for compressible text like English prose, versus 7-8 bits per byte for random or compressed binary data); high entropy suggests binary or encrypted content, as seen in security tools analyzing file randomness.^[52] Edge cases complicate detection, such as hybrid files like PDFs, which are fundamentally binary formats per the ISO 32000 specification but include readable text streams alongside compressed binary objects like images.^[53] Unicode text files may embed binary data, such as null bytes in malformed UTF-16 encodings, triggering false positives in heuristic checks, while legitimate text with diacritics (bytes >127) requires careful handling to avoid misclassification. In modern antivirus scanners, these methods—heuristics, magic numbers, and entropy—are integral for prioritizing binary executables and archives for deeper malware signature scanning, enhancing threat detection efficiency.^[54]

Manipulation and Tools

Viewing and Editing Techniques

Viewing binary files requires tools that can interpret and display raw byte data without assuming text encoding, as standard text editors may corrupt or misrender non-printable characters. Hex editors, such as HxD, provide a dual-pane interface showing bytes in hexadecimal notation alongside their ASCII equivalents, allowing users to navigate large files efficiently up to exabyte sizes with flicker-free rendering.^[55] Command-line tools like xxd on Linux systems generate hexadecimal dumps of binary files, converting input to a formatted output with offsets, hex values, and ASCII representations for quick inspection.^[56] For terminal-based viewing, the less pager with the -X option (disabling terminal initialization sequences) enables safe pagination of binary content, preventing display artifacts from non-printable bytes.^[57] Editing binary files involves direct manipulation of byte values in hex editors, where users can overwrite or insert data while maintaining file integrity through features like unlimited undo support in tools such as HxD.^[55] However, misalignment during edits—such as inserting bytes that shift subsequent data structures—can disrupt offsets, leading to program crashes or corrupted functionality in executables.^[58] Modern hex editors mitigate these risks with visual highlighting of modifications and redo capabilities, facilitating precise adjustments without permanent data loss. Common techniques include search and replace operations by byte value, where users specify hex patterns (e.g., two characters per byte) to locate and substitute sequences across the file, as supported in editors like 010 Editor for automated batch replacements.^[59] For executables, brief disassembly views in integrated tools like Hex Editor Neo translate byte sequences into assembly instructions starting from a selected offset, aiding initial code analysis without full decompilation.^[60] Hex editors emerged in the 1980s as essential debugging aids for microprocessor development, where programmers used hex dumps from tools like PROM burners to inspect and patch firmware directly.^[61] In reverse engineering, these editors play a key role by enabling analysts to decode proprietary formats, extract embedded data, and modify binaries for compatibility or vulnerability assessment.^[62]

Programmatic Handling

Programmatic handling of binary files requires language-specific APIs that enable byte-level read and write operations without text encoding or decoding. In the C standard library, the fopen function opens files in binary mode by appending 'b' to the mode string, such as "rb" for read-only access or "wb" for write access, ensuring no platform-specific text transformations like newline conversions occur.^[63] Data is then read or written using fread and fwrite, which transfer bytes into or from user-provided buffers, allowing precise control over data chunks for efficiency.^[63] Python provides the built-in open function for binary file access, where specifying mode 'rb' returns a file object that yields bytes objects—immutable sequences representing raw binary data—upon methods like read() or readinto().^[64] This approach facilitates direct manipulation of binary content, such as processing image or executable files, without implicit string conversions. In Java, the abstract InputStream class and subclasses like FileInputStream handle binary data by providing methods such as read(byte[] b) to fill byte arrays from the stream, supporting sequential byte access suitable for network or file sources.^[65] Best practices emphasize explicit management of byte order (endianness) to ensure cross-platform compatibility when interpreting multi-byte values in binary files. Python's struct module addresses this through pack and unpack functions, where format strings use prefixes like '>' for big-endian or '<' for little-endian ordering; for example, struct.pack('>i', value) serializes an integer in network byte order.^[30] Developers must also incorporate robust error checking, including validating return values from I/O functions (e.g., non-zero bytes read) and computing checksums like CRC32 or SHA-256 to detect corruption from transmission errors or storage faults. Such verification involves recalculating the checksum on read data and comparing it against a stored value appended to the file. For handling large binary files, POSIX systems offer memory mapping via the mmap function, which associates a file descriptor with a portion of the process's virtual address space, allowing direct pointer-based access as if the file were in memory.^[66] This technique avoids repeated system calls for reads, improving performance for random access patterns on files exceeding available RAM, though it requires handling signals like SIGBUS for out-of-bounds errors. To maintain efficiency with massive datasets, streaming via buffered I/O is recommended, processing files in fixed-size chunks (e.g., 4KB buffers) to minimize memory overhead and enable pipelined operations without loading the entire file.^[67]

Interpretation and Usage

Operating System Role

Operating systems manage binary files primarily through their kernels, which handle the loading, verification, and initial execution of executables to ensure secure and efficient process creation. When execution is requested, the kernel first checks file system attributes, such as the execute permission bit, which must be set for the requesting user, group, or others to allow the binary to run; without this, the kernel denies access to prevent unauthorized execution of potentially malicious code.^[68] For instance, in UNIX-like systems, the execute bit (represented as 'x' in permission strings like rwxr-xr-x) is a fundamental security control enforced during the open and exec phases.^[69] Upon verification, the kernel reads the binary file into memory by parsing its header to determine the format and layout. In Linux, for Executable and Linkable Format (ELF) binaries, the kernel examines the initial four bytes—known as the magic number 0x7F followed by the ASCII characters 'E', 'L', 'F'—to confirm it is an ELF object file and to decode details like the target architecture, byte order, and program headers.^[70] The kernel then uses these program headers to map segments (such as code, data, and dynamic linking information) into the new process's virtual memory space, often via the mmap system call, allocating virtual addresses while deferring physical memory assignment until pages are accessed; this enables efficient sharing of read-only segments like libraries across processes.^[71] Execution proceeds through system calls like execve in UNIX-like systems, which overlays the current process image with the binary: it clears the existing address space, loads the new binary's segments, initializes the stack with arguments and environment variables, and transfers control to the entry point specified in the header, effectively creating the illusion of a new process if preceded by fork.^[72] Dynamic linking, where shared libraries are loaded at runtime rather than statically included at compile time, is facilitated during this phase; early UNIX systems from the 1970s used static linking exclusively, but dynamic linking emerged in UNIX System V Release 4 in 1988, allowing reusable libraries to be mapped into memory on demand for better modularity and reduced disk usage.^[73] Similarly, Microsoft's Portable Executable (PE) format, introduced with Windows NT 3.1 in 1993, supports dynamic linking via DLLs, with the kernel parsing the PE header's DOS stub, NT headers, and section table to map image sections into virtual memory and resolve imports.^[74]

Application-Level Processing

At the application level, software utilizes specialized libraries to parse binary files by decoding structured elements like headers and subsequent data streams, enabling higher-level interpretation beyond mere file access. These libraries handle format-specific logic to extract meaningful content, ensuring efficient and correct processing of the encoded data. For instance, the libpng library, the official reference implementation for PNG images, begins parsing by verifying the 8-byte file signature using the png_sig_cmp() function, followed by reading the IHDR header chunk and metadata via png_read_info(), which populates an info structure with details like width, height, and color type.^[75] It then applies transformations such as de-interlacing or bit-depth adjustment with png_read_update_info(), before decoding the image data stream row-by-row using png_read_rows() to produce pixel arrays.^[75] Parsed binary data is then leveraged for domain-specific tasks, such as rendering in multimedia applications where image binary streams are converted into visual output. In graphics software, the decoded pixel data from formats like PNG—represented as row pointers in libpng—is mapped to display buffers or canvas elements to generate on-screen images, supporting features like alpha blending and color space conversion.^[75] Similarly, in machine learning frameworks, binary files storing model parameters are loaded for computational operations; PyTorch's torch.load() function, for example, deserializes these files via Python's unpickling process, reconstructing tensors and moving them to the target device (e.g., CPU or GPU) for tasks like inference or fine-tuning. Error handling during application-level processing often involves validating embedded checksums to confirm data integrity after parsing. A common method uses CRC32, a 32-bit cyclic redundancy check computed as the remainder from dividing the data (padded with zeros) by a fixed generator polynomial using bitwise XOR operations in modulo-2 arithmetic, which detects errors like bit flips introduced during storage or transfer.^[76] Applications recompute this value over the parsed sections and compare it to the stored checksum; mismatches trigger exceptions or retries, as seen in libraries handling ZIP archives or network protocols.^[76] To manage resource constraints with large binary files, applications employ streaming parsers that process data incrementally in chunks, avoiding full in-memory loading. These parsers read from input streams sequentially, decoding headers first and then handling data payloads in buffers; for example, Python's struct module facilitates this by unpacking binary patterns from file-like objects without buffering the entire content, ideal for gigabyte-scale files in data processing pipelines. This approach maintains low memory footprint while enabling real-time analysis, such as in video decoding or log aggregation tools. In scenarios involving executable binary code, applications may use just-in-time (JIT) compilation to dynamically optimize and execute snippets at runtime. JIT compilers monitor execution hotspots, baseline-compile binary code into initial machine instructions, and then apply optimizations like type specialization for frequently run sections, producing faster native code tailored to the current environment; this is prominent in JavaScript engines or Java virtual machines where binary bytecode is translated on-the-fly.^[77]

Compatibility and Portability

Architecture-Specific Issues

Binary files face significant challenges when portability across hardware architectures is required, as differences in processor design directly impact how data is encoded, interpreted, and executed. These issues stem from variations in byte ordering, instruction encoding, register configurations, and memory addressing models, often leading to misinterpretation, incompatibility, or runtime failures if not addressed. A primary concern is endianness, which determines the byte order for multi-byte values in memory and files. In big-endian systems, the most significant byte is stored at the lowest memory address, while little-endian systems store the least significant byte first.^[78] This convention became the network standard through early ARPANET protocols.^[79] Modern x86 processors, however, employ little-endian ordering, creating potential for data corruption in binary files exchanged between systems. For instance, a 32-bit integer value of 0x12345678 stored in big-endian format as bytes 12 34 56 78 would be misinterpreted on a little-endian system as 0x78563412, resulting in incorrect numerical values or program errors.^[80] Network protocols like IP continue to enforce big-endian transmission to ensure consistent interpretation across diverse architectures.^[81] Differences in instruction set architectures (ISAs) further complicate binary file handling, as code compiled for one processor family cannot execute natively on another without translation or emulation. Binaries targeting the x86 ISA, common in desktops, are incompatible with ARM-based systems prevalent in mobile and embedded devices, due to fundamentally distinct instruction encodings and execution models.^[82] For example, an x86 executable cannot run directly on an ARM processor, requiring emulation layers like Microsoft's Prism to translate instructions at runtime, which incurs performance overhead.^[82] Register configurations exacerbate this, as architectures vary in register count, size, and usage, influencing data layout within binary structures. x86 typically features 16 general-purpose registers of 64 bits each in 64-bit mode, while ARMv8 offers 31 64-bit registers, leading to different packing strategies for data in application binary interfaces (ABIs) and potential alignment mismatches that cause faults or suboptimal performance.^[83]^[84] Address space variations between 32-bit and 64-bit architectures introduce additional risks, particularly with pointer sizes and memory addressing. In 32-bit systems, pointers are 4 bytes long, limiting addressable memory to 4 GB, whereas 64-bit systems use 8-byte pointers to access vastly larger spaces.^[85] Loading a 32-bit binary on a 64-bit system or vice versa can lead to pointer truncation or overflow, resulting in segmentation faults or crashes if the application assumes a specific pointer size for memory allocation or data referencing.^[86] These disparities highlight the need for architecture-aware compilation during transitions, as seen in Apple's 2020 shift to Apple Silicon (ARM-based), where macOS introduced universal binaries containing both x86-64 and ARM64 code to mitigate immediate compatibility breaks while developers adapted.^[87]^[88]

Cross-Platform Strategies

Cross-platform strategies for binary files aim to mitigate architecture-specific incompatibilities, such as endianness differences or instruction set variations, by enabling binaries to execute or be recompiled across diverse hardware and operating systems. These approaches include compilation techniques that target multiple platforms, runtime environments that abstract underlying hardware, and standardized formats that promote interoperability. By adopting such methods, developers can enhance portability without sacrificing performance, ensuring applications remain functional in heterogeneous environments.^[88] One key technique involves cross-compilation, where tools like the GNU Compiler Collection (GCC) generate binaries for target architectures distinct from the host system. GCC supports building executables for various platforms by specifying target triplets (e.g., arm-linux-gnueabi), allowing developers to produce portable code from a single development machine without needing access to the target hardware. This process relies on predefined toolchains that handle architecture-specific optimizations, making it essential for embedded systems and multi-platform software distribution. Universal binaries represent another compilation-based strategy, bundling multiple architecture-specific variants into a single file for seamless execution on compatible systems. On macOS, Mach-O fat binaries encapsulate both x86_64 and ARM64 code, enabling applications to run natively on Intel-based and Apple Silicon Macs by dynamically selecting the appropriate slice at runtime. Apple's developer tools, such as Xcode, facilitate the creation of these multi-architecture bundles, which maintain a unified file size footprint while supporting transition periods between processor generations.^[88] Virtualization techniques further extend portability by emulating foreign architectures or abstracting system dependencies. Emulators like QEMU provide full-system and user-mode emulation, allowing binaries compiled for one architecture—such as ARM on an x86 host—to execute transparently through dynamic binary translation. QEMU's Tiny Code Generator (TCG) backend enables this cross-architecture execution with minimal overhead, supporting a wide range of guest operating systems and binaries for testing and deployment.^[89] Containers, exemplified by Docker, abstract architectural differences by packaging binaries with their runtime dependencies in a portable image format. Docker's multi-platform build support generates images for multiple architectures (e.g., amd64 and arm64) from a single Dockerfile, leveraging emulation via QEMU under the hood for cross-platform pulls and runs. This abstraction ensures consistent execution across cloud, on-premises, and edge environments, isolating the binary from host-specific variations in libraries or kernels. Adherence to standards plays a crucial role in fostering native portability for binary files across Unix-like systems. POSIX compliance, defined by IEEE Std 1003.1, standardizes interfaces for file handling, processes, and signals, enabling binaries to operate predictably on compliant systems like Linux and BSD variants without modification. This interface-focused specification promotes source and binary portability by ensuring consistent behavior for system calls and utilities, though it does not address deep architectural variances. WebAssembly (Wasm) emerges as an architecture-agnostic binary format, standardized by the W3C in December 2019 as a portable compilation target for high-level languages.^[90] Wasm modules compile to a compact, stack-based bytecode that executes in a sandboxed virtual machine, independent of the host CPU or OS, with near-native performance via just-in-time compilation. In March 2025, WebAssembly 2.0 was released, adding advanced features such as garbage collection and exception handling to improve portability across diverse environments.^[91] This format's design emphasizes safety and efficiency, allowing binaries to run in browsers, servers, and embedded devices without recompilation. Introduced with Java in 1995, bytecode serves as a portable intermediate representation that enhances cross-platform compatibility. Java source code compiles to platform-independent bytecode executed by the Java Virtual Machine (JVM), which interprets or just-in-time compiles it for the local architecture. Oracle's JVM implementations ensure this bytecode runs consistently across Windows, Linux, and macOS, abstracting hardware differences through runtime verification and optimization.^[92] Despite these strategies, challenges persist in shared library portability between ecosystems, notably Windows Dynamic Link Libraries (DLLs) versus Linux Shared Objects (SO files). DLLs follow the Portable Executable (PE) format with specific loading mechanisms tied to the Windows API, while SO files adhere to the Executable and Linkable Format (ELF) used in Linux, leading to incompatibilities in symbol resolution and dependency management. Tools like Wine or cross-compilation kits can bridge these gaps, but full portability often requires format conversion or wrappers, as native interoperation remains limited without additional abstraction layers.

Applications and Security

Common Binary File Types

Binary files encompass a wide array of formats designed for efficient storage and execution of non-textual data in computing systems. Among the most prevalent are executables, which contain machine code ready for direct processor execution; media files, which encode audiovisual content through compressed binary streams; and data files, which organize structured information in compact, non-human-readable structures. These formats leverage binary encoding schemes to represent complex data hierarchies, enabling high performance in diverse applications from operating systems to multimedia processing.^[1] Executables represent a core category of binary files, primarily used to launch software applications and system processes. The Portable Executable (PE) format, identified by the .exe extension on Windows systems, has dominated executable deployment since the release of Windows 95 in 1995, structuring binaries with headers for metadata like entry points and sections for code, data, and resources. On Linux and Unix-like systems, the Executable and Linkable Format (ELF), with .elf files, organizes executables into segments for program headers, providing modularity for dynamic linking and relocation. Apple's macOS employs the Mach-O format for .app bundles and executables, featuring load commands that map segments into memory for efficient runtime loading. Media binary files store sensory data in optimized, compressed forms to minimize storage and transmission needs. Image formats like JPEG (Joint Photographic Experts Group) use binary markers and quantized discrete cosine transform coefficients to encode pixel data lossily for photographs, while PNG (Portable Network Graphics) employs binary chunks with deflate compression for lossless representation of graphics and images. Audio files such as MP3 (MPEG-1 Audio Layer 3) consist of binary frames with perceptual coding to compress stereo streams, reducing file sizes for music and sound playback. Video containers like MP4 (MPEG-4 Part 14) package binary tracks of encoded video (e.g., H.264) and audio within an [ISO base media file format](/page/ISO_base_media_file format), supporting streaming and metadata embedding. Data binary files facilitate the aggregation and management of information in non-textual repositories. Archive formats including ZIP, which uses binary central directory records and deflate compression for bundling files, and TAR (Tape ARchive), which concatenates binary file headers with payloads in a streamable structure, enable efficient packaging for distribution and backup. Databases like SQLite's .db files employ binary B-tree structures within a single file, indexing records via page headers and variable-length payloads for lightweight, embedded storage. Additionally, the Android Package (APK) format, introduced in 2008 with the first Android release, serves as a binary archive for mobile applications, containing compiled Dalvik bytecode and resources in a ZIP-based structure.

Security Implications

Binary files are susceptible to various security vulnerabilities, particularly in parsing and execution stages. A notable example is the Heartbleed bug (CVE-2014-0160), discovered in 2014, which exploited a buffer overread flaw in the OpenSSL cryptographic library's heartbeat extension.^[93] This vulnerability allowed remote attackers to extract up to 64 kilobytes of sensitive memory content, including private keys and user credentials, from applications relying on affected OpenSSL versions (1.0.1 to 1.0.1f).^[93] Such issues in binary parsers highlight how flaws in handling structured data can lead to information disclosure without authentication.^[93] Malware often distributes through binary files disguised as legitimate software, exploiting user trust to initiate infections. Trojans, for instance, masquerade as harmless executables like .exe files, tricking users into downloading and running them to install additional payloads such as keyloggers or remote access tools.^[94] These binaries can transmit sensitive data, including passwords and keystrokes, to attackers or grant unauthorized remote control over infected systems.^[94] Threats to binary files include obfuscation techniques employed by malware authors to evade detection. Common methods involve compression, encryption, and encoding of binary code, which alter recognizable patterns and hinder signature-based antivirus scanning.^[95] Supply chain attacks further amplify these risks by inserting malicious code into trusted binaries during development. In the 2020 SolarWinds incident, attackers compromised the Orion platform's DLL file (SolarWinds.Orion.Core.BusinessLayer.dll) by adding nearly 4,000 lines of obfuscated backdoor code, which executed in a parallel thread upon installation.^[96] This digitally signed binary affected thousands of organizations, demonstrating how tampered executables can propagate undetected across networks.^[96] A more recent example is the XZ Utils backdoor (CVE-2024-3094), discovered in March 2024, which involved a deliberate supply chain compromise in the open-source XZ Utils data compression library (versions 5.6.0 and 5.6.1). An attacker, posing as a maintainer, inserted malicious code over two years that could enable remote code execution in SSH daemons on affected Linux distributions, potentially compromising binary compression and distribution processes.^[97] This incident underscored vulnerabilities in open-source binary tool maintenance and led to widespread checks and rollbacks in Linux ecosystems. The rise of binary exploits traces back to the Morris Worm of 1988, which pioneered buffer overflow techniques against UNIX binaries. This worm exploited a 512-byte stack buffer overflow in the fingerd daemon using the vulnerable gets() function, enabling remote code execution via a shell payload.^[98] It also leveraged command injection in Sendmail's debug mode, infecting approximately 10% of the internet's 60,000 hosts at the time and marking the first major demonstration of binary-level attacks.^[98] Protective measures against binary file threats include code signing, which uses cryptographic hashes like SHA-256 to verify file integrity and authenticity. Tools such as Microsoft's SignTool apply SHA-256 digests (/fd SHA256 option) to embed signatures in executables, ensuring alterations are detectable during loading.^[99] Browser sandboxing further mitigates risks from potentially malicious binary plugins by isolating processes with restricted tokens and job objects, preventing unauthorized system access or persistent changes.^[100] Static analysis tools, such as IDA Pro, aid in security by disassembling and decompiling binaries to identify obfuscated code or vulnerabilities without execution.^[101] Post-2020, zero-trust models have emphasized continuous verification of binaries in cloud environments to counter evolving threats. These architectures assume no implicit trust, requiring runtime attestation of workloads—including binary integrity—via identity checks and micro-segmentation, as outlined in NIST SP 800-207.^[102] This approach addresses supply chain compromises by enforcing least-privilege access to cloud-hosted binaries.^[102]

References

[1]
What is a binary file and how does it work? - TechTarget
Jun 23, 2022 · A binary file is a file whose content is in a binary format consisting of a series of sequential bytes, each of which is eight bits in length.Missing: authoritative | Show results with:authoritative
[2]
Text and binary files - Ada Computer Science
A binary file is intended to be read by a computer system and will almost certainly not make sense to a human. These files are optimised for processing and the ...Missing: authoritative | Show results with:authoritative
[3]
Example: Reading and Writing Binary Files - Runestone Academy
A binary file is a sequence of bytes. Unlike a text file, which is terminated by a special end-of-file marker, a binary file consists of nothing but data.
[4]
Text files and binary files - FutureLearn
Binary files can be used to store any data; for example, a JPEG image is a binary file designed to be read by a computer system. The data inside a binary file ...
[5]
https://www.lenovo.com/us/en/glossary/binary-file/
[6]
Handling binary files - Isaac Computer Science
Binary files are intended to only be read by a computer system. A program is needed to interpret the data contained in a binary file.Missing: authoritative sources<|control11|><|separator|>
[7]
Binary Files in C++
Accessing a binary file from a C++ program (by not using the old C functions) requires firstly attaching a stream variable to the file.
[8]
The Unix File System
May 8, 2021 · A binary file is a sequence of bytes that can contain almost anything. In practice, some software developer working on a program defined a file ...<|control11|><|separator|>
[9]
Everything you wanted to know about binary files - Packagecloud Blog
Jan 10, 2023 · Binary files are used to store data and are often used in programming. The term "binary" refers to the file containing a sequence of 1s and 0s rather than ...Missing: authoritative | Show results with:authoritative
[10]
Bits, bytes, and words: the big picture – Clayton Cafiero
A byte is an ordered sequence of eight bits. The byte is the smallest addressable unit of memory in most architectures. Even if only a single bit is required, ...
[11]
What Is a Binary File? - phoenixNAP
Feb 28, 2024 · A binary file contains data encoded in binary form, a sequence of bytes not intended for interpretation as text, read by computers.
[12]
The Unix File System
Oct 29, 2025 · Text files are files that consist entirely of human-readable (more or less) text, while binary files are files that encode data in a fashion intended only for ...
[13]
3. Files
Both kinds of files contain an ordered collection of bytes, but while binary files can contain both printable and non-printable bytes, text files are usually ...
[14]
Binary Formats :: CC 410 Textbook
Aug 24, 2021 · One major advantage to the binary file format is that they are typically smaller in size than a comparable textual representation. This depends ...
[15]
File Input and Output - Texas Computer Science
A text file is processed as a sequence of characters. A binary file is processed as a sequence of bytes. In a text file you have the illusion that the file is ...
[16]
[PDF] Bits, Words, and Integers - Computer Science Department
A byte is an 8-bit word. Memory in modern computers is arranged as a sequence of bytes, and adjacent bytes are considered to be longer words. As we shall ...
[17]
Bits and Bytes
"Byte" - unit of information storage · A document, an image, a movie .. how many bytes? · 1 byte is enough to hold about 1 typed character, e.g. 'b' or 'X' or '$' ...
[18]
https://www.computerhistory.org/timeline/1950/
[19]
The UNIX System -- History and Timeline - UNIX.org
Unix was born in 1969 not 1974, and the account of its development makes a little-known and perhaps instructive story.<|separator|>
[20]
Milestones:The CP/M Microcomputer Operating System, 1974
Aug 5, 2024 · Dr. Gary A. Kildall demonstrated the first working prototype of CP/M (Control Program for Microcomputers) in Pacific Grove in 1974.
[21]
Early Digital Research CP/M Source Code - CHM
Oct 1, 2014 · CP/M came with five built-in commands (for saving, erasing, renaming, listing, and typing files), and nine additional “transient” commands.
[22]
Microsoft MS-DOS early source code - Computer History Museum
Mar 25, 2014 · PC DOS version 1.0, which supported only floppy disks, was shipped when IBM first released their PC in August 1981. Microsoft then substantially ...
[23]
A Brief History of the GIF, From Early Internet Innovation to ...
Jun 2, 2017 · What made the format revolutionary was a specific compression algorithm, named Lempel-Ziv-Welch for its three creators (Abraham Lemepl, Jacob ...
[24]
Internet History of 1980s
TCP/IP is available on workstations and PCs such as the newly introduced Compaq portable computer. Ethernet is becoming accepted for wiring inside buildings ...Missing: binary formats
[25]
A Brief History of Containers: From the 1970s Till Now - Aqua Security
Jan 10, 2020 · History of containers · 1979: Unix V7 · 2000: FreeBSD Jails · 2001: Linux VServer · 2004: Solaris Containers · 2005: Open VZ (Open Virtuzzo) · 2006: ...Missing: binary mobile
[26]
Memory MAYHEM! Memory, Byte Ordering and Alignment
A byte is probably the first "clump" size that's actually useful. With a byte we can address the entire alphabet and all of the punctuation marks and symbols ...
[27]
Data alignment - IBM
ALIGNED specifies that the data element is aligned on the storage boundary corresponding to its data type requirement. UNALIGNED specifies that each data ...
[28]
Data Alignment - songho.ca
Data alignment means that the address of a data can be evenly divisible by 1, 2, 4, or 8. In other words, data object can have 1-byte, 2-byte, 4-byte, 8-byte ...
[29]
On Holy Wars and a Plea for Peace - » RFC Editor
IEN 137 Danny Cohen U S C/I S I 1 April 1980 ON HOLY WARS AND A PLEA FOR PEACE INTRODUCTION This is an attempt to stop a war. I hope it is not too late and ...
[30]
struct — Interpret bytes as packed binary data — Python 3.14.0 ...
The result of packing a given C struct includes pad bytes which maintain proper alignment for the C types involved; similarly, alignment is taken into account ...
[31]
Overview | Protocol Buffers Documentation
Protocol Buffers are a language-neutral, platform-neutral extensible mechanism for serializing structured data. It's like JSON, except it's smaller and faster.Tutorials · Protobuf Editions Overview · Java API
[32]
Encoding | Protocol Buffers Documentation
This document describes the protocol buffer wire format, which defines the details of how your message is sent on the wire and how much space it consumes on ...
[33]
RFC 1951 - DEFLATE Compressed Data Format Specification ...
This specification defines a lossless compressed data format that compresses data using a combination of the LZ77 algorithm and Huffman coding.
[34]
RFC 4648 - The Base16, Base32, and Base64 Data Encodings
This document describes the commonly used base 64, base 32, and base 16 encoding schemes. It also discusses the use of line-feeds in encoded data.
[35]
Fixed-Point vs. Floating-Point Digital Signal Processing
Dec 2, 2015 · Floating point can support a much wider range of values than fixed point, with the ability to represent very small numbers and very large numbers.
[36]
IEEE 754-2019 - IEEE SA
Jul 22, 2019 · This standard specifies interchange and arithmetic formats and methods for binary and decimal floating-point arithmetic in computer programming environments.
[37]
IEEE Floating-Point Representation | Microsoft Learn
Aug 3, 2021 · The IEEE-754 standard describes floating-point formats, a way to represent real numbers in hardware. There are at least five internal formats ...<|control11|><|separator|>
[38]
None
Below is a merged summary of the JPEG file format header and Start of Image (SOI) marker, consolidating all information from the provided segments into a single, dense response. To maximize detail and clarity, I’ll use a combination of narrative text and a table in CSV format for structured data. The response retains all mentioned details, including definitions, hexadecimal values, purposes, locations, relevant sections, and useful URLs.
[39]
Introduction to ASN.1 - ITU
Because ASN.1 has been an international standard since 1984, its encoding rules are mature and have a long track record of reliability and interoperability.Missing: 1980s | Show results with:1980s
[40]
ASN.1 - Web Encrypt
ASN.1 was first standardized in 1984 by the CCITT (International Telegraph and Telephone Consultative Committee, now called ITU-T)Missing: evolution 1980s
[41]
What are binary and text files? - Project Nayuki
Dec 16, 2015 · The difference between binary and text files is in how these bytes are interpreted. Every text file is indeed a binary file, but this ...
[42]
difference between text file and binary file - Stack Overflow
May 18, 2011 · ... bytes but text files differ from binary files in that the bytes are understood to represent characters. The mapping of bytes to characters ...Understanding the difference between text files and binary filesText Files vs. Binary Files - Stack OverflowMore results from stackoverflow.com
[43]
Difference Between C++ Text File and Binary File - GeeksforGeeks
Feb 19, 2023 · A binary file is the one in which data is stored in the file in the same way as it is stored in the main memory for processing. It is stored in ...
[44]
A little diddy about binary file formats - BetterExplained
Binary files are efficient is because they can use all 8 bits in a byte, while most text is constrained to certain fixed patterns, leaving unused space.
[45]
Understanding Text and Binary Files - Real Python
Jun 13, 2023 · 02:17 Note that from a technical point of view, there's no real difference between text and binary files as they both consist of bytes ...
[46]
Simple and Universal: A History of Plain Text, and Why It Matters
Jun 11, 2016 · Well, digitized plain text originated on American computers built by American hardware designers and loaded up with software written by American ...
[47]
Timeline of Computer History
IBM´s 7000 series of mainframe computers are the company´s first to use transistors. At the top of the line was the Model 7030, also known as "Stretch." Nine of ...
[48]
a2ps.txt - GNU.org
To detect such a file we make use of a very simple heuristic: if the first sheet of the file contains more than 40% of non-printing characters, it's a binary ...
[49]
GNU Grep 3.12
The heuristic that grep uses to intuit whether input is binary is specific to grep and may well be unsuitable for other applications, as it depends on command- ...
[50]
file(1) - Linux manual page - man7.org
There has been a file command in every UNIX since at least Research ... This program, based on the System V version, was written by Ian Darwin ⟨ian ...
[51]
MIME Sniffing Standard
Aug 12, 2025 · This document describes a content sniffing algorithm that carefully balances the compatibility needs of user agent with the security constraints imposed by ...
[52]
Analyzing files for embedded content and malicious activity - IBM
File entropy measures the randomness of the data in a file and is used to determine whether a file contains hidden data or suspicious scripts. The scale of ...
[53]
[PDF] Portable document format — Part 1: PDF 1.7 - Adobe Open Source
Jul 1, 2008 · ... binary file format that is optimized for high performance in interactive viewing. PDF also includes objects, such as annotations and ...
[54]
Key Malware Detection Techniques - Cynet
Oct 9, 2025 · Basic malware detection techniques can help identify and restrict known threats and include signature-based detection, checksumming, and application ...
[55]
HxD - Freeware Hex Editor and Disk Editor - mh-nexus
The easy to use interface offers features such as searching and replacing, exporting, checksums/digests, insertion of byte patterns, a file shredder ...Downloads · Change log for HxD · HxD License · Translators of HxD
[56]
xxd Command in Linux - GeeksforGeeks
Jul 23, 2025 · The `xxd` command creates and analyzes hexadecimal dumps from files, and can convert them back to binary form.Installing xxd on Linux · Using xxd command · Generating hexadecimal dumps
[57]
How do I make less open binary files without confirmation?
Sep 1, 2014 · Use -f option: -f or --force Forces non-regular files to be opened. (A non-regular file is a directory or a device special file.) Also suppresses the warning ...Viewing binary files in terminal mangles display - any option other ...How can I find binary data in my text file? - Super UserMore results from superuser.com
[58]
Why does inserting characters into an executable binary file cause it ...
Dec 31, 2013 · I've known for a long time that it is possible to use a hex editor to change code in a compiled executable file and still have it run as normal.Is it safe to modify a file with a hex editor in this manner?How can I learn to use a hex editor safely? - Stack OverflowMore results from stackoverflow.com
[59]
Using Replace - 010 Editor Manual
The Replace Bar is used to search and replace a set of bytes within a file. Access the Replace Bar from the 'Find > Replace...' menu option, the Tool Bar, or by ...
[60]
HexEditorNeo Starting File Disassemble
To start disassembling, use Tools » Run Disassembler. The disassembler window selects the opened document or uses the cursor's position as the starting offset.
[61]
A history of microprocessor debug, 1980–2016 - Embedded
Jul 25, 2017 · The typical development flow was to write your code in ASM or C and get it compiled, linked, and located so that you ended up with a HEX file ...
[62]
Hex Editor Use Cases: Debugging, Analysis, File Recovery + More
Mar 26, 2025 · When software crashes or files become unreadable, a hex editor provides direct access to the underlying binary data. It helps developers and ...
[63]
fopen
### Summary of `fopen` Binary Mode for Handling Binary Files
[64]
https://docs.python.org/3/library/functions.html#open
[65]
InputStream (Java Platform SE 8 ) - Oracle Help Center
InputStream is an abstract class representing an input stream of bytes. Applications must provide a method that returns the next byte of input.DataInputStream · FileInputStream · ByteArrayInputStream · BufferedInputStream
[66]
mmap
### Summary: Using mmap for Memory Mapping Large Binary Files in POSIX/UNIX
[67]
https://www.ni.com/docs/en-US/bundle/labview/page/binary-files.html
[68]
Linux file permissions explained - Red Hat
Jan 10, 2023 · Execute permission allows you to execute the contents of a file. Typically, executables would be things like commands or compiled binary ...
[69]
Unix File Permissions - NERSC Documentation
Every file (and directory) has an owner, an associated Unix group, and a set of permission flags that specify separate read, write, and execute permissions.
[70]
ELF Header
### Summary of ELF Header Magic Number and Its Use in Loading
[71]
ELF - OSDev Wiki
Allocate virtual memory for each segment, at the address specified by the p_vaddr member in the program header. The size of the segment in memory is specified ...
[72]
execve(2) - Linux manual page - man7.org
This manual page describes the Linux system call in detail; for an overview of the nomenclature and the many, often preferable, standardised variants of ...
[73]
[PDF] Shared libraries on UNIX System V; 1986
A shared library lets multiple processes- from different executable files-share a single copy of library code. Shared libraries make programs' sizes on disk ...
[74]
Inside Windows: Win32 Portable Executable File Format in Detail
Overview of the PE File Format. Microsoft introduced the PE File format, more commonly known as the PE format, as part of the original Win32 specifications.
[75]
libpng 1.2.5 manual
Summary of each segment:
[76]
Understanding CRC - Sunshine's Homepage
CRC (Cyclic Redundancy Check) is a checksum algorithm to detect data inconsistency, like bit errors, during transmission. It is based on division.
[77]
A crash course in just-in-time (JIT) compilers - Mozilla Hacks
Feb 28, 2017 · In programming, there are generally two ways of translating to machine language. You can use an interpreter or a compiler.
[78]
Understanding Big and Little Endian Byte Order - BetterExplained
Big endian machine: Stores data big-end first. When looking at multiple bytes, the first byte (lowest address) is the biggest. · Little endian machine: Stores ...Missing: ARPANET standardization 1969
[79]
Why Does TCP/IP Use Big-Endian Encoding? - Baeldung
Mar 18, 2024 · Historical Precedence. Big-endian coding was part of the original design of the ARPANET, the precursor to today's Internet. In addition, in ...
[80]
What is Endianness? Big-Endian vs Little-Endian Explained with ...
Feb 1, 2021 · Endianness means that the bytes in computer memory are read in a certain order. We won't have any issues if we never need to share information.
[81]
https://datatracker.ietf.org/doc/html/rfc791#appendix-B
[82]
How emulation works on Arm | Microsoft Learn
Emulation works as a software simulator, just-in-time compiling blocks of x86 instructions into Arm64 instructions with optimizations to improve performance.
[83]
Overview of ARM ABI Conventions | Microsoft Learn
This article highlights key differences between Windows on ARM and the standard. This document covers the ARM32 ABI. For information about the ARM64 ABI, see ...
[84]
ARM Data Types and Registers (Part 2) - Azeria Labs
According to the ARM Reference Manual, there are 30 general-purpose 32-bit registers, with the exception of ARMv6-M and ARMv7-M based processors. The first 16 ...
[85]
Compatibility between the 32-bit and 64-bit versions of Office
Mar 30, 2022 · The size (in bytes) of the pointer or handle depends on whether you're using a 32-bit or 64-bit system. If you want to run your existing ...
[86]
[PDF] Solaris 64-bit Developer's Guide - Oracle Help Center
Because the size of longs and pointers change in the LP64 data model, you need to be aware that this change alone can cause many 32-bit to. 64-bit conversion ...
[87]
Apple announces Mac transition to Apple silicon
Jun 22, 2020 · Apple today announced it will transition the Mac to its world-class custom silicon to deliver industry-leading performance and powerful new technologies.Missing: multi- | Show results with:multi-
[88]
Building a universal macOS binary | Apple Developer Documentation
A universal binary runs natively on both Apple silicon and Intel-based Mac computers, because it contains executable code for both architectures.
[89]
Emulation — QEMU documentation
QEMU's Tiny Code Generator (TCG) provides the ability to emulate a number of CPU architectures on any supported host platform.
[90]
The Java Language Environment - Oracle
The primary benefit of the interpreted byte code approach is that compiled Java language programs are portable to any system on which the Java interpreter and ...
[91]
OpenSSL 'Heartbleed' vulnerability (CVE-2014-0160) | CISA
Oct 5, 2016 · This flaw allows an attacker to retrieve private memory of an application that uses the vulnerable OpenSSL library in chunks of 64k at a time.
[92]
Trojan malware - Microsoft Defender for Endpoint
Oct 29, 2024 · Trojans are a type of threat that can infect your device. This article describes how trojans work and how to remove them.
[93]
Malware Obfuscation: Techniques, Definition & Detection - ExtraHop
Compression, encryption, and encoding are some of the most common obfuscation methods used by threat actors. Multiple methods are often used in tandem to evade ...
[94]
Analyzing Solorigate, the compromised DLL file that started a ...
Dec 18, 2020 · In this blog we are sharing insights into the compromised SolarWinds Orion Platform DLL that led to this sophisticated attack.
[95]
The Ghost of Exploits Past: A Deep Dive into the Morris Worm - Rapid7
Jan 2, 2019 · All of these vulnerabilities were exploited by the worm in 1988. In this post, we will dive into the exploit development process for those ...
[96]
Using Your Code Signing Certificate - SSL.com
SHA256 is recommended over SHA1 for security. The /a option instructs SignTool to automatically find an appropriate code signing certificate for your ...
[97]
Chromium Docs - Sandbox
### Summary: How Browser Sandboxing Protects Against Malicious Binary Plugins or Code
[98]
IDA Pro: Powerful Disassembler, Decompiler & Debugger - Hex-Rays
Powerful disassembler, decompiler and versatile debugger in one tool. Unparalleled processor support. Analyze binaries in seconds for any platform.IDA Free · IDA Pro OEM · IDA Decompilers · IDA Home
[99]
[PDF] Zero Trust Architecture - NIST Technical Series Publications
Zero trust architecture is an end-to- end approach to enterprise resource and data security that encompasses identity (person and non- person entities), ...