Fact-checked by Grok 2 weeks ago

COM file

A COM file is a binary executable file format originally developed for the CP/M operating system and later adopted by MS-DOS, consisting of a flat, unstructured memory image of machine code, data, and stack without any header, metadata, or relocation information.^[1] It represents the simplest form of DOS-compatible program, limited to a maximum size of approximately 64 KB (precisely 65,280 bytes or 0xFF00) due to the single-segment memory model it employs.^[2]^[3] Introduced in the late 1970s with CP/M and carried over to MS-DOS in 1981, the format enabled quick loading and execution of small utility programs on resource-constrained 8086/8088-based systems, serving as the precursor to more complex formats like EXE.^[1] In MS-DOS, COM files were prioritized over EXE files bearing the same name during command execution, a legacy behavior that persisted into early Windows versions such as 95, 98, and Me, where the COMMAND.COM shell itself was a COM file.^[2] Upon execution, the DOS loader allocates a memory segment, places the Program Segment Prefix (PSP) at offset 0x00, loads the entire COM file contents starting at offset 0x100, initializes the stack pointer near the top of available memory, and transfers control to the program's entry point at 0x100 via a far call instruction.^[1]^[3] This format's lack of structure imposed significant constraints: programs could not exceed the 64 KB limit, required manual management of code, data, and stack within a single 64 KB segment, and relied on direct BIOS or DOS interrupts (e.g., INT 21h) for system services without support for dynamic linking or overlays.^[1]^[3] While ideal for compact commands like DEBUG.COM or FORMAT.COM, larger applications necessitated the MZ/EXE format introduced in MS-DOS 1.0 to accommodate relocatable segments and headers.^[1]^[4] In modern Windows, COM files are largely obsolete but can still execute under the NTVDM subsystem on 32-bit systems or via DOS emulators on 64-bit versions, though they pose security risks due to their simplicity and historical use in malware.^[2]

Overview and History

Definition and Purpose

A COM file is a flat binary executable format utilized in MS-DOS, consisting of pure machine code and data without any headers or metadata structures.^[5] This simplicity allows the entire file contents to be treated as a single contiguous block of code and initialized data, designed specifically for direct loading into memory at offset 0x100 in the program's segment.^[5] Unlike more elaborate formats, COM files require no parsing of headers or relocation of addresses, making them ideal for environments with limited resources.^[6] The primary purpose of COM files is to enable the rapid execution of small utility programs in resource-constrained systems like MS-DOS, where the emphasis on straightforward operation takes precedence over advanced features such as relocatable code or dynamic linking.^[5] They offer key advantages including minimal runtime overhead, instantaneous loading without the need for format interpretation, and compatibility with memory models that support fixed-address execution, rendering them suitable for bootloaders and memory-resident utilities.^[5] Historically, the COM format originated as the native executable type in CP/M and was directly inherited by early versions of MS-DOS, serving as the default for executables before the introduction of the more versatile EXE format for handling larger applications.^[6] This inheritance ensured continuity in the DOS ecosystem, allowing simple binaries to remain viable even as the operating system evolved.^[6]

Development in MS-DOS

The COM file format originated in 86-DOS, an operating system developed by Tim Paterson at Seattle Computer Products starting in April 1980 as a CP/M-compatible environment for Intel 8086-based systems, providing a simple mechanism for executing assembly-language programs directly as binary images.^[7]^[8] Early versions of 86-DOS, such as 0.33 released in December 1980, included utilities like ASM.COM and HEX2BIN.COM, demonstrating the format's use for compact, load-and-run executables in resource-constrained environments.^[7]^[9] Microsoft first licensed 86-DOS in December 1980 and acquired full rights in July 1981, adapting it into MS-DOS 1.0, released alongside the IBM PC in August 1981, where the COM format became the primary executable type for small programs due to its simplicity and direct compatibility with the system's 64 KB memory limit.^[10]^[11] This adoption integrated COM files into the IBM PC ecosystem, enabling key command-line utilities such as DEBUG.COM for program debugging and FORMAT.COM for disk preparation, which exemplified the format's role in essential system operations.^[11]^[12] The COM format remained largely unchanged through subsequent MS-DOS releases, persisting as a core component up to version 6.22 in 1994, while the more flexible EXE format, introduced in MS-DOS 1.0 (August 1981), supported overlays and programs exceeding 64 KB, shifting preference toward EXE for larger applications.^[13]^[11] Despite this evolution, COM files continued to underpin lightweight utilities in the command-line environment throughout the MS-DOS era.^[14] By the mid-1990s, the rise of graphical interfaces marked the decline of COM files as a primary format; Windows 95, released in August 1995, phased out direct reliance on DOS executables in favor of native Windows applications, though COM support was retained in the MS-DOS compatibility mode for legacy software.^[10]^[13]

Technical Format

Binary Structure

The COM file format consists of a flat binary image containing solely the program's machine code and data, without any file header, segments, or metadata structures. This simplicity stems from its origins in early operating systems like CP/M, where the entire file—limited to a maximum size of 64 KB—is treated as a direct memory loadable entity. Upon execution in MS-DOS, the operating system allocates a single 64 KB memory segment and loads the COM file starting at offset 0x0100 within that segment, setting the code segment (CS) and instruction pointer (IP) to point to this location (CS:IP = segment:0x0100), while data segment (DS) and extra segment (ES) registers are also initialized to the segment base.^[5]^[1] Preceding the loaded code at offsets 0x0000 to 0x00FF within the same segment, MS-DOS constructs a Program Segment Prefix (PSP), a 256-byte data structure that provides essential runtime information such as the program's termination vector, memory allocation details, and command-line arguments, but this PSP is not part of the COM file itself. The COM file's contents thus occupy a contiguous linear block from 0x0100 onward, encompassing the program's code, stack, heap, and any initialized data variables, all managed within the single segment without relocation or segmentation support. This unified layout requires programmers to use absolute addressing relative to the 0x0100 origin, as there are no mechanisms for dynamic relocation during loading.^[15]^[16] In contrast to the more complex EXE format, which begins with an MZ header containing details like the program's entry point, relocation table, and segment information to enable loading into non-contiguous memory and support for larger programs, the COM format lacks all such elements, enforcing a simpler but more restrictive model suitable only for small, self-contained applications. COM files must be created using assembly tools configured for flat binary output, such as the Microsoft Macro Assembler (MASM), where directives like .MODEL TINY and ORG 0x100h ensure the output is a pure binary without object file overhead or linking artifacts. The standard file extension is .COM, adhering to the 8.3 naming convention of the MS-DOS file system, which reserves the first eight characters for the name and the last three for the extension.^[1]^[17]

Memory Loading Process

The MS-DOS command interpreter, COMMAND.COM, initiates the loading of a .COM file by invoking DOS Interrupt 21h with AH=4Bh (the EXEC function), passing the program's filename and an execution parameter block that specifies details such as the command tail and file control blocks (FCBs).^[18] The DOS loader allocates a contiguous block of conventional memory for the program, creating a 256-byte Program Segment Prefix (PSP) at the base of this block to manage the program's environment, including interrupt vectors and default FCBs. The entire contents of the .COM file—treated as raw machine code without any header or relocation information—are then read into memory starting at offset 0x0100 within the allocated segment, immediately following the PSP, using DOS file services like INT 21h AH=3Fh for reading.^[18] This process assumes the file size does not exceed 64 KB (minus the 256 bytes for the PSP), as .COM files operate within a single 64 KB segment.^[19] Upon successful loading, the DOS loader configures the CPU registers to prepare for execution: the code segment (CS), data segment (DS), extra segment (ES), and stack segment (SS) registers are all set to the segment address of the PSP, ensuring the program runs in a flat memory model with unified addressing; the instruction pointer (IP) is set to 0x0100 to begin execution at the start of the loaded code; and the stack pointer (SP) is initialized to 0xFFFE, pointing to the last available word in the 64 KB segment to provide maximum stack space.^[18] No relocation or segment binding occurs, as the .COM format lacks relocation tables, allowing the program to run directly in this single-segment environment without further adjustment by the loader.^[20] The loader then transfers control to the program at the effective address formed by the CS:IP pair. The program executes within the allocated memory until it terminates, typically by issuing INT 20h (a direct terminate call that releases all memory and returns control to DOS via the PSP's interrupt 22h vector) or INT 21h with AH=4Ch (terminate with return code, which flushes file buffers, closes handles, and releases memory before returning to the caller with an exit code in AL).^[18] If the program ends without proper termination—such as by falling off the end of code—the PSP's first two bytes (containing the INT 20h opcode CD 20h) serve as a safety net to invoke termination automatically.^[21] Error conditions during loading, such as insufficient memory (error code 08h) or a file larger than 64 KB, result in the carry flag being set upon return from the EXEC call, with the specific error code in AX, prompting COMMAND.COM to display an error message and return to the DOS prompt without executing the program.^[18] In certain MS-DOS configurations, particularly from version 5.0 onward with extended memory managers like HIMEM.SYS and EMM386 loaded via CONFIG.SYS directives such as DOS=HIGH,UMB, the available conventional memory is maximized by relocating core DOS components to the high memory area (HMA) or upper memory blocks (UMBs), indirectly allowing .COM files to utilize more of the lower 640 KB for loading without fragmentation issues.^[22] For terminate-and-stay-resident (TSR) .COM programs, the LH (load high) command in AUTOEXEC.BAT—enabled by UMB support in CONFIG.SYS—can explicitly place them into UMBs above 640 KB, though transient programs are still loaded into conventional memory by default.^[23]

Limitations and Workarounds

Size Restrictions

The COM file format imposes a strict maximum size of 65,278 bytes (0xFEFE in hexadecimal), stemming from its reliance on single-segment loading within the 64 KB address space of the 8086 processor's segment, excluding the 256-byte Program Segment Prefix (PSP) allocated by MS-DOS for essential system data and an additional 2 bytes reserved on the stack for the return address.^[6]^[24]^[25] This limitation means COM files lack support for multiple memory segments or dynamic allocation mechanisms beyond the contiguous RAM available in that single segment, requiring all code, data, and stack to reside linearly within the allocated space starting at offset 0x0100 immediately after the PSP.^[24]^[26] Consequently, the format's constraints influenced program design by promoting highly compact coding practices, such as prioritizing CPU registers over memory-based variables to minimize space usage and generally avoiding inclusion of external libraries that would inflate the binary size.^[27] If a COM file exceeds 64 KB, MS-DOS typically rejects it during loading, resulting in errors like "Program too big to fit in memory" or immediate crashes due to incomplete or corrupted execution, as the system cannot allocate sufficient contiguous memory.^[28]^[29] Developers could assess a COM file's size using the DIR command in MS-DOS, which displays the exact byte count of the file on disk, though the actual loadable portion accounts for overhead like the PSP and any unaddressable bytes at the segment's end.^[30]^[6]

Techniques for Larger Programs

To overcome the 64 KB size restriction inherent to COM files, developers employed overlay techniques, loading a compact core program as a COM file and dynamically fetching additional code or data from disk files during execution. This was achieved using MS-DOS interrupt 21h functions, such as AH=3Dh to open a file and AH=3Fh to read its contents into allocated memory, allowing the program to incorporate larger modules on demand.^[31] Alternatively, interrupt 21h with AH=4Bh and AL=03h provided a dedicated "load overlay" capability, transferring code from a specified file into a target memory location without immediate execution, enabling segmented program structures despite the flat memory model of COM files.^[32] Self-modifying code offered another workaround, where the running program altered its own instructions in memory to emulate segmentation or adapt behavior, leveraging the fact that COM files treat code and data within the same writable segment. This technique reduced the need for static inclusion of all logic within the initial 64 KB load, though it required careful management to avoid corruption. For instance, a program could overwrite portions of its code to branch to newly loaded routines, simulating a multi-segment EXE-like architecture. Tools for COM-to-EXE conversion, such as com0exe, facilitated creating hybrid setups by wrapping a small COM stub around larger EXE overlays, effectively reverse-engineering the process of tools like EXE2BIN to produce COM-compatible entry points for extended functionality. In TSR mode, small COM-based stubs remained in memory after initial loading, hooking interrupts to chain-load or invoke larger modules as needed; the Microsoft Mouse driver (MOUSE.COM) exemplifies this, installing a minimal resident handler that extended input capabilities without exceeding COM limits.^[33] Early games adopted similar extensions, starting with compact COM loaders that dynamically incorporated graphics or level data to fit within memory constraints. These methods, while innovative, introduced significant limitations: they heightened development complexity due to manual memory management, risked instability from improper loading or overwrites, and exhibited incompatibility with certain DOS versions or hardware configurations lacking sufficient free memory above the COM segment.^[34]

Platform Compatibility

Support in DOS and Early Windows

COM files enjoyed full native support in MS-DOS versions 1.0 through 7.0, from their introduction in 1981 to the late 1990s, as simple binary executables loaded directly by the command interpreter COMMAND.COM. This interpreter, residing in memory as both a resident and transient portion, handled execution by searching for the file in the current directory or along the PATH environment variable and loading its contents into memory starting at offset 0x100, preserving the DOS environment for the program.^[14] Key operational features in MS-DOS emphasized COM files' efficiency and priority. The system searched for executables by prioritizing the .COM extension over .EXE and .BAT in the current directory and PATH directories, enabling quick access without specifying extensions. Additionally, COM files could be automatically executed during system startup via the AUTOEXEC.BAT batch file, which ran commands sequentially after CONFIG.SYS processing, allowing utilities or drivers to load seamlessly at boot.^[35]^[36] In early Windows versions 1.0 to 3.1 (1985–1992), COM files executed within a DOS box, a virtualized DOS environment that inherited the native MS-DOS loader behavior for compatibility with the underlying DOS host. This setup allowed DOS-based programs, including COM files, to run windowed or full-screen under Windows' graphical shell, with the DOS box providing emulation for graphics modes and hardware access.^[37] From Windows NT in 1993 onward, the NT Virtual DOS Machine (NTVDM) provided emulated support for COM files on 32-bit x86 systems, replicating the DOS loading process while enforcing the format's inherent 64 KB size limit through memory segmentation. NTVDM isolated 16-bit DOS applications in a virtualized subsystem, enabling execution without interfering with the 32-bit kernel.^[38] Support for COM files was gradually deprecated as legacy technology starting with Windows 95, though retained via virtual DOS mechanisms such as NTVDM in the NT family for backward compatibility. Microsoft placed NTVDM in maintenance mode due to its age and security vulnerabilities, recommending migration to modern 32-bit or 64-bit applications. This support persisted as an optional feature in 32-bit editions of Windows 10 until its end-of-life in October 2025. Windows 11, released in 2021 as a 64-bit-only OS, does not include NTVDM and all 16-bit DOS execution to align with contemporary hardware and security standards.^[38]^[39]

Implementation on Other Systems

The COM file format for 8086 processors in MS-DOS drew significant influence from the executable formats used in CP/M-86, Digital Research's operating system for Intel 8086 systems introduced in the late 1970s. While CP/M-86 primarily employed the .CMD extension for relocatable memory image files that supported direct loading into memory without relocation, its design emphasized simple binary loading mechanisms akin to the flat, non-relocatable structure of MS-DOS .COM files for 8086 binaries. This precursor approach facilitated efficient execution in resource-constrained environments by treating executables as raw memory images starting at offset 0x100, a convention that MS-DOS adopted to ensure compatibility with early x86 hardware.^[40]^[41] DR-DOS, released by Digital Research in 1988 as a compatible alternative to MS-DOS, retained the core COM file format while introducing variations such as extended file attributes and additional interrupt 21h functions for enhanced system calls. These modifications allowed DR-DOS to support the same direct loading process for .COM files—mapping the binary directly into memory at segment 0x0100—without altering the fundamental binary structure, ensuring seamless execution of MS-DOS-compatible programs. However, certain system files like COMMAND.COM in DR-DOS 6.0 deviated by using the more advanced DOS executable (EXE) format for larger code requirements, though standard application .COM files remained unchanged in format.^[42]^[43] FreeDOS, an open-source DOS-compatible operating system initiated in 1994, provides full support for .COM files through its kernel loader, which emulates the MS-DOS loading behavior by reading the file as a raw binary image and executing it in real mode at the conventional memory offset. The FreeDOS kernel (KERNEL.SYS) handles .COM execution identically to MS-DOS, loading the entire file into memory below 640 KB and transferring control to the entry point, thereby maintaining compatibility for legacy DOS software on modern hardware. This design choice ensures that .COM programs run without modification, leveraging the kernel's CONFIG.SYS and FDCONFIG.SYS directives for environment setup.^[44]^[45] Emulators like DOSBox, first released in 2002, enable .COM file execution by simulating an IBM PC-compatible environment, including the DOS command interpreter and memory management necessary for loading and running these flat binaries. DOSBox mounts host directories as virtual drives and invokes .COM files via the emulated command line, replicating the original loading process with cycle-accurate timing for authentic behavior in games and utilities. Similarly, PCem (and its successor 86Box) supports .COM execution through full hardware emulation of x86 systems from the 1980s and 1990s, allowing users to boot DOS variants and run .COM programs as on genuine period hardware, complete with accurate BIOS interactions and peripheral simulation.^[46]^[47] On Unix-like systems, .COM files can be executed using DOSemu, a Linux-based DOS emulation layer that provides a user-space environment for running DOS applications, including direct loading of .COM binaries via an emulated MS-DOS kernel. DOSemu integrates with the host filesystem, allowing seamless access to .COM files while handling real-mode execution through dynamic recompilation or interpretation. Wine does not support DOS .COM files natively. For executing DOS programs on Unix-like systems, dedicated emulators like DOSBox or DOSemu are recommended.^[48] In embedded systems, .COM files find use in certain BIOS and UEFI-compatible tools for x86 architectures, particularly in legacy real-mode utilities embedded within firmware for diagnostic or boot-time operations that require DOS compatibility. These tools leverage the simple loading mechanism of .COM files to execute in the pre-OS environment, ensuring portability across x86-based embedded platforms without relying on complex loaders.^[49]

Modern Applications

Compatibility in Contemporary OS

In contemporary 64-bit Windows 11 editions, COM files cannot be executed natively due to the lack of the NTVDM (NT Virtual DOS Machine) subsystem, which was limited to 32-bit Windows versions and placed in maintenance mode without further development.^[38] The WOW64 subsystem supports 32-bit applications but does not handle 16-bit DOS executables like COM files, requiring third-party emulators such as DOSBox-X or NTVDMx64 to provide compatibility through simulated DOS environments.^[38]^[50]^[51] Following the launch of the 64-bit-only Windows 11 in 2021 and updates including version 25H2 released in September 2025, these emulators have become essential for any DOS legacy support, as no built-in mechanisms exist for direct loading.^[52]^[53] Linux and macOS offer no native execution for COM files, as these systems do not include DOS-compatible loaders, instead relying on user-space emulators like DOSBox-X for lightweight simulation or QEMU for full-system virtualization paired with a DOS kernel.^[50]^[54] This emulation approach ensures isolation but demands manual configuration to mount file systems and replicate hardware interfaces.^[55] Support for COM files persists in modern operating systems primarily to accommodate legacy business software in enterprises, where outdated DOS applications continue to operate critical workflows; retro computing communities preserve historical programs; and cybersecurity professionals analyze malware samples that exploit the format to evade detection.^[45]^[56]^[57] Contemporary development tools, such as the Netwide Assembler (NASM), enable the generation of COM-compatible flat binary outputs using the -f bin format, allowing developers to assemble and test DOS code across platforms like Windows, Linux, and macOS without platform-specific dependencies.^[58] Running COM files on 64-bit systems presents challenges, including the complete absence of direct execution paths, which blocks legacy loaders and requires virtualization layers like QEMU or VirtualBox to achieve hardware-accurate emulation and prevent compatibility gaps in timing or interrupts.^[59]^[54] As of 2025, COM file compatibility has become increasingly niche, with viability maintained through open-source initiatives like FreeDOS 1.4, released in April 2025, which provides an updated DOS-compatible kernel for running and developing such executables in emulated or bare-metal environments.^[45]^[60]

Execution Order in DOS Environments

In MS-DOS environments, when a user invokes a command without specifying a file extension, the COMMAND.COM shell initiates a search process that prioritizes .COM files over other executable formats. It first checks for matching internal commands embedded within COMMAND.COM itself. If no internal command matches, it scans the current directory, followed by each directory listed in the PATH environment variable, appending the extensions .COM, .EXE, and .BAT sequentially until a matching file is found or the search exhausts all options.^[61] If a .COM or .EXE file is located, COMMAND.COM invokes the MS-DOS EXEC function to load and execute it; otherwise, it falls back to interpreting a .BAT file if present.^[62] This execution order favors .COM files due to their straightforward structure as raw memory images, inherited from CP/M conventions, which enables quicker loading without the need to parse complex headers or perform relocation adjustments required for .EXE files—thereby minimizing overhead in resource-constrained command-line operations typical of DOS systems.^[1] The .BAT extension is checked last because batch files involve sequential interpretation by COMMAND.COM, introducing additional processing latency compared to the direct execution of binary formats.^[63] The behavior of COMMAND.COM can be influenced through configuration files like AUTOEXEC.BAT, which executes at system startup and allows setting or modifying the PATH variable to reorder directory priorities, potentially favoring locations with .COM files. For instance, placing utility directories early in PATH ensures .COM executables are discovered before equivalents in later paths. Internal commands, such as DIR for directory listing or ECHO for output display, inherently take precedence as they are handled directly by COMMAND.COM without any file search, and this priority cannot be altered but can be supplemented via batch scripts in AUTOEXEC.BAT. The PROMPT command further customizes the interactive shell by defining prompt strings that incorporate variables or conditional elements, indirectly aiding command prioritization in scripted environments.^[63] Modern DOS emulators, such as DOSBox, faithfully replicate this .COM-.EXE-.BAT search order to preserve authentic behavior for legacy software, with options in configuration files like dosbox.conf to adjust PATH emulation or mount directories that mimic original disk structures.^[62] Exceptions to the standard order arise with certain utilities; for example, APPEND.COM, a terminate-and-stay-resident program introduced in MS-DOS 3.3, modifies the file search mechanism by appending extra directories for data file access via FCB (File Control Block) calls, which some older applications use for locating executables and can thus alter effective PATH resolution in non-standard scenarios.^[64]

Security Implications

Vulnerabilities in Format

The COM file format's absence of headers or metadata precludes any built-in integrity verification, such as checksums or digital signatures, rendering files susceptible to undetected tampering by appending, prepending, or overwriting malicious code. This raw binary structure, consisting solely of executable instructions and data, enables attackers to modify files without altering their apparent size or extension in a way that triggers loader warnings, facilitating stealthy alterations. Furthermore, the direct memory loading process in DOS—mapping the entire file into a single 64 KB segment starting at offset 0x100—bypasses content validation, allowing potentially malicious or malformed code within the size limit to execute directly in memory.^[26] COM files employ absolute addressing, assuming execution from the fixed memory location of 0x100, which lacks relocation information and prevents position-independent code execution; this rigidity allows code injection exploits where attackers craft payloads tailored to this exact layout, as the format offers no mechanisms to enforce or verify address integrity during loading. In DOS environments, COM programs operate without modern privilege rings, granting direct access to hardware interrupts for system calls like file I/O or memory manipulation, which can enable escalation from application-level operations to full system control without authentication barriers.^[26] Early viruses exemplified these flaws, with the Jerusalem virus (detected in 1987) leveraging the format's simplicity to infect COM files by appending its code to the file's end—expanding the size while preserving functionality—and overwriting the initial three bytes with a jump instruction to the viral payload, enabling self-replication upon execution without detection by the loader. The format provides no inherent mitigations like code signing or embedded checksums, leaving protection dependent on external antivirus tools that scan for known signatures or behavioral anomalies. Compared to modern formats such as the Portable Executable (PE), which includes optional checksum fields, section headers for validation, and support for relocations and signatures, or the Executable and Linkable Format (ELF) with program headers enabling integrity verification and dynamic linking, the COM design's minimalism inherently amplifies vulnerability to such manipulations.^[65]^[66]

Malicious Exploitation of Extension

Attackers frequently exploit the .COM extension through spoofing techniques, renaming non-executable files such as scripts or documents (e.g., .txt or .bat) to .COM to deceive users into executing them as legacy DOS programs.^[67] This social engineering tactic leverages user assumptions that .COM files are harmless or outdated, prompting direct execution in environments supporting DOS compatibility, such as command prompts or virtual machines. Double extensions represent another common abuse, where files like "report.com.exe" are crafted to hide the true executable nature; in Windows, with file extensions hidden by default, this displays as "report.com" while retaining the .exe icon due to icon caching mechanisms that prioritize the primary extension for visual representation.^[68] This exploitation of Windows Explorer's icon caching and extension display settings tricks users into perceiving the file as a benign .COM document, leading to unintended execution of the embedded malware.^[67] Historically, malware has targeted the .COM format for infection and propagation, as seen in the Cascade virus from 1987, which appended its code to .COM files on MS-DOS systems, causing widespread disruption by corrupting executables and displaying a cascading text effect on infection.^[69] Similarly, the Jerusalem virus (1987) infected .COM and .EXE files, activating on Fridays the 13th to delete files, highlighting early exploitation of the format's simplicity for parasitic behavior in DOS environments.^[69] In phishing campaigns, .COM attachments are used to evade email filters that primarily flag common executables like .exe, as security gateways often overlook .COM assuming it refers to domain names rather than file types, allowing malicious payloads to reach inboxes disguised as invoices or updates.^[70] This tactic has seen increased adoption since 2018, with attackers embedding droppers or scripts in .COM files to initiate infections upon user interaction.^[70] Detection challenges arise from obfuscation methods like the right-to-left override (RTLO) Unicode character (U+202E), which reverses displayed text to mask extensions; for instance, a file named "file.exe‮txt.exe" appears as "file.txt.exe" but executes as .exe, complicating antivirus scanning that relies on visible extensions.^[71] While .COM-specific RTLO uses are less documented, the technique similarly disguises .COM files as innocuous types (e.g., appearing as .txt), evading pattern-based detection in legacy-compatible scanners.^[72] As of 2025, .COM exploitation remains rare but persistent in targeted attacks against legacy systems, such as industrial control environments running DOS-compatible software, where ransomware groups deploy .COM droppers to bypass modern protections lacking full backward compatibility.^[73] These threats are increasingly mitigated by endpoint detection tools that verify file signatures beyond extensions and enforce execution policies in virtualized legacy environments.^[74]

References

[1]
What's the difference between the COM and EXE extensions?
Mar 24, 2008 · The format of a COM file is… um, none. There is no format. A COM file is just a memory image. This “format” was inherited from CP/M. To load ...
[2]
COM File - What is a .com file and how do I open it? - FileInfo.com
Nov 1, 2021 · A COM file is an executable program capable of being run by MS-DOS and Windows. It is saved in a binary format and is similar to an .EXE file.Missing: specification | Show results with:specification
[3]
The MS-DOS COM Executable File Format - FileFormat.Info
Original Documentation. The COM files are raw binary executables and are a leftover from the old CP/M machines with 64K RAM. A COM program can only have ...
[4]
COM - OSDev Wiki
COM files are simple, raw binary executables used by MS-DOS, limited to 64kb, and loaded at 0x100. They are useful for simple loading.
[5]
DOS executable (.com) - Just Solve the File Format Problem
Aug 8, 2025 · Since the executable is limited to a single segment, the maximum size of a COM file is 65280 (0xff00) bytes. Some files carrying the .com ...Missing: specification | Show results with:specification
[6]
Oldest known version of DOS demoed — recently unearthed 86 ...
Jan 7, 2024 · HEX2BIN.COM is a loader for assembly code, changing the hex assembly code to binary. ASM.COM is an assembly language program from SCP. You may ...
[7]
86-DOS Revisited | OS/2 Museum
Jan 15, 2024 · The release notes for 86-DOS 1.0 mention that with the new 32-byte directory entry format, file sizes are no longer limited to 16 megabytes.
[8]
86-DOS 0.33 - BetaWiki
Mar 16, 2025 · 86-DOS 0.33 is the second release of 86-DOS. It was shipped with 2 manuals, the 86-DOS Version 0.3 User's Manual and the 86-DOS Version 0.3 Programmer's Manual.
[9]
Microsoft MS-DOS early source code - Computer History Museum
Mar 25, 2014 · MS-DOS was basically a file manager and a simple program loader. The user interface was text commands typed on a keyboard, followed by text ...
[10]
DOS 1.0 and 1.1 | OS/2 Museum
DOS 1.0 could read/write 160KB floppies, start .COM/.EXE, and process batch files, but lacked hard disk support. DOS 1.1 doubled disk capacity to 320KB.Missing: primary | Show results with:primary
[11]
86-DOS 1.10 - BetaWiki
Feb 20, 2024 · This version of 86-DOS is the first known version to support Microsoft's EXE executable format. The DOS kernel is now stored as a file ...
[12]
MS-DOS 1.25 - BetaWiki
Jul 16, 2025 · MS-DOS 1.25 was released in April 1982 as the first general release to OEM customers other than IBM, so it was used by all the first clone manufacturers.Missing: introduction | Show results with:introduction
[13]
The MS-DOS Encyclopedia: Section III: User Commands
This section of The MS-DOS Encyclopedia describes the standard internal and external MS-DOS commands available to the user who is running MS-DOS (versions 1.0 ...
[14]
Appendix H: Program Segment Prefix (PSP) Structure - PCjs Machines
The PSP structure includes an INT 20H instruction, address of last segment, reserved space, long call to MS-DOS, and interrupt vectors.
[15]
Format of Program Segment Prefix (PSP)
Format of Program Segment Prefix (PSP) ; 2Ch. WORD. DOS 2+ segment of environment for process (see #01379) ; 2Eh. DWORD. DOS 2+ process's SS:SP on entry to last ...Missing: 0x0100 | Show results with:0x0100
[16]
how to create .com files using masm 5.10? - Stack Overflow
Apr 23, 2011 · It can be done in MASM 5.1 (or older). From the MASM 5.0 docs, here is the basic shell with your test program.How to run a MS-DOS .asm file using VS2013 or MASM32?Creating a COM file from text - Stack OverflowMore results from stackoverflow.com
[17]
The MS-DOS Encyclopedia: Section V: System Calls - PCjs Machines
int 20H ; Transfer to MS-DOS. Interrupt 21H (33) Function 00H (0) Terminate Process 1.0 and later Function 00H flushes all file buffers to disk, terminates ...
[18]
How does DOS load a program into memory? - Stack Overflow
Sep 15, 2010 · What steps does MS-DOS take to load a COM or EXE file into memory? Are there still references online as to how this happens? The best I can ...How do you make a MSDOS .com file?load file from MS-DOS batch fileMore results from stackoverflow.com
[19]
MSDOS loading .com / .exe process
Mar 27, 2018 · What i am looking for is the loading process of .com and .exe files before execution. I already know .com files are loaded at 100h, 0..99h data ...Missing: 1.25 | Show results with:1.25
[20]
Why does MS-DOS put an int 20h at byte 0 of the COM file program ...
Mar 9, 2020 · The int 20h is the “exit program” system call. One theory is that it is placed at offset 0000h so that if execution runs off the end of the code ...
[21]
How Do I Get MS-DOS to Run in the High Memory Area? (96710)
If you are using MS-DOS 5.x or earlier, you need to create an MS-DOS startup disk. · Copy your CONFIG.SYS file to the startup disk by typing the following: · Use ...
[22]
List of DOS CONFIG.SYS commands
Specifies where to load DOS. DOS=HIGH|LOW[,UMB|,NOUMB] HIGH Load DOS into the high memory area (HMA) if available. LOW Load DOS into conventional memory.DEVICEHIGH · DOS · DRIVPARM · MENUITEM
[23]
MS-DOS DEBUG Program - The Starman's Realm
DEBUG gained the ability to assemble instructions directly into machine code (the A command). This is one of the most important commands for many of its users.Beginnings · Changes in DEBUG · Under Windows™ NT/2000/XP...
[24]
ms dos - How did large .COM files work?
Apr 25, 2020 · An MS-DOS .com file is just raw code/data without header, thus no linking information, and was limited to be loaded into just one segment (64kB).
[25]
COM - DOS Command File Format
The COM file format is a binary executable format used in Microsoft Windows or MS-DOS. Its structure consists of just a set of instructions.Missing: specification | Show results with:specification
[26]
Advanced MS-DOS Programming - PCjs Machines
Advanced MS-DOS Programming is written for the experienced C or assembly-language programmer. It provides all the information you need to write robust, high- ...<|separator|>
[27]
Why does a corrupted binary sometimes result in "Program too big to ...
Jan 30, 2006 · But where does “Program too big to fit in memory” come from? If the program header is corrupted, then various fields in the header such as those ...
[28]
16-bit assembly incompatibility with 64-bit windows 7 - Stack Overflow
Sep 27, 2013 · The COM case is easy: 65280-byte-max. 16-bit MS-DOS® program, out. EXE files, on the other hand, have certain file headers: one for the 16-bit ...
[29]
dir | Microsoft Learn
Feb 3, 2023 · For files, this command displays the name extension and the size in bytes. This command also displays the total number of files and directories ...Missing: measurement | Show results with:measurement
[30]
Int 21h Function 3Dh - Assembly Language Help - github
The function opens the specified file in the designated or default directory on the designated or default disk drive.
[31]
Int 21H, AH=4BH - osFree project
Sep 7, 2018 · AH = 4Bh AL = type of load 00h load and execute 01h load but do not execute 03h load overlay (see #01591) 04h load and execute in background ...Missing: dynamic | Show results with:dynamic
[32]
pts/com0exe: DOS .com program to .exe converter - GitHub
com0exe is a command-line tool to convert DOS .com executable programs to DOS .exe. It is compatible with any .com program (unlike other similar tools)
[33]
[PDF] Microsoft® DEBUG
The Microsoft DEBUG Utility (DEBUG) is a debugging program that provides a controlled testing environment for binary and executable object files.
[34]
MS DOS Command: PATH - output.To
MS-DOS searches for a file by using default filename extensions in the following order of precedence: .COM, .EXE, and .BAT. To run ACCNT.BAT when ACCNT.COM ...
[35]
http://www.output.to/sideway/default.aspx?qno=110700234
[36]
[PDF] Microsoft Windows 3.1 Resource Kit 0030-31645 1992 - vtda.org
This is a Microsoft Windows 3.1 Resource Kit, providing complete technical information for support professionals for the Microsoft Windows Operating System ...
[37]
NTVDM and 16-bit app support - Compatibility Cookbook
Nov 17, 2021 · NTVDM is a Feature on Demand and only supported on the x86 version of Windows. It is not supported on x64 and ARM versions of Windows, which do ...
[38]
Windows 11 Specs and System Requirements - Microsoft
Minimum Windows 11 requirements: 1 GHz 2+ core 64-bit processor, 4 GB RAM, 64 GB storage, UEFI, TPM 2.0, DirectX 12 graphics, 720p display >9".Get Windows 11 · Compare Windows 10 & 11 · Worldwide sites
[39]
[PDF] CP/M-86® - Operating System
This means that if the disk formats are the same, as in standard single density format, CP/M-86 can read the same data files as CP/M.
[40]
Was DOS copied from CP/M? - Embedded
Aug 6, 2016 · The answer is no. Further research showed that very early versions of DOS were designed to read and write CP/M files. The code I found confirms ...Missing: format origin
[41]
The AARD Code and DR DOS - Geoff Chappell, Software Analyst
Aug 13, 2021 · The updated appendix now lists int 21h functions that DR DOS adds to the MS-DOS interface, along with some variations from MS-DOS, especially ...
[42]
https://www.geoffchappell.com/notes/windows/archive/aard/drdos/index.htm
[43]
FreeDOS command: kernel
The KERNEL is necessary to load drivers via CONFIG.SYS / FDCONFIG.SYS and to make COMMAND.COM or other shells run in FreeDOS. Whereas the KERNEL loads CONFIG.
[44]
The FreeDOS Project
FreeDOS is an open source DOS-compatible operating system that you can use to play classic DOS games, run legacy business software, or write new DOS programs.Download FreeDOS 1.4 · Developers · Run applications · Play classic games
[45]
DOSBox v0.74-3 Manual
Add the commands you want to execute to the [autoexec] section of the DOSBox configuration file. Open the DOSBox configuration file and change the usescancodes ...
[46]
PCem Emulator
No information is available for this page. · Learn why
[47]
DOSEMU Main Page
Nov 2, 2012 · DOSEMU stands for DOS Emulation, and allows you to run DOS and many DOS programs, including many DPMI applications such as DOOM and Windows 3.1, ...
[48]
How do I run a .com file through wine? - WineHQ Forums
May 17, 2008 · .com is a dos program not a windows one. Wine does have dos layour just not a good one. Dosemu and dosbox are better at running dos applications.How to: Run a DOS program using WineMS-DOS Batch File CommandsMore results from forum.winehq.org
[49]
[PDF] Advanced UEFI Development Environment for Embedded Platforms
Unified Extensible Firmware Interface (UEFI) specifies how firmware boots OS loader. • UEFI's Platform Initialization Architecture (PI).
[50]
DOSBox-X - Accurate DOS emulation for Windows, Linux, macOS ...
DOSBox-X is an open-source DOS emulator for running DOS applications and games. DOS-based Windows such as Windows 3.x and Windows 9x are officially supported.
[51]
NTVDMx64 by Leecher1337 - Edward Mendelson
NTVDMx64 is a 64-bit version of NTVDM that allows running old DOS applications under 64-bit Windows, fully integrated with the Windows file system.<|control11|><|separator|>
[52]
How to keep running DOS 16 bit applications when Windows 11 ...
Nov 8, 2022 · Windows 11 does not support NTVDM, which eliminates support for 16-bit application supportability. Windows 11 is 64-bit only and will likely not run DOS ...
[53]
How to run DOS programs in Linux - Opensource.com
Oct 19, 2017 · Run DOS programs in Linux using QEMU, a PC emulator, and FreeDOS, a DOS-compatible OS, installed in a virtual machine.
[54]
How to run DOS apps on Linux - Both.org
Jun 13, 2024 · Run DOS apps on Linux using a PC emulator (QEMU) and FreeDOS. Steps include creating a virtual disk, installing FreeDOS, and booting from it.
[55]
What is a .COM File? Not Just Another Dotcom Bubble - Huntress
Sep 7, 2025 · COM file is a simple executable file that runs code on Windows or DOS systems. It tells your computer to perform specific actions when opened.
[56]
Relive your worst MS-DOS file-deletion memories at the Malware ...
Feb 9, 2016 · A website collection of 78 viruses from the MS-DOS era of the late '80s and early '90s, all ready to either launch on a DOSBOX web browser emulator or be ...
[57]
Chapter 9: Output Formats - NASM - The Netwide Assembler
NASM is a portable assembler, designed to be able to compile on any ANSI C-supporting platform and produce output to run on a variety of Intel x86 operating ...Missing: cross- | Show results with:cross-
[58]
Execution of COM files in windows - Stack Overflow
Feb 9, 2011 · 32-bit windows will execute them inside ntvdm.exe (which emulates DOS / 16-bit windows) 64-bit windows does not support 16-bit applications.
[59]
FreeDOS 1.4: Still DOS, still FOSS, more modern than ever
Apr 9, 2025 · The FreeDOS Project has released version 1.4 of its fully open source DOS-compatible OS – but you'll need a BIOS for bare metal.
[60]
Order of Precedence in Locating Executable Files (35284) - XS4ALL
COMMAND.COM looks in the following order for an executable file that has this name: .COM .EXE .BAT If COMMAND.COM cannot find this file in the current ...
[61]
Incorrect EXECUTABLE priority - DOSBox - VOGONS
Aug 31, 2003 · If a .COM file or a .EXE file is found, COMMAND.COM uses the MS-DOS EXEC function to load and execute it. The EXEC function builds a special ...
[62]
[PDF] Chapter 3: Using DOS Commands - Higher Education | Pearson
Sep 12, 2001 · When you type a command that COMMAND.COM does not recognize as one of its internal commands, COMMAND.COM responds by looking on disk for a ...
[63]
DOS APPEND | OS/2 Museum
Dec 20, 2024 · It is a TSR which allows applications to find files in a directory other than the current one. The first appearance in APPEND was in the IBM PC ...
[64]
Jerusalem | F-Secure
The Jerusalem virus is one of the oldest and most common viruses around. As a result there are numerous variants of it. It will infect both .EXE and .COM files.
[65]
PE Format - Win32 apps - Microsoft Learn
Jul 14, 2025 · After the MS-DOS stub, at the file offset specified at offset 0x3c, is a 4-byte signature that identifies the file as a PE format image file.
[66]
Lesser known tricks of spoofing extensions | Malwarebytes Labs
Lesser known tricks of spoofing extensions. Posted: September 30, 2016 by Malwarebytes Labs. It is a well-known fact that malware using social engineering ...
[67]
Masquerading: Double File Extension, Sub-technique T1036.007
The user may then view it as a benign text file and open it, inadvertently executing the hidden malware. Common file types, such as text files (.txt, .doc, etc.) ...
[68]
13 Scariest Computer Viruses of All Time - Etactics
Oct 5, 2023 · This DOS file infector was first launched in 1988 to celebrate the 40th anniversary of the creation of the Jewish state. The virus was scheduled ...
[69]
Report Shows Increase in Email Attacks Using .com File Extensions
Nov 15, 2018 · Anti-phishing firm Cofense has discovered an uptick in the use of .com file extensions in phishing email attacks.
[70]
Masquerading: Right-to-Left Override, Sub-technique T1036.002
Feb 10, 2020 · Adversaries may abuse the right-to-left override (RTLO or RLO) character (U+202E) to disguise a string and/or file name to make it appear benign.
[71]
The RTLO method | Malwarebytes Labs
Jan 9, 2014 · RTLO can be used to spoof fake extensions. Learn how malware writers are using this old trick to spread malware.
[72]
From Legacy Systems to 5G: Enterprise Security Threats in 2025
Feb 28, 2025 · From Legacy Systems to 5G: Enterprise Security Threats in 2025 ... Safeguarding Critical Supply Chain Data Through Effective Risk Assessment.Missing: .com
[73]
https://www.infosecurity-magazine.com/opinions/legacy-systems-5g-enterprise/