The Program Segment Prefix (PSP) is a 256-byte data structure in MS-DOS that is automatically created and placed at the beginning of a program's memory block upon loading, serving to store essential runtime information about the program such as its allocated memory limits, interrupt vectors, environment variables, command-line arguments, and default file control blocks.[1][2] This structure facilitates program execution by providing the operating system with hooks for termination, error handling, and resource management, while ensuring compatibility with earlier systems like CP/M through elements like the initial INT 20h instruction for program exit.[3][1]Introduced with MS-DOS version 1.0 in 1981, the PSP evolved across DOS versions to support increasingly complex multitasking and memory management, with key enhancements in DOS 2.0 adding environment block pointers and in later versions like DOS 3.0 incorporating file handle arrays for better process isolation.[2] Its segment address is passed to .COM programs via the DS register on entry, or can be retrieved at runtime for both .COM and .EXE programs using DOS interrupt 21h function AH=51h, allowing programs—especially assembly-language ones—to access and manipulate their own state dynamically.[3] The PSP's design reflects DOS's segmented memory model, where it occupies the first 256 bytes (16 paragraphs), aligned to a segment boundary, immediately preceding the program's code or data.[1]Key fields within the PSP include:Although largely obsolete in modern operating systems, the PSP remains a foundational concept in understanding early PC software architecture, influencing how subsequent OSes handle process environments and command parsing.[3] Programs could modify certain PSP fields, such as updating the termination vector before spawning child processes, to chain execution in a task-switching environment.[2]
Overview
Definition and Purpose
The Program Segment Prefix (PSP) is a 256-byte data structure in MS-DOS and compatible systems, positioned at the base of a program's allocated memory segment to store essential metadata required for program execution and interaction with the operating system.[4] It serves as a control block that encapsulates program-specific information, including the program's memory allocation size, command-line arguments, default file control blocks (FCBs), and the disk transfer area (DTA).[5] This structure enables DOS to manage individual programs effectively within its memory model.The primary purposes of the PSP include storing the operational state of a loaded program, such as a pointer to the environment block and unformatted parameter data, to support runtime access and modification.[4] It facilitates orderly program termination through dedicated exit vectors and interrupt handlers, allowing DOS to reclaim resources via mechanisms like Interrupt 20H or Function 4CH of Interrupt 21H.[5] Additionally, the PSP stores addresses for critical interrupts—such as those for termination (Interrupt 22H), control-C handling (Interrupt 23H), and fatal error recovery (Interrupt 24H)—ensuring proper system responses during execution.[6] These functions collectively provide compatibility with CP/M-style operations, where the PSP resembles the Zero Page used for similar program control in the earlier system.[4]DOS creates the PSP automatically prior to program execution, typically through the EXEC system call (Function 4BH of Interrupt 21H), which initializes the structure in the lowest part of the program's segment by copying relevant data from system tables.[5] In the single-tasking memory model of DOS, the PSP plays a key role in isolating each program's environment, maintaining separate state and resources to prevent interference despite the absence of hardware-enforced multitasking.[4] This isolation is vital for stability, as it allows DOS to track and restore program-specific configurations upon loading or termination.
Relation to Operating System Processes
In MS-DOS, each executed program, whether a .COM or .EXE file, is allocated its own Program Segment Prefix (PSP), which serves as a critical process descriptor defining the program's memory boundaries and managing associated file handles.[7] This structure ensures that the operating system can isolate and track individual program instances within the limited real-mode memory environment.[3]The PSP plays a central role in DOS's resource management by maintaining records of allocated memory segments, open files through the Job File Table (JFT), and a pointer to the environment block, thereby helping to mitigate conflicts in the system's cooperative multitasking model.[8] In this model, programs voluntarily yield control via interrupts or termination calls, and the PSP facilitates resource sharing and cleanup to support task switching without preemptive intervention.[3]Upon program loading, the DOS loader sets the DS (data segment) register to point to the PSP's segment address, enabling the program to directly access and reference its own process state for operations like memory allocation queries or filehandlemanagement.[8] This setup is essential for self-referential process handling in DOS's non-preemptive environment.A key feature of the PSP is its support for chain-loading programs, such as through the EXEC function (INT 21h AH=4Bh), where the parent process selectively passes its PSP details to the child, preserving necessary context like the parent segment address while initializing a new PSP for the loaded program.[8] This mechanism allows hierarchical process execution, with the child's PSP storing the parent's reference to enable proper return and resource deallocation upon termination.[3] For instance, command-line parameters stored in the PSP can be passed during such loads to configure the new process.[3]
History
Origins in CP/M
The Program Segment Prefix (PSP) in MS-DOS traces its conceptual origins to the Zero Page of CP/M-80, a 256-byte fixed memory area at the base of the system's address space designed to store essential program metadata and facilitate communication with the operating system. Developed by Gary Kildall in 1973–1974 as part of the first prototype of CP/M (Control Program for Microcomputers), this structure provided a standardized interface for 8-bit microcomputers, enabling efficient system calls and program execution without requiring complex memory management.[9][10][11]In CP/M, the Zero Page included key fields for program parameters, such as the command tail buffer starting at offset 80h, which held the length of the command line (one byte) followed by the actual characters, allowing programs to receive arguments from the console command processor. It also reserved space for default File Control Blocks (FCBs) at offsets 5Ch and 6Ch, which stored file names, extents, and allocation information for disk operations, supporting up to two default files without additional setup. These elements formed a compact repository for transient program data, ensuring that applications could access operating system services through simple register-based calls, such as using register C for the function number and DE for parameter addresses like FCB pointers.[11][11][11]The Zero Page's design emphasized simplicity and portability, particularly through its support for program relocation: transient programs loaded into the Transient Program Area (TPA) starting at offset 100h could reference the fixed Zero Page locations for parameters, enabling easy movement of code without address adjustments. This mechanism for parameter passing and metadata storage directly influenced later systems seeking similar functionality on new hardware, as it minimized overhead in resource-constrained 8-bit environments.[11][12]CP/M's dominance on microcomputers from the mid-1970s through the early 1980s, powering thousands of 8080- and Z80-based systems in business and hobbyist applications, underscored the need for software portability when the IBM PC arrived in 1981. Its widespread adoption created a large ecosystem of applications, prompting early versions of PC-DOS (MS-DOS 1.0) to incorporate a CP/M-compatible interface, including Zero Page-like structures, to facilitate the migration of existing CP/M software to the 8086 architecture.[10][13][14]
Evolution in MS-DOS Versions
The Program Segment Prefix (PSP) in early MS-DOS versions, such as 1.x released in 1981, served as a basic 256-byte structure primarily designed for .COM file execution, incorporating an INT 20h instruction at offset 00h for program termination and simple memory allocation tracking via the segment address at offset 02h.[15] This foundational layout emphasized compatibility with CP/M's zero page, limiting features to essential process state management without support for advanced file systems or environment handling.[15]With MS-DOS 2.x in 1983, the PSP evolved to accommodate the introduction of hierarchical directories and environment variables, adding a pointer to the environment block at offset 2Ch.[15] These changes supported subdirectory navigation and variable inheritance for child processes, while replacing the rudimentary INT 20h termination with the more robust INT 21h function 4Ch, which ensured proper memory release and vector restoration. The critical error handler vector (INT 24h) at offsets 12h-15h had been introduced earlier in version 1.10.[15][16] This update marked a shift from CP/M-like simplicity toward enhanced operating system capabilities, including preliminary support for multitasking extensions such as those in DESQview, which relied on the expanded PSP for process coordination.[15]In MS-DOS 3.x through 6.x (1984–1993), the PSP further expanded to handle increased complexity, with the file handle table growing from a default of 20 entries to a maximum of 255 in version 3.0 and later, enabled by INT 21h function 67h for dynamic handle count adjustment.[15]DOS 3.0 also introduced INT 21h function 62h to retrieve the current PSPsegmentaddress (with the earlier function 51h available from version 2.0), facilitating programmatic access for memory and process management.[17] By DOS 5.0 in 1991, reserved areas in the PSP supported upper memory block (UMB) integration via improved allocation strategies, allowing programs to utilize high RAM more efficiently without altering the core structure.[18] These developments reflected a broader transition to supporting larger file systems and limited multitasking, while maintaining backward compatibility for legacy .COM programs.[15]
Structure
Overall Layout
The Program Segment Prefix (PSP) is a fixed 256-byte (offsets 00h to FFh) data structure that forms the initial portion of a program's memory allocation in MS-DOS.[1] It serves as a header for the program's memory block, providing essential control and state information to the operating system and the executing program.[3] This consistent size ensures compatibility across DOS versions, allowing programs to reliably access standardized locations within the structure.[19]In terms of memory positioning, the PSP resides at offset 00h within the program's allocated segment, immediately preceding the program's code and data. For .COM files, which load directly into memory without relocation, the executable code begins at offset 100h, leaving the preceding 256 bytes dedicated to the PSP.[3] Upon program loading, MS-DOS sets the segment address of the PSP in the DS register (and ES for .EXE files), enabling the program to reference this block directly without additional queries.[1]The PSP's general organization divides it into fixed sections to separate distinct categories of data. Offsets 00h–0Fh contain termination codes and related instructions for program exit. Offsets 10h–2Fh are reserved for DOS-internal use and system pointers. Offsets 50h–7Fh handle file control blocks for initial program parameters. Finally, offsets 80h–FFh store the command tail, including the length and arguments passed to the program at invocation.[1] This layout facilitates efficient access to critical runtime information while maintaining a compact footprint.[19]
Key Fields and Offsets
The Program Segment Prefix (PSP) consists of a fixed 256-byte layout in memory, with key fields positioned at specific hexadecimal offsets to store essential program metadata, interrupt vectors, and compatibility structures. These fields facilitate program termination, memory management, error handling, and parameter passing, while reserved areas ensure compatibility and internal DOS operations. The structure's design draws from CP/M conventions, particularly in its file handling components.[20][1]At offset 00h, the first two bytes contain the INT 20h instruction (opcode CDh followed by 20h), which serves as the program termination vector; executing this interrupt returns control to DOS. Immediately following at offset 02h, a 2-byte word holds the segmentaddress marking the top of the memory allocated to the program, indicating the boundary beyond which the program's heap or stack cannot extend without overflow.[20][1]Offsets 0Ah through 12h allocate 12 bytes for three double-word interrupt vectors: the termination handler at 0Ah (for INT 22h), the Ctrl-C (control-break) handler at 0Eh (for INT 23h), and the critical error handler at 12h (for INT 24h). These vectors point to routines that manage program exit, user interrupt responses, and device error recovery, respectively, allowing DOS to chain handlers during process execution.[20][21]
Command-line string and default Disk Transfer Area (DTA)[21]
Further into the structure, offset 2Ch holds a 2-byte word specifying the segment address of the associated environment block, which contains variables like PATH and COMSPEC for the process. The fields at offsets 5Ch through 6Ch (totaling 36 bytes) house the default File Control Blocks (FCBs), with the first 16 bytes at 5Ch storing parsed details from the initial command-line parameter (such as drive, filename, and extension) and the next 20 bytes at 6Ch handling the second parameter. The first FCB uses the 16-byte unopened format, while the second uses a 20-byte opened format including additional fields like record size. These FCBs emulate CP/M-style file handling by providing a fixed-format block for filenames, attributes, and access modes, supporting up to two default files without requiring full FCB construction; this compatibility allows legacy CP/M programs to interface seamlessly with DOS file operations.[1]At offset 80h, a single byte indicates the length of the command-line tail (up to 127 bytes), followed from 81h to FFh by the actual command-line string (terminated with 0Dh) and the remaining space serving as the default Disk Transfer Area (DTA) for file searches via INT 21h functions. Several areas within the PSP are reserved for DOS internal use, such as offsets 16h through 2Bh (22 bytes, often zeroed) and 2Eh through 4Fh (34 bytes, version-specific and typically unused by applications), ensuring these regions remain available for system functions like job file tables without interference.[20][1][21]
Usage in DOS
Program Loading and Initialization
When MS-DOS loads a program via the EXEC function (INT 21h, AH=4Bh), it initiates a structured sequence to allocate memory and establish the Program Segment Prefix (PSP) at the base of the allocated segment. The caller first ensures adequate free memory by releasing unused blocks using INT 21h AH=4Ah if necessary, then provides a parameter block specifying the environment segment, command line tail, and file control blocks (FCBs). DOS allocates a page-aligned memory block large enough for the program image plus the PSP, typically starting from the lowest available address. It then invokes INT 21h AH=26h to create the new PSP at segment DX, initializing core fields by copying select data from the parent's PSP: the INT 20h termination instruction (CDh 20h) at offset 00h, the top-of-memory segment at offset 02h based on the allocated size, the parent PSP segment at offset 16h set to the caller's PSP, and interrupt vectors for 22h (terminate), 23h (Ctrl-Break), and 24h (critical error) from the current interrupt vector table to offsets 0Ah, 0Eh, and 12h respectively.[22][20][23]Following PSP creation, DOS populates additional fields from the EXEC parameter block to facilitate parameter passing and environment setup. If an environment segment is specified (non-zero at offset 00h of the parameter block), it is copied to the child's PSP at offset 2Ch; otherwise, the parent's environment is inherited. The command line is parsed and stored starting at offset 80h: a length byte (up to 127 characters) at 80h, the argument string at 81h, terminated by a carriage return (0Dh) and null (00h). The first and second FCBs, derived from command-line parameters (positions 1-16 for the first, 17-32 for the second, or pointers from the block), are initialized at offsets 5Ch and 6Ch, with unparsed portions filled as unformatted blocks of spaces. During this EXEC process, DOS duplicates these select PSP fields—such as the environment pointer, command tail, and FCBs—from parent to child to enable seamless parameter inheritance.[22][23][21]The loading differs for .COM and .EXE formats after PSP initialization. For .COM files, DOS reads the entire file into memory starting at offset 100h within the PSP segment and sets the registers for execution: CS=DS=ES=SS equal to the PSP segment, IP=100h, and SP=FFFEh to allocate a 64 KB stack. For .EXE files, DOS first reads the executable header into a temporary buffer to verify the MZ/ZM signature and compute relocation tables; it then loads the relocatable segments starting at offset 100h (after the PSP), adjusts all segment addresses in the header and relocation data by adding 10h (accounting for the 256-byte PSP), and configures CS:IP and SS:SP from the adjusted header values before transferring control. In overlay scenarios, where a child program is loaded as an extension of the parent (e.g., via EXEC with AL=03h for overlays), the child inherits a modified PSP with updated fields like the memory top and interrupt vectors tailored to the overlay's context, while sharing the parent's environment and handles.[22][23][21]
Accessing PSP Data Programmatically
In MS-DOS, programs can directly access the Program Segment Prefix (PSP) data by using the DS segment register, which DOS initializes to point to the PSP segment upon program loading.[3] This allows reading fields via offset addressing without additional setup; for instance, the assembly instruction MOV AX, [DS:80h] retrieves the byte at offset 80h, which holds the length of the command tail (limited to a maximum of 127 characters).[21] The command tail itself, starting at offset 81h and extending up to FFh (terminated by a carriage return at 0Dh), stores the program's arguments for parsing during execution.[24]If the PSP segment address is needed later in execution or in environments where DS may have changed, DOS provides interrupt services via INT 21h. For DOS 2.0 and later, function AH=51h returns the current PSP segment address in the BX register upon invocation.[25] In DOS 3.0 and later, function AH=62h offers a similar retrieval, also placing the PSP segment in BX, and is recommended for compatibility in higher versions.[26] Once obtained, programs can load this segment into DS (e.g., MOV DS, BX) to access offsets as described.Programs may also modify certain PSP fields during runtime to manage resources. For example, in DOS 3.0 and later, the file handle table—accessed via the DWORD pointer at offset 34h (defaulting to offset 18h within the PSP), which by default spans offsets 18h to 31h for the initial 20 handles (values FFh indicate closed handles)—can be relocated to a larger buffer by updating the pointer to reflect open files.[21] Similarly, the WORD at offset 2Ch points to the program's environmentsegment, allowing chaining to environment variables by dereferencing this pointer and scanning the block for null-terminated strings.[24]To spawn child processes, a program first creates a new PSP using the undocumented INT 21h AH=55h (DOS 2.0+), passing the child PSP segment in DX, then invokes INT 21h AH=4Bh to load and execute the child, supplying the command line in the parent's PSP offsets 80h-FFh and file control blocks (FCBs) at 5Ch and 6Ch for argument passing.[27] This setup ensures the child inherits relevant state, such as open files from the JFT, while the parent regains control upon child termination.[28]
Legacy and Modern Context
Changes in Later DOS Versions
In MS-DOS 7.0, released in 1995 and bundled with Windows 95, the file handle table associated with the PSP could be dynamically expanded using Interrupt 21h function 67h (available since DOS 3.3), allowing applications to increase the maximum number of open files from the default of 20 up to 65,535 handles per process, improving resource management for larger programs. Additionally, support for long filenames—up to 255 characters—was introduced through the Installable File System (IFS) framework, specifically via the VFAT driver, which extended the traditional 8.3 naming convention while maintaining backward compatibility for DOS applications.[29][30]During the Windows 9x era (1995–2000), the PSP was retained primarily for 16-bit applications executed in virtual 8086 (V86) mode, ensuring DOS compatibility within the hybrid environment. Environment block handling, traditionally stored at offset 2Ch in the PSP, was increasingly redirected to Win32 APIs like GetEnvironmentStrings for 32-bit processes while preserving it for legacy 16-bit code.In MS-DOS 8.0, included with Windows Millennium Edition in 2000, the PSP became largely obsolete for native applications, as the operating system emphasized Win32 subsystems and restricted real-mode DOS access to recovery modes. Despite this deprecation, the PSP structure was preserved in DOS compatibility environments to support legacy 16-bit and command-line operations, maintaining essential fields like file handles and termination vectors.A notable evolution across these versions was the incorporation of internationalization features, managed via Interrupt 21h functions like 38h for country information, providing support for multiple code pages, such as 865 (Nordic), 912 (Western European), and 915 (Cyrillic), to facilitate National Language Support (NLS).[29][30]
Role in Emulation and Virtualization
Emulators such as DOSBox and PCem faithfully recreate the Program Segment Prefix (PSP) to enable the execution of legacy .COM and .EXE files on modern hardware. In DOSBox, a highly accurate x86 and DOS kernel emulator, the PSP is implemented through functions like DOS_NewPSP, which allocates the structure at a specified low memory segment and populates it with essential fields such as the command tail, file control blocks (FCBs), and interrupt vectors for handling calls like INT 20h (program termination) and INT 21h (DOS services). This ensures that programs receive the correct PSP offsets via the DS and ES segment registers upon loading, mimicking real DOS behavior for interrupt-driven operations.[31]PCem, a cycle-accurate hardwareemulator for IBM PC compatibles, handles the PSP indirectly through its emulation of the 8086/8088 processor and memory system, allowing an authentic instance of MS-DOS to manage PSP creation and access during program loading. This approach preserves the original 8086 segmented memory model, where the PSP occupies the first 256 bytes of a program's allocated block, facilitating precise mapping of offsets for environment variables, parent process links, and device information without software-level intervention from the emulator itself.[32]In virtualization platforms like VMware and VirtualBox, 16-bit DOS guests rely on the PSP for memory isolation and process management, as the guest OS allocates and populates the structure to separate program environments within the virtualized 8086 address space. Similarly, Microsoft's NT Virtual DOS Machine (NTVDM) in Windows NT and later 32-bit systems emulates the PSP to support 16-bit DOS and Windows 3.x applications, integrating it with the host's protected mode via compatibility interfaces like the CALL 5 mechanism at PSP offset 5 for DOS API interception. This emulation translates legacy DOS calls, including those accessing PSP fields, to Win32 equivalents while maintaining isolation for multiple DOS sessions.[33][34]The PSP's design, rooted in the 8086 memory model, allows these tools to virtualize CP/M-derived software on x86-64 architectures by simulating segmented addressing and providing FCBs in the PSP for file operations, with emulators often translating these to modern argc/argv-style arguments during command tail setup for host integration. For instance, DOSBox adjusts the PSP command tail (offset 80h) to parse inputs from the host operating system, ensuring compatibility without altering the program's expectations.[31]