Fact-checked by Grok 2 weeks ago

Program Segment Prefix

The Program Segment Prefix (PSP) is a 256-byte data structure in MS-DOS that is automatically created and placed at the beginning of a program's memory block upon loading, serving to store essential runtime information about the program such as its allocated memory limits, interrupt vectors, environment variables, command-line arguments, and default file control blocks. This structure facilitates program execution by providing the operating system with hooks for termination, error handling, and resource management, while ensuring compatibility with earlier systems like CP/M through elements like the initial INT 20h instruction for program exit. Introduced with version 1.0 in 1981, the PSP evolved across versions to support increasingly complex multitasking and , with key enhancements in 2.0 adding environment block pointers and in later versions like 3.0 incorporating file handle arrays for better . Its segment address is passed to .COM programs via the DS register on entry, or can be retrieved at runtime for both .COM and .EXE programs using interrupt 21h function AH=51h, allowing programs—especially -language ones—to access and manipulate their own state dynamically. The PSP's design reflects 's segmented model, where it occupies the first 256 bytes (16 paragraphs), aligned to a segment boundary, immediately preceding the program's code or data. Key fields within the PSP include: Although largely obsolete in , the remains a foundational concept in understanding early PC , influencing how subsequent OSes handle environments and command . Programs could modify certain PSP fields, such as updating the termination before spawning child processes, to chain execution in a task-switching environment.

Overview

Definition and Purpose

The Program Segment Prefix (PSP) is a 256-byte data structure in MS-DOS and compatible systems, positioned at the base of a program's allocated segment to store essential required for program execution and interaction with the operating system. It serves as a control block that encapsulates program-specific information, including the program's allocation size, command-line arguments, default file control blocks (FCBs), and the disk transfer area (DTA). This structure enables to manage individual programs effectively within its memory model. The primary purposes of the include storing the operational state of a loaded , such as a pointer to the environment block and unformatted parameter data, to support runtime access and modification. It facilitates orderly termination through dedicated exit vectors and handlers, allowing to reclaim resources via mechanisms like 20H or 4CH of 21H. Additionally, the PSP stores addresses for critical —such as those for termination ( 22H), control-C handling ( 23H), and fatal error recovery ( 24H)—ensuring proper responses during execution. These functions collectively provide compatibility with CP/M-style operations, where the PSP resembles the used for similar control in the earlier . DOS creates the PSP automatically prior to program execution, typically through the EXEC system call (Function 4BH of Interrupt 21H), which initializes the structure in the lowest part of the program's segment by copying relevant data from system tables. In the single-tasking memory model of DOS, the PSP plays a key role in isolating each program's environment, maintaining separate state and resources to prevent interference despite the absence of hardware-enforced multitasking. This isolation is vital for stability, as it allows DOS to track and restore program-specific configurations upon loading or termination.

Relation to Operating System Processes

In , each executed program, whether a .COM or .EXE , is allocated its own Program Segment Prefix (PSP), which serves as a critical descriptor defining the program's boundaries and managing associated handles. This structure ensures that the operating system can isolate and track individual program instances within the limited real-mode environment. The PSP plays a central role in DOS's resource management by maintaining records of allocated memory segments, open files through the Job File Table (JFT), and a pointer to the environment block, thereby helping to mitigate conflicts in the system's cooperative multitasking model. In this model, programs voluntarily yield control via interrupts or termination calls, and the PSP facilitates resource sharing and cleanup to support task switching without preemptive intervention. Upon program loading, the DOS loader sets the DS (data segment) register to point to the PSP's segment address, enabling the program to directly access and reference its own for operations like allocation queries or . This setup is essential for self-referential process handling in DOS's non-preemptive environment. A key feature of the PSP is its support for chain-loading programs, such as through the EXEC function (INT 21h AH=4Bh), where the parent process selectively passes its PSP details to the child, preserving necessary context like the parent segment address while initializing a new PSP for the loaded program. This mechanism allows hierarchical process execution, with the child's PSP storing the parent's reference to enable proper return and resource deallocation upon termination. For instance, command-line parameters stored in the PSP can be passed during such loads to configure the new process.

History

Origins in CP/M

The Program Segment Prefix (PSP) in MS-DOS traces its conceptual origins to the Zero Page of -80, a 256-byte fixed memory area at the base of the system's designed to store essential program metadata and facilitate communication with the operating system. Developed by in 1973–1974 as part of the first prototype of (Control Program for Microcomputers), this structure provided a standardized for 8-bit microcomputers, enabling efficient system calls and program execution without requiring complex . In CP/M, the Zero Page included key fields for program parameters, such as the command tail buffer starting at offset 80h, which held the length of the command line (one byte) followed by the actual characters, allowing programs to receive arguments from the console command processor. It also reserved space for default File Control Blocks (FCBs) at offsets 5Ch and 6Ch, which stored file names, extents, and allocation information for disk operations, supporting up to two default files without additional setup. These elements formed a compact repository for transient program data, ensuring that applications could access operating system services through simple register-based calls, such as using register C for the function number and DE for parameter addresses like FCB pointers. The 's design emphasized simplicity and portability, particularly through its support for program relocation: transient programs loaded into the Transient Program Area (TPA) starting at offset 100h could reference the fixed locations for parameters, enabling easy movement of code without address adjustments. This mechanism for parameter passing and storage directly influenced later systems seeking similar functionality on new , as it minimized overhead in resource-constrained 8-bit environments. CP/M's dominance on microcomputers from the mid-1970s through the early 1980s, powering thousands of 8080- and Z80-based systems in business and hobbyist applications, underscored the need for software portability when the IBM PC arrived in 1981. Its widespread adoption created a large ecosystem of applications, prompting early versions of PC-DOS (MS-DOS 1.0) to incorporate a CP/M-compatible interface, including Zero Page-like structures, to facilitate the migration of existing CP/M software to the 8086 architecture.

Evolution in MS-DOS Versions

The Program Segment Prefix (PSP) in early versions, such as 1.x released in 1981, served as a basic 256-byte structure primarily designed for .COM file execution, incorporating an INT 20h instruction at offset 00h for program termination and simple memory allocation tracking via the segment address at offset 02h. This foundational layout emphasized compatibility with CP/M's , limiting features to essential process state management without support for advanced file systems or environment handling. With 2.x in 1983, the PSP evolved to accommodate the introduction of hierarchical directories and variables, adding a pointer to the block at offset 2Ch. These changes supported subdirectory navigation and variable inheritance for child processes, while replacing the rudimentary 20h termination with the more robust 21h function 4Ch, which ensured proper memory release and vector restoration. The critical error handler vector ( 24h) at offsets 12h-15h had been introduced earlier in version 1.10. This update marked a shift from CP/M-like simplicity toward enhanced operating system capabilities, including preliminary support for multitasking extensions such as those in DESQview, which relied on the expanded PSP for process coordination. In 3.x through 6.x (1984–1993), the further expanded to handle increased complexity, with the file table growing from a default of 20 entries to a maximum of 255 in version 3.0 and later, enabled by INT 21h 67h for dynamic count adjustment. 3.0 also introduced INT 21h 62h to retrieve the current (with the earlier 51h available from ), facilitating programmatic for and management. By 5.0 in 1991, reserved areas in the supported upper block (UMB) integration via improved allocation strategies, allowing programs to utilize high more efficiently without altering the core structure. These developments reflected a broader transition to supporting larger file systems and limited multitasking, while maintaining for legacy .COM programs.

Structure

Overall Layout

The Program Segment Prefix (PSP) is a fixed 256-byte (offsets 00h to FFh) data structure that forms the initial portion of a program's memory allocation in MS-DOS. It serves as a header for the program's memory block, providing essential control and state information to the operating system and the executing program. This consistent size ensures compatibility across DOS versions, allowing programs to reliably access standardized locations within the structure. In terms of memory positioning, the resides at 00h within the program's allocated , immediately preceding the program's code and data. For . files, which load directly into without , the code begins at 100h, leaving the preceding 256 bytes dedicated to the PSP. Upon program loading, sets the segment address of the PSP in the DS register (and ES for .EXE files), enabling the program to reference this block directly without additional queries. The PSP's general organization divides it into fixed sections to separate distinct categories of data. Offsets 00h–0Fh contain termination codes and related instructions for program exit. Offsets 10h–2Fh are reserved for DOS-internal use and system pointers. Offsets 50h–7Fh handle file control blocks for initial parameters. Finally, offsets 80h–FFh store the command tail, including the length and arguments passed to the at invocation. This layout facilitates efficient access to critical runtime information while maintaining a compact .

Key Fields and Offsets

The Program Segment Prefix (PSP) consists of a fixed 256-byte layout in , with key fields positioned at specific offsets to store essential program metadata, vectors, and compatibility structures. These fields facilitate program termination, , error handling, and parameter passing, while reserved areas ensure compatibility and internal operations. The structure's design draws from conventions, particularly in its file handling components. At offset 00h, the first two bytes contain the INT 20h instruction (opcode CDh followed by 20h), which serves as the termination vector; executing this returns control to . Immediately following at offset 02h, a 2-byte word holds the marking the top of the allocated to the , indicating the boundary beyond which the 's or cannot extend without . Offsets 0Ah through 12h allocate 12 bytes for three double-word vectors: the termination handler at 0Ah (for INT 22h), the Ctrl-C (control-break) handler at 0Eh (for INT 23h), and the critical error handler at 12h (for INT 24h). These vectors point to routines that manage program exit, user responses, and device error recovery, respectively, allowing to chain handlers during process execution.
Offset (hex)Size (bytes)Field Description
00h2INT 20h termination instruction (CDh 20h)
02h2Top of allocated memory segment address
0Ah4 22h termination vector
0Eh4 23h Ctrl-C handler vector
12h4 24h critical error handler vector
2Ch2Environment block segment address
5Ch16First default (FCB)
6Ch20Second default (FCB)
80h1Command-line tail length
81h–FFh127Command-line string and default Disk Transfer Area (DTA)
Further into the structure, offset 2Ch holds a 2-byte word specifying the segment address of the associated environment block, which contains variables like and for the process. The fields at offsets 5Ch through 6Ch (totaling 36 bytes) house the default File Control Blocks (), with the first 16 bytes at 5Ch storing parsed details from the initial command-line parameter (such as drive, filename, and extension) and the next 20 bytes at 6Ch handling the second parameter. The first FCB uses the 16-byte unopened format, while the second uses a 20-byte opened format including additional fields like record size. These FCBs emulate -style file handling by providing a fixed-format block for filenames, attributes, and access modes, supporting up to two default files without requiring full FCB construction; this allows legacy programs to interface seamlessly with file operations. At offset 80h, a single byte indicates the length of the command-line tail (up to 127 bytes), followed from 81h to FFh by the actual command-line string (terminated with 0Dh) and the remaining space serving as the default Disk Transfer Area (DTA) for file searches via INT 21h functions. Several areas within the are reserved for DOS internal use, such as offsets 16h through 2Bh (22 bytes, often zeroed) and 2Eh through 4Fh (34 bytes, version-specific and typically unused by applications), ensuring these regions remain available for system functions like job file tables without interference.

Usage in DOS

Program Loading and Initialization

When MS-DOS loads a program via the EXEC function (INT 21h, AH=4Bh), it initiates a structured sequence to allocate memory and establish the Program Segment Prefix (PSP) at the base of the allocated segment. The caller first ensures adequate free memory by releasing unused blocks using INT 21h AH=4Ah if necessary, then provides a parameter block specifying the environment segment, command line tail, and file control blocks (FCBs). DOS allocates a page-aligned memory block large enough for the program image plus the PSP, typically starting from the lowest available address. It then invokes INT 21h AH=26h to create the new PSP at segment DX, initializing core fields by copying select data from the parent's PSP: the INT 20h termination instruction (CDh 20h) at offset 00h, the top-of-memory segment at offset 02h based on the allocated size, the parent PSP segment at offset 16h set to the caller's PSP, and interrupt vectors for 22h (terminate), 23h (Ctrl-Break), and 24h (critical error) from the current interrupt vector table to offsets 0Ah, 0Eh, and 12h respectively. Following creation, populates additional fields from the EXEC parameter block to facilitate parameter passing and setup. If an segment is specified (non-zero at offset 00h of the parameter block), it is copied to the child's PSP at offset 2Ch; otherwise, the parent's is inherited. The command line is parsed and stored starting at offset 80h: a length byte (up to 127 characters) at 80h, the argument string at 81h, terminated by a (0Dh) and (00h). The first and second FCBs, derived from command-line parameters (positions 1-16 for the first, 17-32 for the second, or pointers from the block), are initialized at offsets 5Ch and 6Ch, with unparsed portions filled as unformatted blocks of spaces. During this EXEC process, duplicates these select PSP fields—such as the pointer, command tail, and FCBs—from parent to child to enable seamless parameter inheritance. The loading differs for .COM and .EXE formats after PSP initialization. For .COM files, DOS reads the entire file into memory starting at offset 100h within the PSP segment and sets the registers for execution: CS=DS=ES=SS equal to the PSP segment, IP=100h, and SP=FFFEh to allocate a 64 KB stack. For .EXE files, DOS first reads the executable header into a temporary buffer to verify the MZ/ZM signature and compute relocation tables; it then loads the relocatable segments starting at offset 100h (after the PSP), adjusts all segment addresses in the header and relocation data by adding 10h (accounting for the 256-byte PSP), and configures CS:IP and SS:SP from the adjusted header values before transferring control. In overlay scenarios, where a child program is loaded as an extension of the parent (e.g., via EXEC with AL=03h for overlays), the child inherits a modified PSP with updated fields like the memory top and interrupt vectors tailored to the overlay's context, while sharing the parent's environment and handles.

Accessing PSP Data Programmatically

In , programs can directly access the Program Segment Prefix () data by using the DS segment register, which DOS initializes to point to the PSP segment upon program loading. This allows reading fields via addressing without additional setup; for instance, the assembly instruction MOV AX, [DS:80h] retrieves the byte at 80h, which holds the of the command (limited to a maximum of 127 characters). The command itself, starting at 81h and extending up to FFh (terminated by a at 0Dh), stores the program's arguments for during execution. If the PSP segment address is needed later in execution or in environments where DS may have changed, DOS provides interrupt services via INT 21h. For DOS 2.0 and later, function AH=51h returns the current PSP segment address in the BX register upon invocation. In DOS 3.0 and later, function AH=62h offers a similar retrieval, also placing the PSP segment in BX, and is recommended for compatibility in higher versions. Once obtained, programs can load this segment into DS (e.g., MOV DS, BX) to access offsets as described. Programs may also modify certain PSP fields during runtime to manage resources. For example, in DOS 3.0 and later, the file handle table—accessed via the DWORD pointer at offset 34h (defaulting to offset 18h within the ), which by default spans offsets 18h to 31h for the initial 20 handles (values FFh indicate closed handles)—can be relocated to a larger by updating the pointer to reflect open files. Similarly, the WORD at offset 2Ch points to the program's , allowing chaining to environment variables by dereferencing this pointer and scanning the for null-terminated strings. To spawn child processes, a program first creates a new using the undocumented 21h =55h (DOS 2.0+), passing the child PSP segment in DX, then invokes 21h =4Bh to load and execute the child, supplying the command line in the parent's PSP offsets 80h-FFh and file control blocks (FCBs) at 5Ch and 6Ch for argument passing. This setup ensures the child inherits relevant state, such as open files from the JFT, while the parent regains control upon child termination.

Legacy and Modern Context

Changes in Later DOS Versions

In MS-DOS 7.0, released in 1995 and bundled with , the file handle table associated with the PSP could be dynamically expanded using Interrupt 21h function 67h (available since DOS 3.3), allowing applications to increase the maximum number of open files from the default of 20 up to 65,535 handles per process, improving resource management for larger programs. Additionally, support for long filenames—up to 255 characters—was introduced through the (IFS) framework, specifically via the VFAT driver, which extended the traditional 8.3 naming convention while maintaining for DOS applications. During the era (1995–2000), the was retained primarily for 16-bit applications executed in virtual 8086 (V86) mode, ensuring compatibility within the hybrid environment. Environment block handling, traditionally stored at offset 2Ch in the , was increasingly redirected to Win32 APIs like GetEnvironmentStrings for 32-bit processes while preserving it for legacy 16-bit code. In 8.0, included with in 2000, the became largely obsolete for native applications, as the operating system emphasized Win32 subsystems and restricted real-mode access to recovery modes. Despite this deprecation, the structure was preserved in compatibility environments to support legacy 16-bit and command-line operations, maintaining essential fields like file handles and termination vectors. A notable evolution across these versions was the incorporation of features, managed via 21h functions like 38h for country information, providing support for multiple code pages, such as 865 (), 912 (Western European), and 915 (), to facilitate Support (NLS).

Role in Emulation and Virtualization

Emulators such as and faithfully recreate the Program Segment Prefix (PSP) to enable the execution of and .EXE files on modern hardware. In , a highly accurate x86 and kernel emulator, the PSP is implemented through functions like DOS_NewPSP, which allocates the structure at a specified low memory segment and populates it with essential fields such as the command tail, file control blocks (FCBs), and interrupt vectors for handling calls like INT 20h (program termination) and INT 21h ( services). This ensures that programs receive the correct PSP offsets via the DS and ES segment registers upon loading, mimicking real behavior for interrupt-driven operations. PCem, a cycle-accurate for PC compatibles, handles the indirectly through its of the 8086/8088 processor and memory system, allowing an authentic instance of to manage PSP creation and access during program loading. This approach preserves the original 8086 segmented memory model, where the PSP occupies the first 256 bytes of a program's allocated block, facilitating precise mapping of offsets for environment variables, links, and device information without software-level intervention from the emulator itself. In virtualization platforms like and , 16-bit DOS guests rely on the PSP for memory isolation and process management, as the guest OS allocates and populates the structure to separate program environments within the virtualized 8086 . Similarly, Microsoft's NT (NTVDM) in and later 32-bit systems emulates the PSP to support 16-bit DOS and Windows 3.x applications, integrating it with the host's via compatibility interfaces like the CALL 5 mechanism at PSP offset 5 for DOS API interception. This emulation translates legacy DOS calls, including those accessing PSP fields, to Win32 equivalents while maintaining isolation for multiple DOS sessions. The PSP's design, rooted in the 8086 memory model, allows these tools to virtualize CP/M-derived software on architectures by simulating segmented addressing and providing FCBs in the PSP for file operations, with emulators often translating these to modern argc/argv-style arguments during command tail setup for host integration. For instance, adjusts the PSP command tail (offset 80h) to parse inputs from the host operating system, ensuring compatibility without altering the program's expectations.