Protected mode
Protected mode is an operational mode of x86-compatible central processing units (CPUs) introduced by Intel with the 80286 microprocessor in 1982, serving as the native state of the processor and enabling advanced memory management, protection, and multitasking through segmentation and paging mechanisms.[1]
In protected mode, the processor supports a segmented memory model where memory is divided into segments defined by descriptors in the Global Descriptor Table (GDT) or Local Descriptor Table (LDT), each specifying base addresses, limits, and access rights to enforce isolation between tasks.[1] Paging provides an additional layer of virtual-to-physical address translation, allowing for efficient virtual memory implementation and further protection against unauthorized access.[1] This mode expanded significantly with the Intel 80386 in 1985, introducing 32-bit addressing for up to 4 gigabytes of linear address space, compared to the 1-megabyte limit of real mode.[1]
A core feature of protected mode is its hierarchical protection system, utilizing four privilege levels—known as rings 0 through 3—to control access to resources, with ring 0 reserved for the most privileged operations (typically the operating system kernel) and ring 3 for user applications.[1] Multitasking is facilitated through Task-State Segments (TSS) and task gates, which manage context switches and maintain separate address spaces per task, ensuring hardware-enforced isolation.[1] Entry into protected mode occurs by setting the Protection Enable (PE) bit in the machine status word (MSW) on the 80286 or in the CR0 control register on the 80386 and later processors, a process that requires careful initialization of segment descriptors and often a far jump to flush the processor pipeline.[1]
Unlike real mode, which operates in a 16-bit flat address space without protection mechanisms and supports only single-tasking, protected mode provides robust security against faults and malicious code, forming the foundation for modern operating systems like Windows, Linux, and macOS.[1] Later enhancements, such as Physical Address Extension (PAE) on processors from the Pentium era onward, further increased physical memory addressing to 64 gigabytes while maintaining compatibility.[1]
Background
Real Mode Basics
Real mode, also known as real-address mode, is the default operational state of x86 processors upon power-up or reset, providing backward compatibility with the original Intel 8086 and 8088 architectures.[2] In this mode, the processor employs a 20-bit physical address space, restricting accessible memory to a maximum of 1 MB (2^20 bytes).[2] Addressing operates through a segment:offset scheme, where memory locations are specified by combining a 16-bit segment value from a segment register with a 16-bit offset.[2]
The x86 architecture includes four primary 16-bit segment registers in real mode: CS (code segment), which points to the current code segment; DS (data segment), used for data access; SS (stack segment), for stack operations; and ES (extra segment), for additional data.[2] To form a physical address, the segment value is shifted left by 4 bits (effectively multiplied by 16) to establish the segment base, and the offset is then added to this base.[2] For example, the physical address is calculated as \text{Physical address} = (\text{segment} \times 16) + \text{offset}.[2]
This segmented approach limits each individual segment to a maximum size of 64 KB (2^16 bytes), as the offset is only 16 bits wide.[2] Furthermore, real mode provides no hardware-enforced memory protection, allowing programs unrestricted access to the entire addressable memory space without checks for bounds or privileges.[2] These characteristics establish the foundational model from which protected mode evolved to address growing demands for larger memory and security.[2]
Limitations and Motivations
Real mode, the default operating mode of early x86 processors like the Intel 8086, imposed significant constraints that hindered the evolution of personal computing systems. Primarily, it provided no inherent memory protection mechanisms, allowing errant or malicious code to access and corrupt any part of the system's 1 MB address space, which frequently led to crashes or security vulnerabilities in multi-program environments. This 1 MB ceiling stemmed from the 20-bit physical addressing scheme, where segmented addressing—combining a 16-bit segment register shifted left by 4 bits with a 16-bit offset—effectively capped usable memory at 2^20 bytes, restricting scalability as RAM capacities grew beyond this limit in the early 1980s. Furthermore, real mode lacked support for native multitasking, forcing operating systems to rely on cumbersome software techniques like time-slicing or bank switching to simulate concurrent program execution, which were inefficient and prone to errors.[3]
These limitations motivated the development of protected mode as a foundational shift toward more robust, secure, and scalable computing architectures. The primary drivers included the need to support multitasking operating systems such as Unix variants, which required process isolation to prevent one application from interfering with others, thereby enabling safer multiuser environments on emerging PCs. Protection against malicious software was another key impetus, as growing software complexity in the 1980s amplified risks from faulty or intentionally harmful code, necessitating hardware-enforced boundaries to enhance system reliability. Additionally, the demand for larger memory addressing arose from hardware advancements allowing RAM beyond 1 MB, compelling designs that could leverage up to 16 MB of physical memory while maintaining compatibility with existing 8086 software.[3][4]
Historically, protected mode drew inspiration from minicomputer operating systems that emphasized isolation and resource sharing, influencing Intel's response to pressures from developers seeking to port advanced OSes to x86 platforms. Intel's development of the 80286 was informed by six months of field research into customer requirements for enhanced memory addressing and protection mechanisms.[4] Developers, including those at Microsoft who ported XENIX (a Unix derivative) to the 80286, sought features for advanced OS implementations, as real mode's constraints made reliable Unix-like systems impractical on PCs. The 80286, designed starting in 1978 and released in 1982, specifically aimed to bridge 16-bit real-mode compatibility with advanced computing paradigms, incorporating protected mode to address these demands while supporting high-level languages and modular programming for scientific, engineering, and business applications.[4][5][3]
Historical Development
Intel 80286 Implementation
The Intel 80286, released in 1982, introduced protected mode as the first implementation of this feature in the x86 architecture, extending the 16-bit iAPX 86/88 family with advanced memory management capabilities.[6] This processor supported 24-bit physical addressing, enabling access to up to 16 MB of physical memory, a significant expansion from the 1 MB limit of prior real-mode systems that relied on 20-bit addressing.[6] In protected mode, the 80286 provided a virtual address space of up to 1 GB per task through segmentation, allowing for more efficient multitasking and isolation without hardware paging support.[6] This mode built on real mode's segment:offset addressing model for compatibility but enforced stricter boundaries to prevent unauthorized access.[6]
Key innovations in the 80286's protected mode included the Global Descriptor Table (GDT) and Local Descriptor Table (LDT), which defined segment boundaries and attributes for all processes.[6] The GDT, a system-wide table accessed via the GDTR register, held shared segment descriptors, while the LDT, loaded via the LDTR register, provided task-specific segments for private address spaces.[6] Protection was further enhanced by a four-level privilege hierarchy, known as rings 0 through 3, where ring 0 represented the highest privilege for kernel operations and ring 3 the lowest for user applications.[6] These levels were enforced through the Descriptor Privilege Level (DPL) in segment descriptors and the Current Privilege Level (CPL) in the code segment register, preventing lower-privilege code from accessing sensitive resources.[6]
Segment descriptors in the GDT and LDT followed an 8-byte format, specifying the base address (starting location of the segment), limit (up to 64 KB in size), and access rights.[6] Access rights included the present bit (P) to indicate segment validity, type fields for code, data, or system segments, and the DPL for privilege checking.[6] This structure allowed the processor to validate memory references dynamically, generating exceptions for violations such as invalid descriptors or privilege breaches.[6]
Despite these advances, the 80286's protected mode had notable limitations, including restriction to 16-bit operations without support for 32-bit extensions, absence of virtual 8086 mode for running real-mode applications natively, and no built-in paging hardware for demand-paged virtual memory.[6] Despite these innovations, the 80286's protected mode saw limited adoption in practice, as it lacked support for running real-mode applications without rebooting and was incompatible with the dominant MS-DOS environment, leading most software to run in real mode.[7] Transitioning to protected mode required software initialization: loading the GDT into the GDTR register using the LGDT instruction, followed by setting the Protection Enable (PE) bit in the Machine Status Word (MSW) via the LMSW instruction at privilege level 0, and finally executing an intra-segment jump to flush the prefetch queue.[6] This process ensured a clean switch but demanded careful setup to avoid faults during the mode change.[6]
Intel 80386 Enhancements
The Intel 80386 microprocessor, released in October 1985, introduced 32-bit protected mode, featuring 32-bit general-purpose registers such as EAX, EBX, ECX, and EDX, along with a 32-bit address bus that supported a 4 GB physical address space.[8][9] This marked a significant evolution from the 16-bit protected mode of the Intel 80286, extending the latter's four privilege levels while enabling full 32-bit operations for enhanced performance and scalability in multitasking environments.[9]
Key additions included Virtual 8086 (V86) mode, which allowed real-mode 8086 applications to execute within protected mode under multitasking supervision, and a built-in paging unit for virtual memory management.[9] The segment descriptor format was enhanced to support 32-bit addressing, incorporating a granularity bit that permitted segment limits up to 4 GB when set.[9] For compatibility, the 80386 provided Big Real Mode, an unprotected extension enabling 32-bit addressing in real mode to access the full 4 GB space, and enhanced task state segments (TSS) that facilitated efficient context switching for multitasking by storing complete 32-bit processor states.[9]
Exclusive to the 80386 were structures like the Page Directory and Page Tables, which formed the basis for demand-paged virtual memory by mapping 4 KB pages in a hierarchical manner.[9] Additionally, the I/O Privilege Level (IOPL) bits in the EFLAGS register (bits 12 and 13) introduced granular control over I/O instructions, restricting sensitive operations to higher privilege levels and enhancing security in protected environments.[9] These features collectively enabled the development of full 32-bit operating systems, such as Microsoft's Windows NT and OS/2 2.0, which leveraged the 80386's capabilities for robust, protected multitasking on personal computers.[10][11]
Mode Switching
Entering Protected Mode
To enter protected mode on an x86 processor, specific prerequisites must be met to ensure a stable transition from real mode. The global descriptor table (GDT) must be loaded into memory with at least a null descriptor, a code segment descriptor, and a data segment descriptor; these descriptors define the initial memory segments for code execution and data access in protected mode. A stack must also be established within a data segment to handle any immediate subroutine calls or interrupts post-switch. Interrupts, including non-maskable interrupts (NMIs), should be disabled prior to the switch to prevent interference during the transition.[12]
The step-by-step procedure to switch to protected mode involves initializing segment registers to a flat memory model, loading the GDT, enabling the protection enable (PE) bit, and reloading the code segment register (CS). First, initialize the segment registers (DS, ES, SS, FS, GS) to point to a flat data segment, typically using a selector value like 0x10 for a 32-bit flat descriptor in the GDT; this ensures data access remains valid during the switch. Next, load the GDTR with the base address and size of the GDT using the LGDT instruction, which accepts a 6-byte pseudo-descriptor containing the 32-bit base address (24-bit for 80286) and 16-bit limit. Then, set the PE bit (bit 0) in the CR0 register to 1 using a MOV to CR0 instruction; this action serializes the processor and activates protected mode semantics. Finally, execute a far jump (JMP FAR) to a protected-mode code segment selector (e.g., 0x08 for a 32-bit code descriptor) and offset, which flushes the prefetch queue, reloads CS with the new selector, and begins execution in protected mode. After the jump, reinitialize the remaining segment registers if needed and re-enable interrupts.[12]
The Intel 80286 implementation differs from the 80386 in several key aspects during entry. On the 80286, segments are limited to 16-bit operations, and the GDT base is 24 bits wide, requiring a hardware reset to exit protected mode once entered. In contrast, the 80386 supports 32-bit code and data segments (via the D/B flag in descriptors), a full 32-bit GDT base address, and software-based mode switching without reset, allowing larger address spaces and enabling paging if desired post-entry. These enhancements make the 80386 procedure more flexible for modern operating systems.[12]
Invalid configurations during entry can trigger exceptions, primarily general protection faults (#GP). For instance, attempting to set the PE bit without a valid GDT or using an invalid segment selector in the far jump causes a #GP(0) exception, halting the processor and requiring error handling via an interrupt service routine if one is already set up. Similarly, loading a null selector into CS or SS post-jump results in a #GP fault. Proper validation of descriptors and selectors is essential to avoid these faults.[12]
A minimal assembly code example for entering protected mode on an 80386, assuming a pre-built GDT at physical address 0x1000 with flat 32-bit code (selector 0x08) and data (0x10) segments, illustrates the sequence:
cli ; Disable interrupts
lgdt [gdt_descriptor] ; Load GDT (gdt_descriptor at 0x1000-0x1005)
mov [eax](/page/EAX), cr0
or [eax](/page/EAX), 1 ; Set [PE](/page/PE) bit
mov cr0, [eax](/page/EAX)
jmp 0x08:protected_mode ; Far jump to reload [CS](/page/CS)
protected_mode:
mov ax, 0x10 ; Data segment selector
mov [ds](/page/DS), ax
mov es, ax
mov fs, ax
mov gs, ax
mov ss, ax
[sti](/page/STI) ; Re-enable interrupts
cli ; Disable interrupts
lgdt [gdt_descriptor] ; Load GDT (gdt_descriptor at 0x1000-0x1005)
mov [eax](/page/EAX), cr0
or [eax](/page/EAX), 1 ; Set [PE](/page/PE) bit
mov cr0, [eax](/page/EAX)
jmp 0x08:protected_mode ; Far jump to reload [CS](/page/CS)
protected_mode:
mov ax, 0x10 ; Data segment selector
mov [ds](/page/DS), ax
mov es, ax
mov fs, ax
mov gs, ax
mov ss, ax
[sti](/page/STI) ; Re-enable interrupts
This code performs the essential switch, transitioning to a flat 4 GB address space model.[12]
Exiting Protected Mode
Exiting protected mode on x86 processors, particularly the Intel 80386 and later, involves a careful sequence of operations to return the CPU to real mode while ensuring compatibility with legacy BIOS services and avoiding system instability. This transition disables the protection enable (PE) bit in the CR0 register, flushes the instruction pipeline, and reinitializes segment registers and interrupt handling to mimic the 20-bit addressing and flat memory model of real mode. The procedure requires prior setup of the global descriptor table (GDT) with real-mode-compatible descriptors (64 KB limits, byte granularity) to prevent invalid memory accesses during the switch.
The step-by-step procedure for exiting protected mode is as follows: First, disable all interrupts to prevent asynchronous disruptions, including maskable interrupts via the CLI instruction and non-maskable interrupts (NMIs) through external masking or by ensuring a valid real-mode vector for NMI (IVT offset 0x08). Second, if paging is enabled, disable it by clearing the PG bit (bit 31) in CR0, zeroing CR3 to invalidate the page directory, and ensuring the CPU is executing from an identity-mapped physical address to avoid translation errors. Third, clear the PE bit (bit 0) in CR0 using a MOV instruction to disable protected mode addressing and protection checks. Fourth, perform a far jump to a real-mode code segment, typically to offset 0xFFF0 in segment 0xF000 (the conventional BIOS entry point), to flush the prefetch queue and reload the code segment register (CS) with real-mode values. Finally, reload the data segment (DS), extra segment (ES), stack segment (SS), FS, and GS registers with real-mode values (usually 0 for flat addressing), load the interrupt descriptor table register (IDTR) via LIDT to point to the real-mode interrupt vector table (IVT) at physical address 0, and re-enable interrupts with STI. The order of segment register reloads is critical: SS should be updated early to ensure a valid stack for any subsequent operations, followed by DS and ES to avoid data access faults.
On the Intel 80386, additional considerations apply for safe exit. Paging must be disabled prior to clearing the PE bit, as active paging with protected mode disabled can lead to unpredictable address translations. In multitasking environments using task state segments (TSS), any active task switch must be completed or aborted by popping the task register (TR) and ensuring no pending task gates in the IDT before the transition, to prevent corruption of the task state.[13]
Risks during the exit process include potential triple faults if the GDT is not properly configured with valid real-mode descriptors before clearing PE, as invalid segment references post-transition can trigger unhandled exceptions leading to a double fault and subsequent triple fault, causing a CPU reset. Post-exit, reliance on BIOS calls requires the IVT to be correctly initialized, as improper interrupt handling can hang the system or corrupt memory.
This mode-switching capability has been historically utilized in bootloaders like GRUB for hybrid operations, where protected mode is entered for efficient kernel loading but exited to real mode for accessing BIOS disk and video services.[14][15]
A representative assembly example for a safe exit on the 80386, assuming paging is disabled and GDT is preset, emphasizes the segment reload sequence:
cli ; Disable maskable interrupts
; Assume NMI masked externally
mov eax, cr0
and eax, 0x7FFFFFFE ; Clear PE bit (and PG if needed)
mov cr0, eax
jmp far 0xF000:0xFFF0 ; Far jump to flush pipeline, CS=0xF000, IP=0xFFF0 (BIOS reset vector)
real_mode_cs:
mov ax, 0 ; Real-mode segment value
mov ds, ax ; Reload DS
mov es, ax ; Reload ES
mov ss, ax ; Reload SS (critical for stack)
mov fs, ax ; Reload FS
mov gs, ax ; Reload GS
lidt [real_idt] ; Load IVT pointer (base 0, limit 0x3FF)
sti ; Re-enable interrupts
cli ; Disable maskable interrupts
; Assume NMI masked externally
mov eax, cr0
and eax, 0x7FFFFFFE ; Clear PE bit (and PG if needed)
mov cr0, eax
jmp far 0xF000:0xFFF0 ; Far jump to flush pipeline, CS=0xF000, IP=0xFFF0 (BIOS reset vector)
real_mode_cs:
mov ax, 0 ; Real-mode segment value
mov ds, ax ; Reload DS
mov es, ax ; Reload ES
mov ss, ax ; Reload SS (critical for stack)
mov fs, ax ; Reload FS
mov gs, ax ; Reload GS
lidt [real_idt] ; Load IVT pointer (base 0, limit 0x3FF)
sti ; Re-enable interrupts
This sequence ensures a clean transition, with the far jump establishing the real-mode CS before other registers are adjusted to prevent faults.[14]
Protection Mechanisms
Privilege Levels
Protected mode in the x86 architecture employs a hierarchical privilege system consisting of four rings, numbered 0 through 3, to enforce security boundaries between different software components. Ring 0 represents the highest privilege level, typically reserved for operating system kernel code with unrestricted access to hardware and system resources. Rings 1 and 2 serve as intermediate levels for less trusted system services, such as device drivers or executive modules, while Ring 3 is the lowest privilege level, designated for user applications with restricted access to prevent interference with critical system operations.[12]
The privilege of executing code or accessing data is determined by two key fields: the Current Privilege Level (CPL), which indicates the privilege level of the currently running task and is stored in bits 0-1 of the code segment (CS) register, and the Descriptor Privilege Level (DPL), a 2-bit field (bits 5-6 of the access rights byte) in segment descriptors that specifies the minimum privilege required to access the associated segment or gate. The DPL in segment descriptors acts as the primary enforcement mechanism for privilege checks. For nonconforming code segments, code executing at a given CPL can only load segments whose DPL equals the CPL, ensuring same-privilege direct execution. To invoke more privileged code (lower ring number), transitions must occur through controlled mechanisms like call gates, which validate the caller's CPL against the gate's DPL before allowing the switch.[12][3]
Data access follows similar rules, where a task at CPL can read or write data segments only if the CPL is less than or equal to the segment's DPL, and the requested privilege level (RPL) of the segment selector is also less than or equal to the DPL; violations result in a general protection fault (#GP). For inter-ring calls to more privileged code using nonconforming segments, the processor performs a stack switch to a new stack segment at the target privilege level, loaded from the task state segment (TSS), and pushes the caller's stack pointer, flags, instruction pointer, and parameters onto it to maintain isolation and enable proper returns. Conforming code segments, marked in their descriptor, allow execution from any CPL greater than or equal to the DPL without a stack switch, facilitating shared library code across privilege levels.[12][3]
Input/output (I/O) operations and interrupt handling further enforce privilege separation. I/O instructions from a non-zero ring require the current privilege level to be less than or equal to the I/O privilege level (IOPL) bits in the flags register; otherwise, a #GP fault occurs, with escalation typically to Ring 0 handlers. Interrupts and exceptions always vector to Ring 0 via entries in the interrupt descriptor table (IDT), using interrupt or trap gates that may disable interrupts to prevent nesting issues. In the Intel 80286 implementation, I/O access is strictly limited to Ring 0 unless explicitly permitted via the task's I/O permission bitmap, with no dedicated IOPL field for finer Ring 3 control. The Intel 80386 enhances this by introducing IOPL bits that allow Ring 3 code to perform I/O operations directly if IOPL equals 3, providing more flexible control without always relying on bitmaps. For instance, a Ring 3 application attempting to access a Ring 0 data segment without proper authorization triggers a #GP fault, protecting kernel memory from user-mode corruption.[12][3]
Memory Protection Fundamentals
In protected mode, the x86 architecture employs hardware-enforced checks to prevent unauthorized memory access, ensuring isolation between code, data, and tasks. These checks occur automatically during memory operations and include verification of address bounds, access types, and segment or page presence, as defined in memory descriptors. Bounds checking confirms that the effective address falls within the defined limits of the memory region, triggering a fault if exceeded. Type checking validates whether the operation (read, write, or execute) aligns with the region's permissions, such as restricting writes to executable code segments. The present/not-present bit in descriptors further ensures that only loaded or mapped memory regions are accessible, with absent regions causing an immediate fault.[16][3]
Violations of these protection rules generate specific exceptions to allow the operating system to handle errors gracefully. The general protection fault (#GP) arises from bounds violations, invalid types, or privilege mismatches, providing an error code identifying the offending selector or operation. The not-present fault (#NP) specifically occurs when a descriptor's present bit is clear, indicating the memory region is not loaded into physical memory. On processors supporting paging, such as the Intel 80386 and later, the page fault (#PF) handles similar issues at the page level, including not-present pages or permission violations, with an error code detailing the cause like writability or user/supervisor access. These faults are fully restartable, enabling precise recovery without data corruption.[16][3]
These mechanisms deliver key isolation benefits by enforcing per-task address spaces, where each process operates within its allocated regions without interfering with others. This prevents common exploits like buffer overflows from propagating to adjacent memory areas, enhancing system stability and security. Protection integrates with privilege levels (rings 0-3) to add layered enforcement, where access requires matching current privilege level (CPL) against descriptor privilege level (DPL), further restricting sensitive operations to higher-privilege code. In the Intel 80286, the original implementation of protected mode, these checks relied solely on segmentation for coarse-grained protection without paging support, limiting finer-grained isolation until the 80386's enhancements.[16][3]
Memory Management
Segmentation
In protected mode, segmentation provides a mechanism for dividing the linear address space into variable-sized segments, each defined by a segment descriptor stored in either the Global Descriptor Table (GDT) or a Local Descriptor Table (LDT). A segment selector, a 16-bit value loaded into one of the segment registers (CS, DS, ES, FS, GS, or SS), serves as an index into the GDT or LDT to retrieve the descriptor, which specifies the segment's base address, size limit, and access rights such as readability, writability, and executability. The logical address, consisting of the selector and an offset, is translated to a linear address by adding the offset to the segment base; the processor hardware automatically performs bounds checking to ensure the offset does not exceed the segment limit, generating a general-protection exception if it does.[12]
This approach contrasts sharply with real mode segmentation, where segments are fixed at 64 KB and addressed via a simple left-shift of the segment value by 4 bits to form a base address, lacking hardware-enforced bounds or access controls. In protected mode, segments can vary in size without the 64 KB restriction, enabling more efficient memory utilization, and stack segments support expand-up (growing from low to high addresses) or expand-down (growing from high to low addresses) configurations to accommodate stack operations while respecting the limit. These features introduce robust protection against buffer overflows and unauthorized access, fundamental to the security model of protected mode.[12]
The Intel 80286 implemented protected mode segmentation with 16-bit offsets, restricting each segment to a maximum size of 64 KB, which aligned offsets and addresses within this limit for compatibility with earlier designs but limited scalability for larger programs. The Intel 80386 enhanced this by introducing 32-bit offsets, allowing segments up to 4 GB in size, and added a granularity bit in the descriptor that, when set, scales the limit to units of 4 KB, permitting segment sizes from 4 KB to 4 GB in 4 KB increments for finer control over memory allocation. These improvements made segmentation more practical for multitasking environments and larger address spaces.[12]
Segmentation in protected mode supports various segment types essential for program execution and system operation: code segments hold executable instructions with attributes controlling conformity and readability; data segments manage read-write data areas; and stack segments handle push and pop operations with directionality flags. System segments, such as the Task State Segment (TSS), facilitate task management by storing processor state for context switching in multitasking scenarios. This segmented model, while flexible, is often combined with other mechanisms in modern operating systems to address memory needs comprehensively.[12]
Paging
The Intel 80386 introduced paging as a key enhancement to its protected mode memory management, providing a mechanism for virtual-to-physical address translation that was absent in the 80286, thereby enabling true virtual memory support for multitasking operating systems.[13] This paging unit operates on 32-bit linear addresses produced by segmentation, dividing the address space into fixed-size pages of 4 KB each, with support for larger 4 MB pages through specific page directory entry flags.[13]
Paging employs a two-level hierarchical structure: a Page Directory, which holds up to 1024 entries and serves as the top-level table, points to Page Tables; each Page Table also contains up to 1024 entries that map to individual 4 KB physical pages.[13] The base physical address of the Page Directory is stored in the CR3 control register, allowing dynamic switching of address spaces during task changes.[13] To enable paging, the PG bit (bit 31) in the CR0 register must be set after protected mode is activated (PE bit in CR0), which initiates address translation for all subsequent memory accesses.[13]
A 32-bit linear address is split into three fields for translation: the upper 10 bits (31–22) index the Page Directory entry, the middle 10 bits (21–12) index the corresponding Page Table entry, and the lower 12 bits (11–0) provide the offset within the 4 KB page.[13] The translation process begins by using CR3 to locate the Page Directory, then fetches the Page Table base address from the indexed directory entry (shifted left by 12 bits to align to a 4 KB boundary), followed by fetching the physical page frame base from the indexed table entry (also shifted left by 12 bits), and finally adding the offset to yield the physical address.[13] For example, given a linear address of 0x402003, the directory index is 0x001 (bits 31–22), the table index is 0x002 (bits 21–12), and the offset is 0x003 (bits 11–0); the physical address is then computed as ((page frame base from table entry) << 12) + offset.[13]
Key features include demand paging, where pages are loaded into physical memory only upon access; this is controlled by the Present bit (P, bit 0) in page directory and table entries—if unset, a page fault exception (#PF) is generated to allow the operating system to handle paging in or out.[13] Page-level protections are enforced via bits in the entries: the Read/Write bit (R/W, bit 1) distinguishes read-only from read-write access, and the User/Supervisor bit (U/S, bit 2) restricts access based on the current privilege level (CPL), with supervisor-mode (U/S=0) pages inaccessible from user mode (CPL>0).[13] Effective permissions combine attributes from both the directory and table entries, providing granular control that complements segmentation.[13] Performance is optimized by a Translation Lookaside Buffer (TLB), an on-chip four-way set-associative cache holding 32 recent translations (8 entries per set), which is flushed on CR3 reloads or task switches to ensure coherence.[13]
While the 80386 paging supports up to 4 GB of physical memory with 32-bit addresses, later processors introduced Physical Address Extension (PAE) starting with the Pentium Pro to handle more than 4 GB by expanding to 36-bit physical addresses and adding a third level (page directory pointer table) to the hierarchy.[17]
Segment Descriptors
Segment descriptors in the Intel 80386 protected mode are 8-byte entries stored in the global descriptor table (GDT) or local descriptor tables (LDT), defining the base address, size, and access attributes of memory segments.[13] Each descriptor consists of a 32-bit base address, a 20-bit limit field, and various control bits for protection and usage.[13] The structure is as follows:
| Byte | Bits | Field Name | Description |
|---|
| 0 | 0-7 | Limit (bits 0-7) | Lower 8 bits of the 20-bit segment limit; when the granularity bit (G) is 0, units are bytes (maximum 1 MB - 1); when G=1, units are 4 KB pages (maximum 4 GB - 1). |
| 1 | 0-7 | Limit (bits 8-15) | Upper 8 bits of the lower 16 bits of the segment limit. |
| 2 | 0-7 | Base (bits 0-7) | Lower 8 bits of the 32-bit base address, specifying the starting linear address of the segment. |
| 3 | 0-7 | Base (bits 8-15) | Next 8 bits of the 32-bit base address. |
| 4 | 0-7 | Base (bits 16-23) | Middle 8 bits of the 32-bit base address. |
| 5 | 0-3 | Type | 4-bit type field defining the segment category and access rights (bits 40-43); combined with S bit for effective 5-bit type interpretation (detailed below). |
| 4 | S (System) | System flag: 0 for system segments, 1 for code/data segments (bit 44). |
| 5-6 | DPL (Descriptor Privilege Level) | Privilege level (0-3) for the descriptor, used in protection checks; 0 is the most privileged (bits 45-46). |
| 7 | P (Present) | Present bit: 1 indicates the segment is present in memory; 0 indicates it is not (bit 47). |
| 6 | 0-3 | Limit (bits 16-19) | Upper 4 bits of the 20-bit segment limit. |
| 4 | AVL (Available) | Available bit for use by system software (bit 52). |
| 5 | Reserved | Must be 0 (bit 53). |
| 6 | D/B (Default operation size) | Default size: 0 for 16-bit mode, 1 for 32-bit mode (bit 54). |
| 7 | G (Granularity) | Granularity bit: 0 for byte granularity, 1 for 4 KB granularity (bit 55). |
| 7 | 0-7 | Base (bits 24-31) | Upper 8 bits of the 32-bit base address. |
The type field (combined with S for interpretation) distinguishes between code segments, data segments, and system segments.[13] Code segments (type values 8h-Bh when S=1) support execution, with bit 1 indicating readability (e.g., type 0Ah for execute-only non-conforming, 0Bh for execute/read non-conforming).[13] Data segments (type values 0h-7h when S=1) allow data access, with bit 1 for writability and bit 0 for expansion direction (e.g., type 0h for read-only expand-up, 2h for read/write expand-up, or 6h for read/write expand-down for stacks).[13] System segments include types for task state segments (TSS, 9h available or 0Bh busy), local descriptor tables (LDT, 2h), and gates.[13]
Gate descriptors enable controlled inter-privilege-level transfers, such as call gates (type Ch for 80386 32-bit).[13] Their format replaces the base and limit with a 32-bit offset to the target entry point, a 16-bit target segment selector, and bits 0-3 of byte 6 specifying the parameter count (0-15 doublewords) to copy from the caller's stack to the target's stack during privilege transitions.[13] The P, DPL, and type fields in byte 5 ensure the gate is present and enforces the required privilege for invocation.[13]
Segment descriptor tables are loaded using instructions like LGDT (load GDT register) or LLDT (load LDT register), which set the base and limit of the table in special registers (GDTR or LDTR).[13] Individual descriptors are referenced via 16-bit selectors loaded into segment registers (CS, DS, ES, FS, GS, SS) using instructions such as MOV, far jumps, or calls, which cache the descriptor details for address translation.[13]
Compared to the Intel 80286, the 80386 segment descriptors expand the base address to 32 bits (from 24 bits, enabling 4 GB addressing versus 16 MB) and the limit to 20 bits with granularity support (up to 4 GB versus 1 MB fixed).[13] The 80386 also introduces 32-bit offsets in gates and additional flags like D/B and G, absent in the 80286's 16-bit-oriented format.[13]
Compatibility Features
Real Mode Application Support
Protected mode environments, designed primarily for advanced memory management and security, initially posed challenges for running legacy real-mode software, which operates under the simpler 8086 addressing model limited to 1 MB of memory. To bridge this gap, developers introduced DOS extenders, software mechanisms that enable 16-bit real-mode applications to leverage protected-mode features without full rewriting. These extenders temporarily switch the processor into protected mode for the application while maintaining compatibility with the underlying MS-DOS environment.[18]
A key standardization effort was the DOS Protected Mode Interface (DPMI), a specification released in 1989 that defines an API for 16-bit DOS programs to request protected-mode services, such as allocating extended memory beyond 1 MB and managing descriptors for larger address spaces. DPMI allows applications to execute in protected mode via function calls (interrupt 31h), providing access to up to 4 GB of memory on 80386 processors while intercepting and emulating real-mode DOS and BIOS services as needed. This interface became widely adopted for memory-intensive DOS applications, like games and utilities, by encapsulating protected-mode operations behind a real-mode-compatible facade.[19][18]
Another technique, known as big real mode or unreal mode, emerged as an unofficial but practical workaround on 80386 and later processors. By briefly entering protected mode to load segment descriptors with 32-bit base addresses and limits set to the maximum (0xFFFFFFFF), the system can then return to real mode while retaining these extended descriptors, effectively allowing 32-bit addressing and access to the full 4 GB address space without formal protection mechanisms. This mode lacks memory protection, making it vulnerable to crashes from invalid accesses, but it enabled early 32-bit DOS programs to run with minimal modifications under real-mode DOS. Intel's 1986 programmer's reference manual implicitly supported this behavior through descriptor manipulation, though it was not an intended operating mode.[20]
Historically, these techniques saw significant use in Microsoft Windows 3.x, particularly in 386 Enhanced mode, where DPMI served as the interface for running DOS applications alongside Windows sessions. Windows 3.0 formalized DPMI support, allowing real-mode DOS programs to request protected-mode execution for memory expansion, while the system managed mode switches to integrate them into the multitasking environment. However, limitations persisted, such as the absence of true multitasking for DOS apps, which ran in isolated sessions without preemptive scheduling, and reliance on cooperative interactions with the host OS.[21]
Key challenges in supporting real-mode applications within protected environments include handling interrupts and BIOS calls, which are inherently real-mode constructs incompatible with protected-mode segmentation and privilege checks. DOS extenders address this by implementing mode-switching routines, such as the int86 function family, to temporarily revert to real mode for BIOS interrupts (e.g., int 10h for video services) before returning, incurring performance overhead from context switches and descriptor reloads. Failure to manage these transitions properly could lead to system instability or inaccessible hardware resources.[22]
With the shift to 64-bit operating systems like Windows XP Professional x64 Edition and later, native support for real-mode applications has declined sharply, as long mode (x86-64) omits hardware features like virtual 8086 mode essential for direct 16-bit emulation. 64-bit Windows versions explicitly do not support 16-bit processes or components, rendering traditional DOS extenders obsolete and requiring software emulation tools for legacy real-mode software. Virtual 8086 mode serves as an advanced hardware alternative in 32-bit protected mode for more seamless integration.[23]
Virtual 8086 Mode
Virtual 8086 mode, introduced with the Intel 80386 processor, enables the execution of real-mode 8086/8088/80186/80188 software within a protected-mode environment, providing a mechanism for backward compatibility while maintaining memory protection and multitasking capabilities.[13] This mode is activated by setting the VM bit (bit 17) in the EFLAGS register while the processor operates in protected mode (PE bit set in CR0), typically through methods such as an IRET instruction from ring 0 with VM=1 or a task switch to a Task State Segment (TSS) marked with VM=1.[13] Once activated, each V86 task functions as a virtual machine emulating the 8086 processor state, including 16-bit registers, segment-based addressing, and real-mode interrupt handling, including additional segment registers (FS and GS) and support for 80186 instructions such as BOUND, while restricted to the 8086 instruction set.[13]
Key features of Virtual 8086 mode include the virtualization of sensitive operations, where instructions for I/O (e.g., IN, OUT, INS, OUTS) and privileged actions (e.g., CLI, STI, HLT, INT n) generate exceptions that trap to a ring 0 Virtual 8086 monitor for emulation or direct hardware access, ensuring protected-mode supervision.[13] Interrupts and exceptions in V86 mode vector through the protected-mode Interrupt Descriptor Table (IDT) to privilege level 0 handlers, which can reflect them back to the 8086 code or handle them natively, supporting seamless integration with the host operating system.[13] Multiple V86 instances are supported on 80386 and later processors by enabling paging (PG bit in CR0), which allows distinct linear address spaces for each task, preventing interference while emulating the 8086's 20-bit addressing model (up to 1 MB).[13] This paging integration maps V86 linear addresses—formed by shifting 16-bit selectors left by 4 bits and adding offsets—into the protected-mode virtual address space, enabling address wrapping emulation if needed to mimic 8086 behavior beyond 1 MB.[13]
In practice, Virtual 8086 mode facilitated the integration of legacy real-mode applications into protected-mode operating systems, such as in Microsoft Windows 3.1, where it powered "DOS boxes" by running MS-DOS programs in isolated virtual machines under the 80386's enhanced mode.[24][25] However, the mode imposes limitations, including a per-task address space capped at 1 MB (with potential for 64 KB overflow in some configurations), reliance on paging for isolation which adds complexity, and performance overhead from frequent supervisor traps for I/O and interrupts, as all V86 tasks execute at current privilege level (CPL) 3 without direct access to hardware.[13] Compatibility differences from true 8086 execution, such as altered instruction timings and exception behaviors (e.g., added page faults), may require OS-level adjustments.[13]
For instance, setting the I/O privilege level (IOPL) to 3 in the EFLAGS register permits V86 tasks running at ring 3 to perform I/O operations without trapping to the monitor, enabling user-level emulation in scenarios like application-hosted DOS environments, though lower IOPL values enforce stricter supervision.[13]
Multitasking Support
Task Structures
In protected mode, multitasking is supported through hardware data structures that manage task states and facilitate switches between them. The primary structure is the Task State Segment (TSS), a system segment that holds the complete context of a task, including processor registers and stack information, allowing the CPU to save and restore states efficiently during switches.[26]
Introduced with the Intel 80286, the TSS is a fixed-format segment with a minimum size of 44 bytes, containing fields such as the back link to the previous TSS, segment registers (CS, DS, ES, SS), general-purpose registers (AX, BX, CX, DX, SI, DI, BP, SP), the instruction pointer (IP), flags register, and initial stack pointers and selectors for each privilege level (0 through 2). In the 80286, hardware task switching is initiated by a CALL or JMP instruction targeting a task selector, which references a TSS descriptor in the Global Descriptor Table (GDT); this loads the new task's state from its TSS, saves the current task's state, and updates the processor's registers accordingly, enabling basic multitasking without software intervention for context management.[3]
The Intel 80386 enhanced the TSS to a minimum size of 104 bytes to accommodate 32-bit operations and additional features, incorporating expanded fields like 32-bit versions of the general-purpose registers (EAX, EBX, etc.), the extended instruction pointer (EIP), extended flags (EFLAGS), additional segment registers (FS, GS), the page directory base register (CR3) for per-task paging, the Local Descriptor Table (LDT) selector, and stack pointers and selectors per privilege ring (ESP0:ESP2, SS0:SS2) to support ring transitions during switches. An optional I/O permission bitmap at the end of the TSS allows fine-grained control over I/O port access for the task, with the bitmap's offset specified in the TSS itself. The Task Register (TR), a special 16-bit register, holds the selector for the current task's TSS descriptor in the GDT or LDT; it is loaded using the LTR instruction (restricted to privilege level 0) and automatically updated during task switches to point to the active TSS.[26]
The 80386 introduced the Nested Task (NT) flag in EFLAGS (bit 14), set during CALL-initiated switches to indicate nesting and enable return to the parent task via IRET, while JMP switches clear it for non-nested transitions; additionally, a busy bit in the TSS descriptor (changing type from 9 to B for available to busy) prevents reentrancy by triggering a general protection fault if a busy TSS is referenced. For security, the Descriptor Privilege Level (DPL) in the TSS descriptor (bits 5:6) specifies the minimum privilege required to access the TSS, ensuring that task switches are only permitted if the current privilege level (CPL) is at or below the TSS DPL, thus enforcing inter-task access control via privilege levels.[26]
Context Switching
In protected mode, context switching enables multitasking by saving the processor state of the current task and restoring that of another, leveraging task state segments (TSS) to hold the complete execution context including registers, segment selectors, and stack pointers. This mechanism supports both inter-task and nested execution flows on x86 processors.[27]
Hardware context switching is performed using JMP, CALL, or IRET instructions that target a TSS selector in the global descriptor table (GDT). Upon execution, the processor automatically saves the current task's state—such as general-purpose registers, EFLAGS, EIP, and segment registers—into the old TSS, then loads the new task's state from its TSS into the processor registers. The task register (TR) is updated with the new TSS selector, and the busy flag in the TSS descriptor is atomically set using LOCK semantics to prevent concurrent access. This process ensures isolation between tasks without software intervention.[27]
Software context switching, in contrast, is managed entirely by the operating system kernel without hardware TSS involvement, offering greater control and efficiency. The kernel explicitly saves the current task's registers and state to a per-task data structure using instructions like PUSH, POP, or MOV, then restores the next task's state similarly; floating-point and vector unit states are handled via FXSAVE/FXRSTOR or the XSAVE family of instructions. This method is typically invoked during scheduler decisions, such as timer interrupts for preemptive multitasking.[27]
Interrupt-driven context switching is triggered when an interrupt or exception vectors through a task gate in the interrupt descriptor table (IDT), initiating a hardware task switch to a dedicated handler task. The processor saves the interrupted task's state to its TSS, loads the handler's state, and provides a fresh stack for execution, which enhances isolation for event handling; interrupts must often be disabled during the switch to avoid reentrancy. This approach integrates seamlessly with the IDT for asynchronous operations.[27]
With the introduction of the Intel 386 processor, task switching gained support for nesting via a back pointer field in the TSS, which stores the selector of the previous task's TSS during switches initiated by CALL or interrupts. When the nested task flag (NT) is set in EFLAGS, an IRET instruction uses this back pointer to return control to the prior task, automatically reloading its saved state and clearing the NT flag. This linkage facilitates chained task execution without manual stack management.[13]
Hardware task switching incurs higher latency compared to software methods due to TSS memory accesses, which often cause cache misses and TLB invalidations, making it less suitable for frequent switches in performance-critical systems. Consequently, modern operating systems favor software approaches for their speed and adaptability, reserving hardware mechanisms for specific isolation needs.[27]
Operating Systems
Early Adopters
Earlier commercial operating systems to adopt protected mode included Microsoft Xenix around 1984 and Coherent in the mid-1980s, which utilized the Intel 80286's protected mode for Unix-like environments on PC-compatible hardware.
One of the earliest widespread commercial operating systems to adopt protected mode was OS/2 1.x, jointly developed by IBM and Microsoft and released in December 1987. It represented a significant implementation of the Intel 80286's protected mode for multitasking, enabling pre-emptive scheduling, multi-threading, and memory protection through a segmented address space while requiring at least a 286 processor. This allowed OS/2 to support up to 16 MB of RAM and virtual memory, marking a significant shift from the limitations of real-mode MS-DOS.[28]
In 1990, Microsoft released Windows 3.0, which incorporated protected mode in its standard mode for 80286-based systems, utilizing segment descriptors to enforce memory protection attributes such as read-only access and prevent errant pointers from corrupting other programs. This segmented approach addressed the 1 MB real-mode barrier, supporting up to 16 MB of physical memory, though it retained 16-bit limitations compared to the enhanced mode on 80386 processors. Windows 3.0's protected mode features facilitated better multitasking and application isolation, contributing to its commercial success with millions of installations.[24]
Among Unix variants, 386BSD emerged as an early adopter, with its project established in the summer of 1989 to port Berkeley Software Distribution (BSD) to the Intel 80386 platform. It leveraged the 80386's protected mode for 32-bit addressing in a flat memory model, incorporating two-level paging with 4 KB pages to manage virtual memory efficiently and support a 4 GB address space per process. While initial releases minimized segmentation for BSD compatibility, future plans included Virtual 8086 mode support to enable real-mode application execution within protected environments.[29]
Early protected mode adoption encountered significant challenges, including the limited availability and high cost of 80286 and 80386 hardware in the late 1980s, exacerbated by a 1988 RAM price crisis that made even 3 MB systems expensive for typical users. To bridge the gap for DOS-based applications, tools like Phar Lap's 286/DOS Extender emerged around 1989, allowing protected mode execution on 286 systems by proxying DOS calls and providing access to extended memory, though mode-switching overhead limited its use in performance-sensitive software.[30][31]
A pivotal development came in 1993 with Microsoft's release of Windows NT 3.1, which mandated an 80386 processor to deliver full 32-bit protected mode functionality, including advanced paging and portability across architectures, as part of a strategic shift from 16-bit systems to enterprise-grade computing. This requirement reflected Microsoft's focus on leveraging the 80386's capabilities for robust multitasking and security, distinguishing NT from consumer-oriented Windows versions.[32]
Modern Implementations
In contemporary x86 architectures, all x86-64 CPUs initialize in real mode upon power-on for backward compatibility with legacy BIOS firmware, requiring the bootloader to transition the processor to protected mode before enabling long mode for 64-bit operation.[33][34] This sequence ensures that protected mode serves as the foundational execution environment, even in 64-bit systems, where long mode extends its capabilities with a larger address space while retaining core protection mechanisms like privilege rings and memory isolation.
Modern operating systems predominantly adopt a flat memory model in protected mode, configuring segment registers to span the entire address space with minimal segmentation, thereby simplifying memory management and relying primarily on paging for isolation and protection. This approach persists in 64-bit long mode, where protected mode features remain essential for legacy support, such as running 32-bit applications in compatibility submode, underscoring the enduring role of protected mode as the baseline for x86 execution beyond real mode.[35]
The Windows NT kernel, powering modern Windows versions, operates exclusively in protected mode with a flat segmentation model augmented by paging to enforce strict separation between user-mode (ring 3) and kernel-mode (ring 0) address spaces, preventing user applications from directly accessing kernel memory or hardware. For legacy 16-bit application compatibility, Windows employs the NT Virtual DOS Machine (NTVDM) subsystem, which leverages Virtual 8086 mode within protected mode to emulate a DOS environment without compromising overall system isolation.[36]
In Linux, the kernel utilizes IA-32 protected mode for managing 32-bit processes on 64-bit systems via long mode's compatibility submode, where the code segment selector (CS=0x08) designates 32-bit execution while maintaining protected mode's privilege levels and paging for process isolation.[37] This setup allows seamless execution of legacy IA-32 binaries under the 64-bit kernel, building directly on protected mode foundations extended by long mode's 64-bit registers and addressing.
macOS, based on the XNU kernel, minimizes reliance on protected mode segmentation in favor of paging for virtual memory management, implementing a flat model where page tables handle user-kernel separation and protection across its hybrid Unix foundation.[38] To enhance kernel security, macOS incorporates Intel's Supervisor Mode Execution Prevention (SMEP, introduced in 2012) and Supervisor Mode Access Prevention (SMAP, introduced in 2013), which prevent the kernel from executing or accessing user-mode memory pages, thereby mitigating common exploit techniques like code injection and data leaks.[39]