Doom engine
The Doom engine, also known as id Tech 1, is a pioneering game engine developed primarily by programmer John Carmack at id Software for the 1993 first-person shooter video game Doom.[1] It introduced efficient 2.5D rendering techniques, including binary space partitioning (BSP) trees to organize level geometry and determine visibility, enabling fast, texture-mapped visuals on contemporary PC hardware without full 3D polygons.[2] The engine separates game logic from assets stored in WAD files, facilitating modding and ports, and includes components for multiplayer networking over IPX and a sound system with positional audio.[1] Originally compiled for DOS using Watcom C, it powered Doom (1993), Doom II: Hell on Earth (1994), and licensed titles like Heretic (1994), Hexen: Beyond Heretic (1995), and Strife (1996), establishing foundational standards for the FPS genre with its speed, scalability, and shareware distribution model.[1] Its source code was publicly released on December 23, 1997, under a non-commercial license by Carmack, who noted its outdated aspects like fixed-point math and polar clipping while encouraging community enhancements for features such as slopes and transparency; it was later relicensed under the GNU GPL in 1999,[3] spawning hundreds of source ports for modern platforms.[4] Despite limitations like no true vertical aiming or overhanging geometry due to its 2D map structure, the engine's innovations in real-time rendering and level design influenced subsequent id Tech iterations and the broader industry.[5]History and Development
Origins and Creation
The Doom engine originated in late 1992 at id Software, when lead programmer John Carmack conceived it as a significant advancement over the raycasting technology used in the company's previous title, Wolfenstein 3D.[6] Carmack's vision was driven by the need to address Wolfenstein 3D's restrictive flat-world design, which limited environments to uniform floor and ceiling heights without variations or multi-level structures.[6] This evolution aimed to enable pseudo-3D spaces with dynamic height differences, textured ceilings, and more immersive level geometries, transforming first-person shooter gameplay into a more architecturally complex experience.[6] Initial prototype development began shortly after Wolfenstein 3D's release, with Carmack focusing on optimizing performance for the era's hardware.[7] The engine was written primarily in C, with performance-critical sections in assembly language to leverage the speed of Intel 386 processors, and compiled using the Watcom C compiler, targeting MS-DOS as the primary platform to ensure broad accessibility on consumer PCs.[7] This low-level approach allowed for efficient rendering of the ambitious features, building on techniques Carmack had refined during porting efforts for Wolfenstein 3D to other systems like the SNES.[6] Key milestones marked steady progress through 1993, culminating in the core engine's completion by late that year, just ahead of Doom's release on December 10.[6] During this phase, the engine integrated with early level editing tools—precursors to modern Doom builders—enabling designers to construct and iterate on multi-height sectors efficiently.[7] To enhance rendering efficiency amid growing level complexity, Carmack later introduced binary space partitioning during development.[6]Key Developers and Innovations
The Doom engine was primarily architected by lead programmer John Carmack at id Software, who handled the core technical implementation, including the rendering and spatial partitioning systems.[7] Collaborating closely were project leader and designer John Romero, who influenced level integration and gameplay flow, and graphic artist Adrian Carmack (no relation to John), who focused on integrating visual assets into the engine's rendering pipeline.[7] This division of labor allowed for rapid iteration, with Carmack's programming expertise complemented by the artistic and design contributions that ensured seamless asset loading and visual coherence. A major innovation was the first widespread adoption of binary space partitioning (BSP) trees in a video game, pioneered by John Carmack to efficiently partition static 3D worlds and resolve visibility during rendering.[2] This offline preprocessing technique divided levels into convex subspaces, enabling fast traversal and front-to-back drawing without a full z-buffer, which dramatically improved performance on 1993 hardware.[2] Complementing this, Carmack implemented column-based texture mapping for walls and sprites, storing assets vertically to match the engine's per-column rendering approach, which minimized memory access and maximized cache efficiency on era-specific CPUs lacking floating-point units.[8] Similarly, artist Kevin Cloud worked on asset pipelines alongside Adrian Carmack, creating and optimizing textures, sprites, and environmental elements to fit the engine's constraints, ensuring high-quality visuals without compromising speed.[9] Another breakthrough was the introduction of variable height sectors, allowing floors and ceilings to differ in elevation within the same level, which enabled multi-story environments like overhanging balconies and staircases—a leap beyond the uniform-height raycasting of predecessors like Wolfenstein 3D.[10] This feature, implemented by Carmack, supported vertical gameplay tactics while maintaining the engine's 2.5D architecture of horizontal flats and vertical walls.[10]Source Code Release and Licensing
On December 23, 1997, id Software released the source code for the Linux version of the Doom engine (version 1.10) under the DOOM Source License, a restrictive non-commercial agreement that permitted non-profit use to encourage community-driven ports and modifications while protecting commercial interests.[4] This release, spearheaded by John Carmack, provided developers with access to the engine's core without the proprietary game data, explicitly requiring users to possess legitimate Doom binaries for compatibility.[4] The initial impact was immediate and widespread, enabling the rapid development of source ports that extended the engine to new platforms while maintaining binary compatibility with original Doom WAD files. For instance, DOSDoom emerged within a day of the release as the first DOS-targeted port, followed by WinDoom, which adapted the code for Windows environments with features like TCP/IP networking.[11] These early ports preserved the engine's performance on legacy hardware and facilitated cross-platform play, ensuring Doom's longevity beyond its original DOS ecosystem.[12] Subsequent licensing evolutions broadened the code's accessibility; on October 3, 1999, John Carmack granted permission for re-release under the GNU General Public License version 2 (GPLv2) or later, transitioning from the initial non-profit restrictions to full open-source terms that allowed derivative works and commercial redistribution of modifications.[13] This change spurred advanced derivatives, such as ZDoom, initially released on March 6, 1998, which introduced enhancements like dynamic lighting and improved scripting under the GPLv2.[14] Legally, id Software retained copyright over the proprietary assets, including WAD files containing levels, textures, sounds, and sprites, which remained separate from the open-sourced engine code and required separate licensing for use.[15] This distinction prevented unauthorized distribution of complete games while empowering the community to innovate on the engine itself, with modern ownership under ZeniMax Media (via Bethesda Softworks) upholding these separations.[16]Core Architecture
Game World Fundamentals
The Doom engine utilizes a pseudo-3D architecture, often described as 2.5D, which constructs immersive environments from fundamentally 2D maps augmented with height attributes rather than employing full 3D geometry. This approach allows for efficient rendering on 1990s hardware by projecting flat sectors into a perspective view, simulating depth through variable heights for floors and ceilings while maintaining all structural elements in a horizontal plane. Unlike contemporary true 3D engines, it eschews complex polygonal models or arbitrary surface orientations, prioritizing speed and simplicity in level design and traversal.[8] The coordinate system operates on a fixed grid in the X-Y plane, where vertices defining the map's layout are placed at integer positions ranging from -32,768 to 32,767 units. These coordinates form the backbone of the 2D floor plan, with Z-axis values introduced solely for vertical offsets in sectors and movable entities, enabling stacked rooms and elevation changes without intersecting geometry. Player and object positions are computed using 32-bit fixed-point arithmetic (16.16 format), providing sub-unit precision for smooth movement and collision detection within this grid-based framework.[8] World scale in the map editor aligns directly with integer map units, which the engine interprets with fixed-point scaling for enhanced precision, effectively treating each editor unit as 65,536 internal subunits to handle fractional movements accurately. This scaling ensures consistent proportions, where typical gameplay elements like doorways span 128 units wide, establishing a human-scale environment without requiring floating-point operations. The automap grid, when enabled, visually represents 64 units per square for mapper reference, reinforcing the engine's grid-aligned design philosophy.[8] Fundamental constraints shape the engine's world model, notably the absence of dynamic lighting, which relies instead on static light levels assigned per sector to modulate brightness uniformly across enclosed areas. Occlusion culling is limited to the binary space partitioning (BSP) tree, which determines rendering order but does not incorporate advanced techniques like portal visibility or runtime shadowing. These limitations, including the prohibition of sloped surfaces—all floors and ceilings remain perfectly horizontal—stem from the engine's 2D foundational structure, preventing overlaps or non-planar geometry while optimizing for real-time performance on limited processors. Sectors serve as the primary building blocks, encapsulating these attributes into discrete, enclosed volumes.[8]Level Data Structure
The Doom engine stores level data in WAD (Where's All the Data) files, which use a lump-based archive format consisting of a 12-byte header identifying the file type (IWAD for internal game data or PWAD for patches), the number of lumps, and an offset to the directory; followed by a directory of 16-byte entries for each lump specifying its offset, size, and 8-character name; and the raw data lumps themselves in arbitrary order.[17] Levels are encapsulated within map-specific lumps prefixed by a marker like E1M1, containing the core geometry and entity data required for the game world.[18] Core level components begin with vertices, defined as 4-byte records storing fixed-point X and Y coordinates (16.16 format) to represent 2D points in the map grid, serving as endpoints for structural lines.[18] Linedefs connect pairs of vertices in 14-byte records, including indices to start and end vertices, flags for properties like impassability or two-sided rendering, a special type for effects (e.g., teleporters), a tag for sector triggers, and indices to one or two sidedefs; these define the boundaries and behaviors of walls and portals.[18] Sidedefs, in 30-byte records, attach to linedefs and specify texture mappings with X and Y offsets for alignment, names for upper, lower, and middle textures (padded to 8 bytes, with "-" indicating transparency), and a reference to the adjacent sector for spatial context.[18] Sectors enclose areas bounded by linedefs in 26-byte records, detailing floor and ceiling heights (in map units), texture names for flats, light level (0-255), special types (e.g., damage floors or secrets), and a tag linking to linedef triggers for dynamic effects.[18] Object placement occurs via the THINGS lump, where each 10-byte record positions entities as 2D sprites with X and Y coordinates, facing angle (in 0-359 degrees, multiples of 45), type identifier (e.g., 1 for player start, 7 for spider demon, 5 for blue key), and flags for skill levels, multiplayer spawning, or ambush behavior (deaf monsters).[18] These define interactive elements like enemies, items, and lights without inherent 3D orientation beyond angle and height inheritance from sectors. During loading, the engine reads these raw lumps and preprocesses the geometry by generating segments (SEGS) from linedefs, subdividing sectors into convex subsectors (SSECTORS) for efficient partitioning, and building a binary space partitioning (BSP) node tree (NODES) to organize the map for traversal and rendering, enabling hidden surface removal without per-frame calculations.[8] This BSP integration allows rapid occlusion queries during gameplay, with subsector segmentation ensuring convex polygons for texture projection.[8]| Lump | Size per Entry | Key Fields | Purpose |
|---|---|---|---|
| VERTEXES | 4 bytes | X, Y coordinates | 2D map points for geometry. |
| LINEDEFS | 14 bytes | Vertex indices, flags, type, tag, sidedef indices | Wall connections and properties. |
| SIDEDEFS | 30 bytes | Texture offsets, upper/lower/middle textures, sector | Wall texturing and adjacency. |
| SECTORS | 26 bytes | Floor/ceiling heights, flat textures, light, type, tag | Enclosed areas and effects. |
| THINGS | 10 bytes | X/Y position, angle, type, flags | Entity placement and attributes. |
Binary Space Partitioning
The Binary Space Partitioning (BSP) tree in the Doom engine organizes the 2D level geometry into a hierarchical structure of convex subspaces, enabling efficient visibility determination and rendering without a full z-buffer. This approach, adapted by John Carmack from earlier computer graphics research, partitions the map using lines extended from linedefs as hyperplanes, ensuring all resulting regions (subsectors) are convex polygons that simplify ray casting and occlusion checks. The tree is precomputed offline during level compilation, transforming the raw map data into a binary tree where each node represents a partitioning line, and leaf nodes correspond to visible subsectors.[8] BSP tree construction begins with the entire map as the initial region and proceeds recursively: a suitable linedef is selected as the splitter, extended infinitely in both directions to form a partitioning hyperplane, and the map is divided into two half-spaces containing the remaining geometry. Any linedefs intersected by the splitter are split at the intersection point to avoid crossing partitions, increasing the total number of segments while ensuring convexity. The selection of the splitter aims to balance the tree by minimizing the difference in subtree sizes, often using heuristics to evaluate potential lines based on the number of affected segments and overall balance, as outlined in methods for generating efficient partitioning trees. This recursive process continues until all subregions are small convex subsectors, typically bounded by 4 to 20 segments, resulting in a binary tree with depth proportional to the map's complexity. The original Doom node builder, iBSP, implemented this in Objective-C on NeXTSTEP, taking seconds to minutes per level depending on size.[8] During rendering, the BSP tree is traversed from the root node using the player's viewpoint to classify and prioritize subsectors for drawing. The function R_RenderBSPNode recursively visits nodes, computing the viewpoint's position relative to each partitioning plane to determine if it lies in the front (positive side), back (negative side), or on the plane. Subtrees on the opposite side of the viewpoint are culled early, while visible portions are sorted front-to-back to allow natural occlusion—closer walls obscure farther ones without depth testing. Bounding boxes around subtrees provide additional coarse culling, discarding entire branches if they fall outside the view frustum. This traversal identifies active subsectors, which are then rendered by drawing their bounding segments (segs) as walls, ensuring only visible geometry is processed and maintaining consistent performance on 1993 hardware.[19][8] Central to both construction and traversal is the side-classification mathematics, which uses vector cross products to evaluate point positions relative to lines. For a point P = (x_0, y_0) and a line defined by points A = (x_1, y_1) and B = (x_2, y_2), the signed value d = (x_2 - x_1)(y_1 - y_0) - (x_1 - x_0)(y_2 - y_1) indicates the side: d > 0 for one half-space (front), d < 0 for the other (back), and d = 0 for collinearity, used in splitting to find intersection points along the line. During tree building, this formula identifies intersections for segment splits; in rendering, it drives the recursive decisions for viewpoint navigation. This efficient 2D computation, requiring only multiplications and subtractions, aligns with the engine's fixed-point arithmetic for speed.[19]Rendering Pipeline
Wall and Surface Drawing
The Doom engine renders vertical walls and surfaces through a column-based projection technique that transforms 2D map geometry into a 3D perspective view on the screen. Wall segments, known as segs, are projected by converting their endpoint vertices into angular positions using Binary Angular Measurement (BAM) and a lookup table (viewangletox) to determine corresponding screen x-coordinates. The vertical extent of each column is calculated based on the segment's distance from the player and the sector's ceiling and floor heights, ensuring perspective scaling where closer walls appear taller. This process avoids full ray tracing by relying on pre-partitioned geometry, drawing one column at a time across the screen width.[8]
Texture application occurs via vertical strips extracted from wall textures, which are stored rotated 90 degrees counterclockwise to optimize cache access during rendering. Each column's texture coordinate is offset to align with the wall's position in the map, and the strip is scaled vertically by the projected height; upper and lower textures fill from ceiling to mid-point and mid-point to floor, respectively. Mid-textures, applied to the middle portion of segs, handle features like doors and windows, with support for transparency in specific cases such as switch animations. Shading is applied using fixed-point light level calculations per column, diminishing with distance.[8]
To manage overlaps and occlusion, segs are processed in front-to-back order as dictated by the binary space partitioning (BSP) tree traversal, clipping each new segment against previously drawn ones using a span occlusion array (solidsegs). Backface culling eliminates invisible sides by checking if the angle between segment endpoints exceeds 180 degrees from the view. This painter's algorithm approach ensures distant walls are obscured without a depth buffer, limiting the engine to convex rooms but enabling efficient rendering on 1993 hardware.[8]
Special effects for walls include animation through frame cycling, where textures switch based on elapsed time and predefined sequences defined in the level data. Transparent or masked mid-textures, such as those for fences or switches, are rendered in a separate pass (R_DrawMasked) after solid walls to composite semi-transparent elements correctly. These effects are constrained to avoid performance overhead, with no support for arbitrary transparency on full walls.[8]