DirectX Raytracing
DirectX Raytracing (DXR) is a hardware-accelerated ray tracing API extension to Microsoft's DirectX 12 graphics platform, enabling developers to simulate realistic light interactions in real-time for effects such as global illumination, reflections, shadows, and refractions in video games and applications.[1] It treats ray tracing as a peer to traditional rasterization and compute shaders, allowing flexible integration into existing rendering pipelines without requiring a complete overhaul.[1] Announced by Microsoft on March 19, 2018, DXR was initially developed in collaboration with NVIDIA to leverage specialized ray tracing cores in GPUs, with subsequent support from vendors like AMD, marking a significant advancement in accessible real-time ray tracing for Windows-based development.[2] The core of DXR revolves around acceleration structures—hierarchical data representations of 3D scenes that optimize ray-scene intersections for performance—and specialized HLSL shader types, including ray generation, closest hit, any hit, and miss shaders, which handle ray launching and intersection responses.[1] Developers dispatch ray tracing workloads using theDispatchRays command, which supports both CPU-initiated and GPU-initiated execution via indirect calls, enabling scalable parallelism across rays.[1] Key optimizations include watertight triangle intersections to prevent artifacts, procedural geometry support, and ray flags for culling non-essential tests, all contributing to efficient traversal of complex scenes.[1]
DXR's hardware support is tiered to accommodate varying GPU capabilities, starting with Tier 1.0 for basic ray tracing on compatible hardware like NVIDIA RTX series or AMD RDNA 2 architectures, progressing to Tier 1.1 with features like inline ray querying (via RayQuery) for hybrid rasterization-ray tracing, and Tier 1.2 introducing shader execution reordering (SER) and opacity micromaps for enhanced transparency handling and performance.[1] Integration with DirectX 12 occurs through state objects that bundle shaders, root signatures, and pipeline configurations, allowing dynamic shader binding via shader tables during execution.[1] A fallback layer was initially provided for non-hardware-accelerated systems but has been deprecated as native support proliferated.[1]
Since its debut, DXR has powered notable implementations in engines like Unreal Engine and Unity, fostering advancements in photorealistic rendering while maintaining compatibility with broader DirectX ecosystems.[2] Ongoing updates, as detailed in the DXR functional specification (version 1.34, updated June 30, 2025), continue to refine features like recursion limits (up to 31 levels) and build optimizations for acceleration structures, ensuring DXR remains a cornerstone for future graphics innovations.[1]
History and Development
Announcement and Initial Release
DirectX Raytracing (DXR) was announced on March 19, 2018, at the Game Developers Conference (GDC) in San Francisco by Microsoft and NVIDIA, as a new extension to the DirectX 12 API designed to provide hardware-accelerated ray tracing capabilities for real-time rendering in games and applications.[2][3] The announcement positioned DXR as an open standard that integrates seamlessly with existing DirectX 12 workflows, allowing developers to incorporate ray-traced effects without requiring major engine redesigns.[2] The initial development of DXR involved close collaboration between Microsoft and NVIDIA, who demonstrated early hardware support through NVIDIA's RTX technology on Volta and Turing architectures.[3] At the GDC announcement, NVIDIA showcased real-time ray tracing demos built with DXR, including photorealistic scenes in Unreal Engine 4 that highlighted hybrid rendering techniques combining rasterization and ray tracing.[3] These demonstrations emphasized DXR's potential to deliver advanced visual fidelity on consumer GPUs, marking the first public reveal of ray tracing acceleration in a major graphics API.[2] DXR officially launched to the public on October 2, 2018, as part of the Windows 10 October 2018 Update (version 1809), enabling out-of-the-box support on compatible hardware without additional installations.[4] The feature requires a DirectX 12-compatible graphics card with hardware ray tracing support, such as NVIDIA's GeForce RTX 20-series GPUs paired with the corresponding Game Ready Driver (version 416.16 or later).[4][5] Key motivations for DXR's introduction included facilitating real-time ray tracing for effects like global illumination, realistic reflections, soft shadows, and refractions, thereby enhancing graphical realism in games while leveraging existing DirectX ecosystems.[2][3]Early Adoption and Milestones
Following the initial release of DirectX Raytracing (DXR), the technology saw its first major game integration with Battlefield V in November 2018, marking it as the inaugural title to leverage DXR for real-time ray-traced reflections.[6] This implementation demonstrated DXR's potential for enhancing visual fidelity in large-scale multiplayer environments, though it required NVIDIA's GeForce RTX 20-series GPUs for optimal performance. Building on this momentum, Metro Exodus followed in February 2019, utilizing DXR to achieve advanced global illumination effects in its post-apocalyptic settings, further showcasing the API's versatility in single-player narratives.[7] Control arrived in August 2019 as another early adopter, employing DXR for intricate ray-traced reflections, shadows, and indirect lighting that integrated seamlessly with its surreal architectural designs.[8] Hardware accessibility expanded significantly in 2020 with AMD's RDNA 2 architecture in the Radeon RX 6000 series, which introduced hardware-accelerated ray tracing support for DXR, allowing broader adoption beyond NVIDIA's ecosystem.[9] This shift democratized DXR implementation for developers targeting cross-vendor compatibility. Further broadening support, Intel's Arc GPUs launched in 2022 with native DXR acceleration via their Xe-HPG architecture, enabling ray tracing on Intel hardware and reducing dependency on specific vendors.[10] Key milestones in DXR's early years included its integration into major game engines, accelerating developer adoption. Unreal Engine 4.22, released in April 2019, introduced beta support for DXR, enabling real-time ray tracing and path tracing features that simplified implementation for creators.[11] Unity followed suit with ray tracing in its High Definition Render Pipeline (HDRP) starting in late 2019 and stabilizing in 2020, providing tools for ray-traced reflections, shadows, and global illumination in Unity projects. In November 2020, Microsoft introduced DXR 1.1 with features like inline ray querying, enhancing hybrid rendering efficiency and adopted in subsequent engine updates. By 2020, dozens of games supported DXR, reflecting rapid industry uptake as titles across genres incorporated the technology for enhanced realism. Early adoption faced notable challenges, including the high cost of compatible hardware—such as NVIDIA's initial RTX GPUs priced over $1,000—which limited accessibility to enthusiasts and professionals. Additionally, DXR's performance overhead, often reducing frame rates by 30-50% in initial implementations like Battlefield V, prompted developers to adopt hybrid approaches combining rasterization with selective ray tracing to balance visual gains and playability.[12] These hurdles underscored the need for ongoing optimizations in both software and hardware to make DXR viable for mainstream gaming.Overview and Core Concepts
Integration with DirectX 12
DirectX Raytracing (DXR) integrates seamlessly into the DirectX 12 architecture as a first-class peer to the existing rasterization and compute pipelines, preserving the low-level, low-overhead model that defines DX12. This design allows DXR workloads to leverage the same resource management system, including heaps, descriptors, and barriers, without requiring developers to overhaul their existing DX12 applications. By treating ray tracing as an extension rather than a separate paradigm, DXR enables hybrid rendering scenarios where ray-traced effects can be combined with traditional rasterization and compute passes in a single frame.[1][2] Command list compatibility is a cornerstone of this integration, permitting ray tracing commands to execute on graphics or compute queues alongside conventional draw and dispatch calls. The primary entry point for ray tracing is theDispatchRays() method, which submits ray generation shaders and initiates ray traversal using a D3D12_DISPATCH_RAYS_DESC structure to specify dispatch dimensions such as width, height, and depth. This approach ensures that DXR operations can be recorded into the same command lists and bundles as other DX12 workloads, facilitating efficient GPU utilization and minimal synchronization overhead. For instance, developers can interleave DispatchRays() with DrawInstanced() calls to blend ray-traced global illumination with rasterized geometry in real-time applications.[1][2]
State objects provide a flexible mechanism for configuring the ray tracing pipeline, created via the CreateStateObject() API with a D3D12_STATE_OBJECT_DESC that bundles pipeline states, shaders, and root signatures. These objects support incremental construction through subobjects, allowing shaders compiled into DXIL libraries to be associated at runtime, which promotes reusability and adaptability across different rendering scenarios. Once created, the state object is bound using SetPipelineState1(), ensuring compatibility with DX12's state management while enabling the definition of ray-specific configurations like shader tables for indexing ray generation, miss, hit, and closest hit shaders. This bundled approach reduces API overhead and aligns with DX12's emphasis on explicit control.[1]
DXR maintains backward compatibility with DirectX 12 feature level 12_0, building upon it through ray tracing tier support queries via CheckFeatureSupport() to determine hardware capabilities. Tiers such as 1.0 provide core functionality, while higher tiers unlock enhancements like additional vertex formats, ensuring that DXR can be progressively adopted without breaking existing DX12 baselines. This tiered system allows applications to query and fallback gracefully, supporting experimentation on a wide range of hardware.[1]
Fundamental Ray Tracing Principles in DXR
Ray tracing is a rendering technique that simulates the physical paths of light rays to produce highly realistic images by modeling interactions such as reflections, refractions, shadows, and global illumination. In its fundamental form, rays are cast from the virtual camera through each pixel of the image plane into the 3D scene, where they are tested for intersections with geometric objects; the closest intersection determines the visible surface for that pixel, and subsequent shading calculations account for material properties and lighting. This backward ray tracing approach, originating from early work on improved illumination models, contrasts with forward light simulation by efficiently focusing computations on viewer-dependent paths.[13][14] In DirectX Raytracing (DXR), primary rays initiate the process by originating from the camera position, traversing the scene to establish initial visibility and base color contributions for each pixel. These rays intersect with scene geometry, triggering shading decisions that may spawn secondary rays from the hit points to simulate indirect lighting effects, such as specular reflections or diffuse bounces. Secondary rays extend the simulation through recursive calls, limited by a configurable maximum depth to prevent excessive computation, thereby capturing multi-bounce light transport while maintaining control over performance. This hierarchical ray generation enables DXR to approximate physically based rendering without fully tracing all possible light paths.[2][15] DXR is designed for hybrid rendering pipelines, where ray tracing augments traditional rasterization to leverage the strengths of both methods: rasterization efficiently handles primary visibility and large-scale geometry culling, while ray tracing adds accurate secondary effects like soft shadows and realistic reflections on demand. This integration allows developers to selectively apply ray tracing to specific scene elements or passes, optimizing for real-time constraints in games and applications. By dispatching rays alongside rasterized draws in the same command list, DXR facilitates seamless blending of techniques for enhanced visual fidelity without overhauling existing rendering systems.[2][16] A core advantage of DXR lies in its hardware acceleration, which dramatically lowers the computational overhead of ray-scene intersection tests compared to pure software implementations, making real-time ray tracing viable on modern GPUs. Specialized fixed-function hardware traverses acceleration structures to quickly identify potential hits, exploiting spatial coherence and reducing the branching costs inherent in software ray tracing. This enables frame rates suitable for interactive applications, with performance gains scaling to complex scenes where traditional methods falter. DXR briefly references acceleration structures for these efficient intersections, though their construction is handled separately.[2][17]Technical Implementation
Acceleration Structures
In DirectX Raytracing (DXR), acceleration structures are hierarchical bounding volume hierarchies (BVHs) that optimize ray-geometry intersection tests by organizing scene data for efficient traversal during ray tracing. These structures are essential for achieving real-time performance, as they reduce the computational cost of checking ray intersections against potentially millions of primitives. DXR supports two primary types: bottom-level acceleration structures (BLAS) for individual geometry and top-level acceleration structures (TLAS) for instancing multiple BLAS in a scene.[1] A bottom-level acceleration structure (BLAS) encapsulates the raw geometry of a scene, such as triangle meshes or procedural axis-aligned bounding boxes (AABBs), enabling fast intersection queries at the primitive level. Each BLAS is constructed using theD3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_DESC structure, which specifies an array of D3D12_RAYTRACING_GEOMETRY_DESC entries detailing the geometry—either triangles via vertex and index buffers or AABBs as an array of bounding boxes. The build process is invoked via the BuildRaytracingAccelerationStructure API call on a command list, requiring temporary scratch space for construction. Build flags influence the trade-offs: D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_PREFER_FAST_TRACE prioritizes tracing performance, typically increasing build time by 2-3x compared to default settings, while D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE enables incremental updates to geometry like vertex positions or AABBs without a full rebuild, though this may slightly degrade tracing efficiency.[18][19]
The top-level acceleration structure (TLAS) references multiple BLAS instances to represent the full scene, supporting transformations and visibility culling for dynamic environments. It is built similarly using BuildRaytracingAccelerationStructure, but with an array of D3D12_RAYTRACING_INSTANCE_DESC structures, each a 16-byte aligned descriptor containing a pointer to a BLAS, a 3x4 transform matrix (also 16-byte aligned), an instance ID for shader identification, and an instance mask for ray filtering. This design allows up to 2^24 instances per TLAS, facilitating efficient instancing of repeated geometry like duplicated objects in a game world. Updates to the TLAS, such as redefining instance transforms or masks, can be performed incrementally if built with the ALLOW_UPDATE flag, but changes to the instance count require a complete rebuild.[20][21]
The construction of both BLAS and TLAS involves allocating output buffers in the default heap with D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS and a resource state of D3D12_RESOURCE_STATE_RAYTRACING_ACCELERATION_STRUCTURE. Scratch space, sized via a pre-build query, handles temporary memory needs during construction. To optimize memory usage post-build, the D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_COMPACTION flag can be set, followed by a compaction step using CopyRaytracingAccelerationStructure in COMPACTED_SIZE mode; the resulting size is queried beforehand with EmitRaytracingAccelerationStructurePostbuildInfo. This process can significantly reduce the structure's footprint, though it adds an extra command list operation. Synchronization between builds and ray tracing dispatches requires UAV barriers to manage read-write access.[22][23]
DXR imposes strict limits on acceleration structures to ensure hardware compatibility and performance. A single BLAS supports a maximum of 2^24 geometries and 2^29 primitives (including inactive or degenerate ones), while a TLAS is limited to 2^24 instances. All structures must be 256-byte aligned, with buffers using GPU virtual addresses for geometry data. These constraints, combined with implementation-dependent variations queried via device capabilities, guide developers in partitioning complex scenes to avoid exceeding hardware bounds.[24]
Ray Tracing Pipeline and Dispatch
The ray tracing pipeline in DirectX Raytracing (DXR) begins with the DispatchRays method, which launches a three-dimensional grid of ray generation shader invocations to initiate ray tracing execution. This dispatch operates on graphics or compute command lists and requires a ray tracing pipeline state to be set beforehand, ensuring the shaders and resources are properly configured for the operation. The grid dimensions are defined by width, height, and depth parameters, each supporting values up to 2^16, allowing for a scalable launch of up to billions of rays depending on the specified extents. The dispatch is controlled through a D3D12_DISPATCH_RAYS_DESC structure, which specifies GPU virtual addresses for shader tables that bind ray generation shaders, miss shaders, hit group shaders, and callable shaders to corresponding regions of the dispatch grid, along with associated resources such as constant buffers or textures. These shader tables must be in the D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE state, with alignments of 64 bytes for starting addresses and 32 bytes for strides, enabling flexible mapping of shaders and data to different ray types or spatial locations. In DXR 1.1, DispatchRays supports indirect invocation via the ExecuteIndirect method, permitting GPU-driven dispatches where grid dimensions and shader table parameters can be dynamically determined on the GPU without CPU intervention. Within the ray generation shaders invoked by DispatchRays, individual rays are defined using the RayDesc structure, which encapsulates the ray's origin as a float3 vector, direction as a normalized float3 vector, minimum parametric extent TMin (typically a small positive value like 0.001 to avoid self-intersections), and maximum extent TMax (often set to a large value like 1e30 for unbounded rays). The structure also includes a Flags field for controlling traversal behavior, such as RAY_FLAG_FORCE_OPAQUE to treat all primitive intersections as opaque and skip any hit shaders, thereby optimizing for scenarios like shadow rays where transparency is irrelevant, or RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH to terminate traversal after the first valid hit and invoke the closest hit shader only if not skipped. Once defined, tracing a ray via the TraceRay intrinsic initiates the traversal process, where the ray is hardware-accelerated through the acceleration structure to perform intersection tests against geometry instances, efficiently culling non-intersecting nodes and primitives to identify the closest hit along the ray path or determine a complete miss if no intersection occurs within the TMin to TMax bounds. This traversal leverages dedicated ray tracing hardware units, such as RT cores in compatible GPUs, to accelerate bounding volume hierarchy (BVH) navigation and primitive intersection computations, returning control to the calling shader with updated payload data reflecting the hit details or miss outcome. The process supports recursive tracing, limited by the pipeline's configured maximum trace recursion depth of 0 to 31 levels, as defined by the constant D3D12_RAYTRACING_MAX_DECLARABLE_TRACE_RECURSION_DEPTH during pipeline state object creation to balance effect complexity with performance overhead. To accommodate recursion, the per-thread call stack size is configured using the SetPipelineStackSize method on the command list prior to dispatch, specifying the memory allocation in bytes for maintaining shader state and local variables across recursive TraceRay calls, with typical values ranging from 1024 to 4096 bytes depending on shader complexity and desired recursion depth.Shaders and API Calls
DirectX Raytracing (DXR) employs a set of programmable shaders written in High-Level Shading Language (HLSL) to implement ray tracing logic, compiled to the library target model lib_6_3. These shaders are organized into the raytracing pipeline state object and invoked based on ray interactions with acceleration structures. Developers use shader attributes, such as [shader("raygeneration")], to designate shader types within the HLSL code.[25] The primary shader types include the ray generation shader, which acts as the entry point for dispatching rays and typically calls the TraceRay intrinsic to initiate ray traversal. The closest hit shader processes the nearest intersection with geometry, enabling computations like material shading or reflection calculations using intersection data. The any hit shader handles per-primitive logic for transparent or alpha-tested surfaces, where developers can use functions like IgnoreHit to accept or reject hits. The miss shader executes when a ray does not intersect any geometry, often used for environment mapping or sky rendering. Additionally, the intersection shader supports procedural geometry by defining custom intersection tests against bounding volumes, while the callable shader functions as a subroutine, invoked via the CallShader intrinsic for reusable code across the pipeline.[25] Key API intrinsics facilitate ray tracing within these shaders. The TraceRay function, callable from ray generation, closest hit, and miss shaders, performs recursive ray casting into an acceleration structure, passing a user-defined payload for data exchange between shaders. In DXR 1.1, the RayQuery object enables inline, non-recursive intersection queries, allowing developers to test rays against multiple structures without recursion. Shaders access ray and hit information through intrinsics such as RayTCurrent, which returns the current distance along the ray; WorldRayDirection, providing the ray's direction in world space; and GeometryIndex (available in DXR 1.1 and later), which identifies the hit geometry instance.[26][27] Hit groups bundle shaders relevant to specific geometry types, combining a closest hit shader with optional any hit and intersection shaders into a single unit defined via the D3D12_HIT_GROUP_DESC structure. This allows tailored handling for different materials or object classes during ray traversal. The D3D12_RAYTRACING_SHADER_CONFIG structure configures the pipeline by specifying maximum sizes for ray payloads and hit attributes, limited to 32 bytes each to balance flexibility and performance. Shader tables organize these shaders and hit groups for execution, serving as GPU-accessible buffers that map ray indices to appropriate shader entry points during dispatch.[26][25][28]Hardware and System Requirements
Compatible Graphics Hardware
DirectX Raytracing (DXR) requires graphics hardware compatible with DirectX 12 Ultimate, featuring dedicated ray tracing acceleration units or equivalent capabilities to handle ray intersection and traversal efficiently. These include specialized RT cores in modern GPUs, enabling real-time ray tracing without prohibitive performance costs. Compatibility is determined by querying the device's feature support levels, ensuring developers can adapt applications to available hardware tiers. DXR hardware support is categorized into tiers, assessed via theID3D12Device::CheckFeatureSupport method with D3D12_FEATURE_D3D12_OPTIONS5 and the RaytracingTier member. Tier 1.0 offers foundational ray tracing, supporting ray generation shaders, closest hit shaders, any hit shaders, and miss shaders for basic scene intersection. Tier 1.1 extends this with inline ray querying using RayQuery objects, indirect ray dispatch via DispatchRays in compute shaders, additional ray flags like RAY_FLAG_SKIP_TRIANGLES, and expanded vertex formats such as DXGI_FORMAT_R16G16_UNORM. Tier 1.2 builds on prior tiers by adding Opacity Micromaps (OMM) for efficient alpha-tested geometry handling and Shader Execution Reordering (SER) for dynamic shader optimization during ray traversal, requiring Shader Model 6.9.
| Vendor | Supported Architectures and Series | Tiers Supported |
|---|---|---|
| NVIDIA | Turing (RTX 20-series, 2018); Ampere (RTX 30-series, 2020); Ada Lovelace (RTX 40-series, 2022); Blackwell (RTX 50-series, 2024+) | Tier 1.0 (Turing+); Tier 1.1 (Turing+); Tier 1.2 (all RTX series, with SER acceleration on 40/50-series and OMM across all) |
| AMD | RDNA 2 (RX 6000-series, 2020+); RDNA 3 (RX 7000-series, 2022+); RDNA 4 (RX 9000-series, 2024+) | Tier 1.1 (RDNA 2+); Tier 1.2 (RDNA 4 with OMM/SER support) |
| Intel | Alchemist (Arc A-series, 2022+); Xe2/Battlemage (Arc B-series, 2024+) | Tier 1.1 (Alchemist+); Tier 1.2 (Battlemage with SER support and OMM evaluation) |
Software and Driver Prerequisites
DirectX Raytracing (DXR) requires Windows 10 version 1809 (October 2018 Update) or later to access the necessary runtime components.[4] For comprehensive support of DirectX 12 Ultimate, including advanced DXR features like Tier 1.1, Windows 10 version 2004 or Windows 11 is recommended, as these versions provide the full feature set through built-in updates.[31] The development headers and libraries for DXR are integrated into the Windows SDK starting from version 10.0.17763.0, which aligns with the October 2018 Update.[32] The DXR runtime itself is delivered via operating system updates, eliminating the need for separate installations beyond the base OS requirement. Compatible graphics drivers are essential for hardware-accelerated DXR. NVIDIA provides initial support through GeForce drivers version 416.16 WHQL and later, released in October 2018 to coincide with the Windows 10 update. AMD introduced hardware-accelerated DXR support with Radeon Software Adrenalin Edition 20.11.2 and subsequent releases in November 2020, enabling Tier 1.1 on RX 6000 series GPUs (RDNA 2 architecture).[33] Intel added DXR support for Arc graphics starting with driver version 31.0.101.3445 in October 2022, targeting discrete and integrated Arc-based hardware.[34] DXR mandates DirectX 12 with shader model 6.3 or higher, as this version introduces the intrinsics and library targets (e.g., lib_6_3) required for raytracing shaders.[35] Applications must query raytracing tier support at runtime using D3D12_FEATURE_D3D12_OPTIONS6 to confirm hardware capabilities, such as Tier 1.0 for basic functionality or higher tiers for enhanced features.[1] For development and debugging, PIX on Windows offers comprehensive tools for capturing, analyzing, and debugging DXR frames, including shader inspection and performance profiling.[36] Visual Studio provides seamless integration for compiling HLSL shaders targeting model 6.3+ via the DirectX Shader Compiler (DXC), supporting DXR-specific extensions directly within the IDE.[25]Versions and Updates
DXR 1.1 Enhancements
DirectX Raytracing (DXR) 1.1, introduced in preview in November 2019 with Windows 10 Insider Preview builds for version 20H1 and fully released with Windows 10 version 20H1 on May 27, 2020, introduced several enhancements focused on improving developer usability and pipeline flexibility without requiring new hardware beyond existing DXR Tier 1.0 support, provided driver updates were available.[37][38] These updates enable more efficient integration of ray tracing into broader rendering workflows, supporting Shader Model 6.5 and queried viaD3D12_FEATURE_D3D12_OPTIONS5 for Tier 1.1 compatibility.[37][38]
A key addition is inline ray tracing, which allows ray queries to be performed directly within any shader stage, such as pixel or compute shaders, using the RayQuery object as a local state machine.[37][39] This non-recursive approach, invoked via TraceRayInline(), eliminates the need for separate ray generation shaders or shader tables, simplifying scenarios like hybrid rasterization-ray tracing or compute-based denoising.[37] Developers instantiate a RayQuery with flags (e.g., RAY_FLAG_CULL_NON_OPAQUE) and use methods like Proceed() to advance the query, querying hits or misses as needed, which contrasts with the recursive TraceRay() call from the core pipeline.[37][39]
GPU-driven dispatch further enhances flexibility by allowing the GPU to initiate ray tracing calls independently of the CPU, using ExecuteIndirect() with a command signature for DispatchRays().[37][40] This enables dynamic scenarios, such as adaptive ray generation based on culling or refinement passes, where thread counts and shader tables are computed on the GPU, reducing CPU-GPU synchronization overhead.[37]
Support for expanded vertex formats in acceleration structures broadens data compatibility, including DXGI_FORMAT_R16G16_UNORM for 16-bit per component positions (with the third component defaulting to 0) and similar formats like R16G16B16A16_UNORM (stride 6 bytes, ignoring alpha).[37][41] These additions accommodate more compact or legacy vertex buffers during bottom-level structure builds, improving usability for diverse asset pipelines without format conversion.[37]
New ray flags and intrinsics provide finer control over intersection behavior; for instance, RAY_FLAG_SKIP_TRIANGLES skips triangle tests entirely (taking precedence over culling flags), while RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES targets box or AABB primitives.[37][42] The GeometryIndex() intrinsic, available in hit shaders, returns the index of the geometry within a bottom-level acceleration structure, enabling shared shaders to differentiate geometries without duplicating shader table entries.[37][43]
Incremental state objects address pipeline management efficiency through AddToStateObject(), which appends shaders or components to an existing ray tracing pipeline state object (PSO) flagged with D3D12_STATE_OBJECT_FLAG_ALLOW_STATE_OBJECT_ADDITIONS.[37][44] This avoids full PSO recreation—incurring proportional overhead based on additions, such as one shader—facilitating dynamic updates in scenarios with evolving shader libraries.[37]
DXR 1.2 and Recent Advancements
DirectX Raytracing (DXR) 1.2 was announced by Microsoft at the Game Developers Conference (GDC) on March 20, 2025, introducing key hardware-accelerated features aimed at addressing longstanding performance bottlenecks in real-time ray tracing.[45] These advancements, including Opacity Micromaps (OMM) and Shader Execution Reordering (SER), deliver significant efficiency gains across NVIDIA, AMD, and Intel GPUs, with reported speedups of up to 2.3x for OMM in path-traced scenarios and up to 2x for SER in select workloads.[45] When combined with neural rendering integrations, such as AI-accelerated denoising and upscaling via cooperative vectors in Shader Model 6.9, these features enable up to 10x overall performance improvements in targeted ray tracing applications.[46] Opacity Micromaps (OMM) provide a hierarchical structure for encoding per-triangle opacity at the sub-triangle level, using a 4^N micro-triangle grid organized along a space-filling curve to represent fine-grained alpha-tested geometry like foliage or fences.[47] Supported in 2-state (opaque/transparent, 1-bit) or 4-state (adding known/unknown variants, 2-bit) formats, OMMs are defined via theD3D12_RAYTRACING_OPACITY_MICROMAP_ARRAY_DESC structure, which specifies input buffers, histogram entries, and linkages to bottom-level acceleration structures (BLAS) through D3D12_RAYTRACING_GEOMETRY_OMM_LINKAGE_DESC.[1] This approach reduces any-hit shader invocations by culling rays in fully opaque or transparent regions, bypassing shaders entirely in 2-state mode and minimizing computational overhead for complex alpha geometry.[47] OMM arrays are stored as separate resources in the D3D12_RESOURCE_STATE_RAYTRACING_ACCELERATION_STRUCTURE state and can be reused across multiple BLAS instances, with 4-state OMMs operable in 2-state mode via the RAY_FLAG_FORCE_OMM_2_STATE ray flag or instance flags like D3D12_RAYTRACING_INSTANCE_FLAG_DISABLE_OMMS.[1]
Shader Execution Reordering (SER) enhances ray tracing efficiency by decoupling traversal phases (any-hit and intersection shaders) from shading (closest-hit and miss shaders), allowing dynamic reordering of GPU threads to improve coherence.[48] Introduced in Shader Model 6.9, SER utilizes the HitObject API to process hit information from sources like RayQuery objects and enables RayGeneration shaders to dispatch closest-hit or miss shaders post-traversal.[48] Developers invoke reordering via the MaybeReorderThread() HLSL intrinsic, providing a sort key (e.g., a bit indicating workload type) to guide hardware in grouping similar rays, which boosts cache coherence and reduces divergent execution in incoherent ray patterns.[1] This feature yields up to 2x faster rendering in scenarios with high ray divergence, as demonstrated on NVIDIA RTX 40/50-series GPUs with 40% framerate gains and Intel Arc B-series with up to 90% improvements in complex scenes.[45]
Additional capabilities in DXR 1.2 include ray query support for OMM via the RAYQUERY_FLAG_ALLOW_OPACITY_MICROMAPS flag, enabling inline ray tracing to leverage these optimizations.[1] As of specification version 1.34 (dated June 30, 2025), a proposed ALLOW_DATA_ACCESS flag is under consideration to enable geometry fetch intrinsics, such as TriangleObjectPosition, in shaders when paired with build flags like RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_DATA_ACCESS.[1] DXR 1.2 maintains backward compatibility with prior tiers, supporting inline ray tracing from Tier 1.1 and legacy triangle geometry alongside OMM in BLAS builds, though full feature access requires Shader Model 6.9 and hardware/driver support for Tier 1.2.[1]