Fact-checked by Grok 2 weeks ago

Ray-tracing hardware

Ray-tracing hardware consists of specialized circuits integrated into graphics processing units (GPUs) to accelerate the computation-intensive process of ray tracing, a rendering technique that simulates the physical paths of light rays to produce highly realistic images with accurate reflections, refractions, shadows, and global illumination effects.^[1] This hardware enables real-time performance in applications such as video games, film production, and architectural visualization, where traditional rasterization methods fall short in mimicking complex light interactions.^[2] Unlike software-based ray tracing, which relies on general-purpose GPU shaders and can be computationally prohibitive for interactive use, dedicated ray-tracing units—such as NVIDIA's RT Cores introduced in 2018 with the Turing architecture—optimize tasks like ray-triangle intersection testing and bounding volume hierarchy traversals through parallel processing and fixed-function pipelines.^[3] AMD followed with hardware ray accelerators in its RDNA 2 architecture in 2020, enhancing throughput for ray-triangle intersections and supporting DirectX Raytracing (DXR) APIs, while Intel integrated ray-tracing units and AI acceleration via Xe Matrix Extensions (XMX) into its Arc GPUs starting in 2022 to boost path-tracing workloads.^[4]^[5] The evolution of ray-tracing hardware traces back to the 1980s, when ray tracing emerged as an offline rendering method in computer graphics research, initially implemented in software on general-purpose CPUs due to its high computational demands.^[6] By the early 2000s, programmable GPUs began supporting software ray tracing via shader programs, but performance limitations confined it to non-real-time applications like scientific visualization and movie effects.^[7] Academic and industry efforts in the 2010s explored custom ASICs and FPGA prototypes for acceleration, such as early hardware engines for ray-scene intersection, paving the way for commercial integration.^[8] The breakthrough came with NVIDIA's 2018 launch of RT Cores, which delivered up to 10 giga-rays per second in initial implementations, dramatically reducing the overhead of ray generation and traversal compared to software approaches.^[9] Today, ray-tracing hardware is a cornerstone of modern GPU architectures, with ongoing advancements like AMD's third-generation ray accelerators in RDNA 4 (announced in 2025) doubling throughput over prior generations, NVIDIA's fourth-generation RT Cores in the Blackwell-based GeForce RTX 50 series (released 2025), and Intel's optimizations for hybrid rasterization-ray tracing pipelines.^[10]^[11] These developments, supported by APIs such as Vulkan Ray Tracing Extension and DirectX Raytracing, have democratized real-time photorealism, though challenges remain in balancing performance with power efficiency and integrating with AI-driven denoising techniques to handle noise in traced scenes.^[12] As of 2025, adoption spans consumer gaming cards, professional workstations, and mobile SoCs like Samsung's Exynos 2200, underscoring ray-tracing hardware's role in pushing the boundaries of visual fidelity across industries.^[13]

Fundamentals

Ray tracing principles

Ray tracing is a rendering technique in computer graphics that simulates the physical propagation of light by tracing rays from the virtual camera through each pixel of the image plane into the scene, determining interactions with objects to compute color and intensity. This backward ray tracing approach models how light rays bounce off surfaces toward light sources, enabling the realistic depiction of effects such as specular reflections, refractions through transparent materials, soft shadows, and global illumination where light indirectly influences distant surfaces. Introduced in seminal work on improved illumination models, ray tracing contrasts with earlier scanline methods by explicitly handling light transport for photorealistic results.^[14] The core process begins with casting a primary ray from the camera origin through a scene point corresponding to each pixel. At the first intersection point with a scene object, the algorithm performs ray-object intersection tests to identify the closest surface, computing surface properties like normal and material. Secondary rays are then recursively generated from this point: reflection rays follow the law of reflection, refraction rays apply Snell's law for transmission, and shadow rays check visibility to light sources. This recursive bouncing continues until rays escape the scene, hit a light, or reach a predefined maximum depth, accumulating radiance contributions weighted by the bidirectional scattering distribution function (BSDF) and geometric attenuation.^[14] Mathematically, a ray is represented by the parametric equation \mathbf{P}(t) = \mathbf{O} + t \mathbf{D}, where \mathbf{O} is the ray's origin vector, \mathbf{D} is the direction vector (typically normalized), and t \geq 0 is the scalar parameter denoting distance from the origin. Intersection algorithms solve for t values where the ray meets object geometry; for example, the slab method efficiently tests axis-aligned bounding boxes (AABBs) by computing the ray's entry and exit parameters for each pair of parallel planes (slabs) along the x, y, and z axes, taking the maximum entry t and minimum exit t across axes to determine overlap. Brute-force implementations test each ray against all n scene objects, yielding O(n) complexity per ray, but acceleration structures like bounding volume hierarchies (BVHs) enable hierarchical traversal, pruning non-intersecting subtrees to achieve average O(\log n) complexity.^[15]^[16] Path tracing extends classical ray tracing into an unbiased Monte Carlo method for solving the rendering equation, which integrates incoming radiance over all directions weighted by the BSDF and cosine term. By randomly sampling directions at each bounce rather than deterministic reflection or refraction, path tracing captures all global illumination paths stochastically, converging to physically accurate results with sufficient samples, though at higher computational cost than biased variants.^[17]

Need for hardware acceleration

Ray tracing algorithms demand immense computational resources due to the need to simulate light paths by casting millions of rays per frame, each undergoing numerous intersection tests with scene geometry. For a typical real-time rendering scenario at 1080p resolution aiming for 60 frames per second (FPS), this translates to processing billions of rays and associated operations per second, rendering pure software implementations on general-purpose CPUs or even early GPUs insufficient for interactive rates below 30 FPS.^[18] The core bottleneck lies in the ray-triangle intersection computations, which dominate runtime and scale poorly with scene complexity without specialized optimizations.^[19] This computational intensity traces back to ray tracing's origins in offline rendering, where it excels in producing photorealistic images for film production but at the cost of extended render times. For instance, in Pixar's 2006 film Cars, a test scene involving 678 million triangles required 1.2 billion ray-triangle intersection tests per frame, taking over 100 minutes on a dual-processor system even with advanced caching techniques.^[20] Such durations—often hours per frame—were acceptable for non-interactive media like movies and animations, but the advent of real-time applications in video games and virtual reality (VR) has shifted priorities toward sub-16-millisecond frame budgets to maintain immersion and prevent motion sickness in VR.^[21] Pre-2010, the industry thus depended heavily on rasterization pipelines for interactive 3D graphics, relegating ray tracing to pre-computed lighting maps, low-resolution previews, or static effects to approximate global illumination without real-time overhead.^[21] Compounding these challenges is the irregular memory access patterns inherent in ray traversal, which hinder efficient execution on conventional architectures despite ray tracing's embarrassingly parallel nature at the ray level. This parallelism aligns well with SIMD (Single Instruction, Multiple Data) instructions in GPUs, enabling concurrent processing of independent rays, yet general-purpose shaders struggle with the divergent execution and bandwidth demands of incoherent ray bundles.^[22] For a moderately complex scene with approximately 10^6 triangles, unaccelerated ray tracing at 60 FPS would necessitate around 10^9 intersection operations per frame—equivalent to billions per second—overwhelming the throughput of pre-acceleration hardware like multi-core CPUs or early programmable GPUs. Dedicated hardware acceleration thus becomes essential to parallelize and streamline these operations, bridging the gap between offline feasibility and real-time viability.

Historical Development

Early prototypes and research

The foundations of ray-tracing hardware trace back to the late 1970s and 1980s, when academic research primarily focused on algorithmic developments rather than dedicated hardware, as computational demands made real-time implementation infeasible on contemporary systems. Turner Whitted's seminal 1980 paper introduced a recursive ray-tracing model that simulated reflections, refractions, and shadows by tracing rays from the viewer through a scene, establishing the core principles for future hardware acceleration efforts.^[23] Early hardware concepts in this era explored pixel processors for basic ray casting, but these were limited to software simulations on general-purpose computers, highlighting the need for specialized architectures to handle intersection tests and ray propagation efficiently.^[14] In the 1990s, the first dedicated prototypes emerged, leveraging digital signal processors (DSPs) to accelerate ray-casting operations for volume rendering and basic ray tracing. The ART AR250 RenderDrive, developed by Advanced Rendering Technology in 1998, represented an early commercial prototype using a network of DSP-based rendering processors to perform ray-traced rendering, achieving speeds suitable for offline production by parallelizing ray generation and intersection computations across multiple units.^[24] Similarly, in 1996, researchers at Princeton University proposed the TigerSHARK system, a hardware-accelerated ray-tracing engine built around off-the-shelf DSPs to handle ray-triangle intersections and shading, demonstrating that low-cost hardware could rival more expensive rasterization systems in output quality while targeting interactive rates for complex scenes.^[25] The early 2000s saw advancements in reconfigurable hardware, particularly field-programmable gate arrays (FPGAs), enabling more flexible prototypes for ray-tracing pipelines. At Saarland University, Sven Woop, Jörg Schmittler, and Philipp Slusallek developed the Ray Processing Unit (RPU) in 2005, a programmable single-chip architecture prototyped on FPGAs that integrated ray traversal, intersection testing, and shading into a stream-processing model, achieving real-time performance for moderately complex scenes by optimizing data flow for coherent ray packets.^[26] Concurrently, Ingo Wald and colleagues advanced coherent ray tracing techniques, grouping similar rays to exploit cache locality and SIMD parallelism in software implementations, which informed hardware designs by reducing traversal overhead in dynamic scenes.^[27] Experimental ASIC efforts during this period focused on dedicated intersection units, but remained largely academic, paving the way for scalable hardware. A key milestone came in 2005 with NVIDIA's research demonstrations of software ray tracing on GPUs, which, despite performance limitations on general-purpose shaders, proved the feasibility of interactive ray tracing on commodity hardware for simple scenes.^[28]

Transition to commercial products

In the mid-2000s, the advent of general-purpose GPU computing marked a pivotal catalyst for ray tracing, as platforms like NVIDIA's CUDA, introduced in 2006, allowed developers to implement software-based ray tracing on graphics hardware. These advancements enabled experimental demos that demonstrated ray tracing's potential for realistic lighting and shadows, yet achieving real-time performance remained elusive for complex scenes due to computational demands exceeding the era's GPU capabilities.^[29] The transition to dedicated commercial hardware began in 2009 with Caustic Graphics' release of the CausticOne, an add-in board featuring a field-programmable gate array (FPGA)-based Ray Tracing Processing Unit (RTPU) designed to accelerate ray setup, traversal, and intersection calculations as a co-processor alongside CPUs and GPUs.^[30] Claiming up to a 20-fold speedup over CPU-based methods, the CausticOne targeted professional rendering workflows and was made available to developers for integration testing.^[31] Caustic Graphics was acquired by Imagination Technologies in 2010 for $27 million, paving the way for further integration of its ray tracing innovations into broader graphics ecosystems.^[32] Throughout the 2010s, additional commercial entries emerged, particularly for embedded and IP-based solutions. SiliconArts introduced its RayCore IP in 2011, the first semiconductor intellectual property dedicated to real-time ray tracing, enabling integration into system-on-chips for applications like automotive and consumer devices.^[33] This evolved into the RayChip in 2014, a multi-core embedded processor combining six RayCore units with a bounding volume hierarchy accelerator to deliver up to 60 frames per second at HD resolution. Concurrently, Imagination Technologies incorporated Caustic's technology into its PowerVR IMG GPU series, providing early hardware ray tracing support starting with the Caustic Series2 R2500 and R2100 accelerator boards in 2013, which processed up to 100 million incoherent rays per second for workstation use.^[34] A major breakthrough arrived in 2018 with NVIDIA's Turing architecture, which integrated the first consumer-grade RT Cores into GeForce RTX GPUs, accelerating bounding volume hierarchy traversal and ray-triangle intersections by orders of magnitude over software approaches.^[35] This hardware enabled real-time ray tracing via Microsoft's DirectX Raytracing (DXR) API, debuting in games such as Battlefield V, where it powered ray-traced reflections and shadows at playable frame rates.^[36] The shift to commercial products initially found traction in professional visualization sectors before permeating consumer gaming. Tools from Autodesk, enhanced by ray tracing integrations like the 2011 acquisition of Numenus' technology, adopted these accelerators for faster, more accurate surface rendering in design and architectural workflows.^[37] However, early ray tracing hardware often grappled with challenges such as elevated power consumption, which constrained deployment in power-sensitive environments and necessitated robust cooling in high-end systems.^[31]

Core Architectures

Bounding volume hierarchies and traversal

Bounding volume hierarchies (BVHs) are tree-based acceleration structures that organize scene geometry into a hierarchy of bounding volumes, typically axis-aligned bounding boxes (AABBs), to efficiently prune ray intersection tests during ray tracing. Each internal node represents a bounding volume enclosing the union of its child nodes, while leaf nodes contain one or more primitives such as triangles. By traversing the hierarchy from the root, rays can quickly eliminate entire subtrees that do not intersect their path, reducing the number of expensive primitive intersection tests. This approach was first introduced for ray tracing complex scenes by Kay and Kajiya in 1986, using slab-based bounding volumes to handle procedural and detailed environments.^[38] BVH construction typically employs top-down methods, starting from the scene root and recursively partitioning primitives based on heuristics to minimize expected traversal costs. The surface area heuristic (SAH) is the most widely adopted, estimating the cost of a node split as C = C_{\text{trav}} + p_A \cdot C_A + p_B \cdot C_B, where C_{\text{trav}} is the cost of traversing the node, p_A and p_B are the probabilities of a ray intersecting the left and right child volumes (proportional to their surface areas relative to the parent), and C_A and C_B are the costs of testing those children (accounting for the number of primitives they contain). This heuristic, originally proposed by MacDonald and Booth in 1990 for space subdivision in ray tracing, favors splits that balance volume overlap and primitive distribution to optimize ray throughput. Bottom-up construction, by contrast, builds the hierarchy from leaf clusters upward, often using clustering techniques for initial grouping, though it is less common due to higher preprocessing overhead. For animated scenes requiring dynamic updates, incremental refitting or partial rebuilds are used to maintain hierarchy quality without full reconstruction each frame, as demonstrated in fast SAH-based rebuilding algorithms that enable per-frame updates in interactive applications.^[39] Traversal of a BVH involves a ray "walking" down the tree, testing intersection with each node's bounding volume and recursing into intersecting children until reaching leaves, where primitive tests occur. Traditional stack-based traversal maintains a stack of pending nodes to handle branching, enabling depth-first exploration but incurring memory overhead for the stack. Stackless variants, such as those using restart trails or state-based logic, eliminate the stack by encoding traversal state implicitly in node pointers and ray parameters, reducing per-ray memory to a few registers and improving cache efficiency—particularly beneficial in hardware with limited register files. These methods, refined in works like Hapala et al.'s 2011 algorithm, achieve near-equivalent performance to stack-based approaches while minimizing divergence in parallel execution.^[40] In ray-tracing hardware, BVH traversal is optimized for coherent ray batches—groups of rays sharing similar directions or origins—to exploit spatial and temporal locality, allowing SIMD or SIMT processing to traverse multiple rays simultaneously with reduced branch divergence. Node compression techniques further enhance cache utilization; for instance, quantized AABB representations and shortened node references (e.g., 2-byte indices for child pointers in large scenes) shrink memory footprint, enabling larger hierarchies to reside in on-chip caches and accelerating traversal rates. These hardware-oriented adaptations build on foundational BVH designs, prioritizing low-latency intersection tests for AABBs to sustain high ray throughput in real-time rendering pipelines.^[41]

Dedicated processing units

Dedicated processing units in ray-tracing hardware refer to specialized hardware blocks within chips that offload ray-tracing workloads from general-purpose cores, focusing on accelerating core operations like spatial queries and geometric computations. These units emerged to address the irregular and memory-intensive nature of ray tracing, which challenges traditional SIMD architectures used in rasterization. By dedicating silicon to ray-specific tasks, they enable real-time performance in complex scenes, marking a shift from software-only implementations to hardware-accelerated pipelines. Key types of dedicated units include traversal units for navigating bounding volume hierarchies (BVHs) and intersection units for testing rays against primitives, often integrated to minimize latency in hit finding. Any-hit shaders serve as specialized processing elements for evaluating material properties during opaque intersections, allowing efficient handling of effects like shadows without full visibility shading. These units typically operate in a massively parallel manner, processing thousands of rays simultaneously to exploit the inherent independence of ray paths. The ray-tracing pipeline in these units encompasses several stages: ray setup generates initial rays from the camera or light sources; BVH traversal identifies candidate primitives by descending the acceleration structure; hit detection performs precise intersection tests; shading computes lighting contributions; and compaction reorganizes active rays and hits for memory-efficient processing in subsequent passes. This staged design reduces global memory accesses and supports incoherent ray bundles, common in secondary rays for global illumination. Integration models vary between fixed-function application-specific integrated circuits (ASICs), which hardwire operations for peak efficiency in standard algorithms, and programmable units that allow runtime reconfiguration via shaders or microcode for adaptability to new techniques. Hybrid models combine fixed traversal hardware with programmable shading to balance specialization and flexibility, sometimes incorporating auxiliary units like tensor processors for post-processing tasks such as denoising. Efficiency gains stem from tailored arithmetic logic units (ALUs) optimized for ray-box tests, enabling parallel evaluation of multiple axes in a single cycle, alongside power optimizations like dynamic clock gating and thread-level scheduling to idle underutilized resources. These features can yield significant throughput improvements, with architectures achieving sustained ray processing rates suitable for interactive rendering while consuming less power than emulated software paths. The general evolution traces from early full custom chips, exemplified by the CausticOne ASIC—a standalone accelerator board launched in 2009 that processed ray intersections via dedicated silicon—to modern co-processors embedded in GPUs and SoCs, facilitating seamless integration with existing graphics pipelines. Seminal works, such as the programmable ray processing unit concept, laid foundational designs for these evolutions by demonstrating scalable hardware for realtime traversal.

Vendor-Specific Implementations

NVIDIA's RT Cores

NVIDIA's RT Cores represent a dedicated hardware acceleration unit integrated into the company's GPU streaming multiprocessors (SMs), designed specifically to handle the computationally intensive aspects of ray tracing, such as bounding volume hierarchy (BVH) traversal and ray-triangle intersection testing. Introduced with the Turing architecture in 2018, these cores offload ray-tracing workloads from general-purpose CUDA cores, enabling real-time performance in graphics applications. Each RT Core operates autonomously within an SM, processing rays in queues via micro-scheduling to optimize throughput and minimize latency, while employing BVH compression techniques to reduce memory bandwidth demands. This architecture forms the foundation for NVIDIA's RTX platform, which combines ray tracing with AI-driven features like DLSS for hybrid rendering that balances performance and visual fidelity. The first-generation RT Cores debuted in the GeForce RTX 20 series GPUs based on the Turing microarchitecture, marking NVIDIA's entry into hardware-accelerated real-time ray tracing. These cores focused primarily on accelerating BVH traversal—efficiently navigating hierarchical scene representations—and performing ray-triangle intersection tests to determine visibility and shading. Integrated into each SM of GPUs like the TU102 (RTX 2080 Ti), they achieved a peak throughput of 10 GigaRays per second in the flagship RTX 2080 Ti, a significant leap from software-based ray tracing on prior architectures like Pascal, which managed only about 1.1 GigaRays per second. This generation laid the groundwork for APIs such as DirectX Raytracing (DXR) and OptiX, enabling effects like realistic shadows, reflections, and global illumination in games and professional visualization tools. Building on Turing, the second-generation RT Cores arrived with the Ampere architecture in 2020, powering the GeForce RTX 30 series and delivering approximately double the ray-triangle intersection throughput compared to their predecessors. This improvement stemmed from enhanced fixed-function hardware for intersection testing and the addition of an Interpolate Triangle Position unit, which supported accurate handling of motion blur in dynamic scenes by interpolating vertex positions over time. In the RTX 3090 flagship, these cores provided up to 69 RT-TFLOPS of ray-tracing performance, allowing for up to 8x faster rendering of ray-traced motion blur relative to Turing. Integration with the media engine further enabled concurrent execution of ray tracing alongside denoising and shading tasks, reducing overall pipeline stalls and improving efficiency in complex workloads. The third-generation RT Cores, introduced in the Ada Lovelace architecture of 2022 for the GeForce RTX 40 series, emphasized optimizations for shader engagement and traversal in intricate scenes. Achieving 2x the ray-triangle intersection speed over Ampere (and 4x over Turing), these cores incorporated the Opacity Micromap Engine to accelerate alpha-tested geometry traversal by 2x through direct opacity mask evaluation, bypassing unnecessary shader invocations. Additionally, the Displaced Micro-Mesh Engine facilitated 10x faster BVH construction and 20x reduced BVH memory usage by generating micro-triangles on-demand for displaced surfaces. The RTX 4090 flagship delivered 191 RT-TFLOPS, enabling robust handling of high-fidelity ray-traced effects in professional rendering and gaming, with reduced overhead for scenes featuring dense foliage or transparent materials. NVIDIA's fourth-generation RT Cores, featured in the Blackwell architecture of the GeForce RTX 50 series launched in 2024 and 2025, extend capabilities toward full path tracing and deeper AI integration for advanced rendering. Doubling the ray-triangle intersection rate over Ada, these cores include a Triangle Cluster Intersection Engine for accelerating ray tracing of mega-geometry clusters, such as those in virtualized asset pipelines, and support for Linear Swept Spheres (LSS) to trace fine details like hair with 2x faster performance and 5x less VRAM than traditional methods. Enhanced BVH structures, including Cluster-level Acceleration Structures (CLAS) and Partitioned Top-Level Acceleration Structures (PTLAS), optimize level-of-detail updates for scenes with thousands of objects. The RTX 5090 achieves over 317 RT-TFLOPS, paired with Shader Execution Reordering (SER) 2.0 for 2x improved path-tracing efficiency and neural rendering techniques like Neural Radiance Cache for AI-accelerated denoising, pushing boundaries in photorealistic simulation and content creation.

AMD's Ray Accelerators

AMD's ray accelerators represent a unified approach to hardware-accelerated ray tracing, integrated directly into the compute units (CUs) of its RDNA architectures rather than as separate dedicated cores. This design allows ray-tracing operations to leverage the existing shader infrastructure for hybrid rasterization and ray-tracing workloads, emphasizing efficiency in real-time rendering. Each enhanced CU in RDNA architectures includes one ray accelerator, optimized for bounding volume hierarchy (BVH) traversal, ray-triangle intersection testing, and culling to reduce computational overhead.^[42]^[43] The first-generation ray accelerators debuted in the RDNA 2 architecture with the Radeon RX 6000 series in 2020, powering GPUs like the RX 6800 and RX 6900 XT. These accelerators focus on accelerating ray intersection and culling operations, enabling support for DirectX Raytracing (DXR) and Vulkan ray tracing extensions. In professional applications, they deliver up to 194% faster performance compared to the prior GCN architecture, as demonstrated in tools like SOLIDWORKS Visualize. The integration within CUs allows seamless handling of ray-tracing kernels as compute shaders, with performance scaling based on the number of CUs—up to 80 in flagship models.^[42]^[43]^[4] Building on this foundation, the second-generation ray accelerators in the RDNA 3 architecture, introduced with the Radeon RX 7000 series in 2022, enhance ray-tracing efficiency through architectural tweaks like improved two-stage ray scheduling to discard empty ray quads. While the core intersection throughput per accelerator remains similar to RDNA 2 (4 box tests or 1 triangle test per cycle), overall ray-tracing performance improves by up to 80% in select games due to better cache hierarchies and variable rate shading (VRS) integration, which ties ray density to shading rates for optimized rendering. Flagship models like the RX 7900 XTX feature up to 96 CUs, further boosting aggregate throughput.^[44]^[45]^[46] The third-generation ray accelerators in the RDNA 4 architecture, launched with the Radeon RX 9000 series in early 2025, double the intersection throughput over RDNA 3 by incorporating two engines per accelerator, enabling 8 box tests or 2 triangle tests per cycle. This redesign supports advanced features like 8-wide BVH traversal and oriented bounding boxes (OBBs) for tighter geometry fits, improving coherence in animated BVHs and enhancing path tracing capabilities for more realistic global illumination in games. Performance metrics show mid-range GPUs like the RX 9070 achieving approximately 112 billion box tests per second, establishing strong scalability in hybrid workloads. Compared to NVIDIA's RT Cores, AMD's integrated design excels in raster-ray hybrids by minimizing context switching overhead.^[47]^[10]^[4] A key unique aspect of AMD's ray accelerators is their tight integration with compute units, avoiding the need for separate hardware blocks and allowing flexible allocation of resources across raster, compute, and ray-tracing tasks. Technologies like Smart Access Memory (SAM) further accelerate BVH construction by enabling faster CPU-GPU data sharing, reducing build times in dynamic scenes. Since 2020, these accelerators have been pivotal in console adoption, forming the basis for ray tracing in the PlayStation 5 and Xbox Series X, where they enable real-time effects like reflections and shadows in titles such as Spider-Man: Miles Morales.^[48]^[42]^[4]

Intel's XMX and Arc RT units

Intel's ray-tracing hardware primarily revolves around the Xe architecture family, which integrates XMX (Xe Matrix Extensions) engines for AI-accelerated computations and dedicated Ray Tracing Units (RTUs) for efficient BVH traversal and intersection testing. The XMX engines, consisting of systolic arrays optimized for matrix multiplications, support data types like FP16, BF16, and INT8, enabling up to 275.2 TOPS of INT8 performance in high-end configurations, which aids in ray-tracing workloads through AI upscaling and denoising.^[49] These units are paired with RTUs that handle 12 ray-box intersections and 1 ray-triangle intersection per clock, incorporating a BVH cache and Thread Sorting Unit (TSU) to improve SIMD coherence and reduce divergence in ray queries.^[49] The foundational implementation appears in the Xe-HPG microarchitecture, introduced in 2021 and powering the discrete Arc Alchemist GPUs launched in 2022. Each Xe-core in Xe-HPG includes 16 XMX engines and 1 RTU, scalable up to 32 Xe-cores with 32 RTUs in flagship models like the Arc A770, supporting real-time ray tracing for effects such as global illumination, shadows, and reflections via a hybrid rasterization model.^[49]^[50] This first-generation design achieves up to 384 ray-box and 32 ray-triangle intersections per clock across the GPU, with full compliance to DirectX Raytracing (DXR) 1.0 and 1.1, including RayQuery for shader-based acceleration.^[49]^[13] Subsequent advancements in the Arc Battlemage series, based on the Xe2 architecture and released in 2024, introduce second-generation RTUs with enhanced throughput and efficiency, offering approximately 70% better performance per Xe-core compared to Alchemist.^[51] These improvements include beefed-up ray-tracing accelerators (RTAs) for higher intersection rates and better caching, alongside integration with the Xe Media Engine's AV1 encoding capabilities to support ray-traced content streaming at high quality.^[52] The Xe2 cores retain the XMX structure but with refined matrix operations, targeting budget discrete GPUs like the Arc B580 for improved ray-tracing in modern engines.^[51] For integrated solutions, Intel incorporates ray-tracing hardware in the Xe-LPG microarchitecture within Core Ultra processors, debuting in Meteor Lake (2023) and advancing to Xe2 in Lunar Lake (2024). Xe-LPG features up to 8 Xe-cores with corresponding RTUs, emphasizing low-power traversal via fixed-function RTAs and large L1 caches exceeding 1 TB/s bandwidth, suitable for laptop ray tracing in DXR-compliant titles.^[53]^[54] Lunar Lake's Xe2 iteration adds 8 next-generation RTUs per configuration, delivering over 50% gaming performance uplift including ray tracing, while maintaining power efficiency for mobile use.^[55] Key features bolstering ray-tracing performance include DP4a instructions, which perform four INT8 dot products per 32-bit operation to accelerate ray math on compatible hardware, and Xe Super Sampling (XeSS), an AI upscaling technology leveraging XMX for up to 145% frame rate gains in ray-traced scenes at 1440p.^[56] XeSS operates in modes like Performance and Quality, providing anti-aliasing and temporal stability, with DP4a enabling cross-vendor support albeit at reduced efficiency compared to native XMX.^[56] Post-launch, Intel focused on driver maturity for Arc GPUs, achieving full DXR compliance by 2023 and iterative optimizations that addressed initial stability issues in ray-tracing workloads.^[13]

Mobile and embedded solutions

Ray-tracing hardware for mobile and embedded systems prioritizes low-power consumption and efficient integration within ARM-based processors, enabling real-time effects in smartphones, tablets, and specialized devices like automotive systems. These implementations often feature dedicated acceleration units tailored for bounding volume hierarchy (BVH) traversal and intersection testing, but with optimizations to mitigate thermal and energy constraints compared to desktop counterparts.^[57] Apple introduced hardware-accelerated ray tracing in its A17 Pro chip, powering the iPhone 15 Pro launched in 2023, through the MetalFX framework which supports ray-traced effects for enhanced visual fidelity in augmented reality (AR) and virtual reality (VR) applications. The A17 Pro's six-core GPU includes dedicated ray-tracing hardware capable of processing approximately 10-15 gigaflops of ray-tracing operations, facilitating features like dynamic lighting and reflections in mobile games. Similarly, the M3 series chips, debuted in 2023 for Macs, incorporate a hardware BVH builder and ray-tracing acceleration in their next-generation GPUs, enabling up to 20% performance gains over prior generations for mesh shading and ray-traced rendering. ARM's Immortalis-G715 GPU, announced in 2022 and integrated into MediaTek's Dimensity 9200 chipset, brings hardware ray tracing to flagship Android devices with support for Vulkan ray-tracing extensions, delivering 15% improved performance and energy efficiency for mobile gaming. This GPU, configurable up to 16 cores, accelerates ray-traced shadows and global illumination while incorporating machine learning enhancements for upscaling, as seen in devices like high-end smartphones from MediaTek partners. Imagination Technologies' PowerVR RT cores, evolved for ARM ecosystems, further enable Vulkan ray tracing in Samsung's earlier Exynos processors, providing low-latency traversal for embedded graphics.^[58]^[59] Qualcomm's Adreno 7xx series GPUs, starting with the Adreno 740 in Snapdragon 8 Gen 2 (2023), introduced hardware ray-tracing extensions for mobile platforms, supporting ray queries and pipelines to simulate realistic lighting with reduced overhead. Samsung's Exynos 2400, featured in the 2024 Galaxy S24 series, builds on this with an AMD RDNA 3-based GPU that handles path tracing workloads, achieving leading ray-tracing efficiency among mobile chips at up to 7.4 megapixels per second in benchmarks. These advancements allow for console-like effects, such as full-scene path tracing in select titles, while maintaining thermal stability.^[57]^[60]^[61] In embedded applications, NVIDIA's DRIVE Orin platform (2022) integrates Ampere-based GPUs with ray-tracing cores for automotive autonomous vehicles (AV), enabling high-fidelity sensor simulation and visualization in safety-critical environments with up to 254 TOPS of AI compute. For IoT and low-power embedded systems, SiliconArts' RayCore MC IP provides a dedicated ray-tracing accelerator, delivering over 300 million path-traced rays per second per mm² at under 5 million rays per mW, suitable for AR/VR headsets and edge devices.^[62] Power and thermal limitations in mobile and embedded ray-tracing hardware necessitate simplifications, such as non-recursive BVH traversal and hybrid rasterization-ray tracing pipelines, to avoid excessive computation intensity and memory bandwidth demands. These constraints lead to focused acceleration on primary rays and basic effects, with full recursion often deferred to offline processing to preserve battery life and prevent throttling.^[63]^[64]

Supporting Software and APIs

DirectX Raytracing

DirectX Raytracing (DXR), introduced by Microsoft in March 2018 as an extension to DirectX 12, enables developers to integrate ray tracing into graphics pipelines for enhanced rendering effects such as realistic reflections, shadows, and global illumination. The initial version, DXR 1.0 (Tier 1.0), provided core functionality including ray generation shaders, closest-hit and any-hit shaders for material interactions, miss shaders for non-intersections, and dispatch calls like DispatchRays() to initiate ray tracing execution. It relies on acceleration structures—typically bounding volume hierarchies (BVHs)—built via APIs such as BuildRaytracingAccelerationStructure to optimize ray-geometry intersections, with TraceRay() and inline RayQuery operations allowing flexible ray traversal within shaders. Hardware support for Tier 1.0 and above requires DirectX 12-compatible GPUs, such as NVIDIA's GeForce RTX 20 series and later, ensuring efficient hardware-accelerated traversal.^[65]^[66] DXR is structured around progressive tiers to support incremental feature adoption and hardware evolution. Tier 1.1, released in November 2019 and enabled in the Windows 10 May 2020 Update, expanded capabilities with GPU-initiated DispatchRays() via ExecuteIndirect(), inline ray tracing for better integration with rasterization, support for additional ray flags (e.g., skipping certain geometry), and enhanced vertex formats. Tier 1.2, announced at GDC 2025, introduces advanced optimizations including Shader Execution Reordering (SER) for dynamic shader invocation efficiency and Opacity Micromaps (OMM) for accelerated handling of transparent or alpha-tested geometry, achieving up to 2.3x performance gains in path-traced scenes. These tiers integrate seamlessly with Direct3D 12's hybrid rasterization and ray tracing pipelines, allowing developers to blend traditional rendering with ray-traced effects while maintaining compatibility through feature queries like CheckFeatureSupport.^[67]^[68]^[66] Key components of DXR include programmable shaders compiled into state objects for pipeline management—ray generation shaders serve as entry points, closest-hit shaders compute material shading on intersections, any-hit shaders evaluate early termination (e.g., for opacity), and miss shaders handle background rendering. Acceleration structure APIs enable building bottom-level structures for geometry and top-level instances for scene composition, with update flags like ALLOW_UPDATE for dynamic scenes. Execution is driven by TraceRays or inline queries, supporting recursion depths up to 31 and stack sizes queried via GetShaderStackSize, all bound through shader resource views for GPU access. For non-hardware scenarios, DXR can fallback to software implementations using compute shaders, though performance is significantly lower without dedicated units.^[66]^[69] DXR has seen broad adoption in game engines and titles, powering real-time ray tracing in DirectX-based applications on Windows. Unreal Engine 5, for instance, leverages DXR for features like Lumen global illumination and Nanite virtualized geometry, enabling hybrid rendering in titles such as Fortnite and The Matrix Awakens demo. Other DirectX games, including Cyberpunk 2077 and Control, utilize DXR for path-traced effects with hardware acceleration. As of 2025, DXR 1.2 previews emphasize enhanced path tracing support through SER and OMM integration, alongside neural rendering extensions for denoising and supersampling, further boosting efficiency in complex scenes.^[70]^[68]

Vulkan and OptiX

Vulkan's ray tracing extensions provide a cross-platform API for leveraging hardware-accelerated ray tracing on GPUs from multiple vendors, including AMD, Intel, and NVIDIA. The core extension, VK_KHR_ray_tracing_pipeline, was initially introduced as a provisional NVIDIA-specific extension (VK_NV_ray_tracing) in 2018 and promoted to Khronos ratified status in 2020, enabling developers to build ray tracing pipelines that integrate with existing Vulkan applications.^[71] This extension relies on acceleration structures, such as bottom-level structures (BLAS) for geometry representation and top-level structures (TLAS) for scene hierarchies, which are built and managed via the companion VK_KHR_acceleration_structure extension to optimize ray traversal efficiency. Shader binding tables (SBTs) serve as a key mechanism, allowing dynamic binding of ray generation, miss, hit, and callable shaders to specific ray types and intersection outcomes during pipeline execution. These components support hardware ray tracing on compatible GPUs, promoting portability across Windows, Linux, and other platforms without vendor lock-in.^[72] Key features of Vulkan ray tracing include dynamic rendering via VK_KHR_dynamic_rendering, which eliminates the need for fixed render passes and allows flexible attachment of subpasses for hybrid rasterization and ray tracing workflows. Variable rate shading (VRS), extended through VK_KHR_fragment_shading_rate, can be combined with ray tracing to apply coarser shading rates in less critical areas, reducing computational overhead while maintaining visual fidelity. In 2024, the Vulkan roadmap introduced enhancements such as improved integration with mesh shaders (VK_EXT_mesh_shader) for handling meshlets—compact geometry primitives that streamline draw calls—and better multi-threading support through expanded multi-draw indirect commands (VK_KHR_draw_indirect_count), enabling more efficient parallel command buffer recording for complex scenes. In November 2025, the Khronos Group released VK_EXT_ray_tracing_invocation_reorder, enabling Shader Execution Reordering (SER) in Vulkan ray tracing pipelines, which can improve performance by up to 47% in divergent workloads such as path tracing.^[73]^[74] These updates facilitate scalable ray tracing implementations, particularly for real-time applications requiring high throughput. NVIDIA's OptiX, first released in 2009 as an SDK for programmable ray tracing, evolved into a CUDA-based framework optimized for NVIDIA GPUs, with OptiX 7 (launched in 2020) marking a shift to a low-level, CUDA-centric API that grants developers fine-grained control over memory management, traversal stacks, and custom intersection shaders. Later versions, starting from 7.1, integrated advanced features like curve primitives for hair and fur simulation, tiled denoising for progressive rendering, and AI-accelerated denoising using NVIDIA's OptiX Denoiser, which leverages Tensor Cores to reduce noise in low-sample path-traced images.^[75] OptiX 7.2 further enhanced the denoiser with support for layered arbitrary output variables (AOVs), allowing separate denoising passes for elements like albedo and normals to improve compositing in production pipelines.^[76] The framework is prominently used in NVIDIA Omniverse, a collaborative platform for 3D simulation and rendering, where it powers real-time ray tracing for virtual collaboration and digital twins. Vulkan's cross-platform nature enables unified development for consoles and PCs, allowing developers to share codebases across PC (Windows/Linux) and console ecosystems (via middleware or abstractions) for consistent ray-traced effects like reflections and shadows, though platforms like PlayStation 5 use proprietary APIs. In contrast, OptiX excels in offline and professional rendering workflows, offering recursive ray tracing pipelines that integrate seamlessly with CUDA for compute-intensive tasks such as global illumination simulation in film and architecture visualization.^[77] OptiX 9.0, released in February 2025, includes optimizations for NVIDIA's Blackwell architecture, leveraging features like shader execution reordering (SER) for improved efficiency. On Blackwell GPUs, ray tracing performance benefits from up to 2x faster RT Core throughput compared to prior generations, enabling full-scene path tracing at interactive frame rates.^[78]^[79] These advancements, combined with Vulkan's ongoing extensions, underscore a trend toward hybrid APIs that blend portability with hardware-specific accelerations for broader adoption in real-time graphics.^[80]

Performance and Optimization

Key metrics and benchmarks

Evaluating ray-tracing hardware requires standardized metrics that capture the efficiency of core operations such as ray traversal through acceleration structures and intersection testing with scene geometry. One primary metric is RT-TFLOPS, which quantifies the teraflops of computational throughput dedicated to ray-triangle intersection testing, a bottleneck in ray-tracing pipelines accelerated by specialized units like RT Cores.^[81] Another key measure is GigaRays/s, representing the number of rays processed per second, which reflects overall ray-casting throughput in real-time scenarios.^[35] BVH build time, typically expressed in milliseconds per frame, assesses the latency of constructing or refitting the bounding volume hierarchy (BVH), an essential acceleration structure for efficient ray traversal in dynamic scenes.^[82] Benchmark suites provide frameworks for consistent evaluation, incorporating scene complexity factors like triangle count and ray depth to simulate real-world demands. LumiBench, an open-source suite for GPU ray-tracing analysis, tests workloads with scenes ranging from procedural geometries (zero triangles) to highly detailed models exceeding 20 million triangles, while varying ray depths for effects like path tracing recursion.^[82] For professional applications, SPECviewperf 15 includes ray-tracing-specific subtests such as Enscape-01, which leverages Vulkan for architectural visualization, evaluating performance across updated traces from tools like Autodesk Maya 2025 and Chaos Enscape 4.0.^[83] These suites emphasize factors like primary ray depth for shadows and secondary bounces for global illumination, ensuring benchmarks reflect computational intensity. Testing protocols in 2025 increasingly focus on hybrid workloads that integrate ray tracing with rasterization for balanced rendering pipelines, measuring end-to-end frame times in combined RT + shading scenarios. Power efficiency, often reported as RT-TFLOPS per watt, accounts for thermal and energy constraints in sustained workloads. Emerging standards incorporate path-tracing rays, which simulate full light transport with multiple bounces, as seen in updated SPECviewperf subtests and Vulkan-based simulations that track traversal ratios and shader overhead.^[83]^[82] Metrics have evolved significantly since 2018, with hardware ray-tracing throughput improving by over an order of magnitude through generational advancements in dedicated units; for instance, second-generation accelerators doubled intersection testing rates compared to first-generation designs, reaching up to 58 RT-TFLOPS from prior baselines around 34 RT-TFLOPS. Factors like denoising overhead are now routinely factored into benchmarks to evaluate AI-accelerated cleanup for interactive rates.^[81]^[84] General trends indicate that achieving 60 frames per second at 4K resolution with ray tracing in complex scenes—featuring millions of triangles and deep ray bounces—demands substantial RT-TFLOPS to maintain playable performance without excessive denoising latency.^[81]

Techniques for efficiency

To achieve real-time performance in ray-tracing hardware, ray sorting and coherence techniques group rays with similar directions or origins to minimize divergence during traversal and improve cache utilization. These methods reorganize rays into coherent packets using hash-based sorting or space-filling curves, reducing memory access latency by ensuring sequential processing of rays that intersect nearby scene geometry. For instance, compressing ray origins to 16-bit floating-point formats further enhances efficiency by lowering bandwidth demands during batch processing on GPUs. Such optimizations can yield up to 2x speedups in traversal-heavy workloads by exploiting spatial locality in bounding volume hierarchies (BVHs).^[85]^[86]^[87] Denoising plays a critical role in enabling low-sample ray tracing for real-time applications, where AI-based methods reconstruct clean images from noisy renders accumulated over multiple frames. Temporal accumulation integrates data from prior frames to stabilize outputs, with neural networks predicting and filtering variance to handle motion and reduce artifacts in indirect lighting. Hardware integration, such as NVIDIA's use of tensor cores for accelerating these neural denoisers, allows for real-time processing at 1-4 samples per pixel by performing matrix operations on low-precision data. This approach significantly lowers the ray count needed for acceptable quality, making path-traced effects feasible in dynamic scenes.^[88]^[89]^[90] Hybrid rendering combines ray tracing for secondary effects like shadows and reflections with rasterization for primary visibility, balancing computational cost and visual fidelity. Rasterization handles the base geometry pass efficiently on traditional pipelines, while ray tracing augments it with global illumination only where needed, such as in specular highlights. To optimize BVH traversal in this setup, level-of-detail (LOD) techniques adapt hierarchy resolution based on ray distance or importance, using fused BVHs that merge multiple mesh LODs into a single structure for reduced build times and memory footprint. This selective application ensures real-time rates by limiting full ray tracing to 10-20% of pixels in typical game scenes.^[91]^[92]^[93] Effective memory management is essential for ray-tracing accelerators, where on-chip SRAM stores traversal stacks to avoid frequent DRAM accesses during BVH navigation. Prefetching mechanisms anticipate ray paths by loading adjacent nodes into local memory, mitigating stalls in SIMD architectures. For animated scenes, dynamic BVH refits update hierarchies asynchronously in parallel threads, rebuilding only affected subtrees to maintain coherence without full recomputation each frame. These strategies reduce memory traffic in hardware implementations, enabling higher throughput on resource-constrained GPUs.^[47]^[94]^[95] Advanced techniques like ReSTIR (Reservoir-based Spatiotemporal Importance Resampling) enhance path tracing efficiency by resampling light reservoirs across space and time, reusing valid samples from neighboring pixels and frames to converge indirect illumination with fewer rays. This method supports fully dynamic scenes at interactive rates, reducing variance in multi-bounce effects without complex data structures. By 2025, hardware acceleration for neural radiance fields (NeRF) has emerged as a focus, with co-designed architectures optimizing ray marching through Gaussian splatting and radiance caching on dedicated coprocessors. These integrate ray-tracing units for volume traversal, achieving real-time neural rendering speeds up to 10x faster than software baselines via hardware-accelerated importance sampling.^[96]^[97]^[98]^[99]

Applications and Future Trends

Real-time rendering in gaming

Real-time ray tracing first gained prominence in video games with the launch of NVIDIA's RTX platform in 2018, enabling hardware-accelerated effects in titles such as Battlefield V and Shadow of the Tomb Raider, which utilized ray-traced reflections and shadows to enhance visual fidelity.^[100]^[101] By 2020, adoption expanded to consoles with the release of PlayStation 5 and Xbox Series X, both powered by AMD's RDNA 2 architecture featuring dedicated ray accelerators for real-time lighting and reflections.^[102]^[103] Entering 2025, ray tracing has become common in AAA titles, with over 250 games supporting it as of June 2025, including advanced implementations like path tracing in Cyberpunk 2077's Overdrive Mode, which simulates fully dynamic lighting across complex urban environments.^[104]^[105] The technology primarily enables realistic lighting effects, such as ray-traced global illumination (RTGI) for accurate light bounces and indirect lighting, ray-traced reflections on surfaces like water and metal, and enhanced shadows that respond dynamically to light sources.^[106] On consoles, RDNA 2's ray accelerators support these features at 4K resolutions, as seen in games like Spider-Man: Miles Morales, where RTGI illuminates detailed environments without pre-baked lighting.^[103] These effects create more immersive atmospheres, with light interacting naturally across scenes, though they demand significant computational resources. Development tools like Unreal Engine 5 (UE5) have facilitated broader integration through features such as Lumen, a hybrid system that employs software and hardware ray tracing for dynamic global illumination and reflections, and Nanite, which handles high-fidelity geometry to complement RT rendering.^[107] However, real-time ray tracing often leads to frame rate drops—sometimes halving performance at high settings—necessitating upscaling technologies like NVIDIA DLSS or AMD FSR to maintain playable rates above 60 FPS.^[108] Notable case studies include Metro Exodus Enhanced Edition (2021), which pioneered full ray tracing by replacing traditional rasterized lighting with RTGI and ray-traced reflections, achieving Hollywood-level realism on RTX GPUs with DLSS boosting performance up to 2x at 4K.^[108] In 2025, Cyberpunk 2077 exemplifies advanced adoption with its path-traced Overdrive Mode, supporting 8K rendering on high-end hardware like GeForce RTX 50 Series with DLSS, where multiple light bounces create photorealistic neon-lit streets, though requiring upscaling for stable frame rates.^[109]^[110] By 2025, ray tracing has solidified as a common feature in AAA games, driving industry standards for visual quality and influencing hardware demands, with developers prioritizing it to reduce reliance on time-intensive baked lighting workflows. However, some 2025 titles have opted out of hardware ray tracing to prioritize broader accessibility and performance.^[110]^[111] This shift has elevated player immersion across platforms, making realistic rendering accessible even on mid-range systems via optimized APIs and AI upscaling.^[112]

Professional and scientific uses

In the visual effects (VFX) and film industries, ray-tracing hardware accelerates offline rendering workflows in tools like Blender and Arnold by leveraging GPU-accelerated ray tracing via NVIDIA's OptiX engine, enabling high-fidelity simulations of light transport for complex scenes.^[113]^[114] NVIDIA Omniverse facilitates real-time collaboration among distributed VFX teams, allowing iterative refinements to ray-traced assets without file transfers, which streamlines production pipelines for feature films.^[115] By 2025, major Hollywood productions have integrated RTX hardware for rapid previews, reducing iteration cycles from days to hours during previsualization and lighting setup.^[116] Architectural visualization benefits from ray-tracing hardware through real-time walkthroughs in software like Twinmotion, which supports path tracing on NVIDIA RTX GPUs to deliver photorealistic lighting and reflections directly from CAD models.^[117] Integration with CAD tools via plugins enables seamless import of BIM data, allowing architects to simulate accurate global illumination and material interactions for client presentations and design validation.^[118] In automotive design, RTX-enabled ray tracing enhances material accuracy, such as rendering precise metallic and glass properties on vehicle prototypes, aiding engineers in evaluating aesthetics and aerodynamics without physical mockups.^[119]^[120] Scientific applications utilize ray-tracing hardware for ray marching in volume rendering, particularly in medical imaging, where GPU acceleration processes CT and MRI datasets to visualize internal structures with high precision and reduced noise.^[121] In physics simulations, ray tracing models radiative transfer in climate studies, tracing photon paths through atmospheric volumes to compute energy balances with greater fidelity than traditional approximations.^[122] These hardware-accelerated methods support hybrid CPU/GPU workflows, where initial scene setup occurs on CPUs and intensive ray computations shift to GPUs for efficiency.^[123] Overall, ray-tracing hardware in professional contexts yields significant benefits, including up to 100x speedups in denoising noisy ray-traced images via AI-accelerated techniques, transforming render times from days to hours in demanding VFX and simulation tasks.^[124] This efficiency enables more iterations in design reviews and scientific analyses, fostering innovation while maintaining physically accurate results.^[125]

Advancements beyond 2025

Following the advancements in dedicated ray-tracing hardware through 2025, next-generation architectures from major vendors are poised to enhance path tracing capabilities and integrate more deeply with AI acceleration. NVIDIA's Rubin architecture, anticipated for release in 2026, will feature enhanced GPU capabilities including RT cores for handling complex light interactions in dynamic scenes.^[126] Similarly, AMD's UDNA (or RDNA 5) GPUs, targeted for 2026-2027, incorporate patents detailing a redesigned ray accelerator with support for dense geometry formats and co-compute units, aiming to achieve performance parity with NVIDIA's Blackwell in ray-tracing workloads while enabling full path tracing at higher frame rates.^[127]^[128] Intel's Xe3 Celestial architecture for discrete GPUs, entering pre-silicon validation in 2025 with a projected launch in 2027, promises refined ray-tracing units integrated with AI upscaling, focusing on competitive efficiency for both discrete and integrated graphics.^[129]^[130] These developments build toward hardware-accelerated full path tracing, where multiple light bounces are computed in real-time without relying heavily on approximations. AI synergies are central to post-2025 ray-tracing hardware, with on-chip neural networks increasingly used for techniques like radiance caching to optimize global illumination. NVIDIA's Neural Radiance Caching (NRC), originally detailed in 2021 research, employs Tensor Cores to approximate radiance fields, reducing the computational load for path-traced scenes by caching neural network predictions that cut evaluation times by 2-25 times compared to traditional Monte Carlo methods, enabling real-time performance in dynamic environments.^[131]^[132] This approach is set to deepen integration in Rubin GPUs, where AI-driven denoising and caching will further minimize required samples for noise-free rendering. AMD's FSR Redstone, extending into future architectures, incorporates neural radiance caching alongside machine learning-based ray regeneration, leveraging on-chip AI hardware to regenerate rays efficiently and reduce sample needs in path-traced pipelines.^[133] Such synergies not only accelerate rendering but also adapt to scene complexity, potentially halving noise in global illumination without additional bounces. Emerging form factors will embed ray-tracing hardware more pervasively, particularly in AR and VR devices. Successors to devices like Apple's Vision Pro are shifting toward lightweight AR glasses, with Apple prioritizing development of AI-enhanced smart glasses by 2026 that could incorporate mobile GPUs supporting ray tracing for realistic overlays in mixed-reality environments.^[134] Meta's Orion prototype, evolving into consumer AR glasses around 2027, hints at integrated graphics capable of real-time ray-traced effects for immersive holographics, driven by edge-optimized SoCs.^[135] Research into quantum-inspired algorithms offers additional acceleration potential; a 2022 study demonstrated up to 190% performance improvement (nearly 3x speedup) in ray-tracing workloads using hybrid quantum-classical methods, with ongoing advancements in 2025 exploring scalable implementations for complex scenes on specialized hardware.^[136] These integrations prioritize low-power edge computing, aligning with sustainability goals through efficient APIs like evolving Vulkan extensions that unify ray-tracing access across devices while minimizing energy use in battery-constrained AR platforms.^[137] By 2030, market forecasts indicate ray-tracing technology, including path tracing, will achieve near-universal adoption in consumer and professional graphics, with the global RT cores market expanding significantly due to AI and hardware optimizations.^[138] Innovations like neuromorphic-inspired processing, though nascent, could enable adaptive rendering by mimicking neural efficiency for light transport simulations, addressing power challenges in edge devices.^[139] Overall, these advancements emphasize scalable, AI-augmented hardware that balances performance with ethical considerations, such as reducing computational waste in rendering pipelines.

References

[1]
Real-Time Ray Tracing - NVIDIA Developer
Real-time ray tracing is a method of graphics rendering that simulates the physical behavior of light. Find tutorials, samples, videos, and more.
[2]
Ray Tracing Essentials Part 1: Basics of Ray Tracing
Jan 18, 2020 · In Part 1: Basics of Ray Tracing, Haines runs through the basics of ray and path tracing. To begin, he defines a ray and notes how it is useful for different ...
[3]
Introduction to NVIDIA RTX and DirectX Ray Tracing
Mar 19, 2018 · Ray tracing and other dispatch types share all resources such as textures, buffers, and constants. No conversion, duplication, or mapping is ...
[4]
AMD RDNA™ Architecture
Powerful 3rd-gen Raytracing accelerators in AMD RDNA 4 architecture deliver 2x the raytracing throughput vs the previous generation1, to bring even more ...Benefits · Generations · Amd Radeon Rx Series...Missing: hardware | Show results with:hardware
[5]
What Is Ray Tracing and How Does It Work with Intel® Arc™?
Intel Arc Graphics feature hardware level ray tracing units, intended to improve performance of this algorithm. For this purpose, any game or app developer ...
[6]
Ray Tracing From the 1980's to Today An Interview with Morgan ...
Mar 19, 2019 · Morgan McGuire, a Distinguished Research Scientist at NVIDIA, states that the history of ray tracing dates back to the late 1980s, but it ...
[7]
[PDF] Ray Tracing on Programmable Graphics Hardware
Recently a breakthrough has occurred in graphics hardware: fixed function pipelines have been replaced with programmable vertex and fragment processors.
[8]
A Hardware Acceleration Engine for Ray Tracing - IEEE Xplore
In this paper, we present a hardware acceleration engine for ray tracing using Application Specific Integrated Circuit (ASIC) technology.
[9]
What's the Difference Between Hardware- and Software-Accelerated ...
Jun 7, 2019 · Delivering high-performance, real-time ray tracing required two innovations: dedicated ray-tracing hardware, RT Cores; and Tensor Cores for high ...
[10]
AMD Unveils Next-Generation AMD RDNA™ 4 Architecture with the ...
Feb 28, 2025 · The new AMD RDNA 4 architecture features 16GB memory, improved raytracing, AI accelerators, 3rd gen raytracing, 2nd gen AI, and up to 40% ...
[11]
Introduction to Real-Time Ray Tracing with Vulkan - NVIDIA Developer
Oct 10, 2018 · At NVIDIA, GPU-accelerated ray tracing has been a topic of research for over a decade. GPUs evolved to become powerful rasterization machines.
[12]
Intel® Arc™ Graphics Developer Guide for Real-Time Ray Tracing in...
Intel® Arc™ GPUs support DirectX* 12 Ultimate features with variable-rate shading (VRS), mesh shading, and DirectX* Raytracing (DXR) through new built-in ...
[13]
An improved illumination model for shaded display
An improved illumination model for shaded display. Author: Turner Whitted. Turner Whitted. Bell Labs, Holmdel, NJ. View Profile. Authors Info & Claims.Abstract · Cited By · Information & Contributors
[14]
Ray tracing complex scenes | ACM SIGGRAPH Computer Graphics
Kajiya, James W., "New Techniques for Ray Tracing Procedurally Defined Objects', Computer Graphics, 17(3), July, 1983, pp. 91-102. Digital Library · Google ...
[15]
A 3-dimensional representation for fast rendering of complex scenes
This paper describes a method whereby the object space is represented entirely by a hierarchical data structure consisting of bounding volumes, with no other ...
[16]
The rendering equation | ACM SIGGRAPH Computer Graphics
The rendering equation. Author: James T. Kajiya. James T ... A two-pass solution to the rendering equation: A synthesis of ray tracing and radiosity methods.
[17]
RAY TRACING GEMS
Ray tracing has finally become a core component of real-time rendering. We ... billions of rays per second, an order of magnitude improvement over the ...
[18]
(PDF) A detailed study of ray tracing performance: render time and ...
In this paper we present a detailed study of per-ray time and energy costs for ray tracing. Specifically, we use path tracing, broken down into distinct ...
[19]
[PDF] Ray Tracing for the Movie 'Cars' - Pixar Graphics Technologies
This paper describes how we extended Pixar's RenderMan renderer with ray tracing abilities. In order to ray trace highly complex.
[20]
Realtime Ray Tracing for Current and Future Games. - ResearchGate
In this paper we describe our experience with two games: The adaption of a well known ego-shooter to a ray tracing engine and the development of a new game espe ...
[21]
[PDF] TRaX: A Multi-Threaded Architecture for Real-Time Ray Tracing
While the ray tracing algorithm is not particularly parallel at the instruction level, it is extremely (embarrassingly) parallel at the thread level. The ...<|control11|><|separator|>
[22]
[PDF] An Improved Illumination Model for Shaded Display
The improved model uses global illumination, a ray tree, and considers light source, viewer, surface properties, and orientation to calculate light reflection.
[23]
[PDF] The AR250 A New Architecture for Ray Traced Rendering
• AR250 Rendering Processor. • RenderDrive Network Appliance. How and why it works. • Problems with Parallelism. • The ART Model. • Main AR250 Data Flows. • ...Missing: 1995 | Show results with:1995
[24]
TigerSHARK - A Hardware Accelerated Ray-tracing Engine
The current state of the art in graphics rendering algorithms and hardware is surveyed, and it is shown that ray-tracing, despite larger computational ...
[25]
RPU: a programmable ray processing unit for realtime ray tracing
This paper describes the architecture and a prototype implementation of a single chip, fully programmable Ray Processing Unit (RPU).
[26]
[PDF] Interactive Rendering with Coherent Ray Tracing
Interactive Rendering with Coherent Ray Tracing. Ingo Wald, Philipp Slusallek, Carsten Benthin, and Markus Wagner. Computer Graphics Group, Saarland ...
[27]
Interactive ray tracing | ACM SIGGRAPH 2005 Courses
We examine a rendering system that interactively ray traces an image on a conventional multiprocessor. The implementation is "brute force" in that it ...
[28]
[PDF] Interactive Ray Tracing with CUDA - NVIDIA
Why ray tracing? Ray tracing unifies rendering of visual phenomena fewer algorithms with fewer interactions between algorithms. Easier to combine advanced ...
[29]
Caustic Graphics launches real-time ray tracing platform
Apr 21, 2009 · Speaking of the hardware, the first Caustic RT board is essentially an Altera 100MHz FPGA chip with a single lane of DDR2 memory. This board, ...
[30]
Caustic Graphics Ray Tracing Acceleration Technology Review
Apr 20, 2009 · Caustic Graphics is formally announcing availability of the CausticOne ray tracing accelerator card today and we are finally allowed to unveiling the ...<|control11|><|separator|>
[31]
Imagination buys ray tracing firm for $27M - EE Times
Dec 14, 2010 · Imagination Technologies has acquired for $27 million Caustic Graphics Inc., a developer of real-time ray tracing graphics, ...
[32]
SiliconArts new ray tracing chip and IP - Jon Peddie Research
Oct 10, 2019 · The company introduced its RayCore 2000 realtime ray tracing IP design capable of 250 m-rays/s @ 500 MHz (4 cores), running at 2048 × 2048 resolution.
[33]
Imagination Ships Caustic Series2 R2500 and R2100 Ray Tracing ...
Jan 31, 2013 · Up to 100 million incoherent rays per second · Target workstation: Dual CPU · Bus: PCIe x16 Gen 2.0 - single height, full length · Power: 65 Watts ...
[34]
[PDF] NVIDIA TURING GPU ARCHITECTURE
Scene from Battlefield V with RTX On and Off ... RT Cores, Turing Tensor Cores, and the architectural changes made to the ...
[35]
Battlefield V DXR Real-Time Ray Tracing Available Now - NVIDIA
Nov 14, 2018 · Battlefield V Recommended System Requirements For DXR Ray Tracing · GPU: NVIDIA GeForce RTX 2080 · CPU: Intel Core i7 8700 / AMD Ryzen 7 2700 · RAM ...
[36]
Autodesk Acquires Numenus Ray Tracing Technology
Aug 25, 2011 · As a result of the acquisition, Autodesk software users should gain more accurate visualization of surfaces and spend less time preparing ...Missing: hardware early 2010s
[37]
[PDF] Ray tracing complex scenes - UGA School of Computing
This traversal algorithm bears similarities to the fractal intersection algorithm of Kajiya [Kajiya 83]. We will assume that the type of bounding volume to be ...Missing: BVH | Show results with:BVH
[38]
[PDF] On fast Construction of SAH-based Bounding Volume Hierarchies
Figure 1: We present a method that enables fast, per-frame and from-scratch re-builds of a bounding volume hierarchy, thus completely removing a BVH-based ...
[39]
[PDF] Efficient Stack-less BVH Traversal for Ray Tracing
We have presented a traversal algorithm for BVH that does not need a stack and hence minimizes the memory needed for a ray. It is based on a three-state logic ...
[40]
[PDF] A Survey on Bounding Volume Hierarchies for Ray Tracing
Using a. BVH, we can efficiently prune branches that do not intersect a given ray, and thus reduce the time complexity from linear to logarithmic on average.Missing: citation | Show results with:citation
[41]
[PDF] A FOUNDATION FOR HIGH PERFORMING GRAPHICS - AMD
AMD RDNA 2 is an evolved graphics architecture with up to 194% faster performance, new AMD Infinity Cache, enhanced Compute Units, and Hardware Raytracing.
[42]
AMD's RDNA 2: Shooting For the Top - Chips and Cheese
Feb 19, 2023 · Internally, RDNA 2 treats raytracing kernels as compute shaders. This particular call launched 32,400 compute wavefronts. The 6900 XT's 40 WGPs ...
[43]
AMD Radeon RDNA 3 Architecture Overview: Efficiency Is King
Rating 4.0 · Review by Zak KillianNov 14, 2022 · Along with a new two-stage scheduling algorithm to discard empty ray quads, AMD says RT performance is up by 80% from RDNA 2. The World's First ...
[44]
AMD RDNA 3 GPU Architecture Deep Dive - Tom's Hardware
Jun 5, 2023 · AMD's RDNA 3 GPU architecture promises improved performance, more than a 50% boost in efficiency, better ray tracing hardware, and numerous ...Specifications · GPU Chiplets · Architecture · Compute Units
[45]
Raytracing on AMD's RDNA 2/3, and Nvidia's Turing and Pascal
Mar 22, 2023 · RDNA 3 ends up averaging 45.2 billion box tests and 5.22 billion triangle tests per second in that shader.
[46]
RDNA 4's Raytracing Improvements - Chips and Cheese
Apr 14, 2025 · BVH4x2 almost certainly takes two of those paths in parallel, with two paths popped from the LDS and tested in the Ray Accelerator with each ...Missing: third 8000 performance
[47]
https://chipsandcheese.com/p/rdna-4s-raytracing-improvements
[48]
Introduction to the Xe-HPG Architecture - Intel
Nov 4, 2022 · This white paper describes the components of Xe-HPG, a new high-performance graphics architecture designed for discrete GPUs.
[49]
Intel Xe HPG Graphics Architecture and Arc "Alchemist" GPU Detailed
Aug 19, 2021 · The Arc "Alchemist" discrete GPU implements the Xe HPG (high performance gaming) graphics architecture, and offers full DirectX 12 Ultimate compatibility.
[50]
Intel Launches Arc B-Series Graphics Cards
Dec 3, 2024 · New Xe-cores are supported by more capable ray tracing units, better mesh shading performance and improved support of key graphics functions ...
[51]
Raytracing on Intel's Arc B580 - by Chester Lam - Chips and Cheese
Mar 14, 2025 · Intel's Battlemage architecture is stronger than its predecessor at raytracing, thanks to beefed up RTAs with more throughput and better caching ...
[52]
Raytracing on Meteor Lake's iGPU - Chips and Cheese
Apr 15, 2024 · Meteor Lake's Xe-LPG architecture supports its raytracing acceleration units with fast dispatch hardware and large, high bandwidth L1 caches.
[53]
Intel details Arc "Xe-LPG" Meteor Lake GPU architecture, up to 8 Xe ...
Sep 19, 2023 · Intel Xe-LPG architecture delivers higher frequency and larger GPU configurations than Xe-LP. A major upgrade to Intel's integrated graphics.
[54]
Intel Lunar Lake Technical Deep Dive - Battlemage Xe2 Graphics
Rating 5.0 · Review by W1zzard (TPU)Jun 3, 2024 · Intel's iGPU for Lunar Lake is the first from the company based on the new Xe2 Battlemage graphics architecture. It offers a 50% gain in gaming performance.
[55]
Intel Details Inner Workings of XeSS - Tom's Hardware
Aug 27, 2022 · XeSS operates on Intel's Arc XMX cores, but it can also run on other GPUs in a slightly different mode. DP4a instructions are basically four ...
[56]
Developers: Hardware-accelerated ray tracing improves lighting ...
May 9, 2023 · Ray tracing is an advanced technique for closely approximating real-world lighting effects in a rendered environment.Missing: 7xx | Show results with:7xx
[57]
Arm Immortalis-G715
Arm Immortalis-G715 is a flagship GPU for ultimate gaming, with 15% performance, 2x ML improvements, hardware ray tracing, and 15% energy efficiency.Missing: PowerVR Dimensity
[58]
https://www.arm.com/products/silicon-ip-multimedia/immortalis-gpu/immortalis-g715
[59]
Exynos 2400 | Mobile Processor | Samsung Semiconductor Global
The Samsung Exynos 2400 enhances the mobile experience with hardware ray tracing and powerful AI featuresMissing: path | Show results with:path
[60]
Samsung Exynos 2400: specs and benchmarks - NanoReview
Image detection, 159.2 images/sec ; HDR, 211.5 Mpixels/sec ; Background blur, 21.8 images/sec ; Photo processing, 65.3 images/sec ; Ray tracing, 7.4 Mpixels/sec ...Samsung Galaxy S24 (Exynos) · Samsung Galaxy S24 Plus... · VS
[61]
Ray Tracing GPU IP | Siliconarts, Inc.
RayCore MC is the world's best real-time path & ray-tracing GPU IP that can quickly proceed with the rendering process with low power.
[62]
RT Engine: An Efficient Hardware Architecture for Ray Tracing - MDPI
This article presents a novel architecture, a RT engine (Ray Tracing engine), that accelerates ray tracing.
[63]
Real-time Ray Tracing Technology: A Game-Changer for Mobile ...
Nov 17, 2021 · Today it is extremely challenging to implement real-time ray tracing in power-constrained devices, and thus the technology has yet to grace the ...Missing: simplified | Show results with:simplified
[64]
Announcing Microsoft DirectX Raytracing!
Today, we are introducing a feature to DirectX 12 that will bridge the gap between the rasterization techniques employed by games today, and the ...
[65]
DirectX Raytracing (DXR) Functional Spec - Microsoft Open Source
Jun 30, 2025 · This document describes raytracing support in D3D12 as a first class peer to compute and graphics (rasterization).Initiating raytracing · Inline raytracing · Shader Execution Reordering · State objects
[66]
DirectX Raytracing (DXR) Tier 1.1 - Microsoft Developer Blogs
Nov 6, 2019 · This post discusses each new raytracing feature individually. The DXR spec has the full definitions, starting with its Tier 1.1 summary.Missing: documentation | Show results with:documentation
[67]
Announcing DirectX Raytracing 1.2, PIX, Neural Rendering and ...
Mar 20, 2025 · Real-time path tracing can be enhanced by neural supersampling and denoising, combining two of the most cutting-edge graphics innovations to ...Missing: 1.3 1.4
[68]
Direct3D 12 Raytracing - Win32 apps - Microsoft Learn
Dec 30, 2021 · This article provides a listing of the documentation that is available for Direct3D raytracing. C++ raytracing reference
[69]
Hardware Ray Tracing in Unreal Engine - Epic Games Developers
Ray tracing features are supported on Vulkan, including feature parity with DirectX 12 and Shader Model 6. This allows the full suite of ray tracing ...Ray Traced Ambient Occlusion · Using Ray Tracing Features · Additional Notes
[70]
Ray Tracing In Vulkan - The Khronos Group
Mar 17, 2020 · This blog summarizes how the Vulkan Ray Tracing extensions were developed, and illustrates how they can be used by developers to bring ray tracing ...Overview · Introduction to the Vulkan Ray... · Ray Traversal · Ray Tracing Pipelines
[71]
Vulkan® ray tracing extension support in our latest AMD Radeon ...
Dec 1, 2020 · VK_KHR_acceleration_structure. This extension provides acceleration structures for representing geometry that is spatially sorted.
[72]
Vulkan: Continuing to Forge Ahead - The Khronos Group
Aug 5, 2025 · VRED's Vulkan implementation delivers a hybrid rendering pipeline that combines traditional rasterization with GPU-accelerated ray tracing, ...A Spotlight On Deprecation · Revamped Vulkan Docs &... · Top Tier Applications Go...
[73]
NVIDIA OptiX SDK Tackles Hair and Fur with Release 7.1
Jul 1, 2020 · OptiX SDK 7.1 is the latest update to the new OptiX 7 API. Version 7.1 introduces a new curve primitive for hair and fur, tiled denoising, and ...Missing: history | Show results with:history
[74]
The NVIDIA OptiX SDK Release 7.2 Adds AI Denoiser Layer ...
Oct 7, 2020 · OptiX SDK 7.2 is the latest update to the new OptiX 7 API. This version introduces API changes to the OptiX denoiser to support layered AOV ...Missing: history Omniverse
[75]
HDRP Raytracing Effects on Vulkan - Roadmap Detail - Unity
The Raytracing API is currently supported when targeting Windows/DirectX12, Xbox Series and PS5 - in order to enable a range of HDRP effects such as ...Missing: console PC
[76]
Flexible and Powerful Ray Tracing with NVIDIA OptiX 8
Aug 7, 2023 · NVIDIA OptiX is a powerful and flexible ray-tracing framework, enabling you to harness the potential of ray tracing.Key Benefits · Key Features · Next StepsMissing: Blackwell | Show results with:Blackwell
[77]
[PDF] NVIDIA RTX BLACKWELL GPU ARCHITECTURE
For some background, the RT Cores in Turing, Ampere, and Ada GPUs include dedicated hardware units for accelerating Bounding Volume Hierarchy (BVH) data ...
[78]
New AI SDKs and Tools Released for NVIDIA Blackwell GeForce ...
Feb 11, 2025 · Optimize ray tracing with NVIDIA OptiX. NVIDIA OptiX SDK is an application framework for achieving optimal ray tracing performance on the GPU.Ai-Powered Gaming · Accelerated Content Creation · Optimize Ray Tracing With...
[79]
[PDF] NVIDIA AMPERE GA102 GPU ARCHITECTURE
The GA10x RT Core doubles the ray/triangle intersection testing rate over Turing architecture RT Cores, and also adds a new Interpolate Triangle Position ...
[80]
[PDF] LumiBench: A Benchmark Suite for Hardware Ray Tracing - People
LumiBench is a benchmark suite for evaluating ray tracing hardware performance on modern GPUs, designed to study the performance of hardware ray tracing ...
[81]
SPECviewperf 15 - SPEC GWPG
The SPECviewperf graphics card benchmark is the worldwide standard for measuring graphics performance representing professional applications.Missing: RT scene complexity
[82]
Ray Tracing Essentials Part 7: Denoising for Ray Tracing
May 4, 2020 · A critical element in making realistic, high-quality images with ray tracing at interactive rates is the process of denoising.
[83]
On Ray Reordering Techniques for Faster GPU Ray Tracing - arXiv
Jun 12, 2025 · They proposed the idea of sorting rays to reduce divergence in computation using a hash-based method for sorting the rays into coherent packets.
[84]
On Ray Reordering Techniques for Faster GPU Ray Tracing
May 5, 2020 · Ray reordering increases GPU ray tracing performance, yielding 1.3-2.0x higher speed, but recovering reordering overhead is problematic.
[85]
AQB8: Energy-Efficient Ray Tracing Accelerator through Multi-Level ...
Jun 20, 2025 · Using Equation 1 as the building block, we can find where the ray intersects a BB based on the Slab method [22] as shown in Figure 3(a).Missing: AABB | Show results with:AABB
[86]
[PDF] Neural Temporal Adaptive Sampling and Denoising
We present a novel adaptive rendering method that increases temporal stability and image fidelity of low sample count path tracing by distributing samples via.
[87]
[PDF] Neural Temporal Denoising for Indirect Illumination
We propose a novel neural temporal denoising method for indirect illumination of Monte Carlo (MC) ray tracing animations at 1 sample per pixel. Based on the end ...
[88]
NVIDIA Turing Architecture In-Depth | NVIDIA Technical Blog
Sep 14, 2018 · Each SM contains 64 CUDA Cores, eight Tensor Cores, a 256 KB register file, four texture units, and 96 KB of L1/shared memory which can be ...
[89]
[PDF] Hybrid-Rendering Techniques in GPU - arXiv
Hybrid rendering combines ray tracing with rasterization and denoising to achieve photorealistic quality and real-time performance.<|separator|>
[90]
[PDF] Locally-Adaptive Level-of-Detail for Hardware-Accelerated Ray ...
Oct 9, 2023 · In this paper, we target hardware-accelerated ray-tracing with custom hardware logic to perform ray traversal with our LOD operations, which ...
[91]
[PDF] Fused BVH to Ray Trace Level of Detail Meshes - GPUOpen
Oct 4, 2022 · ABSTRACT. The paper presents different strategies for fusing Bounding Volume. Hierarchies (BVHs) of varying level of detail (LODs).
[92]
[PDF] Interleaved Prefetching Ray Traversal on CPUs - DiVA portal
Nov 18, 2021 · This project focuses on the Bounding Volume Hierarchy (BVH) structure and improving its performance when querying large batches of first hit ray ...Missing: chip refits
[93]
[PDF] Asynchronous BVH Construction for Ray Tracing Dynamic Scenes ...
Abstract. Recent developments have produced several techniques for interactive ray tracing of dynamic scenes. In par- ticular, bounding volume hierarchies ...
[94]
[PDF] Spatiotemporal reservoir resampling for real-time ray tracing with ...
We introduce a method to sample one-bounce direct lighting from many lights that is suited to real-time ray tracing with fully dynamic scenes (see Fig. 1). Our ...
[95]
RayGaussX: Accelerating Gaussian-Based Ray Marching for Real ...
Sep 9, 2025 · Efficient space skipping and adaptive sampling of unstructured volumes using hardware accelerated ray tracing. In 2019 IEEE Visualization ...3 Raygaussx Approach · 3.3 Optimizing Ray Coherence... · 4 Experiments And Results
[96]
Lumina: Real-Time Neural Rendering by Exploiting Computational ...
Jun 20, 2025 · We propose Lumina, a hardware-algorithm co-designed system, which integrates two principal optimizations: a novel algorithm, S 2 , and a radiance caching ...2.1 3dgs Pipeline · 3.2 Radiance Caching · 6.1 Rendering Quality
[97]
A Neural Rendering Coprocessor With Optimized Ray ... - IEEE Xplore
Sep 29, 2025 · Abstract—Neural rendering, a transformative approach for. 3-D scene reconstruction and rendering, has advanced rapidly in recent years.
[98]
NVIDIA RTX Platform Brings Real-Time Ray Tracing and AI to ...
Aug 20, 2018 · The Turing architecture's new RT Cores enable real-time ray tracing of objects and environments with physically accurate shadows ...
[99]
Here's How Battlefield V Performs with Ray Tracing | Tom's Hardware
Nov 16, 2018 · Battlefield V is the first of 11 games announced to support ray tracing. EA and DICE have used it for real-time reflections.
[100]
A Closer Look at How Xbox Series X|S Integrates Full AMD RDNA 2 ...
Oct 28, 2020 · Xbox Series X|S are the only next-generation consoles with full hardware support for all the RDNA 2 capabilities AMD showcased today.
[101]
https://www.tomshardware.com/reviews/battlefield-5-ray-tracing-benchmarks%2C5911.html
[102]
NVIDIA GeForce RTX
### Summary of NVIDIA GeForce RTX Content
[103]
Path Tracing & Overdrive Mode - Requirements & How-To
With Patch 1.62 we've introduced Ray Tracing: Overdrive Mode, the next preset above Ray Tracing: Ultra that includes path-traced rendering.
[104]
Types of Ray Tracing, Performance On GeForce GPUs, and More
Real-Time Ray Tracing is the biggest leap in computer graphics in years, bringing realistic lighting, shadows and effects to games, enhancing image quality, ...<|separator|>
[105]
Lumen Technical Details in Unreal Engine - Epic Games Developers
Lumen uses multiple ray-tracing methods to solve Global Illumination and Reflections. Screen Traces are done first, followed by a more reliable method.
[106]
Metro Exodus PC Enhanced Edition Out Now With Improved Ray ...
May 6, 2021 · Accelerate performance by up to 2X with NVIDIA DLSS 2.0, giving you the headroom to max out the game's stunning ray-traced effects on ...
[107]
Technology Preview Of New Ray Tracing Overdrive Mode Out Now
Apr 11, 2023 · For the technology preview of Cyberpunk 2077's Ray Tracing: Overdrive Mode, we currently recommend a GeForce RTX 40 Series GPU and NVIDIA DLSS 3 ...<|separator|>
[108]
Ray Tracing Hasn't Yet Lived Up To Its Potential - GameSpot
Sep 9, 2025 · Budgets, game design, and timing haven't quite lined up yet to make ray tracing a primary feature in games.
[109]
Accelerating Cycles using NVIDIA RTX - Blender Developers Blog
Jul 29, 2019 · Cycles uses NVIDIA OptiX for hardware-accelerated ray tracing, managing acceleration and ray intersection, and most features work with the new ...
[110]
GPU Ray Tracing for Film and Design- Bringing the Arnold Renderer ...
Aug 31, 2019 · GPU-accelerated ray tracing is delivering new levels of interactivity and performance to rendering for film and design. Ray tracing leaders from NVIDIA, ...
[111]
How NVIDIA Omniverse is accelerating the VFX pipeline - Televisual
May 20, 2022 · The NVIDIA Omniverse™ open platform directly addresses the bottlenecks within the VFX pipeline to enable real-time collaboration and real ...
[112]
Creative Studio Enhances Visual Effects and Virtual Production with ...
SOKRISPY uses NVIDIA RTX, Unreal Engine, and a Dell workstation for real-time rendering, enabling faster, more efficient visual effects and virtual production.
[113]
Path Tracer Requirements for Twinmotion - Epic Games Developers
To use path tracing, your computer must be equipped with one of the following: Any NVIDIA RTX graphics card. NVIDIA T1000 cards are not supported. An ...
[114]
Twinmotion Plugins Supports Files From All Major Modeling Solutions
Twinmotion supports files from all major CAD, BIM, and modeling solutions, and offers direct one-click synchronization with many of them.Datasmith Exporter for Revit · Download options · Datasmith Exporter for Archicad
[115]
Real-Time Ray Tracing Changes the Way We See Games and CGI
Oct 26, 2025 · Discover how real-time ray tracing is revolutionizing gaming, CGI, and design. We explain how it works, the hardware behind it, and more!Cyberpunk 2077 · Nvidia Rtx Gpus · Amd Radeon Rx Series<|separator|>
[116]
Autodesk VRED 2026: How Autodesk Revolutionized Visualization ...
Oct 14, 2025 · Ray-traced reflections that provide accurate mirror and glossy surface visualization crucial for automotive design evaluation. Ray-traced ...
[117]
GPU-Based Volume Rendering for Medical Imagery - ResearchGate
We present a method for fast volume rendering using graphics hardware (GPU). To our knowledge, it is the first imple-mentation on the GPU.
[118]
A Path‐Tracing Monte Carlo Library for 3‐D Radiative Transfer in ...
Jul 15, 2019 · This library is devoted to the acceleration of ray tracing in complex data, typically high-resolution large-domain grounds and clouds.
[119]
GPU Rendering Solutions for 3D Designers - NVIDIA
Turn RTX On and get dedicated GPU-acceleration for ray tracing with RT Cores and AI-accelerated denoising with Tensor Cores. Experience real-time ...Real-Time Ray-Traced... · Rtx Pro Platforms For... · Nvidia Graphics And Neural...
[120]
Ray Tracing Essentials Part 7: Denoising for Ray Tracing
May 11, 2020 · Denoising is typically applied in interactive environments where you don't get a static result, you get a new denoised result that includes more ...
[121]
Accelerating ray tracing engine of BLENDER on the new Sunway ...
Oct 31, 2023 · The rendering time for one image is greatly reduced from 2260 s using the single-core serial version to 71 s using 48 processes, which is a ...
[122]
NVIDIA's next-gen GeForce RTX 60 rumors: Rubin GPU, more ...
Feb 2, 2025 · NVIDIA's upcoming GeForce RTX 60 series "Rubin" GPUs are anticipated to feature a new architecture with a 10%+ boost in rasterization and 20%+ in ray tracing ...
[123]
AMD Patents Provide Early UDNA Insights - "Blackwell-esque" Ray ...
May 5, 2025 · The patents indicate a strong possibility of almost feature level parity with NVIDIA Blackwell in AMD's future GPU architectures likely as soon as 'RDNA 5/UDNA ...Missing: advancements | Show results with:advancements
[124]
AMD Details How Next-Gen UDNA/RDNA 5 GPUs Could Boost Ray ...
Sep 30, 2025 · AMD's Next-Gen UDNA 5 Gaming GPUs Could Potentially Bridge The Ray-Tracing Performance Gap With NVIDIA, Indicates Extensive Patent Filings.
[125]
Intel Arc Xe3 Celestial GPU enters pre-validation stage
May 3, 2025 · Intel's next-generation Xe3 Celestial GPU reportedly enters the pre-silicon validation stage, when the GPU design and architecture are being tested.
[126]
Next-Gen Intel GPUs: Battle Mage and Celestial Aim to Redefine ...
Oct 13, 2025 · The Celestial series is expected to make major gains in these areas, including as better AI-driven upscaling, faster ray tracing, and more ...
[127]
Real-time Neural Radiance Caching for Path Tracing | Research
Jun 21, 2021 · We introduce a real-time neural radiance caching technique for path-traced global illumination. Our system is designed to handle fully dynamic scenes.
[128]
Neural Two‐Level Monte Carlo Real‐Time Rendering - Dereviannykh
Apr 18, 2025 · The cache is designed to provide a fast and reasonable approximation of the incident radiance: an evaluation takes 2–25 × less compute time than ...4.1. Radiance Cache... · 6. Results · 6.3. Cache Analysis<|separator|>
[129]
AMD FSR Redstone: Radiance Cache, Ray Regeneration & ML ...
May 21, 2025 · It includes three features already available on most GeForce RTX GPUs: Radiance Caching, Ray Regeneration, and Frame Generation.
[130]
Apple Enters the AI Glasses Race After Vision Pro Pause
Oct 13, 2025 · Apple pivots from Vision Pro to AI-powered smart glasses as Meta dominates the XR space, redefining how humans connect with digital and ...
[131]
Hands-on with Orion, Meta's first pair of AR glasses | The Verge
Sep 25, 2024 · Meta's first pair of AR glasses, Orion, are a demo of what CEO Mark Zuckerberg thinks will one day replace the smartphone.
[132]
Quantum Computing May Make Ray Tracing Easier | Tom's Hardware
May 20, 2022 · Ray tracing workloads aided by quantum computing can offer an up to 190% performance improvement by slashing the number of calculations required by each ray.<|control11|><|separator|>
[133]
[PDF] MEC application developer guidelines for universal access to ... - ETSI
The Multi-access Edge Computing (MEC) APIs are designed to seamlessly integrate edge computing resources with application services, facilitating efficient ...
[134]
Global RT Cores Market 2024-2030
Apr 25, 2025 · The Global RT cores market accounted for $XX Billion in 2023 and is anticipated to reach $XX Billion by 2030, registering a CAGR of XX% from ...
[135]
Redefining GPU Architectures for AI and Computational Neuroscience
Oct 31, 2024 · Unlike GPUs, initially developed for graphics processing and then adapted for AI, neuromorphic chips are specifically built to mirror the human ...