Fact-checked by Grok 2 weeks ago

Tiled rendering

Tiled rendering is a technique used in graphics processing units (GPUs) to divide the rendering target, such as a screen or , into small rectangular regions called , which are processed sequentially to minimize usage and improve efficiency. This approach involves a two-stage process: first, a binning sorts (like triangles) into the they overlap, and second, each is rasterized and shaded independently using on-chip before being written to the final . By confining rendering operations to local , tiled rendering avoids frequent accesses to slower off-chip , making it particularly suitable for power-constrained devices like GPUs. The technique originated in the 1990s with early implementations by companies like PowerVR and Gigapixel, who developed tile-based architectures to address bandwidth limitations in embedded systems. PowerVR's tile-based deferred rendering (TBDR), for instance, deferred shading until visibility was determined per tile, a method that gained prominence in mobile GPUs from vendors such as , , and . Over time, variants emerged, including tile-based immediate rendering (TBIR) in desktop GPUs like NVIDIA's and Pascal architectures, which rasterize and shade tiles without full deferral but still buffer outputs on-die for efficiency. GPUs, starting with the A11 chip, enhanced TBDR with features like imageblocks for per-pixel data and tile shaders that integrate compute operations, further optimizing for high-performance mobile rendering. Key benefits of tiled rendering include reduced power consumption—critical for battery-powered devices, where mobile GPUs operate at 3-6 watts compared to hundreds for desktops—and higher performance through overdraw reduction, as shaders execute only on visible fragments within a tile. It contrasts with immediate-mode rendering (IMR) architectures, which process across the entire frame without , leading to higher demands and less efficiency in memory-limited environments. Today, tiled rendering dominates mobile and XR platforms, such as Meta Quest devices and Samsung Galaxy hardware, enabling complex 3D scenes despite constrained resources like 1-5 MB of on-chip memory.

Fundamentals

Definition and Principles

Tiled rendering, also known as tile-based rendering, is a processing technique that divides the screen space into a of small rectangular , typically measuring 16x16 or 32x32 pixels, and renders each independently to optimize usage and efficiency. This approach processes the entire scene geometry once to determine which overlap each , avoiding the need for full-framebuffer reads and writes during . By confining rendering operations to local on-chip for each , tiled rendering reduces external traffic, which is particularly beneficial in power-constrained environments. The core principles revolve around a two-pass : first, a binning stage where vertex-shaded primitives are sorted and assigned to relevant based on their screen-space coverage, creating compact per-tile lists of contributing . In the second stage, each is rasterized and in isolation, performing hidden surface removal—such as depth testing—entirely within on-chip buffers to eliminate occluded fragments early and prevent unnecessary shading computations. This deferred aspect ensures that only occurs for visible surfaces, further minimizing redundant work and bandwidth demands, as intermediate data like depth and color values remain local until the is complete. Tile size selection balances several factors, including the degree of parallelism across shader cores, the on-chip required for tile buffers, and overhead from handling tile boundaries, such as artifacts or additional primitive tests. Smaller tiles enhance locality and reduce memory per tile but increase binning overhead and boundary computations, while larger tiles improve coherence for complex scenes at the cost of higher local storage needs.

Comparison to Immediate Mode Rendering

Immediate mode rendering, also known as immediate-mode rendering or , processes graphics primitives in the order they are submitted by the application, immediately transforming vertices, rasterizing triangles, and writing fragment data directly to an off-chip in main memory. This approach results in high demands, particularly due to overdraw—where multiple fragments are processed and written for the same —and frequent fetches that traverse the for each fragment across the entire screen. In contrast, tiled rendering divides the screen into small rectangular tiles (typically 16x16 or 32x32 pixels) and processes each tile independently using on-chip tile buffers for local storage of , and other fragment data, minimizing external memory accesses until the tile is complete. This enables early depth and testing confined to the tile, rejecting occluded fragments before computations, unlike immediate mode's scene-wide processing that applies tests after full rasterization. Tiled rendering thus achieves lower by localizing operations, while immediate mode relies on global memory for updates, exacerbating latency in bandwidth-constrained environments like mobile GPUs. Bandwidth savings in tiled rendering arise because only tile-local is stored on-chip before a single write-back to . In immediate mode, scales with the full screen times an overdraw factor (often 2-4x in complex ), leading to repeated off-chip reads and writes for depth, textures, and colors. Tiled rendering reduces through on-chip buffering, depending on . A key trade-off lies in parallelism: tiled rendering supports fine-grained parallelism at the tile level, allowing multiple to be processed concurrently on GPU cores with reduced contention, but it introduces binning overhead to assign to tiles. Immediate mode, conversely, enables coarser-grained parallelism across entire or draw calls without this preprocessing, facilitating simpler driver implementations but at the cost of inefficient resource utilization in overdraw-heavy scenarios.

Historical Development

Early Concepts and Research

The Pixel Planes project, initiated in 1981 at the by Henry Fuchs and John Poulton, marked a foundational effort in developing efficient hardware for rendering. This VLSI-oriented design introduced pixel-parallel processing, where computations such as and visibility tests occur directly at the pixel level using specialized memory chips, aiming to overcome bandwidth limitations in traditional frame buffers. By distributing processing across pixels, the approach enabled interaction with three-dimensional images, laying early groundwork for localized rendering strategies that would influence tiled methods. Building on this, and Poulton's 1985 work further advanced deferred techniques within the Pixel Planes framework, demonstrating algorithms for fast rendering of spheres, , textures, transparencies, and image enhancements. These methods deferred complex shading operations until after visibility resolution, reducing redundant computations and memory accesses in prototypes. This deferred approach highlighted the potential for separating geometric from pixel filling, a core principle that would later integrate with to optimize in resource-constrained systems. Tiling concepts emerged prominently in the Pixel Planes 5 , detailed in a 1989 publication by , Poulton, and collaborators, which subdivided the screen into 128×128 pixel patches processed by multiple SIMD renderers. This tile-based subdivision allowed independent handling of primitives per patch, with simulations validating high performance—up to 150,000 Phong-shaded triangles per second per renderer—while minimizing global memory bandwidth through on-chip and local VRAM operations. Academic prototypes and simulations demonstrated substantial reductions in frame buffer traffic by localizing pixel updates, achieving efficient rendering of complex scenes with up to 1 million triangles per second across multiple renderers. Earlier scan-line algorithms, such as those developed by Kevin Weiler in the late and extended through the , contributed to the evolution toward tiled rendering by employing recursive image subdivision for hidden surface removal. Weiler's area sorting method divided the viewport into smaller windows to resolve visibility, shifting from one-dimensional scan-line traversal to two-dimensional regions that better accommodated and complex interactions. This progression from linear scan-lines to 2D tiles improved coherence exploitation and reduced overdraw in simulations, paving the way for hardware-efficient tiled pipelines.

Commercial Milestones

The first commercial implementation of tiled rendering in consumer graphics hardware arrived with the PowerVR PCX1 chipset, released in 1996 by VideoLogic (later ), which introduced full tile-based deferred rendering (TBDR) for personal computers. This architecture divided the screen into tiles to reduce , enabling efficient on limited hardware of the era. The PCX1 powered add-in cards like the M3D and 3Dlabs Oxygen VX1, marking an early shift toward bandwidth-optimized rendering in desktop GPUs. Concurrently, in the late 1990s, Gigapixel developed tile-based rendering technology, including the GigaMan engine announced in 1999, though it was not released commercially before the company's acquisition by in 2000. In the late 1990s, tiled rendering entered the console market through the Sega Dreamcast, launched in 1998, which utilized the PowerVR2 (CLX2) GPU—a second-generation TBDR design capable of rendering up to 7 million polygons per second at resolution. This console's adoption highlighted tiled rendering's advantages in power-constrained embedded systems, influencing future handheld and mobile designs. The 2000s saw further expansion into mobile devices, with ARM's acquisition of Falanx MicroSystems in 2006 leading to the GPU family, which integrated TBDR for low-power embedded applications starting with the Mali-55 and Mali-200 series. Console milestones continued with the in 2011, featuring a quad-core PowerVR SGX543MP4+ GPU that advanced TBDR with support for 2.0 and improved texture handling, delivering up to 28 GFLOPS of performance while maintaining efficiency for portable gaming. Post-2010, tiled rendering's adoption surged in mobile GPUs driven by stringent power and bandwidth constraints in smartphones, becoming the preferred architecture for optimizing overdraw and memory access in battery-limited environments. By the late , it dominated mobile GPUs from vendors like (Mali series), (PowerVR), and (Adreno), which together captured the majority of the mobile market.

Technical Implementation

Tile-Based Rendering Pipeline

The tile-based rendering pipeline structures the graphics processing into sequential stages that partition the screen into small rectangular tiles, typically 16x16 or 32x32 pixels, to enable localized computations and minimize memory traffic. This approach processes input primitives through geometry preparation, spatial organization, and tile-specific rendering, culminating in framebuffer updates. By confining fragment operations to on-chip memory during tile processing, the pipeline enhances efficiency in bandwidth-limited systems. In the first stage, transforms input , such as triangles, by executing vertex shaders to compute screen-space positions and attributes. are then culled based on view frustum, back-face orientation, or other early rejection criteria to discard irrelevant . The binning substage follows, where each culled undergoes overlap tests—often using axis-aligned bounding boxes or precise computations—against the grid of screen ; overlapping multiple tiles are assigned to all relevant bins, resulting in replication across those tile lists. The second stage focuses on per-tile rasterization, where the iterates over each in parallel or sequentially. For a specific , only the primitives from its are loaded and rasterized using techniques like edge equations or hierarchical traversal to generate fragments representing covered pixels within the tile boundaries. This step computes fragment coverage masks and interpolates attributes, ensuring that outside the tile is ignored to avoid unnecessary computations. In the third stage, shading and blending occur entirely within the tile's on-chip buffer. Generated fragments are shaded via fragment shaders to determine final colors and material properties, followed by depth and stencil tests to resolve visibility among overlapping fragments. Surviving fragments are then blended according to the active rendering state, such as , before the tile buffer is resolved—through operations like if enabled—and merged into the main via a single write-back pass. The binning process incurs overhead from managing bin lists and replicating straddling primitives, which can increase memory usage in scenes with high primitive counts or large tiles; efficient overlap tests and hierarchical bin structures help balance this cost against the benefits of localized processing. Pipeline variants include immediate tiling, which skips off-chip binning by processing tiles directly in a single pass to reduce latency, and full deferred tiling, which delays fragment shading until after visibility determination to shade only visible surfaces.

Deferred Shading Techniques

Deferred shading techniques in tiled rendering separate the determination of visible geometry from the computationally intensive process, enabling significant efficiency gains particularly in bandwidth-constrained environments. In tile-based deferred rendering (TBDR), visibility is resolved per tile through hidden surface removal (HSR) using an on-chip after rasterizing binned primitives; fragment is then performed only on visible fragments, writing results to an on-chip color . This approach avoids shading hidden surfaces and confines all operations to fast local memory, reducing external memory accesses. Tile memory can also support advanced deferred shading passes, where developers populate on-chip geometry buffers (G-buffers) storing attributes like depth, surface normals, and for visible pixels within a . is determined during the HSR stage, and subsequent lighting passes shade only visible fragments using this data, further minimizing bandwidth. Implementations vary by vendor; for example, PowerVR hardware performs fixed shading after HSR, while Apple GPUs use features like imageblocks to enable flexible G-buffer storage in tile memory for multi-pass deferred techniques. Additionally, hierarchical is employed during the geometry pass to perform early rejection of occluded fragments at multiple levels, further unnecessary data before it reaches the tile buffer. This hierarchical approach builds a of depth information, enabling rapid visibility tests that reject entire groups of primitives or fragments per . The fundamental algorithm for TBDR can be expressed conceptually as \text{Shade}(f) = \text{Material}(L, V) \times \text{Visibility}(f), where f represents a fragment, L denotes lighting parameters, V includes view-dependent factors, and visibility is resolved post-HSR using the tile's depth buffer. Shading computations are deferred until after this visibility determination, ensuring that material evaluations—such as diffuse, specular, or physically-based models—are applied solely to fragments that contribute to the final image. This separation allows for flexible lighting integration, where multiple light sources can be processed efficiently per tile without re-rasterizing geometry. Advanced variants of TBDR extend these principles to handle and data efficiency. Multi-sample anti-aliasing (MSAA) is integrated at the level by storing multiple samples per in the on-chip , resolving coverage masks during the visibility pass to shade only unique visible samples and reduce artifacts without excessive memory overhead. Compression techniques further optimize buffers by exploiting spatial coherence, such as for depth values. These enhancements maintain while preserving the savings inherent to TBDR. By deferring shading until visibility is fully resolved per , TBDR effectively addresses overdraw challenges in scenes with high fragment density, such as those featuring complex geometry or dense foliage, through early and on-chip processing. This makes it particularly suitable for resource-limited hardware, where traditional rendering might incur prohibitive costs.

Applications

Desktop and Console GPUs

In desktop and console GPUs, tiled rendering has evolved into architectures that balance high-throughput with bandwidth , particularly in power-rich environments where access costs remain a bottleneck despite ample compute resources. introduced tiled rasterization in its architecture starting in 2014, buffering geometry data on-chip within small screen-space tiles (typically 16x16 pixels) to minimize external accesses during the rasterization . This approach, carried forward in subsequent architectures like Pascal and beyond, reduces the need for multiple round-trips to by keeping rasterizer outputs local until tile completion, yielding significant bandwidth savings in geometry-heavy workloads. Prior to full desktop adoption, 's series (pre-2015 models like Tegra 4) employed tiled rendering in mobile-oriented SoCs, combining tile-based with immediate-mode elements to handle variable geometry loads while maintaining compatibility with desktop Kepler cores. AMD Radeon GPUs, beginning with the (GCN) architecture around 2011, support partial through compute shaders for targeted optimizations, enabling software-based techniques rather than full-pipeline tile-based deferred rendering. In RDNA architectures (introduced 2019 and refined through RDNA 4 in 2025), developers leverage compute shaders to implement tiled light culling and passes, dividing the screen into tiles to cull irrelevant lights or shadows per region, which is especially effective for compute-intensive effects like volumetric rendering. This software-driven partial allows flexibility in large scenes, avoiding the overhead of hardware-mandated full tiling while still achieving localized reductions by processing tiles independently in programs. Console GPUs, built on custom variants, integrate tiled techniques for high-fidelity rendering under fixed hardware constraints. The Series X and S (launched 2020) utilize 12's tiled resource management to enable sparse virtual texturing, where textures are divided into tiles loaded on-demand, reducing and for massive open-world environments without compromising . This feature, combined with the GPU's native tiled rasterization, supports efficient handling of high-detail assets in titles emphasizing dynamic lighting. Such binning reduces draw calls and overdraw, as demonstrated in multi-platform engines like those in , where z-binning against tile boundaries improves volumetric performance in tiled setups. Hybrid models predominate in these platforms, merging tiled rasterization or compute with traditional immediate-mode rendering to accommodate expansive scenes that exceed pure tile-based limits. For instance, tile-based compute shaders handle or light culling in isolated passes, while the main raster processes full-frame immediately, allowing seamless for open-world with millions of primitives. This combination mitigates the geometry sorting overhead of full tiling, enabling higher throughput in desktop and console titles. Performance benefits include notable bandwidth efficiency, with NVIDIA's tiled rasterization delivering reductions in memory traffic for rasterization-bound workloads, as seen in compute-heavy scenarios akin to those in 2077's ray-traced passes. AMD's compute-based tiling similarly yields bandwidth savings in deferred lighting, enhancing frame rates in bandwidth-limited configurations without altering core architecture. From 2020 to 2025, tiled rendering has increasingly integrated with ray tracing through optimized BVH traversal in hybrid pipelines, where screen-space tiles guide acceleration structure culls to focus ray queries on visible regions. NVIDIA's and Ada architectures (2020 onward) use tiled raster outputs to inform BVH builds, reducing traversal costs by 20-40% in dynamic scenes via on-chip tile data reuse. This continues in the Blackwell architecture (2024). AMD's (2022) and RDNA 4 (2025) extend this with compute shaders for tiled BVH refits, enabling real-time updates in ray-traced titles while maintaining compatibility with APIs. These advancements, highlighted in high-impact works like treelet-based BVH traversal, underscore tiled rendering's role in scaling ray tracing for desktop and console interactivity.

Mobile and Embedded Devices

Tiled rendering has become the dominant in GPUs due to its efficiency in bandwidth-constrained and power-limited environments. ARM's G-series GPUs, introduced in 2008, employ full tile-based deferred rendering (TBDR), dividing the screen into small tiles—typically 16x16 or 32x32 pixels—to process and locally, minimizing external accesses and reducing power draw compared to immediate-mode alternatives. Similarly, Qualcomm's Adreno GPUs, integrated into Snapdragon SoCs since the early 2010s, utilize a tile-based approach with FlexRender technology, which dynamically adjusts tile sizes and switches between binned and direct rendering modes to optimize for varying workloads, enhancing efficiency in devices like smartphones and tablets. Apple's A-series processors, starting from the A4 in 2010, feature custom-designed GPUs that leverage TBDR tailored to the Metal API, enabling seamless integration of advanced shading techniques while maintaining low latency and power efficiency; this architecture processes tiles on-chip, supporting features like efficient multisample anti-aliasing (MSAA) and contributing to sustained performance in graphics-intensive apps without excessive battery drain. By 2025, tiled rendering dominates smartphone GPUs, facilitating smooth 60 fps gameplay at resolutions up to 4K on external displays while consuming under 5W, as seen in flagship SoCs like the Snapdragon 8 series and Apple A18. In embedded systems, tiled rendering supports power-sensitive applications such as automotive in NVIDIA's Drive PX platforms, which incorporate tiled rasterization from Pascal-era GPUs to handle visualizations with minimal overhead. ' PowerVR Rogue architecture, used in devices, applies TBDR to deliver scalable in constrained environments like sensors and wearables, where on-chip tile buffers reduce data movement. Optimizations like dynamic tile sizing adapt to varying display resolutions, while mechanisms deactivate idle tile processing units, further lowering energy use in these integrated SoCs.

Advantages and Challenges

Performance and Efficiency Gains

Tiled rendering significantly reduces usage by eliminating redundant fetches due to overdraw, as fragments are processed on-chip within each before writing to external . In fill-rate limited scenes with high overdraw, this approach can achieve reductions of up to 90% and average bandwidth reductions of 48% through techniques like early discard of redundant s, compared to immediate-mode rendering that requires multiple off-chip accesses per . Measurements across various workloads show an average total external data reduction by a factor of approximately 2, with back (from rasterizer to ) decreasing by up to 2.71 times in scenes prone to overdraw. Power efficiency gains stem from minimizing DRAM accesses, which are energy-intensive; on-chip tile processing consumes roughly 10 times less power per access than external operations. Tile-based architectures in GPUs demonstrate higher compared to desktop immediate-mode GPUs, enabling longer life in graphics-intensive applications. For instance, optimizations in tile-based deferred rendering have been shown to reduce overall by 37% in rendering scenarios on . Latency improvements arise from parallel tile rendering, which allows independent processing of screen regions and reduces pipeline stalls caused by overdraw in immediate-mode systems; the effective scales with the number of tiles divided by the overdraw ratio, as hidden surfaces are discarded early without external intervention. This parallelism is particularly beneficial in complex scenes, where it can lower average by 13.5% and yield up to 1.15x overall in commercial gaming applications. Empirical comparisons highlight power savings in tile-based GPUs, contributing to extended in demanding games under similar levels. These benefits scale with increasing and scene complexity, as higher counts amplify overdraw and demands; in scenarios, tiled rendering supports up to 4x effective gains by efficiently handling the dual high-resolution eye buffers and foveated techniques without proportional memory overhead increases. As of 2024, advancements in APIs like have enhanced tiled rendering efficiency on mobile GPUs through better support for render passes and dynamic rendering, reducing overhead in multi-subpass scenarios and improving memory access patterns.

Limitations and Optimizations

One key limitation of tiled rendering is the binning overhead, where primitives are sorted into tile lists, which can become significant with complex scenes containing many large or overlapping triangles that span multiple tiles, leading to repeated processing and increased geometry throughput demands. This overhead grows with scene complexity, potentially consuming a notable portion of the rendering budget in mobile GPUs. Another challenge arises in handling transparent objects, as alpha blending disrupts the deferred nature of tiled rendering by requiring back-to-front sorting across the entire scene to ensure correct compositing, rather than per-tile processing, which can eliminate bandwidth savings and force full-frame buffer reads and writes. Bandwidth spikes occur during tile buffer flushes to main memory, particularly at tile boundaries or when mid-render access to the framebuffer is needed for effects like post-processing, resulting in sudden high memory traffic that undermines the architecture's efficiency goals. Alpha blending further exacerbates inefficiencies by necessitating frequent framebuffer accesses, which prevent the use of on-chip tile memory and revert to higher-bandwidth immediate-mode-like behavior in tile-based deferred architectures. To mitigate binning overhead, adaptive binning techniques employ hierarchical structures, where coarser levels of are used for initial assignment before refining to finer , reducing redundant for large and improving in complex scenes. Compression algorithms, such as applied to tile-local data like depth or color values, enable efficient in on-chip buffers; for instance, lightweight integer schemes using differences can achieve substantial reductions in data footprint for sorted or semi-sorted tile contents, though exact ratios depend on workload characteristics. Software mitigations include extensions like Vulkan's VK_EXT_shader_tile_image, which grant fragment shaders rasterization-order access to on-chip tile image data, allowing developers to optimize custom blending or effects without full flushes. Hybrid rendering modes address edge cases, such as high-overdraw passes, by dynamically switching between tile-based deferred rendering and rendering paths to balance bandwidth savings with flexibility in scenarios like compute-heavy post-effects.

References

  1. [1]
    GPU Framebuffer Memory: Understanding Tiling | Samsung Developer
    Summary. Tile-based rendering is a technique used by modern GPUs to reduce the bandwidth requirements of accessing off-chip framebuffer memory. Almost ...Missing: architecture | Show results with:architecture
  2. [2]
    Mobile GPUs and tiled rendering | Meta Horizon OS Developers
    Dec 4, 2024 · This article provides an overview of tile-based rendering, an algorithm and architecture design used by mobile GPUs for 3D rendering.
  3. [3]
    Tailor your apps for Apple GPUs and tile-based deferred rendering
    The GPU breaks up the render destination into a grid of smaller regions, called tiles. It processes each tile with one of its GPU cores, often running many at ...
  4. [4]
    Tile-based Rasterization in Nvidia GPUs - Real World Tech
    Aug 1, 2016 · The PowerVR architecture has used tile-based deferred rendering since the 1990's, and mobile GPUs from ARM and Qualcomm also use various forms ...
  5. [5]
    Tile-based rendering in XR - Unity - Manual
    Tile-based GPUs divide the screen into small regions (tiles), which are rendered on-chip in the tile buffer. When a tile is rendered, its data is written to the ...<|control11|><|separator|>
  6. [6]
    Tile-based GPUs - Arm Developer
    Tile-based renders split the screen into small pieces and fragment shade each small tile to completion before writing it out to memory.Missing: definition | Show results with:definition
  7. [7]
    The PowerVR Advantage - PowerVR Developer Documentation
    All generations are based on Imagination's patented Tile Based Deferred Rendering (TBDR) architecture. The core design principle of the TBDR architecture is ...
  8. [8]
    5th Gen Arm GPU Architecture
    ### Summary of Mali GPU Tile-Based Rendering Principles
  9. [9]
    A look at the PowerVR Graphics Architecture: Tile-Based Deferred ...
    Apr 2, 2015 · Explore the intricacies of PowerVR's Tile-Based Deferred Rendering architecture and its efficiency advantages over Immediate Mode Renderers ...
  10. [10]
    [PDF] Fast Spheres, Shadows, Textures, Transparencies, and Image ...
    Details of the hardware design and the implementation are in [Paeth 1982] and in [Poulton 1985]. The latter of these papers outlines architectural enhancements ...
  11. [11]
    Pixel-planes 5: a heterogeneous multiprocessor graphics system ...
    This paper introduces the architecture and initial algorithms for Pixel-Planes 5, a heterogeneous multi-computer designed both for high-speed polygon and ...
  12. [12]
    PowerVR at 25 : The story of a graphics revolution - Imagination Blog
    Aug 11, 2017 · The very first PowerVR logo​​ This key innovation was a tile-based deferred rendering (TBDR) technology which we introduced in the mid-90s. At ...
  13. [13]
    PowerVR PCX1 review - Vintage 3D
    Tile rendering divides the screen into small tiles and draws all of it, dump it to primary card and than moves onto another tile. Tiles are small enough to ...Missing: commercial milestones GPUs history Imageon ARM Mali Nintendo DS PlayStation Vita
  14. [14]
    History of PowerVR graphics cards - VOGONS
    Oct 23, 2021 · The 3D performance of Midas 3 is between PCX1 and PCX2. PowerVR Series1 3D performance is stronger than Rendition Vérité V1000, and weaker than ...
  15. [15]
    Dreamcast Architecture | A Practical Analysis - Rodrigo Copetti
    VideoLogic chose an alternative approach for the construction of their 3D engine called Tile-Based Deferred Rendering (TBDR). Instead of rendering a whole frame ...
  16. [16]
    Sega Dreamcast/Hardware comparison
    The Dreamcast's PowerVR CLX2 GPU, due to its tiled rendering architecture, also has has a higher fillrate and faster polygon rendering throughput than a ...Graphics comparison table · Vs. Arcade · Vs. PC · Vs. Consoles
  17. [17]
    [PDF] Mali-400 MP: A Scalable GPU for Mobile Devices
    ▫ Graphics at ARM. ▫ Acquired Falanx in 2006. ▫ ARM Mali is now the world's most widely licensed GPU family ... ▫ Details: see Real-Time Rendering, 3 rd ed. 0.
  18. [18]
    Graphics - Vita Developer wiki - PSDevWiki
    Sep 13, 2020 · The PlayStation®Vita SoC (system-on-a-chip) contains an SGX543MP4+ GPU. This is a multi-core, tile based deferred rendering GPU, with an advanced unified ...SGX543MP4+ · SGX543MP4+ Block Overview · Single SGX543+ Core Block...
  19. [19]
    [PDF] Snapdragon-8-Elite-Gen-5-product-brief.pdf - Qualcomm
    Industry-leading memory cache solution, Adreno HPM, provides 18 MB of dedicated memory for revolutionary efficiency and visual rendering. • Tile Memory Heap ...
  20. [20]
    GPU architecture types explained – RasterGrid | Software Consultancy
    The two major architecture types being tile-based and immediate-mode rendering GPUs. In this article we explore how they work, present their strengths/ ...<|control11|><|separator|>
  21. [21]
    [PDF] Computer Graphics: Rendering, Geometry, and Image Manipulation ...
    Step 3: per-tile processing. In parallel, the cores process the bins: performing rasterization, fragment shading, and frame bufier update. While (more bins ...Missing: binning | Show results with:binning
  22. [22]
    Chapter 19. Deferred Shading in Tabula Rasa - NVIDIA Developer
    Deferred shading is a technique that separates the first two steps from the last two, performing each at separate discrete stages of the render pipeline.
  23. [23]
  24. [24]
    On NVIDIA's Tile-Based Rendering | TechPowerUp
    Mar 1, 2017 · Tile-based rendering seems to have been a key part on NVIDIA's secret-sauce towards achieving the impressive performance-per-watt ratings of their last two ...
  25. [25]
    [PDF] NVIDIA Tegra Multi-processor Architecture
    The GPU is specifically optimized to decode Flash video and graphics elements and is able to decode Flash content at full frame rates and render Flash visual ...
  26. [26]
    Basically all modern GPU architectures implement tiled rasterization ...
    Jul 20, 2021 · Basically all modern GPU architectures implement tiled rasterization. NVIDIA has been doing it since Maxwell (2014) and AMD has been doing ...
  27. [27]
    TiledLighting11 DirectX® 11 SDK Sample - AMD GPUOpen
    They utilize a Direct3D® 11 compute shader (DirectCompute 5.0) to divide the screen into tiles and quickly cull lights against those tiles. In addition to ...
  28. [28]
    RDNA Performance Guide - AMD GPUOpen
    Mar 22, 2023 · Async compute queues can be used to issue compute commands to the GPU parallel to the graphics queue. This allows use of the shader resources ...
  29. [29]
    Announcing DirectX 12 Ultimate - Microsoft Developer Blogs
    Mar 19, 2020 · When gamers purchase PC graphics hardware with the DX12 Ultimate logo or an Xbox Series X, they can do so with the confidence that their ...
  30. [30]
    Playstation 5 Pro specs analysis, also new information - NeoGAF
    Mar 19, 2024 · Sure, being able to bin geometry to tiles was a pre-requisite. Again, not the point, it is not really that important, but between the nVIDIA GPU ...Missing: culling | Show results with:culling
  31. [31]
    Improved Culling for Tiled and Clustered Rendering in Call of Duty
    The first, z-binning, significantly improves the quality and performance of volumetric entity-vs-geometry culling as compared to classic tiled and clustered ...<|separator|>
  32. [32]
    Ray Tracing | NVIDIA Developer
    Ray tracing is a rendering technique that can realistically simulate the lighting of a scene and its objects by rendering physically accurate reflections.Missing: tiled 2020-2025
  33. [33]
    [PDF] "RDNA3" Instruction Set Architecture: Reference Guide - AMD
    Aug 15, 2023 · This document describes the instruction set and shader program accessible state for RDNA3 devices. The AMD RDNA3 processor implements a ...
  34. [34]
    [PDF] Treelet Accelerated Ray Tracing on GPUs - People
    Apr 3, 2025 · Prior work has shown that dividing the BVH tree into smaller subtrees (treelets) and traversing all rays that visit a treelet before switching ...Missing: desktop | Show results with:desktop
  35. [35]
    Introduction to Snapdragon Adreno - Game Developer Guide
    FlexRender allows Adreno GPUs to switch between tile-based binned rendering and direct rendering to a frame buffer – since depending on workload, direct or ...
  36. [36]
    Developers: The evolution of high performance foveated rendering ...
    Jul 7, 2021 · The key functionality that makes efficient, high-performing foveated rendering possible is the Adreno GPU's tile-based rendering approach. Tile ...
  37. [37]
    PowerVR Rogue Architecture - Imagination Technologies
    Next-generation Tile Based Deferred Rendering architecture. Our Technology. Edge AI & Compute · Edge Graphics Processing · Ray Tracing · Functional Safety ...
  38. [38]
    (PDF) Memory Bandwidth Requirements of Tile-Based Rendering
    7 ago 2025 · Because mobile phones are omnipresent and equipped with displays, they are attractive platforms for rendering 3D images.
  39. [39]
    [PDF] Early Discard of Redundant Tiles in the Graphics Pipeline - UPC
    Tile-. Based Rendering GPUs divide the screen space into multiple tiles that are independently rendered in on-chip buffers, thus reducing memory bandwidth and ...
  40. [40]
    The Mali GPU: An Abstract Machine, Part 2 - Tile-based Rendering
    Feb 20, 2014 · This blog continues the development of this abstract machine, looking at the tile-based rendering model of the Arm Mali GPU family. I'll ...Missing: 2006 | Show results with:2006
  41. [41]
    [PDF] Exploiting Frame Coherence in Real-Time Rendering for Energy ...
    applied on top of a Tile-Based Rendering GPU and shown to reduce energy consumption by 37% ... Tile-Based Rendering is currently employed as a low-power de- sign ...
  42. [42]
  43. [43]
    [PDF] Memory Bandwidth- and Locality-Aware Parallel Tile Rendering
    Liuha, “Memory bandwidth requirements of tile-based rendering,” in Computer Systems: Architec- ... Molnar, G. Turk, B. Tebbs, and L. Israel, “Pixel-planes 5: a ...
  44. [44]
    Imagination's PowerVR graphics provide huge performance boost ...
    ... power savings up to 60 percent compared to the previous generations. The ... PowerVR's efficiency through tile-based deferred rendering (TBDR) ensures ...
  45. [45]
    Optimizing Oculus Go for Performance | Meta Horizon OS Developers
    Using FFR, many apps can dramatically increase the resolution of the eye texture that they render to on Oculus Go, improving the final image.