Retained mode
Retained mode is a design pattern in computer graphics APIs where the application declaratively defines and maintains a persistent data structure, such as a scene graph, representing the graphical scene, while the graphics library automatically manages rendering, updates, and display in response to modifications.[1][2] In contrast to immediate mode, where the application issues explicit procedural drawing commands for each frame and must resend all geometric data even for minor changes, retained mode stores the scene model in memory on the client or server side, minimizing redundant data transfers between the CPU and GPU.[1][3] This approach allows the library to handle optimizations like incremental updates and automatic redrawing, reducing the developer's burden for low-level rendering details.[2] Retained mode offers advantages in simplicity and efficiency for complex, dynamic scenes, as it abstracts away frame-by-frame command issuance and supports features like event handling for user interactions without manual re-rendering.[1] However, it can consume more memory due to the persistent scene representation and may limit fine-grained control over the rendering pipeline compared to immediate mode.[3] Common examples include Scalable Vector Graphics (SVG), where the browser retains and renders the scene description without per-frame commands, and Windows Presentation Foundation (WPF), which uses a retained-mode model for building and updating visual elements.[2][1] Retained mode has been influential in graphical user interfaces (GUIs) and vector-based rendering systems, enabling scalable displays across platforms.[3]Fundamentals
Definition
Retained mode is a declarative paradigm in graphics application programming interfaces (APIs) where the application constructs and maintains a persistent scene model, typically comprising data structures representing objects, transformations, and properties, which the graphics library utilizes for rendering.[1][4] In this approach, the application specifies an entire scene that is built within the graphics system, allowing the library to store and manage this model across multiple frames rather than executing commands immediately.[4][5] The core concept of retained mode involves the graphics library retaining control over the scene state, thereby automatically managing updates, view frustum culling, and redrawing operations in response to modifications in the model.[1][5] This retention enables optimizations such as scene graph traversal for hierarchical relationships and efficient resource allocation, as the library handles the rendering pipeline independently of per-frame application input.[5] A representative example of retained mode involves constructing a scene using primitives like shapes, lines, or hierarchical nodes that persist beyond a single frame, as seen in APIs such as Windows Presentation Foundation (WPF), where the application adds, modifies, or removes elements from the maintained scene model.[1] In contrast to immediate mode, retained mode emphasizes model persistence over procedural command issuance per frame.[1]Key Characteristics
Retained mode graphics systems maintain a persistent state by storing and managing a hierarchical model of the scene, such as a directed acyclic graph (DAG), within the graphics subsystem itself. This model persists across rendering cycles and is updated in response to application modifications, decoupling scene description from the act of rendering and enabling the system to handle display updates autonomously.[6][7] Automatic optimization is a core feature, achieved through mechanisms like dirty flagging, which marks modified scene elements to facilitate partial updates rather than full re-renders. The system traverses the hierarchical structure to identify and process only affected regions, incorporating techniques such as culling and level-of-detail adjustments during rendering. Event-driven modifications further enhance efficiency by propagating changes through the structure and triggering optimizations only as needed.[6][8] At its abstraction level, retained mode employs high-level declarative syntax, allowing developers to define the desired scene composition—such as adding or removing structural nodes—without specifying per-frame rendering commands. This paradigm shifts the burden of low-level details, like transformation matrices or primitive drawing orders, to the graphics system, promoting modularity and ease of scene manipulation.[6][9]Comparison to Immediate Mode
Core Differences
Retained mode and immediate mode represent two fundamental paradigms in graphics APIs, differing primarily in their approach to scene description and rendering execution. Retained mode operates in a declarative manner, where the application constructs and maintains a persistent model of the scene—often structured as a scene graph—that encapsulates the geometry, transformations, and other elements. This model is built once and subsequently updated incrementally as needed, allowing the graphics library to manage rendering based on the current state of the model.[5] In contrast, immediate mode is imperative, requiring the application to issue a complete sequence of drawing commands for every frame, effectively re-describing the entire scene from scratch without any retained structure.[4] A core distinction lies in state handling between the two modes. In retained mode, the graphics library retains a full, persistent representation of the scene model, enabling efficient incremental modifications such as adding, removing, or altering elements without respecifying unaffected parts. This persistence facilitates optimizations like culling or batching that the library can perform automatically on the stored data.[5] Immediate mode, however, maintains no such persistent state; each command stream is stateless and self-contained, with the application bearing full responsibility for re-establishing all relevant states (e.g., transformations, materials) in every rendering pass, which can lead to redundant computations but ensures reproducibility across frames.[4] Control flow also diverges significantly, affecting the level of abstraction and developer oversight. Retained mode delegates key rendering decisions—such as traversal order, optimization strategies, and even parallelization opportunities—to the graphics library, which analyzes the scene model to determine the most efficient execution path. This abstraction reduces application-side complexity but limits fine-grained control over the rendering pipeline.[5] Conversely, immediate mode grants the application complete authority over the sequence and timing of commands, allowing direct manipulation of the rendering process for custom effects or algorithms, though at the cost of increased code verbosity and potential performance overhead from repeated state setups.[4]Rendering Processes
In retained mode rendering, the process begins with traversing the scene graph, a hierarchical data structure representing the graphical elements, to determine the visible components for output.[9] During traversal, transformations such as rotations, scales, and translations are applied hierarchically from parent to child nodes, inheriting coordinate systems to position elements accurately.[9] Invisible elements are culled through optimizations like view-frustum culling or occlusion culling to exclude them from further processing, followed by compositing the remaining primitives into the final framebuffer image.[9] Updates to the scene propagate via change notifications, where modifications to the graph trigger selective re-traversals and re-renders rather than full rebuilds.[10] In contrast, immediate mode rendering involves issuing a sequence of draw calls directly to the graphics pipeline for each frame, specifying vertices, attributes, and primitives on-the-fly without maintaining any persistent scene representation.[11] Commands such asglBegin and glEnd in legacy OpenGL enclose vertex data, which is processed immediately through the pipeline stages including transformation, clipping, and rasterization, resulting in output to the framebuffer.[11] Since no state or geometry is retained between frames, the entire visual content must be recreated from application logic in every rendering cycle, leading to repeated command issuance.[11]
These procedural differences yield distinct performance implications: retained mode facilitates batched updates by leveraging the scene graph for efficient change detection and partial re-renders, optimizing for complex, static scenes.[1] Immediate mode, however, provides low-overhead execution suitable for dynamic scenarios, as it avoids the cost of graph maintenance and allows direct control over per-frame drawing without intermediate storage.[1]
Historical Development
Origins in Early Graphics
The concept of retained mode graphics originated in the early 1960s with pioneering interactive systems designed to overcome the constraints of contemporary hardware, such as limited memory and processing power on machines like the TX-2 computer. Ivan Sutherland's Sketchpad system, developed in 1963 as part of his MIT doctoral thesis, introduced a foundational retained structure through its use of a "display file" to store spot coordinates for vector-based drawings. This file, occupying up to 32,000 words in memory, held graphical primitives like lines and circles as sequences of display spots, enabling the system to regenerate the image incrementally without requiring full recomputation from the source data each time.[12] The retained approach was essential for interactivity, as it allowed real-time manipulation via a light pen—users could select and modify elements, with the ring-structured topological data (linking points, lines, and constraints) automatically updating connected parts of the scene.[12] During the 1970s, retained mode concepts evolved in response to growing demands for handling more complex scenes in resource-constrained environments, where constant redrawing would overwhelm slow CPUs and vector displays. Systems built on Sketchpad's ideas, such as those in military and engineering applications, emphasized retaining scene descriptions to support editing and reuse, minimizing redraw cycles on hardware with display rates limited to around 100,000 spots per second. This motivation stemmed from the need to manage interactive vector graphics efficiently, as immediate redrawing of entire scenes was impractical given memory sizes under 64K words and the absence of dedicated graphics accelerators.[13] By the 1980s, these early innovations influenced standardized libraries like the Graphical Kernel System (GKS), formalized as ISO 7942 in 1985, which incorporated retained structures through workstation-independent segment storage. GKS allowed graphical output primitives to be grouped into reusable segments, stored separately from immediate output commands, facilitating device-independent portability and efficient updates for complex 2D vector graphics across varying hardware.[14] This was followed by the Programmer's Hierarchical Interactive Graphics System (PHIGS), published as ISO/IEC 9592 in 1989, which introduced retained hierarchies for 3D scenes, enabling structured storage and traversal for interactive 3D graphics.[15]Evolution in Modern APIs
In the 1990s, retained mode gained prominence in 3D graphics through its integration into APIs like Open Inventor, developed by Silicon Graphics Inc. (SGI) and released in 1991, which employed scene graphs to model and retain complex 3D scenes for efficient rendering and interaction.[16] Similarly, Java 3D, announced in 1996 and first released in 1998 by Sun Microsystems, adopted a retained mode approach using hierarchical scene graphs composed of nodes such as BranchGroup, TransformGroup, and Shape3D to manage persistent 3D objects and automate rendering updates.[17] These APIs built on earlier concepts like those in GKS but emphasized object-oriented structures for higher-level abstraction in professional 3D applications. Parallel to advancements in 3D graphics, retained mode influenced user interface paradigms during the same decade, as seen in Java Swing, introduced in 1997 as part of the Java Foundation Classes and fully integrated into Java 1.2 in 1998, where a hierarchy of retained component objects (e.g., panels, buttons) is maintained by the framework to handle layout and repainting dynamically. Qt, conceived in 1990 by Haavard Nord and Eirik Chambe-Eng, with Trolltech founded in 1994, first publicly released in 1995, and reaching version 1.0 in 1996, similarly utilized a retained widget-based model to construct and persist UI elements across cross-platform applications.[18] By the late 1990s, this approach extended to web technologies with the HTML Document Object Model (DOM), standardized by the W3C in 1998, serving as a retained tree structure for document representation that enabled dynamic updates via scripting for interactive web UIs. In recent trends, retained mode has evolved toward hybrid models in game engines, exemplified by Unity, launched in 2005, which employs a retained scene graph of GameObjects and components to maintain persistent world states while incorporating immediate-mode elements, such as its legacy IMGUI system, to optimize performance in real-time rendering scenarios. This balance allows developers to leverage retained structures for complex scene management alongside immediate commands for lightweight, frame-specific overlays, addressing performance demands in interactive environments.Implementation Approaches
Scene Graph Structures
In retained mode graphics systems, the primary data structure is a hierarchical node-based model that organizes scene elements into a tree-like representation, where nodes encapsulate individual objects such as geometry, lights, and cameras. Each node maintains parent-child relationships that propagate transformations, like translations, rotations, and scalings, from parent to child, enabling efficient management of complex spatial hierarchies.[19][20] Nodes in this model serve as containers for various components, including properties such as position, material attributes, and behavioral scripts, which define how objects interact within the scene. Traversal algorithms, typically depth-first, navigate the hierarchy to apply these components during rendering, accumulating state changes like transformation matrices as they descend the tree.[19][20] A common variant of the basic tree structure is the directed acyclic graph (DAG), which permits nodes to have multiple parents, facilitating shared subgraphs for elements like repeated geometries in a scene. This DAG approach maintains acyclicity to prevent infinite loops while allowing modifications to shared nodes to propagate efficiently across instances.[19][20]API Design Patterns
Retained mode APIs leverage established software design patterns to enable developers to construct, modify, and interact with scene representations efficiently, often building upon an underlying scene graph structure. These patterns promote modularity, automation of updates, and abstraction from low-level rendering details, allowing applications to focus on high-level scene description rather than explicit rendering commands.[9] The builder pattern is frequently used in retained mode APIs for constructing complex scene hierarchies through sequential operations, such as method chaining or factory methods that create and assemble nodes. For instance, developers might invoke functions likecreateNode() to instantiate a graphical element and addChild() to attach it to a parent, progressively building the retained model without directly managing memory or rendering states. This approach separates the construction logic from the representation, facilitating reusable and readable code for scene assembly.[21][9]
The observer pattern supports change propagation in retained mode systems by establishing dependencies where modifications to the scene model automatically notify and update dependent components, eliminating the need for manual redraws. When an application alters a node—such as updating its position or properties—the API's observer mechanism detects the change and triggers library-managed refreshes, ensuring the displayed output remains synchronized with the retained data. This pattern is integral to event-driven updates, enhancing responsiveness without burdening the developer with explicit synchronization.[9][21]
Encapsulation in retained mode API design hides implementation details of the scene model, exposing only high-level operations that abstract away complexities like traversal algorithms or optimization routines. Nodes encapsulate properties such as transformations or attributes with default values and provide methods like attachEvent() for adding interactivity, allowing developers to manipulate the scene declaratively while the API internally manages rendering and state consistency. This principle ensures that changes to the underlying graphics pipeline do not affect application code, promoting maintainability and portability across hardware.[9][22]