Fact-checked by Grok 2 weeks ago

In-memory database

An in-memory database (IMDB), also referred to as a main memory database system (MMDB or IMDBMS), is a database management system that primarily stores and manages data in the computer's (RAM) rather than on disk, enabling significantly faster data access and processing by eliminating the latency of disk I/O operations. This approach contrasts with traditional disk-based databases, where involves mechanical seek times and slower read/write speeds, allowing IMDBs to deliver microsecond-level latencies for reads and single-digit latencies for writes. The architecture of in-memory databases leverages RAM's volatility for speed but incorporates mechanisms like data replication, periodic snapshots to non-volatile storage, and transaction logging to ensure durability and , addressing concerns over in case of power failures. Key features include support for various data models such as relational, key-value, or document-oriented structures, often with techniques and optimized indexing to maximize efficiency. Advancements in hardware, including larger RAM capacities, , and emerging non-volatile memory (NVM) technologies like and phase-change RAM, have made IMDBs scalable for workloads, supporting high throughput in clustered environments without proportional performance degradation. In-memory databases excel in real-time applications requiring low-latency responses, such as financial trading systems for instant , e-commerce platforms handling high-traffic shopping carts, leaderboards for millions of concurrent users, and tools for rapid data visualization. Notable examples include MemoryDB and ElastiCache for scalable caching and durable storage with compatibility; for in-memory data grids; Oracle Database In-Memory (released in 2014) and TimesTen (available since 2005) for and OLTP; (launched in 2010) for in-memory columnar processing; and Microsoft SQL Server's In-Memory OLTP (introduced in 2014). Their adoption has accelerated since the early due to plummeting costs and the need for processing massive datasets in memory, though challenges like at scale and integration with non-volatile storage persist.

Overview

Definition and Characteristics

An in-memory database, also referred to as a main database , is a type of database management that primarily stores and processes in the computer's main () rather than on persistent . This design eliminates the need for frequent disk input/output operations, enabling sub-millisecond query latencies and supporting processing applications. Core characteristics of in-memory databases include the inherent of stored in , which requires additional mechanisms such as or periodic snapshots to provide persistence and durability against power failures or crashes. These systems leverage high-speed memory hierarchies and specialized structures, such as tables or tree-based indexes, to optimize access patterns. They support diverse models, including key-value stores, relational tables, document-oriented formats, and structures, with a focus on maximizing throughput and minimizing latency rather than optimizing for large-scale, long-term archival storage. In comparison to disk-based databases, in-memory databases achieve dramatically lower access times: typical access is around 100 nanoseconds, versus approximately 10 milliseconds for disk seek operations in traditional systems. This fundamental difference in medium shifts the bottleneck from I/O to computational . The terminology of in-memory databases distinguishes them from broader in-memory paradigms or simple caching layers, as they provide full database capabilities—including complex querying, transaction support with , and enforcement—while treating main memory as the primary data residence rather than a temporary .

Historical Development

The concept of leveraging main memory for faster data access in database systems traces its roots to the and , when early database management systems like IBM's Information Management System (IMS), developed in 1966 for the , incorporated memory buffering to accelerate access to hierarchical data structures, marking an initial shift from purely disk-based operations. In the , the advent of relational databases further emphasized buffer management techniques to cache frequently accessed data in RAM, as seen in pioneering systems like System R at , which optimized query performance by minimizing disk I/O through in-memory caching mechanisms. These early approaches laid the groundwork for in-memory concepts, though full in-memory storage remained limited by constraints. The 1980s and 1990s saw the emergence of dedicated in-memory systems, driven by object-oriented database research and declining memory costs. Seminal work at institutions like the University of Wisconsin's MM-DBMS project explored main-memory architectures, influencing commercial products such as , released in 1988 by Object Design, Inc., which provided persistent object storage primarily in RAM for engineering applications. Similarly, GemStone/S, developed from 1982 and commercially available by 1987, offered an in-memory for complex data models in Smalltalk environments. By the mid-1990s, fully relational in-memory databases proliferated, including Lucent's DataBlitz (prototyped 1993–1995) for high-throughput telecom applications and Oracle TimesTen (spun out in 1996 from research starting in 1995), which delivered microsecond response times for OLTP workloads. Altibase followed in 1999 as a hybrid in-memory RDBMS from South Korean research origins in 1991. The 2000s marked a boom in in-memory databases, fueled by the NoSQL movement's emphasis on scalability and the plummeting cost of RAM—from approximately $700 per GB in 2000 to around $10 per GB by 2010—enabling larger datasets to fit in memory. Redis, prototyped in 2009 by Salvatore Sanfilippo to address analytics bottlenecks, became a cornerstone in-memory store for its simplicity and speed in caching and messaging. SAP HANA, announced in 2010 and generally available in 2011, revolutionized enterprise by combining in-memory columnar storage with OLAP/OLTP capabilities, processing terabytes in seconds. VoltDB, commercialized in 2008 from MIT's H-Store project (forked post-2008 VLDB demo), exemplified NewSQL's fusion of relational compliance with in-memory performance for distributed OLTP. In the and , in-memory databases integrated with and paradigms, supporting real-time analytics in distributed environments, while non-volatile memory advancements like Intel Optane (introduced 2015) enhanced persistence by bridging and SSD latencies without full volatility risks. This era's growth was propelled by surging data velocity in ecosystems, with influences like accelerating adoption for high-throughput caching in architectures. By 2025, costs had declined to around $3 per (as of November 2025), despite recent price surges driven by high demand.

Core Technologies

Memory Management and Data Structures

In-memory databases employ specialized memory allocation strategies to ensure efficient utilization of , minimizing overhead and fragmentation in high-throughput environments. Dynamic allocators such as jemalloc and tcmalloc are commonly integrated to handle frequent allocations and deallocations, providing scalable performance for large-scale deployments. Jemalloc, for instance, uses size-class bucketing and arena-based allocation to reduce contention and limit metadata overhead to under 2% of total , making it suitable for long-running processes like databases where fragmentation can accumulate over time. Similarly, tcmalloc employs thread-local caches to accelerate small object allocations, optimizing transactional throughput by avoiding global locks and reducing resource starvation in multi-threaded scenarios. These allocators address fragmentation by implementing techniques like low address reusage for large objects, which scans for free slots in a manner that prevents external fragmentation in query-intensive workloads. Data structures in in-memory databases are selected for their low-latency access patterns, leveraging RAM's speed to achieve sub-millisecond operations. Hash tables are prevalent for key-value stores, enabling average O(1) lookup, insertion, and deletion times through direct addressing via functions. For example, in , hashes and sets are implemented using s with incremental rehashing to handle resizing without blocking operations. Ordered data, such as in sorted sets, often utilizes skip lists, which provide probabilistic O(log n) search complexity with simpler implementation than balanced trees, as seen in Redis's ZSET structure where skip lists overlay on s for efficient range queries. B-trees or their variants, like B+-trees, are used in relational in-memory systems for indexing ordered data, maintaining balance to support range scans with O(log n) access while minimizing memory footprint through node sharing. To support concurrency without locks, lock-free data structures such as non-blocking s and skip lists employ atomic operations like () for thread-safe updates, ensuring progress in multi-core environments. A basic for lock-free insertion illustrates this:
function insert(key, value):
    hash = hash_function(key)
    node = new Node(key, value)
    while true:
        current = bucket[hash]
        node.next = current
        if CAS(bucket[hash], current, node):
            return success
This approach retries on contention, avoiding mutual exclusion overhead. Garbage collection and compaction mechanisms are critical in in-memory databases to reclaim unused without introducing unacceptable pauses, particularly in systems using multi- (MVCC). Generational garbage collection, where short-lived objects are separated from long-lived ones, is applied in Java-based in-memory databases like to focus minor collections on young generations and reduce full heap scans. In , hybrid garbage collection combines background threads for table-level reclamation with version chain traversal to remove obsolete MVCC snapshots, ensuring efficient reuse in mixed OLTP/OLAP workloads. pooling serves as an alternative in -sensitive systems, pre-allocating fixed-size buffers to avoid allocation stalls and fragmentation; for instance, object pools recycle data structures like query results, minimizing garbage generation during peak loads. Compaction relocates live objects to eliminate holes from deallocations, often triggered periodically in generational schemes to maintain contiguous and improve locality. These techniques balance reclamation speed with predictability, as excessive pauses can degrade performance. Exploiting memory hierarchies enhances performance by aligning data access with hardware locality, from CPU caches to non-uniform memory access (NUMA) topologies. In-memory databases optimize for CPU caches (L1, L2, L3) by structuring data to fit cache lines—typically 64 bytes—and using techniques like cache-aware partitioning to reduce misses; for example, column-oriented storage in systems like groups attributes to load only relevant data into L1/L2 caches during scans, improving CPU efficiency by up to 2-3x over row stores. NUMA awareness is essential in multi-socket servers, where memory access varies by node; databases configure allocation strategies to minimize remote memory fetches that can significantly increase compared to local access. Locality-optimized structures, such as B-skiplists, further enhance cache utilization by sequentializing traversals to prefetch adjacent data, reducing L3 cache evictions in concurrent workloads. These optimizations ensure that the bulk of operations remain within fast cache layers, leveraging DRAM only for larger working sets.

Persistence and Durability

In-memory databases face significant durability challenges due to the inherent volatility of , where can be lost entirely in the event of power failures, crashes, or faults, potentially leading to substantial without mitigation strategies. This volatility necessitates trade-offs between the high-speed access that defines in-memory systems and the reliability required for production use, often introducing from operations that can reduce throughput by orders of magnitude compared to pure in-memory writes. To address these issues, common persistence techniques include (WAL), where modifications are appended to a durable log on disk or SSD before being applied in , ensuring that committed transactions can be recovered even if contents are lost. Snapshotting complements WAL by periodically dumping the entire in- state to non-volatile storage, such as through mechanisms that create compressed backups at intervals like every 300 seconds or after a of writes, minimizing ongoing overhead while providing a baseline for recovery. Replication to secondary nodes further enhances durability by synchronously or asynchronously mirroring data across distributed systems, allowing to redundant copies in case of failure and reducing single points of vulnerability. Emerging hardware solutions like non-volatile RAM (NVRAM), exemplified by Intel's technology (introduced in 2017 but discontinued in 2022), enable byte-addressable directly from the CPU, storing data in a form that survives power loss without the full I/O overhead of traditional disk writes. Ongoing advancements include technologies such as MRAM and ReRAM. (RDMA) extends this capability across nodes, permitting direct writes to remote NVRAM for with low latency—achieving up to 2.4 times higher throughput in distributed in-memory systems by bypassing CPU involvement and leveraging one-sided operations. Recovery processes in these systems typically involve checkpointing algorithms that combine snapshots with log replay, where the latest checkpoint is loaded into memory followed by sequential application of logged transactions to reconstruct the state. Log replay operates with linear time complexity O(n), where n represents the number of transactions in the log, enabling efficient restoration but scaling directly with log volume— for instance, recovering a 137 GB log may take around 109 seconds on multi-core systems with parallel processing. These mechanisms ensure data integrity post-failure while balancing recovery speed against the persistence costs inherent to volatile memory environments.

Performance and Use Cases

Advantages Over Disk-Based Databases

In-memory databases provide significant performance superiorities over traditional disk-based systems primarily through drastically reduced and increased throughput. For instance, by storing all data in , these systems achieve end-to-end access latencies of 5–10 microseconds, compared to 0.5–10 milliseconds for disk-based alternatives, representing a 100x to 1,000x improvement. This latency reduction enables query execution in microseconds rather than milliseconds, particularly beneficial for (OLTP) workloads where rapid response is critical. Throughput can similarly increase by 100x to 1,000x, supporting up to 1 million small read requests per second per server versus 1,000–10,000 for disk systems. Input/output operations per second () are also enhanced, as memory access avoids mechanical delays inherent in disk operations. A key advantage stems from the elimination of I/O bottlenecks associated with . Disk-based databases suffer from seek times (typically 5–10 ms) and rotational (around 4–8 ms for 7,200 RPM drives), which introduce substantial delays in . In contrast, in-memory databases enable direct access via pointers, bypassing these overheads entirely and allowing seamless integration of data processing without persistent storage interruptions. This facilitates higher —often in the millions—without the physical constraints of disk , leading to more predictable and consistent performance. These capabilities extend to processing of high-velocity data streams, enabling high-throughput on large datasets. For example, in-memory systems can achieve over 1,000 on 100 databases with 10 working sets. In read-heavy workloads, this translates to cost-efficiency despite higher initial costs, as reduced reliance on disk hardware lowers operational expenses through smaller storage footprints and faster processing that minimizes . Overall, these advantages make in-memory databases particularly suited for scenarios demanding high-speed, low-latency operations.

Common Applications

In-memory databases excel in analytics applications, where their ability to process vast amounts of with minimal is crucial. In fraud detection systems within the financial sector, they enable the rapid analysis of transaction streams, often handling millions of events per second to flag suspicious activities in milliseconds. Similarly, for () ingestion, these databases manage high-velocity from connected devices, supporting immediate querying and aggregation for applications like monitoring and . Caching and session management represent another key domain, particularly in platforms, where in-memory databases store and order books for sub-millisecond access, facilitating algorithmic decisions during volatile market conditions. In applications, they efficiently handle , such as shopping carts and sessions, ensuring seamless experiences across distributed servers without disk I/O bottlenecks. In the and sectors, in-memory databases power leaderboards and multiplayer by delivering updates to player scores and game states, which is essential for maintaining in competitive environments with thousands of concurrent users. Their low-latency retrieval supports dynamic systems that refresh instantly, enhancing user satisfaction in mobile games and online multiplayer titles. Emerging applications leverage in-memory databases for and inference caching, where frequently accessed model outputs are stored to accelerate predictions in real-time systems like recommendation engines. In edge computing for networks, they process localized data streams from base stations and devices, reducing for applications such as autonomous vehicles and . The in-memory database market, driven by analytics demands, is projected to grow from USD 7.08 billion in 2025 at a CAGR of 13.98% through 2030.

Design Considerations

ACID Properties

In-memory databases adapt the traditional ACID properties—atomicity, , , and —to the constraints of volatile main memory, where data resides primarily in for high-speed access but requires mechanisms to ensure reliability despite potential power failures or crashes. These adaptations often involve lightweight logging, versioning, and replication strategies that minimize disk I/O overhead while preserving transaction integrity. Unlike disk-based systems, in-memory implementations prioritize low-latency operations, sometimes trading strict for performance in certain variants. Atomicity ensures that transactions execute as indivisible units, either fully succeeding or rolling back completely without partial updates. In in-memory databases, this is commonly achieved through in-memory undo logs, which record the original data states before modifications, allowing rapid reversal during rollbacks if a transaction fails due to errors or conflicts. Alternatively, shadow paging provides atomicity by creating copies of modified pages in new memory locations during a transaction; upon commit, a pointer is updated atomically to reference the new versions, while rollbacks simply discard the shadow copies without affecting the original data. For example, systems like enforce atomicity via stored procedures that bundle SQL operations, automatically rolling back on exceptions to prevent inconsistent states. Consistency maintains database invariants, such as constraints and triggers, ensuring that each transitions the database from one valid state to another. In-memory databases enforce this through validation and deterministic execution rules, often leveraging multi-version (MVCC) to provide snapshot isolation, where transactions read from a consistent point-in-time view without blocking writers. MVCC stores multiple versions of data rows in memory, tagged with timestamps, allowing readers to access uncommitted or historical versions while writers append new ones, thus upholding constraints like primary keys and foreign relationships without traditional locks. Isolation prevents concurrent transactions from interfering, typically supporting levels up to serializable to avoid anomalies like dirty reads or lost updates. Many in-memory systems achieve serializable isolation without locks by using deterministic scheduling, where transactions are queued and executed in a fixed order per partition, ensuring equivalence to sequential execution. For instance, employs single-threaded processing per data partition with deterministic command ordering, providing strict serializability while minimizing contention in high-throughput environments. Durability guarantees that committed transactions survive failures, but in-memory systems adapt this by balancing with performance, often using synchronous or asynchronous commits to non-volatile storage. Synchronous commits flush transaction logs to disk immediately upon completion, ensuring immediate but adding due to I/O waits, typically in the millisecond range for solid-state drives; asynchronous commits defer this flushing to background processes, avoiding added at the cost of potential minor on crashes. Some in-memory databases, like , forgo full in favor of the model (Basically Available, Soft state, ), prioritizing availability and partition tolerance over immediate for scalable, high-speed applications.

Scalability and Distribution

In-memory databases achieve horizontal scalability by partitioning data across multiple , often using sharding techniques that divide datasets into subsets stored on different servers. is a widely adopted for this partitioning, mapping keys to on a where each handles a range of values, minimizing data movement during node additions or failures. Rebalancing algorithms, such as those that adjust or , ensure even load distribution and handle hotspots by dynamically redistributing , typically affecting only adjacent to maintain efficiency. Replication strategies enhance and in distributed in-memory setups. Master-slave asynchronous replication propagates updates from a primary to replicas, providing read scalability while allowing temporary inconsistencies during failures, whereas multi-master synchronous replication enables concurrent writes across s for higher throughput at the cost of coordination overhead. Quorum-based writes, where a of replicas must acknowledge updates before completion, balance consistency and , often configured such that read and write quorums overlap to ensure . These approaches are coordinated through consensus protocols like , which uses and log replication to manage state across s, or , which achieves agreement via multi-phase proposals and acceptances among a of participants. In distributed query processing, these protocols handle coordination for transactions spanning , mitigating network latency—typically tens to hundreds of microseconds per hop in data centers—through techniques like batching requests and pipelined replication to minimize round-trip times. Scalability in in-memory databases faces challenges from hardware constraints, such as memory limits per node ranging from 1 TB to 64 TB, which cap single-node capacity and necessitate careful data tiering or offloading non-critical computations to avoid bottlenecks. Vertical scaling addresses intra-node limits via NUMA (Non-Uniform Memory Access) optimizations, including affinity-aware partitioning that localizes data to specific memory nodes and threads to reduce remote access latencies, which can be 2-3 times higher than local ones. For example, NUMA-aware radix partitioning divides large relations into subsets aligned with node topology, enabling parallel aggregation with reduced cache misses and up to 2x performance gains on multi-socket systems. These solutions prioritize in-memory efficiency while extending ACID isolation to distributed contexts through coordinated commits.

Hybrid Systems

Integration with On-Disk Storage

Hybrid in-memory database architectures often integrate on-disk storage to handle scenarios where memory capacity is exceeded or data durability is required beyond volatile RAM. This integration enables tiered storage systems where active datasets reside in memory for rapid access, while less frequently used or archival data is persisted to disk-based storage like SSDs or HDDs. Such designs balance the speed of with the cost-effectiveness and persistence of , commonly seen in systems that support overflow management and recovery mechanisms. Overflow handling in these systems relies on policies to manage constraints by moving from to disk when capacity limits are reached. For instance, the Least Recently Used (LRU) policy identifies and evicts the least accessed items, tiering them to secondary storage such as SSDs to maintain performance for hot . Other policies, like clock or adaptive replacement, may also be employed to optimize decisions based on access patterns, ensuring minimal disruption to ongoing operations. This tiering approach prevents out-of-memory errors while preserving query efficiency for in-memory subsets. Backup and recovery processes in hybrid in-memory databases involve periodic full or incremental dumps to disk to ensure data durability against crashes or failures. Full dumps capture the entire in-memory state at checkpoints, while incremental methods log only changes since the last backup, reducing overhead. Tools in hybrid in-memory database systems facilitate these operations by supporting snapshot isolation for consistent disk writes without blocking in-memory transactions. Recovery then reconstructs the database by loading dumps and replaying logs, minimizing downtime in durable configurations. Workload partitioning separates hot data—frequently accessed for queries—into memory, while cold data, used for infrequent or archival, remains on disk. Query routers or optimizers analyze incoming requests and direct them appropriately: high-velocity OLTP operations route to in-memory layers, whereas batch OLAP jobs access disk tiers. This partitioning enhances overall throughput by localizing latency-sensitive workloads to fast storage and leveraging cheaper disk for bulk operations. Performance in hybrid systems incurs added latency from disk synchronization, as SSDs are typically 100 to 10,000 times slower than RAM reads and HDDs are orders of magnitude slower (10,000 to 1,000,000 times), due to seek and access delays. To mitigate this, asynchronous flushing decouples write acknowledgments from physical disk commits, allowing transactions to complete in memory while background processes handle persistence. This technique reduces effective latency for user-facing operations but requires careful tuning to balance durability guarantees with throughput.

Caching Mechanisms

In-memory databases often serve as front-end s in larger systems, acting as a high-speed layer for transient data to accelerate access while deferring persistence to backing stores. A prominent example is , a distributed in-memory key-value store designed for simple caching scenarios, where it implements patterns by loading data from the database only on cache misses and write-through patterns by updating both the cache and the underlying store synchronously during writes. These architectures prioritize low-latency reads for read-heavy workloads, such as session storage or query result caching, by keeping frequently accessed data in without built-in persistence mechanisms. Cache coherence is maintained through invalidation techniques and strategies to ensure data consistency between the in-memory layer and the backing store. Time-to-live (TTL) expiration automatically removes stale entries after a predefined duration, typically set in seconds or minutes, preventing indefinite storage of outdated data and balancing freshness with performance. Cache-aside strategies, where the application explicitly manages loading and updating the cache independently of the database, further support coherence by allowing manual invalidation on writes to the primary store, thus avoiding propagation delays in distributed environments. Advanced caching features in in-memory databases include write-back mechanisms, which defer updates to the backing store by first storing changes in and batching them for periodic flushes, thereby reducing immediate backend load through coalesced I/O operations. This approach can significantly alleviate pressure on disk-based systems by amortizing writes, with benchmarks showing reductions of up to 90% in high-throughput scenarios. Eviction policies manage constraints by selectively removing least valuable entries, minimizing overhead from frequent replacements. Key performance metrics for these caching mechanisms emphasize high hit rates, ideally exceeding 95% to maximize the benefits of in-memory access speeds over disk I/O, as lower rates indicate inefficient utilization and increased backend queries. overhead is mitigated by algorithms like the Adaptive Replacement Cache (), which self-tunes by balancing recency and frequency of access to achieve superior hit ratios compared to traditional LRU policies, with low implementation overhead suitable for dynamic workloads.

Notable Examples

Commercial Products

SAP HANA, introduced in 2010, is a columnar in-memory database designed primarily for advanced analytics and high-speed within () environments. It supports standard SQL queries alongside integrated capabilities, enabling real-time data analysis and predictive modeling directly on operational data. As a market leader in enterprise systems, powers , facilitating seamless integration of business processes with in-memory computing for faster in large-scale deployments. Oracle TimesTen, launched in 1996, is a memory-optimized tailored for applications requiring response times and high throughput. It offers full SQL support and compliance, with key features including asynchronous replication for and low-latency data access. A standout is the IMDB Cache option, which allows TimesTen to cache data from backend Oracle Databases, accelerating performance for hybrid workloads without full . Amazon MemoryDB for Redis and ElastiCache, both launched in 2013 with MemoryDB's durable storage enhancements in 2021, provide managed in-memory data stores compatible with for low-latency caching and persistent storage. MemoryDB combines the speed of in-memory caching with multi-AZ for up to 99.999999% , supporting use cases like real-time analytics and session management, while ElastiCache focuses on scalable caching for high-throughput applications without persistence guarantees. VoltDB, first released in 2010, represents a approach to in-memory databases, optimized for high-velocity (OLTP) with a focus on distributed, scalable architectures. Its design partitions data across single-threaded execution engines, eliminating locks and latches to achieve deterministic performance for simple, high-frequency transactions. VoltDB supports SQL standards and is particularly suited for and real-time analytics in industries like and . Microsoft SQL Server In-Memory OLTP, introduced in 2014 under the Hekaton engine, provides lock-free for memory-optimized tables integrated within the standard on-disk SQL Server environment. It leverages compiled stored procedures and hash/ range indexes to deliver up to 30x faster performance for OLTP workloads compared to traditional disk-based operations, while maintaining full properties. This feature enables hybrid use cases where in-memory tables coexist with disk-based ones for seamless application development. The commercial in-memory database market has seen significant growth since 2020, driven by demands for real-time analytics and cloud-native deployments, with global revenue expanding from approximately USD 4.16 billion in 2019 to USD 7.1 billion in 2025. Post-2020 updates in leading products, such as enhanced partitioning in Cloud and Kubernetes support in TimesTen, reflect adaptations to and distributed environments, contributing to a projected (CAGR) of around 14% through 2030. Adoption among large enterprises has accelerated.

Open-Source Solutions

Open-source in-memory databases provide flexible, community-driven alternatives for high-performance , enabling developers to customize and extend systems without licensing costs. These solutions emphasize , , and with modern , fostering innovation through collaborative development. Redis, first released in 2009, is an open-source in-memory key-value store that supports advanced data structures such as strings, hashes, lists, sets, and sorted sets. It includes pub/sub messaging for communication and scripting for executing custom server-side logic, enhancing its utility as a and database. Redis also features clustering for horizontal scaling across nodes and persistence options like RDB snapshots for point-in-time backups and Append-Only File (AOF) for durable , balancing speed with data reliability. Its lightweight architecture, written in C, makes it suitable for caching, session , and analytics in web applications. Apache Ignite, introduced in 2014 as an top-level project, is a that functions as a database with support via ANSI-99 compliant queries. Its multi-tier architecture allows seamless scaling across memory and disk storage, enabling pure in-memory operations or persistence with a single configuration change. Ignite supports joins, aggregations, and indexing, alongside near real-time processing through continuous queries that detect and react to data changes using languages like and C#. This design facilitates high-throughput applications in finance and , where low-latency queries and transactions are essential. Hazelcast, launched in 2008, operates as an open-source in-memory (IMDG) that pools across clustered nodes for shared data access and processing. Its architecture supports distributed caching with sub-millisecond latencies and in-memory computing for parallel execution of tasks like operations. Key features include replication for cross-site data synchronization and , ensuring in geographically distributed environments. Hazelcast's cloud-native design integrates with for elastic scaling, making it ideal for and real-time event processing in sectors like banking and payments. DragonflyDB, released in 2022, serves as an open-source, Redis-compatible in-memory optimized for multi-core CPUs and modern . Its multi-threaded, enables up to 25 times higher throughput than traditional single-threaded designs, achieving over 3.9 million queries per second on standard hardware. Fully compatible with APIs, it requires no application changes while improving efficiency through asynchronous operations and reduced memory overhead. DragonflyDB targets high-scale caching and queuing workloads, offering cost savings of up to 80% in resource usage for AI and applications. These open-source projects demonstrate strong community engagement, with Redis ranking as the most popular database for AI agent data storage in the 2025 Stack Overflow Developer Survey, showing an 8% year-over-year usage increase. Apache Ignite and Hazelcast maintain active ecosystems for distributed computing, while DragonflyDB has amassed 26,000 GitHub stars by late 2025, reflecting rapid adoption among developers seeking performant Redis alternatives. Overall, their GitHub repositories collectively exceed hundreds of thousands of stars and forks, underscoring widespread use in production environments for scalable, low-latency data handling.

References

  1. [1]
    What Is a In Memory Database? - Amazon AWS
    An in-memory database is a purpose-built database that relies primarily on internal memory for data storage. It enables minimal response times.Missing: authoritative | Show results with:authoritative
  2. [2]
    In-Memory Databases: An Overview - Dataversity
    Mar 20, 2024 · In-memory databases use a database management system that relies primarily on a computer's main memory (RAM).Missing: definition authoritative
  3. [3]
    What Is An In-Memory Database? IMDB Overview & Use Cases
    An in-memory database (IMDB) is a computer system that stores and retrieves data records that reside in a computer's main memory, e.g., RAM.Missing: authoritative | Show results with:authoritative
  4. [4]
    None
    **Authors:** Kian-Lee Tan, Qingchao Cai, Beng Chin Ooi, Weng-Fai Wong, Chang Yao, Hao Zhang
  5. [5]
    [PDF] Oracle Database In-Memory
    In-Memory Database (IMDB) technology is one of the most active data management software categories in recent past. In December 2013 a major technology ...<|control11|><|separator|>
  6. [6]
    In-Memory Big Data Management and Processing: A Survey
    **Summary of In-Memory Databases from Sections 1 and 2:**
  7. [7]
    Implementation techniques for main memory database systems
    In this paper we consider the changes necessary to permit a relational database system to take advantage of large amounts of main memory.
  8. [8]
    [PDF] The Memory Hierarchy
    Sep 23, 2025 · Disk seek time. SSD access time. DRAM access time. SRAM access time. CPU ... = 9 ms + 4 ms + 0.02 ms. ⬛ Important points: ▫ Access time ...
  9. [9]
    Memory & Storage | Timeline of Computer History
    In 1953, MIT's Whirlwind becomes the first computer to use magnetic core memory. Core memory is made up of tiny “donuts” made of magnetic material strung on ...
  10. [10]
    Information Management Systems - IBM
    The first version shipped in 1967. A year later the system was delivered to NASA. IBM would soon launch a line of business called Database/Data Communications ...
  11. [11]
    Buffer management in relational database systems
    Principles of database buffer management. This paper discusses the implementation of a database buffer manager as a component of a DBMS. · Second-Level Buffer ...
  12. [12]
    [PDF] Object Oriented Database Systems - Computer Sciences Dept.
    In the engineering and scientific marketplaces, the workstation-server model of computing is emerging as the standard of the 1990s.
  13. [13]
    GemStone/S - Wikipedia
    Company history. GemStone Systems was founded on March 1, 1982, as Servio Logic, to build a database machine based on a set theory model. Ian Huang instigated ...
  14. [14]
    Datablitz - Wikipedia
    DataBlitz is a general purpose main memory database management system, developed by Lucent Bell Labs Research from 1993 to 1995. It replaced various home-grown ...
  15. [15]
    TimesTen - Wikipedia
    Originally designed and implemented at Hewlett-Packard labs in Palo Alto, California, TimesTen spun out into a separate startup in 1996 and was acquired by ...
  16. [16]
    About | ENTERPRISE HIGH PERFORMANCE Hybrid RDBMS
    The rich history of Altibase as a pioneer in in-memory databases and in-memory computing dates back to 1999 when the Electronics and Telecommunications ...
  17. [17]
    Historical price of computer memory and storage - Our World in Data
    Historical price of computer memory and storage. This data is expressed in US dollars per terabyte (TB), adjusted forinflation. "Memory" refers to random ...
  18. [18]
    Redis - Wikipedia
    After encountering significant problems in scaling some types of workloads using traditional database systems, Sanfilippo began in 2009 to prototype a first ...Redis (company) · Key–value database · Valkey
  19. [19]
    The SAP HANA Revolution
    Mar 31, 2020 · When SAP first introduced the idea of the in-memory database SAP HANA in 2010, skeptics dismissed the idea as a “complete fantasy.”.
  20. [20]
    H-Store
    Jun 29, 2022 · After the VLDB demo in the fall of 2008, the H-Store codebase was forked off and became VoltDB. The final version was announced in 2016 ...
  21. [21]
    Ultra Fast Data Processing: In-Memory Data Grids and ... - GigaSpaces
    In-memory computing includes In-Memory Data Grids (IMDGs) and In-Memory Databases (IMDBs). IMDGs are distributed, while IMDBs are standalone, both using RAM.
  22. [22]
    Scalable memory allocation using jemalloc - Engineering at Meta
    Jan 3, 2011 · Ignoring fragmentation, jemalloc limits metadata to less than 2% of total memory usage, for all size classes. Minimize the active page set. ...
  23. [23]
    Optimizing memory consumption with tcmalloc or jemalloc - IBM
    To avoid resource starvation and optimize memory consumption and transactional throughput, try applying a different memory allocation library in your cluster.
  24. [24]
    [PDF] On the Impact of Memory Allocation on High-Performance Query ...
    May 3, 2019 · jemalloc implements low address reusage for large allocations to reduce fragmentation. Low address reusage, which basically scans for the first ...<|separator|>
  25. [25]
    [PDF] Concurrent Data Structures - People | MIT CSAIL
    In the next three sections, we concentrate on imple- menting sets using different structures: linked lists, hash tables, and trees. ... High performance dynamic ...
  26. [26]
    [PDF] Fast Nonblocking Persistence for Concurrent Data Structures
    Performance results for nonblocking queues, skip lists, trees, and hash tables rival custom data structures in the literature – dramatically faster than ...
  27. [27]
  28. [28]
    A.2. Process Configuration Options - VoltDB
    The Java garbage collector (GC) intermittently frees up unused memory. Different garbage collectors use different algorithms for choosing when and how to do ...
  29. [29]
    [PDF] Hybrid Garbage Collection for Multi-Version Concurrency Control in ...
    Jun 26, 2016 · For mixed OLTP and OLAP workload, the table garbage collector is particularly useful in a database system such as. SAP HANA which uses separate ...
  30. [30]
    [PDF] Scalable Garbage Collection for In-Memory MVCC Systems
    Since HANA and Hekaton use the background thread to re- fresh their high watermark, garbage collection decisions are made based on outdated information if the ...Missing: VoltDB | Show results with:VoltDB
  31. [31]
    [PDF] CPU and Cache Efficient Management of Memory-Resident ...
    MRDBMSs are optimized for CPU and memory. Combining partially decomposed storage with Just-in-Time (JiT) compilation improves CPU efficiency, addressing the ...
  32. [32]
    [PDF] Optimizing Cache Efficiency for In-memory Key-value Stores
    The paper addresses the challenge of limited CPU cache in in-memory key-value stores, proposing a scheme called Cavast to optimize cache utilization.<|control11|><|separator|>
  33. [33]
    [PDF] Memory Databases on a Sub- NUMA Processor Topology
    Feb 24, 2020 · The levels of the locality hierarchy can be a shared L2 cache, a NUMA domain, another processor, or another sever. For each hierarchy, the.<|control11|><|separator|>
  34. [34]
    Oracle® TimesTen In-Memory Database Release Notes
    On NUMA machines with a large number of CPUs TimesTen could crash while ... See Oracle TimesTen In-Memory Database SQL Reference for more information.
  35. [35]
    What is an In-Memory Database and How Does it Work
    Jun 16, 2021 · The catch is with durability. Since in-memory databases store all the data in volatile memory, a power outage or RAM crash can cause data loss.
  36. [36]
    [PDF] Comprehensive Study of Persistence Techniques in In - Atlantis Press
    This paper presents a detailed study of the performance of persistence and recovery techniques in IM-NoSQL databases. The evaluation examines the performance of ...
  37. [37]
    Redis replication | Docs
    It allows replica Redis instances to be exact copies of master instances. The replica will automatically reconnect to the master every time the link breaks.Replication Id Explained · Read-Only Replica · Allow Writes Only With N...
  38. [38]
    Evaluation of intel 3D-xpoint NVDIMM technology for memory ...
    This paper presents a detailed empirical evaluation of Intel's Optane DC Persistent Memory solution that provides 3D-XPoint NV-DIMMs, which are directly ...Missing: NVRAM | Show results with:NVRAM
  39. [39]
    [PDF] Characterizing and Optimizing Remote Persistent Memory with ...
    Jul 16, 2021 · This paper studies how to best utilize NVM with RDMA, focusing on remote write, and summarizes optimization hints to improve performance.
  40. [40]
    [PDF] Fast Checkpoint and Recovery Techniques for an In-Memory ...
    We show that techniques for disk-based persistence can be efficient enough to keep up with current systems' huge memory sizes and fast transaction rates, be ...
  41. [41]
    The Case For RAMCloud - Communications of the ACM
    Jul 1, 2011 · Because all data is in DRAM at all times, RAMCloud promises 100x–1,000x lower latency than disk-based systems and 100x–1,000x greater throughput ...
  42. [42]
    [PDF] Latency Management in Storage Systems - USENIX
    In the simple case of a disk-based file system with a cold cache, this algorithm will degenerate to linear access of the file.
  43. [43]
  44. [44]
    [PDF] When to Use Oracle Database In-Memory
    It is therefore important to understand the basic characteristics of your application to determine the potential benefits from Database In-Memory. Figure 1: ...<|control11|><|separator|>
  45. [45]
    Optimizing in-memory database engine for AI-powered on-line ...
    OLDA has been widely used in many applications such as real-time fraud detection, personalized recommendation, etc. ... in-memory databases on real-time feature ...
  46. [46]
    Oracle TimesTen Scaleout: A New Scale-Out In-Memory Database ...
    Oracle TimesTen Scaleout is a shared-nothing scale-out inmemory database designed for extreme OLTP workloads, such as IoT, real-time fraud detection, ...
  47. [47]
    (PDF) Real-Time Data Management with In-Memory Databases
    May 14, 2025 · We analyze their performance characteristics, focusing on data manipulation operations and the impact on query processing times. The benefits of ...
  48. [48]
    Market Overview: In-Memory Data Platforms - Forrester
    New in-memory technologies can support new and complex mobile, web, and interactive workloads, such as real-time analytics, extreme transactions, dashboards, ...
  49. [49]
    [PDF] Databases in Online (Social) Gaming - Severalnines
    This paper discusses the importance of databases for the gaming industry, what its requirements are in terms of database technology as well as a discussion on ...
  50. [50]
    Gartner Top 10 Data and Analytics Trends for 2019
    Nov 5, 2019 · Applications of the technology range from fraud detection, traffic route optimization and social network analysis to genome research. Gartner ...<|separator|>
  51. [51]
    In-Memory Database Market Size & Outlook - Industry Report 2030
    Jul 7, 2025 · The In-Memory Database Market is expected to reach USD 7.08 billion in 2025 and grow at a CAGR of 13.98% to reach USD 13.62 billion by 2030.
  52. [52]
    [PDF] Using VoltDB - Volt Active Data
    This book explains how to use VoltDB to design, build, and run high performance applica- tions. V14. Page 2. Using VoltDB. V14. Copyright © 2008-2024 Volt ...
  53. [53]
    Implementing Transaction Processing using Undo Logs - Emory CS
    Example transaction processing using in-place write and UNDO logs ... How in-place update + undo log ensures atomicity. Recall that the execution ...
  54. [54]
    [PDF] Shadow Paging Is Feasible - Tatu Ylonen
    An alternative method for achieving atomicity (shadow paging) is to have a mapping from logical to physical database pages (page table), and execute.
  55. [55]
    Exploring MVCC and InnoDB's Multi-Versioning Technique
    May 20, 2024 · Snapshot isolation provides a consistent and isolated view of the data to each transaction, allowing them to work with a snapshot of the ...
  56. [56]
    VoltDB 6.3 - Jepsen
    Jul 12, 2016 · Serializability is the strongest of the four ANSI SQL isolation levels: transactions must appear to execute in some order, one at a time. It ...
  57. [57]
    Synchronous vs. Asynchronous Replication in Real-Time DBMSes
    Jun 9, 2025 · Compare synchronous and asynchronous replication in real-time DBMSs, including latency, consistency, recovery, and scaling, to choose the right strategy.
  58. [58]
    ACID vs. BASE Database Model: Differences Explained - phoenixNAP
    Mar 27, 2025 · Most relational databases are ACID compliant, while NoSQL databases tend to conform to the BASE model. The table below lists some notable ...ACID vs. BASE: Overview · ACID vs. BASE: In-Depth... · Definition · Performance
  59. [59]
    None
    ### Summary of Consistent Hashing, Replication, and Quorum in Dynamo
  60. [60]
    None
    ### Summary of Raft Consensus for Distributed Query Processing and Coordination in Databases
  61. [61]
    None
    ### Summary of Paxos for Consensus in Distributed Systems
  62. [62]
    In-memory Databases: Challenges and Opportunities From Software ...
    The increase in the capacity of main memory coupled with the decrease in cost has fueled research in and development of in-memory databases.Missing: optimization | Show results with:optimization
  63. [63]
    NUMA-Aware Scalable and Efficient In-Memory Aggregation on ...
    We propose a NUMA-aware radix partitioning (NaRP) method which divides the original huge relation table into subsets, without invoking expensive remote memory ...
  64. [64]
    [PDF] Hybrid Storage Management for Database Systems
    In hybrid storage sys- tems, the physical access pattern to the SSDs depends on the management of the DBMS buffer pool. We studied the im- pact of buffer pool ...Missing: overflow | Show results with:overflow
  65. [65]
    Chapter 10. Setting Up Persistent Storage | Configuring Data Grid
    Using eviction and passivation techniques ensures that Data Grid keeps only frequently used data in-memory and writes older entries to persistent storage. 10.1.
  66. [66]
    (PDF) An Approach for Hybrid-Memory Scaling Columnar In ...
    Höppner et al. presented a partitioning approach for SAP HANA to tier infrequently accessed columns to secondary storage [26] . While this approach was ...
  67. [67]
    [PDF] Checkpoints for Instant Recovery in In-Memory Database Systems
    In this paper, we investigate index checkpoints to eliminate this bottleneck. However, improper de- signs may lead to inconsistent index checkpoints or incur ...
  68. [68]
    (PDF) Recovery in Main Memory Database Systems. - ResearchGate
    PDF | In this paper we present a recovery mechanism for main memory database, which does not treat volatile RAM as a buffer and uses a limited size.
  69. [69]
    Documentation: 18: Chapter 25. Backup and Restore - PostgreSQL
    There are three fundamentally different approaches to backing up PostgreSQL data. Each has its own strengths and weaknesses; each is discussed in turn in the ...Missing: memory extensions hybrid
  70. [70]
    [PDF] In-Memory Big Data Management and Processing: A Survey
    Partitioning both the input data and query execution was proposed in [189]. ... Larson, “Identifying hot and cold data in main-memory databases,” in Proc.
  71. [71]
    [PDF] Brame: Hierarchical Data Management Framework for Cloud-Edge ...
    Feb 12, 2025 · The output of this algorithm comprises a partition scheme for hot data and a query routing tree for cold data. The key components are ...
  72. [72]
    [PDF] Demystifying Breakthrough Oracle Database Storage Technologies
    This traditional access method is up to 100x slower than the RDMA over Converged Ethernet (RoCE) method, which provides database applications with non-volatile ...<|control11|><|separator|>
  73. [73]
    [PDF] ScaleDB: A Scalable, Asynchronous In-Memory Database - USENIX
    Jul 12, 2023 · Abstract. ScaleDB is a serializable in-memory transactional database that achieves excellent scalability on multi-core machines.
  74. [74]
    Caching strategies for Memcached - Amazon ElastiCache
    Memcached strategies include read replicas, lazy loading (loading data only when needed), write-through (updating cache on database write), and adding TTL.
  75. [75]
  76. [76]
    Caching Best Practices | Amazon Web Services
    But there are a few simple strategies that you can use: Always apply a time to live (TTL) to all of your cache keys, except those you are updating by write- ...
  77. [77]
    Cache-Aside Pattern - Azure Architecture Center | Microsoft Learn
    An application can emulate the functionality of read-through caching by implementing the Cache-Aside pattern. This strategy loads data into the cache on demand.
  78. [78]
    Write caching: drive your workloads up to 90% faster - Redpanda
    Jul 16, 2024 · Write caching can achieve 90+% lower latencies on a wider range of storage and also reduces CPU utilization.Missing: backend 50-90%
  79. [79]
    What is a Cache Hit Ratio and How do you Calculate it? - StormIT
    Dec 7, 2022 · Generally speaking, for most sites, a hit ratio of 95-99%, and a miss ratio of one to five percent is ideal. You should keep in mind that these ...
  80. [80]
    None
    ### Summary of Adaptive Replacement Cache (ARC) Algorithm
  81. [81]
    What is SAP HANA?
    SAP HANA is a column-oriented in-memory database that runs advanced analytics alongside high-speed transactions in a single system.What Is An In-Memory... · Top 10 Benefits Of Sap Hana · The History Of Sap Hana
  82. [82]
    1 Overview for the Oracle TimesTen In-Memory Database
    Oracle TimesTen In-Memory Database (TimesTen) is a relational database that is memory-optimized for fast response and throughput.Timesten In-Memory Database... · Timesten Scaleout Specific... · Timesten Classic Specific...
  83. [83]
    [PDF] Oracle TimesTen In-Memory Database
    KEY FEATURES. • Low latency. • Microsecond response time. • Multi-user ... Oracle TimesTen In-Memory Database (TimesTen) delivers real-time performance by.
  84. [84]
    VoltDB System Properties - DB-Engines
    Name, VoltDB ; Description, Distributed In-Memory NewSQL RDBMS info Used for OLTP applications with a high frequency of relatively simple transactions, that can ...
  85. [85]
    [PDF] VoltDB Technical Overview - ODBMS.org
    A VoltDB database is composed of many in-memory execution engines called “partitions.” A partition combines data and associated processing constructs. VoltDB ...
  86. [86]
    [PDF] Hekaton: SQL Server's Memory-Optimized OLTP Engine - Microsoft
    Jun 22, 2013 · Hekaton is a new database engine optimized for memory resident data and OLTP workloads. Hekaton is fully integrated into SQL. Server; it is not ...
  87. [87]
    In-Memory OLTP overview and usage scenarios - SQL Server
    Mar 5, 2024 · In-Memory OLTP is the premier technology available in SQL Server and SQL Database for optimizing performance of transaction processing, data ingestion, data ...
  88. [88]
    Global In-Memory Database Market Outlook, 2020-2025
    Apr 22, 2020 · The Global In-Memory Database Market was valued at USD 4.16 billion in 2019. This market is considered as the upgraded step after traditional ...
  89. [89]
    In-Memory Database Market | Global Market Analysis Report - 2035
    Oct 3, 2025 · In-Memory Database Market is forecasted to reach USD 26.2 billion by 2035 and exhibiting a remarkable 14.0% CAGR between 2025 and 2035.<|separator|>
  90. [90]
    What's New in SAP HANA Cloud – June 2025
    Jun 24, 2025 · With Q2 2025, the SAP HANA Database within SAP HANA Cloud introduces some new features for table partitioning that significantly simplify the management of ...Administration And Service... · Partitioning Updates... · Multi-Model Data Processing
  91. [91]
    Oracle TimesTen In-Memory Database Express Edition
    Cache: Enhances application performance with easy-to-use caching of relational data from a backend Oracle Database. Kubernetes Operator: Supports running ...
  92. [92]
    Docs
    ### Key Features of Redis (Open-Source In-Memory Database)
  93. [93]
    Apache Ignite: Distributed Database
    Apache Ignite is a distributed database for high-performance applications, scaling across memory and disk, and can act as an in-memory cache or data grid.FAQ · In-Memory Database · Distributed In-Memory Cache · Download Ignite
  94. [94]
    Home
    ### Summary of Hazelcast Open-Source In-Memory Data Grid
  95. [95]
    Dragonfly | An In-Memory Data Store without Limits
    Dragonfly is 100% API compatible with Redis, Valkey, and Memcached, allowing for quick and seamless migrations that result in up to 25X better performance on ...Redis Performance · Documentation · Dragonfly Pricing · Blog
  96. [96]
    Technology | 2025 Stack Overflow Developer Survey
    The significant growth in usage for Redis (+8%) highlights its growing importance. As applications become more complex, the need for high-speed, in-memory ...
  97. [97]
    Blog - Dragonfly
    November 6, 2025 ... Reflecting on a year of innovation, collaboration, top contributors, and milestones like launching Dragonfly Cloud and 26k GitHub stars!
  98. [98]
    redis/redis: For developers, who are building real-time data ... - GitHub
    Redis is the preferred, fastest, and most feature-rich cache, data structure server, and document and vector query engine.Redis · Homebrew-redis · Redis-rpm · Redis-debianMissing: November | Show results with:November
  99. [99]
    Apache Ignite - GitHub
    Apache Ignite is a distributed database for high-performance computing with in-memory speed. Multi-Tier Storage Apache Ignite is designed to work with memory, ...Issues 99 · Pull requests 725 · Security · ActivityMissing: November | Show results with:November
  100. [100]
    hazelcast/hazelcast - GitHub
    Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for ...Hazelcast · Issues · Hazelcast Python Client · Hazelcast Go ClientMissing: adoption Redis Ignite DragonflyDB
  101. [101]
    dragonflydb/dragonfly: A modern replacement for Redis ... - GitHub
    Dragonfly is an in-memory data store built for modern application workloads. Fully compatible with Redis and Memcached APIs, Dragonfly requires no code changes ...DragonflyDB · Issues 251 · Pull requests 27 · Discussions