Lists of databases
Lists of databases are directories or catalogs that systematically compile and organize information about various databases, particularly those used in academic, scientific, and professional research, enabling users to identify and access structured collections of data, articles, and resources across diverse fields.[1] These compilations typically include hundreds of entries, such as those maintained by university libraries, and serve as gateways to scholarly materials like peer-reviewed journals, datasets, and multimedia content.[2] The primary purpose of lists of databases is to streamline the research process by guiding users toward credible and relevant sources, reducing the time spent navigating vast information landscapes.[3] In rigorous applications like systematic reviews, selecting databases from these lists is crucial for achieving comprehensive coverage; for example, combining databases such as MEDLINE, Embase, Web of Science, and Google Scholar can retrieve up to 98.3% of relevant references, highlighting how incomplete selections may miss over 5% of key literature in 60% of published reviews.[4] By prioritizing high-quality, peer-reviewed, and updated resources, these lists ensure the reliability and efficiency of information retrieval in multidisciplinary contexts.[1] Such lists are often organized by categories including subject areas (e.g., humanities, sciences, medicine), content types (e.g., article indexes, statistical datasets, multimedia archives), and access methods (e.g., subscription-based or open-access).[2] Institutions like universities provide filterable A-Z directories to accommodate both broad exploratory searches and targeted inquiries, often restricting remote access to affiliates while promoting on-site or proxy usage.[1] This structured approach underscores their role in fostering informed decision-making and advancing knowledge discovery in an era of exponential data growth.[4]Overview
Definition and Scope
A database is an organized collection of structured or semi-structured data, typically stored electronically in a computer system for efficient retrieval and manipulation.[5] This encompasses everything from tabular records in business applications to complex datasets in research environments. In contrast, a database management system (DBMS) is the software that enables users to define, create, maintain, and control access to the database, providing mechanisms for data security, integrity, and querying.[6] The key distinction lies in their roles: the database serves as the data repository itself, while the DBMS acts as the intermediary tool for interaction and administration. Lists of databases represent curated compilations of notable database systems or repositories, organized to aid selection and comparison across diverse criteria. These lists typically categorize entries by underlying model, distinguishing relational databases—which store data in tables with predefined schemas and support ACID transactions—from non-relational (NoSQL) ones that prioritize flexibility, scalability, and handling of unstructured data through models like document, key-value, or graph stores.[7] Further scope includes segmentation by application domain, such as scientific databases for genomic or bibliographic data versus commercial ones designed for enterprise transactions and financial records, as well as by accessibility, contrasting open-source options with community-driven development and proprietary systems offering vendor-supported features.[8] Such categorizations reflect the broad applicability of databases, from research to business, without attempting comprehensive catalogs that would include every minor implementation. These lists prioritize conceptual overviews of prominent examples, underscoring that they highlight influential DBMS like those powering major platforms rather than obscure variants. Database models have evolved from the 1960s onward, progressing through hierarchical and network structures to modern relational and NoSQL paradigms that underpin current list organizations.[9] Common presentation formats enhance usability: alphabetical indexes facilitate rapid reference to systems by name, while comparison tables assess attributes such as scalability for high-volume workloads, supported query languages (e.g., SQL for relational systems), and integration capabilities.[10] This approach ensures lists remain practical resources for developers, researchers, and organizations evaluating database solutions.[11]Historical Development
The development of databases began in the 1960s with hierarchical and network models designed to manage complex data structures for large-scale applications. IBM's Information Management System (IMS), developed in 1966 and first released in 1967, was one of the first hierarchical database management systems, initially created to support NASA's Apollo space program by handling vast inventories and bill-of-materials data.[12] Concurrently, the Conference on Data Systems Languages (CODASYL) introduced the network model in 1969 through its Database Task Group report, which specified a schema for interconnected record types to enable more flexible data relationships beyond strict hierarchies.[13] The 1970s marked a pivotal shift toward the relational model, fundamentally altering database design and necessitating new ways to catalog systems. In 1970, Edgar F. Codd published his seminal paper "A Relational Model of Data for Large Shared Data Banks," proposing data organization into tables with rows and columns, relational algebra for querying, and normalization to reduce redundancy—concepts that became foundational for modern databases.[14] This innovation addressed limitations in hierarchical and network models, spurring the creation of relational database management systems (RDBMS) and highlighting the need for comparative lists as options proliferated. By the 1980s and 1990s, the rise of Structured Query Language (SQL) and commercial RDBMS solidified relational databases as the dominant paradigm, further expanding the ecosystem. IBM developed the first SQL prototype in 1974 as part of its System R project to provide a declarative interface for relational data manipulation.[15] SQL was standardized by the American National Standards Institute (ANSI) in 1986 as SQL-86, enabling interoperability across systems.[15] Commercial implementations followed, including Oracle Version 2 in 1979, the first SQL-based RDBMS for production use, and MySQL's initial release in May 1995, which popularized open-source relational databases.[16][17] From the 2000s onward, the emergence of NoSQL databases addressed big data challenges, leading to diverse models and domain-specific catalogs amid the internet and open data movements. Google's Bigtable, detailed in a 2006 OSDI paper, introduced a distributed, column-oriented storage system for handling petabyte-scale structured data, influencing subsequent NoSQL implementations.[18] Apache HBase, released in 2008 as part of the Hadoop ecosystem, was directly modeled on Bigtable to provide scalable, real-time read/write access on commodity hardware.[19] Similarly, Apache Cassandra, initially developed at Facebook in 2008, drew inspiration from Bigtable's data model for its wide-column storage while incorporating Amazon Dynamo's distribution for high availability.[20] The explosion of DBMS options—spurred by XML databases, key-value stores, and cloud-native systems in the early 2000s—drove the creation of structured lists, evolving from pre-internet print surveys like CODASYL reports to online directories and rankings in the 2000s and 2010s, such as DB-Engines launched in 2012 to track popularity and trends.[21][22]Lists by Database Model
Relational Database Management Systems
Lists of relational database management systems (RDBMS) are available through various directories and rankings, such as the DB-Engines Ranking, which measures popularity based on mentions in technical discussions, search engine queries, and job postings, updated monthly as of 2025.[23] Other catalogs include Wikipedia's comprehensive list of RDBMS software and vendor comparisons from sources like Gartner.[24] These lists organize RDBMS by criteria like market share, features, and deployment type, with relational systems enforcing ACID properties and using SQL for queries. RDBMS lists highlight systems suitable for applications requiring data integrity, such as enterprise transactions. Notable examples from these lists include:- Oracle Database: First commercial SQL implementation released in 1979, used for high-volume enterprise deployments.[25]
- MySQL: Open-source RDBMS developed starting in 1995, popular for web applications due to its lightweight architecture.[26]
- PostgreSQL: Evolved from the POSTGRES project, renamed in 1996, supports advanced features like JSON for semi-structured data.[27]
- Microsoft SQL Server: Launched in 1989, integrates with Windows for business intelligence and analytics.[28]
- IBM Db2: A long-standing RDBMS for mainframe and cloud environments, emphasizing scalability.
NoSQL Database Management Systems
Directories of NoSQL database management systems (DBMS) emphasize flexibility for unstructured data and scalability, categorized by types like key-value, document, wide-column, and graph. Prominent lists include the DB-Engines Ranking for NoSQL systems and AWS comparisons of NoSQL types.[30][31] These compilations evaluate based on performance, adoption, and suitability for distributed environments, contrasting BASE consistency with relational ACID models. Notable examples from NoSQL lists include:- MongoDB: Document-oriented, released in 2009, supports dynamic schemas for content management.[32]
- Redis: In-memory key-value store introduced in 2009, used for caching and real-time processing with sub-millisecond latency.[33]
- Apache Cassandra: Wide-column store launched in 2008, handles high-write throughput for time-series data, as used by Netflix for billions of daily events.[34][35]
- Neo4j: Graph database from 2007, employs Cypher for querying relationships in social networks and recommendations.[36]
Other Database Models
Lists of other database models cover hierarchical, network, object-oriented, and specialized variants like time-series and spatial, often found in academic catalogs, DB-Engines specialized rankings, and extension lists for relational systems.[30] These directories highlight niche applications beyond relational and NoSQL, including legacy systems in finance and emerging multimodel engines. Notable examples from these lists include:- IBM IMS: Hierarchical model released in 1968, used for mainframe transaction processing in banking hierarchies.[38]
- IDMS: Network model implemented in 1973 per CODASYL standards, for complex linkages in inventory systems.
- ObjectDB: Object-oriented DBMS from 2007, stores Java/.NET objects for CAD and multimedia.[39]
- InfluxDB: Time-series database launched in 2013, optimized for metrics and sensor data ingestion.[40]
- PostGIS: Spatial extension for PostgreSQL since 2001, supports geospatial queries for maps and routes.[41]
- ArangoDB: Multimodel database introduced in 2012, combines document, graph, and key-value for unified queries.[42]