Fact-checked by Grok 2 weeks ago

Query language

A query language is a specialized language designed to make requests (queries) into and information systems for the purpose of retrieving, manipulating, and managing . These languages enable users to interact with structured or stores by specifying selection criteria, often in a declarative manner that describes what data is needed rather than how to retrieve it. The development of query languages traces back to the , emerging from foundational work in theory. In 1970, IBM researcher published a seminal paper introducing the , which laid the groundwork for systematic data querying. SQL (Structured Query Language), the most widely adopted query language, was initially developed by in the early as SEQUEL (Structured English QUEry Language) to support relational databases like System R. By 1979, (then Relational Software, Inc.) released the first commercial SQL-based relational database management system, standardizing SQL as the language for data operations. Over the decades, SQL evolved through ANSI and ISO standards (e.g., SQL-86, ), incorporating features for data definition, manipulation, and control, while alternatives like QUEL appeared in the but were eventually overshadowed by SQL's dominance. Query languages encompass various types tailored to different data models and use cases, broadly categorized as declarative (specifying desired results) or imperative (detailing retrieval steps). The primary subtypes include for retrieving data, for modifying it, and extensions like for schema management, all integral to SQL. Beyond relational systems, notable examples include query languages for (e.g., Query Language), for API-driven flexible queries, for RDF data, and domain-specific ones like SPL for machine data analysis. Today, query languages are essential in , , and AI applications, powering everything from to real-time .

Definition and Purpose

Core Definition

A query language is a specialized used to retrieve, manipulate, and manage data stored in or information systems, abstracting away the precise algorithmic steps required for execution. This formalism enables users to define queries as functions that input a or set of facts and output a relevant subset or derived facts, focusing on the logical specification of data needs rather than implementation details. Central to query languages is their declarative nature, which allows users to specify what is desired—such as particular meeting certain criteria—while the underlying determines how to efficiently compute and deliver it. This contrasts with procedural approaches, promoting higher-level abstractions that enhance and enable optimization by the . Query languages typically encompass both retrieval and manipulation operations; for example, in SQL, the (DQL) subset handles read-centric activities like extraction and analysis via SELECT statements, while the (DML) subset supports modifications such as insertions and updates via INSERT, , and DELETE. This integrated focus facilitates efficient exploration and management in large-scale systems. At their core, query languages comprise query expressions that articulate the intended output, operators for tasks like selection (filtering records) and (specifying attributes), and result sets that encapsulate the processed in a structured format. These elements collectively form a syntax and semantics tailored for precise interaction.

Applications in Data Systems

Query languages serve as the foundational interface for interacting with data in relational database management systems (RDBMS), where languages like SQL enable users to retrieve, manipulate, and manage structured data stored in tables. In NoSQL databases, query languages such as Cypher for graph databases or MongoDB's query API support flexible data models, including document, key-value, and column-family stores, facilitating operations on unstructured or semi-structured data. Search engines employ query languages based on keyword, Boolean, and natural language constructs to perform information retrieval from vast textual corpora, powering ranked result delivery in systems like web search platforms. Knowledge graphs utilize specialized query languages like SPARQL for RDF-based structures or Cypher for property graphs, allowing traversal and pattern matching across interconnected entities to support semantic querying. In business intelligence tools, query languages play a pivotal role in data retrieval for analytics, reporting, and decision-making by extracting insights from operational databases and data warehouses. For instance, SQL-based queries integrate with platforms like Tableau or Power BI to aggregate metrics, generate dashboards, and enable predictive analytics that inform strategic choices in organizations. This capability streamlines the transformation of raw data into actionable reports, enhancing efficiency in sectors such as and healthcare. Query languages integrate seamlessly with APIs for web services, allowing SQL extensions to mash up data from multiple relational sources and external endpoints in a unified query environment. In platforms, they extend to distributed systems like Hadoop via HiveQL for SQL-like querying on HDFS-stored data, and cloud services such as AWS , which uses standard SQL to analyze petabyte-scale datasets in S3 without infrastructure management. These languages offer benefits including high efficiency in processing large datasets through optimized execution plans and declarative paradigms that abstract low-level details, focusing instead on what data to retrieve. Additionally, they support ad-hoc querying, enabling on-the-fly analysis without predefined schemas, which is essential for exploratory and rapid prototyping in dynamic environments.

Historical Development

Origins in Relational Databases

The origins of query languages are deeply rooted in the of data, proposed by in his seminal 1970 paper, which formalized databases as collections of relations (tables) composed of tuples (rows) and attributes (columns), emphasizing and logical structure over physical storage. This model laid the theoretical groundwork for querying by introducing as a procedural foundation for data manipulation, but it was the non-procedural relational calculi—specifically (focusing on selecting tuples satisfying predicates) and domain relational calculus (emphasizing domain variables and conditions)—developed in Codd's 1972 work on relational completeness, that served as key precursors to declarative query languages. These calculi provided a formal, logic-based means to express queries without specifying retrieval steps, enabling completeness in expressing any relational algebra operation and influencing the design of practical sublanguages for database interaction. Building on this foundation, early practical query languages emerged within IBM's research efforts to implement the . In 1975, and introduced SQUARE (Specifying Queries as Relational Expressions), a sublanguage designed for querying in , which directly translated operations into a textual form but relied heavily on , subscripts, and complex expressions that proved cumbersome for non-experts. To address these usability challenges, the same researchers simplified SQUARE into SEQUEL (Structured English Query Language) in 1974, adopting a more readable, English-like syntax while retaining declarative semantics inspired by the relational calculi, and integrating it as the query interface for IBM's System R prototype—a pioneering developed to demonstrate Codd's concepts in a working environment. By the late 1970s, SEQUEL transitioned to SQL (Structured Query Language) due to a trademark conflict with the existing SEQUEL name held by an unrelated company, prompting IBM to shorten it while preserving its core features. This evolution marked the shift from research prototypes to commercial viability, with Relational Software, Inc. (later Oracle Corporation) releasing the first production implementation of SQL in Oracle Version 2 in 1979, enabling structured queries on relational data in a multi-user setting and setting the stage for widespread adoption.

Evolution and Standardization

The evolution of query languages, building on early relational concepts, accelerated in the with the formal of SQL as a core query mechanism for relational databases. In 1986, the (ANSI) approved the first SQL standard, designated ANSI X3.135-1986, which defined essential syntax for data definition, manipulation, and control operations, including SELECT, INSERT, , and DELETE statements. This standard was adopted internationally by the (ISO) in 1987 as ISO/IEC 9075:1987, promoting portability and consistency across database systems. The 1990s marked significant expansions to the SQL standard, enhancing its expressiveness and applicability. The SQL-92 standard (ISO/IEC 9075:1992), also known as SQL2, introduced features such as outer joins for handling unmatched rows in queries, improved support for views and schemas, and new data types like DATE, TIME, and , while defining conformance levels (Entry, Intermediate, Full) to guide implementations. Building on this, SQL:1999 (ISO/IEC 9075:1999), or SQL3, incorporated object-relational extensions including user-defined types, , and recursive queries via common table expressions (CTEs), allowing complex hierarchical data retrieval without procedural code. Subsequent revisions continued to evolve SQL for modern data needs. SQL:2003 added support for XML data querying and manipulation. Later versions, including SQL:2008 and SQL:2011, enhanced analytical processing with improved window functions and temporal data handling. SQL:2016 introduced data type and functions for . The most recent, SQL:2023 (ISO/IEC 9075:2023), further expanded capabilities and added enhancements for property graphs and matching in JSON contexts. As query languages matured, domain-specific extensions emerged to address limitations in handling non-relational data and procedural logic, alongside alternatives to SQL. For instance, QUEL (Query Language), developed in the late 1970s for the Ingres database system at UC Berkeley and based on relational calculus, offered a more mathematical syntax and was used commercially in the 1980s but was eventually supplanted by SQL's growing dominance and English-like readability. For XML data, the W3C standardized XQuery 1.0 in 2007 as a functional query language for retrieving and transforming XML documents, complementing SQL by supporting path expressions and FLWOR (For-Let-Where-Order-Return) constructs. Concurrently, integration with procedural elements gained traction; for instance, Oracle introduced PL/SQL in 1992 with Oracle7, extending SQL with blocks, variables, loops, and exception handling for server-side programming. Database vendors further influenced standardization through proprietary evolutions that extended core SQL while aiming for partial compliance. Microsoft's (T-SQL), originating from the 1989 Sybase-Microsoft partnership for SQL Server and fully developed by after 1993, added procedural constructs like cursors and error handling, alongside extensions for such as window functions in later versions. Similarly, Oracle's evolved as a robust procedural layer, enabling stored procedures and triggers that influenced subsequent ISO standards on persistent stored modules. These developments balanced innovation with , shaping query languages into versatile tools for enterprise data management.

Recent Advancements

The 2010s marked a significant shift in query languages with the rise of , addressing the limitations of relational models in handling interconnected data. , developed by engineers in 2011, emerged as a declarative query language specifically designed for property databases, enabling and traversal operations that are intuitive for graph structures. This innovation laid the groundwork for broader adoption of graph querying, culminating in the of as ISO/IEC 39075 in April 2024, which defines operations for creating, querying, and maintaining property graphs in a vendor-neutral manner. draws heavily from Cypher's syntax while incorporating elements from other graph languages, promoting across graph database systems. Parallel to graph advancements, databases prompted adaptations in query paradigms to support flexible, schema-less data models. The Query Language (MQL), integral to since its initial release in August 2009, uses JSON-like documents for querying, allowing operations like aggregation pipelines and without rigid schemas. Similarly, the Cassandra Query Language (CQL), introduced in 2011 for , mimics SQL syntax to query wide-column stores, facilitating distributed data manipulation across clusters with commands for keyspace management and conditional updates. These adaptations enabled scalable querying in non-relational environments, influencing hybrid systems that blend NoSQL flexibility with familiar SQL-like interfaces. API-centric query languages further evolved data access in web and architectures. , open-sourced by in 2015, introduced a flexible querying mechanism where clients specify exact data requirements via a single endpoint, reducing over-fetching and under-fetching common in REST APIs. This approach, now widely adopted by platforms like and , supports introspection and through definitions, streamlining client-server interactions in distributed applications. Integrations with have transformed query generation by bridging and structured queries. From 2023 onward, (LLM)-based tools have enabled for automatic SQL or query generation, with examples like Uber's QueryGPT (2024) using LLMs and vector search to convert English questions into executable database queries, improving accessibility for non-experts. Complementary innovations include PRQL, a pipelined relational query language developed in the early , which compiles to SQL and emphasizes readable, chainable expressions over nested subqueries to enhance maintainability in analytical workflows. Cloud-native systems have advanced distributed query capabilities through SQL extensions tailored for massive . , a data platform launched in 2014, has iteratively extended SQL in the with features like dynamic table functions and vector search support, optimizing queries across distributed warehouses for on petabyte-scale data without traditional indexing overhead. These enhancements facilitate seamless federated querying over environments, underscoring the trend toward unified, elastic .

Key Characteristics

Declarative vs. Procedural Paradigms

Query languages predominantly adopt the , where users specify the desired results—what data to retrieve or manipulate—without dictating the method of execution. The underlying database management system (DBMS) optimizer then determines the optimal execution plan, including choices like join orders, usage, and parallelization, based on system statistics and constraints. This paradigm is exemplified by set-based operations inspired by , such as selections, projections, and unions, which treat data as mathematical sets rather than sequential records, enabling concise expressions of complex queries. In contrast, the procedural paradigm requires explicit step-by-step instructions for accessing and processing data, akin to where and operations are fully prescribed by the user. Although less prevalent in pure query languages due to their complexity and reduced flexibility, procedural elements persist in extensions like SQL cursors, which facilitate iterative, row-by-row traversal of result sets for tasks requiring ordered processing or dynamic decision-making. These mechanisms allow fine-grained control but often lead to less efficient, harder-to-optimize code compared to set-based alternatives. The dominance of the declarative paradigm stems from its key advantages: enhanced portability, as queries remain valid across diverse DBMS implementations without modification for underlying storage or hardware differences; superior performance optimization, where the engine automatically generates efficient plans that outperform manually tuned procedural equivalents in most scenarios; and clear separation of concerns, isolating logical query intent from physical execution details to improve maintainability and reduce developer burden. Theoretically, declarative query languages are grounded in relational calculus, a non-procedural formalism that defines queries through logical predicates on relations, offering equivalent expressive power to the procedural relational algebra without specifying operational sequences. Relational algebra, introduced by E.F. Codd, serves as the procedural foundation with its explicit operators for data manipulation, mirroring the step-wise control of imperative loops in general programming languages like C or Java. This duality, formalized in Codd's work on relational completeness, underscores why declarative approaches prevail in modern database systems for their balance of power and abstraction.

Syntax and Semantic Elements

Query languages are constructed using a formal syntax that includes predefined keywords, operators, and clauses to articulate data selection, filtering, and manipulation instructions. Keywords such as SELECT and FROM delineate the projection of desired attributes and the specification of data sources, respectively, forming the foundational structure of most queries. Logical operators like AND and OR enable the combination of conditions, while comparison operators including = and > facilitate precise filtering based on relational predicates. Clauses such as WHERE for conditional filtering and GROUP BY for aggregation organize the query logic, ensuring systematic processing of input data. Semantically, query languages define mappings from underlying data models—such as relations or graphs—to output result sets, where the of a query determines the exact transformation applied. In the , these semantics embody closure properties, whereby algebraic operations on relations yield relations, thereby preserving the model's throughout . Expressiveness is a key semantic attribute, exemplified by the of , which equivalently captures all queries formulable in , ensuring no loss of representational power. Common patterns in query languages include for identifying structural similarities in , joins for integrating across multiple relations or entities, and aggregation functions such as and for condensing datasets into summary metrics. Pattern matching employs symbolic representations, often using wildcards or regular expressions, to locate conforming elements within records or nodes. Joins, typically categorized as inner, outer, or equi-joins, merge datasets based on shared attributes, enabling relational composition without data duplication. Aggregation functions apply over grouped data to compute scalar values, supporting analytical operations like totals or averages in result sets. Challenges in query language design encompass ambiguity in natural language interfaces, where polysemous terms or contextual nuances can yield multiple valid interpretations, thus hindering precise query translation. In structured queries, type safety poses another hurdle, as mismatches between operand types may lead to runtime failures unless enforced by static checks or schema-aware compilation.

Classification by Type

Database Query Languages

Database query languages enable the retrieval, manipulation, and management of structured data within database systems, primarily focusing on relational models where data is organized into tables with predefined schemas. The cornerstone of these languages is SQL (Structured Query Language), a standardized developed for relational databases to perform (CRUD) operations, with (DQL) components emphasizing efficient read operations such as selecting and filtering data from tables. SQL and its variants, including those in systems like , , and , adhere to ANSI/ISO standards, allowing developers to express queries declaratively for consistent data interaction across RDBMS platforms. In non-relational or environments, query languages adapt to diverse models while retaining core principles of structured retrieval. Key-value stores, exemplified by , utilize command-based queries like GET, SET, and MGET to access stored as simple pairs, prioritizing speed for caching and session management. Document-oriented databases, such as , employ a JavaScript Object Notation ()-like query syntax to match and aggregate semi-structured documents, supporting operations akin to CRUD through methods like find() and update(). Column-family stores like use Query Language (CQL), a SQL-inspired syntax tailored for distributed wide-column , enabling inserts, selects, and updates across partitioned tables. Essential features of these query languages include support for ACID (Atomicity, , , ) compliance to guarantee reliability, particularly in relational systems where SQL enforces during multi-statement operations. Indexing structures, such as or hash indexes in SQL and secondary indexes in variants, accelerate query execution by facilitating rapid lookups and reducing full-table scans. Transactional capabilities allow queries to bundle operations atomically, with mechanisms in SQL and multi-document transactions in ensuring consistency in concurrent environments. These languages power enterprise by underpinning (OLTP) for real-time, high-throughput tasks like order processing and inventory updates, while also supporting (OLAP) for aggregating and analyzing large datasets in applications.

Information Retrieval Query Languages

Information retrieval (IR) query languages are designed to search and rank documents in large collections of unstructured or semi-structured text, emphasizing probabilistic relevance over exact matches. These languages enable users to express information needs through terms, operators, and modifiers that facilitate retrieval from corpora such as web pages, digital libraries, or enterprise archives. Unlike precise data extraction in structured databases, IR queries prioritize ranking documents by estimated relevance, often using statistical models to handle ambiguity and scale to billions of items. Boolean queries form the foundational logic in early IR systems, employing operators like AND, OR, and NOT to combine terms for exact set-based retrieval. For instance, a query such as "cat AND dog NOT bird" retrieves documents containing both "cat" and "dog" but excluding "bird," processed efficiently via inverted indexes that map terms to document lists. This model, prominent in systems like the SMART retrieval system from the 1960s, provides binary yes/no results without inherent ranking, making it suitable for precise filtering in controlled vocabularies but limited for vague user intents in full-text scenarios. Full-text and ranked retrieval extend Boolean capabilities by incorporating term weighting and proximity operators to score document relevance. In term-based approaches, queries use free-text keywords weighted by models like TF-IDF (Term Frequency-Inverse Document Frequency), where term frequency measures local importance within a document, and inverse document frequency downweights common terms across the corpus, enabling ranked lists ordered by or similar metrics. Proximity operators, such as "cat NEAR/5 dog," refine searches by requiring terms within a specified distance, improving precision in phrase-like queries. These elements, central to models, power modern search engines by addressing vocabulary mismatches and supporting . Structured elements in query languages allow field-specific searches to target or document sections, enhancing in semi-structured collections. For example, queries like "title:quantum physics" restrict matching to titles, while "author:Einstein date:>1900" combines fields for temporal filtering, common in tools like search engines or libraries. This approach leverages document schemas without full relational structure, bridging free-text and metadata-driven retrieval. The evolution of IR query languages has incorporated faceted search and to better capture user intent and support exploratory navigation. Faceted search presents results with navigable categories (facets) like or date, allowing progressive refinement of queries through selections that intersect with initial terms, originating from systems and advanced in tools like the interface. automatically augments user queries with related terms—via thesauri, co-occurrence analysis, or —to mitigate issues like synonymy or , as demonstrated in techniques from Rocchio's 1971 method and later surveys showing 7-14% improvements in benchmark tests. These advancements shift IR from rigid logic to interactive, intent-aware paradigms.

Emerging and Specialized Languages

In recent years, query languages for graph data have advanced to handle complex relational structures beyond traditional tabular models. Property graph query languages, such as and , enable traversals that navigate nodes and relationships to uncover patterns in interconnected data, supporting applications like and recommendation systems. For semantic web applications, RDF-based languages like facilitate querying distributed knowledge graphs by matching triples (subject-predicate-object) across heterogeneous sources, with the SPARQL 1.2 Working Draft (as of November 2025) enhancing federation and update capabilities for large-scale RDF datasets. The integration of large language models (LLMs) has given rise to natural language-driven query interfaces, allowing users to pose conversational questions that are automatically translated into executable code. Tools like Uber's QueryGPT, launched in 2024, leverage generative AI to convert prompts into SQL queries, improving accessibility for non-technical users in workflows. Recent advancements in text-to-SQL, as surveyed in 2025, demonstrate LLMs achieving up to 80% accuracy on benchmark datasets like by incorporating retrieval-augmented generation () to refine schema understanding and query synthesis. Domain-specific query languages address niche data paradigms, optimizing for performance in specialized environments. , the query language for , supports real-time aggregation of time-series metrics using functions like rate() and histogram_quantile() to monitor infrastructure and applications at scale. For AI embeddings in vector databases, query mechanisms often extend SQL with similarity operators (e.g., cosine distance in pgvector) or use dedicated syntax in systems like for approximate nearest neighbor searches over high-dimensional data. The (GQL), standardized by ISO/IEC 39075 in 2024, provides a unified declarative syntax for property graphs, enabling path traversals and in knowledge graphs while promoting across vendors. Emerging trends emphasize hybrid query languages that blend paradigms for polyglot persistence, where systems manage diverse data types within a single query interface. For instance, extensions like PostgreSQL's SQL/PGQ integrate graph traversals with relational joins, allowing unified queries over SQL tables and property graphs to support complex analytics in mixed workloads. This approach reduces data silos, as seen in 2025 hybrid models that combine vector embeddings with graph structures for enhanced retrieval-augmented generation in AI applications.

Notable Examples

Structured Query Language (SQL)

Structured Query Language (SQL) is a standardized designed for managing and querying data held in management systems (RDBMS). Originally developed by in the 1970s, it became an ANSI standard in 1986 and an international ISO standard in 1987, enabling declarative expressions for , , and . SQL's widespread adoption stems from its simplicity and power in handling structured data through relational models, where data is organized into tables with rows and columns related via keys. As the de facto standard for relational databases, SQL underpins systems like , , , and SQL Server, facilitating operations from simple lookups to complex analytical queries. At its core, SQL syntax revolves around the SELECT-FROM-WHERE structure for querying data. The SELECT clause specifies the columns or expressions to retrieve, the FROM clause identifies the source tables, and the WHERE clause applies filtering conditions to rows. For example, to retrieve employee names from a department, one might use:
SELECT name FROM employees WHERE department = 'Sales';
This basic form supports with GROUP BY and HAVING for conditional summaries. SQL also includes (DML) statements like INSERT, , and DELETE for modifying data, and (DDL) commands like CREATE TABLE for schema management. To combine data from multiple tables, SQL employs JOIN operations, which link rows based on related columns. Common types include INNER JOIN, which returns only matching rows from both tables, and LEFT JOIN, which includes all rows from the left table and matching rows from the right, with NULLs for non-matches. An example INNER JOIN on customers and orders:
SELECT customers.name, orders.date 
FROM customers 
INNER JOIN orders ON customers.id = orders.customer_id;
Subqueries enhance expressiveness by nesting one query within another, often in the WHERE clause for comparisons or in FROM for derived tables. For instance, a subquery might filter employees earning above the departmental average. Window functions, introduced in SQL:1999, perform calculations across row sets without grouping, using an OVER clause to define the window. The ROW_NUMBER() assigns sequential numbers to rows within a partition, useful for ranking:
SELECT name, salary, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank 
FROM employees;
These features allow SQL to handle analytical tasks efficiently in relational contexts. SQL's evolution is tracked through successive ISO/IEC 9075 revisions, balancing core stability with new capabilities. The progression began with ANSI X3.135-1986 (SQL-86), focusing on basic relational operations, followed by enhancements in SQL-89 for integrity constraints and for fuller syntax including outer joins. Later versions added object-relational features: SQL:1999 introduced recursive queries and window functions; SQL:2003 supported XML data; SQL:2006 and SQL:2008 enhanced temporal and window support; SQL:2011 added temporal tables. SQL:2016 (ISO/IEC 9075-2016) notably incorporated support through functions like JSON_VALUE for extracting values from JSON documents stored in columns, enabling hybrid relational-NoSQL workloads. The latest, SQL:2023 (ISO/IEC 9075-2023), introduces property queries via clauses like for traversing graph structures directly in SQL, extending its reach to graph data without abandoning relational foundations. Database vendors extend the SQL standard to address domain-specific needs, often through proprietary functions while maintaining core compliance. , for instance, provides robust via the tsvector and tsquery types, integrated into SQL queries using operators like @@ for matching parsed text against search terms. This allows efficient indexing and ranking of textual , as in:
SELECT title FROM articles WHERE to_tsvector('english', content) @@ to_tsquery('english', 'database & query');
Such extensions leverage PostgreSQL's indexes for performance on large corpora. MySQL offers spatial query extensions compliant with Open Geospatial Consortium (OGC) standards, supporting types like POINT, LINESTRING, and for storing and querying geospatial data. Functions such as ST_Distance compute metrics between features, enabling location-based queries like finding nearby points:
SELECT name FROM locations WHERE ST_Distance_Sphere(geom, POINT(40.7128, -74.0060)) < 10000;
These build on MySQL's spatial indexes for efficient analysis in GIS applications. Despite its strengths, traditional SQL implementations in monolithic RDBMS face scalability limitations when handling big data volumes, such as petabyte-scale datasets or high-velocity streams, due to challenges in distributed processing, locking, and index maintenance that can lead to performance bottlenecks. These issues are mitigated in modern dialects like Google BigQuery's SQL, which leverages a serverless, columnar storage architecture with automatic sharding and massively parallel processing to query terabytes in seconds without managing infrastructure. BigQuery's extensions, such as scripting and machine learning integrations, further adapt SQL for cloud-scale analytics while preserving standard syntax.

Graph and NoSQL Query Languages

Graph query languages are designed to operate on graph data models, which represent entities as nodes and relationships as edges, enabling efficient traversal and pattern matching for interconnected data. Unlike relational approaches, these languages emphasize declarative specifications of graph patterns and traversals, facilitating queries over complex networks such as social graphs or recommendation systems. NoSQL query languages extend this paradigm to non-relational stores, supporting diverse data models like documents, key-value pairs, and semantic webs, while providing schema flexibility for big data environments. Cypher is a declarative query language developed for Neo4j, a leading property database, allowing users to express graph patterns and traversals in a readable, ASCII-art-inspired syntax. It focuses on pattern matching to retrieve connected data, such as identifying relationships between nodes, and is optimized for real-time queries in graph databases. For instance, the query MATCH (a:Person)-[:KNOWS]->(b:Person) RETURN a, b finds all pairs of connected by a "KNOWS" relationship, enabling efficient traversals without explicit joins. Cypher's design draws from SQL-like readability but prioritizes graph semantics, making it suitable for applications requiring deep relationship analysis. Gremlin serves as the graph traversal language for the Apache TinkerPop framework, supporting a wide range of databases through a functional, data-flow approach composed of sequential steps. It enables both imperative traversals for procedural control and declarative patterns for high-level queries, with operations like addV('person').property('name', 'Alice') to create vertices and outE('knows') to follow outgoing edges labeled "knows." This step-based model allows for complex path computations, such as shortest paths or community detection, and is embeddable in languages like or for versatile graph processing. 's Turing-complete nature supports both (OLTP) and (OLAP) workloads across TinkerPop-compatible systems. The (GQL), standardized as ISO/IEC 39075:2024, is a declarative language for querying property graph databases, serving as the analogous to SQL for relational data. Inspired by , it uses pattern-matching syntax for traversals, such as MATCH (n:Person)-[r:KNOWS]->(m:Person) RETURN n.name, m.name to retrieve connected persons, supporting efficient querying of complex relationships in graph stores. GQL enables vendor-neutral graph operations, including path finding and subgraph extraction, and is implemented in databases like and AWS as of 2025. In the NoSQL domain, languages like (ArangoDB Query Language) provide unified querying for multi-model databases that combine graphs, documents, and key-value stores. is declarative and SQL-inspired, supporting operations across heterogeneous data with features like traversals and aggregations in a single query, such as FOR v IN 1..3 INBOUND STARTVERTEX GRAPH 'social' OPTIONS {bfs: true} RETURN v.name for graph navigation. Similarly, is the W3C-standardized query language for RDF () data, treating it as directed labeled graphs for applications. It uses triple patterns for matching, as in SELECT ?subject WHERE { ?subject rdf:type :[Resource](/page/Resource) }, to retrieve resources of a specific type, with support for federated queries, filters, and constructs to build new RDF graphs. These languages enable flexible, scalable data access in distributed environments. Graph and NoSQL query languages offer distinct advantages over rigid relational systems, particularly in handling complex relationships through native traversals that avoid costly multi-table joins, achieving up to orders-of-magnitude performance gains in interconnected datasets. For example, graph databases like demonstrate superior efficiency in relationship-heavy queries compared to , as joins in SQL scale poorly with degree of connectivity. Additionally, their schema-less or flexible designs accommodate evolving data structures without migrations, supporting agile development in scenarios where relational schemas impose constraints. This flexibility is crucial for applications like fraud detection or knowledge graphs, where ad-hoc patterns and prevail.

References

  1. [1]
    What is query language? - Elastic
    Query language, which includes database query language (DQL), is a specialized computer language used to make queries and retrieve information from databases.Query language definition · So, what is a query? · How to improve your query...
  2. [2]
    Query Languages: A Simple Introduction - Splunk
    Mar 13, 2024 · A query language is a computer programming language used to retrieve and manipulate data from databases. It allows users to communicate with the ...
  3. [3]
    A brief history of databases: From relational, to NoSQL, to distributed ...
    Feb 24, 2022 · SQL, the Structured Query Language, became the language of data, and software developers learned to use it to ask for what they wanted, and ...
  4. [4]
    What is SQL? - Structured Query Language (SQL) Explained - AWS
    Structured query language (SQL) is a programming language for storing and processing information in a relational database.
  5. [5]
    50 Years of Queries - Communications of the ACM
    Jul 26, 2024 · Two relational query languages were available in the marketplace in 1982: SQL, marketed by IBM and SDL (later Oracle); and QUEL, marketed by RTI ...
  6. [6]
    Imperative vs. Declarative Query Languages: What's the Difference?
    Aug 21, 2018 · The two main paradigms of database query languages are imperative and declarative. Understanding the difference between these two approaches is ...<|separator|>
  7. [7]
    What are query languages? - Aerospike
    Data Query Language (DQL): Primarily used for retrieving data from databases. · Data Manipulation Language (DML): Facilitates the modification of data stored in ...
  8. [8]
    On the expressive power of query languages for relational databases
    We prove some general results and show that only a proper subset of first-order logic formulas may be used as a practical query language. We characterize ...
  9. [9]
    Query languages - korrekt.org
    A query language is any formalism that can be used to define queries, where a query is a function that takes a database (or set of facts) as an input and ...
  10. [10]
    [PDF] CSC 261/461 – Database Systems Lecture 2
    – Are “set”-oriented and specify what data to retrieve rather than how to retrieve it. – Also called declarative languages. • Low Level or Procedural Language:.
  11. [11]
    Introduction to Databases - UTK-EECS
    Data Definition Language (DDL): Defines the structure of the database by ... Data Manipulation Language (DML): A query language that allows users to ...
  12. [12]
    [PDF] The complexity of relational query languages - Rice University
    Definition: A query language (or language for short) is a set of expressions L and a meaning function µ such that for every expression e in L, µ(e) is a ...
  13. [13]
    Data Management and Analytics in Business - OPEN OCO
    Database query languages are computer programming languages used to retrieve and manipulate data in a database. There are several types of query languages, ...Data Storage And Retrieval · Using Databases · Relational Databases
  14. [14]
    [PDF] ANALYSIS OF SQL AND NOSQL DATABASE MANAGEMENT ...
    Graph databases support. ACID transactions, indexing, and rich query languages like Cypher, making them suitable for applications requiring relationship-aware.
  15. [15]
    5. Query Specification
    These are: command language, form fillin, menu selection, direct manipulation, and natural language. ... Each technique has been used in query specification ...
  16. [16]
    [2305.14485] Knowledge Graphs Querying - arXiv
    May 23, 2023 · Querying KGs is critical in web search, question answering (QA), semantic search, personal assistants, fact checking, and recommendation.
  17. [17]
    Predictive Analytics in Business Intelligence -How machine learning ...
    Mar 10, 2025 · SQL-based analytics enhances data extraction and querying processes by offering advanced capabilities to operate data query systems more ...
  18. [18]
    Understanding How SQL is Used in Data Analytics for Effective ...
    Apr 24, 2025 · SQL is used by analysts for data manipulation which includes operations such as altering records, adding new records and the removal of redundant records.
  19. [19]
    SQL as a mashup tool: design and implementation of a web service ...
    Apr 1, 2010 · With this conversion layer provided by the tool, multiple Web services and relational databases can be integrated in SQL, and therefore, ...
  20. [20]
    Evaluation of high-level query languages based on MapReduce in ...
    Oct 6, 2018 · In other words, it provides an easy data summarization, ad-hoc querying and analysis of large volumes of data. The Hive architecture presented ...
  21. [21]
    The Hadoop Ecosystem's Continued Impact
    Jun 12, 2023 · Speaking of Hive clones, the AWS query service Athena is actually the Presto framework under the hood. As the saying goes, imitation is the ...
  22. [22]
    Query Language Extensions for Advanced Analytics on Big Data ...
    Advanced analytics and other Big Data applications call for query languages that can express the complex logic of advanced analytics, and are also amenable ...
  23. [23]
    [PDF] Efficient Query Processing Techniques for Big Data Analytics
    Ad Hoc Analysis: Efficient query processing enables ad hoc analysis, allowing users to formulate and execute queries on the fly. This flexibility is essential ...
  24. [24]
    A relational model of data for large shared data banks
    A relational model of data for large shared data banks. Author: E. F. Codd ... Published: 01 June 1970 Publication History. 5,614citation66,017Downloads.
  25. [25]
    [PDF] Relational Completeness of Data Base Sublanguages
    This paper attempts to provide a theoretical basis which may be used to determine how complete a selection capability is provided in a proposed data sublanguage.
  26. [26]
    Specifying queries as relational expressions: the SQUARE data ...
    This paper presents a data sublanguage called SQUARE, intended for use in ad hoc, interactive problem solving by non-computer specialists.
  27. [27]
    SEQUEL: A structured English query language - ACM Digital Library
    In this paper we present the data manipulation facility for a structured English query language (SEQUEL) which can be used for accessing data in an integrated ...
  28. [28]
    System R: relational approach to database management
    System R is a database management system which provides a high level relational data interface. The systems provides a high level of data independence.
  29. [29]
    What is a Relational Database? - IBM
    Originally known as SEQUEL, it was simplified to SQL due to a trademark issue. SQL queries also allows users to retrieve data from databases using only a ...
  30. [30]
    History of SQL
    In 1979, Relational Software, Inc. (now Oracle) introduced the first commercially available implementation of SQL. Today, SQL is accepted as the standard ...
  31. [31]
    SQL: American National Standard Adoptions by INCITS
    Oct 5, 2018 · SQL ANSI Standard INCITS ANSI X3 135 1986. SQL is the standard language for relational database management systems, and, in the age of mass ...
  32. [32]
    The History of SQL Standards | LearnSQL.com
    Dec 8, 2020 · The first SQL standard was SQL-86. It was published in 1986 as ANSI standard and in 1987 as International Organization for Standardization (ISO) standard.
  33. [33]
    [PDF] SQL:1999, formerly known as SQL3
    As we shall show, SQL:1999 is much more than merely SQL-92 plus object technology. It involves additional features that we consider to fall into SQL's ...
  34. [34]
    XQuery 1.0: An XML Query Language - W3C
    This specification describes a query language called XQuery, which is designed to be broadly applicable across many types of XML data sources.
  35. [35]
    Introduction to Oracle Database
    Oracle7, released in 1992, introduced PL/SQL stored procedures and triggers. Objects and partitioning. Oracle8 was released in 1997 as the object-relational ...
  36. [36]
    New Video: The History of SQL Server - Microsoft
    Feb 15, 2012 · The history of SQL Server dates back to 1989 when the product came about as a result of a partnership between Microsoft, Sybase, and Ashton-Tate.
  37. [37]
    Overview - Cypher Manual - Neo4j
    Cypher is Neo4j's declarative graph query language. It was created in 2011 by Neo4j engineers as an SQL-equivalent language for graph databases.Missing: history | Show results with:history
  38. [38]
    ISO/IEC 39075:2024 - Database languages — GQL
    In stockThis document defines data structures and basic operations on property graphs. It provides capabilities for creating, accessing, querying, maintaining, and ...
  39. [39]
    GQL Standard
    Database languages — GQL, is officially published and available for purchase on the ISO web store!What is a GQL Standard? · GQL Blogs · Existing Languages · Resources
  40. [40]
    MongoDB Evolved – Version History
    The first version of the MongoDB database shipped in August 2009. The 1.0 release and those that followed shortly after were focused on validating a new and ...What's New In The Latest... · 2024 -- Mongodb 8.0 · 2023 -- Mongodb 7.0
  41. [41]
    The Cassandra Query Language (CQL)
    This document describes the Cassandra Query Language (CQL) version 3. Note that this document describes the last version of the language.Cassandra Documentation · Data Definition · SASI · JSON SupportMissing: history | Show results with:history
  42. [42]
    GraphQL: A data query language - Engineering at Meta
    Sep 14, 2015 · A GraphQL query is a string that is sent to a server to be interpreted and fulfilled, which then returns JSON back to the client.
  43. [43]
    QueryGPT - Natural Language to SQL using Generative AI | Uber Blog
    Sep 19, 2024 · QueryGPT uses large language models (LLM), vector databases, and similarity search to generate complex queries from English questions that are ...
  44. [44]
    PRQL
    a simple, powerful, pipelined SQL replacement.Book · Playground · FAQ · Roadmap
  45. [45]
    Announcing New SQL Features in Public Preview - Snowflake
    to optimize queries, extend logic and simplify ...Missing: extensions distributed 2020s
  46. [46]
    DECLARE CURSOR (Transact-SQL) - SQL Server - Microsoft Learn
    Nov 22, 2024 · Defines the attributes of a Transact-SQL server cursor, such as its scrolling behavior and the query used to build the result set on which the cursor operates.
  47. [47]
    Mastering Declarative Programming Languages for Code Optimization
    Apr 30, 2024 · Benefits of Declarative Programming · Readability · Referential transparency · Less data mutability · Seamless optimization · Easy maintainability.
  48. [48]
    Query Language Reference (Version 0.7) | Charts
    Jul 10, 2024 · The query language syntax is similar to SQL and includes clauses like select , where , group by , pivot , order by , limit , offset , label , ...
  49. [49]
    Access SQL: basic concepts, vocabulary, and syntax
    SQL terms ; operator. verb or adverb. A keyword that represents an action or modifies an action. ; constant. noun. A value that does not change, such as a number ...
  50. [50]
    [PDF] A Relational Model of Data for Large Shared Data Banks
    Normally, one domain (or combination of domains) of a given relation has values which uniquely identify each ele- ment (n-tuple) of that relation. Such a domain ...
  51. [51]
    [PDF] Database theory: Query languages
    Computer Science Logic (CSL), volume 4646 of Lecture Notes in Computer. Science, pages 84–98. Springer-Verlag, 2007. [35] Jan Van den Bussche, Dirk Van Gucht ...
  52. [52]
    SQL for Pattern Matching - Oracle Help Center
    Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. These patterns use regular expression syntax, a powerful and expressive ...
  53. [53]
    What Are Graph Query Languages? - PuppyGraph
    Rating 100% (2) · FreeMay 9, 2025 · At the core of graph query languages are two fundamental operations: pattern matching and traversal. These operations enable querying graphs in ...
  54. [54]
    [PDF] Relational Algebra and Calculus with SQL Null Values - arXiv
    Feb 23, 2022 · We extend Codd's theorem by proving the equivalence of the relational algebra with both domain relational calculi in presence of SQL null values ...
  55. [55]
    DataTone: Managing Ambiguity in Natural Language Interfaces for ...
    In this work we propose a mixed-initiative approach to managing ambiguity in natural language interfaces for data visualization.
  56. [56]
    A Cup of ChaiSQL: Benefits of Type Checking for SQL
    May 9, 2024 · In this paper, we present the early design and evaluation of ChaiSQL - an optional, comment-based, static type-checker and CLI tool for type safety analyses.
  57. [57]
    What Is Online Transaction Processing (OLTP)? - Oracle
    Aug 1, 2023 · OLTP is data processing that executes concurrent transactions, like online banking, and involves inserting, updating, or deleting small amounts ...OLTP · Oracle Africa Region · Oracle Middle East Regional
  58. [58]
    Online Analytical Processing - Azure Architecture Center
    Apr 22, 2025 · Online analytical processing (OLAP) is a technology that organizes large business databases to perform complex calculations and trend analysis.Missing: enterprise | Show results with:enterprise
  59. [59]
    Boolean retrieval
    ### Summary of Boolean Retrieval in Information Retrieval
  60. [60]
    [PDF] boolean retrieval
    – Easier to use (supports full text queries). – Similar efficiency (based on inverted file implementations). • Disadvantages: – More difficult to convey an ...
  61. [61]
    [PDF] Query Languages - DCC UChile
    For query languages not aimed at information retrieval, the concept of ranking cannot be easily de ned, so we consider them as languages for data retrieval.
  62. [62]
    [PDF] A Query Language for Information Retrieval in XML Documents
    Roughly speaking, there are two kinds of IR approaches that deal with the retrieval of structured documents: • The structural approach enriches text search by ...
  63. [63]
    [PDF] Faceted Search
    The interested reader is encouraged to consult a textbook on information retrieval to learn more about the variety of query expansion and relevance feedback ...
  64. [64]
    [PDF] A Survey of Automatic Query Expansion in Information Retrieval
    With a correct disambiguation rate of 90%, this paper was the first to show that WSD can work successfully with an IR system, reporting a 7 to 14% improvement ...
  65. [65]
    A Guide to Graph Query Languages - Hypermode
    Jun 27, 2024 · Discover the power of the graph database model and how graph query languages like Cypher, Gremlin, and SPARQL simplify handling complex, ...Missing: emerging | Show results with:emerging
  66. [66]
    RDF Triple Stores vs. Property Graphs: What's the Difference? - Neo4j
    Jun 4, 2024 · This article compares two methods: RDF from the original 1990s Semantic Web research and the property graph model from the modern graph database.Rdf Vs. Property Graphs... · What Is Rdf? · Share ArticleMissing: emerging | Show results with:emerging
  67. [67]
  68. [68]
    RDF & SPARQL Working Group Charter - W3C
    This specification defines an update language for RDF graphs. It uses a syntax derived from the SPARQL Query Language for RDF. Update operations are performed ...Missing: emerging | Show results with:emerging
  69. [69]
    Natural Language to SQL: State of the Art and Open Problems
    Aug 1, 2025 · Translating users' natural language queries (nl) into sql queries (i.e., nl2sql) can significantly reduce barriers to accessing relational ...
  70. [70]
    Build your gen AI–based text-to-SQL application using RAG ...
    Mar 18, 2025 · This application allows users to ask questions in natural language and then generates a SQL query for the user's request. Large language models ...
  71. [71]
    Querying basics - Prometheus
    Jan 4, 2021 · Prometheus provides a functional query language called PromQL (Prometheus Query Language) that lets the user select and aggregate time series data in real time.Query examples · Operators · HTTP API
  72. [72]
    The 7 Best Vector Databases in 2025 - DataCamp
    A comprehensive guide to the best vector databases. Master high-dimensional data storage, decipher unstructured information, and leverage vector embeddings ...
  73. [73]
    GQL: The ISO Standard for Graphs Has Arrived - Neo4j
    Apr 25, 2024 · GQL, which stands for Graph Query Language, is the first new ISO database language since the introduction of SQL in 1987.
  74. [74]
    GQL: The ISO standard for graphs has arrived | AWS Database Blog
    Apr 25, 2024 · GQL, which stands for Graph Query Language, is the first new ISO database language since the introduction of SQL in 1987.<|separator|>
  75. [75]
    The Hybrid Multimodal Graph Index (HMGI) - arXiv
    Oct 11, 2025 · Research has highlighted the shift from polyglot persistence models, which relied on separate vector and graph databases, to native hybrid ...
  76. [76]
    SQL Standards - JCC Consulting
    The original SQL standard was completed as a USA ANSI (American National Standards Institute) standard in 1986, and adopted as an ISO (International Standards ...Missing: progression | Show results with:progression
  77. [77]
    The SQL Standard - ISO/IEC 9075:2023 (ANSI X3.135)
    Oct 5, 2018 · SQL (Structured Query Language) standard for relational database management systems is ISO/IEC 9075:2023, with origins in ANSI X3.135.
  78. [78]
    Subqueries (SQL Server) - Microsoft Learn
    Aug 21, 2025 · A subquery is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery.Missing: core window
  79. [79]
    (PDF) The new and improved SQL:2016 standard - ResearchGate
    Aug 7, 2025 · SQL:2016 (officially called ISO/IEC 9075:2016, Information technology - Database languages - SQL) was published in December of 2016, replacing SQL:2011 as the ...
  80. [80]
    SQL:2023 is finished: Here is what's new | Peter Eisentraut
    Apr 4, 2023 · Normally, ISO/IEC standards are supposed to take 4 to 5 years (or 3 to 4 years in the future).Missing: progression | Show results with:progression
  81. [81]
    Documentation: 18: Chapter 12. Full Text Search - PostgreSQL
    Chapter 12 covers full text search, including what a document is, basic text matching, tables, indexes, controlling search, and parsing documents and queries.12.3. Controlling Text Search · 12.1. Introduction · 12.2. Tables and Indexes
  82. [82]
    Google BigQuery Vs SQL Server: 8 Critical Differences - Hevo Data
    Oct 7, 2024 · Google Bigquery can auto-scale up and down based on the data load. On the other hand, SQL Server doesn't have auto-scalability, and hence it ...
  83. [83]
    Graph Query Language - Gremlin - Apache TinkerPop
    Gremlin is a functional, data-flow language that enables users to succinctly express complex traversals on (or queries of) their application's property graph.
  84. [84]
    A comparison of a graph database and a relational database
    This paper reports on a comparison of one such NoSQL graph database called Neo4j with a common relational database system, MySQL.
  85. [85]
    Introduction - Cypher Manual - Neo4j
    Welcome to the Neo4j Cypher® Manual. Cypher is Neo4j's declarative query language, allowing users to unlock the full potential of property graph databases.Cypher and Neo4j · Cypher and Aura · OverviewMissing: history | Show results with:history
  86. [86]
    What is Cypher - Getting Started - Neo4j
    This page covers the basics of Cypher®. For the complete documentation, refer to Cypher. Cypher is Neo4j's declarative and GQL conformant query language. ...
  87. [87]
    404 Page not found
    ### Overview of AQL in ArangoDB
  88. [88]
    SPARQL 1.1 Query Language - W3C
    Mar 21, 2013 · This specification defines the syntax and semantics of the SPARQL query language for RDF. SPARQL can be used to express queries across diverse data sources.
  89. [89]
    Foundations of Modern Query Languages for Graph Databases
    Sep 26, 2017 · We survey foundational features underlying modern graph query languages. We first discuss two popular graph data models: edge-labelled graphs.