Fact-checked by Grok 2 weeks ago

Vector database

A vector database is a specialized database system that stores, indexes, and queries high-dimensional vectors—numerical arrays representing complex data such as text, images, audio, or multimodal content—to enable efficient similarity searches and retrieval.^[1]^[2] These vectors, often generated by machine learning embedding models, capture semantic relationships and patterns in unstructured data, allowing for operations like nearest-neighbor searches that traditional relational or NoSQL databases handle less effectively.^[3] Unlike conventional databases focused on exact matches, vector databases prioritize approximate nearest neighbor (ANN) algorithms to manage large-scale vector data with low latency, making them essential for AI-driven applications.^[1] Vector databases have surged in prominence since the early 2020s, driven by advancements in generative AI and the need to process vast amounts of unstructured data, with projections indicating that over 30% of enterprises will adopt them by 2026.^[1] Their core components include vector storage for embeddings and associated metadata, indexing techniques such as Hierarchical Navigable Small World (HNSW), Locality-Sensitive Hashing (LSH), or Product Quantization (PQ) to accelerate queries, and similarity metrics like cosine similarity or Euclidean distance for ranking results.^[2]^[3] These systems support CRUD operations (create, read, update, delete), horizontal scalability, real-time updates, and hybrid search combining vectors with keyword-based methods, often integrated with frameworks like LangChain for AI workflows.^[3] Key use cases span semantic search in recommendation engines, where vectors match user preferences to products; retrieval-augmented generation (RAG) for enhancing large language models with external knowledge bases; image and video similarity detection in content moderation; and conversational AI chatbots that retrieve contextually relevant information.^[2]^[1] In anomaly detection and fraud prevention, they analyze behavioral patterns via vector clustering, while in healthcare, they facilitate drug discovery by comparing molecular structures.^[3] Benefits include superior performance on high-dimensional data, cost efficiency through serverless architectures, and flexibility for multimodal AI, though challenges like embedding quality and index maintenance remain areas of ongoing innovation.^[1]^[3]

Fundamentals

Definition and Purpose

A vector database is a specialized type of database management system designed to store, index, and query high-dimensional vectors, often referred to as embeddings, using techniques like approximate nearest neighbor (ANN) search to enable efficient similarity-based retrieval.^[4] Unlike traditional databases that primarily handle structured data with exact-match queries, vector databases are optimized for unstructured or semi-structured data represented in vector form, addressing the limitations of conventional systems in managing high-dimensional information.^[5] These vectors are numerical arrays that capture semantic or feature-based representations of diverse data types, such as text, images, or audio, generated through machine learning models like Word2Vec for word embeddings or BERT for contextual text representations. The primary purpose of vector databases is to facilitate rapid similarity searches, which are essential for applications requiring the identification of data points closest to a query vector in a high-dimensional space, such as recommendation engines that suggest similar items or semantic search systems that retrieve contextually relevant documents.^[4] This contrasts sharply with the exact equality or range-based queries in relational databases, as vector databases prioritize probabilistic approximations to balance speed and accuracy in vast datasets.^[6] By leveraging ANN methods, they enable sub-linear query times, often achieving retrieval in milliseconds even for billions of vectors, which is infeasible with exhaustive searches in traditional setups.^[5] Key benefits include robust handling of vectors with thousands of dimensions—common in modern embeddings—while mitigating the effects of the curse of dimensionality through specialized indexing techniques, allowing scalable operations on complex data like multimodal AI inputs.^[4] Vector databases thus serve as a foundational infrastructure for AI-driven systems, where similarity metrics, such as cosine similarity or Euclidean distance, underpin the core retrieval logic.^[5]

Historical Development

The foundations of vector databases trace back to the 1990s and early 2000s, when techniques like vector quantization and k-d trees were developed to handle high-dimensional data in computer vision and information retrieval. Vector quantization, introduced for compressing and searching high-dimensional vectors, enabled efficient similarity searches in applications such as image recognition.^[4] Meanwhile, k-d trees, originally proposed in 1975 but widely adopted in the 1990s, provided a spatial partitioning method for nearest neighbor queries in multidimensional spaces, influencing early indexing strategies for vector data.^[7] These methods addressed the challenges of the "curse of dimensionality" in traditional databases, laying groundwork for scalable vector handling in fields like genetic research at institutions such as NIH and Stanford.^[8] The rise of vector databases accelerated in the 2010s, propelled by deep learning advancements that generated dense vector embeddings from unstructured data, demanding scalable storage beyond conventional relational systems. The 2013 introduction of Word2Vec by Google researchers exemplified this shift, producing low-dimensional embeddings for words that captured semantic relationships, but required efficient storage and retrieval at scale for real-world applications like natural language processing. This proliferation of embedding models, including subsequent ones like GloVe and BERT, overwhelmed traditional databases, spurring the development of specialized vector stores to support approximate nearest neighbor (ANN) searches on billions of vectors.^[4] Key milestones marked the maturation of vector databases, beginning with Facebook's release of the FAISS library in 2017, which optimized ANN search for dense vectors using techniques like inverted file indexing and product quantization, enabling billion-scale similarity searches on commodity hardware.^[9] In 2019, Zilliz launched Milvus as the first open-source vector database, providing distributed storage and hybrid indexing for massive datasets, while Pinecone emerged as a managed cloud service founded that year to simplify vector operations for AI developers.^[10] The 2020s saw further proliferation, including AWS's general availability of vector search capabilities in Amazon OpenSearch Service in 2022 via the k-NN plugin, integrating seamlessly with cloud ecosystems for enterprise-scale AI workflows.^[11] The evolution was influenced by big data frameworks like Hadoop and Spark, which from the late 2000s provided distributed processing foundations that vector databases adapted for parallel indexing and querying of embeddings.^[4] The AI boom post-2018, fueled by transformer models and generative AI, intensified demand, leading to a dedicated vector database market valued at $2.2 billion in 2024 and projected to reach approximately $3.3 billion by 2026.^[12] Recent developments include the introduction of hybrid search in Weaviate (2022), combining vector and keyword matching for more robust retrieval.^[13] As of 2025, ongoing advancements include enhanced integrations with multimodal AI frameworks and research into scalable indexing methods for even larger datasets.^[14]

Core Concepts

Vector Embeddings

Vector embeddings are dense, fixed-length arrays of real numbers that represent complex data—such as text, images, or audio—in a continuous vector space, capturing underlying semantic or structural meaning through the activation patterns of neural networks.^[15] Unlike sparse representations like bag-of-words, these embeddings distribute information across all dimensions, enabling nuanced relationships; for instance, the BERT model generates 768-dimensional vectors where proximity in the space reflects contextual similarity in language.^[15] This dense format allows neural networks to encode high-level abstractions, transforming raw inputs into compact, machine-readable forms suitable for downstream tasks in vector databases. Embeddings are generated via diverse neural network techniques tailored to data types and objectives. Unsupervised methods, such as autoencoders, learn representations by training a network to compress input data into a lower-dimensional latent space and reconstruct it, thereby extracting essential features without labeled examples; this approach, pioneered in early neural network research, remains foundational for discovering intrinsic data patterns.^[16] Supervised techniques, including contrastive learning, refine embeddings by maximizing similarity between related pairs (e.g., augmented versions of the same image) while minimizing it for unrelated ones, as demonstrated in SimCLR, which achieved state-of-the-art visual representations in 2020 using a simple framework without negative sampling complexities.^[17] Multimodal generation extends this to align representations across domains, such as text and images, through joint training on paired data; the CLIP model from 2021, for example, produces unified embeddings that enable zero-shot transfer by leveraging natural language supervision on vast image-caption datasets.^[18] A key property of vector embeddings is their high dimensionality, often ranging from hundreds to thousands of dimensions, which introduces the curse of dimensionality: as dimensions increase, data volumes grow exponentially, leading to sparse sampling and diminished discriminatory power of distance metrics in the space. To mitigate computational and storage inefficiencies, embeddings are commonly normalized using the L2 (Euclidean) norm, scaling vectors to unit length so that similarity computations focus on angular differences rather than magnitudes, a practice that enhances stability in models like CLIP.^[18] Storage considerations for embeddings emphasize their dense nature, where most elements are non-zero, contrasting with sparse formats and requiring efficient handling of large-scale, high-dimensional arrays to avoid memory overhead. While primarily dense, certain advanced embeddings may exhibit effective sparsity through techniques like sparse autoencoders, which enforce zero activations to promote interpretability. Dimensionality reduction methods, such as Principal Component Analysis (PCA), address these challenges by projecting data onto principal axes that capture maximum variance, preserving essential information while reducing dimensions—for example, compressing 768-dimensional embeddings to 100 or fewer without significant loss of semantic fidelity. This preprocessing step is crucial in vector databases to balance retrieval speed and accuracy.

Similarity Metrics

Similarity metrics in vector databases quantify the degree of resemblance between high-dimensional vector embeddings, enabling efficient nearest neighbor searches by defining "closeness" in embedding spaces. These metrics are essential for tasks like semantic search and recommendation systems, where direct comparison of raw vectors would be computationally prohibitive without indexing optimizations. Common metrics balance accuracy, interpretability, and efficiency, with choices influenced by the nature of the data and the downstream application. The Euclidean distance, also known as the L2 norm, measures the straight-line distance between two vectors \mathbf{x} and \mathbf{y} in Euclidean space, given by the formula:

d(\mathbf{x}, \mathbf{y}) = \sqrt{\sum_{i=1}^{d} (x_i - y_i)^2}

where d is the dimensionality of the vectors.^[19] This metric is widely used in vector databases for its geometric intuition, particularly in scenarios involving spatial or continuous data distributions.^[20] Cosine similarity assesses the angular orientation between two vectors, focusing on their directional alignment rather than magnitude, computed as:

\cos(\theta) = \frac{\mathbf{x} \cdot \mathbf{y}}{\|\mathbf{x}\| \cdot \|\mathbf{y}\|}

where \mathbf{x} \cdot \mathbf{y} is the dot product and \|\cdot\| denotes the L2 norm.^[21] It is particularly effective for text embeddings, where vector lengths may vary due to document size but semantic similarity depends on term overlap direction.^[19] The Manhattan distance, or L1 norm, calculates the sum of absolute differences along each dimension:

d(\mathbf{x}, \mathbf{y}) = \sum_{i=1}^{d} |x_i - y_i|

This metric is robust to outliers and suits grid-like or sparse data structures, such as in urban planning analogs or certain feature spaces in machine learning.^[19] Selection of a metric depends on the embedding characteristics and task requirements; for instance, cosine similarity is preferred for angular comparisons in normalized text embeddings to emphasize directional similarity, while Euclidean distance excels in spatial data where magnitude differences matter.^[20] The inner product (dot product) serves as an efficient proxy for cosine similarity when vectors are pre-normalized to unit length, reducing computational overhead in large-scale searches.^[22] Advanced variants include the Minkowski distance, a generalization of p-norms:

d(\mathbf{x}, \mathbf{y}) = \left( \sum_{i=1}^{d} |x_i - y_i|^p \right)^{1/p}

where p=1 recovers Manhattan distance and p=2 yields Euclidean distance; higher p values emphasize larger deviations but increase sensitivity to noise.^[19] Tradeoffs in metric selection involve computational cost, typically O(d) per pairwise comparison regardless of the metric, and domain suitability; cosine similarity ignores vector magnitudes, making it ideal for direction-focused tasks like document retrieval but less appropriate for magnitude-sensitive applications like sensor data analysis.^[20] Euclidean distance, while intuitive, can suffer from the curse of dimensionality in high d, amplifying distances uniformly and potentially masking subtle similarities.^[19]

Techniques

Indexing Methods

Vector databases employ various indexing methods to enable efficient storage and retrieval of high-dimensional vectors, particularly for approximate nearest neighbor (ANN) searches. These methods address the challenges of the curse of dimensionality by organizing vectors into structures that approximate similarity computations, trading exactness for speed and scalability. Common approaches include tree-based partitioning, graph-based navigation, hashing techniques, quantization for compression, and hybrid strategies that combine elements for improved performance.^[23] Tree-based indexing methods, such as KD-trees and ball trees, partition the vector space hierarchically to facilitate rapid nearest neighbor queries. A KD-tree recursively splits the space along alternating coordinate axes at medians, creating hyperrectangular regions that allow pruning of irrelevant subtrees during search. Ball trees, in contrast, divide the space using hyperspheres centered at cluster centroids, which can be more effective for non-uniform distributions by minimizing the overlap of bounding regions.^[23] Both structures achieve query times of O(\log n) in low dimensions by traversing the tree and bounding distance computations, but they suffer in high-dimensional spaces (e.g., 100+ dimensions common in embeddings) due to increased overlap and the need to visit most nodes, leading to near-linear scan times.^[23] Graph-based indexing, exemplified by Hierarchical Navigable Small World (HNSW) graphs, constructs a multi-layer graph where each vector connects to nearby neighbors, enabling greedy navigation from coarse to fine layers for ANN search. During indexing, vectors are inserted by starting at the top layer and descending while adding bidirectional edges based on proximity, with layer connections forming a navigable small world network.^[24] This yields logarithmic query complexity O(\log n) with high recall, as searches traverse fewer than O(\log n) edges per layer, outperforming trees in high dimensions by avoiding exhaustive partitioning. HNSW's robustness stems from its parameterizable trade-off between graph density (affecting memory) and accuracy, making it widely adopted for billion-scale datasets.^[24] Hashing-based methods, such as Locality-Sensitive Hashing (LSH), provide probabilistic ANN by mapping similar vectors to the same hash buckets with high probability using families of hash functions sensitive to distance metrics like Euclidean or cosine similarity. Vectors are hashed into multiple tables, and queries probe buckets containing potential neighbors, followed by exact distance verification within those sets. LSH approximates the c-approximate nearest neighbor problem with query time sublinear in the dataset size, depending on the similarity threshold c > 1, though it requires tuning the number of hash tables and functions to balance false positives and memory usage. Quantization techniques, notably Product Quantization (PQ), compress vectors to reduce storage and accelerate distance approximations without building explicit trees or graphs. In PQ, a high-dimensional vector is split into m low-dimensional subvectors, each quantized to the nearest centroid from a codebook learned via k-means, resulting in a compact code that approximates the original vector's distances via asymmetric distance computation. This achieves compression ratios up to 256x for 128-dimensional vectors while maintaining 90-95% recall in ANN searches, as the additive approximation error is bounded for Euclidean distances. PQ is particularly effective for memory-constrained environments, enabling scans over billions of compressed vectors in seconds on GPUs. Hybrid approaches integrate multiple techniques for enhanced efficiency, such as the Inverted File (IVF) index combined with coarse quantizers. IVF first applies a coarse quantizer (e.g., k-means clustering) to partition vectors into inverted lists associated with centroids, then stores fine-grained vectors or codes within each list. During query, only lists near the query's assigned centroid are probed, reducing candidates from the full dataset to a fraction (e.g., 1-5%), after which PQ or exact search refines results. This hybrid yields sublinear query times with tunable accuracy, scaling to billion-scale indexes by leveraging both clustering for filtering and compression for scanning.

Query Processing

Query processing in vector databases revolves around efficient retrieval of similar vectors using approximate nearest neighbor (ANN) techniques, which form the backbone of search operations. The typical ANN search pipeline begins with candidate generation, where indexing structures such as inverted file (IVF) partition the vector space into clusters to quickly identify a subset of potential matches without exhaustive scanning. This step reduces computational overhead by probing only the most relevant clusters based on the query vector's proximity to cluster centroids. Following candidate generation, refinement occurs by performing exact distance computations on the top-k candidates to rank and return the most similar vectors, balancing approximation errors with high accuracy. Vector databases support various query types to accommodate diverse retrieval needs. The k-nearest neighbors (k-NN) query retrieves the top-k vectors most similar to the query based on a predefined similarity measure, enabling precise similarity-based ranking. Range queries extend this by returning all vectors within a specified distance threshold from the query, useful for applications requiring bounded similarity results. Hybrid queries combine vector similarity search with metadata filters, such as attribute-based conditions (e.g., category or timestamp), to narrow results pre- or post-similarity computation, enhancing relevance in multifaceted datasets.^[3] Update mechanisms ensure vector databases remain responsive to evolving data. Dynamic indexing allows incremental inserts and deletes by adjusting the index structure on-the-fly, as in hierarchical navigable small world (HNSW) graphs where new vectors are added to appropriate layers with logarithmic time complexity, and deletions involve removing nodes while repairing connections to maintain search integrity. For scalability, batch processing groups multiple updates into periodic rebuilds or optimizations, minimizing overhead in high-velocity environments while preserving query performance. Performance tuning in query processing involves optimizing the recall-speed tradeoff, where higher recall (e.g., targeting 95%) demands more candidates during generation, increasing latency, while lower recall prioritizes speed for real-time applications. Parallelism via GPUs accelerates both candidate generation and refinement by distributing distance computations across threads, enabling sub-millisecond queries on billion-scale datasets without sacrificing substantial accuracy.

Implementations

Open-Source Databases

Milvus, released in 2019, is an open-source distributed vector database designed for high-performance similarity search at scale.^[25] It features a modular, cloud-native architecture with disaggregated storage and compute layers, including access, coordinator, worker nodes, and storage components that enable horizontal scalability and data sharding.^[26] Milvus supports advanced indexing methods such as Hierarchical Navigable Small World (HNSW) for approximate nearest neighbor search and Inverted File (IVF) variants like IVF_FLAT and IVF_PQ for efficient querying of dense vectors.^[27] Its integration with Kubernetes via the Milvus Operator facilitates automated deployment and management of clusters, making it suitable for billion-scale vector workloads in production environments.^[28] In version 2.3, released in 2023, Milvus enhanced its core engine Knowhere by incorporating Faiss for GPU-accelerated indexing, improving search speed for large datasets.^[29] Weaviate, also launched in 2019, is an open-source vector database that combines vector search with structured data handling through a GraphQL-based API for querying and schema management.^[30] It excels in hybrid search, fusing vector similarity results with keyword-based (BM25F) retrieval to balance semantic and lexical matching, configurable via fusion methods and weights.^[31] Weaviate's modular architecture includes built-in modules for generating embeddings from providers like Hugging Face, allowing seamless integration for automatic vectorization during data import without external preprocessing.^[32] This design supports both vector-only and hybrid queries on objects with associated metadata, enabling flexible applications in knowledge graph-like structures. Qdrant, introduced in 2021, is a Rust-based open-source vector database emphasizing performance and safety for vector similarity search.^[33] It provides robust payload storage, where JSON-like metadata can be attached to vectors for efficient filtering during queries, supporting nested conditions and array-based operations on payloads.^[34] Qdrant's filtering capabilities allow complex predicates on payload fields, such as range, geo, or match conditions, integrated directly into vector search to reduce result sets without post-processing.^[35] The system handles real-time updates seamlessly, incorporating insertions, modifications, and deletions into indexes with minimal latency, ensuring consistency in dynamic datasets.^[36] In 2024, Qdrant added WebAssembly (WASM) support through its Summer of Code initiative, enabling dimension reduction algorithms like t-SNE for client-side vector visualization in web interfaces.^[37] Chroma, released in 2022, is a lightweight, open-source vector database optimized for Python-native development and local deployment.^[38] It offers a simple API for storing and querying embeddings alongside metadata, full-text, and regex searches, with persistent storage options like SQLite for easy setup without external dependencies. Designed for rapid prototyping in AI applications, Chroma provides zero-configuration embedding, making it ideal for developers building retrieval-augmented generation pipelines or local experimentation with minimal overhead.^[39] Its embedded nature supports in-process execution, facilitating quick iterations in Jupyter notebooks or scripts while scaling to distributed modes via server deployment.^[40] Vespa, originally developed by Yahoo and open-sourced in 2017, functions as a full-featured search and recommendation engine with native tensor support for advanced computations over vectors and structured data. It enables machine-learned ranking and inferences using models in ONNX or XGBoost formats, facilitating complex queries on tensors. Vespa excels in large-scale search scenarios, scaling linearly to billions of data items across distributed clusters while maintaining high throughput for real-time applications. Its tensor framework outperforms standard vector approaches in production AI by integrating dense and sparse representations for personalized results.^[41]^[42]^[43]

Commercial Databases

Commercial vector databases provide managed, proprietary solutions tailored for enterprise-scale AI applications, offering features like automated scaling, high availability, and seamless integration with cloud ecosystems. These platforms prioritize ease of deployment and operational reliability over self-management, distinguishing them from open-source alternatives by including dedicated support, service level agreements (SLAs), and optimized performance for production workloads.^[44]^[43] Pinecone, launched in 2021, is a fully managed, serverless vector database designed for building scalable AI applications.^[45] It supports pod-based and serverless indexing modes, enabling automatic scaling to handle millions of vectors without infrastructure management. Pinecone employs Hierarchical Navigable Small World (HNSW) for auto-indexing, ensuring efficient approximate nearest neighbor searches with low latency. The platform guarantees 99.95% uptime through its SLA, making it suitable for mission-critical deployments. In 2025, Pinecone introduced enhanced multimodal support, allowing assistants to process and query embeddings from images embedded in PDFs alongside text, expanding its utility for diverse AI contexts.^[3]^[46]^[47]^[48] Amazon OpenSearch Service extended its capabilities with vector search in 2022, integrating seamlessly with Amazon SageMaker for generating and managing embeddings in AI workflows. As a pay-per-use managed service, it allows hybrid queries that combine SQL-based filtering with vector similarity searches using k-NN or approximate k-NN via HNSW indexing. This setup supports semantic search alongside traditional text queries, enabling Retrieval Augmented Generation (RAG) applications with up to 16,000-dimensional vectors and metrics like cosine similarity.^[49]^[11] Redis, enhanced with vector support through the RediSearch module in 2023, leverages its in-memory architecture for ultra-low-latency vector operations, positioning it as a high-performance cache for AI-driven caching and real-time recommendation use cases. It stores vectors alongside key-value data, supporting similarity searches with flat or HNSW indexes for sub-millisecond queries. By 2025, Redis enhanced its hybrid search to merge text, vector, and metadata rankings for more relevant results in complex AI pipelines.^[50]^[51]^[52]

Applications and Comparisons

Use Cases in AI and Search

Vector databases play a pivotal role in Retrieval-Augmented Generation (RAG) systems for large language models (LLMs), where they store vector embeddings of external knowledge sources to enhance LLM responses with relevant, up-to-date information without requiring model retraining. In RAG pipelines, user queries are embedded and matched against the database to retrieve contextually similar documents, which are then incorporated into the LLM prompt, improving factual accuracy and reducing hallucinations in applications like chatbots and question-answering systems since their integration with models like GPT in 2023. For instance, this approach enables LLMs to synthesize vast datasets efficiently, as demonstrated in surveys on LLM-vector database synergies. In computer vision, vector databases facilitate image similarity tasks by indexing embeddings generated from convolutional neural networks, allowing rapid nearest-neighbor searches for applications such as duplicate detection in large image repositories. By representing images as high-dimensional vectors capturing visual features like textures and shapes, these databases enable efficient identification of near-duplicates or similar visuals, which is essential for content moderation, e-commerce catalog management, and forensic analysis. For search applications, vector databases power semantic search engines that transcend keyword matching by retrieving results based on embedding similarities, capturing contextual meaning and user intent more effectively.^[53] This embedding-based approach allows search systems to handle synonyms, paraphrases, and nuanced queries, as seen in platforms that index documents or multimedia into vector spaces for relevance ranking via approximate nearest-neighbor algorithms.^[54] Recommendation systems, such as those resembling Netflix's content matching, leverage vector databases to embed user preferences and item features, enabling personalized suggestions through similarity searches across vast catalogs.^[55] By storing user interaction vectors alongside content embeddings, these systems compute real-time matches to recommend media or products, scaling to millions of users while prioritizing diversity and recency in results. Beyond core AI and search domains, vector databases support anomaly detection in cybersecurity by vectorizing log data or network traffic into embeddings, where deviations from normal patterns are flagged via distance metrics in the vector space.^[56] This method enhances threat hunting by identifying outliers in high-volume security datasets, such as unusual user behaviors or intrusion signatures, outperforming traditional rule-based systems in dynamic environments.^[57] In drug discovery, molecular embeddings stored in vector databases accelerate virtual screening by enabling similarity searches across chemical libraries to identify potential compounds with desired properties. Techniques like graph neural networks generate these embeddings from molecular structures, allowing researchers to retrieve analogs for lead optimization or predict bioactivities, as evidenced in recent advances integrating representations with AI-driven pipelines.^[58] Real-world deployments highlight these capabilities; for example, OpenAI's 2023 ChatGPT plugins utilized Pinecone as a vector database backend for the retrieval plugin, enabling users to connect custom knowledge bases for augmented conversations.^[59] Similarly, Google's Vertex AI Vector Search, enhanced in 2024 and with Vector Search 2.0 announced in August 2025 for fully managed vector database capabilities, provides scalable semantic retrieval in enterprise AI workflows.^[60]

Differences from Traditional Databases

Vector databases differ fundamentally from relational databases, such as those using SQL, in their core design priorities and data handling. While relational systems enforce strict schemas, ACID compliance for transactions, and support complex joins on structured tabular data, vector databases focus on storing and querying high-dimensional vector embeddings with probabilistic similarity searches, often sacrificing rigid schema enforcement and full ACID guarantees for scalability in unstructured data scenarios.^[5] This shift enables vector databases to manage embeddings derived from AI models, like those from large language models, without the overhead of predefined relationships or exact matches typical in relational setups.^[3] In comparison to NoSQL databases like MongoDB, which excel at flexible document or key-value storage for semi-structured data with exact indexing and retrieval, vector databases natively accommodate dense vector representations of unstructured content, such as images or text embeddings, using approximate nearest neighbor (ANN) algorithms for efficient similarity-based queries rather than precise lookups.^[61] NoSQL systems typically require extensions or integrations to handle vector operations, whereas vector databases are optimized from the ground up for high-dimensional data, prioritizing semantic relevance over the schema-less but exact-match paradigms of document stores. Vector databases also diverge from full-text search engines like Elasticsearch, which rely on lexical matching and inverted indexes for keyword-based retrieval of textual content. In contrast, vector databases leverage embedding-based semantic similarity to capture contextual meaning beyond exact terms, enabling more intuitive searches like finding conceptually related documents.^[62] However, many modern systems hybridize these approaches, combining vector similarity with traditional full-text capabilities to enhance precision in AI-driven applications. Architecturally, vector databases employ specialized structures like approximate inverted indexes or graph-based hierarchies (e.g., HNSW) tailored for nearest-neighbor searches in high-dimensional spaces, unlike the B-trees or hash tables in traditional databases that optimize for ordered, exact-range queries on scalar values.^[63] Scalability in vector systems often involves sharding data into similarity-aware partitions, allowing distributed ANN computations across clusters, which contrasts with the join-heavy partitioning in relational or NoSQL architectures focused on consistency and throughput for transactional workloads.^[64] Recent trends as of 2025, such as the pgvector 0.7 extension for PostgreSQL (released April 2024), are blurring these distinctions by integrating vector similarity search directly into relational databases, enabling hybrid setups that combine ACID transactions with embedding storage without fully migrating to dedicated vector systems.^[65] This evolution supports applications in AI and search by allowing traditional databases to handle semantic queries alongside structured data, reducing the need for separate vector infrastructure in many cases.^[66]

References

[1]
What Is A Vector Database? - IBM
A vector database stores, manages and indexes high-dimensional vector data. Data points are stored as arrays of numbers called “vectors,” which are clustered ...What is a vector database? · What are vectors?
[2]
What is a Vector Database? - Amazon AWS
Vector databases provide the ability to store and retrieve vectors as high-dimensional points. They add additional capabilities for efficient and fast lookup ...What is a Vector Database? · Why are vector databases...
[3]
What is a Vector Database & How Does it Work? Use Cases + ...
May 3, 2023 · A vector database indexes and stores vector embeddings for fast retrieval and similarity search, with capabilities like CRUD operations, metadata filtering, ...
[4]
[PDF] A Comprehensive Survey on Vector Database: Storage and ... - arXiv
Apr 19, 2021 · Abstract—Vector databases (VDBs) have emerged to manage high-dimensional data that exceed the capabilities of traditional database management ...
[5]
[PDF] Vector database management systems - arXiv
Abstract. Vector database management systems have emerged as an important component in modern data manage- ment, driven by the growing importance for the need ...
[6]
[PDF] Manu: A Cloud Native Vector Database Management System - arXiv
Jun 28, 2022 · We shed light on typical application requirements of vector databases and show how they differ from those of traditional relational databases.
[7]
https://doi.org/10.1145/361002.361007
[8]
In Search of the History of the Vector Database - SW2.ai
Apr 25, 2023 · As genetic research continues to deepen and accelerate through the 2005 to 2015 timeframe, vector databases seem to grow in parallel, with ...
[9]
Faiss: A library for efficient similarity search - Engineering at Meta
Mar 29, 2017 · This month, we released Facebook AI Similarity Search (Faiss), a library that allows us to quickly search for multimedia documents that are ...
[10]
Our Journey to 35K+ GitHub Stars: Building Milvus from Scratch
Jun 26, 2025 · Join us in celebrating Milvus, the vector database that hit 35.5K stars on GitHub ... In November 2019, we decided to open-source Milvus version ...
[11]
Amazon OpenSearch Service's vector database capabilities explained
Jun 21, 2023 · With OpenSearch Service's vector database capabilities, you can implement semantic search, Retrieval Augmented Generation (RAG) with LLMs, recommendation ...Amazon Opensearch Service's... · Using Opensearch Service As... · Understanding The TechnologyMissing: launch | Show results with:launch
[12]
Vector Database Market Size, Share, Trends, Growth & Forecast
Rating 4.7 (40) Vector Database Market size was valued at $ 2.2 Bn in 2024 and is projected to reach $10.4 Bn by 2032 growing at a CAGR of 21.7% from 2026-2032.
[13]
A Web Developers Guide to Hybrid Search - Weaviate
Apr 9, 2024 · Hybrid search in Weaviate combines keyword (BM25) and vector search to leverage both exact term matching and semantic context.
[14]
BERT: Pre-training of Deep Bidirectional Transformers for Language ...
Oct 11, 2018 · BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers.
[15]
[PDF] Autoencoders, Unsupervised Learning, and Deep Architectures
Autoencoders are simple learning circuits which aim to transform inputs into outputs with the least possible amount of distortion. While conceptually simple, ...
[16]
A Simple Framework for Contrastive Learning of Visual ... - arXiv
Feb 13, 2020 · This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised ...
[17]
Learning Transferable Visual Models From Natural Language ...
Feb 26, 2021 · The paper proposes learning visual models by predicting image-caption pairs, then using natural language for zero-shot transfer to downstream ...
[18]
A guide to similarity measures and their data science applications
Jul 26, 2025 · Cosine similarity is scale invariant, hence, useful for applications where the relative size of a vector is not important, such as in document ...
[19]
Distance Metrics in Vector Search - Weaviate
Aug 15, 2023 · In this article, we explore the variety of distance metrics, the idea behind each, how they are calculated, and how they compare to each other.
[20]
[PDF] Scoring, term - Introduction to Information Retrieval
Computing the cosine similarities between the query vector and each doc- ument vector in the collection, sorting the resulting scores and selecting the top K ...
[21]
Vector Similarity Explained - Pinecone
Jun 30, 2023 · Euclidean distance is the straight-line distance between two vectors in a multidimensional space. Figure 1: Euclidean distance measurement in ...Missing: seminal | Show results with:seminal
[22]
[PDF] An Investigation of Practical Approximate Nearest Neighbor Algorithms
This paper concerns approximate nearest neighbor searching algorithms, which ... M-tree: An efficient access method for similarity search in metric spaces.
[23]
Efficient and robust approximate nearest neighbor search using ...
Mar 30, 2016 · We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, ...
[24]
What is Milvus | Milvus Documentation
Milvus is an open-source, high-performance, scalable vector database that stores numerical vectors from unstructured data, enabling fast searches.Missing: 2024 | Show results with:2024
[25]
Milvus Architecture Overview
Milvus has a modular, scalable, cloud-native design with disaggregated storage and compute. It has four main layers: Access, Coordinator, Worker Nodes, and ...Missing: features 2024
[26]
In-memory Index | Milvus Documentation
This topic lists various types of in-memory indexes Milvus supports, scenarios each of them best suits, and parameters users can configure to achieve better ...Vector Fields · Metric Types · Sparse Vector · On-disk Index
[27]
Install Milvus Cluster with Milvus Operator | Milvus Documentation
Milvus Operator is a solution that helps you deploy and manage a full Milvus service stack to target Kubernetes (K8s) clusters.Install Milvus Operator · Check Milvus cluster status · Forward a local port to Milvus
[28]
Knowhere Milvus v2.3.x documentation
Knowhere is the core vector execution engine of Milvus which incorporates several vector similarity search libraries including Faiss, Hnswlib and Annoy.Knowhere Advantages · Knowhere Code Structure · Dataobj : Base ClassMissing: 2.3 | Show results with:2.3<|separator|>
[29]
Search operators | Weaviate Documentation
GraphQL search operators guide for advanced query construction and precise data targeting techniques.
[30]
Hybrid search - Weaviate Documentation
Hybrid search combines the results of a vector search and a keyword (BM25F) search by fusing the two result sets.Missing: 2024 | Show results with:2024
[31]
Text Embeddings - Hugging Face - Weaviate Documentation
Weaviate's integration with Hugging Face's APIs allows you to access their models' capabilities directly from Weaviate.
[32]
Qdrant - Vector Database - Qdrant
Qdrant is an Open-Source Vector Database and Vector Search Engine written in Rust. It provides fast and scalable vector similarity search service with ...Vector Database Benchmarks · Vector Database Use Cases · Vector databasesMissing: WASM 2024
[33]
Payload - Qdrant
Updating payloads in Qdrant offers flexible methods to manage vector metadata. The set payload method updates specific fields while keeping others unchanged, ...Missing: real- 2024
[34]
Filtering - Qdrant
It is achieved by using the nested condition type formed by a payload key to focus on and a filter to apply. The key should point to an array of objects and can ...Missing: real- | Show results with:real-
[35]
Qdrant Fundamentals
How does Qdrant handle real-time data updates and search? Qdrant supports live updates for vector data, with newly inserted, updated and deleted vectors ...Vectors · Search · Does Qdrant support a full-text... · Collections
[36]
Qdrant Summer of Code 2024 - WASM based Dimension Reduction
Aug 31, 2024 · My journey as a Qdrant Summer of Code 2024 participant working on enhancing vector visualization using WebAssembly (WASM) based dimension ...
[37]
Chroma
Chroma is the open-source search and retrieval database for AI applications. Vector, full-text, regex, and metadata search. Develop locally and scale to ...
[38]
Elevate your projects with the powerful Chroma vector database in ...
Aug 17, 2025 · Chroma offers a zero-configuration setup, making it perfect for prototyping. It integrates deeply with Python and LangChain, so AI/ML developers ...
[39]
chroma-core/chroma: Open-source search and retrieval ... - GitHub
Our hosted service, Chroma Cloud, powers serverless vector and full-text search. It's extremely fast, cost-effective, scalable and painless.
[40]
Pinecone: The vector database to build knowledgeable AI
Scale simplified. Fully managed and serverless for effortless scaling. Rapid setup. Launch your vector databases in seconds.What is a Vector Database? · Pinecone Database · Pinecone Console · Pricing
[41]
Vespa.ai - Vespa.ai
Vespa lets you query, organize, and make inferences in vectors, tensors, text and structured data. Scale to billions of constantly changing data items.Yahoo Spins Out Vespa, Its... · Why Vespa · Vespa Blog · Vespa Cloud
[42]
How to Use Pinecone Vector Database in your AI Projects?
May 16, 2025 · We'll cover its setup, features, and architecture and show you how to implement a simple, scalable AI-powered similarity search solution using Python.
[43]
Pricing - Pinecone
Pinecone runs on fully managed infrastructure that scales with you. Start building today with product and support plans tailored to your needs.
[44]
Multimodal context for assistants - Pinecone Docs
Aug 28, 2025 · Pinecone assistants support multimodal context, allowing them to understand and respond to questions about images embedded in PDF documents.
[45]
Features - Vespa.ai
Vespa provides industry-leading vector search, text search and selection in structured data. Vector and tensor search: Any number of vector fields can be added ...Missing: Yahoo database
[46]
RAG at Scale: Why Tensors Outperform Vectors in Real-World AI
Sep 19, 2025 · With Vespa's production-ready tensor framework, organizations can seamlessly integrate dense and sparse data, personalize experiences at scale ...
[47]
Vector search - Amazon OpenSearch Service - AWS Documentation
Vector search in Amazon OpenSearch Service enables you to search for semantically similar content using machine learning embeddings rather than traditional ...Missing: 2022 | Show results with:2022
[48]
Redis as a vector database quick start guide | Docs
This quick start guide helps you to: Understand what a vector database is; Create a Redis vector database; Create vector embeddings and store vectors; Query ...Understand vector databases · Create a Redis vector database · Create an index
[49]
Redis adds support for vector database search in its first unified ...
Aug 15, 2023 · Redis Inc. has added tons of new functionality to its platform, including vector search capabilities, native triggers and a new change data capture feature.<|separator|>
[50]
The fast lane for your AI stack - Redis
Sep 4, 2025 · Hybrid search enhancements. Redis is announcing simpler hybrid search to unify text and vector rankings into a single, more relevant result set.Missing: graph- | Show results with:graph-
[51]
Vector Search Explained - Weaviate
Nov 21, 2024 · Vector databases (such as Weaviate) offer a comprehensive solution to semantic search use cases, supporting vector indexing and also managing ...Vector search in vector... · Types of vector search · Vector search use cases
[52]
What is vector search? Better search with ML - Elastic
Frequently used for semantic search, vector search finds similar data using approximate nearest neighbor (ANN) algorithms. Compared to traditional keyword ...What are Vector Embeddings? · Approximate nearest neighbor · Elasticsearch
[53]
Vector Databases: Intro, Use Cases, Top 5 Vector DBs - V7 Go
Nov 2, 2023 · Companies like Netflix and Amazon leverage vector databases to power their recommendation systems, resulting in more personalized and accurate ...
[54]
Utilizing Vector Database Management Systems in Cyber Security
VDMBSs allow efficient nearest neighbour similarity search on complex data objects, which can be used in various cyber security applications such as anomaly, ...
[55]
The Power of Vector Databases in Anomaly Detection - SingleStore
Nov 8, 2023 · In this guide, you'll learn all about the power of vector databases in anomaly detection, how vector databases differ from traditional ...Why you need vector... · How vector databases make... · Challenges of anomaly...Missing: cybersecurity | Show results with:cybersecurity
[56]
Recent advances in molecular representation methods and their ...
Jun 28, 2025 · The rapid evolution of molecular representation methods has significantly advanced the drug discovery process. Advances in language models, ...
[57]
ChatGPT plugins - OpenAI
Mar 23, 2023 · Plugins are tools designed specifically for language models with safety as a core principle, and help ChatGPT access up-to-date information, run computations, ...
[58]
Vector Databases vs. NoSQL Databases - Zilliz blog
Feb 3, 2025 · Vector databases store high-dimensional embeddings for AI, while NoSQL databases offer flexibility and scalability for diverse data models.
[59]
A quick introduction to vector search - Elasticsearch Labs
Feb 6, 2025 · Learn about vector search (aka semantic search), including the basics of vectors, how vector search works, and how it differs from lexical ...Missing: seminal | Show results with:seminal
[60]
The Complete Guide to Vector Databases for Machine Learning
Oct 18, 2025 · The key difference lies in the index structure. Instead of B-trees optimized for range queries, vector databases use algorithms designed for ...
[61]
Vector Databases vs Traditional Databases: Key Components ...
Sep 27, 2025 · B-tree Index: This data structure organizes data in a balanced tree. This allows the databases to look for a specific row faster by moving ...
[62]
PostgreSQL vs Vector Databases: Who Wins the Speed War in 2025?
Sep 13, 2025 · The surprising truth: For 80% of real-world use cases, PostgreSQL now matches or exceeds specialized vector database performance while offering ...Missing: blurring | Show results with:blurring
[63]
Vector Technologies for AI: Extending Your Existing Data Stack
Mar 28, 2025 · This article explores vector databases, their differences from vector engines, and how to integrate them into your existing data engineering landscape.