Fact-checked by Grok 2 weeks ago

TiDB

TiDB is an open-source, database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads, offering protocol compatibility for seamless migration and application development. Designed for cloud-native environments, it provides horizontal scalability by separating compute and storage layers, enabling elastic expansion across hundreds of nodes without . TiDB ensures and financial-grade through Multi-Raft and multiple data replicas distributed across availability zones. Developed by PingCAP since , TiDB addresses limitations of traditional by combining row-based in TiKV for transactional processing with columnar in TiFlash for real-time , allowing unified handling of OLTP and OLAP queries on the same . Its architecture includes a stateless SQL layer (TiDB server) that parses and optimizes queries, a key-value engine (TiKV) for distributed , and components like PD for coordination. TiDB supports petabyte-scale data volumes, up to 512 nodes per , and up to 1,000 concurrent connections per node, making it suitable for high-traffic applications in , , and . In April 2025, PingCAP announced major enhancements to TiDB for improved global scale and AI-driven applications. Key features of TiDB include automatic sharding for load balancing, built-in tools like TiCDC for , and integration with via TiDB Operator for simplified deployment and operations. It also offers advanced capabilities such as JSON support, , and time-series data handling, while maintaining compliance for transactions. As a self-managed solution, TiDB allows full control over infrastructure, with options for on-premises, , or hybrid setups, and is licensed under the 2.0 license for broad community adoption.

Overview

Description

TiDB is a cloud-native, database designed to support hybrid transactional and analytical processing (HTAP) workloads, providing horizontal scalability supporting up to 512 nodes while maintaining and . Developed by PingCAP, it draws inspiration from Google's Spanner and F1 databases to enable seamless handling of both (OLTP) and (OLAP) in a single system. Founded in 2015 in , , by a team of infrastructure engineers frustrated with the challenges of managing traditional databases at scale, PingCAP launched TiDB as an open-source project under the Apache 2.0 license. The initiative aimed to address the limitations of standalone relational database management systems (RDBMS) like , which struggle with massive data growth and high concurrency in internet-scale applications. At its core, TiDB seeks to combine the familiarity and ease of use of traditional RDBMS such as —with full SQL compatibility and transactions—with the horizontal scalability and resilience typically associated with systems. This hybrid approach allows organizations to build scalable, real-time applications without the need for complex sharding or separate OLTP and OLAP databases.

Design principles

TiDB's is fundamentally shaped by the principle of separating compute from , allowing the SQL layer (TiDB ) to operate independently of the distributed key-value layer (TiKV). This separation enables horizontal of each layer without impacting the other; for instance, additional TiDB servers can be added to handle increased query loads while TiKV clusters expand capacity separately. By , TiDB servers remain stateless, facilitating easy deployment and replacement in dynamic environments. In October 2025, PingCAP introduced TiDB X, an evolved that further decouples compute and by using as the backbone to support adaptive and native workloads. A core tenet of TiDB's design is achieving for distributed transactions, drawing inspiration from Google's Percolator model as implemented in systems like Spanner and . TiKV employs this model to manage multi-version (MVCC) and two-phase commit protocols with optimizations for performance in large-scale environments. Replication is ensured through the consensus algorithm, which provides fault-tolerant data duplication across multiple nodes—typically three replicas per data shard—to guarantee and even in the face of node failures. TiDB incorporates the (HTAP) principle to unify (OLTP) and (OLAP) workloads on the same dataset, eliminating the need for (ETL) pipelines. This is realized through TiKV handling transactional writes with row-oriented storage, complemented by TiFlash's column-oriented copies that accelerate real-time analytics queries without data duplication overhead. Such integration supports low-latency decision-making by allowing analytics to run directly on up-to-date transactional data. Compatibility with the protocol forms a foundational goal, enabling TiDB to serve as a for in most applications with minimal or no code modifications. This wire-protocol adherence covers , semantics, and ecosystem tools, reducing migration barriers for enterprises reliant on existing MySQL-based stacks. The prioritizes retaining familiar behaviors, such as indexing and query optimization, while extending capabilities for distributed scenarios. Embracing -native paradigms, TiDB is engineered with stateless components, automated mechanisms, and elastic scaling to thrive in multi-tenant infrastructures. The Placement Driver () cluster coordinates data placement and leader elections for , ensuring seamless recovery from failures without manual intervention. This elasticity allows clusters to dynamically adjust resources based on demand, supporting both on-premises and managed deployments.

History

Development origins

PingCAP was founded in 2015 by Liu Qi, Huang Dongxu, and Cui Qiu, three experienced infrastructure engineers from leading Chinese Internet companies, with the goal of addressing the challenges of scaling traditional relational databases like for high-growth applications in and environments. The founders, frustrated by the limitations of existing database management, scaling, and maintenance practices, sought to create a database that maintained compatibility while enabling seamless horizontal scaling without the complexities of manual sharding. Inspired by Google's Spanner, they aimed to build an open-source solution that could handle massive data volumes and real-time analytics for modern cloud-native workloads. The initial prototype of TiDB was developed in the Go programming language, building directly on the TiKV key-value store project, which originated in early 2015 as a foundational storage layer inspired by consensus and capable of supporting distributed transactions. Early development emphasized integrating TiKV's storage capabilities with SQL processing, starting with basic SQL parsing to ensure protocol compatibility from the outset. The project was hosted on from its inception, fostering immediate community involvement; initial commits focused on core parser implementation and rudimentary storage engine integration to prototype a stateless SQL layer atop the distributed key-value backend. This approach allowed rapid iteration while leveraging Go's concurrency features for handling distributed query execution. A primary challenge during these origins was achieving full compliance in a distributed without imposing sharding or partitioning burdens on users, which the team addressed by designing TiDB as a "one-stop" solution where the database automatically managed data distribution and consistency via and multi-version concurrency control (MVCC). This enabled across nodes without sacrificing the familiarity of syntax and semantics. In 2016, PingCAP secured its first major funding round, a $5 million Series A led by Yunqi Partners, which supported the shift toward full-time focus on database and accelerated refinement into a production-ready system.

Release milestones

TiDB's initial stable release, version 1.0, arrived on October 16, 2017, marking the production-ready debut of the distributed database with core support for capabilities, including horizontal scalability and via the consensus protocol integrated into TiKV. In April 2018, TiDB 2.0 was launched, introducing significant enhancements to horizontal scaling through improved region splitting and merge mechanisms in TiKV, alongside optimizations for storage engine performance to handle larger workloads more efficiently. TiDB 4.0, released on June 17, 2020, advanced (HTAP) by integrating TiFlash, a columnar storage engine that enables analytics directly on transactional data without data duplication. The April 7, 2021, release of TiDB 5.0 focused on query execution improvements, including vectorized execution for faster analytical processing, and expanded compatibility with 8.0 features such as window functions and common table expressions. TiDB 6.0, unveiled on April 7, 2022, incorporated enterprise-grade resource control via quota management and auto-scaling capabilities to dynamically adjust cluster resources based on workload demands. On March 30, 2023, TiDB 7.0 debuted with bolstered security features like enhanced role-based access control and integrated monitoring tools for better observability and compliance in production environments. TiDB 8.0, released March 29, 2024, introduced bulk DML support for large transactions to mitigate out-of-memory issues, enhanced optimizer support for multi-valued indexes on data, and accelerated cluster snapshot restore speeds by 1.5-3x. TiDB 8.5, released as a version on December 19, 2024, brought general availability for support, client-side encryption for backup data, and experimental vector search capabilities, along with improvements in such as the TiKV MVCC in-memory engine. In 2025, TiDB Cloud expanded with a public preview of its Dedicated service on on June 4, enabling managed deployments in that ecosystem for broader cloud portability. On October 8, 2025, at the SCaiLE Summit, PingCAP announced TiDB X, a rearchitected version introducing context-aware scaling, zero-friction elasticity, and native integrations for adaptive resource allocation in intelligent applications.

Architecture

Core components

TiDB's architecture is composed of several key components that work together to provide a distributed, scalable database system. These include the TiDB Server for SQL processing, the Placement Driver (PD) for cluster management, TiKV for , TiFlash for analytical workloads, and integrated monitoring tools for . This modular design enables horizontal scaling and across nodes. In 2025, PingCAP introduced TiDB X, a new cloud-native architecture variant that uses as the backbone for enhanced decoupling of compute and storage, supporting context-aware scaling and native integrations, available in TiDB Cloud tiers. The TiDB Server serves as a stateless SQL layer that acts as the primary interface for client connections. It handles query parsing by analyzing MySQL protocol packets for syntactic and semantic validity, followed by optimization to generate efficient distributed execution plans, such as pushing down predicates and aggregations to storage layers. Execution involves coordinating data retrieval from underlying stores and assembling results, all while maintaining full compatibility with the to allow seamless integration with existing MySQL tools and applications. As a stateless component, TiDB Servers can be scaled independently without data locality concerns. The Placement Driver (PD) functions as the central , maintaining metadata for data distribution, cluster topology, and transaction identifiers. It performs scheduling to balance load across nodes by allocating data regions and handling , ensuring even distribution and resource utilization. For , PD is deployed in clusters of at least three nodes using an odd number to achieve and . TiKV is the distributed transactional key-value store that forms the foundational storage layer of TiDB. It organizes data into ordered key ranges called , each replicated across multiple nodes for durability. Locally, TiKV relies on as its embedded storage engine to manage persistent key-value data on disk. Replication is achieved through the consensus algorithm, supporting multi-region deployments by ensuring data consistency and automatic recovery from failures. By default, TiKV maintains three replicas per to provide . TiFlash, introduced in TiDB version 4.0, is a columnar optimized for analytical processing within the HTAP . It asynchronously replicates data from TiKV using Learner roles, enabling real-time synchronization while co-locating instances with TiKV nodes to minimize latency for hybrid transactional and analytical workloads. TiFlash leverages coprocessors built on for efficient columnar query execution, such as aggregations and scans, without disrupting OLTP operations. TiDB incorporates robust monitoring through integrations with for collecting and storing time-series metrics from all components, and for visualizing across categories like cluster overview, TiDB performance, and TiKV storage. Additionally, the built-in TiDB , managed by since version 4.0, provides a web-based interface for real-time inspection of data distribution and . These tools enable proactive in multi-tenant environments. In terms of overall topology, TiDB operates as a multi-tenant system where acts as the coordinating "brain" for and scheduling, TiKV provides the core row-based backbone, and TiDB Servers function as scalable gateways for SQL access, with TiFlash optionally extending capabilities for . This separation allows independent of compute, , and layers.

Data flow and storage

In TiDB, the query lifecycle begins when a client connects to the TiDB Server using the , sending commands and statement strings after . The server maintains session state, such as SQL mode and transaction context, while handling synchronous queries like non-prepared statements via the mysql.ComQuery packet. The statement string is then parsed into an (AST) using a MySQL-compatible parser, enabling structured representation of clauses like WHERE conditions as nested expressions. Following parsing, the optimizer compiles the into a logical plan and then a physical execution plan using cost-based optimization, incorporating name resolution and privilege checks; simple queries may use fast planning paths like PointGet for efficiency. During execution, the plan is run via the , which pushes down tasks—such as scans, selections, or aggregations—to TiKV or TiFlash storage nodes to process data locally and minimize network transfer, with results filtered and returned to the client. This pushdown model ensures that computations occur close to the data, enhancing performance in distributed environments. TiDB employs a hybrid storage model, with TiKV providing row-oriented storage as an ordered key-value map implemented via the (LSM-tree) in for transactional workloads. Complementing this, TiFlash offers columnar storage optimized for analytical queries, enabling efficient aggregation and scan operations on large datasets. The Placement Driver (PD) automatically manages data placement by distributing regions across TiKV and TiFlash nodes, ensuring balanced load and without manual intervention. Data sharding in TiDB is implicit and range-based, partitioning tables into consecutive key segments called regions—typically around 256 each (default since v8.4.0)—stored in TiKV. PD monitors region sizes and loads, triggering automatic splitting when regions exceed thresholds to prevent hotspots, and performs balancing by migrating regions across nodes for even distribution. Replication in TiDB defaults to three replicas per region, leveraging the consensus algorithm to maintain ; data modifications are logged and propagated to followers, requiring majority acknowledgment for commits. This setup supports multi-data-center (multi-DC) deployments for geo-redundancy, where regions can be placed across DCs to enable while preserving . Transactions in TiDB follow a two-phase commit (2PC) protocol coordinated by the TiDB Server, drawing from the Percolator model to ensure properties across distributed nodes. The process starts with obtaining a start from , followed by read operations using multi-version concurrency control (MVCC) and buffered writes; in the prewrite , locks are acquired on keys in parallel, and the commit assigns a commit before finalizing updates asynchronously. TiDB supports both optimistic and pessimistic modes: optimistic transactions defer detection to the prewrite , retrying on failures for low-contention scenarios, while pessimistic mode acquires locks during execution using a for-update to handle high contention, supporting isolation levels like repeatable read. For data ingestion, TiDB Lightning facilitates bulk loading of large datasets, such as TB-scale imports from files like SQL dumps or , in either physical mode—for high-speed key-value ingestion directly into TiKV—or logical mode—for ACID-compliant SQL execution on live clusters. Complementarily, (DM) enables streaming ingestion by parsing and replicating incremental binlog events from upstream or sources, handling DML and DDL changes with filtering and sharding merge rules to maintain consistency during ongoing operations.

Features

Horizontal scalability

TiDB achieves linear horizontal scalability by its compute and layers, allowing independent expansion of resources to handle growing workloads. Additional TiDB Server nodes can be added to increase read and write throughput, as the stateless SQL layer processes queries in parallel behind a load balancer like TiProxy. Similarly, out TiKV nodes expands capacity and I/O performance, with data automatically redistributed across the . The Placement Driver (PD) enables auto- by monitoring cluster health and orchestrating without manual intervention. Central to this scalability is TiKV's region management, where is partitioned into discrete units called Regions, each approximately 256 in size by default. These Regions split automatically when exceeding 384 to prevent overload, dividing into two or more smaller units, and merge when falling below 54 to optimize space and efficiency. detects hot-spots—Regions experiencing disproportionate load—through metrics like read/write traffic and CPU usage, then migrates or rebalances them across nodes to maintain even distribution and avoid bottlenecks. In practice, TiDB clusters can sustain over 1 million (QPS) while scaling to petabyte-level data volumes, all without , as demonstrated in production environments like Flipkart's platform. This elasticity supports seamless growth for high-traffic applications, with and sharding ensuring efficient resource use at massive scales. Introduced in 2025, TiDB X enhances this capability with context-aware tailored for workloads, using real-time signals such as QPS, , and query patterns to predict and provision resources proactively. This architecture leverages for zero-friction elasticity, adjusting compute and storage in minutes to accommodate dynamic -driven demands like vector search and operational analytics on a unified . Despite these strengths, horizontal scalability in multi-region deployments can introduce network latency challenges, with cross-region round-trip times potentially reaching hundreds of milliseconds. TiDB mitigates this through follower reads, enabling nodes in local regions to serve queries and reduce traffic by up to 50%, thus minimizing global synchronization overhead.

MySQL compatibility

TiDB provides extensive compatibility with at both the and functional levels, allowing applications built for MySQL to connect and operate with minimal modifications. This compatibility is a core design goal, enabling seamless adoption in MySQL-centric ecosystems without requiring code changes for most workloads. TiDB supports the full MySQL 5.7 and 8.0 , permitting direct connections using standard MySQL clients, drivers, and connection strings. For instance, applications can substitute a TiDB for a MySQL one without altering client configurations, as the handles , query execution, and result sets identically. This wire-level parity extends to tools like and , which connect natively to TiDB clusters. In terms of SQL dialect, TiDB achieves high compatibility with 5.7 and 8.0 syntax for DDL and DML operations, covering most common use cases including index creation, table partitioning (, , , types), and basic query constructs. It supports advanced analytical features such as window functions and common table expressions (CTEs), aligning with 8.0 standards—window functions operate similarly to 's implementation, enabling ranking, aggregation over partitions, and ordered computations. However, gaps exist in specialized areas, such as limited support for (GIS) functions, spatial data types, and indexes, as well as the absence of stored procedures, triggers, events, FULLTEXT indexes, and XML functions. DDL operations like online schema changes are supported, but multi-object ALTER TABLE statements and certain type conversions (e.g., to ) are restricted. TiDB integrates effectively with the MySQL ecosystem, including object-relational mapping (ORM) frameworks like Hibernate (via TiDB Dialect in version 6.0 and later) and tools such as and for administration and querying. It also supports binlog-style replication through TiCDC, which captures change data in formats compatible with MySQL consumers like Kafka and Debezium, facilitating downstream synchronization. These integrations ensure that monitoring, backup, and development workflows from the MySQL landscape transfer directly to TiDB environments. Migration to TiDB is straightforward for most applications, as no schema alterations are required for standard schemas, and tools like TiDB Data Migration () handle both full and incremental data synchronization from or sources without downtime. replicates DDL and DML changes while preserving compatibility with MySQL's protocol and binlog formats, making it suitable for hybrid or phased transitions. This ease of migration underscores TiDB's role as a for scaling MySQL deployments. With the release of TiDB 8.0 in March 2024, compatibility with MySQL 8.0 was further solidified, incorporating features like window functions, CTEs, and system variables such as div_precision_increment for precise division handling. Earlier versions like 7.4 (June 2024) marked official MySQL 8.0 alignment, but TiDB 8.0 extends this with enhanced DML optimizations and ecosystem tooling, ensuring parity for modern MySQL applications. Default settings, including charset utf8mb4 and collation utf8mb4_bin, match MySQL 8.0 conventions to minimize behavioral differences.

Distributed transactions

TiDB supports distributed transactions across its nodes, ensuring atomicity, consistency, , and durability in a horizontally scaled environment. This is achieved through a hybrid transaction model inspired by Google's Percolator, which combines multi-version (MVCC) for and a two-phase commit (2PC) protocol for atomicity. In the 2PC implementation, the TiDB server acts as the transaction coordinator, while TiKV nodes serve as participants managing and replication. The process begins with a prewrite , where TiDB assigns a start timestamp from the and sends parallel prewrite requests to relevant TiKV Regions; locks are acquired on keys if no conflicts exist, validated against existing MVCC versions, with the lock coordinating secondary keys. If prewrites succeed, the commit follows: TiDB commits the first via consensus on TiKV, then secondaries, releasing locks and externalizing committed versions to ensure all-or-nothing atomicity and through replication. TiDB employs MVCC to enable snapshot isolation, allowing transactions to read consistent snapshots of data committed before their start timestamp, thus avoiding dirty reads and non-repeatable reads without blocking writers. The default isolation level is Repeatable Read, compatible with MySQL's semantics, while Read Committed (introduced in v4.0) and Read Uncommitted are also supported for less stringent scenarios; Serializable isolation, as defined by , is not natively provided, though pessimistic locking can approximate stricter controls in high-conflict cases. To handle varying workloads, TiDB offers both optimistic and pessimistic modes, with optimistic as the since it assumes low rates and defers conflict detection until the commit , minimizing locking overhead and improving throughput in read-heavy or low-contention environments. Pessimistic mode, enabled via configuration or statements like BEGIN PESSIMISTIC, acquires locks early during reads (e.g., SELECT FOR UPDATE) and writes, suiting high- scenarios by preventing aborts but potentially increasing latency due to blocking. Both modes build on the same 2PC foundation, though pessimistic adds a pipelined locking for . Deadlock detection occurs automatically in pessimistic mode at the TiKV layer, where circular wait dependencies among lock requests are identified; upon detection, one is aborted with error code 1213, and wait timeouts (default 50 seconds) trigger error 1205 if unresolved. Lock information can be queried via INFORMATION_SCHEMA.DEADLOCKS or CLUSTER_DEADLOCKS tables for . TiDB provides as the default for reads and writes, ensuring operations appear atomic and in strict as if executed sequentially, enforced by consensus and timestamp ordering from PD. For multi-region deployments, is supported when features like async commit are enabled, preserving operation dependencies across regions while reducing latency at the cost of weaker global ordering guarantees. Transaction performance in TiDB emphasizes low , with timestamp oracle (TSO) allocation from typically under 1 ms for local operations, enabling sub-millisecond point reads in optimistic mode under low load. Commit averages around 12-13 ms for typical workloads, influenced by prewrite and replication durations; async commit, introduced in v5.0, accelerates cross-region transactions by decoupling secondary commits, reducing end-to-end while maintaining when paired with one-phase commit options.

Cloud-native design

TiDB is engineered with a cloud-native that leverages container orchestration platforms like to ensure seamless deployment, management, and scaling in dynamic cloud environments. The TiDB Operator, an extension for , automates the full lifecycle of TiDB clusters, including provisioning, upgrading, scaling, and failover operations, allowing operators to manage distributed databases declaratively through custom resources. This approach enables TiDB to run portably across various infrastructures, abstracting underlying hardware complexities and facilitating rapid iteration in microservices-based applications. A core aspect of TiDB's cloud-native design is its elasticity, achieved through the of compute and storage layers, which permits independent scaling of resources without downtime. In Kubernetes deployments, this manifests as dynamic resource allocation via Horizontal Pod Autoscaler () integrations and auto-healing mechanisms that restart or reschedule failed pods automatically, maintaining cluster resilience against node failures or traffic spikes. TiDB's storage engine, TiKV, further enhances this by utilizing cloud-native like or compatible services, ensuring data durability and scalability decoupled from compute instances. TiDB supports multi-cloud deployments across major providers including AWS, (GCP), , and , enabling organizations to avoid while leveraging region-specific advantages. The TiDB Cloud Serverless offering, launched in July 2023, introduces a pay-per-use model that automatically scales compute resources from zero to handle variable workloads, optimizing costs for bursty applications. In 2025, the introduction of TiDB X architecture further advanced serverless scaling, providing enhanced elasticity for unpredictable AI-driven workloads through optimized resource orchestration and faster auto-scaling responses. For observability, TiDB integrates natively with for metrics collection and for visualization, exposing key performance indicators such as query latency, throughput, and resource utilization to enable proactive monitoring in cloud setups. Tracing capabilities support distributed request tracking, aligning with standards like OpenTelemetry for end-to-end visibility in architectures. Security in TiDB's cloud-native design incorporates (RBAC) for fine-grained permissions on database operations, mandatory TLS for all network communications to protect , and seamless integration with cloud provider systems for centralized . These features ensure compliance with standards like GDPR and HIPAA while simplifying secure operations across hybrid and multi-cloud environments.

HTAP capabilities

TiDB supports (HTAP) through a unified engine that separates (OLTP) workloads on its row-oriented storage layer, TiKV, from (OLAP) workloads on the column-oriented TiFlash layer, while maintaining shared data access to enable zero-ETL pipelines. This architecture ensures across both engines without requiring data duplication or batch exports, allowing transactional updates in TiKV to propagate in to TiFlash for immediate analytical use. TiFlash operates as a distributed columnar integrated into the TiDB , featuring Raft-replicated replicas that provide and for analytical . These replicas are created selectively per table via manual , such as using DDL commands to replicate specific tables from TiKV, enabling asynchronous synchronization of regions without impacting OLTP on the primary . By leveraging Multi-Raft protocols with learner roles, TiFlash maintains logical consistency and snapshot isolation, supporting efficient columnar scans and aggregations through integrated coprocessors based on technology. The TiDB query optimizer employs cost-based to direct analytical queries, such as those involving heavy aggregations or joins, to TiFlash replicas, achieving speedups of 10x or more compared to row-store on TiKV. For instance, complex aggregation queries can execute up to 8x faster in benchmarks like TPC-H at scale 100, due to columnar storage optimizations and massively parallel (MPP) distribution across TiFlash nodes. This is automatic based on query patterns, with optional SQL hints available for explicit control, ensuring OLTP queries remain on TiKV for low-latency point reads and writes. TiDB's HTAP design delivers analytics with sub-second latency for ad-hoc queries on up-to-date transactional data, integrating seamlessly with tools like Tableau for interactive dashboards and . Since version 5.0, enhancements including a vectorized execution engine in TiFlash have further accelerated scan-intensive operations by enabling mode for distributed joins and aggregations, reducing query times for large datasets exceeding 10 million rows. These capabilities are particularly valuable in use cases such as platforms performing to optimize stock levels during peak sales, or financial systems detecting patterns through instant analytical scans on transaction streams.

High availability

TiDB achieves through a distributed that emphasizes and automatic recovery mechanisms. At the core of its data replication strategy is the TiKV storage layer, which employs multi-replica consensus groups for each data region. Typically, three replicas are maintained per region, ensuring that data is durably stored across multiple nodes. The Placement Driver (PD) component schedules these replicas with awareness of topology labels, such as zones and regions, to promote diversity and prevent single points of failure, thereby enhancing capabilities. Failover in TiDB is handled seamlessly via Raft's automatic leader election process, which detects and resolves node failures by electing a new leader among replicas, typically completing within seconds to minimize downtime. The TiDB server layer is stateless, allowing for instantaneous scaling in or out without data loss or reconfiguration, as compute nodes do not persist state and can be replaced dynamically. This design ensures that client connections remain uninterrupted during failures, with the system retrying operations in milliseconds as needed. For backup strategies supporting , TiDB utilizes the Backup & Restore (BR) tool to enable (PITR), allowing clusters to be restored to any specific within the using and backups. Additionally, asynchronous replication modes facilitate (DR) by switching to non-synchronous replication when primary replicas fail, maintaining without strict synchronization guarantees during outages. Monitoring and self-healing further bolster uptime, with integrated alerting systems that notify on node failures and critical errors through and dashboards. TiDB Operator, when deployed on , automates self-healing by managing pod restarts, scaling, and recovery, ensuring the cluster responds to faults without manual intervention. In TiDB Cloud, these features contribute to a 99.99% (SLA), guaranteeing resilience against node or zone failures with no .

Vector search and AI integration

TiDB provides native vector database functionality, allowing users to store and query vector embeddings directly within its SQL framework. This support includes the creation of vector columns using data types such as BINARY or VARBINARY to hold embeddings generated by models like those from OpenAI or Hugging Face. The system integrates approximate nearest neighbor (ANN) search capabilities, enabling efficient similarity searches for high-dimensional data in AI and machine learning applications. A key component is the Hierarchical Navigable (HNSW) index, which TiDB uses for indexing to accelerate k-nearest neighbors (k-NN) queries. Users can create an HNSW on vector columns via SQL statements, such as CREATE INDEX idx ON table(vector_column) USING HNSW;, which builds a graph-based structure for fast approximate searches with high recall rates, often up to 98% accuracy in benchmarks. This integration allows operations to be performed seamlessly alongside traditional SQL queries, without requiring separate vector databases. TiDB enhances AI workloads through features like hybrid search, which combines vector similarity with to improve relevance in retrieval tasks. For instance, queries can fuse on embeddings with keyword matching using functions like MATCH AGAINST in a single SQL statement. Additionally, it supports retrieval-augmented generation () pipelines by providing access to fresh data, enabling large language models (LLMs) to ground responses in up-to-date embeddings and structured information. In 2025, PingCAP introduced TiDB X, a context-aware designed for zero-downtime scaling of models and native integration with LLMs. TiDB X leverages as its backbone to handle dynamic workloads, allowing seamless expansion of datasets while maintaining query consistency and integrating directly with frameworks like for agentic applications. This advancement builds on TiDB's HTAP capabilities to support real-time analytics on data. Vector embeddings are stored in TiKV for transactional workloads and TiFlash for analytical , utilizing columnar to optimize ANN algorithms like HNSW for billion-scale datasets. This distributed ensures fault-tolerant persistence and horizontal scaling of vectors across clusters. Common use cases include recommendation systems, where vector search powers personalized suggestions based on embeddings; chatbots, enabling semantic understanding of queries through similarity matching; and , identifying outliers in time-series data via distance metrics on embedded features. Performance-wise, TiDB's vector search delivers low for k-NN queries on large datasets, thanks to optimized HNSW indexing and in-memory . This efficiency supports real-time inference at scale, with recall rates balancing speed and accuracy for production environments.

Deployment options

On-premises methods

TiDB supports several self-managed deployment options for on-premises environments, such as bare metal servers or machines, enabling organizations to operate clusters without relying on cloud infrastructure. These methods leverage command-line tools and automation scripts to provision, configure, and maintain TiDB components including Placement Driver (PD), TiDB servers, and TiKV nodes. The primary tool for on-premises deployments is TiUP, a CLI-based cluster management solution that facilitates single-command operations for deploying, upgrading, and scaling TiDB clusters. TiUP operates from a control machine, using a YAML-formatted topology file to define the cluster layout, including host specifications, node roles (e.g., , TiDB, TiKV), and resource allocations. This allows for straightforward setup on bare metal or , with built-in support for rolling upgrades and scaling without downtime; for instance, adding TiKV nodes involves updating the topology file and executing a scale-out command. TiUP also integrates monitoring components like and during deployment, providing metrics for . For multi-node automation, TiDB Ansible offers a playbook-based approach using to orchestrate cluster provisioning across physical or hosts. This method initializes the , deploys components, and handles tasks like rolling restarts, making it suitable for scripted, repeatable setups in environments. Hardware prerequisites include SSD for TiKV nodes to ensure optimal I/O , with recommendations for at least 8 CPU and 16 GB per to support production workloads. Although TiUP has largely superseded Ansible for new deployments, Ansible remains viable for managing legacy clusters or environments requiring fine-grained playbook customization. Local development and single-node testing can be achieved using Docker Compose, which provisions a lightweight TiDB cluster via predefined Docker images for , TiDB, and TiKV. Users clone the official , pull images from Docker Hub, and start the stack with a simple docker-compose up command, accessing the database via client on port 4000. This setup is ideal for prototyping and isolated testing but is not recommended for due to its single-node limitations. Configurations for and TiKV, such as replication settings or paths, are managed through files like docker-compose.yml and component-specific configs. On-premises best practices emphasize and . Deploy at least three PD nodes across distinct hosts to maintain and , with NVMe SSDs recommended for TiKV in production to handle high-throughput workloads—aim for 2 TB per TiKV minimum. Overall sizing should include at least three TiKV nodes and two TiDB servers, with monitoring enabled via for proactive issue detection.

Cloud and containerized deployments

TiDB offers robust options for cloud and containerized deployments, enabling seamless integration with major cloud providers and environments. TiDB Cloud is a fully managed Database-as-a-Service (DBaaS) platform that automates the deployment, scaling, monitoring, and maintenance of TiDB clusters across (AWS), (GCP), and . It provides two primary tiers: Dedicated, which offers isolated resources for high-performance, predictable workloads with fine-tuned configurations; and Serverless (renamed to Starter in 2025), which supports instant autoscaling and pay-per-use pricing for variable or development workloads. In June 2025, TiDB Cloud Dedicated entered public preview on Azure, expanding multi-cloud availability and allowing enterprises to deploy databases natively within Azure's ecosystem. For containerized deployments, TiDB Operator serves as the core automation tool on , utilizing Custom Resource Definitions (CRDs) to declaratively manage TiDB clusters. It handles full life-cycle operations, including initial deployment, horizontal scaling of components like TiDB servers and TiKV nodes, rolling upgrades without downtime, backups, and . Day 2 operations, such as resource reconfiguration and monitoring integration, are automated through Kubernetes-native controllers, ensuring and elasticity in dynamic environments. Users can deploy TiDB Operator via charts from the official PingCAP repository, which simplifies installation on Kubernetes clusters by packaging CRDs, controllers, and dependencies into reusable templates. These charts integrate with Container (CSI) drivers for persistent , enabling quick setups with commands like helm install tidb-operator pingcap/tidb-operator. TiDB emphasizes multi-cloud portability, supporting deployment on managed Kubernetes services such as AWS Elastic Service (EKS), Google Engine (GKE), and Service (AKS). On EKS, for instance, users provision node groups with optimized instance types (e.g., c7g.4xlarge for TiDB) and gp3 EBS volumes, then apply TiDBCluster manifests for auto-provisioning. Similar workflows apply to GKE with pd-ssd storage classes and n2-standard machine types, and AKS with Ultra SSD for high-IOPS TiKV nodes, allowing consistent operations across providers without . In Serverless mode within TiDB Cloud, clusters automatically suspend during idle periods and resume on demand, optimizing costs for bursty or unpredictable workloads by scaling compute and storage to zero when inactive. This mode supports AI-driven scaling for vector search and applications across TiDB Cloud tiers including Starter, where resources dynamically adjust based on query complexity and data volume.

Ecosystem tools

Data migration and ingestion

TiDB Data Migration (DM) is an integrated data migration platform that enables full data migration and incremental replication from heterogeneous sources, primarily MySQL-compatible databases such as (versions 5.6 to 8.0), (version 10.1.2 and later, on an experimental basis), and MySQL, to TiDB clusters. It supports online DDL synchronization, including compatibility with tools like gh-ost and pt-osc for ghost and online changes, ensuring minimal disruption during schema alterations. Additionally, DM provides sharding support, allowing it to merge data from multiple upstream shards into a single TiDB database while automatically detecting and applying DDL changes across shards. For initial bulk data imports into TiDB, TiDB Lightning serves as a high-speed loader capable of handling terabyte-scale datasets. It accepts input from local files or S3-compatible storage in formats such as SQL dumps, , or , and operates in two modes: physical import, which encodes data into key-value pairs for direct ingestion into TiKV storage (achieving speeds of 100 to 500 GiB per hour), and logical import, which generates and executes SQL statements (at 10 to 50 GiB per hour). The physical mode is optimized for empty tables and empty clusters, making it ideal for deployments or large-scale initial loads. Dumpling complements these tools by providing a logical export mechanism from MySQL-compatible sources, generating SQL dumps or files that can be directly imported via . It exports and data into structured files, including metadata and per- splits (e.g., {schema}.{table}.{index}.sql), supporting parallel dumping for efficiency and output to local storage or S3-compatible endpoints. This makes Dumpling a key component for preparing data for TiDB ingestion, particularly in scenarios requiring portable, human-readable formats. Within DM, the loader stage handles full data migration by dumping upstream data (similar to ) and loading it into TiDB, while the syncer stage enables incremental replication by and applying binlog events. between stages ensures seamless transitions, with the syncer resuming from the loader's completion point using binlog positions as checkpoints, updated every 30 seconds to track replication progress across workers. Conflict resolution during replication is managed through features like the algorithm for detecting concurrent updates and a that rewrites INSERT and UPDATE operations to REPLACE on task restarts, preventing data inconsistencies based on binlog coordinates.

Backup and recovery

TiDB provides robust backup and recovery mechanisms through the Backup & Restore () tool, a command-line utility designed for distributed operations on data stored in TiKV nodes. enables full backups of the entire at a specific point in time, capturing raw key-value for physical consistency. It also supports incremental log backups that record changes to TiKV , allowing for () with a Recovery Point Objective (RPO) as low as 5 minutes. These backups are stored in S3-compatible external storage, such as , , or Azure Blob Storage, ensuring scalability and durability. For logical backups, TiDB uses , a data export tool that generates SQL or files compatible with ecosystems, facilitating portable restores across different systems. Unlike 's physical approach, which backs up underlying files for faster intra-TiDB restores, produces human-readable exports suitable for migrations or archiving but with higher overhead due to SQL parsing. remains the preferred method for production environments in TiDB due to its efficiency in handling distributed physical data. The recovery process in TiDB leverages BR to restore data to an empty or non-conflicting cluster, supporting full cluster recovery or specific databases and tables. PITR combines the most recent full snapshot with log backups up to a user-specified timestamp, such as 2022-05-15 18:00:00+0800, enabling precise rollbacks. During recovery, BR handles partial failures by pausing tasks and reporting details like pause times and store-specific errors, allowing operators to address issues such as unavailable TiKV nodes before resuming. Restores require an empty target cluster to avoid conflicts, and the process applies changes sequentially from logs to achieve the desired state. Backups and restores can be scheduled and managed using TiUP for on-premises deployments or the TiDB Operator for environments, integrating seamlessly with workflows. From TiDB v7.0.0, SQL-based commands are available directly within the database. is enhanced through encryption: BR supports server-side encryption (SSE) for S3 storage using AWS KMS keys, and similar mechanisms for Blob Storage with encryption scopes or AES-256 keys, protecting and in transit via storage provider credentials. Performance of BR operations scales horizontally with cluster size, utilizing parallel processing across TiKV nodes for distributed I/O. Snapshot backups achieve speeds of 50-100 MB/s per TiKV node with minimal impact (<20% on cluster throughput), while restores reach up to 2 TiB/hour for snapshots and 30 GiB/hour for logs in tested configurations with 6-21 nodes. For terabyte-scale clusters, such as 10 TB datasets, BR enables recovery times under 1 hour by fully utilizing hardware resources, as demonstrated in benchmarks with 1+ GB/s throughput. This results in low Recovery Time Objectives (RTO) for large-scale recoveries, particularly with optimizations in TiDB 8.1 that improve region scattering and communication efficiency.

Change data capture

TiDB provides (CDC) capabilities through dedicated tools that enable real-time synchronization of incremental data changes from the database to downstream systems, ensuring low-latency replication for distributed environments. The primary tool, TiCDC (TiDB CDC), captures row-level changes by pulling and processing change logs from TiKV storage nodes, which are based on consensus protocol logs, and exports sorted row-based incremental data in formats compatible with various sinks. TiCDC supports streaming to targets such as and Debezium connectors, facilitating integration with event-driven architectures and real-time processing pipelines. Additionally, TiDB Binlog offers a MySQL-compatible log protocol for replicating changes to downstream systems like databases and , though it was deprecated starting from TiDB version 7.5.0, fully deprecated in v8.3.0, and removed in v8.4.0 in favor of more scalable alternatives. Common use cases for TiDB's CDC tools include populating data warehouses with real-time streams from TiDB to storage like and for , invalidating caches by propagating updates to in-memory systems, and enabling cross-data-center for high-availability setups across regions. Key features of TiCDC include exactly-once delivery semantics when configured with idempotent sinks like Kafka, handling of schema evolution through replication of DDL statements alongside DML changes to maintain downstream consistency, and via multi-threaded processors and event queues in its updated architecture for improved throughput. Despite these strengths, TiDB CDC tools exhibit limitations, such as increased in high-throughput workloads due to splitting needs and partial constraints, particularly when exceeds 100 ms between clusters.

References

  1. [1]
    What is TiDB Self-Managed
    TiDB is a distributed database designed for the cloud, providing flexible scalability, reliability, and security on the cloud platform. Users can elastically ...
  2. [2]
    About PingCAP | TiDB
    Our History. PingCAP started in 2015 when three seasoned infrastructure engineers were sick and tired of the way databases were managed, scaled, and ...
  3. [3]
    TiDB Architecture
    Provides a rich series of data migration tools for migrating, replicating, or backing up data. As a distributed database, TiDB is designed to consist of ...
  4. [4]
    The Unified Database for Modern Workloads - TiDB
    TiDB delivers strong consistency, built-in horizontal scalability, and cloud-native resilience for the most demanding workloads.
  5. [5]
    pingcap/tidb: TiDB - the open-source, cloud-native, distributed SQL ...
    An open-source, cloud-native, distributed SQL database designed for high availability, horizontal and vertical scalability, strong consistency, and high ...
  6. [6]
    How we build TiDB
    Oct 17, 2016 · Inspired by Spanner and F1, we are making a NewSQL database. Of course, it's open source. What to build? So we are building a NewSQL database ...
  7. [7]
    PingCAP - Crunchbase Company Profile & Funding
    $$10.4M This year, PingCAP is projected to spend $10.4M on IT ... PingCAP is located in Sunnyvale, California, United States . Who invested in ...
  8. [8]
    TiDB Self-Managed | TiDB Docs
    TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads.
  9. [9]
    TiDB Architecture
    As a distributed database, TiDB is designed to consist of multiple components. These components communicate with each other and form a complete TiDB system.Tidb Server · Storage Servers · Tikv Server
  10. [10]
    How we build TiDB | PingCAP株式会社
    We need TiDB to be easy to maintain so we chose the loose coupling approach. We design the database to be highly layered with a SQL layer and a Key-Value layer.
  11. [11]
    TiDB Storage | TiDB Docs
    Transaction of TiKV adopts the model used by Google in BigTable: Percolator. TiKV's implementation is inspired by this paper, with a lot of optimizations ...Key-Value pairs · Local storage (RocksDB) · Raft protocol · Region
  12. [12]
  13. [13]
    China's Biggest Startups Ditch Oracle and IBM for Home-Made Tech
    Jun 25, 2019 · PingCAP founders Huang Dongxu, right, Liu Qi, center, and Cui Qiu. Source: PingCAP. “A lot of firms that used to resort to Oracle or IBM ...
  14. [14]
    Top 100 Influential Figures in the Domestic Database Industry
    Jun 21, 2024 · In 2015, Liu Qi co-founded the enterprise-level open-source distributed database company PingCAP with Huang Dongxu and Cui Qiu, serving as CEO.
  15. [15]
    China's biggest startups ditch Oracle, IBM for home-made technology
    Jun 25, 2019 · Inspired by Google's Cloud Spanner, which pioneered the distributed database model, the trio - Mr Huang, Liu Qi and Cui Qiu - began creating an ...
  16. [16]
    Building a Large-scale Distributed Storage System Based on Raft
    May 21, 2020 · Since April 2015, we PingCAP have been building TiKV, a large-scale open-source distributed database based on Raft. It's the core storage ...<|control11|><|separator|>
  17. [17]
    Why did we choose Rust over Golang or C/C++ to develop TiKV?
    Sep 26, 2017 · TiKV originates from the end of 2015. Our team was struggling among different language choices such as Pure Go, Go + Cgo, C++11, or Rust. Pure ...
  18. [18]
    PingCAP Raises $50 Million in Series C Round, and Yunqi Partners ...
    Oct 1, 2018 · This is the third investment by Yunqi Partners after it led PingCAP's A round financing in September 2016. PingCAP completed its $15 million B ...
  19. [19]
    How Much Did PingCAP Raise? Funding & Key Investors - Clay
    Apr 7, 2025 · How Much Funding Has PingCAP Raised? · Amount Raised: $5M · Date: August 2016 · Lead Investors: Yunqi Partners · Valuation at Round: Not publicly ...
  20. [20]
    PingCAP Launches TiDB 1.0 | TiDB
    October 16, 2017 – PingCAP Inc., a cutting-edge distributed database technology company, officially announces the release of TiDB 1.0.
  21. [21]
    TiDB 2.0 is Ready - Faster, Smarter, and Battle-Tested
    Apr 29, 2018 · TiDB 2.0 is ready! Experience faster, smarter, and battle-tested HTAP database technology with TiDB 2.0 today.
  22. [22]
    TiDB 4.0 GA, Gearing You Up for an Unpredictable World with a ...
    Jun 17, 2020 · TiDB 4.0 is a real-time HTAP, truly elastic, cloud-native database, which meets your application requirements in various scenarios.Serverless Tidb · Tidb 4.0 Achieves Faster... · Tidb 4.0 Is Smarter And More...
  23. [23]
    What's New in TiDB 5.0
    Apr 7, 2021 · What's New in TiDB 5.0. Release date: April 7, 2021. TiDB version: 5.0.0. In v5.0, PingCAP is dedicated to helping enterprises quickly build ...Compatibility changes · New features · Performance optimization · Improve stability
  24. [24]
    TiDB 6.0: A Leap Towards an Enterprise-Grade Cloud Database
    Apr 7, 2022 · TiDB 6.0 significantly enhances the manageability as an enterprise product and incorporates many of the essential features.
  25. [25]
    TiDB 7.0.0 Release Notes
    Release date: March 30, 2023. TiDB version: 7.0.0-DMR. Quick access: Quick start. In v7.0.0-DMR, the key new features and improvements are as follows:Feature details · Performance · Reliability · Compatibility changes
  26. [26]
    TiDB 8.0.0 Release Notes
    Release date: March 29, 2024. TiDB version: 8.0.0. Quick access: Quick start. 8.0.0 introduces the following key features and improvements.Feature details · SQL · Data migration · Compatibility changes
  27. [27]
    TiDB Cloud Release Notes in 2025
    Oct 21, 2025 · July 15, 2025. Upgrade the default TiDB version of new TiDB Cloud Dedicated clusters from v8. 1.2 to v8. 5.2.
  28. [28]
    PingCAP Launches TiDB X and New AI Capabilities
    Oct 8, 2025 · Today at SCaiLE Summit 2025, PingCAP unveiled TiDB X, a new architecture for context-aware, zero-friction scaling and native AI support.
  29. [29]
    TiDB Computing
    ### Summary of TiDB Server: Stateless SQL Layer
  30. [30]
    TiDB Storage
    ### Summary of TiKV Details
  31. [31]
    TiFlash Overview
    - **TiFlash Overview**:
  32. [32]
    TiDB Monitoring Framework Overview
    The TiDB monitoring framework adopts two open source projects: Prometheus and Grafana. TiDB uses Prometheus to store the monitoring and performance metrics and ...Missing: core | Show results with:core
  33. [33]
    The Lifecycle of a Statement - TiDB Development Guide
    The Life cycle of a Statement. MySQL protocol package with command and statement string. After connecting and getting authenticated, the server is in a ...
  34. [34]
    TiDB Query Execution Plan Overview
    The process of considering query execution plans is known as SQL optimization. The EXPLAIN statement shows the selected execution plan for a given statement.
  35. [35]
    [PDF] A Raft-based HTAP Database - TiDB - VLDB Endowment
    The distributed storage layer consists of a row store (TiKV) and a columnar store (TiFlash). Logically, the data stored in TiKV is an ordered key-value map.
  36. [36]
    TiDB Lightning Overview
    TiDB Lightning is a tool used for importing data at TB scale to TiDB clusters. It is often used for initial data import to TiDB clusters.
  37. [37]
    TiDB Data Migration Overview
    Compatibility with MySQL. DM is compatible with the MySQL protocol and most of the features and syntax of MySQL 5.7 and MySQL 8.0. Replicating DML and DDL ...
  38. [38]
    Horizontal Scaling vs. Vertical Scaling: Choosing the Right Strategy
    Several core mechanisms underpin TiDB's horizontal scalability: Automatic Sharding (Regions): TiDB automatically partitions table data into Regions ...
  39. [39]
    TiKV Overview - TiDB Docs
    The Region size is currently 256 MiB by default. This mechanism helps the PD component to balance Regions among nodes in a TiKV cluster.Architecture Overview · Region and RocksDB · Region and Raft Consensus...
  40. [40]
    How Flipkart Scales Over 1M QPS with Zero Downtime Maintenance
    May 29, 2025 · These tests demonstrated that TiDB could handle over 1 million QPS with 7.4 ms P99 latency and 120K writes per second at 13 ms. These benchmarks ...
  41. [41]
    Business Growth: How TiDB Scales Petabyte-Level Data Volumes
    Dec 18, 2024 · Data Volume Explosion: Rapid user activity generates terabytes (TBs) to petabytes (PBs) of data that must be stored and managed efficiently. ...
  42. [42]
    How We Reduced Multi-region Read Latency and Network Traffic by ...
    Feb 19, 2020 · TiDB reduced multi-region read latency and network traffic by 50% using Follower Read, Follower Replication, and a proxy service to reduce ...
  43. [43]
    MySQL Compatibility - TiDB Docs
    TiDB is highly compatible with the MySQL protocol and the common features and syntax of MySQL 5.7 and MySQL 8.0. The ecosystem tools for MySQL (PHPMyAdmin, ...Differences from MySQL · Auto-increment ID · DDL operations
  44. [44]
    Connect to TiDB - TiDB Docs
    TiDB supports the MySQL Client/Server Protocol, which allows most client drivers and ORM frameworks to connect to TiDB just as they connect to MySQL.
  45. [45]
    Window Functions - TiDB Docs
    Window Functions. The usage of window functions in TiDB is similar to that in MySQL 8.0. For details, see MySQL Window Functions.
  46. [46]
    TiDB 7.4 Release: Officially Compatible with MySQL 8.0
    Jun 21, 2024 · After supporting the complete functions of MySQL 5.7, TiDB continues to add support for new features released in MySQL 8.0. The recent version ...
  47. [47]
    Choose Driver or ORM - TiDB Docs
    TiDB is highly compatible with the MySQL protocol but some features are incompatible with MySQL. For a full list of compatibility differences, see MySQL ...
  48. [48]
    Third-Party Tools Supported by TiDB - TiDB Docs
    TiDB is highly compatible with the MySQL protocol, so most of the MySQL drivers, ORM frameworks, and other tools that adapt to MySQL are compatible with TiDB.
  49. [49]
  50. [50]
    System Variables - TiDB Docs
    Additionally, TiDB presents several MySQL variables as both readable and settable. This is required for compatibility, because it is common for both ...
  51. [51]
    TiDB vs Traditional Databases: Scalability and Performance
    Sep 5, 2024 · TiDB excels in horizontal scaling, where adding more nodes to the cluster can increase capacity and performance without downtime. The separation ...
  52. [52]
    ACID Transactions in Distributed Databases - TiDB
    Oct 31, 2025 · Percolator uses the snapshot isolation model. This allows transactions to take snapshots of the database at initiation, promote parallel updates ...Inside Tidb's Distributed... · How Tidb Ensures Acid... · Tidb Vs. Other Distributed...
  53. [53]
  54. [54]
    TiDB Transaction Isolation Levels
    Transaction retries in TiDB's optimistic concurrency control might fail, leading to a final failure of the transaction, while in TiDB's pessimistic concurrency ...
  55. [55]
    Ensuring Data Consistency in Distributed Databases - TiDB
    Sep 26, 2024 · A diagram comparing the ACID, BASE, and CAP Theorem consistency models. ACID: Atomicity: Ensures that transactions are all-or-nothing.Introduction To Data... · Overview Of Consistency... · Best Practices For...
  56. [56]
    TiDB Optimistic Transaction Model
    In the case that concurrent transactions frequently modify the same rows (a conflict), optimistic transactions may perform worse than Pessimistic Transactions.Principles of optimistic... · Transaction retries · Automatic retryMissing: flow | Show results with:flow
  57. [57]
    TiDB Pessimistic Transaction Mode
    TiDB supports the pessimistic transaction mode on top of the optimistic transaction model. This document describes the features of the TiDB pessimistic ...Switch transaction mode · Behaviors · Differences from MySQL InnoDBMissing: flow | Show results with:flow
  58. [58]
    Optimistic Transactions and Pessimistic Transactions - TiDB Docs
    The optimistic transaction model commits the transaction directly, and rolls back when there is a conflict. By contrast, the pessimistic transaction model ...Pessimistic transactions · Write a pessimistic transaction... · Optimistic transactionsMissing: flow 2PC
  59. [59]
    Troubleshoot Lock Conflicts - TiDB Docs
    This document describes how to use Lock View to troubleshoot lock issues and how to deal with common lock conflict issues in optimistic and pessimistic ...
  60. [60]
    Ensuring Data Consistency in Distributed Systems - TiDB
    Nov 30, 2024 · By default, TiDB provides linearizability (strong consistency), ensuring each transaction appears instantaneously from any client's perspective.
  61. [61]
    Transactions - TiDB Docs
    TiDB supports explicit transactions (use [BEGIN|START TRANSACTION] and COMMIT to define the start and end of the transaction) and implicit transactions ( SET ...Common Statements · Autocommit · Causal ConsistencyMissing: date | Show results with:date
  62. [62]
    Latency Breakdown - TiDB Docs
    Commit. The commit duration can be broken down into four metrics: Get_latest_ts_time records the duration of getting latest TSO in async-commit or single-phase ...Read Queries · Write Queries · Async Write
  63. [63]
    Performance Analysis and Tuning - TiDB Docs
    Learn how to optimize database system based on database time and how to utilize the TiDB Performance Overview dashboard for performance analysis and tuning.Performance analysis and... · TiDB key metrics and cluster...
  64. [64]
  65. [65]
    TiDB Operator Source Code Reading (I): Overview
    Mar 23, 2021 · TiDB Operator is an automatic operation system for TiDB in Kubernetes. As a Kubernetes Operator tailored for TiDB, it is widely used by TiDB users to manage ...
  66. [66]
    TiDB: Scalable Cloud-Native SQL Database Solution
    Sep 15, 2025 · Drawing inspiration from Google's Spanner and F1, TiDB emerged as an innovative open-source project that sought to reconcile the demands of ...
  67. [67]
    Optimizing Cloud-Native Apps with TiDB's Scalable Architecture | TiDB
    TiDB Operator simplifies these processes by managing TiDB clusters on Kubernetes, enabling self-healing and adapting quickly to workload changes. Solutions ...
  68. [68]
    Best Practices for TiDB on AWS Cloud | PingCAP株式会社
    Nov 24, 2020 · The storage layer is decoupled from the Elastic Compute Cloud (EC2) instance, and therefore is resilient to host failure and is easy to scale.
  69. [69]
    How PingCAP transformed TiDB into a serverless DBaaS using ...
    Nov 14, 2023 · In July 2023, PingCAP released TiDB Serverless, a fully managed, autonomous DBaaS offering of TiDB. However, based on TiDB's existing ...Missing: GCP Azure 2022
  70. [70]
    TiDB Cloud FAQs
    For new TiDB Cloud Dedicated clusters, the default TiDB version is v8.5.2 starting from July 15, 2025. For TiDB Cloud Starter and TiDB Cloud Essential clusters, ...Missing: June | Show results with:June
  71. [71]
    Introducing A New Foundation for Distributed SQL - TiDB X
    Oct 8, 2025 · Today, at SCaiLE Summit 2025, we're announcing TiDB X – a breakthrough new architecture for TiDB that redefines how distributed SQL databases ...The Tidb X Breakthrough... · Why This Matters For You · Looking Ahead
  72. [72]
    How We Trace a KV Database with Less than 5% Performance Impact
    Jun 30, 2021 · This article describes how we achieved tracing all requests' time consumption in TiKV with less than 5% performance impact.
  73. [73]
    Security - TiDB Docs
    TiDB Cloud provides a robust and flexible security framework designed to protect data, enforce access control, and meet modern compliance standards. This ...Tidb Cloud User Accounts · Tidb Privileges And Roles · Network Access Control
  74. [74]
    Enhancing Data Security and Privacy in Distributed SQL Database
    Dec 2, 2024 · TiDB deploys role-based access control (RBAC), enabling administrators to assign precise permissions to users based on predefined roles. This ...
  75. [75]
    Enable TLS Between TiDB Clients and Servers
    To use connections secured with TLS, you first need to configure the TiDB server to enable TLS. Then you need to configure the client application to use TLS.Configure Tidb Server To Use... · Enable Authentication · Check Whether The Current...Missing: enforcement | Show results with:enforcement
  76. [76]
    TiDB Cloud Security: Protecting Data Without Added Complexity
    Mar 27, 2025 · Explore the security features behind TiDB Cloud and learn when a self-managed TiDB deployment might be the best fit.How Tidb Cloud Keeps Your... · Tidb Cloud's Security Model... · How Tidb Cloud Manages...
  77. [77]
    Explore HTAP - TiDB Docs
    TiDB HTAP can handle the massive data that increases rapidly, reduce the cost of DevOps, and be deployed in either self-hosted or cloud environments easily, ...Use cases · Environment preparation · Data preparation · Data processing
  78. [78]
    Transforming AI with Scalable Data Management | TiDB
    TiDB's storage engine, in conjunction with TiFlash ... speedups of 10x or more. This is crucial for ... Product Overview TiDB Cloud TiDB Self-Managed Pricing.Leveraging Tidb For... · Why Tidb Is Ideal For Ai... · How Tidb Enhances Machine...<|control11|><|separator|>
  79. [79]
    HTAP Queries - TiDB Docs
    HTAP stands for Hybrid Transactional and Analytical Processing. ... TiDB databases can perform both transactional and analytical tasks, which greatly simplifies ...
  80. [80]
    Best Practices for PD Scheduling - TiDB Docs
    Because you cannot actually distribute a single hotspot, you need to manually add a split-region operator to split such a region. The load of some nodes is ...Pd Scheduling Policies · Pd Scheduling In Common... · Leaders/regions Are Not...
  81. [81]
    Schedule Replicas by Topology Labels - TiDB Docs
    To improve the high availability and disaster recovery capability of TiDB clusters, it is recommended that TiKV nodes are physically scattered as much as ...Configure Location-Labels... · Configure Isolation-Level... · Pd Schedules Based On...
  82. [82]
    High Availability in TiDB Cloud Serverless
    TiDB ensures high availability and data durability using the Raft consensus algorithm. This algorithm consistently replicates data changes across multiple nodes ...Overview · Zonal High Availability... · Regional High Availability...
  83. [83]
    A TiKV Source Code Walkthrough - Raft in TiKV - TiDB
    Jul 28, 2017 · election_tick : When a Follower hasn't received the message sent by its Leader after the election_tick time, then there will be a new election ...
  84. [84]
    TiDB Log Backup and PITR Guide
    To restore the cluster to any point in time within the backup retention period, you can use tiup br restore point . When you run this command, you need to ...Back up TiDB cluster · Query the status of the log... · Clean up outdated data
  85. [85]
    TiDB Backup & Restore Overview
    By running the br restore point command, you can restore the latest snapshot backup data before recovery time point and log backup data to a specified time.Br Features · Compatibility · Before You Use
  86. [86]
    Two Availability Zones in One Region Deployment - TiDB Docs
    If the disaster recovery AZ fails and a few Voter replicas are lost, the cluster automatically switches to the asynchronous replication mode.Configuration · Placement Rules · Enable The Dr Auto-Sync Mode
  87. [87]
    TiDB Cluster Alert Rules
    This document describes the alert rules for different components in a TiDB cluster, including the rule descriptions and solutions of the alert items.Missing: healing | Show results with:healing
  88. [88]
    TiDB Operator Architecture
    Starting from TiDB Operator v1.1, the TiDB cluster, monitoring, initialization, backup, and other components are deployed and managed using CR.Missing: healing | Show results with:healing
  89. [89]
    Transforming TiDB with AI: HTAP, Scalability & Real-World Cases
    Aug 12, 2024 · By leveraging AI for predictive maintenance, TiDB ensures high availability and reduces the risk of unplanned outages. Enhancing TiDB ...Introduction To Tidb And Ai · Overview Of Tidb · Enhancing Tidb Performance...
  90. [90]
    Vector Search Overview - TiDB Docs
    You can store vector embeddings in TiDB and perform vector search queries to find the most relevant data using these data types. Embedding model. Embedding ...Concepts · How Vector Search Works · Use Cases
  91. [91]
    Vector Search Index - TiDB Docs
    HNSW is one of the most popular vector indexing algorithms. The HNSW index provides good performance with relatively high accuracy, up to 98% in specific cases.Create the HNSW vector index · Use the vector index · View index build progress
  92. [92]
    TiDB Vector Search Public Beta
    Jun 25, 2024 · With built-in vector search in TiDB, you can develop AI applications directly, eliminating the need for additional databases or tech stacks.
  93. [93]
    Hybrid Search with TiDB: Combining Full-Text and Vector Search for ...
    TiDB's unique architecture facilitates seamless Hybrid Search integration. It streamlines RAG pipelines and sets a new standard for AI applications' efficacy ...Semantic Search (vector... · Why Hybrid Search Is Crucial... · Tidb As Your Hybrid Search...
  94. [94]
    Build Gen-AI Applications with TiDB
    TiDB supports vector search, full-text search, and SQL-native hybrid queries so your LLMs always get the most relevant, grounded information. Graph-based ...
  95. [95]
    TiFlash Overview - TiDB Docs
    TiFlash is the key component that makes TiDB essentially a Hybrid Transactional/Analytical Processing (HTAP) database. As a columnar storage extension of TiKV, ...Asynchronous replication · Consistency
  96. [96]
    Storing Billions of Vectors with TiDB Serverless
    May 21, 2024 · Learn how TiDB Serverless incorporates cutting-edge vector storage mechanisms designed specifically to handle storing billions of vectors.The Solution: Tidb... · Efficient Vector Storage · Similarity Search...
  97. [97]
    Introduce Vector Search Indexes in TiDB
    Jun 3, 2024 · Explore how TiDB implements vector search indexes using HNSW, and how they can be utilized for efficient nearest neighbor searches.Missing: sub- latency 100M
  98. [98]
    Deploy a TiDB Cluster Using TiUP
    Deploy a TiDB Cluster Using TiUP · Step 1. Prerequisites and prechecks · Step 2. Deploy TiUP on the control machine · Step 3. Initialize the cluster topology file.Step 2. Deploy TiUP on the... · Step 4. Run the deployment...Missing: premises | Show results with:premises
  99. [99]
    pingcap/tidb-ansible - GitHub
    Jun 24, 2021 · TiDB-Ansible is a TiDB cluster deployment tool developed by PingCAP, based on Ansible playbook. TiDB-Ansible enables you to quickly deploy a new TiDB cluster.
  100. [100]
  101. [101]
    Ansible Deployment - TiKV
    Use TiDB-Ansible to deploy a TiKV cluster on multiple nodes. This guide describes how to install and deploy TiKV using Ansible.
  102. [102]
    pingcap/tidb-docker-compose - GitHub
    Jul 26, 2025 · You can customize TiDB cluster configuration by editing docker-compose.yml and the above config files if you know what you're doing.
  103. [103]
    Minimal Deployment Topology - TiDB Docs
    This document describes the minimal deployment topology of TiDB clusters. ... Developer Guide · FAQs · Support. Company. About Us · News · Careers · Contact Us.Missing: premises | Show results with:premises
  104. [104]
    TiDB Software and Hardware Requirements
    This document describes the software and hardware requirements for deploying and running the TiDB database.Os And Platform Requirements · Server Requirements · Storage Requirements
  105. [105]
    Select Your Cluster Plan - TiDB Docs
    TiDB Cloud Serverless (now Starter) is a fully managed, multi-tenant TiDB offering. It delivers an instant, autoscaling MySQL-compatible database and offers a ...Tidb Cloud Serverless · Usage Quota · Tidb Cloud Essential
  106. [106]
    TiDB Cloud Starter: Our Renamed Auto-Scaling Plan
    Jul 22, 2025 · TiDB Cloud Serverless will now become TiDB Cloud Starter, providing a clear a path for both existing users and the next wave of builders.
  107. [107]
    TiDB Operator Overview
    TiDB Operator is an automatic operation system for TiDB clusters on Kubernetes. It provides a full management life-cycle for TiDB including deployment, upgrades ...
  108. [108]
    Deploy TiDB Operator on Kubernetes
    These two components are stateless and deployed via Deployment . You can customize resource limit , request , and replicas in the values.yaml file. After ...Deploy Tidb Operator · Online Deployment · Offline Installation
  109. [109]
    Get Started with TiDB on Kubernetes
    This document introduces how to create a simple Kubernetes cluster and use it to deploy a basic test TiDB cluster using TiDB Operator.
  110. [110]
    Deploy TiDB on AWS EKS
    This document describes how to deploy a TiDB cluster on AWS Elastic Kubernetes Service (EKS). To deploy TiDB Operator and the TiDB cluster in a self-managed ...Create An Eks Cluster And A... · Configure Storageclass · Deploy Tiflash/ticdc
  111. [111]
    Deploy TiDB on Google Cloud GKE
    This document describes how to deploy a Google Kubernetes Engine (GKE) cluster and deploy a TiDB cluster on GKE.Configure Storageclass · Deploy A Tidb Cluster And... · Access The Tidb Database
  112. [112]
    Deploy TiDB on Azure AKS
    Before deploying a TiDB cluster on Azure AKS, perform the following operations: Install Helm 3 for deploying TiDB Operator. Deploy a Kubernetes (AKS) cluster ...Deploy A Tidb Cluster And... · Access The Database · Deploy Tiflash/ticdcMissing: charts | Show results with:charts
  113. [113]
    Transforming App Development with Serverless Computing - TiDB
    Mar 14, 2025 · Discover how serverless computing and TiDB enhance app development with scalability, cost-efficiency, and simplified management.Understanding Serverless... · Advantages Of Tidb In A... · Implementing Tidb Serverless...
  114. [114]
    TiDB Lightning Data Sources
    TiDB Lightning supports importing data from CSV, SQL, and Parquet files. It also supports schema files and compressed files.Missing: documentation | Show results with:documentation
  115. [115]
    Dumpling Overview | TiDB Docs
    Jul 2, 2020 · Dumpling exports data stored in TiDB/MySQL as SQL or CSV data files and can be used to make a logical full backup or export.Export data from TiDB or MySQL · Export to SQL files · Export to CSV files
  116. [116]
    DML Replication Mechanism in Data Migration - TiDB Docs
    This document introduces the complete processing flow of DML events in DM, including the logic of binlog reading, filtering, routing, transformation, ...Missing: loader stage real-
  117. [117]
    Usage Overview of TiDB Backup and Restore
    If you have started log backup and regularly performed a full backup, you can run the tiup br restore point command to restore data to any time point within the ...Recommended Practices · How To Manage Backup Data? · Deploy And Use Br
  118. [118]
    TiDB Snapshot Backup and Restore Command Manual
    TiDB Snapshot Backup and Restore Command Manual describes commands for backing up and restoring cluster snapshots, databases, and tables.
  119. [119]
    Encryption at Rest - TiDB Docs
    TiKV supports KMS encryption for three platforms: AWS, Google Cloud, and Azure. Depending on the platform where your service is deployed, you can choose one of ...Encryption Support In... · Tikv Encryption At Rest · Tiflash Encryption At Rest
  120. [120]
    Overview of TiDB Backup & Restore Architecture - TiDB Docs
    You can use Backup & Restore (BR) and TiDB Operator to access these features, and create tasks to back up data from TiKV nodes or restore data to TiKV nodes.
  121. [121]
    How to Back Up and Restore a 10-TB Cluster at 1+ GB/s - TiDB
    Apr 20, 2020 · BR enables backup and restore to horizontally scale; that is, you can increase BR's backup and restore speeds by adding new TiKV instances.Missing: RTO | Show results with:RTO
  122. [122]
    Cluster Recovery: How TiDB Redefines Large-Scale Data Restores
    Jan 31, 2025 · Dive into the performance improvements, challenges overcome, and innovations that make TiDB 8.1 a leader in large-scale cluster recovery.
  123. [123]
    TiCDC Overview - TiDB Docs
    High availability with no single point of failure, supporting dynamically adding and deleting TiCDC nodes. Cluster management through Open API, including ...Major Features · Ticdc Architecture Overview · Implementation Of Processing...
  124. [124]
    pingcap/tidb-binlog - GitHub
    Initial commit. 9 years ago. Makefile · Makefile · refactor(ci): delete the ... The best way to install TiDB-Binlog is via TiDB-Binlog-Ansible. Tutorial.<|control11|><|separator|>
  125. [125]
    Deploy TiDB Binlog
    Install Helm and configure it with the official PingCAP chart. Deploy TiDB Binlog in a TiDB cluster. TiDB Binlog is disabled in the TiDB cluster by default. To ...Deploy Tidb Binlog In A Tidb... · Deploy Pump · Remove Pump/drainer Nodes<|separator|>
  126. [126]
    I Like To Move IT, Move IT - Replication in TiDB & MySQL - Fosdem
    Use cases include easily reversible TiDB version upgrades, cross-region high availability with standby TiDB clusters, and real-time Change Data Capture into ...
  127. [127]
    TiCDC New Architecture - TiDB Docs
    Downstream Adapter, as the stateless component, uses a lightweight scheduling mechanism that allows quick migration of replication tasks between instances. It ...Comparison Between The... · Compatibility · Upgrade Guide
  128. [128]
    Enhancing Real-Time Analytics with TiDB and AI Integration
    Mar 23, 2025 · TiDB's capability to manage both OLTP and OLAP workloads enables seamless integration of AI models, resulting in enhanced real-time analytics.Missing: TiCDC enhancements
  129. [129]
    How to Resolve High Latency in CDC - Translated - TiDB Forum
    Jun 21, 2024 · Splitting transactions can significantly reduce the latency and memory consumption of MySQL sink when synchronizing large transactions.How to Resolve High Latency in CDC - TiDB ForumTiCDC latency is high, consuming a lot of CPU and memoryMore results from ask.pingcap.comMissing: limitations | Show results with:limitations
  130. [130]
    TiDB Adoption at Pinterest. Authors - Medium
    Jul 19, 2024 · - Change Data Capture (CDC). CDC is an essential requirement for many near-real-time use cases to stream database changes. It is also needed ...