diff --git a/key-features.md b/key-features.md deleted file mode 100644 index 825279ba8bab9..0000000000000 --- a/key-features.md +++ /dev/null @@ -1,99 +0,0 @@ ---- -title: Key Features -summary: Key features of the TiDB database platform. -aliases: ['/docs/stable/key-features/','/docs/v4.0/key-features/'] ---- - -# Key Features - -## Horizontal scalability - -TiDB expands both SQL processing and storage by simply adding new nodes. This makes infrastructure capacity planning both easier and more cost-effective than traditional relational databases which only scale vertically. - -## MySQL compatible syntax - -TiDB acts like it is a MySQL 5.7 server to your applications. You can continue to use all of the existing MySQL client libraries, and in many cases, you will not need to change a single line of code in your application. - -TiDB does not have 100% MySQL compatibility because we built the layer from scratch in order to maximize the performance advantages inherent to a distributed system. We believe in being transparent about the level of MySQL compatibility that TiDB provides. Please check out the list of [known compatibility differences](/mysql-compatibility.md). - -## Replicate from and to MySQL - -TiDB supports the ability to replicate from a MySQL or MariaDB installation, using its Data Migration (DM) toolchain. Replication is also possible in the direction of TiDB to MySQL using the TiDB Binlog. - -We believe that being able to replicate in both directions lowers the risk when either evaluating or migrating to TiDB from MySQL. - -## Distributed transactions with strong consistency - -TiDB internally shards table into small range-based chunks that we refer to as "Regions". Each Region defaults to approximately 100MiB in size, and TiDB uses a Two-phase commit internally to ensure that Regions are maintained in a transactionally consistent way. - -Transactions in TiDB are strongly consistent, with snapshot isolation level consistency. For more information, see transaction [behavior and performance differences](/transaction-isolation-levels.md). This makes TiDB more comparable to traditional relational databases in semantics than some of the newer NoSQL systems using eventual consistency. - -These behaviors are transparent to your application(s), which only need to connect to TiDB using a MySQL 5.7 compatible client library. - -## Cloud native architecture - -TiDB is designed to work in the cloud -- public, private, or hybrid -- making deployment, provisioning, operations, and maintenance simple. - -The storage layer of TiDB, called TiKV, [became](https://www.cncf.io/blog/2018/08/28/cncf-to-host-tikv-in-the-sandbox/) a [Cloud Native Computing Foundation](https://www.cncf.io/) member project in 2018. The architecture of the TiDB platform also allows SQL processing and storage to be scaled independently of each other in a very cloud-friendly manner. - -## Minimize ETL with HTAP - -TiDB is designed to support both transaction processing (OLTP) and analytical processing (OLAP) workloads. This means that while you may have traditionally transacted on MySQL and then Extracted, Transformed and Loaded (ETL) data into a column store for analytical processing, this step is no longer required. - -With trends in business such as moving from two-day delivery to instant, it is important to be able to run analytics with minimal delay. The future is in HTAP databases which can perform the _hybrid_ of Transactional and Analytical processing. - -## Fault tolerance & recovery with Raft - -TiDB uses the Raft consensus algorithm to ensure that data is safely replicated throughout storage in Raft groups. In the event of failure, a Raft group will automatically elect a new leader for the failed member, and self-heal the TiDB cluster without any required manual intervention. - -Failure and self-healing operations are also transparent to applications. TiDB servers will retry accessing the data after the leadership change, with the only impact being slightly higher latency for queries attempting to access this specific data in between when the failure is detected and fixed. - -## Automatic rebalancing - -The storage in TiKV is automatically rebalanced to match changes in your workload. For example, if part of your data is more frequently accessed, this hotspot will be detected and may trigger the data to be rebalanced among other TiKV servers. Chunks of data ("Regions" in TiDB terminology) will automatically be split or merged as needed. - -This helps remove some of the headaches associated with maintaining a large database cluster and also leads to better utilization over traditional master-slave read-write splitting that is commonly used with MySQL deployments. - -## Deployment and orchestration with Ansible, Kubernetes, Docker - -TiDB supports several deployment and orchestration methods, like Ansible, Kubernetes, and Docker. Whether your environment is bare metal, virtualized or containerized, TiDB can be deployed, upgraded, operated, and maintained using the best toolset most suited to your needs. - -## JSON support - -TiDB supports a built-in JSON data type and set of built-in functions to search, manipulate and create JSON data. This enables you to build your application without enforcing a strict schema up front. - -## Spark integration - -TiDB natively supports an Apache Spark plug-in, called TiSpark, with a SparkSQL interface that enables users to run analytical workloads using Spark directly on TiKV, where the data is stored. This plug-in does not interfere with transactional processing in the TiDB server. This integration takes advantage of TiDB’s modular architecture to support HTAP workloads. - -## Read historical data without restoring from backup - -Many restore-from-backup events are the result of accidental deletion or modification of the wrong data. With TiDB, you can access the older versions from MVCC by specifying a timestamp in the past from when you would like to access the data. - -Your session will be placed in read-only mode while reading the earlier versions of rows, but you can use this to export the data and reload it to the current time if required. - -## Fast import and restore of data - -TiDB supports the ability to fast-import both Mydumper and .csv formatted data using an optimized insert mode that disables redo logging, and applies a number of optimizations. - -With TiDB Lightning, you can import data into TiDB at over 100GiB/hour using production-grade hardware. - -## Hybrid of column and row storage - -TiDB supports the ability to store data in both row-oriented and (coming soon) column-oriented storage. This enables a wide spectrum of both transactional and analytical queries to be executed efficiently in TiDB and TiSpark. The TiDB optimizer is also able to determine which queries are best served by column storage, and route the queries appropriately. - -## SQL plan management - -In both MySQL and TiDB, optimizer hints are available to override the default query execution plan with a better known plan. The problem with this approach is that it requires an application developer to make modifications to query text to inject the hint. This can also be undesirable in the case that an ORM is used to generate the query. - -In TiDB 3.0, you will be able to bind queries to a specific execution plan directly within the TiDB server. This method is entirely transparent to application code. - -## Open source - -TiDB has been released under the Apache 2.0 license since its initial launch in 2015. The TiDB server has (to our knowledge) the highest contributor count on GitHub of any relational database project. - -## Online schema changes - -TiDB implements the _Online, Asynchronous Schema Change_ algorithm first described in [Google's F1 paper](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41376.pdf). - -In simplified terms, this means that TiDB is able to make changes to the schema across its distributed architecture without blocking either read or write operations. There is no need to use an external schema change tool or flip between masters and slaves as is common in large MySQL deployments. diff --git a/tidb-lightning/tidb-lightning-table-filter.md b/tidb-lightning/tidb-lightning-table-filter.md deleted file mode 100644 index cbc54f98d5627..0000000000000 --- a/tidb-lightning/tidb-lightning-table-filter.md +++ /dev/null @@ -1,129 +0,0 @@ ---- -title: TiDB Lightning Table Filter -summary: Use black and white lists to filter out tables, ignoring them during import. -aliases: ['/docs/stable/tidb-lightning/tidb-lightning-table-filter/','/docs/v4.0/tidb-lightning/tidb-lightning-table-filter/','/docs/stable/reference/tools/tidb-lightning/table-filter/'] ---- - -# TiDB Lightning Table Filter - -TiDB Lightning supports setting up black and white lists to ignore certain databases and tables. This can be used to skip cache tables, or manually partition the data source on a shared storage to allow multiple Lightning instances work together without interfering each other. - -The filtering rule is similar to MySQL `replication-rules-db`/`replication-rules-table`. - -## Filtering databases - -```toml -[black-white-list] -do-dbs = ["pattern1", "pattern2", "pattern3"] -ignore-dbs = ["pattern4", "pattern5"] -``` - -* If the `do-dbs` array in the `[black-white-list]` section is not empty, - * If the name of a database matches *any* pattern in the `do-dbs` array, the database is included. - * Otherwise, the database is skipped. -* Otherwise, if the name matches *any* pattern in the `ignore-dbs` array, the database is skipped. -* If a database’s name matches *both* the `do-dbs` and `ignore-dbs` arrays, the database is included. - -The pattern can either be a simple name, or a regular expression in [Go dialect](https://golang.org/pkg/regexp/syntax/#hdr-syntax) if it starts with a `~` character. - -> **Note:** -> -> The system databases `INFORMATION_SCHEMA`, `PERFORMANCE_SCHEMA`, `mysql` and `sys` are always black-listed regardless of the table filter settings. - -## Filtering tables - -```toml -[[black-white-list.do-tables]] -db-name = "db-pattern-1" -tbl-name = "table-pattern-1" - -[[black-white-list.do-tables]] -db-name = "db-pattern-2" -tbl-name = "table-pattern-2" - -[[black-white-list.do-tables]] -db-name = "db-pattern-3" -tbl-name = "table-pattern-3" - -[[black-white-list.ignore-tables]] -db-name = "db-pattern-4" -tbl-name = "table-pattern-4" - -[[black-white-list.ignore-tables]] -db-name = "db-pattern-5" -tbl-name = "table-pattern-5" -``` - -* If the `do-tables` array is not empty, - * If the qualified name of a table matched *any* pair of patterns in the `do-tables` array, the table is included. - * Otherwise, the table is skipped -* Otherwise, if the qualified name matched *any* pair of patterns in the `ignore-tables` array, the table is skipped. -* If a table’s qualified name matched *both* the `do-tables` and `ignore-tables` arrays, the table is included. - -Note that the database filtering rules are applied before Lightning considers the table filtering rules. This means if a database is ignored by `ignore-dbs`, all tables inside this database are not considered even if they matches any `do-tables` array. - -## Example - -To illustrate how these rules work, suppose the data source contains the following tables: - -``` -`logs`.`messages_2016` -`logs`.`messages_2017` -`logs`.`messages_2018` -`forum`.`users` -`forum`.`messages` -`forum_backup_2016`.`messages` -`forum_backup_2017`.`messages` -`forum_backup_2018`.`messages` -`admin`.`secrets` -``` - -Using this configuration: - -```toml -[black-white-list] -do-dbs = [ - "forum_backup_2018", # rule A - "~^(logs|forum)$", # rule B -] -ignore-dbs = [ - "~^forum_backup_", # rule C -] - -[[black-white-list.do-tables]] # rule D -db-name = "logs" -tbl-name = "~_2018$" - -[[black-white-list.ignore-tables]] # rule E -db-name = "~.*" -tbl-name = "~^messages.*" - -[[black-white-list.do-tables]] # rule F -db-name = "~^forum.*" -tbl-name = "messages" -``` - -First apply the database rules: - -| Database | Outcome | -|---------------------------|--------------------------------------------| -| `` `logs` `` | Included by rule B | -| `` `forum` `` | Included by rule B | -| `` `forum_backup_2016` `` | Skipped by rule C | -| `` `forum_backup_2017` `` | Skipped by rule C | -| `` `forum_backup_2018` `` | Included by rule A (rule C will not apply) | -| `` `admin` `` | Skipped since `do-dbs` is not empty and this does not match any pattern | - -Then apply the table rules: - -| Table | Outcome | -|--------------------------------------|--------------------------------------------| -| `` `logs`.`messages_2016` `` | Skipped by rule E | -| `` `logs`.`messages_2017` `` | Skipped by rule E | -| `` `logs`.`messages_2018` `` | Included by rule D (rule E will not apply) | -| `` `forum`.`users` `` | Skipped, since `do-tables` is not empty and this does not match any pattern | -| `` `forum`.`messages` `` | Included by rule F (rule E will not apply) | -| `` `forum_backup_2016`.`messages` `` | Skipped, since database is already skipped | -| `` `forum_backup_2017`.`messages` `` | Skipped, since database is already skipped | -| `` `forum_backup_2018`.`messages` `` | Included by rule F (rule E will not apply) | -| `` `admin`.`secrets` `` | Skipped, since database is already skipped | diff --git a/tiflash/tiflash-faq.md b/tiflash/tiflash-faq.md deleted file mode 100644 index 854b1dceae7f6..0000000000000 --- a/tiflash/tiflash-faq.md +++ /dev/null @@ -1,29 +0,0 @@ ---- -title: TiFlash FAQ -summary: Learn the frequently asked questions (FAQs) and answers about TiFlash. -aliases: ['/docs/stable/tiflash/tiflash-faq/','/docs/v4.0/tiflash/tiflash-faq/','/docs/stable/reference/tiflash/faq/'] ---- - -# TiFlash FAQ - -This document lists the frequently asked questions (FAQs) and answers about TiFlash. - -## Does TiFlash support direct writes? - -Currently, TiFlash does not support direct writes. You can only write data to TiKV, and then replicate the data to TiFlash. - -## How can I estimate the storage resources if I want to add TiFlash to an existing cluster? - -You can evaluate which tables might require acceleration. The size of a single replica of these tables data is roughly equal to the storage resources required by two replicas of TiFlash. Note that you need to take into account the free space required. - -## How can data in TiFlash be highly available? - -TiFlash restores data through TiKV. As long as the corresponding Regions in TiKV are available, TiFlash can restore data from these Regions. - -## How many replicas does TiFlash recommend to set up? - -If you need highly available TiFlash services (rather than highly available data), it is recommended to set up two replicas for TiFlash. If you allow TiKV replicas to provide analytical services when TiFlash is down, you can set up a single TiFlash replica. - -## Should I use TiSpark or TiDB server for a query? - -It is recommended to use TiDB server if you query a single table mainly using filtering and aggregation, because the TiDB server has better performance on the columnar storage. It is recommended to use TiSpark if you query a table mainly using joins.