From c9fc78315843e243c627fd0b14695a8eb1a4764f Mon Sep 17 00:00:00 2001 From: Ti Chi Robot Date: Wed, 6 Nov 2024 14:16:51 +0800 Subject: [PATCH] tiup: add kvcdc server topylogy example (#19230) (#19344) --- tiup/tiup-cluster-topology-reference.md | 49 +++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/tiup/tiup-cluster-topology-reference.md b/tiup/tiup-cluster-topology-reference.md index c62bb83df8f1b..924a38bb63aed 100644 --- a/tiup/tiup-cluster-topology-reference.md +++ b/tiup/tiup-cluster-topology-reference.md @@ -24,6 +24,7 @@ A topology configuration file for TiDB deployment using TiUP might contain the f - [tikv_servers](#tikv_servers): The configuration of the TiKV instance. This configuration specifies the machines to which the TiKV component is deployed. - [tiflash_servers](#tiflash_servers): The configuration of the TiFlash instance. This configuration specifies the machines to which the TiFlash component is deployed. - [tiproxy_servers](#tiproxy_servers): The configuration of the TiProxy instance. This configuration specifies the machines to which the TiProxy component is deployed. +- [kvcdc_servers](#kvcdc_servers): The configuration of the [TiKV-CDC](https://tikv.org/docs/7.1/concepts/explore-tikv-features/cdc/cdc/) instance. This configuration specifies the machines to which the TiKV-CDC component is deployed. - [cdc_servers](#cdc_servers): The configuration of the TiCDC instance. This configuration specifies the machines to which the TiCDC component is deployed. - [tispark_masters](#tispark_masters): The configuration of the TiSpark master instance. This configuration specifies the machines to which the TiSpark master component is deployed. Only one node of TiSpark master can be deployed. - [tispark_workers](#tispark_workers): The configuration of the TiSpark worker instance. This configuration specifies the machines to which the TiSpark worker component is deployed. @@ -463,6 +464,54 @@ tiproxy_servers: - host: 10.0.1.22 ``` +### `kvcdc_servers` + +`kvcdc_servers` specifies the machines to which the [TiKV-CDC](https://tikv.org/docs/7.1/concepts/explore-tikv-features/cdc/cdc/) services are deployed. It also specifies the service configuration on each machine. `kvcdc_servers` is an array. Each array element contains the following fields: + +- `host`: Specifies the machine to which the TiKV-CDC services are deployed. The field value is an IP address and is mandatory. + +- `ssh_port`: Specifies the SSH port to connect to the target machine for operations. If it is not specified, the `ssh_port` of the `global` section is used. + +- `port`: The listening port of the TiKV-CDC services. The default value is `8600`. + +- `deploy_dir`: Specifies the deployment directory. If it is not specified or specified as a relative directory, the directory is generated according to the `deploy_dir` directory configured in `global`. + +- `data-dir`: Specifies the directory that TiKV-CDC uses to store temporary files primarily for sorting (optional). The free disk space for this directory is recommended to be greater than or equal to 500 GiB. + +- `log_dir`: Specifies the log directory. If it is not specified or specified as a relative directory, the log is generated according to the `log_dir` directory configured in `global`. + +- `gc-ttl`: The TTL (Time to Live, in seconds) of the service-level GC safepoint in PD set by TiKV-CDC (optional). It is the duration that replication tasks can be suspended, defaulting to `86400`, which is 24 hours. Note that suspending replication tasks affects the progress of TiKV garbage collection safepoint. The longer the `gc-ttl`, the longer changefeeds can be suspended, but at the same time, more obsolete data will be kept and occupy more space. Vice versa. + +- `tz`: The time zone that the TiKV-CDC services use. TiKV-CDC uses this time zone when internally converting time data types such as timestamp and when replicating data to the downstream. The default value is the local time zone where the process runs. + +- `numa_node`: Allocates the NUMA policy to the instance. Before specifying this field, you need to make sure that the target machine has [numactl](https://linux.die.net/man/8/numactl) installed. If this field is specified, cpubind and membind policies are allocated using [numactl](https://linux.die.net/man/8/numactl). This field is the string type. The field value is the ID of the NUMA node, such as "0,1". + +- `config`: The address of the configuration file that TiKV-CDC uses (optional). + +- `os`: The operating system of the machine specified in `host`. If this field is not specified, the default value is the `os` value in `global`. + +- `arch`: The architecture of the machine specified in `host`. If this field is not specified, the default value is the `arch` value in `global`. + +- `resource_control`: Resource control for the service. If this field is configured, the field content is merged with the `resource_control` content in `global` (if the two fields overlap, the content of this field takes effect). Then, a systemd configuration file is generated and sent to the machine specified in `host`. The configuration rules of `resource_control` are the same as the `resource_control` content in `global`. + +For the above fields, you cannot modify these configured fields after the deployment: + +- `host` +- `port` +- `deploy_dir` +- `data_dir` +- `log_dir` +- `arch` +- `os` + +A `kvcdc_servers` configuration example is as follows: + +```yaml +kvcdc_servers: + - host: 10.0.1.21 + - host: 10.0.1.22 +``` + ### `cdc_servers` `cdc_servers` specifies the machines to which the TiCDC services are deployed. It also specifies the service configuration on each machine. `cdc_servers` is an array. Each array element contains the following fields: