diff --git a/deploy/aliyun/README-CN.md b/deploy/aliyun/README-CN.md index 853367f157..ade1a5a054 100644 --- a/deploy/aliyun/README-CN.md +++ b/deploy/aliyun/README-CN.md @@ -11,6 +11,19 @@ > 你可以使用阿里云的 [云命令行](https://shell.aliyun.com) 服务来进行操作,云命令行中已经预装并配置好了所有工具。 +### 权限 + +完整部署集群需要下列权限: +- AliyunECSFullAccess +- AliyunESSFullAccess +- AliyunVPCFullAccess +- AliyunSLBFullAccess +- AliyunCSFullAccess +- AliyunEIPFullAccess +- AliyunECIFullAccess +- AliyunVPNGatewayFullAccess +- AliyunNATGatewayFullAccess + ## 概览 默认配置下,我们会创建: @@ -42,11 +55,13 @@ export TF_VAR_ALICLOUD_SECRET_KEY= ```shell $ git clone https://github.com/pingcap/tidb-operator -$ cd tidb-operator/deploy/alicloud +$ cd tidb-operator/deploy/aliyun $ terraform init $ terraform apply ``` +假如在运行 `terraform apply` 时出现报错, 请根据报错信息(比如缺少权限)进行修复后再次运行 `terraform apply` + 整个安装过程大约需要 5 至 10 分钟,安装完成后会输出集群的关键信息(想要重新查看这些信息,可以运行 `terraform output`): ``` @@ -55,16 +70,16 @@ Apply complete! Resources: 3 added, 0 changed, 1 destroyed. Outputs: bastion_ip = 1.2.3.4 -bastion_key_file = /root/tidb-operator/deploy/alicloud/credentials/tidb-cluster-bastion-key.pem +bastion_key_file = /root/tidb-operator/deploy/aliyun/credentials/tidb-cluster-bastion-key.pem cluster_id = ca57c6071f31f458da66965ceddd1c31b -kubeconfig_file = /root/tidb-operator/deploy/alicloud/.terraform/modules/a2078f76522ae433133fc16e24bd21ae/kubeconfig_tidb-cluster +kubeconfig_file = /root/tidb-operator/deploy/aliyun/.terraform/modules/a2078f76522ae433133fc16e24bd21ae/kubeconfig_tidb-cluster monitor_endpoint = 1.2.3.4:3000 region = cn-hangzhou tidb_port = 4000 tidb_slb_ip = 192.168.5.53 tidb_version = v3.0.0-rc.1 vpc_id = vpc-bp16wcbu0xhbg833fymmc -worker_key_file = /root/tidb-operator/deploy/alicloud/credentials/tidb-cluster-node-key.pem +worker_key_file = /root/tidb-operator/deploy/aliyun/credentials/tidb-cluster-node-key.pem ``` 接下来可以用 `kubectl` 或 `helm` 对集群进行操作(其中 `cluster_name` 默认值为 `tidb-cluster`): @@ -113,6 +128,13 @@ watch kubectl get pods --namespace tidb -o wide $ terraform destroy ``` +假如 kubernetes 集群没有创建成功,那么在 destroy 时会出现报错,无法进行正常清理。 此时需要手动将 kubernetes 资源从本地状态中移除: + +```shell +$ terraform state list +$ terraform state rm module.ack.alicloud_cs_managed_kubernetes.k8s +``` + 销毁集群操作需要执行较长时间。 > **注意:**监控组件挂载的云盘需要在阿里云管理控制台中手动删除。 diff --git a/deploy/aliyun/README.md b/deploy/aliyun/README.md index 4397509784..7b8c9c29e3 100644 --- a/deploy/aliyun/README.md +++ b/deploy/aliyun/README.md @@ -11,7 +11,18 @@ - [jq](https://stedolan.github.io/jq/download/) >= 1.6 - [terraform](https://learn.hashicorp.com/terraform/getting-started/install.html) 0.11.* -> You can use the Alibaba [Cloud Shell](https://shell.aliyun.com) service, which has all the tools pre-installed and properly configured. +### Permissions + +The following permissions are required: +- AliyunECSFullAccess +- AliyunESSFullAccess +- AliyunVPCFullAccess +- AliyunSLBFullAccess +- AliyunCSFullAccess +- AliyunEIPFullAccess +- AliyunECIFullAccess +- AliyunVPNGatewayFullAccess +- AliyunNATGatewayFullAccess ## Overview @@ -53,6 +64,8 @@ $ terraform init $ terraform apply ``` +If you get an error while running `terraform apply`, fix the error(e.g. lack of permission) according to the description and run `terraform apply` again. + `terraform apply` will take 5 to 10 minutes to create the whole stack, once complete, basic cluster information will be printed: > **Note:** You can use the `terraform output` command to get the output again. @@ -63,16 +76,16 @@ Apply complete! Resources: 3 added, 0 changed, 1 destroyed. Outputs: bastion_ip = 1.2.3.4 -bastion_key_file = /root/tidb-operator/deploy/alicloud/credentials/tidb-cluster-bastion-key.pem +bastion_key_file = /root/tidb-operator/deploy/aliyun/credentials/tidb-cluster-bastion-key.pem cluster_id = ca57c6071f31f458da66965ceddd1c31b -kubeconfig_file = /root/tidb-operator/deploy/alicloud/.terraform/modules/a2078f76522ae433133fc16e24bd21ae/kubeconfig_tidb-cluster +kubeconfig_file = /root/tidb-operator/deploy/aliyun/.terraform/modules/a2078f76522ae433133fc16e24bd21ae/kubeconfig_tidb-cluster monitor_endpoint = 1.2.3.4:3000 region = cn-hangzhou tidb_port = 4000 tidb_slb_ip = 192.168.5.53 tidb_version = v3.0.0-rc.1 vpc_id = vpc-bp16wcbu0xhbg833fymmc -worker_key_file = /root/tidb-operator/deploy/alicloud/credentials/tidb-cluster-node-key.pem +worker_key_file = /root/tidb-operator/deploy/aliyun/credentials/tidb-cluster-node-key.pem ``` You can then interact with the ACK cluster using `kubectl` and `helm` (`cluster_name` is `tidb-cluster` by default): @@ -124,6 +137,13 @@ It may take some while to finish destroying the cluster. $ terraform destroy ``` +Alibaba cloud terraform provider do not handle kubernetes creation error properly, which will cause an error when destroying. In that case, you can remove the kubernetes resource from the local state manually and proceed to destroy the rest resources: + +```shell +$ terraform state list +$ terraform state rm module.ack.alicloud_cs_managed_kubernetes.k8s +``` + > **Note:** You have to manually delete the cloud disk used by monitoring node in Aliyun's console after destroying if you don't need it anymore. ## Customize diff --git a/deploy/aliyun/ack/main.tf b/deploy/aliyun/ack/main.tf index 4d77f696fc..85c2cf52a9 100644 --- a/deploy/aliyun/ack/main.tf +++ b/deploy/aliyun/ack/main.tf @@ -11,15 +11,15 @@ provider "alicloud" {} resource "alicloud_key_pair" "default" { count = "${var.key_pair_name == "" ? 1 : 0}" - key_name_prefix = "${var.cluster_name}-key" - key_file = "${var.key_file != "" ? var.key_file : format("%s/%s-key", path.module, var.cluster_name)}" + key_name_prefix = "${var.cluster_name_prefix}-key" + key_file = "${var.key_file != "" ? var.key_file : format("%s/%s-key", path.module, var.cluster_name_prefix)}" } # If there is not specifying vpc_id, create a new one resource "alicloud_vpc" "vpc" { count = "${var.vpc_id == "" ? 1 : 0}" cidr_block = "${var.vpc_cidr}" - name = "${var.cluster_name}-vpc" + name = "${var.cluster_name_prefix}-vpc" lifecycle { ignore_changes = ["cidr_block"] @@ -32,12 +32,12 @@ resource "alicloud_vswitch" "all" { vpc_id = "${alicloud_vpc.vpc.0.id}" cidr_block = "${cidrsubnet(alicloud_vpc.vpc.0.cidr_block, var.vpc_cidr_newbits, count.index)}" availability_zone = "${lookup(data.alicloud_zones.all.zones[count.index%length(data.alicloud_zones.all.zones)], "id")}" - name = "${format("vsw-%s-%d", var.cluster_name, count.index+1)}" + name = "${format("vsw-%s-%d", var.cluster_name_prefix, count.index+1)}" } resource "alicloud_security_group" "group" { count = "${var.group_id == "" ? 1 : 0}" - name = "${var.cluster_name}-sg" + name = "${var.cluster_name_prefix}-sg" vpc_id = "${var.vpc_id != "" ? var.vpc_id : alicloud_vpc.vpc.0.id}" description = "Security group for ACK worker nodes" } @@ -55,7 +55,8 @@ resource "alicloud_security_group_rule" "cluster_worker_ingress" { # Create a managed Kubernetes cluster resource "alicloud_cs_managed_kubernetes" "k8s" { - name = "${var.cluster_name}" + name_prefix = "${var.cluster_name_prefix}" + // split and join: workaround for terraform's limitation of conditional list choice, similarly hereinafter vswitch_ids = ["${element(split(",", var.vpc_id != "" && (length(data.alicloud_vswitches.default.vswitches) != 0) ? join(",", data.template_file.vswitch_id.*.rendered) : join(",", alicloud_vswitch.all.*.id)), 0)}"] key_name = "${alicloud_key_pair.default.key_name}" @@ -97,7 +98,7 @@ resource "alicloud_ess_scaling_group" "workers" { # Remove the newest instance in the oldest scaling configuration removal_policies = [ "OldestScalingConfiguration", - "NewestInstance" + "NewestInstance", ] lifecycle { diff --git a/deploy/aliyun/ack/outputs.tf b/deploy/aliyun/ack/outputs.tf index dc1f5fabdb..288c342855 100644 --- a/deploy/aliyun/ack/outputs.tf +++ b/deploy/aliyun/ack/outputs.tf @@ -1,11 +1,11 @@ output "cluster_id" { description = "The id of the ACK cluster." - value = "${alicloud_cs_managed_kubernetes.k8s.id}" + value = "${alicloud_cs_managed_kubernetes.k8s.*.id}" } output "cluster_name" { description = "The name of ACK cluster" - value = "${alicloud_cs_managed_kubernetes.k8s.name}" + value = "${alicloud_cs_managed_kubernetes.k8s.*.name}" } output "cluster_nodes" { @@ -15,20 +15,20 @@ output "cluster_nodes" { output "vpc_id" { description = "The vpc id of ACK cluster" - value = "${alicloud_cs_managed_kubernetes.k8s.vpc_id}" + value = "${alicloud_cs_managed_kubernetes.k8s.*.vpc_id}" } output "vswitch_ids" { description = "The vswich ids of ACK cluster" - value = "${alicloud_cs_managed_kubernetes.k8s.vswitch_ids}" + value = "${alicloud_cs_managed_kubernetes.k8s.*.vswitch_ids}" } output "security_group_id" { description = "The security_group_id of ACK cluster" - value = "${alicloud_cs_managed_kubernetes.k8s.security_group_id}" + value = "${alicloud_cs_managed_kubernetes.k8s.*.security_group_id}" } output "kubeconfig_filename" { description = "The filename of the generated kubectl config." - value = "${path.module}/kubeconfig_${var.cluster_name}" + value = "${path.module}/kubeconfig_${var.cluster_name_prefix}" } diff --git a/deploy/aliyun/ack/variables.tf b/deploy/aliyun/ack/variables.tf index db7e4e4a9e..f36ad7b217 100644 --- a/deploy/aliyun/ack/variables.tf +++ b/deploy/aliyun/ack/variables.tf @@ -2,7 +2,7 @@ variable "region" { description = "Alicloud region" } -variable "cluster_name" { +variable "cluster_name_prefix" { description = "Kubernetes cluster name" default = "ack-cluster" } @@ -142,8 +142,8 @@ EOS "internet_charge_type" = "PayByTraffic" "internet_max_bandwidth_in" = 10 "internet_max_bandwidth_out" = 10 - "node_taints" = "" - "node_labels" = "" + "node_taints" = "" + "node_labels" = "" } } diff --git a/deploy/aliyun/bastion.tf b/deploy/aliyun/bastion.tf index 79f72311bf..9846dafc8b 100644 --- a/deploy/aliyun/bastion.tf +++ b/deploy/aliyun/bastion.tf @@ -19,7 +19,7 @@ module "bastion-group" { alicloud = "alicloud.this" } - vpc_id = "${module.ack.vpc_id}" + vpc_id = "${join("", module.ack.vpc_id)}" cidr_ips = ["${var.bastion_ingress_cidr}"] group_description = "Allow internet SSH connections to bastion node" ip_protocols = ["tcp"] @@ -30,11 +30,11 @@ module "bastion-group" { resource "alicloud_instance" "bastion" { provider = "alicloud.this" count = "${var.create_bastion ? 1 : 0}" - instance_name = "${var.cluster_name}-bastion" + instance_name = "${var.cluster_name_prefix}-bastion" image_id = "${var.bastion_image_name}" instance_type = "${data.alicloud_instance_types.bastion.instance_types.0.id}" security_groups = ["${module.bastion-group.security_group_id}"] - vswitch_id = "${module.ack.vswitch_ids[0]}" + vswitch_id = "${element(module.ack.vswitch_ids[0], 0)}" key_name = "${alicloud_key_pair.bastion.key_name}" internet_charge_type = "PayByTraffic" internet_max_bandwidth_in = 10 diff --git a/deploy/aliyun/main.tf b/deploy/aliyun/main.tf index 6ed2db8a6d..0be69d4f73 100644 --- a/deploy/aliyun/main.tf +++ b/deploy/aliyun/main.tf @@ -11,9 +11,9 @@ provider "alicloud" { locals { credential_path = "${path.module}/credentials" - kubeconfig = "${local.credential_path}/kubeconfig_${var.cluster_name}" - key_file = "${local.credential_path}/${var.cluster_name}-node-key.pem" - bastion_key_file = "${local.credential_path}/${var.cluster_name}-bastion-key.pem" + kubeconfig = "${local.credential_path}/kubeconfig_${var.cluster_name_prefix}" + key_file = "${local.credential_path}/${var.cluster_name_prefix}-node-key.pem" + bastion_key_file = "${local.credential_path}/${var.cluster_name_prefix}-bastion-key.pem" tidb_cluster_values_path = "${path.module}/rendered/tidb-cluster-values.yaml" local_volume_provisioner_path = "${path.module}/rendered/local-volume-provisioner.yaml" } @@ -34,17 +34,17 @@ module "ack" { } # TODO: support non-public apiserver - region = "${var.ALICLOUD_REGION}" - cluster_name = "${var.cluster_name}" - public_apiserver = true - kubeconfig_file = "${local.kubeconfig}" - key_file = "${local.key_file}" - vpc_cidr = "${var.vpc_cidr}" - k8s_pod_cidr = "${var.k8s_pod_cidr}" - k8s_service_cidr = "${var.k8s_service_cidr}" - vpc_cidr_newbits = "${var.vpc_cidr_newbits}" - vpc_id = "${var.vpc_id}" - group_id = "${var.group_id}" + region = "${var.ALICLOUD_REGION}" + cluster_name_prefix = "${var.cluster_name_prefix}" + public_apiserver = true + kubeconfig_file = "${local.kubeconfig}" + key_file = "${local.key_file}" + vpc_cidr = "${var.vpc_cidr}" + k8s_pod_cidr = "${var.k8s_pod_cidr}" + k8s_service_cidr = "${var.k8s_service_cidr}" + vpc_cidr_newbits = "${var.vpc_cidr_newbits}" + vpc_id = "${var.vpc_id}" + group_id = "${var.group_id}" default_worker_cpu_core_count = "${var.default_worker_core_count}" diff --git a/deploy/aliyun/outputs.tf b/deploy/aliyun/outputs.tf index 8f16b647e5..aa46ed3baf 100644 --- a/deploy/aliyun/outputs.tf +++ b/deploy/aliyun/outputs.tf @@ -7,7 +7,7 @@ output "cluster_id" { } output "cluster_name" { - value = "${var.cluster_name}" + value = "${var.cluster_name_prefix}" } output "kubeconfig_file" { diff --git a/deploy/aliyun/variables.tf b/deploy/aliyun/variables.tf index 0c16d3783d..90ae6d4d11 100644 --- a/deploy/aliyun/variables.tf +++ b/deploy/aliyun/variables.tf @@ -1,4 +1,4 @@ -variable "cluster_name" { +variable "cluster_name_prefix" { description = "TiDB cluster name" default = "tidb-cluster" } @@ -88,7 +88,7 @@ variable "monitor_reserve_days" { variable "default_worker_core_count" { description = "CPU core count of default kubernetes workers" - default = 2 + default = 2 } variable "create_bastion" { @@ -122,7 +122,7 @@ variable "monitor_slb_network_type" { variable "monitor_enable_anonymous_user" { description = "Whether enabling anonymous user visiting for monitoring" - default = false + default = false } variable "vpc_id" { @@ -152,5 +152,5 @@ variable "k8s_service_cidr" { variable "vpc_cidr" { description = "VPC cidr_block, options: [192.168.0.0.0/16, 172.16.0.0/16, 10.0.0.0/8], cannot collidate with kubernetes service cidr and pod cidr. Cannot change once the vpc created." - default = "192.168.0.0/16" + default = "192.168.0.0/16" }