Skip to content

Commit

Permalink
Restructure bare-metal module to use a worker submodule
Browse files Browse the repository at this point in the history
* Add an internal `worker` module to the bare-metal module, to
allow individual bare-metal machines to be defined and joined
to an existing bare-metal cluster. This is similar to the "worker
pools" modules for adding sets of nodes to cloud (AWS, GCP, Azure)
clusters, but on metal, each piece of hardware is potentially
unique

New: Using the new `worker` module, a Kubernetes cluster can be defined
without any `workers` (i.e. just a control-plane). Use the `worker`
module to define each piece machine that should join the bare-metal
cluster and customize it in detail. This style is quite flexible and
suited for clusters with hardware that varies quite a bit.

```tf
module "mercury" {
  source = "git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes?ref=v1.26.2"

  # bare-metal
  cluster_name            = "mercury"
  matchbox_http_endpoint  = "http://matchbox.example.com"
  os_channel              = "flatcar-stable"
  os_version              = "2345.3.1"

  # configuration
  k8s_domain_name    = "node1.example.com"
  ssh_authorized_key = "ssh-rsa AAAAB3Nz..."

  # machines
  controllers = [{
    name   = "node1"
    mac    = "52:54:00:a1:9c:ae"
    domain = "node1.example.com"
  }]
}
```

```tf
module "mercury-node1" {
  source = "git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes/worker?ref=v1.26.2"

  cluster_name = "mercury"

  # bare-metal
  matchbox_http_endpoint  = "http://matchbox.example.com"
  os_channel              = "flatcar-stable"
  os_version              = "2345.3.1"

  # configuration
  name               = "node2"
  mac                = "52:54:00:b2:2f:86"
  domain             = "node2.example.com"
  kubeconfig         = module.mercury.kubeconfig
  ssh_authorized_key = "ssh-rsa AAAAB3Nz..."

  # optional
  snippets       = []
  node_labels    = []
  node_tains     = []
  install_disk   = "/dev/vda"
  cached_install = false
}
```

For clusters with fairly similar hardware, you may continue to
define `workers` directly within the cluster definition. This
reduces some repetition, but is not quite as flexible.

```tf
module "mercury" {
  source = "git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes?ref=v1.26.1"

  # bare-metal
  cluster_name            = "mercury"
  matchbox_http_endpoint  = "http://matchbox.example.com"
  os_channel              = "flatcar-stable"
  os_version              = "2345.3.1"

  # configuration
  k8s_domain_name    = "node1.example.com"
  ssh_authorized_key = "ssh-rsa AAAAB3Nz..."

  # machines
  controllers = [{
    name   = "node1"
    mac    = "52:54:00:a1:9c:ae"
    domain = "node1.example.com"
  }]
  workers = [
    {
      name   = "node2",
      mac    = "52:54:00:b2:2f:86"
      domain = "node2.example.com"
    },
    {
      name   = "node3",
      mac    = "52:54:00:c3:61:77"
      domain = "node3.example.com"
    }
  ]
}
```

Optional variables `snippets`, `worker_node_labels`, and
`worker_node_taints` are still defined as a map from machine name
to a list of snippets, labels, or taints respectively to allow some
degree of per-machine customization. However, fields like
`install_disk`, `kernel_args`, `cached_install` and future options
will not be designed this way. Instead, if your machines vary it
is recommended to use the new `worker` module to define each node
  • Loading branch information
dghubble committed Feb 9, 2023
1 parent d04d880 commit ddfd9f2
Show file tree
Hide file tree
Showing 21 changed files with 637 additions and 225 deletions.
4 changes: 4 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ Notable changes between versions.

## Latest

### Bare-Metal

* Add a `worker` module to allow customizing individual worker nodes ([#1295](https://github.com/poseidon/typhoon/pull/1295))

## v1.26.1

* Kubernetes [v1.26.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1261)
Expand Down
22 changes: 0 additions & 22 deletions bare-metal/fedora-coreos/kubernetes/groups.tf

This file was deleted.

7 changes: 7 additions & 0 deletions bare-metal/fedora-coreos/kubernetes/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@ output "kubeconfig-admin" {
sensitive = true
}

# Outputs for workers

output "kubeconfig" {
value = module.bootstrap.kubeconfig-kubelet
sensitive = true
}

# Outputs for debug

output "assets_dist" {
Expand Down
37 changes: 10 additions & 27 deletions bare-metal/fedora-coreos/kubernetes/profiles.tf
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,16 @@ locals {
args = var.cached_install ? local.cached_args : local.remote_args
}

# Match a controller to a profile by MAC
resource "matchbox_group" "controller" {
count = length(var.controllers)
name = format("%s-%s", var.cluster_name, var.controllers.*.name[count.index])
profile = matchbox_profile.controllers.*.name[count.index]

selector = {
mac = var.controllers.*.mac[count.index]
}
}

// Fedora CoreOS controller profile
resource "matchbox_profile" "controllers" {
Expand Down Expand Up @@ -55,30 +65,3 @@ data "ct_config" "controllers" {
strict = true
snippets = lookup(var.snippets, var.controllers.*.name[count.index], [])
}

// Fedora CoreOS worker profile
resource "matchbox_profile" "workers" {
count = length(var.workers)
name = format("%s-worker-%s", var.cluster_name, var.workers.*.name[count.index])

kernel = local.kernel
initrd = local.initrd
args = concat(local.args, var.kernel_args)

raw_ignition = data.ct_config.workers.*.rendered[count.index]
}

# Fedora CoreOS workers
data "ct_config" "workers" {
count = length(var.workers)
content = templatefile("${path.module}/butane/worker.yaml", {
domain_name = var.workers.*.domain[count.index]
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
cluster_domain_suffix = var.cluster_domain_suffix
ssh_authorized_key = var.ssh_authorized_key
node_labels = join(",", lookup(var.worker_node_labels, var.workers.*.name[count.index], []))
node_taints = join(",", lookup(var.worker_node_taints, var.workers.*.name[count.index], []))
})
strict = true
snippets = lookup(var.snippets, var.workers.*.name[count.index], [])
}
33 changes: 0 additions & 33 deletions bare-metal/fedora-coreos/kubernetes/ssh.tf
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ resource "null_resource" "copy-controller-secrets" {
# matchbox groups are written, causing a deadlock.
depends_on = [
matchbox_group.controller,
matchbox_group.worker,
module.bootstrap,
]

Expand Down Expand Up @@ -45,45 +44,13 @@ resource "null_resource" "copy-controller-secrets" {
}
}

# Secure copy kubeconfig to all workers. Activates kubelet.service
resource "null_resource" "copy-worker-secrets" {
count = length(var.workers)

# Without depends_on, remote-exec could start and wait for machines before
# matchbox groups are written, causing a deadlock.
depends_on = [
matchbox_group.controller,
matchbox_group.worker,
]

connection {
type = "ssh"
host = var.workers.*.domain[count.index]
user = "core"
timeout = "60m"
}

provisioner "file" {
content = module.bootstrap.kubeconfig-kubelet
destination = "/home/core/kubeconfig"
}

provisioner "remote-exec" {
inline = [
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo touch /etc/kubernetes",
]
}
}

# Connect to a controller to perform one-time cluster bootstrap.
resource "null_resource" "bootstrap" {
# Without depends_on, this remote-exec may start before the kubeconfig copy.
# Terraform only does one task at a time, so it would try to bootstrap
# while no Kubelets are running.
depends_on = [
null_resource.copy-controller-secrets,
null_resource.copy-worker-secrets,
]

connection {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,12 +59,6 @@ systemd:
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--hostname-override=${domain_name} \
--kubeconfig=/var/lib/kubelet/kubeconfig \
%{~ for label in compact(split(",", node_labels)) ~}
--node-labels=${label} \
%{~ endfor ~}
%{~ for taint in compact(split(",", node_taints)) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--node-labels=node.kubernetes.io/node
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
Expand Down
63 changes: 63 additions & 0 deletions bare-metal/fedora-coreos/kubernetes/worker/matchbox.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
locals {
remote_kernel = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-live-kernel-x86_64"
remote_initrd = [
"--name main https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-live-initramfs.x86_64.img",
]

remote_args = [
"initrd=main",
"coreos.live.rootfs_url=https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-live-rootfs.x86_64.img",
"coreos.inst.install_dev=${var.install_disk}",
"coreos.inst.ignition_url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
]

cached_kernel = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-live-kernel-x86_64"
cached_initrd = [
"/assets/fedora-coreos/fedora-coreos-${var.os_version}-live-initramfs.x86_64.img",
]

cached_args = [
"initrd=main",
"coreos.live.rootfs_url=${var.matchbox_http_endpoint}/assets/fedora-coreos/fedora-coreos-${var.os_version}-live-rootfs.x86_64.img",
"coreos.inst.install_dev=${var.install_disk}",
"coreos.inst.ignition_url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
]

kernel = var.cached_install ? local.cached_kernel : local.remote_kernel
initrd = var.cached_install ? local.cached_initrd : local.remote_initrd
args = var.cached_install ? local.cached_args : local.remote_args
}

// Match a worker to a profile by MAC
resource "matchbox_group" "worker" {
name = format("%s-%s", var.cluster_name, var.name)
profile = matchbox_profile.worker.name
selector = {
mac = var.mac
}
}

// Fedora CoreOS worker profile
resource "matchbox_profile" "worker" {
name = format("%s-worker-%s", var.cluster_name, var.name)
kernel = local.kernel
initrd = local.initrd
args = concat(local.args, var.kernel_args)

raw_ignition = data.ct_config.worker.rendered
}

# Fedora CoreOS workers
data "ct_config" "worker" {
content = templatefile("${path.module}/butane/worker.yaml", {
domain_name = var.domain
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
})
strict = true
snippets = var.snippets
}

27 changes: 27 additions & 0 deletions bare-metal/fedora-coreos/kubernetes/worker/ssh.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Secure copy kubeconfig to worker. Activates kubelet.service
resource "null_resource" "copy-worker-secrets" {
# Without depends_on, remote-exec could start and wait for machines before
# matchbox groups are written, causing a deadlock.
depends_on = [
matchbox_group.worker,
]

connection {
type = "ssh"
host = var.domain
user = "core"
timeout = "60m"
}

provisioner "file" {
content = var.kubeconfig
destination = "/home/core/kubeconfig"
}

provisioner "remote-exec" {
inline = [
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo touch /etc/kubernetes",
]
}
}
111 changes: 111 additions & 0 deletions bare-metal/fedora-coreos/kubernetes/worker/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
variable "cluster_name" {
type = string
description = "Must be set to the `cluster_name` of cluster"
}

# bare-metal

variable "matchbox_http_endpoint" {
type = string
description = "Matchbox HTTP read-only endpoint (e.g. http://matchbox.example.com:8080)"
}

variable "os_stream" {
type = string
description = "Fedora CoreOS release stream (e.g. stable, testing, next)"
default = "stable"

validation {
condition = contains(["stable", "testing", "next"], var.os_stream)
error_message = "The os_stream must be stable, testing, or next."
}
}

variable "os_version" {
type = string
description = "Fedora CoreOS version to PXE and install (e.g. 31.20200310.3.0)"
}

# machine

variable "name" {
type = string
description = "Unique name for the machine (e.g. node1)"
}

variable "mac" {
type = string
description = "MAC address (e.g. 52:54:00:a1:9c:ae)"
}

variable "domain" {
type = string
description = "Fully qualified domain name (e.g. node1.example.com)"
}

# configuration

variable "kubeconfig" {
type = string
description = "Must be set to `kubeconfig` output by cluster"
}

variable "ssh_authorized_key" {
type = string
description = "SSH public key for user 'core'"
}

variable "snippets" {
type = list(string)
description = "List of Butane snippets"
default = []
}

variable "node_labels" {
type = list(string)
description = "List of initial node labels"
default = []
}

variable "node_taints" {
type = list(string)
description = "List of initial node taints"
default = []
}

# optional

variable "cached_install" {
type = bool
description = "Whether Fedora CoreOS should PXE boot and install from matchbox /assets cache. Note that the admin must have downloaded the os_version into matchbox assets."
default = false
}

variable "install_disk" {
type = string
description = "Disk device to install Fedora CoreOS (e.g. sda)"
default = "sda"
}

variable "kernel_args" {
type = list(string)
description = "Additional kernel arguments to provide at PXE boot."
default = []
}

# unofficial, undocumented, unsupported

variable "service_cidr" {
type = string
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
default = "10.3.0.0/16"
}

variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
17 changes: 17 additions & 0 deletions bare-metal/fedora-coreos/kubernetes/worker/versions.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Terraform version and plugin versions

terraform {
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
null = ">= 2.1"
ct = {
source = "poseidon/ct"
version = "~> 0.9"
}
matchbox = {
source = "poseidon/matchbox"
version = "~> 0.5.0"
}
}
}

Loading

0 comments on commit ddfd9f2

Please sign in to comment.