Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation for embedded datastore #500

Merged
merged 40 commits into from
Jul 3, 2024
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
a0c90e4
checkpoint
neoaggelos Jun 12, 2024
4fcede3
adjust embedded dqlite configs
neoaggelos Jun 13, 2024
76cb6d0
checkpoint 2
neoaggelos Jun 13, 2024
6b86771
configure k8s-dqlite embedded
neoaggelos Jun 13, 2024
85b9d5b
unit tests for embedded etcd
neoaggelos Jun 13, 2024
b762a98
remove-node not implemented yet
neoaggelos Jun 13, 2024
8d8b016
support embedded datastore in internal types
neoaggelos Jun 13, 2024
fba7c1b
configurable peer and client port for embedded
neoaggelos Jun 13, 2024
56b8921
fixup k8s-dqlite embedded args
neoaggelos Jun 13, 2024
a4d51f3
fix embedded client paths
neoaggelos Jun 13, 2024
ff3e8f2
fix join with embedded etcd
neoaggelos Jun 13, 2024
3ba987c
fix certificate generation checks
neoaggelos Jun 13, 2024
4a22d20
consistent paths for k8s-dqlite storage
neoaggelos Jun 15, 2024
ea5f996
update client URLs on embedded datastore
neoaggelos Jun 15, 2024
6a98134
add embedded datastore client
neoaggelos Jun 15, 2024
3119e98
implement remove-node for embedded datastore
neoaggelos Jun 15, 2024
5d8022a
wait for apiserver before proceeding
neoaggelos Jun 15, 2024
811eef7
debug where remove hook runs
neoaggelos Jun 15, 2024
ec31277
attempt to remove-node on embedded datastore
neoaggelos Jun 15, 2024
20b1257
always remove node prior to datastore cleanup
neoaggelos Jun 15, 2024
8e7f7be
add e2e test for embedded datastore
neoaggelos Jun 16, 2024
be913db
initial docs for embedded datastore
neoaggelos Jun 16, 2024
17d517a
cleanup unused tests
neoaggelos Jun 17, 2024
6df1528
temporary use k8s-dqlite branch with embedded implementation
neoaggelos Jun 17, 2024
3015104
include command in error output
neoaggelos Jun 17, 2024
0475fac
rename embedded to etcd
neoaggelos Jun 24, 2024
3c3891c
document datastore configs
neoaggelos Jun 24, 2024
b98ef2a
titles and typos in docs
neoaggelos Jun 24, 2024
2fe0777
use separate path for etcd
neoaggelos Jun 24, 2024
aa6eb7a
revert k8s-dqlite custom branch
neoaggelos Jun 24, 2024
c1fa42d
make sure to create etcd directory
neoaggelos Jun 25, 2024
a1c5c63
documentation link fixes
neoaggelos Jun 25, 2024
3d00abf
doc fixes
neoaggelos Jun 25, 2024
a7ecfe4
link to datastore explanation
neoaggelos Jun 25, 2024
5849a86
add link
neoaggelos Jun 25, 2024
9c0134e
remove broken link
neoaggelos Jun 25, 2024
31eb475
adjust handling of etcd remove node result
neoaggelos Jun 25, 2024
37670e8
Fix datastore link
evilnick Jun 25, 2024
96cd8f6
Add important point
evilnick Jun 25, 2024
6599eab
Merge branch 'main' into KU-961/embedded
neoaggelos Jul 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion build-scripts/components/k8s-dqlite/version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
master
KU-961/embedded
78 changes: 78 additions & 0 deletions docs/src/snap/explanation/datastore/etcd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# etcd datastore

Canonical Kubernetes supports using a managed etcd cluster as the underlying
datastore of the cluster.

This page explains the behaviour of the managed etcd cluster. See How-To
[Configure Canonical Kubernetes with etcd][how-to-etcd] for steps to deploy
Canonical Kubernetes with a managed etcd datastore.

## Topology

When using the managed etcd datastore, all the control plane nodes of the
cluster will be running an etcd instance. The etcd cluster is configured with
TLS for both client and peer traffic.

The etcd datastore uses ports 2379 (for client) and 2380 (for peer) traffic.
These ports can be configured when bootstrapping the cluster.

## TLS

Canonical Kubernetes will generate a separate self-signed CA certificate for
the etcd cluster. If needed, it is possible to specify a custom CA certificate
when bootstrapping the cluster. Any of the following scenarios are supported:

- No certificates are given, Canonical Kubernetes will generate self-signed CA
and server certificates as needed.
- A custom CA certificate and private key are given during bootstrap. Canonical
Kubernetes will then use this to generate server and peer certificates as
needed.
- A custom CA certificate is passed. In this scenario, the server and peer
certificates and private must also be specified. This is required for the
bootstrap node, as well as any control plane nodes that join the cluster. In
case any required certificate is not specified, the bootstrap or join process
will fail.

## Clustering

When adding a new control plane node to the cluster, Canonical Kubernetes will
perform the following steps:

1. Use the etcd CA to generate peer and server certificates for the new node.
2. The new node will automatically register itself on the etcd cluster (by
performing the equivalent of `etcdctl member add --peer-url ...`).
3. The new node will start and join the cluster quorum. If necessary, it will
force a new leader election in the etcd cluster (e.g. while transitioning
from 1 to 2 control plane nodes).

Similarly, when removing a cluster node from the cluster using `k8s remove-node`,
Canonical Kubernetes will make sure that the node is also removed from the etcd
cluster.

Canonical Kubernetes will also keep track of the active members of the etcd
cluster, and will periodically update the list of `--etcd-servers` in the
kube-apiserver arguments. This assures that if the etcd service on the local
node misbehaves, then `kube-apiserver` can still work by reaching the rest of
the etcd cluster members.

## Quorum

When using the managed etcd datastore, all nodes participate equally in the
raft quorum. That means an odd number of **2k + 1** nodes is needed to maintain
a fault tolerance of **k** nodes (such that the rest **k + 1** nodes maintain
an active quorum).

## etcd configuration and data directory

The etcd configuration and data directories to be aware of are:

- `/var/snap/k8s/common/var/lib/k8s-dqlite/etcd.yaml`: YAML file with etcd
neoaggelos marked this conversation as resolved.
Show resolved Hide resolved
cluster configuration. This contains information for the initial cluster
members, TLS certificate paths and member peer and client URLs.
- `/var/snap/k8s/comonn/var/lib/k8s-dqlite/data`: etcd data directory.
- `/etc/kubernetes/pki/etcd`: contains certificates for the etcd cluster
(etcd CA certificate, server certificate and key, peer certificate and key).

<!-- LINKS -->

[how-to-etcd]: /snap/howto/datastore/etcd
49 changes: 49 additions & 0 deletions docs/src/snap/explanation/datastore/external.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# external datastore
neoaggelos marked this conversation as resolved.
Show resolved Hide resolved

Canonical Kubernetes supports using an external etcd cluster as the underlying
datastore of the cluster.

This page explains the behaviour of Canonical Kubernetes when using an external
etcd cluster. See How-To
[Configure Canonical Kubernetes with an external datastore][how-to-external] for
steps to deploy Canonical Kubernetes with an external etcd datastore.

## Topology

When using an external etcd datastore, the control plane nodes of the cluster
will only run the Kubernetes services. The cluster administrator is responsible
for deploying, managing and operating the external etcd datastore.

The control plane nodes are expected to be able to reach the external etcd
cluster over the network.

## TLS

For production deployments, it is highly recommended that the etcd cluster uses
TLS for both client and peer traffic. It is the responsibility of the cluster
administrator to deploy the external etcd cluster accordingly.

## Clustering

When using an external etcd datastore, the cluster administrator provides the
known etcd server URLs, as well as any required client certificates when
bootstrapping the cluster.

When adding a new control plane node to the cluster, Canonical Kubernetes will
configure it to use the same list of etcd servers and client certificates.

Removing a cluster node using `k8s remove-node` will not have any side-effect
on the external datastore.

## configuration and data directories

- `/etc/kubernetes/pki/etcd/ca.crt`: This is the CA certificate of the etcd
cluster certificate. This will be created by Canonical Kubernetes, and contain
the CA certificate specified when bootstrapping the cluster.
- `/etc/kubernetes/pki/apiserver-etcd-client.{crt,key}`: This is the client
certificate and key used by `kube-apiserver` to authenticate with the etcd
cluster.

<!-- LINKS -->

[how-to-external]: /snap/howto/datastore/external
52 changes: 52 additions & 0 deletions docs/src/snap/explanation/datastore/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Datastore

```{toctree}
:hidden:
Datastore <self>
```

One of the core components of a Kubernetes cluster is the datastore. The
datastore is where all of the cluster state is persisted. The `kube-apiserver`
communicates with the datastore using an [etcd API].

Canonical Kubernetes supports three different datastore types:

1. `k8s-dqlite` (**default**) (managed): Control plane nodes form a dqlite
cluster and expose an etcd endpoint over a local unix socket. The dqlite
cluster is automatically updated when adding or removing cluster members.

For more details, see [k8s-dqlite].

2. `etcd` (managed): Control plane nodes form an etcd cluster. The etcd cluster
is automatically updated when adding or removing cluster members.

For more details, see [etcd].

3. `external`: Do not deploy or manage the datastore. The user is expected to
provision and manage an external etcd datastore, and provide the connection
credentials (URLs and client certificates) when bootstrapping the cluster.

For more details, see [external].

```{warning}
The selection of the backing datastore can only be done during the bootstrap
process. It is not possible to change the datastore type of a running cluster.

Instead, a new cluster should be deployed and workloads should be migrated to it
using a blue-green deployment method.
```

```{toctree}
:titlesonly:

k8s-dqlite
etcd
external
```

<!-- LINKS -->

[etcd API]: https://etcd.io/docs/v3.5/learning/api/
[k8s-dqlite]: k8s-dqlite
[etcd]: etcd
[external]: external
43 changes: 43 additions & 0 deletions docs/src/snap/explanation/datastore/k8s-dqlite.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# k8s-dqlite datastore

Canonical Kubernetes supports using a managed dqlite cluster as the underlying
datastore of the cluster. This is the default option when no configuration is
specified.

This page explains the behaviour of the managed dqlite cluster. See How-To
[Configure Canonical Kubernetes with dqlite][how-to-dqlite] for steps to
deploy Canonical Kubernetes with a managed etcd datastore.
neoaggelos marked this conversation as resolved.
Show resolved Hide resolved

## Topology

When using the managed dqlite datastore, all the control plane nodes of the
cluster will be running `k8s-dqlite`. Internal cluster communication happens
over TLS between the members. Each cluster member exposes a local unix socket
for `kube-apiserver` to access the datastore.

The dqlite cluster uses port 9000 on each node for cluster communication. This
port can be configured when bootstrapping the cluster.

## Clustering

When adding a new control plane node to the cluster, Canonical Kubernetes will
add the node to the dqlite cluster.

Similarly, when removing a node from the cluster using `k8s remove-node`,
Canonical Kubernetes will make sure that the node is also removed from the
k8s-dqlite cluster.

Since `kube-apiserver` instances access the datastore over a local unix socket,
no reconfiguration is needed on that front.

## configuration and data directory

The k8s-dqlite configuration and data paths to be aware of are:

- `/var/snap/k8s/common/args/k8s-dqlite`: Command line arguments for the
`k8s-dqlite` service.
- `/var/snap/k8s/common/var/lib/k8s-dqlite`: Data directory.

<!-- LINKS -->

[how-to-dqlite]: /snap/howto/datastore/k8s-dqlite
1 change: 1 addition & 0 deletions docs/src/snap/explanation/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Overview <self>
about
channels
clustering
datastore/index
ingress
/snap/explanation/security
```
Expand Down
127 changes: 127 additions & 0 deletions docs/src/snap/howto/datastore/etcd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# How to use the embedded etcd datastore

This guide walks you through bootstrapping a Canonical Kubernetes cluster
using the embedded etcd datastore.

## What you'll need

This guide assumes the following:

- You have root or sudo access to the machine
- You have an external etcd cluster
- You have installed the Canonical Kubernetes snap
(see How-to [Install Canonical Kubernetes from a snap][snap-install-howto]).
- You have not bootstrapped the Canonical Kubernetes cluster yet

## Adjust the bootstrap configuration

To use the embedded etcd datastore, a configuration file that contains the
required datastore parameters needs to be provided to the bootstrap command.
Create a configuration file and insert the contents below while replacing
the placeholder values based on the configuration of your etcd cluster.

```yaml
# must be set to "etcd"
datastore-type: etcd

# port number that will be used for client traffic (default is 2379)
etcd-port: 2379

# port number that will be used for peer traffic (default is 2380)
etcd-peer-port: 2380

# (optional) custom CA certificate and private key to use to generate TLS
# certificates for the etcd cluster, in PEM format. If not specified, a
# self-signed CA will be used instead.
etcd-ca-crt: |
-----BEGIN CERTIFICATE-----
.....
-----END CERTIFICATE-----

etcd-ca-key: |
-----BEGIN RSA PRIVATE KEY-----
.....
-----END RSA PRIVATE KEY-----
```

```{note}
the embedded etcd cluster will always be configured with TLS.
```

## Bootstrap the cluster

The next step is to bootstrap the cluster with our configuration file:

```
sudo k8s bootstrap --file /path/to/config.yaml
```

```{note}
The datastore can only be configured through the `--file` file option,
and is not available in interactive mode.
```

## Confirm the cluster is ready

It is recommended to ensure that the cluster initialises properly and is
running without issues. Run the command:

```
sudo k8s status --wait-ready
```

This command will wait until the cluster is ready and then display
the current status. The command will time-out if the cluster does not reach a
ready state.

## Operations

In the following section, common operations for interacting with the managed
etcd datastore are documented.

### How to use etcdctl

You can interact with the embedded etcd cluster using the standard `etcdctl` CLI
tool. `etcdctl` is not included in Canonical Kubernetes and needs to be
installed separately if needed. To point `etcdctl` to the embedded cluster, you
need to set the following arguments:

```bash
sudo ETCDCTL_API=3 etcdctl \
--endpoints https://${nodeip}:2379 \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/apiserver-etcd-client.crt \
--key /etc/kubernetes/pki/apiserver-etcd-client.key \
member list
```

### Using k8s-dqlite dbctl

There is a `k8s-dqlite dbctl` subcommand that can be used from control
plane nodes to directly interact with the datastore if required. This tool is
supposed to be a lightweight alternative to common `etcdctl` commands:

```bash
sudo /snap/k8s/current/bin/k8s-dqlite dbctl --help
```

Some examples are shown below:

#### List cluster members

```bash
sudo /snap/k8s/current/bin/k8s-dqlite dbctl member list
```

#### Create a database snapshot

```bash
sudo /snap/k8s/current/bin/k8s-dqlite dbctl snapshot save ./file.db
```

The created `file.db` contains a point-in-time backup snapshot of the etcd
cluster, and can be used to restore the cluster if needed.

<!-- LINKS -->

[snap-install-howto]: ./install/snap
Loading
Loading