A demo deployment of a load-balanced web example project in Google Kubernetes Engine

Overview

This small project aims to demonstrate how to set up a load-balanced environment that runs some web application and incorporates their database backend as well, all these in Google Kubernetes Engine, step by step right from the start.

It looks a simple case, but believe me, there were quite some unexpected obstacles along the way...

Prerequisites

You need an account to the Google Cloud platform (obviously), you shall create a project that will enclose all the resources and entities we'll create (and separate them from your other things).

A GCP Service Account

Then create a Service Account within this project (Menu / IAM & Accounts / Service Accounts), and generate a private key for this account that our mechanisms will use later by clicking the three-dot icon to the right of the service account, choosing 'Create key' and saving the file, like service_account.json.

Then you shall authorize this account to perform certain roles in your project:

Go to IAM & Accounts / IAM
Choose the service account, click its 'Edit' on the right
'Add another role', choose 'Kubernetes Engine' / 'Kubernetes Engine Admin'
'Add another role', choose 'Service Accounts' / 'Service Account User'
'Add another role', choose 'Storage' / 'Storage Admin'
Save

Then transfer that service_account.json here and tell the gcloud cli to use it: gcloud auth activate-service-account --key-file=service_account.json

Then you can check its results: gcloud info, or actually test if it indeed works: gcloud container clusters list

If you got error messages, then something is still wrong, but an empty list is completely normal if you don't have any clusters created yet. (We'll change that soon :) ...)

Docker

We'll need to manipulate Docker images, so Docker must be installed, enabled and started.

The docker in the CentOS repo is way too obsolete (as of now: 1.13.1), so install the latest stable (as of now: 19.03.1) from the Docker repo instead.

OS packages

docker, kubectl

python2-google-auth, python2-libcloud, python2-crypto, python2-openshift, python-netaddr, python-docker-py

Creating the cluster

A cluster consists of a bunch of hosts that run all the Kubernetes stuff, so first we'll need some instances to do this.

Fortunately we don't need to do this right from the instance-level, because the GKE infrastructure will do all these for us:

Provisioning the instances
Choosing an already Kubernetes-aware OS image for them
Configuring the cluster store, etc.
Registering the new instances as parts of the cluster

All we need is to define the parameters of the instances and of the cluster:

For the instances:

Machine type
Disk size

For the cluster:

Initial number of nodes
Minimal / maximal number of nodes (if we want autoscaling)
Whether we want auto-repair and auto-upgrade functionality

It's that simple! Well, almost...

Some details

In addition to the Cluster there is another entity: the Node Pool. As its name suggests, it contains (and manages) a bunch of nodes, so actually all those 'for the cluster' parameters belong to a Node Pool, and such Node Pools (note the plural) belong to a Cluster. In fact, the Cluster adds very little extra to those Pools...

And there is a small limitation here, that causes some inconvenience for us.

Creating a Cluster involves storing a lot of information in its Pools, so we can't create a Cluster without a (default) Pool. But as of the current state of the Ansible module gcp_container_cluster, not all Pool parameters can be configured through it. But we can't create the Pool first, without the Cluster, so we've got a chicken-and-egg problem here.

As of now, the only solution seems to

Create the Cluster with a minimal Pool
Create a new Pool with all the features, and assign it to the Cluster
Dispose of that first, implicitely created minimal 'default' Pool

This may change in the future, as the Cluster API documentation marks the nodeConfig and initialNodeCount fields of the Cluster as obsoleted, and recommends specifying the initial Pool as part of the Cluster specification (nodePools[]), but with the full specification model as any Pools that we would add afterwards.

So, this is an Ansible limitation, the latest changes of the GKE API hasn't yet been tracked to the corresponding module gcp_container_cluster. (Others have also run into this problem, only at that time the new GKE API may not have existed yet.)

As of that initial 'minimal pool', GKE also has some peculiar restrictions: if we chose the smallest machine type (f1-micro), it would require at least 3 of them to form a Node Pool, therefore we have to choose the next smallest (g1-small), of which one is enough.

Note #1

If the script cannot dump the cluster services, but returns a terse Unauthorized error, then check on the web console if the cluster could indeed start up, or is it standing in a 'Pods unschedulable' state.

Note #2

If we drop a node pool while the auto-upgrade is in progress, then we'll get that 'Pods unschedulable' above. That's why the auto-upgrade option is disabled in the creator playbook.

Actually creating the cluster

We have an Ansible playbook for that: ./run.sh 0_create_cluster.yaml

Checking the cluster

When we want to manage the cluster manually, we'll use the CLI tool kubectl, which needs access to the cluster, so we must tell it to ask gcloud for credentials.

This information must be described in ~/.kube/config in quite a nice syntax, but gcloud can do that for us:

gcloud container clusters get-credentials --zone=europe-west1-c lb-demo-cluster

Then, to check that we can actually access the cluster: kubectl cluster-info

NOTE: This ~/.kube/config is only needed for kubectl, as the playbooks access and use the credentials directly.

Deploying MariaDB

First of all, we need a persistent disk that will store all our data, and it shall be initialised to contain the (empty) database files.

Create the disk
Create a temporary VM, attach the disk, create filesystem on it
Shut down and destroy the VM

This is done by 1_create_db_disk.yaml.

Generate random passwords for the (soon to be created) root and app users of the DB
Deploy the MariaDB to the cluster (see the embedded resource definition in the .yaml)
Create the MariaDB service
Check out the App sources (to have the DB setup .sql scripts)
Customise the DB setup scripts and the Frontend configs
Remotely access the DB service and execute the DB setup scripts

This is done by 2_mariadb_deployment.yaml

NOTE: It may take some time (even minutes) until the service gets its external IP assigned, see in the output of kubectl get service mariadb-server

The randomly generated passwords are stored in the local files db.root.password and db.user.password, so if we want to check the DB service manually, we can connect to it:

mysql -u root --password="$(<db.root.password)" -h <the service external IP address>

Deploying the webservers

The container image

The frontend is a standalone web application with our specific code and content, so we must create an image for it.

To push a local image to a GSE storage, the officially recommended way is to do it via docker:

Create our local image
Add a tag that refers to our project registry: docker tag lb_demo_frontend eu.gcr.io/networksandbox-232012/lb_demo_frontend
Push the image: docker push eu.gcr.io/networksandbox-232012/lb_demo_frontend

Pushing needs some credentials, so the documentation recommends configuring docker to use gcloud as a credential store: gcloud auth configure-docker --quiet, but DON'T do it yet. It wouldn't work as expected, so we'll have a workaround and therefore we won't need it.

(Btw, this command would just create/update ~/.docker/config.json with { "credHelpers": { "gcr.io": "gcloud", ... }, so it's not dealing with the credentials, it only configures how to get them.)

`docker` as non-root

We are working as a plain, non-root user, so just saying docker whatever will only give us some error messages about not being able to write /var/run/docker.sock.

There is a doc on how to configure Docker accessible for non-root users, and there is another doc about why not to do it.

Starting from Centos 7, the OS-supported docker packages follow the 'root-only' discipline and tell that if you want to make docker available for non-root users, then configure sudo for them, because it's at least audited, while the implicit (and needed) root-exec abilities of docker aren't.

At first glance there isn't too much difference between docker whatever and sudo docker whatever, and for those commands that don't access the GSE repo (eg. pull, build, tag) it indeed works.

For the push, however, it wouldn't, because the credential handling isn't working across sudo.

Credentials: `docker` vs. `gcloud`

If we told gcloud to tell docker to use gcloud as credential source (that gcloud auth configure-docker above), it would create/update the .docker/config.json in our home folder.

Then when we'd say sudo docker push ..., the docker command would run as root and use the home folder of root, and thus wouldn't even consider our config.json.

This could be circumvented by telling docker where to look for its config: sudo docker --config "$HOME/.docker" push ..., and it almost works. Almost...

The docker client can connect to the socket, talk to the docker daemon, prepare the images to send, but then it fails:

denied: Token exchange failed for project 'networksandbox-232012'. Caller does not have permission 'storage.buckets.get'. To configure permissions, follow instructions at: https://cloud.google.com/container-registry/docs/access-control

The problem is this:

docker push is running as root
It starts talking with the GSE server, reaches the authentication phase
Calls out to gcloud for credentials, still as root
So gcloud is also running as root
It tries to get its active config and such information from ~/.config/gcloud/
That refers to the home folder of root, and not to ours
There certainly are no credentials for our logged-in service account

(It took about two hours of debugging the scripts and tracing docker with strace, but finally managed to catch it :D !)

The solution

What docker actually does when asking gcloud for credentials is like this: echo "https://eu.gcr.io" | gcloud auth docker-helper get, and expects the credentials in .json format: { "Secret": "...", "Username": "_dcgcloud_token" }

That is the username and password docker uses to access the server eu.gcr.io, so we may as well login there with these credentials: sudo docker login -u "_dcgcloud_token" -p "..." eu.gcr.io.

The token has a limited validity period, so it should be performed right before the sudo docker push ..., but then all the sudo docker ... commands work fine, and finally we can log out as well: sudo docker logout https://eu.gcr.io

NOTE: This token is the same as that in the structure returned by gcloud config config-helper --format=json, so the playbook uses that one.

So, the process goes like this:

Create the frontend image (see details below)
Add a tag that refers to our project registry: sudo docker tag lb_demo_frontend eu.gcr.io/networksandbox-232012/lb_demo_frontend
Logout any previous logins: sudo docker logout eu.gcr.io
Get a fresh token: echo "https://eu.gcr.io" | gcloud auth docker-helper get
Login to the server: sudo docker login -u "_dcgcloud_token" -p "..." eu.gcr.io
Push the image: docker push eu.gcr.io/networksandbox-232012/lb_demo_frontend
Logout: sudo docker logout eu.gcr.io

This is done by 3_create_frontend_image.yaml, and then its result can be checked:

gcloud container images list --repository=eu.gcr.io/networksandbox-232012

Creating the frontend image

The standalone docker command would be sudo docker image build -t frontend:latest .

Of course, it can be integrated into our playbooks, so this is also in 3_create_frontend_image.yaml.

Finally: deploying the webservers

Having reached this point, no surprises are left. With the DB we already deployed nodes to the cluster (using an image), created service to make it available, and this step is even simpler (it needs no customisation of files, etc.) :

4_frontend_deployment.yaml

And now we have the cluster up and running, at full functionality :D !

Misc notes

Time overhead because of the 'Cluster vs. Pool' creation problem

Creating the cluster with the implicite pool (1 g1-small with 10GB disk, no autoscaling, no auto-repair, no auto-upgrade): 2.5 minutes
Creating the real pool (4 f1-micro instances with 10GB disks): 5.5 minutes
Deleting the implicite pool: 2 minutes

It seems that the time overhead caused by the implicite pool (creating and destroying it) is about 3 minutes. That's not negligible, but not insufferable either.

SSL handling of `k8s_facts`

There is a bug, it was fixed, the fix was merged in on Jun 5, 2019. '2.8.1' was released on Jun 7 but it doesn't have the fix, neither does '2.8.2', so probably it'll be released in '2.8.3'.

Until then, apply this workaround.

Contents of `~/.kube/config`

It's a .yaml file with the following information:

fullname = "gke_{{ project_id }}_{{ zone }}_{{ cluster_name }}"

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: "{{ cluster.masterAuth.clusterCaCertificate }}"
    server: "https://{{ cluster.endpoint }}"
  name: "{{ fullname }}"
contexts:
- context:
    cluster: "{{ fullname }}"
    user: "{{ fullname }}"
  name: "{{ fullname }}"
current-context: "{{ fullname }}"
kind: Config
preferences: {}
users:
- name: "{{ fullname }}"
  user:
    auth-provider:
      config:
        cmd-args: config config-helper --format=json
        cmd-path: /usr/lib64/google-cloud-sdk/bin/gcloud
        expiry-key: '{.credential.token_expiry}'
        token-key: '{.credential.access_token}'
      name: gcp

So the actual auth credentials are provided by gcloud config config-helper --format=json, which produces

{
  "configuration": {
    "active_configuration": "default",
    "properties": {
      "core": {
        "account": "...@....",
        "disable_usage_reporting": "True",
        "project": "networksandbox-232012"
      }
    }
  },
  "credential": {
    "access_token": "...",
    "token_expiry": "2019-07-23T10:54:19Z"
  },
  "sentinels": {
    "config_sentinel": ".../.config/gcloud/config_sentinel"
  }
}

That access_token is the same as the one returned by echo "https://gcr.io" | gcloud auth docker-helper get

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
0_create_db_disk.yaml		0_create_db_disk.yaml
1_create_cluster.yaml		1_create_cluster.yaml
2_mariadb_deployment.yaml		2_mariadb_deployment.yaml
3_create_frontend_image.yaml		3_create_frontend_image.yaml
4_frontend_deployment.yaml		4_frontend_deployment.yaml
5_test.yaml		5_test.yaml
8_destroy_cluster.yaml		8_destroy_cluster.yaml
9_destroy_db_disk.yaml		9_destroy_db_disk.yaml
README.md		README.md
ansible.cfg		ansible.cfg
external_vars.yaml		external_vars.yaml
inventory.gcp_compute.yaml		inventory.gcp_compute.yaml
run.sh		run.sh
service_account.json		service_account.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A demo deployment of a load-balanced web example project in Google Kubernetes Engine

Overview

Prerequisites

A GCP Service Account

Docker

OS packages

Creating the cluster

Some details

Note #1

Note #2

Actually creating the cluster

Checking the cluster

Deploying MariaDB

Deploying the webservers

The container image

`docker` as non-root

Credentials: `docker` vs. `gcloud`

The solution

Creating the frontend image

Finally: deploying the webservers

Misc notes

Time overhead because of the 'Cluster vs. Pool' creation problem

SSL handling of `k8s_facts`

Contents of `~/.kube/config`

About

Releases

Packages

Languages

gsimon75-cloud/gke_lb_demo

Folders and files

Latest commit

History

Repository files navigation

A demo deployment of a load-balanced web example project in Google Kubernetes Engine

Overview

Prerequisites

A GCP Service Account

Docker

OS packages

Creating the cluster

Some details

Note #1

Note #2

Actually creating the cluster

Checking the cluster

Deploying MariaDB

Deploying the webservers

The container image

docker as non-root

Credentials: docker vs. gcloud

The solution

Creating the frontend image

Finally: deploying the webservers

Misc notes

Time overhead because of the 'Cluster vs. Pool' creation problem

SSL handling of k8s_facts

Contents of ~/.kube/config

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`docker` as non-root

Credentials: `docker` vs. `gcloud`

SSL handling of `k8s_facts`

Contents of `~/.kube/config`

Packages