This repo contains some notes about setting up K8s cluster on EC2+ECR without EKS to see how fast it can be. EKS can take between 10 and 15 minutes to boostrap a cluster.
Below is a changelog of this repo.
Commit: https://github.com/lucabrunox/experiments/tree/cd8378154c378
Define the AWS region that will be used for all the commands:
export AWS_REGION=eu-west-1
Create an S3 bucket for terraform state:
aws s3api create-bucket --acl private --bucket experiments-12345-terraform --create-bucket-configuration LocationConstraint=$AWS_REGION
Initialize:
cd tf
cat <<EOF > backend.conf
bucket = "experiments-12345-terraform"
key = "experiments/terraform.tfstate"
region = "$AWS_REGION"
EOF
terraform init --backend-config=backend.conf
terraform get
Apply the plan which will create a K8s cluster:
terraform apply \
-var "region=$AWS_REGION" \
-var "asg_desired_capacity=1" \
-var "nlb_enabled=true"
Use asg_desired_capacity=0 to tear down the cluster.
Commit: https://github.com/lucabrunox/experiments/tree/9cc3ac81d7f
Using a raw K8s on EC2 instead of EKS, using Flannel instead of AWS CNI. Some interesting facts:
- It takes 2 minutes and 20 second until all containers are in Running state.
- A t4g.medium is needed to run a cluster. Using a t4g.nano with swap is not enough because the apiserver/etcd will keep timing out.
SSH into the EC2 instance and run crictl and kubectl commands to inspect the cluster:
sudo su
crictl ps
export KUBECONFIG=/etc/kubernetes/admin.conf
kubectl get all -A
To check init logs:
sudo cat /var/log/cloud-init-output.log
Commit: https://github.com/lucabrunox/experiments/tree/5216dfe5efd6
Set up following https://docs.djangoproject.com/en/5.1/intro/tutorial01/ with django-admin startproject mysite
.
The Dockerfile is self-explanatory:
docker run -p 8000:8000 --rm -it $(docker build -q .)
To test GH actions I've set up act to run in a Docker, which seems to work fine:
./test_gh.sh
Commit: https://github.com/lucabrunox/experiments/tree/5216dfe5efd6
We're using a CronJob to get the ECR creds from the node metadata.
Some notes:
- Need to untaint control plane node in other to schedule pods.
- As we use t4g changed the GH build to ARM.
- Python requirements also need tzdata.
At the end we're able to execute the following kubectl on the EC2 instance to deploy the app and watch it working:
kubectl apply -f k8s/ecr-credentials.yaml
kubectl apply -f frontend/k8s/manifest.yaml
curl $(kubectl get svc frontend -o=jsonpath='{.spec.clusterIP}'):8000
Commit: https://github.com/lucabrunox/experiments/tree/f03d8449f869
Some notes:
- Configured NLB with the security group in 2 subnets.
- Django has an ALLOWED_HOSTS config to prevent Host header attacks
- Django detects a tty when logging to stdout
kubectl apply -f k8s/ecr-credentials.yaml
kubectl apply -f frontend/k8s/manifest.yaml
curl http://$(terraform output --raw experiments_nlb_dns_name)
Commit: https://github.com/lucabrunox/experiments/tree/0427c9b777aaab
Planning to create multiple instances of the same app, so packaging it with helm.
Some notes:
- Created the helm with the default helm create scaffolding.
- Added container hostname to ALLOWED_HOSTS for the health checks.
kubectl apply -f k8s/ecr-credentials.yaml
helm install --set-string image=ACCOUNT_ID.dkr.ecr.eu-west-1.amazonaws.com/experiments-frontend:vTAG frontend ./frontend/k8s/chart
curl http://$(terraform output --raw experiments_nlb_dns_name)
Commit: https://github.com/lucabrunox/experiments/tree/c6db952745cd9f
Created a self-contained k8s_control_plane_template module. Obviously, it's only a single node.
For the rest I'm leaving them in the main module.
Until now, we've been manually getting the EC2 IP and then using admin.conf as root. With this change instead we tag the k8s node instance, and give ec2-user admin access so that we can ssh and run kubectl commands straightaway.
An interesting note is that kubectl certificate approve
has a delay, hence we must wait for the certificate to be ready first.
ssh ec2-user@$(aws ec2 describe-instances --filters "Name=instance-state-name,Values=running" --filters "Name=tag-key,Values=Name" "Name=tag-value,Values=experiments_k8s_control_plane" --query 'Reservations[].Instances[].PublicIpAddress' --output text)
kubectl get all -A