Tools and configuration for provisioning OS diskless Container Linux clusters, running k3s and logging with Papertrail. Nodes boot from an iPXE flash drive, which also contains a dedicated volume for keeping secrets. Apply platform config changes and updates by rebooting.
The iPXE script downloads and boots the newest stable release for Flatcar Container Linux (RIP CoreOS), which then pulls in the ignition.ign file in this repo.
Since that is publicly visible, we cannot have any secrets kicking around in
there. But we still need those values to be carried over somehow, so we capture
those values at iPXE flash drive provisioning-time and throw them into the
secrets
volume. From there, all service definitions pull what they need from
/secrets
, which is readonly mounted.
A number of reasons, in no particular order:
- I didn't want boot drives stealing SATA ports and drive bays on my machines
- I didn't want another computing environment that turned into me managing a small pack of Debian/Fedora boxes
- I didn't want to shell out additional money for OS drives that would basically just hold logs
- I didn't want to manage a bunch of infrastructure (log collection, config management, presence monitoring) just to have a basic, reliable deployment
- I did want to try something new
You can swing a decent deployment using hosted services to the tune of less than $20/month. These are just the bare necessities to bootstrap and monitor a k3s cluster. For any actual applications, you may want dedicated storage. A way to accomplish that would be with a Rook-managed Ceph cluster (backed by local disks, probably), which is then either used by external clients or cluster-internal containers.
papertrail.com has a free tier with 48 hours of search and 7 days of archives.
Spin up a Vultr compute node with Debian or whatever and throw Postgres on it.
You can get a 1 CPU/1GB RAM/25GB SSD node with automated filesystem backups for
like $6/month. If you're interested in additional backups, you can use
Backblaze B2 for cheap backup storage and just throw Postgres dumps there.
I use a hourly/daily/weekly series of
pg-b2
cronjobs, monitored with
healthchecks.io.
Heroku's "Hobby Basic" tier of hosted postgres allows for 10 million rows at
$9/month. This seems compelling at first. My "this is a single node doing
nothing" test hit 3k rows, which seemed promising, but trying to helm install
something hit the hobby tier's 20 connect limit. The next tier is $50/month,
which was high enough to push me away. Vultr offers 4 CPU/8GB RAM/160GB SSD for
only like $40/month, so I'll probably build out in that direction if I need to
grow.
You can get some super simple heartbeat monitoring with healthchecks.io. It's free as long as you fit within the hobby tier.
We support updating Namecheap dynamic DNS records for individual hosts (based
upon the value of hostname -f
). Additionally, all healthy k3s
primaries
will compete for a single record defined by primary.$(hostname -d)
. This is a
super dumb, hacky workaround that let's us define appropriate hosts for cluster
ingress.
Pop in a flash drive you don't care about and then run the following:
sudo \
SSH_KEY_PATH=$HOME/.ssh/id_rsa.pub \
NODE_HOSTNAME=node1337.example.com \
LOGEXPORT_HOST=logsX.papertrailapp.com LOGEXPORT_PORT=XXXXX \
HEARTBEAT_URL=https://nudge.me/im_alive \
NAMECHEAP_DDNS_PASS=foobarbaz \
NAMECHEAP_DDNS_INTERFACE=eth0 \
K3S_DATASTORE_ENDPOINT="see the k3s docs" \
# Other K3S_* env vars you probably want to set \
./build_pxe_stick.sh /dev/sdX
If you don't specify expected LOGEXPORT_*
, HEARTBEAT_*
or K3S_*
, we'll
throw up a warning in build_pxe_stick.sh
and those components won't be
activated at system runtime.
The bare hostname you want to set for the node, e.g. node0
. If you don't set
this, we won't do anything special to automatically set the hostname.
Path to a SSH public key, e.g. $HOME/.ssh/id_rsa.pub
(the default). We need
this to be something, as otherwise we're entirely the values set when the
Ignition config is applied
Where you get these values will vary a lot based upon which log ingestion
service you use. We support the "just pipe journalctl -f
to ncat
" approach
that's used by Papertrail (and maybe others? idk).
If you're using Papertrial, you can rip these from from the Papertrail setup page. See the bit at the top that says "Your logs will go to...". These values seem to be scoped to the Papertrail account, rather than an individual sender, so feel free to reuse those values across multiple nodes.
This can really be any URL; we're just gonna call it every 5 minutes with
wget --spider
. It wouldn't be rocket surgery to implement your own heartbeat
monitor service, but there are services that offer this.
Unlike with Papertrail, you need dedicated values for each host here. At the
time of writing, healthchecks.io offers a free hobby tier with up to 20 checks.
To get the host's HEARTBEAT_URL
, you'll want to:
- head to the healthchecks.io dashboard.
- sign in (if necessary) and select a project
- click
Add Check
at the bottom - hit the small
edit
link at the top by the auto-generated UUID title to set a better title - find the
Change Schedule...
button in theSchedule
section near the bottom of the page - set the period to 5 minutes and the grace time to 1 minute
- configure notifications in the "Notification Methods" section
- grab the
hc-ping.com
URL from the "How To Ping" section, and that's yourHEARTBEAT_URL
value
Unlike with Papertrail, you need dedicated values for each host here. To get those, you'll want to:
- head to the UptimeRobot dashboard
- click
+ Add New Monitor
at the top left (sorry, can't link it) - select monitor type -> "Heartbeat (Beta)" (it's in beta at the time of writing)
- set the "Friendly Name" to the node's hostname
- set a monitoring inverval of "every 5 minutes"
- work out who should be alerted when things break
That'll kick out a URL like https://heartbeat.uptimerobot.com/BUNCH_OF_CHARS
.
That's the value for HEARTBEAT_URL
.
You can get the DDNS password by following Namecheap's docs. The DDNS interface is the interface whose IP we read for the DDNS update.
See k3s docs for more details here. If you want to use the embedded SQLite option, set this to an empty value explicitly.
Just add em in and we'll preserve em. k3s
will start with those vars set.
The ignition.ign file is generated from the ignition.yml file. You can perform that tranformation with the following:
bash ./ignite.sh ignition.yml > ignition.ign
Note that the ignition config is pulled down from github each boot, so you'll need to push any changes in order to test them. There's likely some room here for certain iPXE values (such as the ignition config URI) to be derived from the git repo itself...
You can have qemu boot the flash drive for testing purposes. Note that this
will result in downloading several hundred megabytes each run. To do this use:
./qemu_test /dev/sdX
. If you get stuck in the console, then alt-2
and then
quit
should get you out. You can also run it with NO_CURSES=1
environment
variable set if you want your console in a dedicated graphical window rather
than a terminal:
NO_CURSES=1 ./qemu_test /dev/sdX
Then you can just ssh core@localhost -p 2222
when it's done booting.
Cool! You're probably gonna wanna call build_pxe_stick.sh
with a custom
IGN_PATH
. The default is
https://raw.githubusercontent.com/mcsaucy/ephemeros/master/ignition.ign
, so
you'll probably want to change that to the URI for your own.