Skip to content

Tools for bringing up OS disk-less Container Linux clusters.

Notifications You must be signed in to change notification settings

mcsaucy/ephemeros

Repository files navigation

Ephemeros

Tools and configuration for provisioning OS diskless Container Linux clusters, running k3s and logging with Papertrail. Nodes boot from an iPXE flash drive, which also contains a dedicated volume for keeping secrets. Apply platform config changes and updates by rebooting.

The iPXE script downloads and boots the newest stable release for Flatcar Container Linux (RIP CoreOS), which then pulls in the ignition.ign file in this repo.

Since that is publicly visible, we cannot have any secrets kicking around in there. But we still need those values to be carried over somehow, so we capture those values at iPXE flash drive provisioning-time and throw them into the secrets volume. From there, all service definitions pull what they need from /secrets, which is readonly mounted.

Why?

A number of reasons, in no particular order:

  • I didn't want boot drives stealing SATA ports and drive bays on my machines
  • I didn't want another computing environment that turned into me managing a small pack of Debian/Fedora boxes
  • I didn't want to shell out additional money for OS drives that would basically just hold logs
  • I didn't want to manage a bunch of infrastructure (log collection, config management, presence monitoring) just to have a basic, reliable deployment
  • I did want to try something new

What could a possible deployment look like?

You can swing a decent deployment using hosted services to the tune of less than $20/month. These are just the bare necessities to bootstrap and monitor a k3s cluster. For any actual applications, you may want dedicated storage. A way to accomplish that would be with a Rook-managed Ceph cluster (backed by local disks, probably), which is then either used by external clients or cluster-internal containers.

Log collection

papertrail.com has a free tier with 48 hours of search and 7 days of archives.

k3s datastore hosting

Spin up a Vultr compute node with Debian or whatever and throw Postgres on it. You can get a 1 CPU/1GB RAM/25GB SSD node with automated filesystem backups for like $6/month. If you're interested in additional backups, you can use Backblaze B2 for cheap backup storage and just throw Postgres dumps there. I use a hourly/daily/weekly series of pg-b2 cronjobs, monitored with healthchecks.io.

Heroku's "Hobby Basic" tier of hosted postgres allows for 10 million rows at $9/month. This seems compelling at first. My "this is a single node doing nothing" test hit 3k rows, which seemed promising, but trying to helm install something hit the hobby tier's 20 connect limit. The next tier is $50/month, which was high enough to push me away. Vultr offers 4 CPU/8GB RAM/160GB SSD for only like $40/month, so I'll probably build out in that direction if I need to grow.

Heartbeat monitoring

You can get some super simple heartbeat monitoring with healthchecks.io. It's free as long as you fit within the hobby tier.

DDNS updating

We support updating Namecheap dynamic DNS records for individual hosts (based upon the value of hostname -f). Additionally, all healthy k3s primaries will compete for a single record defined by primary.$(hostname -d). This is a super dumb, hacky workaround that let's us define appropriate hosts for cluster ingress.

Making boot media

Pop in a flash drive you don't care about and then run the following:

sudo \
    SSH_KEY_PATH=$HOME/.ssh/id_rsa.pub \
    NODE_HOSTNAME=node1337.example.com \
    LOGEXPORT_HOST=logsX.papertrailapp.com LOGEXPORT_PORT=XXXXX \
    HEARTBEAT_URL=https://nudge.me/im_alive \
    NAMECHEAP_DDNS_PASS=foobarbaz \
    NAMECHEAP_DDNS_INTERFACE=eth0 \
    K3S_DATASTORE_ENDPOINT="see the k3s docs" \
    # Other K3S_* env vars you probably want to set \
    ./build_pxe_stick.sh /dev/sdX

If you don't specify expected LOGEXPORT_*, HEARTBEAT_* or K3S_*, we'll throw up a warning in build_pxe_stick.sh and those components won't be activated at system runtime.

What values should be used?

NODE_HOSTNAME

The bare hostname you want to set for the node, e.g. node0. If you don't set this, we won't do anything special to automatically set the hostname.

SSH_KEY_PATH

Path to a SSH public key, e.g. $HOME/.ssh/id_rsa.pub (the default). We need this to be something, as otherwise we're entirely the values set when the Ignition config is applied

LOGEXPORT_HOST and LOGEXPORT_PORT

Where you get these values will vary a lot based upon which log ingestion service you use. We support the "just pipe journalctl -f to ncat" approach that's used by Papertrail (and maybe others? idk).

Using Papertrail

If you're using Papertrial, you can rip these from from the Papertrail setup page. See the bit at the top that says "Your logs will go to...". These values seem to be scoped to the Papertrail account, rather than an individual sender, so feel free to reuse those values across multiple nodes.

HEARTBEAT_URL

This can really be any URL; we're just gonna call it every 5 minutes with wget --spider. It wouldn't be rocket surgery to implement your own heartbeat monitor service, but there are services that offer this.

Using healthchecks.io

Unlike with Papertrail, you need dedicated values for each host here. At the time of writing, healthchecks.io offers a free hobby tier with up to 20 checks. To get the host's HEARTBEAT_URL, you'll want to:

  1. head to the healthchecks.io dashboard.
  2. sign in (if necessary) and select a project
  3. click Add Check at the bottom
  4. hit the small edit link at the top by the auto-generated UUID title to set a better title
  5. find the Change Schedule... button in the Schedule section near the bottom of the page
  6. set the period to 5 minutes and the grace time to 1 minute
  7. configure notifications in the "Notification Methods" section
  8. grab the hc-ping.com URL from the "How To Ping" section, and that's your HEARTBEAT_URL value
Using UptimeRobot Heartbeat signals

Unlike with Papertrail, you need dedicated values for each host here. To get those, you'll want to:

  1. head to the UptimeRobot dashboard
  2. click + Add New Monitor at the top left (sorry, can't link it)
  3. select monitor type -> "Heartbeat (Beta)" (it's in beta at the time of writing)
  4. set the "Friendly Name" to the node's hostname
  5. set a monitoring inverval of "every 5 minutes"
  6. work out who should be alerted when things break

That'll kick out a URL like https://heartbeat.uptimerobot.com/BUNCH_OF_CHARS. That's the value for HEARTBEAT_URL.

NAMECHEAP_DDNS_PASS and NAMECHEAP_DDNS_INTERFACE

You can get the DDNS password by following Namecheap's docs. The DDNS interface is the interface whose IP we read for the DDNS update.

K3S_DATASTORE_ENDPOINT

See k3s docs for more details here. If you want to use the embedded SQLite option, set this to an empty value explicitly.

Other K3S_ vars

Just add em in and we'll preserve em. k3s will start with those vars set.

Updating the Ignition configs

The ignition.ign file is generated from the ignition.yml file. You can perform that tranformation with the following:

bash ./ignite.sh ignition.yml > ignition.ign

Note that the ignition config is pulled down from github each boot, so you'll need to push any changes in order to test them. There's likely some room here for certain iPXE values (such as the ignition config URI) to be derived from the git repo itself...

Testing with qemu

You can have qemu boot the flash drive for testing purposes. Note that this will result in downloading several hundred megabytes each run. To do this use: ./qemu_test /dev/sdX. If you get stuck in the console, then alt-2 and then quit should get you out. You can also run it with NO_CURSES=1 environment variable set if you want your console in a dedicated graphical window rather than a terminal:

NO_CURSES=1 ./qemu_test /dev/sdX

Then you can just ssh core@localhost -p 2222 when it's done booting.

I wanna mess with this

Cool! You're probably gonna wanna call build_pxe_stick.sh with a custom IGN_PATH. The default is https://raw.githubusercontent.com/mcsaucy/ephemeros/master/ignition.ign, so you'll probably want to change that to the URI for your own.

About

Tools for bringing up OS disk-less Container Linux clusters.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages