Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are docker images pruned from runners after use? #210

Closed
Glen-Moonpig opened this issue Apr 1, 2020 · 15 comments
Closed

Are docker images pruned from runners after use? #210

Glen-Moonpig opened this issue Apr 1, 2020 · 15 comments
Labels
question ❔ Further information is requested

Comments

@Glen-Moonpig
Copy link
Contributor

Hello Niek, I have a question you might be able to help me with..

I recently had to increase the root volume size of the runners because they were running out of disk space during job execution. I think this happened because some pipelines started using large docker images and suspect the images are pulled from the registry to the EC2 instance root volume to execute the job and then after the job completes the image remains in the machine's local registry.

Do you have any idea how the local image registry is managed on the runners? Is there any kind of automatic image pruning? Do you have any recommendations on how to clean up old images from the runner volumes?

@fliphess
Copy link
Contributor

fliphess commented Apr 2, 2020

@Glen-Moonpig I've run into this as well and fixed it by adding the gitlab-runner-docker-cleaner to my post install, which works really well for me:

locals {
  userdata_post_install = <<-POST
  if ! docker ps --format '{{.Names}}' | grep -w gitlab-runner-docker-cleanup &> /dev/null; then
      docker run -d \
          -e LOW_FREE_SPACE=40G \
          -e EXPECTED_FREE_SPACE=50G \
          -e LOW_FREE_FILES_COUNT=1048576 \
          -e EXPECTED_FREE_FILES_COUNT=2097152 \
          -e DEFAULT_TTL=10m \
          -e USE_DF=1 \
          --restart always \
          -v /var/run/docker.sock:/var/run/docker.sock \
          --name=gitlab-runner-docker-cleanup \
          quay.io/gitlab/gitlab-runner-docker-cleanup &
  fi
  POST
}

And then set userdata_post_install = locals.userdata_post_install in the module.

@fliphess
Copy link
Contributor

fliphess commented Apr 2, 2020

it's good to know that pulling the image takes a long time, which caused a race condition while registering the runner resulting in being blocked on the gitlab api. This was solved by pulling the image in the background, hence the appended & :)

@Glen-Moonpig
Copy link
Contributor Author

Thanks @fliphess , I had seen this repo but wasn't sure if it would be compatible with this module, I will try it out. Does the image run on the agent machine and clean up images from the runner machines?

@fliphess
Copy link
Contributor

fliphess commented Apr 2, 2020

It doesn't on docker-machine, only on the local runner, I'm currently working on a new gitlab-runner setup for the company I work for to create a scaling setup, and I haven't solved this yet.

Gitlab provides tooling to cleanup:
https://gitlab.com/gitlab-org/gitlab-runner/blob/master/packaging/root/usr/share/gitlab-runner/clear-docker-cache

You might be able to run this script as a post task or configure a cronjob through --amazonec2-userdata

@npalm
Copy link
Collaborator

npalm commented Apr 4, 2020

@Glen-Moonpig I only use the docker-machine setup, with a short cycle of ec2 instances. Thereform not having this issue. Would you like to update the docker example with the post install. Sound like a good approach.

@lsorber
Copy link

lsorber commented Jul 2, 2020

@fliphess Thanks for sharing! What value are you using for runners_root_size with that userdata_post_install script?

@fliphess
Copy link
Contributor

fliphess commented Jul 2, 2020

Hey @lsorber We are using 150GB disks.

As we do some integration testing that requires pulling lots of different images at the same time, resulting in a large storage need, we added another cronjob that runs nighty as well to ensure some images are removed too as the cleaner docker wasn't always aware of images that are pulled directly using the docker daemon:

  ## Create cleanup cronjob
  cat > /usr/local/bin/clean-docker <<CRON
  #!/bin/bash

  # Remove exited containers
  docker ps -a -q -f status=exited    | xargs --no-run-if-empty docker rm -v

  # Remove dangling images
  docker images -f "dangling=true" -q | xargs --no-run-if-empty docker rmi

  # Remove unused images
  docker images | awk '/ago/  { print $3}' | xargs --no-run-if-empty docker rmi

  # Remove dangling volumes
  docker volume ls -qf dangling=true  | xargs --no-run-if-empty docker volume rm
  CRON

  # Make executable
  chmod +x  /usr/local/bin/clean-docker

  # Add to cron
  echo -e 'MAILTO=you@example.com\n0 1 * * * root /bin/flock -n /tmp/.docker-clean.lock /usr/local/bin/clean-docker > /dev/null 2>&1\n' > /etc/cron.d/gitlab-runner-cleaner

Which is a bazooka shooting mosquitto's to ensure all leftovers were removed nighlty.

@kayman-mk
Copy link
Collaborator

As the link to the Gitlab documentation is no longer working, try out this one: https://docs.gitlab.com/runner/executors/docker.html

Solution is either use one of the scripts above or the Gitlab scripts I mentioned.

@fliphess
Copy link
Contributor

And if required, the cache cleaner script is here (the url has changed):

https://gitlab.com/gitlab-org/gitlab-runner/-/blob/main/packaging/root/usr/share/gitlab-runner/clear-docker-cache

@kayman-mk
Copy link
Collaborator

@fliphess But the script has to run on the docker+machine and not on the agent. Any idea how to get it working?

@fliphess
Copy link
Contributor

@kayman-mk Sorry but I'm not using the module anymore as I switched job and am now using the kubernetes executor.

Have a look at the --amazonec2-userdata setting for docker-machine (You can find an example of using it in gitlab-runner over here and here, you should be able to add some commands there.

But tbh, I think the easiest way to do this is bake your own AMI with all the required cleanup tools, crons etc added to it instead of adding complex scripts through userdata)

@kayman-mk
Copy link
Collaborator

kayman-mk commented Oct 13, 2022

Activated the following via crontab on my agents (via userdata_post_install). Works like a charme and cleans the docker images/containers/volumes every hour.

# install docker cleanup scripts for the docker+machine instances
cat << "EOF" > /etc/cron.hourly/clean-docker-machine-caches.sh
  #!/usr/bin/env bash
  for i in `docker-machine ls | cut -d' ' -f1`; do
    (echo "sudo -s;" && cat /usr/share/gitlab-runner/clear-docker-cache) | docker-machine ssh $i
  done
EOF

chmod a+x /etc/cron.hourly/clean-docker-machine-caches.sh

@kayman-mk
Copy link
Collaborator

@Glen-Moonpig Seems to be solved now and can be closed, right?

@kayman-mk kayman-mk added the question ❔ Further information is requested label Dec 31, 2022
@NIXKnight
Copy link

Hi everyone. I wanted to know how can I run gitlab-runner-docker-cleanup in the current version of the module?

@kayman-mk
Copy link
Collaborator

See #210 (comment) Meanwhile the name of the variable changed to runner_install.post_install_script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question ❔ Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants