Skip to content

Commit

Permalink
feat: shutdown race resilience
Browse files Browse the repository at this point in the history
A significant rewrite to ensure that we don't suffer from shutdown race
conditions as the prune condition is met and additional resources are
being created.

Previously this would remove resources that were still in use, now we
retry if we detect new resources have been created within a window of
the prune condition triggering.

This supports the following new environment configuration settings:
- RYUK_REMOVE_RETRIES - The number of times to retry removing a resource.
- RYUK_REQUEST_TIMEOUT - The timeout for any Docker requests.
- RYUK_RETRY_OFFSET - The offset added to the start time of the prune
  pass that is used as the minimum resource creation time.
- RYUK_SHUTDOWN_TIMEOUT - The duration after shutdown has been requested
  when the remaining connections are ignored and prune checks start.

Update README to correct example, as health is only valid for containers
not the other resources, so would cause failures.
  • Loading branch information
stevenh committed Sep 7, 2024
1 parent e8ed9c9 commit c8d5692
Show file tree
Hide file tree
Showing 15 changed files with 1,561 additions and 903 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,9 @@

vendor/
bin/

# Binary
moby-ryuk

# VS Code
.vscode
44 changes: 33 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,36 @@
# Moby Ryuk

This project helps you to remove containers/networks/volumes/images by given filter after specified delay.
This project helps you to remove containers, networks, volumes and images by given filter after specified delay.

# Usage
## Building

To build the binary only run:

```shell
go build
```

To build the Linux docker container as the latest tag:

```shell
docker build -f linux/Dockerfile -t testcontainers/ryuk:latest .
```

## Usage

1. Start it:

$ RYUK_PORT=8080 ./bin/moby-ryuk
$ # You can also run it with Docker
$ docker run -v /var/run/docker.sock:/var/run/docker.sock -e RYUK_PORT=8080 -p 8080:8080 testcontainers/ryuk:0.6.0
RYUK_PORT=8080 ./bin/moby-ryuk
# You can also run it with Docker
docker run -v /var/run/docker.sock:/var/run/docker.sock -e RYUK_PORT=8080 -p 8080:8080 testcontainers/ryuk:0.6.0

1. Connect via TCP:

$ nc localhost 8080
nc localhost 8080

1. Send some filters:

label=testing=true&health=unhealthy
label=testing=true&label=testing.sessionid=mysession
ACK
label=something
ACK
Expand All @@ -37,7 +51,15 @@ This project helps you to remove containers/networks/volumes/images by given fil

## Ryuk configuration

- `RYUK_CONNECTION_TIMEOUT` - Environment variable that defines the timeout for Ryuk to receive the first connection (default: 60s). Value layout is described in [time.ParseDuration](https://golang.org/pkg/time/#ParseDuration) documentation.
- `RYUK_PORT` - Environment variable that defines the port where Ryuk will be bound to (default: 8080).
- `RYUK_RECONNECTION_TIMEOUT` - Environment variable that defines the timeout for Ryuk to reconnect to Docker (default: 10s). Value layout is described in [time.ParseDuration](https://golang.org/pkg/time/#ParseDuration) documentation.
- `RYUK_VERBOSE` - Environment variable that defines if Ryuk should print debug logs (default: false).
The following environment variables can be configured to change the behaviour:

| Environment Variable | Default | Format | Description |
| - | - | - | - |
| `RYUK_CONNECTION_TIMEOUT` | `60s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The duration without receiving any connections which will trigger a shutdown |
| `RYUK_PORT` | `8080` | `uint16` | The port to listen on for connections |
| `RYUK_RECONNECTION_TIMEOUT` | `10s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The duration after the last connection closes which will trigger resource clean up and shutdown |
| `RYUK_REQUEST_TIMEOUT` | `10s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The timeout for any Docker requests |
| `RYUK_REMOVE_RETRIES` | `10` | `int` | The number of times to retry removing a resource |
| `RYUK_RETRY_OFFSET` | `-1s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The offset added to the start time of the prune pass that is used as the minimum resource creation time. Any resource created after this calculated time will trigger a retry to ensure in use resources are not removed |
| `RYUK_VERBOSE` | `false` | `bool` | Whether to enable verbose aka debug logging |
| `RYUK_SHUTDOWN_TIMEOUT` | `10m` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The duration after shutdown has been requested when the remaining connections are ignored and prune checks start |
66 changes: 66 additions & 0 deletions config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
package main

import (
"fmt"
"log/slog"
"time"

"github.com/caarlos0/env/v11"
)

// config represents the configuration for the reaper.
type config struct {
// ConnectionTimeout is the duration without receiving any connections which will trigger a shutdown.
ConnectionTimeout time.Duration `env:"RYUK_CONNECTION_TIMEOUT" envDefault:"60s"`

// ReconnectionTimeout is the duration after the last connection closes which will trigger
// resource clean up and shutdown.
ReconnectionTimeout time.Duration `env:"RYUK_RECONNECTION_TIMEOUT" envDefault:"10s"`

// RequestTimeout is the timeout for any Docker requests.
RequestTimeout time.Duration `env:"RYUK_REQUEST_TIMEOUT" envDefault:"10s"`

// RemoveRetries is the number of times to retry removing a resource.
RemoveRetries int `env:"RYUK_REMOVE_RETRIES" envDefault:"10"`

// RetryOffset is the offset added to the start time of the prune pass that is
// used as the minimum resource creation time. Any resource created after this
// calculated time will trigger a retry to ensure in use resources are not removed.
RetryOffset time.Duration `env:"RYUK_RETRY_OFFSET" envDefault:"-1s"`

// ShutdownTimeout is the maximum amount of time the reaper will wait
// for once signalled to shutdown before it terminates even if connections
// are still established.
ShutdownTimeout time.Duration `env:"RYUK_SHUTDOWN_TIMEOUT" envDefault:"10m"`

// Port is the port to listen on for connections.
Port uint16 `env:"RYUK_PORT" envDefault:"8080"`

// Verbose is whether to enable verbose aka debug logging.
Verbose bool `env:"RYUK_VERBOSE" envDefault:"false"`
}

// LogAttrs returns the configuration as a slice of attributes.
func (c config) LogAttrs() []slog.Attr {
return []slog.Attr{
slog.Duration("connection_timeout", c.ConnectionTimeout),
slog.Duration("reconnection_timeout", c.ReconnectionTimeout),
slog.Duration("request_timeout", c.RequestTimeout),
slog.Duration("shutdown_timeout", c.ShutdownTimeout),
slog.Int("remove_retries", c.RemoveRetries),
slog.Duration("retry_offset", c.RetryOffset),
slog.Int("port", int(c.Port)),
slog.Bool("verbose", c.Verbose),
}
}

// loadConfig loads the configuration from the environment
// applying defaults where necessary.
func loadConfig() (*config, error) {
var cfg config
if err := env.Parse(&cfg); err != nil {
return nil, fmt.Errorf("parse env: %w", err)
}

return &cfg, nil
}
80 changes: 80 additions & 0 deletions config_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
package main

import (
"os"
"reflect"
"testing"
"time"

"github.com/stretchr/testify/require"
)

// clearConfigEnv clears the environment variables for the config fields.
func clearConfigEnv(t *testing.T) {
t.Helper()

var cfg config
typ := reflect.TypeOf(cfg)
for i := range typ.NumField() {

Check failure on line 18 in config_test.go

View workflow job for this annotation

GitHub Actions / lint

cannot range over typ.NumField() (value of type int) (typecheck)
field := typ.Field(i)
if name := field.Tag.Get("env"); name != "" {
if os.Getenv(name) != "" {
t.Setenv(name, "")
}
}
}
}

func Test_loadConfig(t *testing.T) {
tests := map[string]struct {
setEnv func(*testing.T)
expected config
}{
"defaults": {
setEnv: clearConfigEnv,
expected: config{
ConnectionTimeout: time.Minute,
Port: 8080,
ReconnectionTimeout: time.Second * 10,
RemoveRetries: 10,
RequestTimeout: time.Second * 10,
RetryOffset: -time.Second,
ShutdownTimeout: time.Minute * 10,
},
},
"custom": {
setEnv: func(t *testing.T) {
t.Helper()

clearConfigEnv(t)
t.Setenv("RYUK_PORT", "1234")
t.Setenv("RYUK_CONNECTION_TIMEOUT", "2s")
t.Setenv("RYUK_RECONNECTION_TIMEOUT", "3s")
t.Setenv("RYUK_REQUEST_TIMEOUT", "4s")
t.Setenv("RYUK_REMOVE_RETRIES", "5")
t.Setenv("RYUK_RETRY_OFFSET", "-6s")
t.Setenv("RYUK_SHUTDOWN_TIMEOUT", "7s")
},
expected: config{
Port: 1234,
ConnectionTimeout: time.Second * 2,
ReconnectionTimeout: time.Second * 3,
RequestTimeout: time.Second * 4,
RemoveRetries: 5,
RetryOffset: -time.Second * 6,
ShutdownTimeout: time.Second * 7,
},
},
}
for name, tc := range tests {
t.Run(name, func(t *testing.T) {
if tc.setEnv != nil {
tc.setEnv(t)
}

cfg, err := loadConfig()
require.NoError(t, err)
require.Equal(t, tc.expected, *cfg)
})
}
}
18 changes: 18 additions & 0 deletions consts.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
package main

const (
// labelBase is the base label for testcontainers.
labelBase = "org.testcontainers"

// ryukLabel is the label used to identify reaper containers.
ryukLabel = labelBase + ".ryuk"

// fieldError is the log field key for errors.
fieldError = "error"

// fieldAddress is the log field a client or listening address.
fieldAddress = "address"

// fieldClients is the log field used for client counts.
fieldClients = "clients"
)
56 changes: 19 additions & 37 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,64 +1,46 @@
module github.com/testcontainers/moby-ryuk

go 1.21
go 1.22

require (
github.com/caarlos0/env/v11 v11.1.0
github.com/docker/docker v27.1.1+incompatible
github.com/stretchr/testify v1.9.0
github.com/testcontainers/testcontainers-go v0.33.0
gopkg.in/matryer/try.v1 v1.0.0-20150601225556-312d2599e12e
)

require (
dario.cat/mergo v1.0.0 // indirect
github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1 // indirect
github.com/Microsoft/go-winio v0.6.2 // indirect
github.com/cenkalti/backoff/v4 v4.2.1 // indirect
github.com/cheekybits/is v0.0.0-20150225183255-68e9c0620927 // indirect
github.com/containerd/containerd v1.7.18 // indirect
github.com/containerd/log v0.1.0 // indirect
github.com/containerd/platforms v0.2.1 // indirect
github.com/cpuguy83/dockercfg v0.3.1 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/distribution/reference v0.6.0 // indirect
github.com/docker/go-connections v0.5.0 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/go-logr/logr v1.4.1 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/klauspost/compress v1.17.4 // indirect
github.com/kr/text v0.2.0 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
github.com/magiconair/properties v1.8.7 // indirect
github.com/matryer/try v0.0.0-20161228173917-9ac251b645a2 // indirect
github.com/kr/pretty v0.3.1 // indirect
github.com/moby/docker-image-spec v1.3.1 // indirect
github.com/moby/patternmatcher v0.6.0 // indirect
github.com/moby/sys/sequential v0.5.0 // indirect
github.com/moby/sys/user v0.1.0 // indirect
github.com/moby/term v0.5.0 // indirect
github.com/morikuni/aec v1.0.0 // indirect
github.com/opencontainers/go-digest v1.0.0 // indirect
github.com/opencontainers/image-spec v1.1.0 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c // indirect
github.com/shirou/gopsutil/v3 v3.23.12 // indirect
github.com/shoenig/go-m1cpu v0.1.6 // indirect
github.com/sirupsen/logrus v1.9.3 // indirect
github.com/tklauser/go-sysconf v0.3.12 // indirect
github.com/tklauser/numcpus v0.6.1 // indirect
github.com/yusufpapurcu/wmi v1.2.3 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.49.0 // indirect
go.opentelemetry.io/otel v1.24.0 // indirect
go.opentelemetry.io/otel/metric v1.24.0 // indirect
go.opentelemetry.io/otel/trace v1.24.0 // indirect
golang.org/x/crypto v0.24.0 // indirect
golang.org/x/net v0.26.0 // indirect
golang.org/x/sys v0.21.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20240318140521-94a12d6c2237 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20240318140521-94a12d6c2237 // indirect
github.com/rogpeppe/go-internal v1.12.0 // indirect
github.com/stretchr/objx v0.5.2 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.53.0 // indirect
go.opentelemetry.io/otel v1.28.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.28.0 // indirect
go.opentelemetry.io/otel/metric v1.28.0 // indirect
go.opentelemetry.io/otel/sdk v1.28.0 // indirect
go.opentelemetry.io/otel/trace v1.28.0 // indirect
golang.org/x/net v0.27.0 // indirect
golang.org/x/sys v0.22.0 // indirect
golang.org/x/time v0.5.0 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20240725223205-93522f1f2a9f // indirect
google.golang.org/grpc v1.65.0 // indirect
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
gotest.tools/v3 v3.5.1 // indirect
)
Loading

0 comments on commit c8d5692

Please sign in to comment.