Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate ingester.ring.final-sleep in favor of shutdown-delay #3431

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
* [CHANGE] Experimental flag `-blocks-storage.tsdb.out-of-order-capacity-min` has been removed. #3261
* [CHANGE] Distributor: Wrap errors from pushing to ingesters with useful context, for example clarifying timeouts. #3307
* [CHANGE] The default value of `-server.http-write-timeout` has changed from 30s to 2m. #3346
* [CHANGE] Ingester: Deprecate `-ingester.ring.final-sleep` in favor of `-shutdown-delay` option for sleeping after SIGTERM before shutdown. #3431
* [FEATURE] Alertmanager: added Discord support. #3309
* [ENHANCEMENT] Added `-server.tls-min-version` and `-server.tls-cipher-suites` flags to configure cipher suites and min TLS version supported by HTTP and gRPC servers. #2898
* [ENHANCEMENT] Distributor: Add age filter to forwarding functionality, to not forward samples which are older than defined duration. If such samples are not ingested, `cortex_discarded_samples_total{reason="forwarded-sample-too-old"}` is increased. #3049 #3133
Expand All @@ -20,7 +21,7 @@
* [ENHANCEMENT] Ingester: reduced the memory footprint of active series custom trackers. #2568
* [ENHANCEMENT] Distributor: Include `X-Scope-OrgId` header in requests forwarded to configured forwarding endpoint. #3283
* [ENHANCEMENT] Alertmanager: reduced memory utilization in Mimir clusters with a large number of tenants. #3309
* [ENHANCEMENT] Add experimental flag `-shutdown-delay` to allow components to wait after receiving SIGTERM and before stopping. In this time the component returns 503 from /ready endpoint. #3298
* [ENHANCEMENT] Add advanced flag `-shutdown-delay` to allow components to wait after receiving SIGTERM and before stopping. In this time the component returns 503 from /ready endpoint. #3298
* [ENHANCEMENT] Go: update to go 1.19.3. #3371
* [ENHANCEMENT] Alerts: added `RulerRemoteEvaluationFailing` alert, firing when communication between ruler and frontend fails in remote operational mode. #3177 #3389
* [ENHANCEMENT] Clarify which S3 signature versions are supported in the error "unsupported signature version". #3376
Expand Down
13 changes: 1 addition & 12 deletions cmd/mimir/config-descriptor.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
"fieldDefaultValue": 0,
"fieldFlag": "shutdown-delay",
"fieldType": "duration",
"fieldCategory": "experimental"
"fieldCategory": "advanced"
},
{
"kind": "block",
Expand Down Expand Up @@ -2627,17 +2627,6 @@
"fieldType": "duration",
"fieldCategory": "advanced"
},
{
"kind": "field",
"name": "final_sleep",
"required": false,
"desc": "Duration to sleep for before exiting, to ensure metrics are scraped.",
"fieldValue": null,
"fieldDefaultValue": 0,
"fieldFlag": "ingester.ring.final-sleep",
"fieldType": "duration",
"fieldCategory": "advanced"
},
{
"kind": "field",
"name": "readiness_check_ring_health",
Expand Down
4 changes: 1 addition & 3 deletions cmd/mimir/help-all.txt.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -1061,8 +1061,6 @@ Usage of ./cmd/mimir/mimir:
Etcd username.
-ingester.ring.excluded-zones comma-separated-list-of-strings
Comma-separated list of zones to exclude from the ring. Instances in excluded zones will be filtered out from the ring. This option needs be set on ingesters, distributors, queriers and rulers when running in microservices mode.
-ingester.ring.final-sleep duration
Duration to sleep for before exiting, to ensure metrics are scraped.
-ingester.ring.heartbeat-period duration
Period at which to heartbeat to the ring. 0 = disabled. (default 15s)
-ingester.ring.heartbeat-timeout duration
Expand Down Expand Up @@ -1872,7 +1870,7 @@ Usage of ./cmd/mimir/mimir:
-server.tls-min-version string
Minimum TLS version to use. Allowed values: VersionTLS10, VersionTLS11, VersionTLS12, VersionTLS13. If blank, the Go TLS minimum version is used.
-shutdown-delay duration
[experimental] How long to wait between SIGTERM and shutdown. After receiving SIGTERM, Mimir will report not-ready status via /ready endpoint.
How long to wait between SIGTERM and shutdown. After receiving SIGTERM, Mimir will report not-ready status via /ready endpoint.
-store-gateway.sharding-ring.consul.acl-token string
ACL Token used to interact with Consul.
-store-gateway.sharding-ring.consul.cas-retry-delay duration
Expand Down
2 changes: 0 additions & 2 deletions development/mimir-microservices-mode/config/mimir.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ ingester_client:

ingester:
ring:
# We want to start immediately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am I misunderstanding what final_sleep does or does this comment not make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment was just wrong. Maybe it used to apply to join_after?

final_sleep: 0s
num_tokens: 512
kvstore:
store: consul
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ ingester_client:

ingester:
ring:
# We want to start immediately.
final_sleep: 0s
num_tokens: 512
kvstore:
store: consul
Expand Down
2 changes: 0 additions & 2 deletions development/mimir-monolithic-mode/config/mimir.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ ingester_client:

ingester:
ring:
# We want to start immediately.
final_sleep: 0s
num_tokens: 512
kvstore:
store: consul
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ where `default_value` is the value to use if the environment variable is undefin
# CLI flag: -auth.no-auth-tenant
[no_auth_tenant: <string> | default = "anonymous"]

# (experimental) How long to wait between SIGTERM and shutdown. After receiving
# (advanced) How long to wait between SIGTERM and shutdown. After receiving
# SIGTERM, Mimir will report not-ready status via /ready endpoint.
# CLI flag: -shutdown-delay
[shutdown_delay: <duration> | default = 0s]
Expand Down Expand Up @@ -784,11 +784,6 @@ ring:
# CLI flag: -ingester.ring.min-ready-duration
[min_ready_duration: <duration> | default = 15s]

# (advanced) Duration to sleep for before exiting, to ensure metrics are
# scraped.
# CLI flag: -ingester.ring.final-sleep
[final_sleep: <duration> | default = 0s]

# (advanced) When enabled the readiness probe succeeds only after all
# instances are ACTIVE and healthy in the ring, otherwise only the instance
# itself is checked. This option should be disabled if in your cluster
Expand Down
11 changes: 6 additions & 5 deletions pkg/ingester/ingester_ring.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,10 @@ type RingConfig struct {
UnregisterOnShutdown bool `yaml:"unregister_on_shutdown" category:"advanced"`

// Config for the ingester lifecycle control
ObservePeriod time.Duration `yaml:"observe_period" category:"advanced"`
MinReadyDuration time.Duration `yaml:"min_ready_duration" category:"advanced"`
FinalSleep time.Duration `yaml:"final_sleep" category:"advanced"`
ObservePeriod time.Duration `yaml:"observe_period" category:"advanced"`
MinReadyDuration time.Duration `yaml:"min_ready_duration" category:"advanced"`
// TODO: Remove in Mimir 2.7 (Mimir 2.5 is the first release in which this is deprecated).
FinalSleep time.Duration `yaml:"final_sleep" category:"advanced" doc:"hidden"`
ReadinessCheckRingHealth bool `yaml:"readiness_check_ring_health" category:"advanced"`

// Injected internally
Expand Down Expand Up @@ -92,7 +93,8 @@ func (cfg *RingConfig) RegisterFlags(f *flag.FlagSet, logger log.Logger) {
f.DurationVar(&cfg.ObservePeriod, prefix+"observe-period", 0*time.Second, "Observe tokens after generating to resolve collisions. Useful when using gossiping ring.")
flagext.DeprecatedFlag(f, prefix+"join-after", "Deprecated: this setting was used to set a period of time to wait before joining the hash ring. Mimir now behaves as this setting is always set to 0s.", logger)
f.DurationVar(&cfg.MinReadyDuration, prefix+"min-ready-duration", 15*time.Second, "Minimum duration to wait after the internal readiness checks have passed but before succeeding the readiness endpoint. This is used to slowdown deployment controllers (eg. Kubernetes) after an instance is ready and before they proceed with a rolling update, to give the rest of the cluster instances enough time to receive ring updates.")
f.DurationVar(&cfg.FinalSleep, prefix+"final-sleep", 0, "Duration to sleep for before exiting, to ensure metrics are scraped.")
// TODO: Remove in Mimir 2.7 (Mimir 2.5 is the first release in which this is deprecated).
flagext.DeprecatedFlag(f, prefix+"final-sleep", "Deprecated: this setting has been replaced by `-shutdown-delay` which can be used by any Mimir component. Duration to sleep for before exiting, to ensure metrics are scraped.", logger)

// Disable the ring health check in the readiness endpoint by default so that we can quickly rollout
// multiple ingesters in multi-zone deployments. It's also safe to disable it when deploying in a single zone,
Expand Down Expand Up @@ -132,7 +134,6 @@ func (cfg *RingConfig) ToLifecyclerConfig() ring.LifecyclerConfig {
lc.JoinAfter = cfg.JoinAfter
lc.MinReadyDuration = cfg.MinReadyDuration
lc.InfNames = cfg.InstanceInterfaceNames
lc.FinalSleep = cfg.FinalSleep
lc.TokensFilePath = cfg.TokensFilePath
lc.Zone = cfg.InstanceZone
lc.UnregisterOnShutdown = cfg.UnregisterOnShutdown
Expand Down
3 changes: 0 additions & 3 deletions pkg/ingester/ingester_ring_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ func TestRingConfig_DefaultConfigToLifecyclerConfig(t *testing.T) {
expected.RingConfig.KVStore.Store = "memberlist"
expected.NumTokens = cfg.NumTokens
expected.MinReadyDuration = cfg.MinReadyDuration
expected.FinalSleep = cfg.FinalSleep
expected.ReadinessCheckRingHealth = false
expected.HeartbeatPeriod = 15 * time.Second

Expand All @@ -52,7 +51,6 @@ func TestRingConfig_CustomConfigToLifecyclerConfig(t *testing.T) {
cfg.UnregisterOnShutdown = true
cfg.ObservePeriod = 10 * time.Minute
cfg.MinReadyDuration = 3 * time.Minute
cfg.FinalSleep = 2 * time.Minute
cfg.ReadinessCheckRingHealth = false
cfg.ListenPort = 10

Expand All @@ -71,7 +69,6 @@ func TestRingConfig_CustomConfigToLifecyclerConfig(t *testing.T) {
expected.JoinAfter = 0
expected.MinReadyDuration = cfg.MinReadyDuration
expected.InfNames = cfg.InstanceInterfaceNames
expected.FinalSleep = cfg.FinalSleep
expected.TokensFilePath = cfg.TokensFilePath
expected.Zone = cfg.InstanceZone
expected.UnregisterOnShutdown = cfg.UnregisterOnShutdown
Expand Down
1 change: 0 additions & 1 deletion pkg/ingester/lifecycle_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ func defaultIngesterTestConfig(t testing.TB) Config {
cfg.IngesterRing.ListenPort = 0
cfg.IngesterRing.InstanceAddr = "localhost"
cfg.IngesterRing.InstanceID = "localhost"
cfg.IngesterRing.FinalSleep = 0
cfg.ActiveSeriesMetricsEnabled = true

return cfg
Expand Down
2 changes: 1 addition & 1 deletion pkg/mimir/mimir.go
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ type Config struct {
Target flagext.StringSliceCSV `yaml:"target"`
MultitenancyEnabled bool `yaml:"multitenancy_enabled"`
NoAuthTenant string `yaml:"no_auth_tenant" category:"advanced"`
ShutdownDelay time.Duration `yaml:"shutdown_delay" category:"experimental"`
ShutdownDelay time.Duration `yaml:"shutdown_delay" category:"advanced"`
PrintConfig bool `yaml:"-"`
ApplicationName string `yaml:"-"`

Expand Down