etcd: avoid creating large leases #96836

mborsz · 2020-11-24T14:21:17Z

Currently we create a single etcd lease for each 1m of events (code). With high event throughput, this can create large number of objects reusing the same lease. While the lease_revoke operation in etcd is atomic, this blocks all other operations for long period of time.

Currently, in #96038 we are seeing occasional event etcd restarts. All of them happens approx. 1h after cluster start and correlates with lease_revoke operations on initial events. After the lease_revoke I see a number of errors like /health error; QGET failed etcdserver: request timed out (status code 503).

To fix this issue (blocking lease_revoke for a long time making health check fail), we shouldn't be creating large etcd leases.

Proposal: Let's introduce a limit of objects attached to a single lease. When the "prevLease" in leaseManager reaches object limit, we force starting a new one. The exact limit of objects needs to be determined (e.g. by running some scalability test with additional logs or by adding some new metric to kube-apiserver (what exactly?)).

/cc @wojtek-t

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2020-11-24T14:21:28Z

@mborsz: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wojtek-t · 2020-11-24T14:31:29Z

/sig scalability

mborsz · 2020-11-25T07:21:21Z

In fact we may consider using leaseReuseDurationSeconds with lower value than 1m instead of introducing objects limit (maybe both). This will reduce number of objects per lease and also will spread deletion of all events over 1m (instead of scheduling deletion of all 1m of events at the same time).

pacoxu · 2020-11-25T09:11:07Z

How about make leaseReuseDurationSeconds configurable as a tuning option firstly?

mborsz · 2020-11-25T09:38:46Z

How about make leaseReuseDurationSeconds configurable as a tuning option firstly?

Sounds reasonable to me.

goku321 · 2020-11-28T15:00:49Z

/assign

mborsz · 2020-11-30T07:48:20Z

After making leaseReuseDurationSeconds configurable, we still need to have better observability to be able to consciously tune this value:

Adding a prometheus metric with a size (= number of objects) of a lease
Log a warning if the size of lease exceeds some threshold (TBD)

The rationale for adding 2, while we have 1 is that e.g. in scalability tests the most problematic case is when we create a huge number of events in a short period of time on cluster bootstrap, before we have prometheus running. I think those tests can be a good starting point for parameter tuning.

fejta-bot · 2021-02-28T07:52:59Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

wojtek-t · 2021-02-28T10:56:18Z

This has been addressed by linked PRs.

mborsz added the kind/bug Categorizes issue or PR as related to a bug. label Nov 24, 2020

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 24, 2020

k8s-ci-robot added sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 24, 2020

wojtek-t added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 24, 2020

k8s-ci-robot assigned goku321 Nov 28, 2020

lingsamuel mentioned this issue Dec 2, 2020

apiserver add --lease-reuse-duration-seconds to config lease reuse duration #97009

Merged

lingsamuel mentioned this issue Dec 23, 2020

apiserver add lease object count metric #97480

Merged

lingsamuel mentioned this issue Jan 21, 2021

lease manager limit max objects attached to a lease #98257

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 28, 2021

wojtek-t closed this as completed Feb 28, 2021

mborsz mentioned this issue Mar 10, 2021

[1.20] Automated cherry pick of fixes for "large leases overload event etcd" issue (96836) #100084

Merged

This was referenced Mar 22, 2021

[1.19] Automated cherry pick of fixes for "large leases overload event etcd" issue (96836) #100450

Merged

[1.18] Automated cherry pick of fixes for "large leases overload event etcd" issue (96836) #100452

Merged

marseel mentioned this issue Jun 20, 2024

Define an official performance validation suite for etcd etcd-io/etcd#16467

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

etcd: avoid creating large leases #96836

etcd: avoid creating large leases #96836

mborsz commented Nov 24, 2020

k8s-ci-robot commented Nov 24, 2020

wojtek-t commented Nov 24, 2020

mborsz commented Nov 25, 2020

pacoxu commented Nov 25, 2020

mborsz commented Nov 25, 2020

goku321 commented Nov 28, 2020

mborsz commented Nov 30, 2020

fejta-bot commented Feb 28, 2021

wojtek-t commented Feb 28, 2021

etcd: avoid creating large leases #96836

etcd: avoid creating large leases #96836

Comments

mborsz commented Nov 24, 2020

k8s-ci-robot commented Nov 24, 2020

wojtek-t commented Nov 24, 2020

mborsz commented Nov 25, 2020

pacoxu commented Nov 25, 2020

mborsz commented Nov 25, 2020

goku321 commented Nov 28, 2020

mborsz commented Nov 30, 2020

fejta-bot commented Feb 28, 2021

wojtek-t commented Feb 28, 2021