Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etcd not respond to progress notification on newly created watch RPC until there is a event in a watch #17507

Closed
4 tasks done
serathius opened this issue Feb 28, 2024 · 7 comments · Fixed by #17557
Closed
4 tasks done
Assignees
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. type/bug

Comments

@serathius
Copy link
Member

Bug report criteria

What happened?

When trying make K8s depend on progress notification I found an broken edge case when working on kubernetes/kubernetes#123513

What I found is that #15237 introduced a bug with deferredProgress logic.

If a progress request is send when watch is not synced (like when it's just newly created watch), etcd mark progress notification as deferred, and stop responding to progress notification request.

Progress notification can be later restored but only if there is an event to be sent. So if there is no event observed in the watch, etcd will never respond to progress notification requests.

What did you expect to happen?

Don't expect event to arrive, send the progress notification when watchers become synced.

How can we reproduce it (as minimally and precisely as possible)?

Take PR.

Run make test WHAT=./vendor/k8s.io/apiserver/pkg/storage/... GOFLAGS="-v" KUBE_TEST_ARGS='--count 10 --run TestListWithListFromCache --failfast'

Anything else we need to know?

No response

Etcd version (please run commands below)

Reproduced on v3.5.10 embedded server

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Relevant log output

No response

@serathius serathius added type/bug priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Feb 28, 2024
@serathius
Copy link
Member Author

Note, I don't expect this to be a super impactful thing, however definitely a blocker for Kubernetes to adopt and depend on manually requested progress notifications.

@ahrtr
Copy link
Member

ahrtr commented Feb 29, 2024

What I found is that #15237 introduced a bug with deferredProgress logic.

If a progress request is send when watch is not synced (like when it's just newly created watch), etcd mark progress notification as deferred, and stop responding to progress notification request.

Did not dig into the source code yet. But based on your input, It should be expected behaviour. I do not see any issue here. The progress notification events shouldn't be sent out until all events up to a revision have been already delivered. It's exactly what the Bookmarkable means. Please let me know if I misunderstood your point.

Take PR.

Run make test WHAT=./vendor/k8s.io/apiserver/pkg/storage/... GOFLAGS="-v" KUBE_TEST_ARGS='--count 10 --run TestListWithListFromCache --failfast'

Please do not expect every etcd contributor is familiar with K8s test case or has time to dig into K8s test case. Please create an dedicated etcd test case to clearly clarify the issue if you still think it's an etcd issue.

@ahrtr
Copy link
Member

ahrtr commented Mar 4, 2024

Can we close this ticket?

@serathius
Copy link
Member Author

serathius commented Mar 4, 2024

No, this is a big issue that blocks usage of manual progress notification in K8s.

Will work on this when I fix kubernetes/kubernetes#123532

@serathius serathius self-assigned this Mar 4, 2024
@ahrtr
Copy link
Member

ahrtr commented Mar 4, 2024

No, this is a big issue that blocks usage of manual progress notification in K8s.

What issue? Let me know if you have comment on #17507 (comment)

@serathius
Copy link
Member Author

Etcd not respond to progress notification on newly created watch RPC until there is a event in a watch

@ahrtr
Copy link
Member

ahrtr commented Mar 4, 2024

I already read your description, as I mentioned in #17507 (comment), it's expected behaviour.

I do not see any issue here. The progress notification events shouldn't be sent out until all events up to a revision have been already delivered. It's exactly what the Bookmarkable means. Please let me know if I misunderstood your point.

serathius added a commit to serathius/kubernetes that referenced this issue Mar 16, 2024
serathius added a commit to serathius/kubernetes that referenced this issue Apr 23, 2024
serathius added a commit to serathius/kubernetes that referenced this issue Apr 23, 2024
serathius added a commit to serathius/kubernetes that referenced this issue Apr 23, 2024
serathius added a commit to serathius/kubernetes that referenced this issue Apr 24, 2024
serathius added a commit to serathius/kubernetes that referenced this issue Apr 24, 2024
serathius added a commit to serathius/kubernetes that referenced this issue Apr 24, 2024
serathius added a commit to serathius/kubernetes that referenced this issue Apr 24, 2024
k8s-ci-robot added a commit to kubernetes/kubernetes that referenced this issue Apr 24, 2024
k8s-publishing-bot pushed a commit to kubernetes/apiserver that referenced this issue Apr 24, 2024
Kubernetes-commit: a08d1b5f3286c6f3698abf59022055dc0b4b922f
k8s-publishing-bot added a commit to kubernetes/apiserver that referenced this issue Apr 24, 2024
Remove workarounds for etcd-io/etcd#17507

Kubernetes-commit: 9c4d207d185da5a377ec1a9e92d8b71edb75085c
jingczhang pushed a commit to nokia/kubernetes that referenced this issue May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. type/bug
Development

Successfully merging a pull request may close this issue.

2 participants