Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd cluster with a learner is not supported #26

Closed
git-yww opened this issue Oct 17, 2023 · 7 comments · Fixed by #28
Closed

etcd cluster with a learner is not supported #26

git-yww opened this issue Oct 17, 2023 · 7 comments · Fixed by #28
Assignees
Labels
bug Something isn't working

Comments

@git-yww
Copy link
Contributor

git-yww commented Oct 17, 2023

Currently, we tried to use etcd-defrag to implement defragmentations on our etcd clusters, and we found it failed quickly due to that the learner node in cluster did not support health check.

Here is the execution log:

Validating configuration.Validating the defragmentation rule: dbQuotaUsage > 0.8 || dbSizeFree/dbQuotaUsage > 0.5 ... validPerforming health check.{"level":"warn","ts":"2023-10-12T17:51:18.358902+0800","logger":"client","caller":"v3@v3.6.0-alpha.0.0.20230803155134-cca200345ab2/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00030a000/11.11.11.11:2379","method":"/etcdserverpb.KV/Range","attempt":0,"error":"rpc error: code = **Unavailable desc = etcdserver: rpc not supported for learner"}**endpoint: https://11.11.11.11:2379/, health: false, took: 7.499546ms, error: etcdserver: rpc not supported for learnerendpoint: https://33.33.33.33:2379/, health: true, took: 7.733879ms, error:endpoint: https://44.44.44.44:2379/, health: true, took: 9.555876ms, error:endpoint: https://55.55.55.55:2379/, health: true, took: 10.164246ms, error:endpoint: https://66.66.66.66:2379/, health: true, took: 9.741549ms, error:endpoint: https://22.22.22.22:2379/, health: true, took: 43.014812ms, error:

So is this an ongoing issue?

@ahrtr
Copy link
Owner

ahrtr commented Oct 17, 2023

The workaround for now is to remove the learner from the --endpoints. Eventually the learner will be promoted to a voting member, right?

@git-yww
Copy link
Contributor Author

git-yww commented Oct 17, 2023

The workaround for now is to remove the learner from the --endpoints. Eventually the learner will be promoted to a voting member, right?

This did not work during our tests. Health check will still request all members in cluster regardless of what --endpoints specifies.

@ahrtr ahrtr added the bug Something isn't working label Oct 17, 2023
@ahrtr
Copy link
Owner

ahrtr commented Oct 18, 2023

Thanks for raising this issue. Learner members can only serve statusRequest and serializable read requests. Refer to util.go#L141-L150

So the solution is to programmatically remove learner members from the endpoint list. Would you be interested in delivering a PR?

@git-yww
Copy link
Contributor Author

git-yww commented Oct 18, 2023

Sure, i’ll take care of it.

@ahrtr
Copy link
Owner

ahrtr commented Oct 18, 2023

Sure, i’ll take care of it.

Thanks. Assigned to you.

@ahrtr
Copy link
Owner

ahrtr commented Oct 19, 2023

Will release a new version today.

@ahrtr
Copy link
Owner

ahrtr commented Oct 19, 2023

FYI. https://github.com/ahrtr/etcd-defrag/releases/tag/v0.7.0

docker pull ghcr.io/ahrtr/etcd-defrag:v0.7.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants