Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auth performance and maintenance improvements #16036

Open
tjungblu opened this issue Jun 8, 2023 · 7 comments
Open

Auth performance and maintenance improvements #16036

tjungblu opened this issue Jun 8, 2023 · 7 comments

Comments

@tjungblu
Copy link
Contributor

tjungblu commented Jun 8, 2023

What would you like to be added?

(Below applies to only 3.6, so current state of the main branch)

After investigating performance problems in #15993 and implementing tests in #16005 we found several things that could be improved. This feature request should track what we want to improve on, also for further discussion on what not.

Opportunities for improvement

1. Testability of the auth package

As seen in #16005, it's currently not even possible to fully mock out the required dependencies to run an authstore. While the changes in the PR are a good start, there can be some more convenience methods around supplying users, passwords and other state necessary to test the auth applier properly.

2. Type conversion in Lease Keys

(Not directly related to auth)

The use of []string lease keys instead of the more common []byte in etcd is causing heavy GC and type conversion costs as seen in #15993.
AFAIU, this has been done in order to use a golang map, which does not work on []byte.

Let's discuss, whether it makes sense to have an alternative map implementation that supports this, or increase the memory usage and store the []byte alongside (eg in a parallel map, even though that might not actually save all/enough conversion costs).

3. Too much coupling in code that's adjacent to Leases

(Not strictly related to auth, also not exhaustive)

One example is that the Lease struct is not fully encapsulated, there are many places where the lock is taken and acquired and the itemSet is entirely replaced or updated.

https://github.com/etcd-io/etcd/blob/main/server/lease/lease.go#L37-L39

This is causing problems, because code using/testing the leases from other packages can't change that state. Going through the lessor requires a fully functioning backend/cluster and is cumbersome to mock.

Just from another PR, #16035 is touching a similar path which seems to be a memory leak and showcases that coupling pretty well.

4. Avoid using auth applier when auth is disabled

Currently the authApplier is part of the uberApplier chain, which is unnecesary overhead when auth isn't even enabled.
Auth can be enabled at runtime, so it makes sense to make it swap in and out depending on the current state.

Controlling whether auth is enabled through this chain allows us to avoid checking the same in the authStore all the time (under a lock in every method).

5. Test coverage in apply_auth.go is zero

As figured in #16005, there are no unit tests, even in the whole package there are none. This is risky, especially because the auth applier contains a lot of ACL logic.

Why is this needed?

I'll leave this up for a general debate, but I think each point individually can be its own ticket. Some of those refactorings can also be good for new joiners to the project, adding tests is always helpful to start with a codebase.

@serathius
Copy link
Member

serathius commented Jun 8, 2023

+1 for testing.

For the other things I don't have strong opinion. But if that will motivate adding a lot of tests then I'm for it.

PS: Please start from adding tests.

@CaojiamingAlan
Copy link
Contributor

CaojiamingAlan commented Jun 8, 2023

I'd like to help adding tests to appliers. Still relatively new to this part of code.

@chaochn47
Copy link
Member

Thanks @CaojiamingAlan, assigned to you.

@serathius
Copy link
Member

Note, appliers is the most fragile part of etcd code. We really try to avoid changes there until we have enough test coverage. Please prioritize designing and proper testing before anything else or your PRs might be rejected due to being too risky.

@CaojiamingAlan
Copy link
Contributor

I have a question about the how to organize the tests. Do we need to mock the all the dependencies from scratch, like what store_mock_test.go does, or should we create real backends and other stuff, like what's provided by betesting.go? It seems that both approaches are used in the project.

@serathius
Copy link
Member

Prefer to avoid mocks in this project as contracts in the etcdserver are not set in stone. See #16035 (comment) for an example of mocks behavior different behavior leading to subtile bugs like memory leak. Prefer to test on real backend, it's more resource intensive, but correct. We can worry about resource and test run time when it becomes a problem.

@CaojiamingAlan
Copy link
Contributor

CaojiamingAlan commented Jun 28, 2023

Found a potential bug #16150 when trying to add tests for quotaApplier.

iuriatan added a commit to iuriatan/etcd that referenced this issue Jul 13, 2023
* add `put` tests on quota applier

Following discussions on  etcd-io#16036 and etcd-io#16150, this patch aims to improve
appliers test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
iuriatan added a commit to iuriatan/etcd that referenced this issue Jul 14, 2023
* add `put` tests on quota applier

Following discussions on  etcd-io#16036 and etcd-io#16150, this patch aims to improve
appliers test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
iuriatan added a commit to iuriatan/etcd that referenced this issue Jul 14, 2023
* add `put` tests on quota applier

Following discussions on etcd-io#16036 and etcd-io#16150, this patch aims to improve
appliers test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
iuriatan added a commit to iuriatan/etcd that referenced this issue Jul 14, 2023
* add `put` tests on quota applier

Following discussions on etcd-io#16036 and etcd-io#16150, this patch improves appliers
test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
iuriatan added a commit to iuriatan/etcd that referenced this issue Jul 15, 2023
* add `put` tests on quota applier

Following discussions on etcd-io#16036 and etcd-io#16150, this patch improves appliers
test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
iuriatan added a commit to iuriatan/etcd that referenced this issue Jul 15, 2023
* add `put` tests on quota applier

Following discussions on etcd-io#16036 and etcd-io#16150, this patch improves appliers
test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
iuriatan added a commit to iuriatan/etcd that referenced this issue Jul 15, 2023
* add `put` tests on quota applier

Following discussions on etcd-io#16036 and etcd-io#16150, this patch improves appliers
test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
iuriatan added a commit to iuriatan/etcd that referenced this issue Jul 27, 2023
* add `put` tests on quota applier

Following discussions on etcd-io#16036 and etcd-io#16150, this patch improves appliers
test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
iuriatan added a commit to iuriatan/etcd that referenced this issue Aug 10, 2023
* add `put` tests on quota applier

Following discussions on etcd-io#16036 and etcd-io#16150, this patch improves appliers
test coverage.

Signed-off-by: iuriatan <iuriatan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants