Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(accesscontrol,schema): replace LRUExpire cache #247

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

aruiz14
Copy link
Contributor

@aruiz14 aruiz14 commented Jul 18, 2024

Stacked on #246

We are investigating high memory usage due to accesscontrol.AccessSet and types.APISchemas, which correlate with the objects being cached by accesscontrol.AccessStore and schema.Collection respectively. Understanding how memory is actually distributed is hard, since the size of each object depends on the number of permissions granted to every user.

Something I noticed is that the LRUExpireCache from k8s.io/apimachinery, which is being used in these packages, only removes items when explicitly requested, but does not "actively" purge expired values from the cache. It only removes them when necessary. In practice, considering that our code never deletes items from the cache (only replaces them), this means that the size of the cache will slowly grow until reaching the maximum configured, then replace expired items one by one as needed.

In contrast, the same library also provides an Expiring implementation (without LRU mechanism), that does have a GC mechanism to prune all expired entries, not only one by one.
One downside of this is that, since it does not keep a list of recent accesses, it is not possible to set a maximum cache size (my proposal here is to just log when that happens). Also, this makes it more important to set a sensible value for the entries TTL.
This makes this option not ideal, but I think it's still a reasonably good and immediate alternative, which does not involve adding a new dependency or adding or very own cache implementation (as too many exist already).

Also, I added a way to revert the cache to the previous implementation, by setting the environment variable CATTLE_STEVE_CACHE_BACKEND to lru or LRU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant