Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the memory footprint by optimizing Kubernetes informer cache #5099

Merged
merged 1 commit into from
Apr 27, 2023

Conversation

tsaarni
Copy link
Member

@tsaarni tsaarni commented Feb 16, 2023

This PR tries to avoid storing data in controller-runtime / client-go cache when we know for sure we do not need it.

While working on #5064 (comment) it was discovered that considerable time was spent on processing Secrets of type helm.sh/release.v1. These secrets contain the helm manifest, and they are just noise for Contour. Besides CPU impact, there is also memory footprint impact, since these Secrets can get large. See related discussion cert-manager/cert-manager#5220.

I was looking for ways to avoid caching objects that fail to fulfil given criteria. I found one: SelectorByObject. It is based on filtering at the API server side by using selectors. Unfortunately, selectors will not be flexible enough for this use case.

Another approach was making the objects smaller by providing custom transform function. The change presented in this PR uses transform function to set Secret.data to nil for objects that clearly are not relevant for Contour.

Yet another approach, although more complicated: metadata-only watches are interesting from security perspective. It is not desirable that a process holds all Secrets of the cluster needlessly in memory. As a side effect, the approach could also lower memory footprint. Curiously, the "actual" data can find its way to metadata too, in a form of last-applied-configuration annotation, see for example here and here. To solve this problem, one could combine metadata-only watches with transform function.

NOTE1 I have not found an obvious way to prove if there is a meaningful impact to memory footprint from this change.

NOTE2 If this optimization has been discussed already before, sorry for the repeat!

@codecov
Copy link

codecov bot commented Feb 16, 2023

Codecov Report

Merging #5099 (41d2e39) into main (97703a8) will decrease coverage by 0.08%.
The diff coverage is 0.00%.

❗ Current head 41d2e39 differs from pull request most recent head c08ed32. Consider uploading reports for the commit c08ed32 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5099      +/-   ##
==========================================
- Coverage   77.86%   77.79%   -0.08%     
==========================================
  Files         138      138              
  Lines       18215    17857     -358     
==========================================
- Hits        14184    13891     -293     
+ Misses       3757     3696      -61     
+ Partials      274      270       -4     
Impacted Files Coverage Δ
cmd/contour/serve.go 21.21% <0.00%> (+1.17%) ⬆️

... and 12 files with indirect coverage changes

@github-actions
Copy link

github-actions bot commented Mar 3, 2023

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 3, 2023
@ravilr
Copy link

ravilr commented Mar 8, 2023

xref #4788 (comment)

@tsaarni above issue seems relevant here.

@ravilr
Copy link

ravilr commented Mar 8, 2023

I was looking for ways to avoid caching objects that fail to fulfil given criteria. I found one: SelectorByObject. It is based on filtering at the API server side by using selectors. Unfortunately, selectors will not be flexible enough for this use case.

Helm created secrets of type helm.sh/release.v1 have the owner: helm label. Traefik (another ingress controller) uses this owner!=helm as a selector to its secrets informer cache: https://github.com/traefik/traefik/blob/56f7515ecd8572c6a4c73f35e54b33414e6efb79/pkg/provider/kubernetes/ingress/client.go#L186

IMO, selector approach would result in filtering at the Kubernetes APIServer itself(both the list and watch api calls have the selector pushed down to apiserver), so that the Contour's informer cache and eventHandler wouldn't even see them. the transformer approach is effectively nullifying it at the client cache side (meaning still work for the contour controller side).

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 9, 2023
@tsaarni
Copy link
Member Author

tsaarni commented Mar 9, 2023

Helm created secrets of type helm.sh/release.v1 have the owner: helm label. Traefik (another ingress controller) uses this owner!=helm as a selector to its secrets informer cache: https://github.com/traefik/traefik/blob/56f7515ecd8572c6a4c73f35e54b33414e6efb79/pkg/provider/kubernetes/ingress/client.go#L186

Thank you @ravilr, great finding! 👍

I did not consider specifically addressing Helm before, but you are correct, doing that would allow using label selector -based approach. Since Helm Secrets can be the biggest and most well-known problem, it can make sense to address that specifically. Even though from a security perspective it still would be desirable not to keep any extra Secrets resident in the process's memory.

I'll make an attempt with selectors to address the Helm issue.

@ravilr
Copy link

ravilr commented Mar 10, 2023

Another option is to filter based on field selector, but it appears client-go doesn't support set-based requirements for field selector( type IN ('Opaque', 'kubernetes.io/tls')).

But, this seems to work:
kubectl get secrets -A --field-selector=type!=helm.sh/release.v1,type!=kubernetes.io/service-account-token

Above will also filter out ServiceAccount tokens which are auto-refreshed every hour and could have significant events traffic in some clusters with large number of ServiceAccount resources.

@github-actions
Copy link

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 25, 2023
@sunjayBhatia
Copy link
Member

sunjayBhatia commented Apr 10, 2023

We got some reports of high memory usage between Contour versions and doing some experiments, it seems the Helm secrets being added to the DAG cache doesn't make much difference:

Setup:

  • Kind cluster
  • Contour v1.23.0 in one ns
  • Contour v1.24.3 in another
  • One HTTPProxy (reconciled by both)
  • Enable metrics-server

Initial resource usage via kubectl top pod:

v1.23.0:

NAME                      CPU(cores)   MEMORY(bytes)
contour-547659bf5-qqscz   14m          21Mi
contour-547659bf5-sp5dw   21m          25Mi
envoy-gxvnm               7m           32Mi

v1.24.3:

NAME                     CPU(cores)   MEMORY(bytes)
contour-6f944569-crrn7   25m          26Mi
contour-6f944569-tzdvz   15m          22Mi
envoy-kz2ct              7m           35Mi

Add secrets of type helm.sh/release.v1, 50 at a time, each with data field of
size 2^17 bytes (131072), largest power of 2 that would let secret be created.

After 600 secrets created in batches, and the cluster sitting idle for a few minutes:

v1.23.0:

NAME                      CPU(cores)   MEMORY(bytes)
contour-547659bf5-qqscz   14m          258Mi
contour-547659bf5-sp5dw   20m          299Mi
envoy-gxvnm               6m           34Mi

v1.24.3:

NAME                     CPU(cores)   MEMORY(bytes)
contour-6f944569-crrn7   23m          301Mi
contour-6f944569-tzdvz   14m          259Mi
envoy-kz2ct              7m           36Mi

After 1000 secrets created in batches, and the cluster sitting idle for a few minutes:

v1.23.0:

NAME                      CPU(cores)   MEMORY(bytes)
contour-547659bf5-qqscz   13m          393Mi
contour-547659bf5-sp5dw   22m          434Mi
envoy-gxvnm               6m           34Mi

v1.24.3:

NAME                     CPU(cores)   MEMORY(bytes)
contour-6f944569-crrn7   23m          435Mi
contour-6f944569-tzdvz   14m          395Mi
envoy-kz2ct              6m           36Mi

After 3000 secrets created in batches, and the cluster sitting idle for a few minutes:

v1.23.0:

NAME                      CPU(cores)   MEMORY(bytes)
contour-547659bf5-qqscz   13m          1051Mi
contour-547659bf5-sp5dw   21m          1070Mi
envoy-gxvnm               6m           34Mi

v1.24.3:

NAME                     CPU(cores)   MEMORY(bytes)
contour-6f944569-crrn7   23m          1105Mi
contour-6f944569-tzdvz   14m          1070Mi
envoy-kz2ct              6m           36Mi

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 11, 2023
@luisdavim
Copy link

I think limiting what is cached to what is actually needed is a good idea

@tsaarni
Copy link
Member Author

tsaarni commented Apr 11, 2023

@sunjayBhatia Do you refer to #4788 and do you mean that unrelated secrets are part of DAG cache as well as in client-go cache? Could the reason for smallish difference be that they both share the underlying data?

Still the figures you included seemed to indicate an increase of footprint from initial ~25Mi to ~1000Mi while just adding unrelated secrets to the cluster, so from that perspective the impact of caching unrelated objects on client-go level would be huge.

I found it a bit surprising that there was no method to write a "predicate" function that could control caching on the client end. Or ways to manage cache footprint in general, besides the ones I mentioned in the description.

@sunjayBhatia
Copy link
Member

sunjayBhatia commented Apr 11, 2023

@sunjayBhatia Do you refer to #4788 and do you mean that unrelated secrets are part of DAG cache as well as in client-go cache? Could the reason for smallish difference be that they both share the underlying data?

Still the figures you included seemed to indicate an increase of footprint from initial ~25Mi to ~1000Mi while just adding unrelated secrets to the cluster, so from that perspective the impact of caching unrelated objects on client-go level would be huge.

I found it a bit surprising that there was no method to write a "predicate" function that could control caching on the client end. Or ways to manage cache footprint in general, besides the ones I mentioned in the description.

sorry yes, to clarify us explicitly adding references to the secrets does not seem to make a difference in memory usage vs. when we do not, the client-go informer cache still contains this content

agree that we should try to make sure unrelated content doesn't show up in the informer caches, but it would seem simply adding a filter to what is added to the DAG cache doesn't really have a material impact on memory usage

@sunjayBhatia
Copy link
Member

With PR 5099 (rebased on main):

Initial resource usage:

NAME                      CPU(cores)   MEMORY(bytes)
contour-869c8b576-h6gqc   1m           14Mi
contour-869c8b576-xhzcw   2m           17Mi
envoy-z6ffk               8m           29Mi

After 600 secrets created in batches, and the cluster sitting idle for a few minutes:

NAME                      CPU(cores)   MEMORY(bytes)
contour-869c8b576-h6gqc   1m           15Mi
contour-869c8b576-xhzcw   2m           18Mi
envoy-z6ffk               8m           29Mi

After 1000 secrets created in batches, and the cluster sitting idle for a few minutes:

NAME                      CPU(cores)   MEMORY(bytes)
contour-869c8b576-h6gqc   1m           15Mi
contour-869c8b576-xhzcw   2m           19Mi
envoy-z6ffk               7m           29Mi

After 3000 secrets created in batches, and the cluster sitting idle for a few minutes:

NAME                      CPU(cores)   MEMORY(bytes)
contour-869c8b576-h6gqc   1m           16Mi
contour-869c8b576-xhzcw   2m           19Mi
envoy-z6ffk               7m           29Mi

During secret creation, memory usage does increase slightly, but I never
observed anything > 30Mi

@tsaarni
Copy link
Member Author

tsaarni commented Apr 11, 2023

Oh, that is cool @sunjayBhatia!

I was previously using pprof and somehow failed to see this. I must have misinterpreted the results. Since it only sets data field to nil, the object will still be in the cache, just without content.

I can resurrect this PR if wanted?

cmd/contour/serve.go Outdated Show resolved Hide resolved
@skriss
Copy link
Member

skriss commented Apr 11, 2023

I can resurrect this PR if wanted?

+1, I think this would be a great optimization to get in

@skriss
Copy link
Member

skriss commented Apr 11, 2023

FYI I'm planning to merge #5214 shortly and looks like this'll conflict. Actually, before I merge, let's confirm that these two are actually compatible with each other?

@tsaarni
Copy link
Member Author

tsaarni commented Apr 11, 2023

FYI I'm planning to merge #5214 shortly and looks like this'll conflict. Actually, before I merge, let's confirm that these two are actually compatible with each other?

Apparently they are not compatible :(

BuilderWithOptions combines user provided options with the default options (link).
MultiNamespacedCacheBuilder does not seem to provide means to pass these extra parameters (link). I need to check tomorrow how to compose these to create NewCacheFunc that considers both aspects.

Meanwhile I will push rebased and cleaned up version of this PR.

@tsaarni tsaarni added the release-note/small A small change that needs one line of explanation in the release notes. label Apr 11, 2023
@tsaarni tsaarni changed the title Experiment: Attempt to reduce the memory footprint of Kubernetes cache Reduce the memory footprint by optimizing Kubernetes informer cache Apr 11, 2023
@tsaarni tsaarni marked this pull request as ready for review April 11, 2023 16:47
@tsaarni tsaarni requested a review from a team as a code owner April 11, 2023 16:47
@tsaarni tsaarni requested review from stevesloka and skriss and removed request for a team April 11, 2023 16:47
@tsaarni
Copy link
Member Author

tsaarni commented Apr 11, 2023

There is forthcoming breaking change in controller-runtime API. To my surprise, both multinamespace and transform function use cases seem to be impacted by it. For example, kubernetes-sigs/controller-runtime/pull/2157 deprecates MultiNamespacedCacheBuilder() which is used in #5214. The API change was merged, and further refactoring has happened since then. For example, the API used by this PR for transformations seems changed too.

I did not yet test with controller-runtime main, but it seems feasible that all these changes might work in the best way after all, removing any conflict that may arise between this PR and #5214. None of this seems to target latest 0.14 release track which Contour is using currently.

// This is useful for saving memory by removing fields that are not needed by Contour.
TransformByObject: ctrl_cache.TransformByObject{
&corev1.Secret{}: func(obj interface{}) (interface{}, error) {
secret, _ := obj.(*corev1.Secret)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The godoc for TransformFunc says:

// TransformFunc (similarly to ResourceEventHandler functions) should be able
// to correctly handle the tombstone of type cache.DeletedFinalStateUnknown

See e.g. https://github.com/projectcontour/contour/blob/main/internal/dag/cache.go#L279-L280 for how we handle that in the DAG cache currently - I guess we should do something similar here.

Copy link
Member Author

@tsaarni tsaarni Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I had missed this!

I added check for the success of type assertion for both TransformByObject and DefaultTransform. I just pass the original object if it fails.

In case of TransformByObject the function is explicitly assigned for type corev1.Secret but I did not find anything that suggests that it could not get passed an cache.DeletedFinalStateUnknown as well, so it makes sense to add this.

Though, I'm bit confused why the TransformFunc would be passed cache.DeletedFinalStateUnknown in any situation, if the tombstone object is associated to OnDelete? I thought transformation would happen when objects are inserted into the cache, not when they are removed. Maybe the transformers are called also at delete 🤨.

cmd/contour/serve.go Outdated Show resolved Hide resolved
Added a transform function that will reduce the size of the objects before
they hit the informer cache. The optimization is targeting Secrets that
are unrelated to any ingress use case.

Signed-off-by: Tero Saarni <tero.saarni@est.tech>
Copy link
Member

@skriss skriss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @tsaarni! My preference would be to get this PR merged for the upcoming 1.25 release, and then to follow up later with any refactoring needed once the new controller-runtime version comes out to enable this plus #5214.

@skriss skriss added this to the 1.25.0 milestone Apr 26, 2023
@skriss skriss merged commit 2c0fc61 into projectcontour:main Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/small A small change that needs one line of explanation in the release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants