-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube-apiserver persistently high memory usage with large number of CRDs #2920
Labels
high-priority
Issues we intend to prioritize (security, outage, blocking bug)
Milestone
Comments
3 tasks
matthchr
added
the
high-priority
Issues we intend to prioritize (security, outage, blocking bug)
label
May 4, 2023
2 tasks
matthchr
added a commit
to matthchr/azure-service-operator
that referenced
this issue
May 23, 2023
Fixes Azure#1433. Fixes Azure#2920. * By default, no CRDs are installed. * If no CRDs are installed, the operator pod will exit with an error stating that there are no CRDs. * Operator pod --crd-pattern command-line argument will now accept more than '*'. * --crd-pattern means NEW CRDs to install. Existing CRDs in the cluster will always be upgraded. This means that upgrading an existing ASO installation without specifying any new CRDs will upgrade all of the existing CRDs and install no new CRDs.
2 tasks
matthchr
added a commit
to matthchr/azure-service-operator
that referenced
this issue
May 23, 2023
Fixes Azure#1433. Fixes Azure#2920. * By default, no CRDs are installed. * If no CRDs are installed, the operator pod will exit with an error stating that there are no CRDs. * Operator pod --crd-pattern command-line argument will now accept more than '*'. * --crd-pattern means NEW CRDs to install. Existing CRDs in the cluster will always be upgraded. This means that upgrading an existing ASO installation without specifying any new CRDs will upgrade all of the existing CRDs and install no new CRDs.
matthchr
added a commit
to matthchr/azure-service-operator
that referenced
this issue
May 24, 2023
Fixes Azure#1433. Fixes Azure#2920. * By default, no CRDs are installed. * If no CRDs are installed, the operator pod will exit with an error stating that there are no CRDs. * Operator pod --crd-pattern command-line argument will now accept more than '*'. * --crd-pattern means NEW CRDs to install. Existing CRDs in the cluster will always be upgraded. This means that upgrading an existing ASO installation without specifying any new CRDs will upgrade all of the existing CRDs and install no new CRDs.
matthchr
added a commit
to matthchr/azure-service-operator
that referenced
this issue
May 24, 2023
Fixes Azure#1433. Fixes Azure#2920. * By default, no CRDs are installed. * If no CRDs are installed, the operator pod will exit with an error stating that there are no CRDs. * Operator pod --crd-pattern command-line argument will now accept more than '*'. * --crd-pattern means NEW CRDs to install. Existing CRDs in the cluster will always be upgraded. This means that upgrading an existing ASO installation without specifying any new CRDs will upgrade all of the existing CRDs and install no new CRDs.
matthchr
added a commit
to matthchr/azure-service-operator
that referenced
this issue
May 24, 2023
Fixes Azure#1433. Fixes Azure#2920. * By default, no CRDs are installed. * If no CRDs are installed, the operator pod will exit with an error stating that there are no CRDs. * Operator pod --crd-pattern command-line argument will now accept more than '*'. * --crd-pattern means NEW CRDs to install. Existing CRDs in the cluster will always be upgraded. This means that upgrading an existing ASO installation without specifying any new CRDs will upgrade all of the existing CRDs and install no new CRDs.
github-project-automation
bot
moved this from Backlog
to Recently Completed
in Azure Service Operator Roadmap
May 24, 2023
matthchr
moved this from Recently Completed
to Ready for Release
in Azure Service Operator Roadmap
May 30, 2023
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
kube-apiserver uses a surprisingly large amount of memory when CRDs are added.
ASO v2.0.0 installs ~125 CRDs. Testing quickly on a local
kind
cluster (version 1.26.3) shows that resident memory grows from:304mb (empty cluster) -> 380mb (cert-manager installed) -> 2.5g (aso just installed) -> 2.0g (steady state).
This means that each of our 125 CRDs causes ~13mb of memory usage in kube-apiserver.
Customer impact
It's very easy to cross managed cluster
kube-apiserver
memory limits. If this happens, kube-apiserver may get OOMKilled by the managed cluster provider (such as AKS). This has negative downstream effects on pretty much everything in the cluster, including triggering pod restarts due to expired watches, failed kube api requests, etc.Quick instructions for profiling
kind
apiserverkind
.aso
kubectl proxy --port=8080 &
go tool pprof -png "http://localhost:8080/debug/pprof/heap" > out.png
- The actual trailing path can be anything:/debug/pprof/profile
,debug/pprof/heap
, etccurl localhost:8080/debug/pprof/heap > out.pprof
and then analyze it withgo tool pprof out.pprof
Experiments
v1beta
versions are removed?kube-apiserver
memory usage is coming fromPrior art
Some key snippets from Crossplane's summary:
On large CRDs:
Proposed fixes
Standard
(previouslyPaid
) tier, and notFree
tier.The text was updated successfully, but these errors were encountered: