Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable serving supervisor metrics #10019

Merged
merged 3 commits into from
May 28, 2024

Conversation

brandond
Copy link
Member

@brandond brandond commented Apr 25, 2024

Proposed Changes

  • Refactor agent supervisor listener startup and authn/authz to use upstream auth delegators to perform for SubjectAccessReview for access to metrics.
  • Convert spegel and pprof handlers over to new structure.
  • Promote --enable-pprof to agent flag to allow profiling agents. Access to the pprof endpoint now requires client cert auth, similar to the spegel registry api endpoint.
  • Add --supervisor-metrics server flag that configures all cluster members (both servers and agents) to serve metrics at 0.0.0.0:6443/metrics.
  • The pprof and metrics endpoints use same listener as spegel. Use of any one of the three features will cause the listener to be enabled.

This is required to expose supervisor metrics on rke2; on k3s the metrics will be the same as those currently available from kubelet and apiserver metrics endpoints

Types of Changes

enhancement

Verification

metrics:

  1. Start servers with --supervisor-metrics
  2. curl -vks --cert /var/lib/rancher/k3s/server/tls/client-admin.crt --key /var/lib/rancher/k3s/server/tls/client-admin.key https://node:6443/metrics against agents and servers

pprof:

  1. Start node (agent or server) with --enable-pprof
  2. curl -vks --cert /var/lib/rancher/k3s/server/tls/client-admin.crt --key /var/lib/rancher/k3s/server/tls/client-admin.key https://node:6443/debug/pprof/ against node with pprof enabled

Testing

Linked Issues

User-Facing Change

`--enable-pprof` can now be set on agents to enable the debug/pprof endpoints. When set, agents will listen on the supervisor port.
`--supervisor-metrics` can now be set on servers to enable serving internal metrics on the supervisor endpoint; when set agents will listen on the supervisor port.

Further Comments

@brandond brandond requested a review from a team as a code owner April 25, 2024 04:03
Copy link

codecov bot commented Apr 25, 2024

Codecov Report

Attention: Patch coverage is 24.09091% with 167 lines in your changes are missing coverage. Please review.

Project coverage is 41.70%. Comparing base (dba30ab) to head (bcdd0be).

Files Patch % Lines
pkg/agent/https/https.go 0.00% 52 Missing ⚠️
pkg/util/net.go 14.81% 45 Missing and 1 partial ⚠️
pkg/profile/profile.go 0.00% 13 Missing ⚠️
pkg/metrics/metrics.go 0.00% 9 Missing ⚠️
pkg/agent/run.go 0.00% 4 Missing and 2 partials ⚠️
pkg/cli/agent/agent.go 40.00% 3 Missing and 3 partials ⚠️
pkg/cli/server/server.go 66.66% 3 Missing and 3 partials ⚠️
pkg/spegel/spegel.go 0.00% 6 Missing ⚠️
pkg/agent/config/config.go 58.33% 2 Missing and 3 partials ⚠️
pkg/server/router.go 50.00% 5 Missing ⚠️
... and 7 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10019      +/-   ##
==========================================
- Coverage   46.55%   41.70%   -4.85%     
==========================================
  Files         173      177       +4     
  Lines       14645    14740      +95     
==========================================
- Hits         6818     6148     -670     
- Misses       6540     7425     +885     
+ Partials     1287     1167     -120     
Flag Coverage Δ
e2etests 36.38% <24.09%> (-10.00%) ⬇️
inttests 37.05% <20.90%> (?)
unittests 11.31% <0.90%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* Refactor agent supervisor listener startup and authn/authz to use upstream
  auth delegators to perform for SubjectAccessReview for access to
  metrics.
* Convert spegel and pprof handlers over to new structure.
* Promote bind-address to agent flag to allow setting supervisor bind
  address for both agent and server.
* Promote enable-pprof to agent flag to allow profiling agents. Access
  to the pprof endpoint now requires client cert auth, similar to the
  spegel registry api endpoint.
* Add prometheus metrics handler.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants