Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dependency for kube-rbac-proxy and add secure-metrics feature #3833

Merged
merged 12 commits into from
Mar 8, 2024
4 changes: 1 addition & 3 deletions Taskfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,6 @@ vars:
CROSSPLANE_OUTPUT:
sh: 'realpath hack/crossplane/config'

KUBE_RBAC_PROXY: gcr.io/kubebuilder/kube-rbac-proxy

# how long to let tests against live resources run for
LIVE_TEST_TIMEOUT: 3h

Expand Down Expand Up @@ -693,7 +691,7 @@ tasks:
deps:
- controller:generate-kustomize
cmds:
- "{{.SCRIPTS_ROOT}}/generate-helm-manifest.sh {{.KUBE_RBAC_PROXY}} {{.LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE}} {{.PUBLIC_REGISTRY}} {{.LATEST_VERSION_TAG}} `pwd`/"
- "{{.SCRIPTS_ROOT}}/generate-helm-manifest.sh {{.LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE}} {{.PUBLIC_REGISTRY}} {{.LATEST_VERSION_TAG}} `pwd`/"

controller:install-helm:
desc: Generate and install helm chart on cluster
Expand Down
70 changes: 67 additions & 3 deletions docs/hugo/content/guide/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ The metrics exposed fall into two groups: Azure based metrics, and reconciler me

## Toggling the metrics

By default, metrics for ASOv2 are turned on and can be toggled by the following options:
By default, secure metrics for ASOv2 are turned on and can be toggled by the following options:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they really turned on, given the option below says default: false ?

matthchr marked this conversation as resolved.
Show resolved Hide resolved

- ### ASOv2 Helm Chart

Expand All @@ -16,7 +16,8 @@ By default, metrics for ASOv2 are turned on and can be toggled by the following

```
--set metrics.enable=true/false (default: true)
--set metrics.address=0.0.0.0:8080 (default)
--set metrics.secure-metrics=true/false (default: false)
--set metrics.address=0.0.0.0:8080 (default)
```

- ### Deployment YAML
Expand All @@ -29,8 +30,71 @@ By default, metrics for ASOv2 are turned on and can be toggled by the following
containers:
- args:
- --metrics-addr=0.0.0.0:8080 (default)
- --secure-metrics=true/false (default: false)
```


## Scraping Metrics Securely via HTTPs

A ServiceAccount token is required to scrape metrics securely. The corresponding ServiceAccount needs permissions on the "/metrics" and "debug/pprof" paths.
This can be achieved e.g. by following the [Kubernetes documentation](https://kubernetes.io/docs/concepts/cluster-administration/system-metrics/).

- Use the settings below in your deployment:
super-harsh marked this conversation as resolved.
Show resolved Hide resolved

- #### ASOv2 Helm Chart
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this bullet point needs to be a heading - and if it were a heading, it would be 3rd level, not 4th.

Suggest changing to plain text.

```
--set metrics.enable=true
--set metrics.secure-metrics=true
--set metrics.address=0.0.0.0:8443
```

- #### Deployment YAML
```
spec:
containers:
- args:
- --metrics-addr=0.0.0.0:8443
- --secure-metrics=true
```

- Deploy the following RBAC configuration:
super-harsh marked this conversation as resolved.
Show resolved Hide resolved
```
cat << EOT | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: default-metrics
rules:
- nonResourceURLs:
- "/metrics"
- "/debug/pprof/*"
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: default-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: default-metrics
subjects:
- kind: ServiceAccount
name: default
namespace: default
EOT
```
- Open a port-forward
super-harsh marked this conversation as resolved.
Show resolved Hide resolved

```
kubectl port-forward deployments/azureserviceoperator-controller-manager -n azureserviceoperator-system 8443
```
- Create a ServiceAccount token and scrape metrics
```
TOKEN=$(kubectl create token default)
curl https://localhost:8443/metrics --header "Authorization: Bearer $TOKEN" -k
```

## Understanding the ASOv2 Metrics

| Metric | Description | Label 1 | Label 2 | Label 3 |
Expand Down
20 changes: 10 additions & 10 deletions scripts/v2/generate-helm-manifest.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,10 @@

set -e

KUBE_RBAC_PROXY=$1
LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE=$2
PUBLIC_REGISTRY=$3
VERSION=$4
DIR=$5
LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE=$1
PUBLIC_REGISTRY=$2
VERSION=$3
DIR=$4

ASO_CHART="$DIR"charts/azure-service-operator
GEN_FILES_DIR="$ASO_CHART"/templates/generated
Expand Down Expand Up @@ -43,8 +42,8 @@ rm "$GEN_FILES_DIR"/*_namespace_* # remove namespace as we will let Helm manage
sed -i "s/\(version: \)\(.*\)/\1${VERSION//v}/g" "$ASO_CHART"/Chart.yaml # find version key and update the value with the current version

# Deployment replacements
grep -E $KUBE_RBAC_PROXY "$GEN_FILES_DIR"/*_deployment_* > /dev/null # Ensure that what we're about to try to replace actually exists (if it doesn't we want to fail)
sed -i "s@$KUBE_RBAC_PROXY.*@{{.Values.image.kubeRBACProxy}}@g" "$GEN_FILES_DIR"/*_deployment_*
#grep -E $KUBE_RBAC_PROXY "$GEN_FILES_DIR"/*_deployment_* > /dev/null # Ensure that what we're about to try to replace actually exists (if it doesn't we want to fail)
#sed -i "s@$KUBE_RBAC_PROXY.*@{{.Values.image.kubeRBACProxy}}@g" "$GEN_FILES_DIR"/*_deployment_*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out - either restore or delete.

sed -i "s@$LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE@{{.Values.image.repository}}@g" "$GEN_FILES_DIR"/*_deployment_* # Replace hardcoded ASO image
# Perl multiline replacements - using this because it's tricky to do these sorts of multiline replacements with sed
perl -0777 -i -pe 's/(template:\n.*metadata:\n.*annotations:\n(\s*))/$1\{\{- if .Values.podAnnotations \}\}\n$2\{\{ toYaml .Values.podAnnotations \}\}\n$2\{\{- end \}\}\n$2/igs' "$GEN_FILES_DIR"/*_deployment_* # Add pod annotations
Expand All @@ -54,7 +53,8 @@ perl -0777 -i -pe 's/(spec:\n.*template:\n.*spec:\n(\s*))/$1\{\{- with .Values.a
perl -0777 -i -pe 's/(spec:\n.*template:\n.*spec:\n(\s*))/$1\{\{- with .Values.tolerations \}\}\n$2tolerations:\n$2\{\{- toYaml . | nindent 8 \}\}\n$2\{\{- end \}\}\n$2/igs' "$GEN_FILES_DIR"/*_deployment_* # Add pod annotations

# Metrics Configuration
flow_control "metrics-addr" "metrics-addr" "{{- if .Values.metrics.enable}}" "$GEN_FILES_DIR"/*_deployment_*
flow_control "metrics-addr" "secure-metrics" "{{- if .Values.metrics.enable}}" "$GEN_FILES_DIR"/*_deployment_*
sed -i "1,/secure-metrics=.*/s/\(secure-metrics=\)\(.*\)/\1{{ .Values.metrics.secureMetrics }}/g" "$GEN_FILES_DIR"/*_deployment_*
sed -i "1,/metrics-addr=.*/s/\(metrics-addr=\)\(.*\)/\1{{ tpl .Values.metrics.address . }}/g" "$GEN_FILES_DIR"/*_deployment_*
sed -i 's/containerPort: 8080/containerPort: {{ .Values.metrics.port | default 8080 }}/g' "$GEN_FILES_DIR"/*_deployment_*
sed -i '1 i {{- if .Values.metrics.enable -}}' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
Expand Down Expand Up @@ -87,8 +87,8 @@ flow_control "aadpodidbinding" "aadpodidbinding" "$IF_TENANT" "$GEN_FILES_DIR"/*

flow_control "--enable-leader-election" "--enable-leader-election" "$IF_TENANT" "$GEN_FILES_DIR"/*_deployment_*

# TODO: This bit is tricky to exclude kube-rbac-proxy and webhook stuff.
flow_control "mountPath: \/tmp\/k8s-webhook-server\/serving-certs" "name: https" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i "/mountPath: \/tmp\/k8s-webhook-server\/serving-certs/i \ \ $IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i "/nodeSelector:/i \ \ {{- end }}" "$GEN_FILES_DIR"/*_deployment_*
flow_control "- name: cert" "secretName" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
flow_control "--webhook-cert-dir=" "--webhook-cert-dir=" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i 's/\/tmp\/k8s-webhook-server\/serving-certs/{{ .Values.webhook.certDir }}/g' "$GEN_FILES_DIR"/*_deployment_*
Expand Down
3 changes: 3 additions & 0 deletions v2/charts/azure-service-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,9 @@ image:
# 'address' field defines the metrics binding address on which metrics
metrics:
enable: true
# secureMetrics controls whether metrics should be served via 'http' or 'https'.
matthchr marked this conversation as resolved.
Show resolved Hide resolved
# Flagging secureMetrics as true would use https
secureMetrics: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in this yaml file reinforces my earlier question about whether secure metrics are turned on by default.

address: 0.0.0.0:{{ .Values.metrics.port }}
port: 8080

Expand Down
5 changes: 5 additions & 0 deletions v2/cmd/controller/app/flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import (

type Flags struct {
MetricsAddr string
SecureMetrics bool
HealthAddr string
WebhookPort int
WebhookCertDir string
Expand Down Expand Up @@ -44,6 +45,7 @@ func ParseFlags(args []string) (Flags, error) {
klog.InitFlags(flagSet)

var metricsAddr string
var secureMetrics bool
var healthAddr string
var webhookPort int
var webhookCertDir string
Expand All @@ -54,6 +56,8 @@ func ParseFlags(args []string) (Flags, error) {

// default here for 'MetricsAddr' is set to "0", which sets metrics to be disabled if 'metrics-addr' flag is omitted.
flagSet.StringVar(&metricsAddr, "metrics-addr", "0", "The address the metric endpoint binds to.")
flagSet.BoolVar(&secureMetrics, "secure-metrics", false, "Enable secure metrics. This will enable serving pprof endpoints and metrics securely using https")

flagSet.StringVar(&healthAddr, "health-addr", "", "The address the healthz endpoint binds to.")
flagSet.IntVar(&webhookPort, "webhook-port", 9443, "The port the webhook endpoint binds to.")
flagSet.StringVar(&webhookCertDir, "webhook-cert-dir", "", "The directory the webhook server's certs are stored.")
Expand All @@ -69,6 +73,7 @@ func ParseFlags(args []string) (Flags, error) {

return Flags{
MetricsAddr: metricsAddr,
SecureMetrics: secureMetrics,
HealthAddr: healthAddr,
WebhookPort: webhookPort,
WebhookCertDir: webhookCertDir,
Expand Down
29 changes: 26 additions & 3 deletions v2/cmd/controller/app/setup.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ import (
"context"
"fmt"
"math/rand"
"net/http"
"net/http/pprof"
"os"
"regexp"
"time"
Expand Down Expand Up @@ -131,9 +133,7 @@ func SetupControllerManager(ctx context.Context, setupLog logr.Logger, flgs Flag
LeaderElection: flgs.EnableLeaderElection,
LeaderElectionID: "controllers-leader-election-azinfra-generated",
HealthProbeBindAddress: flgs.HealthAddr,
Metrics: server.Options{
BindAddress: flgs.MetricsAddr,
},
Metrics: getMetricsOpts(flgs),
WebhookServer: webhook.NewServer(webhook.Options{
Port: flgs.WebhookPort,
CertDir: flgs.WebhookCertDir,
Expand Down Expand Up @@ -253,6 +253,29 @@ func SetupControllerManager(ctx context.Context, setupLog logr.Logger, flgs Flag
return mgr
}

func getMetricsOpts(flags Flags) server.Options {
var metricsOptions server.Options
if flags.SecureMetrics {
metricsOptions = server.Options{
BindAddress: flags.MetricsAddr,
SecureServing: flags.SecureMetrics,
// Note that pprof endpoints are meant to be sensitive and shouldn't be exposed publicly.
ExtraHandlers: map[string]http.Handler{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should have a specific cmdline flag that enables this and it should default to off.

This is what apiserver does via the --profiling argument, though it defaults it to true.

I'd argue we should have both metrics.secure and metrics.profiling flags in Helm, and two cmdline args in ASO to control these two bits.

Secure metrics shouldn't (IMO) require that you expose pprof.

"/debug/pprof/": http.HandlerFunc(pprof.Index),
"/debug/pprof/cmdline": http.HandlerFunc(pprof.Cmdline),
"/debug/pprof/profile": http.HandlerFunc(pprof.Profile),
"/debug/pprof/symbol": http.HandlerFunc(pprof.Symbol),
"/debug/pprof/trace": http.HandlerFunc(pprof.Trace),
},
}
} else {
metricsOptions = server.Options{
BindAddress: flags.MetricsAddr,
}
}
return metricsOptions
}

func getDefaultAzureCredential(cfg config.Values, setupLog logr.Logger) (*identity.Credential, error) {
tokenCred, err := getDefaultAzureTokenCredential(cfg, setupLog)
if err != nil {
Expand Down
1 change: 0 additions & 1 deletion v2/config/manager/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@ resources:
- manager_metrics_service.yaml

patchesStrategicMerge:
- manager_auth_proxy_patch.yaml
- manager_image_patch.yaml
- manager_pull_policy.yaml
1 change: 1 addition & 0 deletions v2/config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ spec:
containers:
- args:
- --metrics-addr=:8080
- --secure-metrics=false
- --health-addr=:8081
- --enable-leader-election
- --v=2
Expand Down
24 changes: 0 additions & 24 deletions v2/config/manager/manager_auth_proxy_patch.yaml

This file was deleted.

14 changes: 0 additions & 14 deletions v2/config/rbac/auth_proxy_service.yaml

This file was deleted.

4 changes: 1 addition & 3 deletions v2/config/rbac/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@ resources:
- leader_election_role.yaml
- leader_election_role_binding.yaml
# Comment the following 3 lines if you want to disable
# the auth proxy (https://github.com/brancz/kube-rbac-proxy)
# which protects your /metrics endpoint.
- auth_proxy_service.yaml
# the SecureMetrics which protects your /metrics endpoint.
- auth_proxy_role.yaml
- auth_proxy_role_binding.yaml
Loading