Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dependency for kube-rbac-proxy and add secure-metrics feature #3833

Merged
merged 12 commits into from
Mar 8, 2024
4 changes: 1 addition & 3 deletions Taskfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,6 @@ vars:
CROSSPLANE_OUTPUT:
sh: 'realpath hack/crossplane/config'

KUBE_RBAC_PROXY: gcr.io/kubebuilder/kube-rbac-proxy

# how long to let tests against live resources run for
LIVE_TEST_TIMEOUT: 3h

Expand Down Expand Up @@ -694,7 +692,7 @@ tasks:
deps:
- controller:generate-kustomize
cmds:
- "{{.SCRIPTS_ROOT}}/generate-helm-manifest.sh {{.KUBE_RBAC_PROXY}} {{.LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE}} {{.PUBLIC_REGISTRY}} {{.LATEST_VERSION_TAG}} `pwd`/"
- "{{.SCRIPTS_ROOT}}/generate-helm-manifest.sh {{.LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE}} {{.PUBLIC_REGISTRY}} {{.LATEST_VERSION_TAG}} `pwd`/"

controller:install-helm:
desc: Generate and install helm chart on cluster
Expand Down
82 changes: 77 additions & 5 deletions docs/hugo/content/guide/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,21 @@ The metrics exposed fall into two groups: Azure based metrics, and reconciler me

## Toggling the metrics

By default, metrics for ASOv2 are turned on and can be toggled by the following options:
By default, secure metrics for ASOv2 are turned on and can be toggled by the following options:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they really turned on, given the option below says default: false ?

matthchr marked this conversation as resolved.
Show resolved Hide resolved

- ### ASOv2 Helm Chart
### ASOv2 Helm Chart

While installing the Helm chart, we can turn the metrics _**on**_ and _**off**_ and set the metrics expose address using the
below settings. Also, we can change the settings inside `values.yaml` file for ASOv2 Helm chart.

```
--set metrics.enable=true/false (default: true)
--set metrics.address=0.0.0.0:8080 (default)
--set metrics.secure=true/false (default: true)
--set metrics.profiling=true/false (default: false)
--set metrics.address=0.0.0.0:8443 (default)
```

- ### Deployment YAML
### Deployment YAML

In the deployment yaml, we can turn _**off**_ the metrics by omitting the `metrics-addr` flag. We can also change to use
a different metrics-addr by changing the default value of that same flag.
Expand All @@ -29,8 +31,78 @@ By default, metrics for ASOv2 are turned on and can be toggled by the following
containers:
- args:
- --metrics-addr=0.0.0.0:8080 (default)
- --secure-metrics=true/false (default: true)
- --profiling-metrics=true/false (default: false)
```


## Scraping Metrics Securely via HTTPs using RBAC

A ServiceAccount token is required to scrape metrics securely. The corresponding ServiceAccount needs permissions on the "/metrics" and "debug/pprof" paths.
This can be achieved e.g. by following the [Kubernetes documentation](https://kubernetes.io/docs/concepts/cluster-administration/system-metrics/).

Follow the steps below to scrape metrics securely.

{{< tabpane text=true left=true >}}
{{% tab header="Helm Chart" %}}
``` Helm Chart
--set metrics.enable=true
--set metrics.secure=true
--set metrics.profiling=true
--set metrics.address=0.0.0.0:8443
```
{{% /tab %}}
{{% tab header="Deployment YAML" %}}
``` Deployment YAML
spec:
containers:
- args:
- --metrics-addr=0.0.0.0:8443
- --secure-metrics=true
- --profiling-metrics=true
```
{{% /tab %}}
{{< /tabpane >}}

Deploy the following RBAC configuration. This creates a role that can scrape metrics.
```
cat << EOT | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: default-metrics
rules:
- nonResourceURLs:
- "/metrics"
- "/debug/pprof/*"
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: default-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: default-metrics
subjects:
- kind: ServiceAccount
name: default
namespace: default
EOT
```
Test locally:
- Open a port-forward

```
kubectl port-forward deployments/azureserviceoperator-controller-manager -n azureserviceoperator-system 8443
```
- Create a ServiceAccount token and scrape metrics
```
TOKEN=$(kubectl create token default)
curl https://localhost:8443/metrics --header "Authorization: Bearer $TOKEN" -k
```

## Understanding the ASOv2 Metrics

| Metric | Description | Label 1 | Label 2 | Label 3 |
Expand Down
23 changes: 11 additions & 12 deletions scripts/v2/generate-helm-manifest.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,10 @@

set -e

KUBE_RBAC_PROXY=$1
LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE=$2
PUBLIC_REGISTRY=$3
VERSION=$4
DIR=$5
LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE=$1
PUBLIC_REGISTRY=$2
VERSION=$3
DIR=$4

ASO_CHART="$DIR"charts/azure-service-operator
GEN_FILES_DIR="$ASO_CHART"/templates/generated
Expand Down Expand Up @@ -43,8 +42,6 @@ rm "$GEN_FILES_DIR"/*_namespace_* # remove namespace as we will let Helm manage
sed -i "s/\(version: \)\(.*\)/\1${VERSION//v}/g" "$ASO_CHART"/Chart.yaml # find version key and update the value with the current version

# Deployment replacements
grep -E $KUBE_RBAC_PROXY "$GEN_FILES_DIR"/*_deployment_* > /dev/null # Ensure that what we're about to try to replace actually exists (if it doesn't we want to fail)
sed -i "s@$KUBE_RBAC_PROXY.*@{{.Values.image.kubeRBACProxy}}@g" "$GEN_FILES_DIR"/*_deployment_*
sed -i "s@$LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE@{{.Values.image.repository}}@g" "$GEN_FILES_DIR"/*_deployment_* # Replace hardcoded ASO image
# Perl multiline replacements - using this because it's tricky to do these sorts of multiline replacements with sed
perl -0777 -i -pe 's/(template:\n.*metadata:\n.*annotations:\n(\s*))/$1\{\{- if .Values.podAnnotations \}\}\n$2\{\{ toYaml .Values.podAnnotations \}\}\n$2\{\{- end \}\}\n$2/igs' "$GEN_FILES_DIR"/*_deployment_* # Add pod annotations
Expand All @@ -54,11 +51,13 @@ perl -0777 -i -pe 's/(spec:\n.*template:\n.*spec:\n(\s*))/$1\{\{- with .Values.a
perl -0777 -i -pe 's/(spec:\n.*template:\n.*spec:\n(\s*))/$1\{\{- with .Values.tolerations \}\}\n$2tolerations:\n$2\{\{- toYaml . | nindent 8 \}\}\n$2\{\{- end \}\}\n$2/igs' "$GEN_FILES_DIR"/*_deployment_* # Add pod annotations

# Metrics Configuration
flow_control "metrics-addr" "metrics-addr" "{{- if .Values.metrics.enable}}" "$GEN_FILES_DIR"/*_deployment_*
flow_control "metrics-addr" "profiling-metrics" "{{- if .Values.metrics.enable}}" "$GEN_FILES_DIR"/*_deployment_*
sed -i "1,/secure-metrics=.*/s/\(secure-metrics=\)\(.*\)/\1{{ .Values.metrics.secure }}/g" "$GEN_FILES_DIR"/*_deployment_*
sed -i "1,/profiling-metrics=.*/s/\(profiling-metrics=\)\(.*\)/\1{{ .Values.metrics.profiling }}/g" "$GEN_FILES_DIR"/*_deployment_*
sed -i "1,/metrics-addr=.*/s/\(metrics-addr=\)\(.*\)/\1{{ tpl .Values.metrics.address . }}/g" "$GEN_FILES_DIR"/*_deployment_*
sed -i 's/containerPort: 8080/containerPort: {{ .Values.metrics.port | default 8080 }}/g' "$GEN_FILES_DIR"/*_deployment_*
sed -i 's/containerPort: 8080/containerPort: {{ .Values.metrics.port | default 8443 }}/g' "$GEN_FILES_DIR"/*_deployment_*
sed -i '1 i {{- if .Values.metrics.enable -}}' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
sed -i 's/port: 8080/port: {{ .Values.metrics.port | default 8080 }}/g' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
sed -i 's/port: 8080/port: {{ .Values.metrics.port | default 8443 }}/g' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
sed -i -e '$a{{- end }}' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
find "$GEN_FILES_DIR" -type f -exec sed -i 's/azureserviceoperator-system/{{ .Release.Namespace }}/g' {} \;

Expand Down Expand Up @@ -87,8 +86,8 @@ flow_control "aadpodidbinding" "aadpodidbinding" "$IF_TENANT" "$GEN_FILES_DIR"/*

flow_control "--enable-leader-election" "--enable-leader-election" "$IF_TENANT" "$GEN_FILES_DIR"/*_deployment_*

# TODO: This bit is tricky to exclude kube-rbac-proxy and webhook stuff.
flow_control "mountPath: \/tmp\/k8s-webhook-server\/serving-certs" "name: https" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i "/mountPath: \/tmp\/k8s-webhook-server\/serving-certs/i \ \ $IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i "/nodeSelector:/i \ \ {{- end }}" "$GEN_FILES_DIR"/*_deployment_*
flow_control "- name: cert" "secretName" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
flow_control "--webhook-cert-dir=" "--webhook-cert-dir=" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i 's/\/tmp\/k8s-webhook-server\/serving-certs/{{ .Values.webhook.certDir }}/g' "$GEN_FILES_DIR"/*_deployment_*
Expand Down
16 changes: 14 additions & 2 deletions v2/charts/azure-service-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,14 +86,26 @@ webhook:
# repository).
image:
repository: mcr.microsoft.com/k8s/azureserviceoperator:v2.6.0
kubeRBACProxy: gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1

# 'metrics' define settings for the metrics from controller.
# 'address' field defines the metrics binding address on which metrics
metrics:
enable: true
# secure controls whether metrics should be served via 'http' or 'https'.
# Flagging secure as 'true' would use https
# Refer to https://azure.github.io/azure-service-operator/guide/metrics/ for more information
secure: true
# profiling exposes below endpoints.
# /debug/pprof/
# /debug/pprof/cmdline
# /debug/pprof/profile
# /debug/pprof/symbol
# /debug/pprof/trace
#
# pprof endpoints are sensitive and can only be enabled when serving metrics securely
profiling: false
address: 0.0.0.0:{{ .Values.metrics.port }}
port: 8080
port: 8443

# installCRDs configures if the operator attempts to install and manage the CRDs associated with ASO.
# If the operator does not install and manage the CRDs on its own, you must manually install the appropriate
Expand Down
14 changes: 12 additions & 2 deletions v2/cmd/controller/app/flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,23 @@ import (

type Flags struct {
MetricsAddr string
ProfilingMetrics bool
SecureMetrics bool
HealthAddr string
WebhookPort int
WebhookCertDir string
EnableLeaderElection bool
CRDManagementMode string
CRDPatterns string // This is a ; delimited string containing a collection of patterns
CRDPatterns string // This is a ';' delimited string containing a collection of patterns
PreUpgradeCheck bool
}

func (f Flags) String() string {
return fmt.Sprintf(
"MetricsAddr: %s, HealthAddr: %s, WebhookPort: %d, WebhookCertDir: %s, EnableLeaderElection: %t, CRDManagementMode: %s, CRDPatterns: %s, PreUpgradeCheck: %t",
"MetricsAddr: %s, SecureMetrics: %t, ProfilingMetrics: %t, HealthAddr: %s, WebhookPort: %d, WebhookCertDir: %s, EnableLeaderElection: %t, CRDManagementMode: %s, CRDPatterns: %s, PreUpgradeCheck: %t",
f.MetricsAddr,
f.SecureMetrics,
f.ProfilingMetrics,
f.HealthAddr,
f.WebhookPort,
f.WebhookCertDir,
Expand All @@ -44,6 +48,8 @@ func ParseFlags(args []string) (Flags, error) {
klog.InitFlags(flagSet)

var metricsAddr string
var profilingMetrics bool
var secureMetrics bool
var healthAddr string
var webhookPort int
var webhookCertDir string
Expand All @@ -54,6 +60,9 @@ func ParseFlags(args []string) (Flags, error) {

// default here for 'MetricsAddr' is set to "0", which sets metrics to be disabled if 'metrics-addr' flag is omitted.
flagSet.StringVar(&metricsAddr, "metrics-addr", "0", "The address the metric endpoint binds to.")
flagSet.BoolVar(&secureMetrics, "secure-metrics", true, "Enable secure metrics. This secures the pprof and metrics endpoints via Kubernetes RBAC and HTTPS")
flagSet.BoolVar(&profilingMetrics, "profiling-metrics", false, "Enable pprof metrics, only enabled in conjunction with secure-metrics. This will enable serving pprof metrics endpoints")

flagSet.StringVar(&healthAddr, "health-addr", "", "The address the healthz endpoint binds to.")
flagSet.IntVar(&webhookPort, "webhook-port", 9443, "The port the webhook endpoint binds to.")
flagSet.StringVar(&webhookCertDir, "webhook-cert-dir", "", "The directory the webhook server's certs are stored.")
Expand All @@ -69,6 +78,7 @@ func ParseFlags(args []string) (Flags, error) {

return Flags{
MetricsAddr: metricsAddr,
SecureMetrics: secureMetrics,
HealthAddr: healthAddr,
WebhookPort: webhookPort,
WebhookCertDir: webhookCertDir,
Expand Down
35 changes: 32 additions & 3 deletions v2/cmd/controller/app/setup.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ import (
"context"
"fmt"
"math/rand"
"net/http"
"net/http/pprof"
"os"
"regexp"
"time"
Expand All @@ -31,6 +33,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/controller"
"sigs.k8s.io/controller-runtime/pkg/healthz"
"sigs.k8s.io/controller-runtime/pkg/manager"
"sigs.k8s.io/controller-runtime/pkg/metrics/filters"
"sigs.k8s.io/controller-runtime/pkg/metrics/server"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"sigs.k8s.io/controller-runtime/pkg/webhook"
Expand Down Expand Up @@ -131,9 +134,7 @@ func SetupControllerManager(ctx context.Context, setupLog logr.Logger, flgs Flag
LeaderElection: flgs.EnableLeaderElection,
LeaderElectionID: "controllers-leader-election-azinfra-generated",
HealthProbeBindAddress: flgs.HealthAddr,
Metrics: server.Options{
BindAddress: flgs.MetricsAddr,
},
Metrics: getMetricsOpts(flgs),
WebhookServer: webhook.NewServer(webhook.Options{
Port: flgs.WebhookPort,
CertDir: flgs.WebhookCertDir,
Expand Down Expand Up @@ -253,6 +254,34 @@ func SetupControllerManager(ctx context.Context, setupLog logr.Logger, flgs Flag
return mgr
}

func getMetricsOpts(flags Flags) server.Options {
var metricsOptions server.Options

if flags.SecureMetrics {
metricsOptions = server.Options{
BindAddress: flags.MetricsAddr,
SecureServing: true,
FilterProvider: filters.WithAuthenticationAndAuthorization,
}
// Note that pprof endpoints are meant to be sensitive and shouldn't be exposed publicly.
if flags.ProfilingMetrics {
metricsOptions.ExtraHandlers = map[string]http.Handler{
"/debug/pprof/": http.HandlerFunc(pprof.Index),
"/debug/pprof/cmdline": http.HandlerFunc(pprof.Cmdline),
"/debug/pprof/profile": http.HandlerFunc(pprof.Profile),
"/debug/pprof/symbol": http.HandlerFunc(pprof.Symbol),
"/debug/pprof/trace": http.HandlerFunc(pprof.Trace),
}
}
} else {
metricsOptions = server.Options{
BindAddress: flags.MetricsAddr,
}
}

return metricsOptions
}

func getDefaultAzureCredential(cfg config.Values, setupLog logr.Logger) (*identity.Credential, error) {
tokenCred, err := getDefaultAzureTokenCredential(cfg, setupLog)
if err != nil {
Expand Down
1 change: 0 additions & 1 deletion v2/config/manager/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@ resources:
- manager_metrics_service.yaml

patchesStrategicMerge:
- manager_auth_proxy_patch.yaml
- manager_image_patch.yaml
- manager_pull_policy.yaml
6 changes: 4 additions & 2 deletions v2/config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,9 @@ spec:
serviceAccountName: default
containers:
- args:
- --metrics-addr=:8080
- --metrics-addr=:8443
- --secure-metrics=true
- --profiling-metrics=false
- --health-addr=:8081
- --enable-leader-election
- --v=2
Expand All @@ -47,7 +49,7 @@ spec:
- containerPort: 8081
name: health-port
protocol: TCP
- containerPort: 8080
- containerPort: 8443
name: metrics-port
protocol: TCP
livenessProbe:
Expand Down
24 changes: 0 additions & 24 deletions v2/config/manager/manager_auth_proxy_patch.yaml

This file was deleted.

2 changes: 1 addition & 1 deletion v2/config/manager/manager_metrics_service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@ metadata:
spec:
ports:
- name: metrics
port: 8080
port: 8443
matthchr marked this conversation as resolved.
Show resolved Hide resolved
selector:
control-plane: controller-manager
13 changes: 0 additions & 13 deletions v2/config/rbac/auth_proxy_role.yaml

This file was deleted.

Loading
Loading