Skip to content

Commit

Permalink
Remove dependency for kube-rbac-proxy and add secure-metrics feature (#…
Browse files Browse the repository at this point in the history
…3833)

* Remove dependency for kube-rbac-proxy and add secure-metrics feature flag

* Update values.yaml

* Update kustomize.yaml

* Update to default enable secureMetrics; Update role and rolebindings

* Add metrics auth filter

* Add seperate arg for profiling-metrics

* Refactor

* fix typo

* Resolve comments

* gofmt file

* Add tabpane for securemetrics docs
  • Loading branch information
super-harsh authored Mar 8, 2024
1 parent d58d0f6 commit 034c9e3
Show file tree
Hide file tree
Showing 17 changed files with 233 additions and 100 deletions.
4 changes: 1 addition & 3 deletions Taskfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,6 @@ vars:
CROSSPLANE_OUTPUT:
sh: 'realpath hack/crossplane/config'

KUBE_RBAC_PROXY: gcr.io/kubebuilder/kube-rbac-proxy

# how long to let tests against live resources run for
LIVE_TEST_TIMEOUT: 3h

Expand Down Expand Up @@ -694,7 +692,7 @@ tasks:
deps:
- controller:generate-kustomize
cmds:
- "{{.SCRIPTS_ROOT}}/generate-helm-manifest.sh {{.KUBE_RBAC_PROXY}} {{.LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE}} {{.PUBLIC_REGISTRY}} {{.LATEST_VERSION_TAG}} `pwd`/"
- "{{.SCRIPTS_ROOT}}/generate-helm-manifest.sh {{.LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE}} {{.PUBLIC_REGISTRY}} {{.LATEST_VERSION_TAG}} `pwd`/"

controller:install-helm:
desc: Generate and install helm chart on cluster
Expand Down
82 changes: 77 additions & 5 deletions docs/hugo/content/guide/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,21 @@ The metrics exposed fall into two groups: Azure based metrics, and reconciler me

## Toggling the metrics

By default, metrics for ASOv2 are turned on and can be toggled by the following options:
By default, secure metrics for ASOv2 are turned on and can be toggled by the following options:

- ### ASOv2 Helm Chart
### ASOv2 Helm Chart

While installing the Helm chart, we can turn the metrics _**on**_ and _**off**_ and set the metrics expose address using the
below settings. Also, we can change the settings inside `values.yaml` file for ASOv2 Helm chart.

```
--set metrics.enable=true/false (default: true)
--set metrics.address=0.0.0.0:8080 (default)
--set metrics.secure=true/false (default: true)
--set metrics.profiling=true/false (default: false)
--set metrics.address=0.0.0.0:8443 (default)
```

- ### Deployment YAML
### Deployment YAML

In the deployment yaml, we can turn _**off**_ the metrics by omitting the `metrics-addr` flag. We can also change to use
a different metrics-addr by changing the default value of that same flag.
Expand All @@ -29,8 +31,78 @@ By default, metrics for ASOv2 are turned on and can be toggled by the following
containers:
- args:
- --metrics-addr=0.0.0.0:8080 (default)
- --secure-metrics=true/false (default: true)
- --profiling-metrics=true/false (default: false)
```

## Scraping Metrics Securely via HTTPs using RBAC

A ServiceAccount token is required to scrape metrics securely. The corresponding ServiceAccount needs permissions on the "/metrics" and "debug/pprof" paths.
This can be achieved e.g. by following the [Kubernetes documentation](https://kubernetes.io/docs/concepts/cluster-administration/system-metrics/).

Follow the steps below to scrape metrics securely.

{{< tabpane text=true left=true >}}
{{% tab header="Helm Chart" %}}
``` Helm Chart
--set metrics.enable=true
--set metrics.secure=true
--set metrics.profiling=true
--set metrics.address=0.0.0.0:8443
```
{{% /tab %}}
{{% tab header="Deployment YAML" %}}
``` Deployment YAML
spec:
containers:
- args:
- --metrics-addr=0.0.0.0:8443
- --secure-metrics=true
- --profiling-metrics=true
```
{{% /tab %}}
{{< /tabpane >}}

Deploy the following RBAC configuration. This creates a role that can scrape metrics.
```
cat << EOT | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: default-metrics
rules:
- nonResourceURLs:
- "/metrics"
- "/debug/pprof/*"
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: default-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: default-metrics
subjects:
- kind: ServiceAccount
name: default
namespace: default
EOT
```
Test locally:
- Open a port-forward

```
kubectl port-forward deployments/azureserviceoperator-controller-manager -n azureserviceoperator-system 8443
```
- Create a ServiceAccount token and scrape metrics
```
TOKEN=$(kubectl create token default)
curl https://localhost:8443/metrics --header "Authorization: Bearer $TOKEN" -k
```
## Understanding the ASOv2 Metrics
| Metric | Description | Label 1 | Label 2 | Label 3 |
Expand Down
23 changes: 11 additions & 12 deletions scripts/v2/generate-helm-manifest.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,10 @@

set -e

KUBE_RBAC_PROXY=$1
LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE=$2
PUBLIC_REGISTRY=$3
VERSION=$4
DIR=$5
LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE=$1
PUBLIC_REGISTRY=$2
VERSION=$3
DIR=$4

ASO_CHART="$DIR"charts/azure-service-operator
GEN_FILES_DIR="$ASO_CHART"/templates/generated
Expand Down Expand Up @@ -43,8 +42,6 @@ rm "$GEN_FILES_DIR"/*_namespace_* # remove namespace as we will let Helm manage
sed -i "s/\(version: \)\(.*\)/\1${VERSION//v}/g" "$ASO_CHART"/Chart.yaml # find version key and update the value with the current version

# Deployment replacements
grep -E $KUBE_RBAC_PROXY "$GEN_FILES_DIR"/*_deployment_* > /dev/null # Ensure that what we're about to try to replace actually exists (if it doesn't we want to fail)
sed -i "s@$KUBE_RBAC_PROXY.*@{{.Values.image.kubeRBACProxy}}@g" "$GEN_FILES_DIR"/*_deployment_*
sed -i "s@$LOCAL_REGISTRY_CONTROLLER_DOCKER_IMAGE@{{.Values.image.repository}}@g" "$GEN_FILES_DIR"/*_deployment_* # Replace hardcoded ASO image
# Perl multiline replacements - using this because it's tricky to do these sorts of multiline replacements with sed
perl -0777 -i -pe 's/(template:\n.*metadata:\n.*annotations:\n(\s*))/$1\{\{- if .Values.podAnnotations \}\}\n$2\{\{ toYaml .Values.podAnnotations \}\}\n$2\{\{- end \}\}\n$2/igs' "$GEN_FILES_DIR"/*_deployment_* # Add pod annotations
Expand All @@ -54,11 +51,13 @@ perl -0777 -i -pe 's/(spec:\n.*template:\n.*spec:\n(\s*))/$1\{\{- with .Values.a
perl -0777 -i -pe 's/(spec:\n.*template:\n.*spec:\n(\s*))/$1\{\{- with .Values.tolerations \}\}\n$2tolerations:\n$2\{\{- toYaml . | nindent 8 \}\}\n$2\{\{- end \}\}\n$2/igs' "$GEN_FILES_DIR"/*_deployment_* # Add pod annotations

# Metrics Configuration
flow_control "metrics-addr" "metrics-addr" "{{- if .Values.metrics.enable}}" "$GEN_FILES_DIR"/*_deployment_*
flow_control "metrics-addr" "profiling-metrics" "{{- if .Values.metrics.enable}}" "$GEN_FILES_DIR"/*_deployment_*
sed -i "1,/secure-metrics=.*/s/\(secure-metrics=\)\(.*\)/\1{{ .Values.metrics.secure }}/g" "$GEN_FILES_DIR"/*_deployment_*
sed -i "1,/profiling-metrics=.*/s/\(profiling-metrics=\)\(.*\)/\1{{ .Values.metrics.profiling }}/g" "$GEN_FILES_DIR"/*_deployment_*
sed -i "1,/metrics-addr=.*/s/\(metrics-addr=\)\(.*\)/\1{{ tpl .Values.metrics.address . }}/g" "$GEN_FILES_DIR"/*_deployment_*
sed -i 's/containerPort: 8080/containerPort: {{ .Values.metrics.port | default 8080 }}/g' "$GEN_FILES_DIR"/*_deployment_*
sed -i 's/containerPort: 8080/containerPort: {{ .Values.metrics.port | default 8443 }}/g' "$GEN_FILES_DIR"/*_deployment_*
sed -i '1 i {{- if .Values.metrics.enable -}}' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
sed -i 's/port: 8080/port: {{ .Values.metrics.port | default 8080 }}/g' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
sed -i 's/port: 8080/port: {{ .Values.metrics.port | default 8443 }}/g' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
sed -i -e '$a{{- end }}' "$GEN_FILES_DIR"/*controller-manager-metrics-service*
find "$GEN_FILES_DIR" -type f -exec sed -i 's/azureserviceoperator-system/{{ .Release.Namespace }}/g' {} \;

Expand Down Expand Up @@ -87,8 +86,8 @@ flow_control "aadpodidbinding" "aadpodidbinding" "$IF_TENANT" "$GEN_FILES_DIR"/*

flow_control "--enable-leader-election" "--enable-leader-election" "$IF_TENANT" "$GEN_FILES_DIR"/*_deployment_*

# TODO: This bit is tricky to exclude kube-rbac-proxy and webhook stuff.
flow_control "mountPath: \/tmp\/k8s-webhook-server\/serving-certs" "name: https" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i "/mountPath: \/tmp\/k8s-webhook-server\/serving-certs/i \ \ $IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i "/nodeSelector:/i \ \ {{- end }}" "$GEN_FILES_DIR"/*_deployment_*
flow_control "- name: cert" "secretName" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
flow_control "--webhook-cert-dir=" "--webhook-cert-dir=" "$IF_CLUSTER" "$GEN_FILES_DIR"/*_deployment_*
sed -i 's/\/tmp\/k8s-webhook-server\/serving-certs/{{ .Values.webhook.certDir }}/g' "$GEN_FILES_DIR"/*_deployment_*
Expand Down
16 changes: 14 additions & 2 deletions v2/charts/azure-service-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,14 +86,26 @@ webhook:
# repository).
image:
repository: mcr.microsoft.com/k8s/azureserviceoperator:v2.6.0
kubeRBACProxy: gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1

# 'metrics' define settings for the metrics from controller.
# 'address' field defines the metrics binding address on which metrics
metrics:
enable: true
# secure controls whether metrics should be served via 'http' or 'https'.
# Flagging secure as 'true' would use https
# Refer to https://azure.github.io/azure-service-operator/guide/metrics/ for more information
secure: true
# profiling exposes below endpoints.
# /debug/pprof/
# /debug/pprof/cmdline
# /debug/pprof/profile
# /debug/pprof/symbol
# /debug/pprof/trace
#
# pprof endpoints are sensitive and can only be enabled when serving metrics securely
profiling: false
address: 0.0.0.0:{{ .Values.metrics.port }}
port: 8080
port: 8443

# installCRDs configures if the operator attempts to install and manage the CRDs associated with ASO.
# If the operator does not install and manage the CRDs on its own, you must manually install the appropriate
Expand Down
14 changes: 12 additions & 2 deletions v2/cmd/controller/app/flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,23 @@ import (

type Flags struct {
MetricsAddr string
ProfilingMetrics bool
SecureMetrics bool
HealthAddr string
WebhookPort int
WebhookCertDir string
EnableLeaderElection bool
CRDManagementMode string
CRDPatterns string // This is a ; delimited string containing a collection of patterns
CRDPatterns string // This is a ';' delimited string containing a collection of patterns
PreUpgradeCheck bool
}

func (f Flags) String() string {
return fmt.Sprintf(
"MetricsAddr: %s, HealthAddr: %s, WebhookPort: %d, WebhookCertDir: %s, EnableLeaderElection: %t, CRDManagementMode: %s, CRDPatterns: %s, PreUpgradeCheck: %t",
"MetricsAddr: %s, SecureMetrics: %t, ProfilingMetrics: %t, HealthAddr: %s, WebhookPort: %d, WebhookCertDir: %s, EnableLeaderElection: %t, CRDManagementMode: %s, CRDPatterns: %s, PreUpgradeCheck: %t",
f.MetricsAddr,
f.SecureMetrics,
f.ProfilingMetrics,
f.HealthAddr,
f.WebhookPort,
f.WebhookCertDir,
Expand All @@ -44,6 +48,8 @@ func ParseFlags(args []string) (Flags, error) {
klog.InitFlags(flagSet)

var metricsAddr string
var profilingMetrics bool
var secureMetrics bool
var healthAddr string
var webhookPort int
var webhookCertDir string
Expand All @@ -54,6 +60,9 @@ func ParseFlags(args []string) (Flags, error) {

// default here for 'MetricsAddr' is set to "0", which sets metrics to be disabled if 'metrics-addr' flag is omitted.
flagSet.StringVar(&metricsAddr, "metrics-addr", "0", "The address the metric endpoint binds to.")
flagSet.BoolVar(&secureMetrics, "secure-metrics", true, "Enable secure metrics. This secures the pprof and metrics endpoints via Kubernetes RBAC and HTTPS")
flagSet.BoolVar(&profilingMetrics, "profiling-metrics", false, "Enable pprof metrics, only enabled in conjunction with secure-metrics. This will enable serving pprof metrics endpoints")

flagSet.StringVar(&healthAddr, "health-addr", "", "The address the healthz endpoint binds to.")
flagSet.IntVar(&webhookPort, "webhook-port", 9443, "The port the webhook endpoint binds to.")
flagSet.StringVar(&webhookCertDir, "webhook-cert-dir", "", "The directory the webhook server's certs are stored.")
Expand All @@ -69,6 +78,7 @@ func ParseFlags(args []string) (Flags, error) {

return Flags{
MetricsAddr: metricsAddr,
SecureMetrics: secureMetrics,
HealthAddr: healthAddr,
WebhookPort: webhookPort,
WebhookCertDir: webhookCertDir,
Expand Down
35 changes: 32 additions & 3 deletions v2/cmd/controller/app/setup.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ import (
"context"
"fmt"
"math/rand"
"net/http"
"net/http/pprof"
"os"
"regexp"
"time"
Expand All @@ -31,6 +33,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/controller"
"sigs.k8s.io/controller-runtime/pkg/healthz"
"sigs.k8s.io/controller-runtime/pkg/manager"
"sigs.k8s.io/controller-runtime/pkg/metrics/filters"
"sigs.k8s.io/controller-runtime/pkg/metrics/server"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"sigs.k8s.io/controller-runtime/pkg/webhook"
Expand Down Expand Up @@ -131,9 +134,7 @@ func SetupControllerManager(ctx context.Context, setupLog logr.Logger, flgs Flag
LeaderElection: flgs.EnableLeaderElection,
LeaderElectionID: "controllers-leader-election-azinfra-generated",
HealthProbeBindAddress: flgs.HealthAddr,
Metrics: server.Options{
BindAddress: flgs.MetricsAddr,
},
Metrics: getMetricsOpts(flgs),
WebhookServer: webhook.NewServer(webhook.Options{
Port: flgs.WebhookPort,
CertDir: flgs.WebhookCertDir,
Expand Down Expand Up @@ -253,6 +254,34 @@ func SetupControllerManager(ctx context.Context, setupLog logr.Logger, flgs Flag
return mgr
}

func getMetricsOpts(flags Flags) server.Options {
var metricsOptions server.Options

if flags.SecureMetrics {
metricsOptions = server.Options{
BindAddress: flags.MetricsAddr,
SecureServing: true,
FilterProvider: filters.WithAuthenticationAndAuthorization,
}
// Note that pprof endpoints are meant to be sensitive and shouldn't be exposed publicly.
if flags.ProfilingMetrics {
metricsOptions.ExtraHandlers = map[string]http.Handler{
"/debug/pprof/": http.HandlerFunc(pprof.Index),
"/debug/pprof/cmdline": http.HandlerFunc(pprof.Cmdline),
"/debug/pprof/profile": http.HandlerFunc(pprof.Profile),
"/debug/pprof/symbol": http.HandlerFunc(pprof.Symbol),
"/debug/pprof/trace": http.HandlerFunc(pprof.Trace),
}
}
} else {
metricsOptions = server.Options{
BindAddress: flags.MetricsAddr,
}
}

return metricsOptions
}

func getDefaultAzureCredential(cfg config.Values, setupLog logr.Logger) (*identity.Credential, error) {
tokenCred, err := getDefaultAzureTokenCredential(cfg, setupLog)
if err != nil {
Expand Down
1 change: 0 additions & 1 deletion v2/config/manager/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@ resources:
- manager_metrics_service.yaml

patchesStrategicMerge:
- manager_auth_proxy_patch.yaml
- manager_image_patch.yaml
- manager_pull_policy.yaml
6 changes: 4 additions & 2 deletions v2/config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,9 @@ spec:
serviceAccountName: default
containers:
- args:
- --metrics-addr=:8080
- --metrics-addr=:8443
- --secure-metrics=true
- --profiling-metrics=false
- --health-addr=:8081
- --enable-leader-election
- --v=2
Expand All @@ -47,7 +49,7 @@ spec:
- containerPort: 8081
name: health-port
protocol: TCP
- containerPort: 8080
- containerPort: 8443
name: metrics-port
protocol: TCP
livenessProbe:
Expand Down
24 changes: 0 additions & 24 deletions v2/config/manager/manager_auth_proxy_patch.yaml

This file was deleted.

2 changes: 1 addition & 1 deletion v2/config/manager/manager_metrics_service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@ metadata:
spec:
ports:
- name: metrics
port: 8080
port: 8443
selector:
control-plane: controller-manager
13 changes: 0 additions & 13 deletions v2/config/rbac/auth_proxy_role.yaml

This file was deleted.

Loading

0 comments on commit 034c9e3

Please sign in to comment.