Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for securing the operator metrics endpoint with RBAC and TLS #7567

Merged
merged 30 commits into from
Mar 7, 2024
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
c7dbb50
WIP
naemono Feb 14, 2024
2420f8d
Final settings enabling authentication for the metrics endpoint.
naemono Feb 15, 2024
60b5b90
add servicemonitor template.
naemono Feb 15, 2024
bdf972d
Move non-cluster related roles to proper location.
naemono Feb 20, 2024
2d9f885
Remove non-cluster scoped option.
naemono Feb 20, 2024
7bf352f
Add back missing cluster role binding.
naemono Feb 20, 2024
4c6eabd
Docs for secure metrics.
naemono Feb 22, 2024
b0e8e65
Update docs for enabling secure metrics.
naemono Feb 22, 2024
21cf390
Fix all helm templating issues
naemono Feb 22, 2024
6233643
Expand fail directive
naemono Feb 23, 2024
4739249
update fail directive #2
naemono Feb 23, 2024
1db9145
fix helm if statement
naemono Feb 23, 2024
0f3ae45
Merge branch 'main' into 480-add-metrics-auth
naemono Feb 23, 2024
664cffb
Move to using 'metricsHost' instead of 'metricsBindAddress'
naemono Feb 26, 2024
6916389
Add serviceMonitor to default values.
naemono Feb 26, 2024
361ca67
WIP
naemono Feb 26, 2024
bd6f90d
Add additional information to docs.
naemono Feb 27, 2024
566f8f0
Add preceeding '-'
naemono Feb 27, 2024
e45f691
Fix all metricsports missing '-'
naemono Feb 27, 2024
7c41f51
securemode.enabled, not enable.
naemono Feb 27, 2024
1f1bacf
remove todo item.
naemono Feb 27, 2024
4348ce8
remove metrics-host from helm values, as it's not needed.
naemono Feb 27, 2024
d862223
remove secure-metrics link.
naemono Feb 27, 2024
11c1742
Updated documentation with full instructions on using custom tls cert…
naemono Mar 1, 2024
144e6f1
Update to include the whole configmap, not a script to patch it.
naemono Mar 4, 2024
119150e
Review comments
naemono Mar 5, 2024
3e35723
Review comments
naemono Mar 5, 2024
2f89b3b
Review comments.
naemono Mar 5, 2024
848e8ed
Addressing review comments
naemono Mar 5, 2024
e8eccd1
Lowercasing operator, and chart.
naemono Mar 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions cmd/manager/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,12 @@ func Command() *cobra.Command {
cmd.Flags().Int(
operator.MetricsPortFlag,
DefaultMetricPort,
"Port to use for exposing metrics in the Prometheus format (set 0 to disable)",
"Port to use for exposing metrics in the Prometheus format. (set 0 to disable)",
)
cmd.Flags().String(
operator.MetricsHostFlag,
"0.0.0.0",
fmt.Sprintf("The host to which the operator should bind to serve metrics in the Prometheus format. Will be combined with %s.", operator.MetricsPortFlag),
naemono marked this conversation as resolved.
Show resolved Hide resolved
)
cmd.Flags().StringSlice(
operator.NamespacesFlag,
Expand Down Expand Up @@ -577,11 +582,12 @@ func startOperator(ctx context.Context) error {

// only expose prometheus metrics if provided a non-zero port
metricsPort := viper.GetInt(operator.MetricsPortFlag)
metricsHost := viper.GetString(operator.MetricsHostFlag)
if metricsPort != 0 {
log.Info("Exposing Prometheus metrics on /metrics", "port", metricsPort)
log.Info("Exposing Prometheus metrics on /metrics", "bindAddress", fmt.Sprintf("%s:%d", metricsHost, metricsPort))
}
opts.Metrics = metricsserver.Options{
BindAddress: fmt.Sprintf(":%d", metricsPort), // 0 to disable
BindAddress: fmt.Sprintf("%s:%d", metricsHost, metricsPort), // 0 to disable
}

webhookPort := viper.GetInt(operator.WebhookPortFlag)
Expand Down
13 changes: 13 additions & 0 deletions deploy/eck-operator/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,19 @@ elastic-webhook-server
{{- end -}}
{{- end -}}

{{/*
Determine the metrics port
*/}}
{{- define "eck-operator.metrics.port" -}}
{{- if .Values.config.metrics.port -}}
{{- .Values.config.metrics.port -}}
{{- else if .Values.config.metricsPort -}}
{{- .Values.config.metricsPort -}}
{{- else -}}
0
{{- end -}}
{{- end -}}

{{/*
RBAC permissions
NOTE - any changes made to RBAC permissions below require
Expand Down
22 changes: 22 additions & 0 deletions deploy/eck-operator/templates/auth-proxy-service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{{- if .Values.config.metrics.secureMode.enabled }}
{{- $metricsPort := int (include "eck-operator.metrics.port" .)}}
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: {{ include "eck-operator.name" . }}-metrics-service
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
helm.sh/chart: {{ include "eck-operator.chart" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
name: "{{ include "eck-operator.fullname" . }}-metrics"
namespace: {{ .Release.Namespace }}
spec:
ports:
- name: https
port: {{ $metricsPort }}
protocol: TCP
targetPort: metrics
selector:
{{- include "eck-operator.selectorLabels" . | nindent 4 }}
{{- end }}
25 changes: 25 additions & 0 deletions deploy/eck-operator/templates/cluster-roles.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
{{- if and (not .Values.createClusterScopedResources) (.Values.config.metrics.secureMode.enabled) -}}
{{ fail "createClusterScopedResources is required to set config.metrics.secureMode.enabled to true" }}
{{- end }}
{{- if .Values.createClusterScopedResources -}}
---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down Expand Up @@ -93,4 +96,26 @@ rules:
- apiGroups: ["logstash.k8s.elastic.co"]
resources: ["logstashes"]
verbs: ["create", "delete", "deletecollection", "patch", "update"]
{{- if .Values.config.metrics.secureMode.enabled }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
{{- include "eck-operator.labels" . | nindent 4 }}
name: "{{ include "eck-operator.fullname" . }}-proxy-role"
rules:
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create
{{- end }}
{{- end -}}
10 changes: 9 additions & 1 deletion deploy/eck-operator/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,16 @@ metadata:
{{- include "eck-operator.labels" . | nindent 4 }}
data:
eck.yaml: |-
{{- $metricsPort := int (include "eck-operator.metrics.port" .)}}
log-verbosity: {{ int .Values.config.logVerbosity }}
metrics-port: {{ int .Values.config.metricsPort }}
{{- if and .Values.config.metrics.secureMode.enabled (eq $metricsPort 0) }}
{{- fail "config.metrics.port must be greater than 0 when config.metrics.secureMode.enabled is true" }}
{{- end }}
{{- if .Values.config.metrics.secureMode.enabled }}
metrics-port: {{ add $metricsPort 1 }}
{{- else }}
metrics-port: {{ $metricsPort }}
{{- end }}
container-registry: {{ .Values.config.containerRegistry }}
{{- with .Values.config.containerSuffix }}
container-suffix: {{ . }}
Expand Down
2 changes: 1 addition & 1 deletion deploy/eck-operator/templates/operator-network-policy.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{{- if .Values.softMultiTenancy.enabled -}}
{{- $kubeAPIServerIP := (required "kubeAPIServerIP is required" .Values.kubeAPIServerIP) -}}
{{- $metricsPort := int .Values.config.metricsPort -}}
{{- $metricsPort := int (include "eck-operator.metrics.port" .)}}
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
Expand Down
12 changes: 9 additions & 3 deletions deploy/eck-operator/templates/podMonitor.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
{{- $metricsPort := int .Values.config.metricsPort -}}
{{- $metricsPort := int (include "eck-operator.metrics.port" .)}}
{{- if and .Values.config.metrics.secureMode.enabled (eq $metricsPort 0) }}
{{- fail "config.metrics.port must be greater than 0 when config.metrics.secureMode.enabled is true" }}
{{- end }}
{{- if and .Values.podMonitor.enabled (gt $metricsPort 0) }}
naemono marked this conversation as resolved.
Show resolved Hide resolved
{{- if and .Values.podMonitor.enabled .Values.config.metrics.secureMode.enabled }}
{{- fail "podMonitor and config.metrics.secureMode are mutually exclusive" }}
{{- end }}
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: {{ include "eck-operator.fullname" . }}
namespace: {{ ternary .Values.podMonitor.namespace .Release.Namespace (not (empty .Values.podMonitor.namespace)) }}
namespace: {{ ternary .Values.podMonitor.namespace .Release.Namespace (not (and (.Values.podMonitor) (empty .Values.podMonitor.namespace))) }}
labels: {{- include "eck-operator.labels" . | nindent 4 }}
{{- with .Values.podMonitor.labels }}
{{- toYaml . | nindent 4 }}
Expand Down Expand Up @@ -33,4 +39,4 @@ spec:
- {{ .Release.Namespace }}
selector:
matchLabels: {{- include "eck-operator.selectorLabels" . | nindent 6 }}
{{- end }}
{{- end }}
18 changes: 18 additions & 0 deletions deploy/eck-operator/templates/role-bindings.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{{- $operatorNSIsManaged := has .Release.Namespace .Values.managedNamespaces -}}
{{- $fullName := include "eck-operator.fullname" . -}}
{{- $svcAccount := include "eck-operator.serviceAccountName" . }}
{{- $enableSecureMetrics := .Values.config.metrics.secureMode.enabled -}}

{{- if not .Values.createClusterScopedResources }}
{{- range .Values.managedNamespaces }}
Expand Down Expand Up @@ -77,4 +78,21 @@ subjects:
- kind: ServiceAccount
name: {{ $svcAccount }}
namespace: {{ $.Release.Namespace }}
{{- if $enableSecureMetrics }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
{{- include "eck-operator.labels" $ | nindent 4 }}
name: "{{ include "eck-operator.fullname" . }}-proxy-rolebinding"
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: "{{ include "eck-operator.fullname" . }}-proxy-role"
subjects:
- kind: ServiceAccount
name: {{ $svcAccount }}
namespace: {{ $.Release.Namespace }}
{{- end }}
{{- end }}
31 changes: 31 additions & 0 deletions deploy/eck-operator/templates/serviceMonitor.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
{{- if .Values.config.metrics.secureMode.enabled }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: {{ include "eck-operator.fullname" . }}
namespace: {{ ternary .Values.serviceMonitor.namespace .Release.Namespace (not (and (.Values.serviceMonitor) (empty .Values.serviceMonitor.namespace))) }}
labels: {{- include "eck-operator.labels" . | nindent 4 }}
spec:
namespaceSelector:
matchNames:
- {{ .Release.Namespace }}
selector:
matchLabels:
app.kubernetes.io/name: {{ include "eck-operator.name" . }}-metrics-service
app.kubernetes.io/instance: {{ .Release.Name }}
endpoints:
- port: https
path: /metrics
scheme: https
interval: 30s
tlsConfig:
insecureSkipVerify: {{ .Values.config.metrics.secureMode.tls.insecureSkipVerify | default false }}
{{- if (not .Values.config.metrics.secureMode.tls.insecureSkipVerify) }}
{{- with .Values.config.metrics.secureMode.tls.caSecret }}
{{- $leading_path := trimSuffix "/" .Values.config.metrics.secureMode.tls.caMountDirectory }}
caFile: "{{ $leading_path }}/{{ . }}/ca.crt"
{{- end }}
serverName: "{{ include "eck-operator.fullname" . }}-metrics.{{ .Release.Namespace }}.svc"
{{- end }}
bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
{{- end }}
49 changes: 45 additions & 4 deletions deploy/eck-operator/templates/statefulset.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{{- $metricsPort := int .Values.config.metricsPort -}}
---
{{- $metricsPort := int (include "eck-operator.metrics.port" .)}}
apiVersion: apps/v1
kind: StatefulSet
metadata:
Expand Down Expand Up @@ -79,10 +79,10 @@ spec:
resources:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- if or (gt $metricsPort 0) .Values.webhook.enabled }}
{{- if or .Values.webhook.enabled (gt $metricsPort 0) }}
ports:
{{- if (gt $metricsPort 0) }}
- containerPort: {{ .Values.config.metricsPort }}
{{- if and (gt $metricsPort 0) (not .Values.config.metrics.secureMode.enabled) }}
- containerPort: {{ $metricsPort }}
name: metrics
protocol: TCP
{{- end }}
Expand All @@ -104,6 +104,41 @@ spec:
{{- with .Values.volumeMounts }}
{{- toYaml . | nindent 12 }}
{{- end }}
{{- if .Values.config.metrics.secureMode.enabled }}
- name: kube-rbac-proxy
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- "ALL"
image: gcr.io/kubebuilder/kube-rbac-proxy:v0.15.0
args:
- "--secure-listen-address=0.0.0.0:{{ $metricsPort }}"
- "--upstream=http://127.0.0.1:{{ add $metricsPort 1 }}/"
- "--logtostderr=true"
- "--v=0"
{{- if .Values.config.metrics.secureMode.tls.certificateSecret }}
- "--tls-cert-file=/tls/tls.crt"
- "--tls-private-key-file=/tls/tls.key"
{{- end }}
{{- if .Values.config.metrics.secureMode.tls.certificateSecret }}
volumeMounts:
- mountPath: "/tls"
name: tls-certificate
readOnly: true
{{- end }}
ports:
- containerPort: {{ $metricsPort }}
protocol: TCP
name: metrics
resources:
limits:
cpu: 500m
memory: 128Mi
requests:
cpu: 5m
memory: 64Mi
{{- end }}
volumes:
- name: conf
configMap:
Expand All @@ -114,6 +149,12 @@ spec:
defaultMode: 420
secretName: {{ include "eck-operator.webhookSecretName" . }}
{{- end }}
{{- if .Values.config.metrics.secureMode.tls.certificateSecret }}
- name: tls-certificate
secret:
defaultMode: 420
secretName: {{ .Values.config.metrics.secureMode.tls.certificateSecret }}
{{- end }}
{{- with .Values.volumes }}
{{- toYaml . | nindent 8 }}
{{- end }}
Expand Down
74 changes: 71 additions & 3 deletions deploy/eck-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -159,8 +159,67 @@ config:
# number greater than 0: Errors, warnings, information, and debug details.
logVerbosity: "0"

# metricsPort defines the port to expose operator metrics. Set to 0 to disable metrics reporting.
metricsPort: "0"
# (Deprecated: use metrics.port: will be removed in v2.14.0) metricsPort defines the port to expose operator metrics. Set to 0 to disable metrics reporting.
metricsPort: 0

metrics:
# port defines the port to expose operator metrics. Set to 0 to disable metrics reporting.
port: "0"
thbkrkr marked this conversation as resolved.
Show resolved Hide resolved
# secureMode contains the options for enabling and configuring RBAC and TLS/HTTPs for the metrics endpoint.
secureMode:
# secureMode.enabled specifies whether to enable RBAC and TLS/HTTPs for the metrics endpoint. (Will be enabled by default in v2.14.0)
# * This option requires using a ServiceMonitor to scrape the metrics and as such is mutually exclusive with the podMonitor.enabled option.
# * This option also requires using cluster scoped resources (ClusterRole, ClusterRoleBinding) to
# grant access to the /metrics endpoint. (createClusterScopedResources: true is required)
#
# This option requires the following settings within Prometheus to function:
# 1. RBAC settings for the Prometheus instance to access the metrics endpoint.
#
# - nonResourceURLs:
# - /metrics
# verbs:
# - get
#
# 2. If using the Prometheus Operator and your Prometheus instance is not in the same namespace as the operator you will need
# the Prometheus Operator configured with the following Helm values:
#
# prometheus:
# prometheusSpec:
# serviceMonitorNamespaceSelector: {}
# serviceMonitorSelectorNilUsesHelmValues: false
enabled: false
tls:
# certificateSecret is the name of the tls secret containing the custom TLS certificate and key for the secure metrics endpoint.
#
# * This is an optional setting and is only required if you are using a custom TLS certificate. A self-signed certificate will be generated by default.
# * TLS secret key must be named tls.crt.
# * TLS key's secret key must be named tls.key.
# * It is assumed to be in the same namespace as the ServiceMonitor.
#
# example: kubectl create secret tls eck-metrics-tls-certificate -n elastic-system \
# --cert=/path/to/tls.crt --key=/path/to/tls.key
certificateSecret: ""
# caSecret is the name of the secret containing the custom CA certificate used to generate the custom TLS certificate for the secure metrics endpoint.
#
# * This *must* be the name of the secret containing the CA certificate used to sign the custom TLS certificate.
# * This secret *must* be in the same namespace as the Prometheus instance that will scrape the metrics.
# * If using the Prometheus operator this secret must be within the `spec.secrets` field of the `Prometheus` custom resource such that it is mounted into the Prometheus pod at `caMountDirectory`, which defaults to /etc/prometheus/secrets/{secret-name}.
# * This is an optional setting and is only required if you are using a custom TLS certificate.
# * Key must be named ca.crt.
#
# example: kubectl create secret generic eck-metrics-tls-ca -n monitoring \
# --from-file=ca.crt=/path/to/ca.pem
caSecret: ""
# caMountDirectory is the directory at which the CA certificate is mounted within the Prometheus pod.
#
# * You should only need to adjust this if you are *not* using the Prometheus operator.
caMountDirectory: "/etc/prometheus/secrets/"
# insecureSkipVerify specifies whether to skip verification of the TLS certificate for the secure metrics endpoint.
#
# * If this setting is set to false, then the following settings are required:
# - certificateSecret
# - caSecret
insecureSkipVerify: true

# containerRegistry to use for pulling Elasticsearch and other application container images.
containerRegistry: docker.elastic.co
Expand Down Expand Up @@ -232,7 +291,7 @@ config:
podMonitor:

# enabled determines whether a podMonitor should deployed to scrape the eck metrics.
# This requires the prometheus operator and the config.metricsPort not to be 0
# This requires the prometheus operator and the config.metrics.port not to be 0
enabled: false

# labels adds additional labels to the podMonitor
Expand All @@ -258,6 +317,15 @@ podMonitor:
podMetricsEndpointConfig: {}
# honorTimestamps: true

# Prometheus ServiceMonitor configuration
# Only used when config.enableSecureMetrics is true
# Reference: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#servicemonitor
serviceMonitor: {}

# namespace determines in which namespace the serviceMonitor will be deployed.
# If not set the serviceMonitor will be created in the namespace where the Helm release is installed into
# namespace: monitoring

# Globals meant for internal use only
global:
# manifestGen specifies whether the chart is running under manifest generator.
Expand Down
Loading