Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📖 document metrics scraping and enable metrics services via annotations #4247

Closed
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions bootstrap/kubeadm/config/rbac/auth_proxy_service.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
apiVersion: v1
kind: Service
metadata:
annotations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these changes also be applied in each infra provider?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will go through the other infra provider and add the missing annotation.

Copy link
Contributor Author

@bavarianbidi bavarianbidi Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @CecileRobertMichon

just created three PRs for providers under kubernetes-sigs-Org, because CLA already signed. Have to check the legal stuff on the remaining others first. Hope this doesn't block this PR ;-)

Summary:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about Azure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

annotation was already there but was removed last year.

I created Issue #1222 to get in contact with Azure team.

Copy link
Contributor Author

@bavarianbidi bavarianbidi Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure PR #1320 created and merged

prometheus.io/scrape: "true"
prometheus.io/port: "8443"
prometheus.io/scheme: "https"
labels:
control-plane: controller-manager
name: controller-manager-metrics-service
Expand Down
4 changes: 4 additions & 0 deletions config/rbac/auth_proxy_service.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8443"
prometheus.io/scheme: "https"
labels:
control-plane: controller-manager
name: controller-manager-metrics-service
Expand Down
4 changes: 4 additions & 0 deletions controlplane/kubeadm/config/rbac/auth_proxy_service.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8443"
prometheus.io/scheme: "https"
labels:
control-plane: controller-manager
name: controller-manager-metrics-service
Expand Down
76 changes: 76 additions & 0 deletions docs/book/src/reference/metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
## Metrics

By default, controller-runtime builds a global prometheus registry and
bavarianbidi marked this conversation as resolved.
Show resolved Hide resolved
publishes a collection of performance metrics for each controller.

### Protecting the Metrics

These metrics are protected by [kube-auth-proxy](https://github.com/brancz/kube-rbac-proxy)
by default.

You will need to grant permissions to your Prometheus server so that it can
scrape the protected metrics. To achieve that, you can create a `clusterRole` and a
`clusterRoleBinding` to bind to the service account that your Prometheus server uses.
bavarianbidi marked this conversation as resolved.
Show resolved Hide resolved

Create a YAML file named `capi-metrics-reader-clusterrole.yaml` with following content
bavarianbidi marked this conversation as resolved.
Show resolved Hide resolved

```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: capi-metrics-reader
rules:
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
Comment on lines +18 to +24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may as well add this to the infra components yaml and remove this step for the end user.

Probably makes sense to leave out the cluster role binding as we don't know the namespace Prometheus may be deployed into.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the reason i documented it this way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@randomvariable so did I get it right that you suggest removing this YAML here and adding it to the infra component YAMLs of all providers? Maybe I'm missing something but in case we really want to add this to our deployments, it looks to me like we need this ClusterRole only once.

But I'm really not sure if we should add this to our YAMLs so that it's deployed everywhere. Imho Prometheus and RBAC setups can vary and (as far as I'm aware) there was no recurring demand for this in Slack. I assume nobody is really missing this in our YAMLs at the moment.

I have no strong opinion against adding the ClusteRole to our YAMLs, but if we do we should do it right and adding it to all infra providers seems redundant to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if #4640 will be merged, there is no need to add these objects to the infra components yaml. Let's wait what happens to #4640 and discuss again

```

and apply the `clusterRole` with
bavarianbidi marked this conversation as resolved.
Show resolved Hide resolved

```bash
kubectl apply -f capi-metrics-reader-clusterrole.yaml
```

You can run the following kubectl command to create a `clusterRoleBinding` and grant access on the `/metrics` endpoint to your Prometheus instance (`<namespace>` must be the namespace where your Prometheus instance is running. `<service-account-name>` must be the service account name which is configured in your Prometheus instance).
bavarianbidi marked this conversation as resolved.
Show resolved Hide resolved

```bash
kubectl create clusterrolebinding capi-metrics-reader --clusterrole=capi-metrics-reader --serviceaccount=<namespace>:<service-account-name>
```

### Scraping the Metrics with Prometheus

To scrape metrics, your Prometheus instance need at least following [`kubernetes_sd_config`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config) section.
bavarianbidi marked this conversation as resolved.
Show resolved Hide resolved


```yaml
# This job is primarily used for Pods with multiple metrics port.
# Per port one service is created and scraped.
- job_name: 'kubernetes-service-endpoints'
tls_config:
# if service endpoints use their own CA (e.g. via cert-manager) which aren't
# signed by the cluster-internal CA we must skip the cert validation
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
```

You are no able to check for metrics in your Prometheus instance. To verify, you could search with e.g. `{namespace="capi-system"}` to get all metrics from components running in `capi-system` Namespace.
bavarianbidi marked this conversation as resolved.
Show resolved Hide resolved