Skip to content

Commit

Permalink
Move Prometheus as an Add-On (linkerd#4362)
Browse files Browse the repository at this point in the history
This moves Prometheus as a add-on, thus making it optional but enabled by default. The also make `linkerd-prometheus` more configurable, and allow it to have its own life-cycle for upgrades, configuration, etc.

This work will be followed by documentation that help users configure existing Prometheus to work with Linkerd.

**Changes Include:**
- moving prometheus manifests into a separate chart at `charts/add-ons/prometheus`, and adding it as a dependency to `linkerd2`
- implement the `addOn` interface to support the same with CLI.
- include configuration in `linkerd-config-addons`

**User Facing Changes:**
The default install experience does not change much but for users who have already configured Prometheus differently, would need to apply the same using the new configuration fields present in chart README

Signed-off-by: Eric Solomon <errcsool@engineer.com>
  • Loading branch information
Pothulapati authored and Eric Solomon committed Jul 15, 2020
1 parent 07dadd5 commit f84bbc4
Show file tree
Hide file tree
Showing 73 changed files with 21,411 additions and 17,868 deletions.
22 changes: 22 additions & 0 deletions charts/add-ons/prometheus/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
9 changes: 9 additions & 0 deletions charts/add-ons/prometheus/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: v1
appVersion: "1.0"
description: A Helm chart for the prometheus add-on in Linkerd
name: prometheus
version: 0.1.0
maintainers:
- name: Linkerd authors
email: cncf-linkerd-dev@lists.cncf.io
url: https://linkerd.io/
4 changes: 4 additions & 0 deletions charts/add-ons/prometheus/requirements.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
dependencies:
- name: partials
version: 0.1.0
repository: file://../../partials
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,10 @@ metadata:
{{.Values.global.createdByAnnotation}}: {{default (printf "linkerd/helm %s" .Values.global.linkerdVersion) .Values.global.cliVersion}}
data:
prometheus.yml: |-
{{- if .Values.prometheusAlertmanagers }}
alerting:
alertmanagers:
{{- toYaml .Values.prometheusAlertmanagers | trim | nindent 8 }}
{{- end }}
global:
scrape_interval: 10s
scrape_timeout: 10s
evaluation_interval: 10s
{{- if .Values.globalConfig -}}
{{- toYaml .Values.globalConfig | trim | nindent 6 }}
{{- end}}
rule_files:
- /etc/prometheus/*_rules.yml
Expand All @@ -34,7 +29,6 @@ data:
static_configs:
- targets: ['localhost:9090']
{{ if .Values.grafana.enabled -}}
- job_name: 'grafana'
kubernetes_sd_configs:
- role: pod
Expand All @@ -45,7 +39,6 @@ data:
- __meta_kubernetes_pod_container_name
action: keep
regex: ^grafana$
{{- end}}
# Required for: https://grafana.com/grafana/dashboards/315
- job_name: 'kubernetes-nodes-cadvisor'
Expand All @@ -54,7 +47,6 @@ data:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
Expand Down Expand Up @@ -153,6 +145,27 @@ data:
# Copy tmp labels into real labels
- action: labelmap
regex: __tmp_pod_label_(.+)
{{- if .Values.scrapeConfigs }}
{{- toYaml .Values.scrapeConfigs | trim | nindent 4 }}
{{- end }}
{{- if (or .Values.alertManagers .Values.alertRelabelConfigs) }}
alerting:
alert_relabel_configs:
{{- if .Values.alertRelabelConfigs }}
{{- toYaml .Values.alertRelabelConfigs | trim | nindent 6 }}
{{- end }}
alertmanagers:
{{- if .Values.alertManagers }}
{{- toYaml .Values.alertManagers | trim | nindent 6 }}
{{- end }}
{{- end }}
{{- if .Values.remoteWrite }}
remote_write:
{{- toYaml .Values.remoteWrite | trim | nindent 4 }}
{{- end }}
---
kind: Service
apiVersion: v1
Expand Down Expand Up @@ -191,7 +204,7 @@ metadata:
namespace: {{.Values.global.namespace}}
spec:
replicas: 1
{{- if .Values.prometheusPersistence.enabled }}
{{- if .Values.persistence }}
strategy:
type: Recreate
{{- end }}
Expand Down Expand Up @@ -219,14 +232,10 @@ spec:
fsGroup: 65534
containers:
- args:
- --storage.tsdb.path=/data
- --storage.tsdb.retention.time=6h
- --config.file=/etc/prometheus/prometheus.yml
- --log.level={{lower .Values.prometheusLogLevel}}
{{- range $key, $value := .Values.prometheusExtraArgs}}
{{- range $key, $value := .Values.args}}
- --{{ $key }}{{ if $value }}={{ $value }}{{ end }}
{{- end }}
image: {{.Values.prometheusImage}}
image: {{.Values.image}}
imagePullPolicy: {{.Values.global.imagePullPolicy}}
livenessProbe:
httpGet:
Expand All @@ -244,15 +253,15 @@ spec:
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
{{- if .Values.prometheusResources -}}
{{- include "partials.resources" .Values.prometheusResources | nindent 8 }}
{{- if .Values.resources -}}
{{- include "partials.resources" .Values.resources | nindent 8 }}
{{- end }}
securityContext:
runAsNonRoot: true
runAsUser: 65534
runAsGroup: 65534
volumeMounts:
{{- range .Values.prometheusRuleConfigMapMounts }}
{{- range .Values.ruleConfigMapMounts }}
- name: {{ .name }}
mountPath: /etc/prometheus/{{ .subPath }}
subPath: {{ .subPath }}
Expand All @@ -264,25 +273,27 @@ spec:
name: prometheus-config
subPath: prometheus.yml
readOnly: true
{{- $tree := deepCopy . }}
{{- if not (empty .Values.prometheusProxyResources) }}
{{- $r := merge .Values.prometheusProxyResources .Values.global.proxy.resources }}
{{- $tree := deepCopy . }}
{{- if not (empty .Values.proxy) }}
{{- if not (empty .Values.proxy.resources) }}
{{- $r := merge .Values.proxy.resources .Values.global.proxy.resources }}
{{- $_ := set $tree.Values.global.proxy "resources" $r }}
{{- end }}
{{- end }}
- {{- include "partials.proxy" $tree | indent 8 | trimPrefix (repeat 7 " ") }}
{{ if not .Values.global.cniEnabled -}}
initContainers:
- {{- include "partials.proxy-init" . | indent 8 | trimPrefix (repeat 7 " ") }}
{{ end -}}
serviceAccountName: linkerd-prometheus
volumes:
{{- range .Values.prometheusRuleConfigMapMounts }}
{{- range .Values.ruleConfigMapMounts }}
- name: {{ .name }}
configMap:
name: {{ .configMap }}
{{- end }}
- name: data
{{- if .Values.prometheusPersistence.enabled }}
{{- if .Values.persistence }}
persistentVolumeClaim:
claimName: linkerd-prometheus
{{- else }}
Expand All @@ -298,7 +309,7 @@ spec:
- {{- include "partials.proxyInit.volumes.xtables" . | indent 8 | trimPrefix (repeat 7 " ") }}
{{ end -}}
- {{- include "partials.proxy.volumes.identity" . | indent 8 | trimPrefix (repeat 7 " ") }}
{{- if .Values.prometheusPersistence.enabled }}
{{- if .Values.persistence }}
---
kind: PersistentVolumeClaim
apiVersion: v1
Expand All @@ -312,11 +323,11 @@ metadata:
namespace: {{.Values.global.namespace}}
spec:
accessModes:
- {{ .Values.prometheusPersistence.accessMode | quote }}
- {{ .Values.persistence.accessMode | quote }}
resources:
requests:
storage: {{ .Values.prometheusPersistence.size | quote }}
{{- if .Values.prometheusPersistence.storageClass }}
storageClassName: "{{ .Values.prometheusPersistence.storageClass }}"
storage: {{ .Values.persistence.size | quote }}
{{- if .Values.persistence.storageClass }}
storageClassName: "{{ .Values.persistence.storageClass }}"
{{- end }}
{{- end }}
15 changes: 15 additions & 0 deletions charts/add-ons/prometheus/values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# This add-on's default property values are declared in `charts/add-ons/prometheus/values.yaml`.
# If installing/upgrading with Helm, you can override them through the usual `--set` or `-f` flags
# when installing with the parent linkerd2 chart
# Do not override them in this file!
# If installing/upgrading with linkerd's CLI, use the `--addon-config` flag.
image: prom/prometheus:v2.15.2
args:
storage.tsdb.path: /data
storage.tsdb.retention.time: 6h
config.file: /etc/prometheus/prometheus.yml
log.level: info
globalConfig:
scrape_interval: 10s
scrape_timeout: 10s
evaluation_interval: 10s
37 changes: 26 additions & 11 deletions charts/linkerd2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,17 +154,6 @@ their default values.
| `identityPoxyResources` | CPU and Memory resources required by proxy injected into identity pod (see `global.proxy.resources` for sub-fields) | values in `global.proxy.resources` |
| `installNamespace` | Set to false when installing Linkerd in a custom namespace. See the [Linkerd documentation](https://linkerd.io/2/tasks/install-helm/#customizing-the-namespace) for more information. | `true` |
| `omitWebhookSideEffects` | Omit the `sideEffects` flag in the webhook manifests | `false` |
| `prometheusAlertmanagers` | Alertmanager instances the Prometheus server sends alerts to configured via the static_configs parameter. | `[]` |
| `prometheusExtraArgs` | Extra command line options for Prometheus | `{}` |
| `prometheusImage` | Docker image for the Prometheus container | `prom/prometheus:v2.15.2` |
| `prometheusLogLevel` | Log level for Prometheus | `info` |
| `prometheusResources` | CPU and Memory resources required by prometheus (see `global.proxy.resources` for sub-fields) | |
| `prometheusProxyResources` | CPU and Memory resources required by proxy injected into prometheus pod (see `global.proxy.resources` for sub-fields) | values in `global.proxy.resources` |
| `prometheusPersistence.enabled` | Boolean value to enable creation and mounting of PVC for prometheus data. | `false` |
| `prometheusPersistence.storageClass` | Storage class used to create prometheus data PV. | `nil` |
| `prometheusPersistence.accessMode` | PVC access mode. | `ReadWriteOnce` |
| `prometheusPersistence.size` | Prometheus data volume size. | `8Gi` |
| `prometheusRuleConfigMapMounts` | Alerting/recording rule ConfigMap mounts (sub-path names must end in `_rules.yml` or `_rules.yaml`) | `[]` |
| `proxyInjector.externalSecret` | Do not create a secret resource for the profileValidator webhook. If this is set to `true`, the value `proxyInjector.caBundle` must be set (see below). | false |
| `proxyInjector.crtPEM` | Certificate for the proxy injector. If not provided then Helm will generate one. | |
| `proxyInjector.keyPEM` | Certificate key for the proxy injector. If not provided then Helm will generate one. | |
Expand Down Expand Up @@ -218,6 +207,32 @@ The following table lists the configurable parameters for the Grafana Add-On.
| `grafana.resources.memory.request` | Amount of memory that the grafana container requests ||
| `grafana.proxy.resources` | Structure analog to the `resources` fields above, but overriding the resources of the linkerd proxy injected into the grafana pod. | values in `global.proxy.resources` of the linkerd2 chart. |

### Prometheus Add-On

The following table lists the configurable parameters for the Prometheus Add-On.

| Parameter | Description | Default |
|:--------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------|
| `prometheus.enabled` | Flag to enable prometheus instance to be installed | `true` |
| `prometheus.alert_relabel_configs` | Alert relabeling is applied to alerts before they are sent to the Alertmanager. | `[]` |
| `prometheus.alertManagers` | Alertmanager instances the Prometheus server sends alerts to configured via the static_configs parameter. | `[]` |
| `prometheus.args` | Command line options for Prometheus binary | `storage.tsdb.path: /data, storage.tsdb.retention.time: 6h, config.file: /etc/prometheus/prometheus.yml, log.level: *controller_log_level` |
| `prometheus.globalConfig` | The global configuration specifies parameters that are valid in all other configuration contexts. | `scrape_interval: 10s, scrape_timeout: 10s, evaluation_interval: 10s` |
| `prometheus.image` | Docker image for the prometheus instance | `prom/prometheus:v2.15.2` |
| `prometheus.proxy.resources` | CPU and Memory resources required by proxy injected into prometheus pod (see `global.proxy.resources` for sub-fields) | values in `global.proxy.resources` |
| `prometheus.persistence.storageClass` | Storage class used to create prometheus data PV. | `nil` |
| `prometheus.persistence.accessMode` | PVC access mode. | `ReadWriteOnce` |
| `prometheus.persistence.size` | Prometheus data volume size. | `8Gi` |
| `prometheus.resources.cpu.limit` | Maximum amount of CPU units that the prometheus container can use ||
| `prometheus.resources.cpu.request` | Amount of CPU units that the prometheus container requests ||
| `prometheus.resources.memory.limit` | Maximum amount of memory that prometheus container can use ||
| `prometheus.resources.memory.request` | Amount of memory that the prometheus container requests ||
| `prometheus.ruleConfigMapMounts` | Alerting/recording rule ConfigMap mounts (sub-path names must end in `_rules.yml` or `_rules.yaml`) | `[]` |
| `prometheus.scrapeConfigs` | A scrape_config section specifies a set of targets and parameters describing how to scrape them. | `[]` |

Most of the above configuration match directly with the official Prometheus
configuration which can be found [here](https://prometheus.io/docs/prometheus/latest/configuration/configuration)

### Tracing Add-On

The following table lists the configurable parameters for the Tracing Add-On.
Expand Down
7 changes: 5 additions & 2 deletions charts/linkerd2/requirements.lock
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,14 @@ dependencies:
- name: partials
repository: file://../partials
version: 0.1.0
- name: prometheus
repository: file://../add-ons/prometheus
version: 0.1.0
- name: grafana
repository: file://../add-ons/grafana
version: 0.1.0
- name: tracing
repository: file://../add-ons/tracing
version: 0.1.0
digest: sha256:f92907b6d243e3b57b4288603ba76eced7c2f4ef913e76505c314971bb4afa21
generated: "2020-05-11T14:13:54.306010536-05:00"
digest: sha256:d2428770ae7d5134c5af6521c78a4c5f95da4c75f21bdea0f74fad6ab6e2e044
generated: "2020-06-24T11:07:53.924602129Z"
4 changes: 4 additions & 0 deletions charts/linkerd2/requirements.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@ dependencies:
- name: partials
version: 0.1.0
repository: file://../partials
- name: prometheus
version: 0.1.0
repository: file://../add-ons/prometheus
condition: prometheus.enabled
- name: grafana
version: 0.1.0
repository: file://../add-ons/grafana
Expand Down
6 changes: 4 additions & 2 deletions charts/linkerd2/templates/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,10 +72,12 @@ spec:
containers:
- args:
- public-api
- -prometheus-url=http://linkerd-prometheus.{{.Values.global.namespace}}.svc.{{.Values.global.clusterDomain}}:9090
- -destination-addr=linkerd-dst.{{.Values.global.namespace}}.svc.{{.Values.global.clusterDomain}}:8086
- -controller-namespace={{.Values.global.namespace}}
- -log-level={{.Values.controllerLogLevel}}
- -log-level={{.Values.global.controllerLogLevel}}
{{- if .Values.prometheus.enabled }}
- -prometheus-url=http://linkerd-prometheus.{{.Values.global.namespace}}.svc.{{.Values.global.clusterDomain}}:9090
{{- end}}
{{- include "partials.linkerd.trace" . | nindent 8 -}}
image: {{.Values.controllerImage}}:{{default .Values.global.linkerdVersion .Values.global.controllerImageVersion}}
imagePullPolicy: {{.Values.global.imagePullPolicy}}
Expand Down
Loading

0 comments on commit f84bbc4

Please sign in to comment.