Skip to content

Commit

Permalink
[nr-k8s-otel-collector] Add top level controls (#1352)
Browse files Browse the repository at this point in the history
<!--
Thank you for contributing to New Relic's Helm charts. Before you submit
this PR we'd like to
make sure you are aware of our technical requirements:

*
https://github.com/newrelic-experimental/helm-charts/blob/master/CONTRIBUTING.md#technical-requirements

For a quick overview across what we will look at reviewing your PR,
please read
our review guidelines:

*
https://github.com/newrelic-experimental/helm-charts/blob/master/REVIEW_GUIDELINES.md

Following our best practices right from the start will accelerate the
review process and
help get your PR merged quicker.

When updates to your PR are requested, please add new commits and do not
squash the
history. This will make it easier to identify new changes. The PR will
be squashed
anyways when it is merged. Thanks.

For fast feedback, please @-mention maintainers that are listed in the
Chart.yaml file.

Please make sure you test your changes before you push them. Once
pushed, a Github Action
will run across your changes and do some initial checks and linting.
These checks run
very quickly. Please check the results. We would like these checks to
pass before we
even continue reviewing your changes.
-->
#### Is this a new chart

#### What this PR does / why we need it:

#### Which issue this PR fixes
*(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)`
format, will close that issue when PR gets merged)*
  - fixes #

#### Special notes for your reviewer:

#### Checklist
[Place an '[x]' (no spaces) in all applicable fields. Please remove
unrelated fields.]
- [x] Chart Version bumped
- [x] Variables are documented in the README.md
- [x] Title of the PR starts with chart name (e.g. `[mychartname]`)

---------

Co-authored-by: csongnr <115833851+csongnr@users.noreply.github.com>
Co-authored-by: chris <csong@newrelic.com>
  • Loading branch information
3 people authored May 29, 2024
1 parent fd3b9cd commit 3e7dc5d
Show file tree
Hide file tree
Showing 15 changed files with 341 additions and 83 deletions.
2 changes: 1 addition & 1 deletion charts/nr-k8s-otel-collector/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.2
version: 0.2.0

dependencies:
- name: common-library
Expand Down
51 changes: 39 additions & 12 deletions charts/nr-k8s-otel-collector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ You can install this chart using directly this Helm repository:

```shell
helm repo add newrelic https://helm-charts.newrelic.com
helm upgrade nr-k8s-otel-collector newrelic/nr-k8s-otel-collector -f your-custom-values.yaml -n newrelic --create-namespace --install
helm upgrade nr-k8s-otel-collector newrelic/nr-k8s-otel-collector -f your-custom-values.yaml -n newrelic --create-namespace --install
```

## Confirm installation
Expand Down Expand Up @@ -64,25 +64,52 @@ Options that can be defined globally include `affinity`, `nodeSelector`, `tolera

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| affinity | object | `{}` | Sets pod/node affinities |
| affinity | object | `{}` | Sets all pods' affinities. Can be configured also with `global.affinity` |
| cluster | string | `""` | Name of the Kubernetes cluster monitored. Mandatory. Can be configured also with `global.cluster` |
| containerSecurityContext | object | `{}` | Sets all security context (at container level). Can be configured also with `global.securityContext.container` |
| customSecretLicenseKey | string | `""` | In case you don't want to have the license key in you values, this allows you to point to which secret key is the license key located. Can be configured also with `global.customSecretLicenseKey` |
| customSecretName | string | `""` | In case you don't want to have the license key in you values, this allows you to point to a user created secret to get the key from there. Can be configured also with `global.customSecretName` |
| daemonset.affinity | object | `{}` | Sets daemonset pod affinities. Overrides `affinity` and `global.affinity` |
| daemonset.containerSecurityContext | object | `{}` | Sets security context (at container level) for the daemonset. Overrides `containerSecurityContext` and `global.containerSecurityContext` |
| daemonset.nodeSelector | object | `{}` | Sets daemonset pod node selector. Overrides `nodeSelector` and `global.nodeSelector` |
| daemonset.podAnnotations | object | `{}` | Annotations to be added to the daemonset. |
| daemonset.podSecurityContext | object | `{}` | Sets security context (at pod level) for the daemonset. Overrides `podSecurityContext` and `global.podSecurityContext` |
| daemonset.resources | object | `{}` | Sets resources for the daemonset. |
| daemonset.tolerations | list | `[]` | Sets daemonset pod tolerations. Overrides `tolerations` and `global.tolerations` |
| deployment.affinity | object | `{}` | Sets deployment pod affinities. Overrides `affinity` and `global.affinity` |
| deployment.containerSecurityContext | object | `{}` | Sets security context (at container level) for the deployment. Overrides `containerSecurityContext` and `global.containerSecurityContext` |
| deployment.nodeSelector | object | `{}` | Sets deployment pod node selector. Overrides `nodeSelector` and `global.nodeSelector` |
| deployment.podAnnotations | object | `{}` | Annotations to be added to the deployment. |
| deployment.podSecurityContext | object | `{}` | Sets security context (at pod level) for the deployment. Overrides `podSecurityContext` and `global.podSecurityContext` |
| deployment.resources | object | `{}` | Sets resources for the deployment. |
| deployment.tolerations | list | `[]` | Sets deployment pod tolerations. Overrides `tolerations` and `global.tolerations` |
| dnsConfig | object | `{}` | Sets pod's dnsConfig. Can be configured also with `global.dnsConfig` |
| image.pullPolicy | string | `"IfNotPresent"` | The pull policy is defaulted to IfNotPresent, which skips pulling an image if it already exists. If pullPolicy is defined without a specific value, it is also set to Always. |
| image.repository | string | `"otel/opentelemetry-collector-contrib"` | OTel collector image to be deployed. You can use your own collector as long it accomplish the following requirements mentioned below. |
| image.repository | string | `"otel/opentelemetry-collector-contrib"` | OTel collector image to be deployed. You can use your own collector as long it accomplish the following requirements mentioned below. |
| image.tag | string | `"0.91.0"` | Overrides the image tag whose default is the chart appVersion. |
| kube-state-metrics.enabled | bool | `true` | Install the [`kube-state-metrics` chart](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-state-metrics) from the stable helm charts repository. This is mandatory if `infrastructure.enabled` is set to `true` and the user does not provide its own instance of KSM version >=1.8 and <=2.0. Note, kube-state-metrics v2+ disables labels/annotations metrics by default. You can enable the target labels/annotations metrics to be monitored by using the metricLabelsAllowlist/metricAnnotationsAllowList options described [here](https://github.com/prometheus-community/helm-charts/blob/159cd8e4fb89b8b107dcc100287504bb91bf30e0/charts/kube-state-metrics/values.yaml#L274) in your Kubernetes clusters. |
| kube-state-metrics.prometheusScrape | bool | `false` | Disable prometheus from auto-discovering KSM and potentially scraping duplicated data |
| labels | object | `{}` | Additional labels for chart objects |
| licenseKey | string | `""` | This set this license key to use. Can be configured also with `global.licenseKey` |
| nodeSelector | object | `{}` | Sets pod's node selector. Can be configured also with `global.nodeSelector |
| nodeSelector | object | `{}` | Sets all pods' node selector. Can be configured also with `global.nodeSelector` |
| nrStaging | bool | `false` | Send the metrics to the staging backend. Requires a valid staging license key. Can be configured also with `global.nrStaging` |
| podAnnotations | object | `{}` | Annotations to be added to each pod created by the chart |
| podSecurityContext | object | `{}` | Sets security context (at pod level). Can be configured also with `global.podSecurityContext` |
| resources | object | `{}` | The default set of resources assigned to the pods is shown below: |
| securityContext | object | `{"privileged":true}` | Sets security context (at container level). Can be configured also with `global.podSecurityContext` |
| tolerations | list | `[]` | Sets pod's tolerations to node taints. Cab be configured also with `global.tolerations` |
| podLabels | object | `{}` | Additional labels for chart pods |
| podSecurityContext | object | `{}` | Sets all security contexts (at pod level). Can be configured also with `global.securityContext.pod` |
| priorityClassName | string | `""` | Sets pod's priorityClassName. Can be configured also with `global.priorityClassName` |
| rbac.create | bool | `true` | Specifies whether RBAC resources should be created |
| receivers.filelog.enabled | bool | `true` | Specifies whether the `filelog` receiver is enabled |
| receivers.hostmetrics.enabled | bool | `true` | Specifies whether the `hostmetrics` receiver is enabled |
| receivers.k8sCluster.enabled | bool | `true` | Specifies whether the `k8s_cluster` receiver is enabled |
| receivers.k8sEvents.enabled | bool | `true` | Specifies whether the `k8s_events` receiver is enabled |
| receivers.kubeletstats.enabled | bool | `true` | Specifies whether the `kubeletstats` receiver is enabled |
| receivers.prometheus.enabled | bool | `true` | Specifies whether the `prometheus` receiver is enabled |
| serviceAccount | object | See `values.yaml` | Settings controlling ServiceAccount creation |
| serviceAccount.create | bool | `true` | Specifies whether a ServiceAccount should be created |
| tolerations | list | `[]` | Sets all pods' tolerations to node taints. Can be configured also with `global.tolerations` |
| verboseLog | bool | `false` | Sets the debug logs to this integration or all integrations if it is set globally. Can be configured also with `global.verboseLog` |

**Note:** If all receivers are disabled in the deployment or in the daemonset, the agent will not start.

## Common Errors

### Exporting Errors
Expand All @@ -91,7 +118,7 @@ Timeout errors while starting up the collector are expected as the collector att
These timeout errors can also pop up over time as the collector is running but are transient and expected to self-resolve. Further improvements are underway to mitigate the amount of timeout errors we're seeing from the NR1 endpoint.

```
info exporterhelper/retry_sender.go:154 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/newrelic", "error": "failed to make an HTTP request: Post \"https://staging-otlp.nr-data.net/v1/metrics\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)", "interval": "5.445779213s"}
info exporterhelper/retry_sender.go:154 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/newrelic", "error": "failed to make an HTTP request: Post \"https://staging-otlp.nr-data.net/v1/metrics\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)", "interval": "5.445779213s"}
```

### No such file or directory
Expand All @@ -100,11 +127,11 @@ Sometimes we see failed to open file errors on the `filelog` and `hostmetrics` r

`filelog` error:
```
Failed to open file {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "error": "open /var/log/pods/<podname>/<containername>/0.log: no such file or directory"}
Failed to open file {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "error": "open /var/log/pods/<podname>/<containername>/0.log: no such file or directory"}
```
`hostmetrics` error:
```
Error scraping metrics {"kind": "receiver", "name": "hostmetrics", "data_type": "metrics", "error": "error reading <metric> for process \"<process>\" (pid <PID>): open /hostfs/proc/<PID>/stat: no such file or directory; error reading <metric> info for process \"<process>\" (pid 511766): open /hostfs/proc/<PID>/<metric>: no such file or directory", "scraper": "process"}
Error scraping metrics {"kind": "receiver", "name": "hostmetrics", "data_type": "metrics", "error": "error reading <metric> for process \"<process>\" (pid <PID>): open /hostfs/proc/<PID>/stat: no such file or directory; error reading <metric> info for process \"<process>\" (pid 511766): open /hostfs/proc/<PID>/<metric>: no such file or directory", "scraper": "process"}
```

## Maintainers
Expand Down
4 changes: 3 additions & 1 deletion charts/nr-k8s-otel-collector/README.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ You can install this chart using directly this Helm repository:

```shell
helm repo add newrelic https://helm-charts.newrelic.com
helm upgrade --install newrelic/nr-k8s-otel-collector -f your-custom-values.yaml -n newrelic --create-namespace
helm upgrade nr-k8s-otel-collector newrelic/nr-k8s-otel-collector -f your-custom-values.yaml -n newrelic --create-namespace --install
```

{{ template "chart.sourcesSection" . }}
Expand Down Expand Up @@ -66,6 +66,8 @@ Options that can be defined globally include `affinity`, `nodeSelector`, `tolera

{{ template "chart.valuesSection" . }}

**Note:** If all receivers are disabled in the deployment or in the daemonset, the agent will not start.

## Common Errors

### Exporting Errors
Expand Down
21 changes: 21 additions & 0 deletions charts/nr-k8s-otel-collector/templates/_affinity.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{{- /*
A helper to return the affinity to apply to the deployment.
*/ -}}
{{- define "nrKubernetesOtel.deployment.affinity" -}}
{{- if .Values.deployment.affinity -}}
{{- toYaml .Values.deployment.affinity -}}
{{- else if include "newrelic.common.affinity" . -}}
{{- include "newrelic.common.affinity" . -}}
{{- end -}}
{{- end -}}

{{- /*
A helper to return the affinity to apply to the daemonset.
*/ -}}
{{- define "nrKubernetesOtel.daemonset.affinity" -}}
{{- if .Values.daemonset.affinity -}}
{{- toYaml .Values.daemonset.affinity -}}
{{- else if include "newrelic.common.affinity" . -}}
{{- include "newrelic.common.affinity" . -}}
{{- end -}}
{{- end -}}
21 changes: 21 additions & 0 deletions charts/nr-k8s-otel-collector/templates/_node_selector.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{{- /*
A helper to return the nodeSelector to apply to the deployment.
*/ -}}
{{- define "nrKubernetesOtel.deployment.nodeSelector" -}}
{{- if .Values.deployment.nodeSelector -}}
{{- toYaml .Values.deployment.nodeSelector -}}
{{- else if include "newrelic.common.nodeSelector" . -}}
{{- include "newrelic.common.nodeSelector" . -}}
{{- end -}}
{{- end -}}

{{- /*
A helper to return the nodeSelector to apply to the daemonset.
*/ -}}
{{- define "nrKubernetesOtel.daemonset.nodeSelector" -}}
{{- if .Values.daemonset.nodeSelector -}}
{{- toYaml .Values.daemonset.nodeSelector -}}
{{- else if include "newrelic.common.nodeSelector" . -}}
{{- include "newrelic.common.nodeSelector" . -}}
{{- end -}}
{{- end -}}
43 changes: 43 additions & 0 deletions charts/nr-k8s-otel-collector/templates/_security_context.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
{{- /*
A helper to return the pod security context to apply to the deployment.
*/ -}}
{{- define "nrKubernetesOtel.deployment.securityContext.pod" -}}
{{- if .Values.deployment.podSecurityContext -}}
{{- toYaml .Values.deployment.podSecurityContext -}}
{{- else if include "newrelic.common.securityContext.pod" . -}}
{{- include "newrelic.common.securityContext.pod" . -}}
{{- end -}}
{{- end -}}

{{- /*
A helper to return the container security context to apply to the deployment.
*/ -}}
{{- define "nrKubernetesOtel.deployment.securityContext.container" -}}
{{- if .Values.deployment.containerSecurityContext -}}
{{- toYaml .Values.deployment.containerSecurityContext -}}
{{- else if include "newrelic.common.securityContext.container" . -}}
{{- include "newrelic.common.securityContext.container" . -}}
{{- end -}}
{{- end -}}

{{- /*
A helper to return the pod security context to apply to the daemonset.
*/ -}}
{{- define "nrKubernetesOtel.daemonset.securityContext.pod" -}}
{{- if .Values.daemonset.podSecurityContext -}}
{{- toYaml .Values.daemonset.podSecurityContext -}}
{{- else if include "newrelic.common.securityContext.pod" . -}}
{{- include "newrelic.common.securityContext.pod" . -}}
{{- end -}}
{{- end -}}

{{- /*
A helper to return the container security context to apply to the daemonset.
*/ -}}
{{- define "nrKubernetesOtel.daemonset.securityContext.container" -}}
{{- if .Values.daemonset.containerSecurityContext -}}
{{- toYaml .Values.daemonset.containerSecurityContext -}}
{{- else if include "newrelic.common.securityContext.container" . -}}
{{- include "newrelic.common.securityContext.container" . -}}
{{- end -}}
{{- end -}}
21 changes: 21 additions & 0 deletions charts/nr-k8s-otel-collector/templates/_tolerations.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{{- /*
A helper to return the tolerations to apply to the deployment.
*/ -}}
{{- define "nrKubernetesOtel.deployment.tolerations" -}}
{{- if .Values.deployment.tolerations -}}
{{- toYaml .Values.deployment.tolerations -}}
{{- else if include "newrelic.common.tolerations" . -}}
{{- include "newrelic.common.tolerations" . -}}
{{- end -}}
{{- end -}}

{{- /*
A helper to return the tolerations to apply to the daemonset.
*/ -}}
{{- define "nrKubernetesOtel.daemonset.tolerations" -}}
{{- if .Values.daemonset.tolerations -}}
{{- toYaml .Values.daemonset.tolerations -}}
{{- else if include "newrelic.common.tolerations" . -}}
{{- include "newrelic.common.tolerations" . -}}
{{- end -}}
{{- end -}}
4 changes: 3 additions & 1 deletion charts/nr-k8s-otel-collector/templates/clusterrole.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
{{- if .Values.rbac.create }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
Expand Down Expand Up @@ -74,4 +75,5 @@ rules:
- watch
# following required for prometheus receiver
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
verbs: ["get"]
{{- end -}}
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
{{- if .Values.rbac.create }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
Expand All @@ -12,3 +13,4 @@ roleRef:
kind: ClusterRole
name: {{ include "newrelic.common.naming.fullname" . }}
apiGroup: rbac.authorization.k8s.io
{{- end -}}
26 changes: 20 additions & 6 deletions charts/nr-k8s-otel-collector/templates/daemonset-configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -303,7 +303,7 @@ data:
action: update
from_attribute: node
- key: k8s.namespace.name
action: upsert
action: upsert
from_attribute: namespace
batch:
Expand All @@ -320,14 +320,24 @@ data:
# insecure: true
service:
{{ if include "newrelic.common.verboseLog" . }}
{{- if include "newrelic.common.verboseLog" . }}
telemetry:
logs:
level: "debug"
{{ end }}
{{- end }}
pipelines:
{{- if or .Values.receivers.hostmetrics.enabled (or .Values.receivers.kubeletstats.enabled .Values.receivers.prometheus.enabled) }}
metrics:
receivers: [hostmetrics, kubeletstats, prometheus]
receivers:
{{- if .Values.receivers.hostmetrics.enabled }}
- hostmetrics
{{- end }}
{{- if .Values.receivers.kubeletstats.enabled }}
- kubeletstats
{{- end }}
{{- if .Values.receivers.prometheus.enabled }}
- prometheus
{{- end }}
processors:
# - transform/truncate
- filter/exclude_cpu_utilization
Expand All @@ -346,8 +356,12 @@ data:
- batch
exporters:
- otlphttp/newrelic
{{- end }}
{{- if .Values.receivers.filelog.enabled }}
logs:
receivers: [filelog]
receivers:
- filelog
processors: [transform/truncate, resource, k8sattributes, batch]
exporters:
- otlphttp/newrelic
- otlphttp/newrelic
{{- end }}
Loading

0 comments on commit 3e7dc5d

Please sign in to comment.