Skip to content

Commit

Permalink
Reapply "Bump otelcontribcol version to v0.106.0-gke.2 (GoogleContain…
Browse files Browse the repository at this point in the history
…erTools#1369)" (GoogleContainerTools#1376)

Due to a refactor in the upstream, the metric transform processor no longer takes `label_set:[]` as a hint to remove all labels, instead all labels fall through and get exported. See [issue](open-telemetry/opentelemetry-collector-contrib#34430) for details.

This change is to reapply the upgrade to 0.106.0 to catch up with latest otelcollector but adding the following workaround:

A label `no_op_label` is given to the `label_set` of the metrics that need all labels removed. In this case the [re-slice function in the core lib](https://github.com/open-telemetry/opentelemetry-collector/blob/main/pdata/pcommon/slice.go#L125) will filter out all other valid labels of the metric. As for the no_op_label nothing will be performed as it's not part of the original attributes.


This reverts commit 1d33f46.
  • Loading branch information
tiffanny29631 committed Aug 13, 2024
1 parent 3c7775d commit 23c6316
Show file tree
Hide file tree
Showing 10 changed files with 96 additions and 30 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ HELM_STAGING_DIR := $(OUTPUT_DIR)/third_party/helm
GIT_SYNC_VERSION := v4.2.3-gke.5__linux_amd64
GIT_SYNC_IMAGE_NAME := gcr.io/config-management-release/git-sync:$(GIT_SYNC_VERSION)

OTELCONTRIBCOL_VERSION := v0.103.0-gke.3
OTELCONTRIBCOL_VERSION := v0.106.0-gke.2
OTELCONTRIBCOL_IMAGE_NAME := gcr.io/config-management-release/otelcontribcol:$(OTELCONTRIBCOL_VERSION)

# Directory used for staging Docker contexts.
Expand Down
1 change: 1 addition & 0 deletions manifests/base/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ resources:
- ../ns-reconciler-base-cluster-role.yaml
- ../root-reconciler-base-cluster-role.yaml
- ../otel-agent-cm.yaml
- ../otel-agent-reconciler-cm.yaml
- ../reconciler-manager-service-account.yaml
- ../reposync-crd.yaml
- ../rootsync-crd.yaml
Expand Down
20 changes: 1 addition & 19 deletions manifests/otel-agent-cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,24 +32,6 @@ data:
tls:
insecure: true
processors:
# Attributes processor adds custom configsync metric labels to applicable
# metrics to identify the sync object used to configure this deployment.
#
# Note: configsync.sync.generation is explicitly excluded here, because it
# is high cardinality. So we don't want to send it as a label, only as a
# resource attribute. That way it's only propagated to Prometheus, and not
# Monarch or Cloud Monitoring, which ignore custom resource attributes.
attributes:
actions:
- key: configsync.sync.kind
action: upsert
value: $CONFIGSYNC_SYNC_KIND
- key: configsync.sync.name
action: upsert
value: $CONFIGSYNC_SYNC_NAME
- key: configsync.sync.namespace
action: upsert
value: $CONFIGSYNC_SYNC_NAMESPACE
batch:
# Populate resource attributes from OTEL_RESOURCE_ATTRIBUTES env var and
# the GCE metadata service, if available.
Expand All @@ -62,7 +44,7 @@ data:
pipelines:
metrics:
receivers: [opencensus]
processors: [batch, resourcedetection, attributes]
processors: [batch, resourcedetection]
exporters: [opencensus]
telemetry:
logs:
Expand Down
69 changes: 69 additions & 0 deletions manifests/otel-agent-reconciler-cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: ConfigMap
metadata:
name: otel-agent-reconciler
namespace: config-management-system
labels:
app: opentelemetry
component: otel-agent
configmanagement.gke.io/system: "true"
configmanagement.gke.io/arch: "csmr"
data:
otel-agent-reconciler-config.yaml: |
receivers:
opencensus:
exporters:
opencensus:
endpoint: otel-collector.config-management-monitoring:55678
tls:
insecure: true
processors:
# Attributes processor adds custom configsync metric labels to applicable
# metrics to identify the sync object used to configure this deployment.
#
# Note: configsync.sync.generation is explicitly excluded here, because it
# is high cardinality. So we don't want to send it as a label, only as a
# resource attribute. That way it's only propagated to Prometheus, and not
# Monarch or Cloud Monitoring, which ignore custom resource attributes.
attributes:
actions:
- key: configsync.sync.kind
action: upsert
value: ${CONFIGSYNC_SYNC_KIND}
- key: configsync.sync.name
action: upsert
value: ${CONFIGSYNC_SYNC_NAME}
- key: configsync.sync.namespace
action: upsert
value: ${CONFIGSYNC_SYNC_NAMESPACE}
batch:
# Populate resource attributes from OTEL_RESOURCE_ATTRIBUTES env var and
# the GCE metadata service, if available.
resourcedetection:
detectors: [env, gcp]
extensions:
health_check:
service:
extensions: [health_check]
pipelines:
metrics:
receivers: [opencensus]
processors: [batch, resourcedetection, attributes]
exporters: [opencensus]
telemetry:
logs:
level: "INFO"
1 change: 1 addition & 0 deletions manifests/templates/otel-collector.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ spec:
# The prometheus transformer appends `_ratio` to gauge metrics: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.86.0/pkg/translator/prometheus/normalize_name.go#L149
# Add the feature gate to enable metric suffix trimming.
- "--feature-gates=-pkg.translator.prometheus.NormalizeName"
- "--feature-gates=-component.UseLocalHostAsDefaultHost"
resources:
limits:
cpu: 1
Expand Down
9 changes: 5 additions & 4 deletions manifests/templates/reconciler-manager-configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -168,10 +168,11 @@ data:
command:
- /otelcontribcol
args:
- "--config=/conf/otel-agent-config.yaml"
- "--config=/conf/otel-agent-reconciler-config.yaml"
# The prometheus transformer appends `_ratio` to gauge metrics: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.86.0/pkg/translator/prometheus/normalize_name.go#L149
# Add the feature gate to enable metric suffix trimming.
- "--feature-gates=-pkg.translator.prometheus.NormalizeName"
- "--feature-gates=-component.UseLocalHostAsDefaultHost"
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
Expand All @@ -184,7 +185,7 @@ data:
- containerPort: 8888 # Metrics.
protocol: TCP
volumeMounts:
- name: otel-agent-config-vol
- name: otel-agent-config-reconciler-vol
mountPath: /conf
readinessProbe:
httpGet:
Expand Down Expand Up @@ -273,9 +274,9 @@ data:
secret:
secretName: git-creds
defaultMode: 288
- name: otel-agent-config-vol
- name: otel-agent-config-reconciler-vol
configMap:
name: otel-agent
name: otel-agent-reconciler
defaultMode: 420
- name: service-account
emptyDir: {}
Expand Down
1 change: 1 addition & 0 deletions manifests/templates/reconciler-manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ spec:
# The prometheus transformer appends `_ratio` to gauge metrics: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.86.0/pkg/translator/prometheus/normalize_name.go#L149
# Add the feature gate to enable metric suffix trimming.
- "--feature-gates=-pkg.translator.prometheus.NormalizeName"
- "--feature-gates=-component.UseLocalHostAsDefaultHost"
resources:
limits:
cpu: 1
Expand Down
1 change: 1 addition & 0 deletions manifests/templates/resourcegroup-manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,7 @@ spec:
- args:
- --config=/conf/otel-agent-config.yaml
- --feature-gates=-pkg.translator.prometheus.NormalizeName
- --feature-gates=-component.UseLocalHostAsDefaultHost
command:
- /otelcontribcol
env:
Expand Down
20 changes: 15 additions & 5 deletions pkg/metrics/otel.go
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,9 @@ processors:
new_name: current_declared_resources
operations:
- action: aggregate_labels
label_set: []
# Using a no_op_label to get around issue in the upstream
# https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34430
label_set: [no_op_label]
aggregation_type: max
- include: kcc_resource_count
action: update
Expand Down Expand Up @@ -255,14 +257,18 @@ processors:
new_name: resource_conflicts_count
operations:
- action: aggregate_labels
label_set: []
# Using a no_op_label to get around issue in the upstream
# https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34430
label_set: [no_op_label]
aggregation_type: max
- include: internal_errors_total
action: update
new_name: internal_errors_count
operations:
- action: aggregate_labels
label_set: []
# Using a no_op_label to get around issue in the upstream
# https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34430
label_set: [no_op_label]
aggregation_type: max
- include: remediate_duration_seconds
action: update
Expand Down Expand Up @@ -322,13 +328,17 @@ processors:
action: update
operations:
- action: aggregate_labels
label_set: []
# Using a no_op_label to get around issue in the upstream
# https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34430
label_set: [no_op_label]
aggregation_type: max
- include: kustomize_build_latency
action: update
operations:
- action: aggregate_labels
label_set: []
# Using a no_op_label to get around issue in the upstream
# https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34430
label_set: [no_op_label]
aggregation_type: max
extensions:
health_check:
Expand Down
2 changes: 1 addition & 1 deletion pkg/reconcilermanager/controllers/otel_controller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ const (
// otel-collector ConfigMap.
// See `CollectorConfigGooglecloud` in `pkg/metrics/otel.go`
// Used by TestOtelReconcilerGooglecloud.
depAnnotationGooglecloud = "c2f6078a9afe1f32721173e9e15bbab5"
depAnnotationGooglecloud = "bfa02552b80a227256e825c807254b40"
// depAnnotationGooglecloud is the expected hash of the custom
// otel-collector ConfigMap test artifact.
// Used by TestOtelReconcilerCustom.
Expand Down

0 comments on commit 23c6316

Please sign in to comment.