Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: no default otel endpoint for the operator #154

Merged
merged 2 commits into from
Feb 19, 2024
Merged

Conversation

nathanielc
Copy link
Collaborator

The operator no longer has a default opentelemetry endpoint and will only collect and send traces if an endpoint is configured.

This might address the memory leak but either way its a nice quality of life improvement as we have not been collecting traces from the operator anyway.

The operator no longer has a default opentelemetry endpoint and will
only collect and send traces if an endpoint is configured.

This might address the memory leak but either way its a nice quality of
life improvement as we have not been collecting traces from the operator
anyway.
Copy link
Contributor

@3benbox 3benbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nathanielc
Copy link
Collaborator Author

Turns out this breaks logging within the keramik operator. Still trying to figure out why.

Copy link
Collaborator

@smrz2001 smrz2001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

With this change logs will still be printed to STDOUT even if no OTLP
endpoint is configured.
@@ -127,9 +127,6 @@ spec:
containerPort: 9464
protocol: TCP
env:
# We are pointing to tempo or grafana tracing agent's otlp grpc receiver port
- name: OPERATOR_OTLP_ENDPOINT
value: "https://otel:4317"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to remove this config option from any operators we have configured otherwise they will still try to send traces.

@nathanielc
Copy link
Collaborator Author

@3benbox @smrz2001 I could use another review. The second commit fixes the issue with no longer logging any events to stdout.

Local testing shows this fixed the OOM issues. The traces must have been being cached forever since they could not be flushed.

In order to fix this in our prod envs we will need to remove the OTLP_ENDPOINT env var from the deployment of the operator.

@nathanielc nathanielc added this pull request to the merge queue Feb 19, 2024
Merged via the queue into main with commit 039e15b Feb 19, 2024
5 checks passed
@nathanielc nathanielc deleted the fix/no-default-otel branch February 19, 2024 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants