-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: no default otel endpoint for the operator #154
Conversation
The operator no longer has a default opentelemetry endpoint and will only collect and send traces if an endpoint is configured. This might address the memory leak but either way its a nice quality of life improvement as we have not been collecting traces from the operator anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Turns out this breaks logging within the keramik operator. Still trying to figure out why. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
With this change logs will still be printed to STDOUT even if no OTLP endpoint is configured.
@@ -127,9 +127,6 @@ spec: | |||
containerPort: 9464 | |||
protocol: TCP | |||
env: | |||
# We are pointing to tempo or grafana tracing agent's otlp grpc receiver port | |||
- name: OPERATOR_OTLP_ENDPOINT | |||
value: "https://otel:4317" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to remove this config option from any operators we have configured otherwise they will still try to send traces.
@3benbox @smrz2001 I could use another review. The second commit fixes the issue with no longer logging any events to stdout. Local testing shows this fixed the OOM issues. The traces must have been being cached forever since they could not be flushed. In order to fix this in our prod envs we will need to remove the OTLP_ENDPOINT env var from the deployment of the operator. |
The operator no longer has a default opentelemetry endpoint and will only collect and send traces if an endpoint is configured.
This might address the memory leak but either way its a nice quality of life improvement as we have not been collecting traces from the operator anyway.