Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OTel collector in flagd #563

Closed
beeme1mr opened this issue Mar 24, 2023 · 6 comments · Fixed by #586
Closed

Support OTel collector in flagd #563

beeme1mr opened this issue Mar 24, 2023 · 6 comments · Fixed by #586
Assignees

Comments

@beeme1mr
Copy link
Member

In order to get data into OTel, we need to be able to configure flagd to communicate with a collector. This entails:

  • a new CLI arg for flagd, specifying the collector (taking the collector URI)
    • if CLI arg is set, configure OTLP in the OTel SDK (see example here)
    • deploy collector + configure it with Jaeger and confirm works as expected
  • documentation and tests for the above
@Kavindu-Dodan
Copy link
Contributor

I will start working on this :)

@Kavindu-Dodan
Copy link
Contributor

Kavindu-Dodan commented Apr 3, 2023

I researched on this topic and now have a working solution inspired by OTEL demo [1]. With the knowledge, I have few questions to clarify so that I can finalize the first version of the implementation.

  • Jaeger is specialized in tracing and currently, we do not record any tracing. So if we want to show Jaeger use-case, then we should add traces
  • Our metrics are right now backed by Prometheus metrics [2], so if we need metrics at Collector (and then export to Prometheus), then we should also migrate the metric recorder to be backed by OTEL

A solution I see is to switch behavior based on the startup flag

If otel collector is setup, then expose metrics and traces to otel. Else, fallback to Prometheus metrics

And regarding traces, I will start with a simple trace which we can expand on later.

Let me know your opinion @beeme1mr @thisthat @toddbaert :)

[1] - https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/examples/demo
[2] - https://github.com/open-feature/flagd/blob/main/core/pkg/runtime/from_config.go#L85


note - draft work (works on my machine 😉) #586

@thisthat
Copy link
Member

thisthat commented Apr 4, 2023

I agree that at the moment, Jaeger is not necessary since we don't provide any trace.
On the metric point, we're using OpenTelemetry metrics with a Prometheus exporter. We could switch the exporter to use OTLP, but I suggest keeping the Prom exporter as the default one to be more flexible. The OpenTelemetry collector supports scraping Prometheus endpoints (see Prometheus-receiver).

To provide flexibility, we could use the env var OTEL_METRIC_EXPORTER to decide which exporter to use (doc). Quickly looking at the go-sdk, this looks like it is not yet implemented and there is a discussion around this topic (see spec). Hence, we must add that logic by ourselves.

@beeme1mr
Copy link
Member Author

beeme1mr commented Apr 4, 2023

Hey @Kavindu-Dodan, there's another issue open for adding distributed traces to flagd. You can see that here.

I agree with @thisthat that we should leave the Prometheus exporter as the default. We just need to figure out a way to allow users to change metric exporter. Perhaps we could do something similar to the source configuration. That would make it easy to use cli flags when the use case is simple and a configuration when it becomes more complex. The OTEL_METRIC_EXPORTER env var may also be a good idea but I think it should behave like other flagd env vars.

@Kavindu-Dodan
Copy link
Contributor

Kavindu-Dodan commented Apr 4, 2023

Thank you @thisthat & @beeme1mr for the feedback.

Can we have the following agreement then:

  • Prometheus exporter will be the default metrics exporter
  • Configurability to change the metric exporter - METRIC_EXPORTER or similar flag with an accepted value of otlp(handled manually) to switch to otel exporter
  • Traces - Otel will be the default trace provider - will set the ground work for OTel Traces for evaluation in flagd #575 through this issue

@Kavindu-Dodan
Copy link
Contributor

PR is ready for review - #586

I can start work on #575 if this is the next priority task :)

Kavindu-Dodan added a commit that referenced this issue Apr 10, 2023
## This PR

Fixes #563

Introduce configurations for flagd to connect to OTEL collector. Also,
improves how telemetry configurations are handled.

Changes include,

- Package renaming - otel to telemetry - This avoids import conflicts 

- Introduce a telemetry builder - The intention is to have a central
location to handle telemetry configurations and build telemetry
components

- Introduce a span processor builder - Provide groundwork for
#575 (needs fallback
mechanism)


## How to test?

Consider following this guide - (doc generated from commit)
https://github.com/open-feature/flagd/blob/81c66b3c89540b475fe0a46ac89869800f7b74ae/docs/configuration/flagd_telemetry.md

In short, 

- create configuration files (docker-compose yaml, collector config
yaml, Prometheus yaml )
- start collector setup (docker-compose up)
- start flagd with otel collector override for metrics (`flagd start
--uri file:/flags.json --metrics-exporter otel --otel-collector-target
localhost:4317`)

Metrics will be available at Prometheus(http://localhost:9090/graph).
Traces are still missing as we have to implement them.

---------

Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <Kavindu-Dodan@users.noreply.github.com>
Co-authored-by: Giovanni Liva <giovanni.liva@dynatrace.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

3 participants