Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTel Traces for evaluation in flagd #575

Closed
4 tasks done
beeme1mr opened this issue Mar 29, 2023 · 3 comments
Closed
4 tasks done

OTel Traces for evaluation in flagd #575

beeme1mr opened this issue Mar 29, 2023 · 3 comments
Assignees

Comments

@beeme1mr
Copy link
Member

beeme1mr commented Mar 29, 2023

Abstract

OpenTelemetry (OTel) provides a vendor-neutral way of collecting and exporting rich telemetry data. Supporting for OTel will help administrators determine the health and impact flagd is having on a system.

Requirements

  • Start or extend a trace for a flag evaluation
  • Start or extend a trace for a change event (e.g. file, HTTP, grpc, kubernetes watcher)
  • Investigate sampling options
  • Investigate using the semantic conventions for gRPC and HTTP, and possibly feature flags where appropriate.

Resources

@beeme1mr
Copy link
Member Author

@Kavindu-Dodan
Copy link
Contributor

Given that I already worked on #563, I would like to continue on this topic :) @toddbaert or @thisthat any opinion ?

Kavindu-Dodan added a commit that referenced this issue Apr 10, 2023
## This PR

Fixes #563

Introduce configurations for flagd to connect to OTEL collector. Also,
improves how telemetry configurations are handled.

Changes include,

- Package renaming - otel to telemetry - This avoids import conflicts 

- Introduce a telemetry builder - The intention is to have a central
location to handle telemetry configurations and build telemetry
components

- Introduce a span processor builder - Provide groundwork for
#575 (needs fallback
mechanism)


## How to test?

Consider following this guide - (doc generated from commit)
https://github.com/open-feature/flagd/blob/81c66b3c89540b475fe0a46ac89869800f7b74ae/docs/configuration/flagd_telemetry.md

In short, 

- create configuration files (docker-compose yaml, collector config
yaml, Prometheus yaml )
- start collector setup (docker-compose up)
- start flagd with otel collector override for metrics (`flagd start
--uri file:/flags.json --metrics-exporter otel --otel-collector-target
localhost:4317`)

Metrics will be available at Prometheus(http://localhost:9090/graph).
Traces are still missing as we have to implement them.

---------

Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <Kavindu-Dodan@users.noreply.github.com>
Co-authored-by: Giovanni Liva <giovanni.liva@dynatrace.com>
beeme1mr pushed a commit that referenced this issue Apr 13, 2023
## This PR

Partially address #575 by introducing **traces** for flag evaluations &
otlp **batch exporter** based on **otlp grpc** with aim to use by otel
collector.

Contents,

- Refactor telemetry builder to register `TraceProvider` [1].
- Tracers [2] for `flag_evaluator.go` & `json_evaluator.go` with passing
context to derive span [2] tree

```
flagEvaluationService
|__ jsonEvaluator
```

Note that, setting `otel-collector-uri` is essential to export collected
traces. However, flagd can still be run **without** collector uri -
<del>this results in an empty(nil) exporter with no-op behavior</del>
this results in no provider registration, making traces to use
`NoopTracerProvider`

## How to test 

1. Set up flagd with otel collector - follow documentation [4]
2. Perform flag evaluations  [5]
3. Visit Jaeger UI (typically - http://127.0.0.1:16686/) for traces 


![overview](https://user-images.githubusercontent.com/8186721/231010260-b6be4cb2-1926-4692-a911-6ae3724afe88.png)

![trace](https://user-images.githubusercontent.com/8186721/231010269-3af505a7-48c4-45c9-ae4d-cc1bc269322e.png)



[1] -
https://opentelemetry.io/docs/reference/specification/trace/api/#tracerprovider
[2] -
https://opentelemetry.io/docs/reference/specification/trace/api/#tracer
[3] -
https://opentelemetry.io/docs/reference/specification/trace/api/#span
[4] -
https://github.com/open-feature/flagd/blob/main/docs/configuration/flagd_telemetry.md#export-to-otel-collector
[5] -
https://github.com/open-feature/flagd/blob/main/docs/configuration/fractional_evaluation.md#example

---------

Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
@Kavindu-Dodan
Copy link
Contributor

Kavindu-Dodan commented Apr 17, 2023

Start or extend a trace for a change event (e.g. file, HTTP, grpc, kubernetes watcher)

Investigated the distributed traces between grpc & grpc server implementation (ex:- flagd-proxy) for grpc change event. otelgrpc interceptor spans work for the full stream (i.e - stream start to stram end). This is not suitable for flagd use cases as we have long-running grpc streams.

Further, we already have jsonEvaluator(setState) trace to track state updates pushed from sync sources. This should be sufficient to track state change events, frequency and operation time of the sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants