-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce sampling score and propagate it with the trace #135
Changes from all commits
207ba2f
0510dfd
096d10b
c41aa8a
eed164f
146d963
0295785
716bdb5
f0ce6ba
0091a55
9b0dcdc
2509e5e
65b6b40
122eb25
bfc6f37
597a78f
b17dfc6
f096d02
a14a7d4
197b1c1
488ae17
ed978c7
234ddf0
b7a2040
7c06469
0721be6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,223 @@ | ||
# Associating sampling score with the trace | ||
|
||
Enable consistent sampling across distributed application with different | ||
sampling rates and probability calculation algorithms. | ||
|
||
## TL;DR | ||
|
||
**Score** is a floating point number associated with the trace. | ||
It's calculated when trace starts and flows in the `tracestate`. | ||
|
||
*Score* is independent of sampling *probability* (aka *rate*) which represents | ||
sampler's configuration, not specific to trace. | ||
|
||
Sampler can compare the *score* with the configured *probability* to make | ||
sampling decisions. | ||
|
||
Service that starts the trace calculates the score and adds it to the | ||
`tracestate` so downstream services can re-use it to make their sampling | ||
decisions *instead of* re-calculating score as a function of trace-id | ||
(or trace-flags). This allows to configure sampling algorithm on the first | ||
service ans avoid coordination of algorithms when multiple tracing tools are | ||
involved. | ||
|
||
## Motivation | ||
|
||
The goal is to enable a mechanism for consistent (best effort) sampling | ||
between services with different sampling rates and different probability | ||
calculation algorithms (for interoperability with existing tracing tools). | ||
lmolkova marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Today consistency across multiple services is achieved by following means: | ||
|
||
1. Same hashing algorithms on trace-id applied on each span. | ||
Problems: | ||
- **same sampling algorithm must be used across multiple apps**: it is | ||
not always possible e.g. when existing components in a system use | ||
vendor-specific tracing tool (pre-OpenTelemetry and major upgrade is hard to | ||
justify) while new components are instrumented with OpenTelemetry. | ||
- **trace-id uniform distribution is not guaranteed** therefore sampling | ||
decisions could be biased | ||
|
||
2. Sampling flag propagated from the head component/app is used by downstream | ||
apps to sample in a given trace. | ||
It requires to trust upstream decision and does not allow to have different | ||
sampling rates across different components. | ||
|
||
## Explanation | ||
|
||
Sampling score is generated by the first service to make sampling | ||
decision. It's a random float (6-9 digits precision, IEEE-754 32-bit | ||
floating-point) number in [0, 1] range. | ||
Score is stamped on the span and also propagated further within `tracestate`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. probably we need a sampler name + sampler details. So here would be "ExternalScoreSampler(0.5 /probabilty/, 0.12343 /score/)" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see why we'd need the sampler details on the tracestate. What scenario are you thinking about? |
||
|
||
Next service reads score from `tracestate` (instead of calculating it from | ||
trace-id) and compares it with its sampling rate to make sampling decision. | ||
lmolkova marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Score is exposed through span attributes. Vendors can leverage it | ||
to sort traces based on their completeness: the lower the value of score is, | ||
the higher the chance it was sampled in by each component. | ||
|
||
Vendors can enable interoperability (in terms of sampling) between legacy | ||
tools and OpenTelemetry: legacy libraries can be updated in non-breaking way to | ||
support external score sampling. Updating current vendor-specific library | ||
version on the existing service in a backward-compatible way is much easier | ||
than upgrading to OpenTelemetry. | ||
|
||
### Example | ||
|
||
``` | ||
+----------------------+ +----------------------+ +----------------------+ | ||
+ Service-A (rate 0.6) + --> + Service-B (rate 0.1) + --> + Service-C (rate 0.5) + | ||
+-------------- -------+ +----------------------+ +----------------------+ | ||
``` | ||
|
||
1. Service-A receives a request | ||
- starts a new trace, generates random trace-id | ||
- generates score: `0.17935003`. It's **smaller** than sampling rate | ||
(`0.6`), so decision is `RECORD_AND_SAMPLED` | ||
- span gets a new attribute `sampling.score = 0.17935003` | ||
- tracestate is modified `sampling.score=0.17935003` | ||
2. Service-B gets request from A | ||
- reads trace-context from headers and `sampling.score` from the | ||
tracestate | ||
- decision is `NOT_RECORD` as `0.17935003` is **bigger** than its | ||
sampling rate (0.1) | ||
3. Service-C get a request from B | ||
- reads trace-context from headers and `sampling.score` from the | ||
tracestate | ||
- decision is `RECORD_AND_SAMPLED` as `0.17935003` is **smaller** than its | ||
sampling rate (0.5) | ||
- span gets a new attribute `sampling.score = 0.17935003` | ||
- tracestate is left untouched | ||
|
||
As a result, spans from Service-A and Service-C are exported. | ||
It's not possible to restore relationship between A and C without B and the | ||
trace is broken, but Service-C can trace their own requests regardless of B's | ||
sampling rate and B can have smaller tracing budget regardless of A's decisions. | ||
All of them can still debug integration issues using common trace-id. | ||
|
||
Vendors can pick the most complete traces sorting them by score. | ||
|
||
## Internal details | ||
|
||
- Service that starts a trace makes sampling decision. It's configured to use | ||
`ExternalScoreSampler`(name TBD) is configured by user. Within `ShouldSample` | ||
callback sampler | ||
- generates score [0, 1] interval using `SamplingScoreGenerator` that can run | ||
random or deterministic `hash(trace-id)` algorithm. | ||
- makes sampling decision by comparing generated score to configured rate | ||
- if decision is `RECORD` (or `RECORD_AND_SAMPLED`), sampler adds | ||
`sampling.score` attribute to attributes collection of to-be-created span | ||
- regardless of sampling decision: prepends `sampling.score` key-value pair | ||
into tracestate of to-be-created span | ||
- Downstream service continues a trace but has different sampling rate (it's | ||
also configured to use `ExternalScoreSampler`) | ||
Comment on lines
+113
to
+114
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if not configured to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same as today - whatever sampler you configured would work according to the spec. Score (if present in tracestate) will be ignored and propagated futher |
||
- `ExternalScoreSampler.ShouldSample` checks if score is provided in | ||
`tracestate`. | ||
- makes sampling decision by comparing upstream-generated score with its | ||
sampling rate | ||
- if span will be recorded: sampler adds `sampling.score` attribute to | ||
attributes collection of to-be-created span | ||
- If downstream service does not find a score in the tracestate, it falls back | ||
to the configured score generation algorithm and updates tracestate and | ||
attributes | ||
- Any service can be configured to use other samplers (e.g. `TraceIdRatioBased`) | ||
In this case, score in tracestate is not affecting sampling decisions and is | ||
re-calculated by sampler. | ||
|
||
`ExternalScoreSampler` is responsible for: | ||
|
||
- reading and writing score to the `tracestate` | ||
- if score is set on the tracestate it makes sampling decision | ||
- if score is not present, it generates one using `SamplingScoreGenerator`. | ||
|
||
`SamplingScoreGenerator` responsible for: | ||
|
||
- calculating score in random or deterministic way based on sampling parameters. | ||
|
||
Here is a [proof of concept](https://github.com/lmolkova/opentelemetry-dotnet/pull/1) | ||
in .NET. | ||
|
||
### Specification Delta | ||
|
||
1. Add convention for `sampling.score` attribute on span (TBD). Check out | ||
[open questions](open-questions) regarding attribute vs special field. | ||
2. Add notion of `SamplingScoreGenerator` that is capable of calculating float | ||
score from sampling parameters. | ||
It has `TraceIdRatioGenerator`, `RandomGenerator` and possible other | ||
implementations. | ||
- Change `TraceIdRatioBased` sampler to use corresponding generator and serve | ||
as generic probability sampler with configurable score generation approach. | ||
3. Add `ExternalScoreSampler` implementation of `Sampler`. It's created with | ||
probability value and implementation of `SamplingScoreGenerator`. | ||
|
||
### Trade-offs and mitigations | ||
|
||
This change would be the first (AFAIK) common use case of the `tracestate`. | ||
It comes with bandwidth and performance overhead: `tracestate` could have | ||
been just propagated [blindly](https://github.com/open-telemetry/opentelemetry-specification/issues/478). | ||
and the overhead is made before sampling decision and cannot be mitigated. | ||
|
||
Customers should configure it explicitly to avoid the overhead in the default | ||
case when interoperability is not necessary. | ||
|
||
Vendors may gradually update their existing solutions to support external | ||
score in order to interoperate with OpenTelemetry and should recommend | ||
customers to configure such sampler. | ||
|
||
It may be the case that after migration to OpenTelemetry is finalized, the need | ||
of `sampling.score` will decrease and customers can remove | ||
`ExternalScoreSampler` from configuration. | ||
|
||
## Prior art and alternatives | ||
|
||
[TraceIdRatioBased](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#traceidratiobased) sampler. | ||
|
||
Related discussions on [Probability sampler](https://github.com/open-telemetry/opentelemetry-specification/pull/570) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
### Sampling.score is NOT priority | ||
|
||
Priority is used by [OpenTracing](https://github.com/opentracing/specification/blob/master/semantic_conventions.md) | ||
as an implementation-specific hint for sampler to prioritize recording a span. | ||
|
||
[OpenTelemetry collector](https://github.com/open-telemetry/opentelemetry-collector/blob/60b03d0d2d503351501291b30836d2126487a741/processor/samplingprocessor/probabilisticsamplerprocessor/testdata/config.yaml#L10) | ||
uses `sampling.priority` to hint collector's sampler decision | ||
|
||
To avoid conflicts with existing implementations we do not reuse priority term. | ||
|
||
## Open questions | ||
|
||
### Should we separate sampling from score generation? | ||
|
||
Rate-based sampling in this spec is separated from score generation. Sampler can | ||
be configured to use any algorithm on sampling parameters. Different samplers | ||
may reuse generation algorithms. | ||
|
||
### Attribute vs field on the span to-be-created | ||
|
||
Collection of attributes which is passed to sampler is empty by default to | ||
minimize perf impact. Propagating score back from sampler to span requires | ||
to initialize the collection. | ||
|
||
Creating a new float field on `SamplingDecision` could be an alternative. | ||
It'd also require adding similar property on Span/SpanData. | ||
|
||
There are other scenarios when sampling information is useful for | ||
exporter: e.g. sampling rate (or it's inverse value: count of spans | ||
this span represents), exporters can use it to estimate metrics. | ||
|
||
Populating all sampling information on all spans may be inefficient in terms of | ||
event payload size and storage while being useful for a subset of vendors. | ||
|
||
Extensible solution may look like a `SamplingInfo` struct that carries all | ||
fields exporters may need. | ||
|
||
``` | ||
struct SamplingInfo | ||
Score, | ||
Rate/Count, | ||
... | ||
``` | ||
|
||
`SamplingResult` would allow sampler for fill it for the span-to-be-created. | ||
`Span` and its exportable representations will also need to be updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: ans -> and