Composite Sampler #1844

iNikem · 2021-08-03T16:17:36Z

Let's imagine for a second that #1597 is resolved and we have a sampler that drops server spans based on the url. It is safe to assume that for any particular configured Otel SDK we still will need one more sampler, to decide the fate of all other spans. This is already a second example of "composite samplers", the first being existing ParentBased sampler.

This raises several questions:

Should we formalise a notion of composite samplers in the spec?
If yes, should we specify rules of composition? Delegation or list of samplers? Should they be combined as logical "and" or "or"? Short-circuit?
Should env variable OTEL_TRACES_SAMPLER be expanded to allow for configuration of sampler composition?

The text was updated successfully, but these errors were encountered:

iNikem · 2021-08-03T16:17:44Z

/cc @carlosalberto

jmacd · 2021-08-05T15:10:20Z

I feel this is connected with the topic of "View configuration" in the metrics SDK:
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#view

iNikem · 2021-08-06T07:21:06Z

I feel this is connected with the topic of "View configuration" in the metrics SDK:
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#view

I probably know too little about metrics, but I cannot see any connection between Sampler and View :(

jmacd · 2021-09-15T15:57:03Z

People use "Sampler" to describe a standard mechanism for selecting which spans will be reported. People use "View" to describe a standard mechanism for selecting which metrics will be reported.
These usages are nearly identical.

jmacd · 2021-09-15T17:21:22Z

After reading @iNikem's comment on open-telemetry/oteps#175, I will take a stronger position here.

The term "non-probability sampler" is an oxymoron. Outside of OpenTelemetry, "Sampling" refers to a statistical process used to estimate something, and if there aren't probabilities involved, what you have may be called "Subjective analysis".

We are stuck with the term "Sampling" in OTel being used to mean both filtering and excluding spans from the population (what I call "View construction") and also the statistical process of counting the non-excluded non-filtered spans (what I call "Probability sampling", a form of span aggregation). OTEPs 170 and 168 are trying to finish this second aspect of what "Sampling" means, and you are asking to refine the first aspect of what Sampling means. In that sense, your comment open-telemetry/oteps#175 (comment) makes sense. It's confusing that the document talks about composition but doesn't answer to the need for View construction.

@iNikem asked above:

logical "and" or "or"? Short-circuit?

OTEP 170 specifies how to compose Samplers using logical-OR, which is useful in the sense that it allows composition of a probability sampler with a non-probability sampler (i.e., probability sampling plus subjective analysis). The concept of "zero adjusted count" is critical to understanding how we compose probability samplers and non-probability samplers.

Combining probability sampling with AND- and short-circuiting goes against the goals of probability sampling. Any span that is excluded before a probability sampler sees it is effectively uncounted, so placing filters in front of a Sampler distorts the result. I'm not convinced there's a real need to specify AND- and short-circuit types of composition, that doesn't seem like a natural way to configure what (IMO) users want to configure.

The form of sampler configuration we've seen from e.g., Jaeger remote sampling and AWS xRay look like a single Sampler with a list of rules that are internally followed, with short-circuiting. In pseudo-code you have a probability lookup like:

if ignoreSomeSpans(span) {
  return alwaysOff // These spans are UNCOUNTED, as if they never existed.
} else if firstConditionMatch(span) {
  return firstSamplingProbability
} else if secondConditionMatch(span) {
  return secondSamplingProbability
} else {
  return defaultSamplingProbability
}

My understanding is that users would like a way to configure this sort of logic, but it won't be through composition of multiple samplers, it will be through configuring one sophisticated sampler. The reason we talk about composing a probability and non-probability sampler is that we are aware of adaptive schemes that combine the two (open-telemetry/oteps#168 (comment)). Please treat the review of open-telemetry/oteps#168 and open-telemetry/oteps#175 as an independent topic from more general forms of View construction for spans.

It's possible in the distant future that OTel will evolve support for computing multiple samples in parallel with multiple independent head and tail probabilities. That will offer meaningfully more ways to sample and track down rare conditions in large-scale systems, but I don't think users will be asking for that soon. Besides, all of this discussion points to a larger problem with configuration in general; for now we do not expect to configure sophisticated samplers through the environment, we expect to use "jaeger_remote" or similar to select a module of code that performs the above logic and returns one of the built-in samplers. Please approve the two OTEPs so that we can proceed with necessary work on the built-in samplers.

iNikem · 2021-09-18T08:40:06Z

Thank you @jmacd for this explanation, I now understand current landscape much better.

I can agree with your suggestion that non-probabilistic samplers don't need to be composed and my problem of span "filtering" can be solved with one exhaustive Sampler. This approach will resolve this current issue. @anuraaga, taking this into account, should we change https://github.com/open-telemetry/opentelemetry-java-contrib/blob/53e8a363afe1d5ec6cd7afdf4285e4bf8dbc63b3/contrib-samplers/src/main/java/io/opentelemetry/contrib/samplers/RuleBasedRoutingSampler.java#L38 to accept fallback decision instead of fallback sampler?

@jmacd Do you think that we need to introduce a new entity, SpanFilter, to separate "View construction" from proper sampling?

jmacd · 2022-01-12T18:46:22Z

#2047 specifies the rules for composing samplers that I believe we need. However, we discovered a problem in the way the sampler APIs interact, discussed in #2179.

I would like to see #2047 merge, after that point I believe we can implement a composite sampler based on the rules there. Please review and approve #2047 -- we are blocked.

iNikem added the spec:trace Related to the specification/trace directory label Aug 3, 2021

github-actions bot assigned bogdandrutu Aug 3, 2021

carlosalberto added the area:sdk Related to the SDK label Aug 17, 2021

anuraaga mentioned this issue Aug 20, 2021

Sampling when span has links #1877

Open

iNikem mentioned this issue Sep 15, 2021

Probability sampler composition rules open-telemetry/oteps#175

Merged

jmacd mentioned this issue Sep 17, 2021

Add Resource to ShouldSample. #1658

Closed

bogdandrutu added the help wanted Extra attention is needed label Dec 16, 2021

iNikem mentioned this issue Jan 9, 2022

Requesting clarification for non-probabilistic samplers #2253

Open

jmacd mentioned this issue Feb 27, 2024

Probability Samplers based on W3C Trace Context Level 2 #3910

Closed

5 tasks

This was referenced Jul 17, 2024

add rule-based sampler #4152

Closed

Composite Samplers open-telemetry/oteps#250

Closed

This was referenced Jul 25, 2024

Randomness requirements following W3C Trace Context level 2 #4162

Draft

OpenTelemetry TraceIdRatioBased sampler requirements following OTEP 235 #4166

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composite Sampler #1844

Composite Sampler #1844

iNikem commented Aug 3, 2021

iNikem commented Aug 3, 2021

jmacd commented Aug 5, 2021 •

edited

Loading

iNikem commented Aug 6, 2021

jmacd commented Sep 15, 2021

jmacd commented Sep 15, 2021

iNikem commented Sep 18, 2021

jmacd commented Jan 12, 2022

Composite Sampler #1844

Composite Sampler #1844

Comments

iNikem commented Aug 3, 2021

iNikem commented Aug 3, 2021

jmacd commented Aug 5, 2021 • edited Loading

iNikem commented Aug 6, 2021

jmacd commented Sep 15, 2021

jmacd commented Sep 15, 2021

iNikem commented Sep 18, 2021

jmacd commented Jan 12, 2022

jmacd commented Aug 5, 2021 •

edited

Loading