This repository has been archived by the owner on Dec 6, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A Sampling Configuration proposal #240
A Sampling Configuration proposal #240
Changes from 4 commits
f3d3878
28eb18c
0e87ea3
26fe48a
2970583
583fccf
e2d1a84
877a056
079e44d
2b66df6
76dbe46
3fd9677
fe4d4ab
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue is very relevant to my comment here. If we had a spec'd component for a composable rule bases sampler, we could define its configuration across languages in
opentelemetry-configuration
, and allow users to specify complex rule-based sampling configuration in a YAML syntax with file configuration.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the way this OTEP enables us to experiment with sampler configuration generally, because (as I think other commenters, including @spencerwilson have shown) this is complicated. After we have integrated the new configuration model, with the new tracestate information in #235 and demonstrated prototypes compatible with head sampling, tail sampling, and Jaeger (for example), then I think we can work on specifying a standard rule-based sampler configuration for OTel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than a scalar value, it may be valuable to support this property being a map type, acting as a tagged union: https://jsontypedef.com/docs/jtd-in-5-minutes/#discriminator-schemas
Parseability and extensibility can be greatly enhanced by tagged unions. If everything's a string, then you must invent some kind of URI-type format to refer to things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth emphasizing this. If these configs aren't portable ("Here's a sampler config for a JVM workload; there's one for Node.js workload", e.g.), then that limits the ability of a central team to author and manage a small set of configs that many downstream workloads adopt.
Perhaps if we at least define behavior for when a sampler is not usable by a given platform, that'd be ok? E.g., if a Node.js workload has sampler config that directs it to use a sampler with type like
then the Node.js OTel SDK could say "Ah, I don't support
jvm.classpath
samplers. In this case the sampler is inactive". TBD precisely what "inactive" means for consistent probability sampling.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe we have such sampler type currently in the spec. The challenge with it is not introducing a configuration, but defining statistically meaningful behavior and sample weight calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't. This proposal does not call for adding such sampler to the specs, it can remain an internal gadget, but adding it to the specs is also a possibility.
I admit I do not understand your second sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was a lot of effort put into the spec to develop sampling strategies that allow calculating summary statistics by capturing
weight
of each sample (simplest example: probabilistic 1-in-10 sampler gives each sample/span aweight=10
). When you have a sequential composite sampler like you have here, how is the weight of a sample will be calculated?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consistent Probability Sampling users have their own composite samplers, which properly handle logical-or and logical-and operations on consistent probability samplers, so they do not need AnyOf or AllOf samplers proposed here, and they should not use them.
Generally, composing consistent probability samplers with non-probabilistic samplers leads to undefined (or incorrect) adjusted counts anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paraphrasing @yurishkuro in another thread:
I spent a lot of time thinking about this last year. The following is adapted from notes that I never quite edited enough to put up in a PR, but I did present to Sampling SIG in some biweekly meeting (though I can't find any notes/YouTube recording, unfortunately).
Also, I'll also note that I focused entirely on consistent probability samplers since I'm so enthusiastic about validly estimating span counts that I limited my analysis to a world where every sampler is a consistent probability sampler. For the OTel-instrumented projects I work on, I will always lobby hard for keeping our spans countable in a statistically valid manner.
Evaluating a tree of consistent probability samplers
When I was thinking about how to compute the weight/p-value/etc. remains well-defined in a tree of consistent probability samplers, I arrived at the following design:
Definitions:
First, start with a sampling policy: config specifying a tree of (predicate, sampler) pairs. Then, to make a sample/drop decision for a span you evaluate the tree:
true
, call that sampler 'active'.false
. Also discard any composite samplers with 0 active children. Call the resulting subtree the 'tree of active samplers'.(update p-value -> r-value or your preferred 'weight' notion as necessary)
Complication: Samplers with side effects
The above works well for stateless, no-side-effects samplers. But some samplers have side effects. For example, a sampler that collaborates with samplers in other processes in order to stay under a system-wide throughput limit does I/O to communicate with those processes. Or even simpler: a sampler that maintains some internal statistics and adapts its selectivity based only on what spans it has "seen". In what circumstances should it write updates to those statistics?
A policy containing a single effectful sampler that's activated on all traces has clear semantics: do all effects as normal. The semantics are less obvious when an effectful sampler only activates for some traces, or is composed with other samplers. In the 'tree-of-conditional-samplers' model, in what circumstances should a sampler perform its effects (vs. electing not to perform them at all)?
A sampler with side effects might have unconditional effects—effects triggered for every trace that is submitted to the policy—but it may want to also have effects that are conditional on things like the following:
You could imagine that samplers interested in knowing these facts could provide a callback which, if defined, the tree evaluator calls after a final decision has been made. It could invoke the callback with 'receipt' data that conveys 'Here are a bunch of facts about a decision just made, and your relation to it / impact on it.' The sampler could then do with that information what it will.
What systems confront and answer this question today?
Hopefully the above inspires some ideas! I'm not sure how much of this complexity must be addressed at this stage in order to avoid boxing ourselves in. I suspect much of it is 'severable', able to be punted on. I figured I'd raise it now though in case someone smarter than me sees something critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are challenges with stateful (or with other side effects) samplers, as well as with samplers that modify the tracestate. We've made some effort to provide a clear framework for these samplers to operate. We want to see how far we can go with useful enhancements to the samplers library without having to change the specs for the head samplers API, but that means we cannot provide support for your "used" and "pivotal" concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RuleBased
here reminds me of the consistent probability sampler I defined last year calledfirst
:The root sampler's p-value is that of the first active child sampler (child whose predicate is true). This is the evaluation model that both X-Ray and Refinery use, and expressiveness-wise it's a superset of Jaeger's config (a mapping from the pair (service name, span name) -> sampler). So I think it's a strong starting place for this conversation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spencerwilson Do you have any references for it? I'll be happy to include them as prior art.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I never pushed any of the actual config speccing to the internet (prior to these comments, anyway).
For X-Ray and Refinery, their documentation are the primary sources. I probably linked to the relevant parts in my notes in #213.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some possible directions, for reference: