Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rule-based sampler #4152

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ release.

### Traces

- Add `RulesBased` sampler. [#4152](https://github.com/open-telemetry/opentelemetry-specification/pull/4152)

### Metrics

### Logs
Expand Down
1 change: 1 addition & 0 deletions spec-compliance-matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ formats is required. Implementing more than one format is optional.
| [Sampling](specification/trace/sdk.md#sampling) | Optional | Go | Java | JS | Python | Ruby | Erlang | PHP | Rust | C++ | .NET | Swift |
| Allow samplers to modify tracestate | | + | + | | + | + | + | + | + | + | + | + |
| ShouldSample gets full parent Context | | + | + | + | + | + | + | + | + | + | - | + |
| Sampler: [RulesBasedSampler](specification/trace/sdk.md#rulebased) | Optional | | | | | | | | | | | |
| Sampler: JaegerRemoteSampler | | + | + | | | | | | + | | | |
| [New Span ID created also for non-recording Spans](specification/trace/sdk.md#sdk-span-creation) | | + | + | | + | + | + | + | + | + | - | + |
| [IdGenerators](specification/trace/sdk.md#id-generators) | | + | + | | + | + | + | + | + | + | | + |
Expand Down
59 changes: 59 additions & 0 deletions specification/trace/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ linkTitle: SDK
+ [TraceIdRatioBased](#traceidratiobased)
- [Requirements for `TraceIdRatioBased` sampler algorithm](#requirements-for-traceidratiobased-sampler-algorithm)
+ [ParentBased](#parentbased)
+ [RuleBased](#rulebased)
+ [JaegerRemoteSampler](#jaegerremotesampler)
- [Span Limits](#span-limits)
- [Id Generators](#id-generators)
Expand Down Expand Up @@ -447,6 +448,64 @@ Optional parameters:
|present|false|true|`localParentSampled()`|
|present|false|false|`localParentNotSampled()`|

#### RuleBased

A `RuleBased` sampler tests a span against a list of rules, and delegates the
sampling decision to another sampler when a rule matches. If no rules match, a
fallback sampler is used to make the sampling decision.

The obvious use-case for a `RuleBased` sampler is to avoid tracing frequent and
uninteresting transactions such as health checks.

* A `RuleBased` sampler MUST be configured with a set of rules and a fallback
`Sampler` to use if the `Span` does not match any of the configured rules. The
rules MUST be processed in the order they were provided.
* A `Rule` requires a list of tests, and a delegate `Sampler` to be invoked if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This describes a specific implementation of Rule. As an abstraction a Rule is just a Sampler, is it not?

And the RuleBased sampler itself seems just like a stand-in for logical expression, discussed in #1844

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This describes a specific implementation of Rule. As an abstraction a Rule is just a Sampler, is it not?

No, it's not a Sampler by itself. A rule is a logical expression which is applied to a component of ShouldSample (eg span name, or attributes), and will evaluate to true or false. If all of the rules in a set are true, then the associated Sampler is invoked.

And the RuleBased sampler itself seems just like a stand-in for logical expression, discussed in #1844

It's related to #1844, because that spawned Java's RuleBasedRoutingSampler which in turn influenced this PR. So I think this PR is the implementation that resolves #1844 at spec level.

Copy link
Member

@yurishkuro yurishkuro Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a logical expression which is applied to [a span], and will evaluate to true or false.

That's almost exactly the definition of a sampler.

So I think this PR is the implementation that resolves #1844 at spec level.

Actually I meant this OTEP https://github.com/open-telemetry/oteps/pull/250/files.

all of the tests match the `Span` being sampled. If multiple tests are
configured, then the rule MUST pass if *all* tests pass.
* If no rules match, the sampling decision MUST be made by the configured
fallback sampler.
* A test can be based on the parameters available to [`ShouldSample`](#shouldsample):
* `parent`
* `sampled` - boolean representing whether the parent is sampled
* `remote` - boolean representing whether the parent is remote
* `span_name`
* `pattern` - regular expression to match on
* `span_kind`
* `kind` - one of [SpanKind](./api.md#spankind), for example `SERVER`
* `attributes`
* `key` and `pattern` - attribute key and regular expression pattern to match on
* `link`
* `sampled` - boolean representing whether a link is sampled
* `remote` - boolean representing whether a link is remote
* Description MUST start with `RuleBased`, and MAY contain further details of the
configuration, such as `RuleBasedSampler{rules=[RuleSet{rules=[Attribute{key=http.request.url,pattern=^/health$}],delegate=always_off}],fallback=parent_based_always_on}`

An example YAML-based configuration for a `RuleBased` sampler with multiple rules:

```yaml
sampler:
rule_based:
rule_sets:
- rules:
- span_kind: { kind: SERVER }
- attribute: { key: url.path, pattern: "^/health$" }
delegate:
always_off: {}
- rules:
- link: { sampled: true }
delegate:
always_on: {}
fallback:
parent_based_always_on: {}
```

The above sampler will:

* Use an `AlwaysOff` sampler for any `SERVER` span with a url path matching the regular expression `^/health$`
* Use an `AlwaysOn` sampler for any span with a link that is sampled
* Use a `ParentBased,AlwaysOn` sampler for all other spans

#### JaegerRemoteSampler

[Jaeger remote sampler][jaeger-remote-sampling] allows remotely controlling the sampling configuration for the SDKs. The sampling configuration is periodically loaded from the backend (see [Remote Sampling API][jaeger-remote-sampling-api]), where it can be managed by operators via configuration files or even automatically calculated (see [Adaptive Sampling][jaeger-adaptive-sampling]). The sampling configuration retrieved by the remote sampler can instruct it to use either a single sampling method for the whole service (e.g., `TraceIdRatioBased`), or different methods for different endpoints (span names), for example, sample `/product` endpoint at 10%, `/admin` endpoint at 100%, and never sample `/metrics` endpoint.
Expand Down
Loading