Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote cardinality limit view API from experimental to stable #5444

Closed
CodeBlanch opened this issue Mar 13, 2024 · 11 comments · Fixed by #5926
Closed

Promote cardinality limit view API from experimental to stable #5444

CodeBlanch opened this issue Mar 13, 2024 · 11 comments · Fixed by #5926
Assignees
Labels
help wanted Good for taking. Extra help will be provided by maintainers metrics Metrics signal related pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Milestone

Comments

@CodeBlanch
Copy link
Member

We have an experimental api now for setting cardinality limit via the view api (see: #5312 & #5328).

This issue is for tracking making this a stable feature.

@CodeBlanch CodeBlanch added pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package metrics Metrics signal related needs-spec-change Issues which require the OpenTelemetry Specification to clarify or define behavior labels Mar 13, 2024
@CodeBlanch CodeBlanch added this to the 1.9.0 milestone Mar 13, 2024
@cijothomas cijothomas changed the title Promote cardinality limit view AP from experimental to stable Promote cardinality limit view API from experimental to stable Mar 19, 2024
@NickCraver
Copy link
Contributor

@CodeBlanch is this likely to make it into an upcoming release? We have severe memory issues with metrics due to needing high cardinality (for compatibility) on a single metric. Unless I'm mistaken, there's not a per-metric way to set higher cardinality until this lands and no current packages expose it, so we're in a tough spot on options here. Knowing where this stands would help me figure out timelines on options to unblock the next release for App Service.

@cijothomas
Copy link
Member

@CodeBlanch This should require no spec support, given we already support cardinality limit on MP, so adding it per Metric should be continuation of that. The current default behavior of drop-when-limit can be continued. For folding overflow into the special overflow bucket - requires spec change.

@vishweshbankwar vishweshbankwar modified the milestones: 1.9.0, Future May 21, 2024
@NickCraver
Copy link
Contributor

@CodeBlanch @cijothomas Checking in since the global limit is costing us an extra 11 terabytes of memory at all times. When do we expect to land this at a more granular level? Or is that already the case and this issue is behind current status?

@cijothomas
Copy link
Member

@NickCraver The status is still the same, the API is not part of stable release yet. Unfortunately, no firm ETA for that as well, given there is some spec dependency (though it can be argued that the spec dependency can be ignored if the overflow behavior is just drop).

@cijothomas
Copy link
Member

open-telemetry/opentelemetry-specification#3904 Spec issue tracking this.

@NickCraver
Copy link
Contributor

@cijothomas I could be misunderstanding the current state - isn't the behavior currently/already drop and log? I'm trying to understand how the spec for what to do when exceeding the limit is a blocker for setting the limit, can't the latter come before, with the former being further enhancement or refinement? It seems like they're related but orthogonal concerns we could parallel.

@cijothomas
Copy link
Member

You are right!
Current: The current default behavior is to drop when limit is exceeded. The limit is allowed to be configured only on Provider level.
Experimental: The experimental-spec is to "don't drop when limit exceeds, but put into overflow=true bucket". -- This is experimental in OTel .NET too, enabled via OTEL_DOTNET_EXPERIMENTAL_METRICS_EMIT_OVERFLOW_ATTRIBUTE=true

Also experimental is the ability to set cardinality on a per metric basis (using Views). This can be stabilized with the default behavior of dropping, and opt-in behavior to put into overflow bucket. Once spec is stabilized, this can be changed to default to overflow. But I believe the desire (based on what I gather from a community call few months ago!) is to do both together, to make the transition smoother.

@CodeBlanch CodeBlanch modified the milestones: Future, 1.10.0 Oct 7, 2024
@CodeBlanch CodeBlanch added help wanted Good for taking. Extra help will be provided by maintainers and removed needs-spec-change Issues which require the OpenTelemetry Specification to clarify or define behavior labels Oct 7, 2024
@CodeBlanch
Copy link
Member Author

CodeBlanch commented Oct 7, 2024

Update...

The spec is now stable so we should be able to promote our API to stable (remove experimental API stuff).

open-telemetry/opentelemetry-specification#4222

[edit]

For anyone who wants to pick this up, here is a previous PR which promoted something to stable which can be used as a sort of guide for how to do it: #5607

@xiang17
Copy link
Contributor

xiang17 commented Oct 7, 2024

I'd like to take this one.

@cijothomas
Copy link
Member

@xiang17 Can you confirm that you will take care of promoting the per metric cardinality limit to stable and also, remove the need for OTEL_DOTNET_EXPERIMENTAL_METRICS_EMIT_OVERFLOW_ATTRIBUTE env variable?

@xiang17
Copy link
Contributor

xiang17 commented Oct 17, 2024

@cijothomas Yes I'll do that as well. It's tracked in this issue: #5641.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Good for taking. Extra help will be provided by maintainers metrics Metrics signal related pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Projects
None yet
5 participants