Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of StartTimeUnixNano in metrics proto #229

Closed
jrcamp opened this issue Oct 28, 2020 · 5 comments
Closed

Clarification of StartTimeUnixNano in metrics proto #229

jrcamp opened this issue Oct 28, 2020 · 5 comments
Labels
area:data-model priority:p2 release:allowed-for-ga Editorial changes that can still be added to the GA spec since they don't require action by SIGs spec:metrics

Comments

@jrcamp
Copy link

jrcamp commented Oct 28, 2020

The documentation around StartTimeUnixNano is a bit unclear because it says it MUST be set to the start of an interval:

// - StartTimeUnixNano MUST be set to the start of the interval when the data's
// type includes an AggregationTemporality. This field is not set otherwise.

Source

Then later it says 0 means it's unspecified.

// Value of 0 indicates that the timestamp is unspecified. In that case the
// timestamp may be decided by the backend.

Source

These seem a bit conflicting since I don't think an unspecified interval can be considered the start of an interval.

Also it's unclear what should be done when StartTimeUnixNano is unknown for a cumulative metric. Imagine that you are scraping an Apache status endpoint for "total_bytes_received" but you are unable to determine the start time of the server (when the metric was 0). What should StartTimeUnixNano be set to? (Perhaps depends on the outcome of the above issue.)

@bogdandrutu @jmacd

@jkwatson
Copy link
Contributor

jkwatson commented Oct 28, 2020

It seems like if you're scraping a cumulative metric, you should be recording that data via a SumObserver, which should use the LastValue aggregation, and end up as a Gauge-type metric in OTLP, which doesn't require a time range, IIRC.

@jrcamp
Copy link
Author

jrcamp commented Oct 29, 2020

It seems like if you're scraping a cumulative metric, you should be recording that data via a SumObserver, which should use the LastValue aggregation, and end up as a Gauge-type metric in OTLP, which doesn't require a time range, IIRC.

Looks like SumObserver ends up as Sum in OTLP. See this comment.

@jkwatson
Copy link
Contributor

It seems like if you're scraping a cumulative metric, you should be recording that data via a SumObserver, which should use the LastValue aggregation, and end up as a Gauge-type metric in OTLP, which doesn't require a time range, IIRC.

Looks like SumObserver ends up as Sum in OTLP. See this comment.

Hm. Well, if you don't have a start time, then you probably need to make it a ValueObserver, which does report as a gauge. It's a shame we lose a little bit of information in this approach (namely, that it's monotonic).

@andrewhsu andrewhsu added priority:p2 release:allowed-for-ga Editorial changes that can still be added to the GA spec since they don't require action by SIGs spec:metrics labels Nov 3, 2020
@jmacd
Copy link
Contributor

jmacd commented Mar 15, 2021

https://github.com/lightstep/opentelemetry-prometheus-sidecar takes an approach inherited from https://github.com/Stackdriver/stackdriver-prometheus-sidecar on this topic, which I've begun to think of as a cumulative-to-delta-to-cumulative transform. The idea is to leave a gap in the series at the point where a cumulative of unknown-reset-time appears, a point of transition where the actual rate of the series is unknown. The cumulative value first reported then becomes the reset value of the series, after which successive values are correctly timestamped cumulative / monotonic data points.

The gap in the series serves is reflected in one lost point, the reset value. In that gap, for all we know, any number of true resets may have taken place. Prometheus has similar ambiguity around resets, where instead of the gap proposed here, the value is presumed to have reset once any time the value falls from the previous known value in the series. https://github.com/lightstep/opentelemetry-prometheus-sidecar/blob/13e3b89cd20909a1dcc98d20e6fe1afdc437482a/retrieval/series_cache.go#L273

@jmacd
Copy link
Contributor

jmacd commented Apr 6, 2021

This is a duplicate of #292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:data-model priority:p2 release:allowed-for-ga Editorial changes that can still be added to the GA spec since they don't require action by SIGs spec:metrics
Projects
None yet
Development

No branches or pull requests

4 participants