Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request duration metric for object stores is inaccurate #3622

Closed
evenyag opened this issue Apr 1, 2024 · 0 comments
Closed

Request duration metric for object stores is inaccurate #3622

evenyag opened this issue Apr 1, 2024 · 0 comments
Labels
good first issue Good for newcomers

Comments

@evenyag
Copy link
Contributor

evenyag commented Apr 1, 2024

What type of enhancement is this?

Tech debt reduction

What does the enhancement do?

The timer of REQUESTS_DURATION_SECONDS in the prometheus metrics layer is observed after the inner method returns.

async fn write(&self, path: &str, args: OpWrite) -> Result<(RpWrite, Self::Writer)> {
REQUESTS_TOTAL
.with_label_values(&[&self.scheme, Operation::Write.into_static()])
.inc();
let timer = REQUESTS_DURATION_SECONDS
.with_label_values(&[&self.scheme, Operation::Write.into_static()])
.start_timer();
let write_res = self
.inner
.write(path, args)
.map(|v| {
v.map(|(rp, r)| {
(
rp,
PrometheusMetricWrapper::new(r, Operation::Write, &self.scheme),
)
})
})
.await;
timer.observe_duration();

However, the inner method returns a writer which indicates that the actual write operation does not perform yet. We should move the timer into the PrometheusMetricsWrapper and observe it after the writer is done. So the duration we collect is inaccurate.

Other methods also have similar issues:

  • read()
  • blocking_read()
  • blocking_write()

Implementation challenges

OpenDAL's metrics layer already fixed this issue. We could refer to their implementation.
https://github.com/apache/opendal/blob/50791255c6ef3ed259c88dcc04a4295fa60fa443/core/src/layers/metrics.rs#L489-L505

The MetricWrapper in that layer also updates the histogram in drop().
https://github.com/apache/opendal/blob/50791255c6ef3ed259c88dcc04a4295fa60fa443/core/src/layers/metrics.rs#L765-L773

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants