Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report exponential_avg_bucket_processing_time which gives more weight to recent buckets #43189

Merged

Conversation

przemekwitek
Copy link
Contributor

Report exponential_avg_bucket_processing_time which gives more weight to recent buckets.

Related issue: #29857

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@przemekwitek przemekwitek force-pushed the anomaly_detector_weighted_avg branch from 32552f0 to cdb756d Compare June 13, 2019 09:00
@przemekwitek przemekwitek changed the title Report timing stats as part of the Job stats response Report exponential_avg_bucket_processing_time which gives more weight to recent buckets Jun 13, 2019
@przemekwitek przemekwitek removed the WIP label Jun 13, 2019
@przemekwitek przemekwitek marked this pull request as ready for review June 13, 2019 10:24
@przemekwitek przemekwitek force-pushed the anomaly_detector_weighted_avg branch from 191295e to 3c91f8c Compare June 13, 2019 11:36
@droberts195
Copy link
Contributor

Now the timing_stats object is complete it would be good to add it to the docs in docs/reference/ml/apis/jobcounts.asciidoc.

Also, it would be good to add an example timing_stats (that's believable given the other stats) into the example response in docs/reference/ml/apis/get-job-stats.asciidoc.

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding docs and updating the yml REST tests would be good to do in this PR as well. My only comment in the review is not that important :D

/**
* Updates the statistics (min, max, avg) for the given data point (bucket processing time).
*/
public void updateStats(double bucketProcessingTimeMs) {
if (bucketProcessingTimeMs < 0.0) {
throw new IllegalArgumentException("bucketProcessingTimeMs must be positive, was: " + bucketProcessingTimeMs);
throw new IllegalArgumentException("bucketProcessingTimeMs must be non-negative, was: " + bucketProcessingTimeMs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#pedantry
So, is 0.0 considered a negative number? 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you've seen the latest revision of this comment ("bucketProcessingTimeMs must be non-negative, was: ") or not. But now I believe it is correct i.e. if the value is less than 0 we throw an error that it should be non-negative (so 0 or greater than 0).

@przemekwitek
Copy link
Contributor Author

Now the timing_stats object is complete it would be good to add it to the docs in docs/reference/ml/apis/jobcounts.asciidoc.

Done.

Also, it would be good to add an example timing_stats (that's believable given the other stats) into the example response in docs/reference/ml/apis/get-job-stats.asciidoc.

Done.

@przemekwitek
Copy link
Contributor Author

I think adding docs and updating the yml REST tests would be good to do in this PR as well. My only comment in the review is not that important :D

Done.

(double) Maximum among all bucket processing times in milliseconds.

`average_bucket_processing_time_ms`::
(double) Cumulative moving average of all bucket processing times in milliseconds.
Copy link
Contributor

@droberts195 droberts195 Jun 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this contain the word "moving"?

To match the wording of min and max it could just be "Average of all bucket processing times in milliseconds."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- gt: { jobs.0.timing_stats.minimum_bucket_processing_time_ms: 0.0 }
- gt: { jobs.0.timing_stats.maximum_bucket_processing_time_ms: 0.0 }
- gt: { jobs.0.timing_stats.average_bucket_processing_time_ms: 0.0 }
- gt: { jobs.0.timing_stats.exponential_average_bucket_processing_time_ms: 0.0 }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably safer to use gte on lines 104-107 and 143-146. The jobs are only creating results for 1 bucket, and if the bucket processing time for that 1 bucket happens to round down to 0 then the gt condition will fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@przemekwitek
Copy link
Contributor Author

run elasticsearch-ci/default-distro

@przemekwitek przemekwitek force-pushed the anomaly_detector_weighted_avg branch from fda922a to 945cc39 Compare June 14, 2019 11:27
@przemekwitek
Copy link
Contributor Author

run elasticsearch-ci/1

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📊

@przemekwitek przemekwitek force-pushed the anomaly_detector_weighted_avg branch from 945cc39 to c322f1d Compare June 14, 2019 19:02
@przemekwitek przemekwitek merged commit 13596c8 into elastic:master Jun 16, 2019
@przemekwitek przemekwitek deleted the anomaly_detector_weighted_avg branch June 16, 2019 18:41
przemekwitek added a commit to przemekwitek/elasticsearch that referenced this pull request Jun 16, 2019
przemekwitek added a commit that referenced this pull request Jun 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants