-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable circuit breaker for the preview API #11
Conversation
The preview API runs heavy queries/aggregation and modeling. We risk bringing the cluster down if there are too many preview requests. This PR installs a circuit breaker check for the preview API. Testing done: 1. added unit tests.
@@ -120,6 +126,10 @@ void previewExecute( | |||
ThreadContext.StoredContext context, | |||
ActionListener<PreviewAnomalyDetectorResponse> listener | |||
) { | |||
if (adCircuitBreakerService.isOpen()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we track memory usage of preview API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, we don't currently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocking issue, but I think it can helps if we track memory usage of preview API so we can reject too many preview API calls before circuit breaker triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preview API only lasts about seconds. Getting reserved memory can triggers evicting models in shared cache. I don't want that disturbance to happen if possible. How about set a max concurrent preview API run (e.g., 5)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if preview API can finish in seconds for multi-category HC. I'm ok to limit concurrent preview API run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I showed the 2 hours data. Will add the code in the new commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added concurrent preview API limit. Also replaced scriipt in terms aggr with composite aggregation because a runtime more than 1 minute will throw the preview API useless.
…egation with composite aggregation
opensearch-project#128) * Improve preview and exception handling in AnomalyResultTransportAction This PR commits changes from the following PRs: kaituo#11 kaituo#12
opensearch-project#128) * Improve preview and exception handling in AnomalyResultTransportAction This PR commits changes from the following PRs: kaituo#11 kaituo#12
…n (#128) * Improve preview and exception handling in AnomalyResultTransportAction This PR commits changes from the following PRs: kaituo/anomaly-detection-1#11 kaituo/anomaly-detection-1#12
Description
The preview API runs heavy queries/aggregation and modeling. We risk bringing the cluster down if there are too many preview requests. This PR installs a circuit breaker check for the preview API.
Testing done:
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.