Add base class of rate limited queue #60

kaituo · 2021-05-24T19:06:41Z

Note: since there are a lot of dependencies, I only list the main class and test code to save reviewers' time. The build will fail due to missing dependencies. I will use that PR just for review. will not merge it. Will have a big one in the end and merge once after all review PRs get approved. Now the code is missing unit tests. Posting PRs now to meet the cutoff date (June 1). Will add unit tests, run performance tests, and fix bugs before the official release.

Description

HCAD can bombard Opensearch with "thundering herd" traffic, in which many entities make requests that need similar Opensearch reads/writes at approximately the same time. To remedy this issue, we queue the requests and ensure that only a limited set of requests are out for Opensearch reads/writes.

This PR adds the class RateLimitedQueue that is the parent abstract class of all of the queues. The process is asynchronous as the put and execute operations do not block each other. How to execute requests is abstracted out and left to RateLimitedQueue's subclasses to implement.

Each request is associated with a segment. That is, a queue consists of segments. Segments have their corresponding priorities: HIGH, MEDIUM, and LOW. An example of HIGH priority requests is anomaly results with errors or its anomaly grade larger than zero. An example of MEDIUM priority requests is a cold start request for an entity. An example of LOW priority requests is checkpoint write requests for a cold entity. LOW priority requests have the slightest chance to be selected to be executed. MEDIUM and HIGH priority requests have higher stakes. LOW priority requests have higher chances of being deleted when the size of the queue reaches beyond a limit compared to MEDIUM/HIGH priority requests.

Testing done:

Manual tests using 10 HCAD detectors and 12,000 entities in a 3 node cluster.

Issues Resolved

[List any issues this PR will resolve]

Check List

[ Y ] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

HCAD can bombard ES with "thundering herd" traffic, in which many entities make requests that need similar ES reads/writes at approximately the same time. To remedy this issue, we queue the requests and ensure that only a limited set of requests are out for ES reads/writes. This PR adds the class RateLimitedQueue that is the parent abstract class of all of the queues. The process is asynchronous as the put and execute operations do not block each other. How to execute requests is abstracted out and left to RateLimitedQueue's subclasses to implement. Each request is associated with a segment. That is, a queue consists of segments. Segments have their corresponding priorities: HIGH, MEDIUM, and LOW. An example of HIGH priority requests is anomaly results with errors or its anomaly grade larger than zero. An example of MEDIUM priority requests is a cold start request for an entity. An example of LOW priority requests is checkpoint write requests for a cold entity. LOW priority requests have the slightest chance to be selected to be executed. MEDIUM and HIGH priority requests have higher stakes. LOW priority requests have higher chances of being deleted when the size of the queue reaches beyond a limit compared to MEDIUM/HIGH priority requests. Testing done: 1. Manual tests using 10 HCAD detectors and 12,000 entities in a 3 node cluster.

src/main/java/org/opensearch/ad/ratelimit/EntityFeatureRequest.java

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

jmazanec15 · 2021-05-24T20:49:09Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+ */
+public abstract class RateLimitedQueue<RequestType extends QueuedRequest> implements MaintenanceState {
+    /**
+     * Each request is associated with a segment. That is, a queue consists of segments.


Where does the terminology segment come from? Why doesn't the queue just consist of requests?

I coined the word. Because different requests have different priorities. Want to differentiate them using segments.

I see. Segment doesn't seem intuitive to me, but Im not sure I have a better name. Segment just appears to be a queue of requests, so why can't it just be RequestQueue?

It is a bit strange to call a variable queue inside a queue. Any other name?

But a segment is just a queue of requests with additional meta data describing when it was accessed. Therefore, RequestQueue or AccessTimeStampedRequestQueue are good names. I understand what the class is just from the name.

How about we rename RateLimitedQueue to RateLimitedRequestWorker and rename segment to RequestQueue?

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

src/main/java/org/opensearch/ad/ratelimit/QueuedRequest.java

weicongs-amazon · 2021-05-24T21:21:37Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+        private Instant lastAccessTime;
+        // data structure to hold requests
+        private BlockingQueue<RequestType> content;


why we use different ways to access these 2 variables? in 229 line, segment.content is used to access content. but some accessing methods are also provided such as setLastAccessTime.

yeah, removed setLastAccessTime and access variables directly.

weicongs-amazon · 2021-05-24T21:23:05Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+    // For medium priority requests, the segment id is detector id. The objective
+    // is to separate requests from different detectors and fairly process requests
+    // from each detector.
+    protected final ConcurrentSkipListMap<String, Segment> requestSegments;


do we need this map? I see this class is always segment priority aware. we can add 3 different queue field variables instead of a map.

for medium priority requests, the segment id is detector id. So will need the map.

weicongs-amazon · 2021-05-24T21:34:50Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+ *
+ * @param <RequestType> Individual request type that is a subtype of ADRequest
+ */
+public abstract class RateLimitedQueue<RequestType extends QueuedRequest> implements MaintenanceState {


should we use a better name? I think this class isn't just a queue, also includes the logics to handle the requests.

Any suggestion?

weicongs-amazon · 2021-05-24T21:54:58Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+        }
+
+        public void put(RequestType request) throws InterruptedException {
+            this.content.put(request);


Minor: put is a blocking call although it's rare since the queue capacity is INTEGER.MAX_VALUE. if we wanna have a capacity limit in future, we probably don't wanna putting thread is blocked as queue is full.

The queue's size is maintained in other places. Please check RateLimitedQueue.maintainForMemory. I know it won't be full so don't do anything for the blocking call here.

weicongs-amazon · 2021-05-24T23:54:15Z

src/main/java/org/opensearch/ad/ratelimit/EntityFeatureRequest.java

+    private final long dataStartTimeMillis;
+
+    public EntityFeatureRequest(
+        long expirationEpochMs,


how to decide this value? do we need this to be set per request?

The expiry time is the start timestamp of the next detector run. Yes, it needs to be set per request.

ylwu-amzn · 2021-05-25T17:02:07Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+
+    private static final Logger LOG = LogManager.getLogger(RateLimitedQueue.class);
+
+    protected int queueSize;


Add volatile as it will be used in setting update consumer in line 159 ?

ylwu-amzn · 2021-05-25T17:21:46Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+        try {
+            Segment requestQueue = requestSegments
+                .computeIfAbsent(
+                    SegmentPriority.MEDIUM == request.getPriority() ? request.getDetectorId() : request.getPriority().name(),


Why only consider MEDIUM priority here? Can you add some comments?

because only medium priority segments use detector id as the key of the segment map. low and high priority requests just use the segment priority (i.e., low or high) as the key of the segment map. Added the comments.

ylwu-amzn · 2021-05-25T17:26:34Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+    // map from segment Id to its segment.
+    // For high priority requests, the segment id is SegmentPriority.HIGH.name().
+    // For low priority requests, the segment id is SegmentPriority.LOW.name().
+    // For medium priority requests, the segment id is detector id. The objective


Why don't separate high priority requests on detector level?

It makes things more complicated and I don't find the need now :)

ylwu-amzn · 2021-05-25T17:34:07Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+
+                BlockingQueue<RequestType> requests = segment.content;
+
+                if (requests != null && false == requests.isEmpty()) {


minor: false == requests.isEmpty() -> !requests.isEmpty() ?

it is a style I try to use false instead of ! as it is easier to read: https://stackoverflow.com/questions/11831881/if-boolean-false-vs-if-boolean

Different people has different understanding https://softwareengineering.stackexchange.com/questions/136908/why-use-boolean-variable-over-boolean-variable-false
Not a big problem

ylwu-amzn · 2021-05-25T17:36:42Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+                    startId = requestSegments.higherKey(startId);
+                }
+
+                if (startId.equals(SegmentPriority.LOW.name())) {


Should we have different logic for high priority as well? We just pull high and medium requests with round-robin way?

Yes, we just pull high and medium requests with round-robin way. We can definitely add a special branch for high priority requests in the future :)

ylwu-amzn · 2021-05-25T17:48:11Z

src/main/java/org/opensearch/ad/ratelimit/RateLimitedQueue.java

+            // remove until reaching below queueSize
+            do {
+                prune(requestSegments);
+            } while (isSizeExceeded());


Can we calculate how many requests we should prune directly to avoid this while loop?

Yes. Created an issue: #66

Will do it after cutoff.

This PR is a conglomerate of the following PRs. #60 #64 #65 #67 #68 #69 #70 #71 #74 #75 #76 #77 #78 #79 #82 #83 #84 #92 #94 #93 #95 kaituo#1 kaituo#2 kaituo#3 kaituo#4 kaituo#5 kaituo#6 kaituo#7 kaituo#8 kaituo#9 kaituo#10 This spreadsheet contains the mappings from files to PR number (bug fix in my AD fork and tests are not included): https://gist.github.com/kaituo/9e1592c4ac4f2f449356cb93d0591167

This PR is a conglomerate of the following PRs. #60 #64 #65 #67 #68 #69 #70 #71 #74 #75 #76 #77 #78 #79 #82 #83 #84 #92 #94 #93 #95 kaituo#1 kaituo#2 kaituo#3 kaituo#4 kaituo#5 kaituo#6 kaituo#7 kaituo#8 kaituo#9 kaituo#10 This spreadsheet contains the mappings from files to PR number (bug fix in my AD fork and tests are not included): https://gist.github.com/kaituo/9e1592c4ac4f2f449356cb93d0591167

kaituo requested review from weicongs-amazon and jmazanec15 May 24, 2021 19:07

jmazanec15 reviewed May 24, 2021

View reviewed changes

Remove unneeded license, bug fix, clean comments

8370a8c

weicongs-amazon reviewed May 24, 2021

View reviewed changes

ylwu-amzn reviewed May 25, 2021

View reviewed changes

jmazanec15 approved these changes May 25, 2021

View reviewed changes

Rename queue and segment

3f78926

ylwu-amzn approved these changes May 26, 2021

View reviewed changes

weicongs-amazon approved these changes May 26, 2021

View reviewed changes

kaituo closed this Jun 2, 2021

kaituo mentioned this pull request Jul 6, 2021

multi-category support, rate limiting, and pagination #121

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add base class of rate limited queue #60

Add base class of rate limited queue #60

kaituo commented May 24, 2021 •

edited

Loading

jmazanec15 May 24, 2021

kaituo May 24, 2021

jmazanec15 May 25, 2021

kaituo May 25, 2021

jmazanec15 May 25, 2021

kaituo May 25, 2021

weicongs-amazon May 24, 2021

kaituo May 25, 2021

weicongs-amazon May 24, 2021

kaituo May 25, 2021

weicongs-amazon May 24, 2021

kaituo May 25, 2021

weicongs-amazon May 24, 2021

kaituo May 25, 2021

weicongs-amazon May 24, 2021

kaituo May 25, 2021

ylwu-amzn May 25, 2021

kaituo May 25, 2021

ylwu-amzn May 25, 2021

kaituo May 25, 2021

ylwu-amzn May 25, 2021

kaituo May 25, 2021

ylwu-amzn May 25, 2021

kaituo May 25, 2021

ylwu-amzn May 26, 2021

ylwu-amzn May 25, 2021

kaituo May 25, 2021

ylwu-amzn May 25, 2021

kaituo May 25, 2021


		private static final Logger LOG = LogManager.getLogger(RateLimitedQueue.class);

		protected int queueSize;


		BlockingQueue<RequestType> requests = segment.content;

		if (requests != null && false == requests.isEmpty()) {

Add base class of rate limited queue #60

Add base class of rate limited queue #60

Conversation

kaituo commented May 24, 2021 • edited Loading

Description

Issues Resolved

Check List

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaituo commented May 24, 2021 •

edited

Loading