-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Ownership can timeout on full buffer for pull based sources #4422
Comments
Maybe we can just have buffer accumulator take in a callback that runs when the buffer times out. |
I'm not sure how that would be different. When it times out, doesn't the current thread continue and then iterate back to getting ownership? Or is there something in between? If I understand the problem correctly, the ownership is expiring during the write to the buffer. |
@dlvenable Buffer accumulator currently will block and retry internally here ( Line 73 in ef39d4f
Line 93 in ef39d4f
|
Describe the bug
we currently update source coordination ownership for partitions synchronously in pull based sources like S3, OpenSearch, and DynamoDB. This happens in a loop approximately every 2 minutes, but when the buffer is very full, we spend time retrying to write to the buffer, which leads to expiring ownership of the partition, and reprocessing of that partition by another node of Data Prepper
Expected behavior
Asynchronously update ownership every 2 minutes without depending on the primary loop. For example, this is done here for DynamoDB (
data-prepper/data-prepper-plugins/dynamodb-source/src/main/java/org/opensearch/dataprepper/plugins/source/dynamodb/export/DataFileLoader.java
Line 206 in a20756c
Alternative consideration
Increase the ownership timeout to be a higher value or check ownership updates in between attempts to write to the buffer
Screenshots
If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: