-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't allow two timeline_delete operations to run concurrently. #4313
Conversation
1004 tests run: 962 passed, 0 failed, 42 skipped (full report)The comment gets automatically updated with the latest test results
f54eb39 at 2023-05-27T10:01:43.370Z :recycle: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, because the tests asserted the response status code but I'll let Christian approve.
d664114
to
900f1af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments in delete_timeline
are now stale, please go through them and adjust
Example
// NB: If you call delete_timeline multiple times concurrently, they will
// all go through the motions here. Make sure the code here is idempotent,
// and don't error out if some of the shutdown tasks have already been
// completed!
Other than that, I don't see why we need a mutex instead of an AtomicBool for this.
But I'm Ok with using the tokio mutex.
We should really fix the #3478 issue, though.
Giving approval because I'm on vacation next week, but please fix those comments.
900f1af
to
e4a7624
Compare
Updated that and a few other comments. Also, this race cannot happen any more:
so I turned that into a panic instead.
The Mutex allows you to detect if another task is currently performing the deletion. With an AtomicBool, the boolean would need to indicate "is deletion in progress", and you'd need to be careful to set it back to false if the deletion fails half-way through. I think you'd need another bool for "did it already finish", or make it an atomic with three states:
Currently, this uses
Yes, I'm doing that in #4314. |
e4a7624
to
f584db3
Compare
If the timeline is already being deleted, return an error. We used to notice the duplicate request and error out in persist_index_part_with_deleted_flag(), but it's better to detect it earlier. Add an explicit flag for the deletion. Note: This doesn't do anything about the async cancellation problem (github issue #3478): if the original HTTP request dropped, because the client disconnected, the timeline deletion stops half-way through the operation. That needs to be fixed, too, but that's a separate story.
Also, we now really should find the timeline we're deleting in 'timelines' map at the end of deletion, so turn that into a panic.
f584db3
to
f54eb39
Compare
If the timeline is already being deleted, return an error. We used to notice the duplicate request and error out in
persist_index_part_with_deleted_flag(), but it's better to detect it earlier. Add an explicit flag for the deletion.
Note: This doesn't do anything about the async cancellation problem (github issue #3478): if the original HTTP request dropped, because the client disconnected, the timeline deletion stops half-way through the operation. That needs to be fixed, too, but that's a separate story.
(This is a simpler replacement for PR #4194. I'm also working on the cancellation shielding, I'll open separate PR for that.)