-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CORE-2321: cloud_storage: Fix remote_partition stuck reader #17805
Conversation
The 'partition_record_batch_reader_impl' uses internal 'remote_segment_batch_reader' to fetch data. When the internal reader reaches the 'max_offset' it returns an empty result. The 'partition_record_batch_reader_impl::do_load_slice' also returns the empty result in this case. Since the top level reader is invoked by the 'model::record_batch_reader'. It has an internal async loop which is running until the EOS is reached. If its internal slice is not empty it's invoking a user provided callback, then it checks if the stream reached EOS (it exits if this is the case), and then it calls 'load_slice' of the underlying stream (this is the place where 'do_load_slice' method of the 'remote_segment_batch_reader' is invoked). Since the low level segment reader reached max_offset it will always return an empty value and the top level record batch reader will loop forever. This commit breaks this loop by checking the current offset of the segment reader if the empty result is returned. If the current offset overshoot max-offset it resets the low level stream. The top level loop will exit after that.
Add fuzz test that generates data with a lot of tx-fence batches and scans the entire partition reading one batch at a time by setting max_bytes=1 and max_offset to 'reader_config.start_offset + 1'.
The exception is no longer thrown anywhere so it's safe to remove the 'catch' block.
// Reader overshoot the offset. If we will not reset | ||
// the stream the loop inside the | ||
// record_batch_reader will keep calling this method | ||
// again and again. We will be returning empty | ||
// result every time because the current offset | ||
// overshoot the max allowed offset. Resetting the | ||
// segment reader fixes the issue. | ||
// | ||
// We can get into the situation when the current | ||
// reader returns empty result in several cases: | ||
// - we reached max_offset (covered here) | ||
// - we reached end of stream (covered above right | ||
// after the 'read_some' call) | ||
// | ||
// If we reached max-bytes then the result won't be | ||
// empty. It will have at least one record batch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
/backport v23.3.x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
if (maybe_max_segments) { | ||
config::shard_local_cfg() | ||
.cloud_storage_max_materialized_segments_per_shard.set_value( | ||
maybe_max_segments); | ||
} | ||
if (maybe_max_readers) { | ||
config::shard_local_cfg() | ||
.cloud_storage_max_segment_readers_per_shard.set_value( | ||
maybe_max_readers); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe they should be the scoped_config, to autoreset at the end of the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but it doesn't matter that much for fuzz tests
but agree, we should do this eventually
auto headers_read | ||
= reader.consume(test_consumer(), model::no_timeout).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the point where it could get stuck, without this fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
This PR adds new fuzz test that reproduces that stuck reader problem and adds a fix. The fix resets the reader in case it reached the max_offset.
Fixes #17788
Backports Required
Release Notes
Bug Fixes