-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce new abstraction between the prefetcher and GetObject calls #552
Conversation
config: PrefetcherConfig, | ||
runtime: Runtime, | ||
} | ||
|
||
impl<Client, Runtime> Debug for PrefetcherInner<Client, Runtime> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
derive(Debug) did not seem to work with part_feed: Arc<dyn ObjectPartFeed<Client>...>
. Not completely sure why (wrt client: Arc<Client>
) or whether there is a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because the Debug impl for part_feed: Arc<dyn ObjectPartFeed<Client>...>
has to work for any ObjectPartFeed
(because the concrete type isn't known), which can only be possible if the ObjectPartFeed
trait itself guarantees Debug. On the other hand, the Arc<Client>
version only has to work for the concrete Client
type(s) that we instantiate it with, which all happen to implement Debug.
@@ -45,8 +43,6 @@ pub struct PrefetcherConfig { | |||
pub sequential_prefetch_multiplier: usize, | |||
/// Timeout to wait for a part to become available | |||
pub read_timeout: Duration, | |||
/// The size of the parts that the prefetcher is trying to align with | |||
pub part_alignment: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we retrieve this directly from the client
.unwrap_or_default() | ||
.clamp(MIN_PART_SIZE, MAX_PART_SIZE); | ||
|
||
let get_object_result = match self.client.get_object(bucket, key, Some(range), Some(if_match)).await { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved from PrefetchGetObject::spawn_next_request
fn get_aligned_request_size(&self, offset: u64, preferred_length: usize) -> usize { | ||
// If the request size is bigger than a part size we will try to align it to part boundaries. | ||
let part_alignment = self.client.part_size().unwrap_or(8 * 1024 * 1024); | ||
let offset_in_part = (offset % part_alignment as u64) as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from PrefetchGetObject::get_next_request_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good
mountpoint-s3/src/prefetch.rs
Outdated
// Tracks the size of read requests | ||
max_read_request_size: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Tracks the size of read requests | |
max_read_request_size: usize, | |
/// Tracks the maximum size of read request submitted so far | |
max_read_request_size: usize, |
7b730cf
to
d6d950e
Compare
struct PrefetcherInner<Client, Runtime> { | ||
client: Arc<Client>, | ||
part_feed: Arc<dyn ObjectPartFeed<Client> + Send + Sync>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts on whether this should be a generic type rather than dynamic dispatch? I guess we can get away with it here (unlike Client
) because ObjectPartFeed
is object-safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a pragmatic choice which we can revert later on. The downside of a new generic type here is that it would propagate up to the root S3FuseFilesystem
and end up touching most of the code in this crate.
In fact, I had started working on an alternative option where I replaced ObjectClient
with a new ObjectStore
trait, with get_object_parts
instead of get_object
(and allowing for further divergence in the future), but that was a much larger change and would have slowed down work on the caching layer.
I still think something like that would be a preferable solution in the long term, but we can review the approach while we work on the cache or soon after.
config: PrefetcherConfig, | ||
runtime: Runtime, | ||
} | ||
|
||
impl<Client, Runtime> Debug for PrefetcherInner<Client, Runtime> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because the Debug impl for part_feed: Arc<dyn ObjectPartFeed<Client>...>
has to work for any ObjectPartFeed
(because the concrete type isn't known), which can only be possible if the ObjectPartFeed
trait itself guarantees Debug. On the other hand, the Arc<Client>
version only has to work for the concrete Client
type(s) that we instantiate it with, which all happen to implement Debug.
Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
d6d950e
to
47c225d
Compare
/// [ObjectPartFeed] implementation which delegates retrieving object data to a [Client]. | ||
#[derive(Debug)] | ||
pub struct ClientPartFeed<Client> { | ||
client: Arc<Client>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we plan to put the cache-based implementation of this? In this file also?
I'm wondering if there's a good way to break down this file.
The old test was hiding a bug because it used a hard coded part size of 8MB regardless of what the client used. awslabs#552 changed that and now this test runs out of memory a lot because it degrades to doing 1 byte requests. I don't think it's worth playing with the logic because it requires a weird config to get there, so just fix the test. Signed-off-by: James Bornholt <bornholt@amazon.com>
* Allow seeking forwards within the prefetch stream Right now we reset the prefetcher any time it seeks forwards, even if the distance it's seeking could be handled by inflight requests (in the worst case, the bytes are already in our buffers, and we just throw them away). That's expensive and slow! This change allows us to seek forwards a limited distance into the prefetch stream. When we see a seek of an acceptable distance, we fast-forward through the stream to the desired target offset, dropping the skipped bytes on the floor. We enforce a maximum seek distance, which is a trade-off between streaming a lot of unnecessary bytes versus an extra request's latency. I haven't put any careful thought into the number. This commit also sets us up to support backwards seeking, which will come in the future. Signed-off-by: James Bornholt <bornholt@amazon.com> * Allow seeking backwards within a prefetch stream Linux asynchronous readahead confuses our prefetcher by sometimes making the stream appear to go backwards, even though the customer is actually just reading sequentially (#488). The problem is that with parallel FUSE threads, the two asynchronous read operations can arrive to the prefetcher out of order. This change allows us to tolerate a little bit of backwards seeking in a prefetch stream. We keep around a little bit of previously read data and can reload it in the event that a seek goes backwards. We do this by creating a fake new request containing the rewound bytes, so that the existing read logic will pick them up. I chose an arbitrary max for the backwards seek buffer, big enough to handle Linux readahead. This should fix the readahead issue: in my testing, I no longer saw slow sequential reads, and the logs confirmed this seeking logic was being triggered in both directions (forwards and backwards), consistent with the readahead requests sometimes arriving out of order. Signed-off-by: James Bornholt <bornholt@amazon.com> * Fix Shuttle tests with new request size logic The old test was hiding a bug because it used a hard coded part size of 8MB regardless of what the client used. #552 changed that and now this test runs out of memory a lot because it degrades to doing 1 byte requests. I don't think it's worth playing with the logic because it requires a weird config to get there, so just fix the test. Signed-off-by: James Bornholt <bornholt@amazon.com> --------- Signed-off-by: James Bornholt <bornholt@amazon.com>
Description of change
Preliminary work that will allow in the future to plug in a caching layer between the prefetching logic and the GetObject calls on the client.
This change introduce a
ObjectPartFeed
trait that will be used by the prefetcher to obtain the chunks of object data and to adjust the size of requests so that they are "optimally" aligned. The currently only implementation delegates retrieving object data to aGetObject
call on the client and aligns to requests topart_size
boundaries, replicating the logic previously in the prefetcher.Relevant issues: #255
Does this change impact existing behavior?
No functional change.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).