Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new abstraction between the prefetcher and GetObject calls #552

Merged
merged 1 commit into from
Oct 16, 2023

Conversation

passaro
Copy link
Contributor

@passaro passaro commented Oct 11, 2023

Description of change

Preliminary work that will allow in the future to plug in a caching layer between the prefetching logic and the GetObject calls on the client.

This change introduce a ObjectPartFeed trait that will be used by the prefetcher to obtain the chunks of object data and to adjust the size of requests so that they are "optimally" aligned. The currently only implementation delegates retrieving object data to a GetObject call on the client and aligns to requests to part_size boundaries, replicating the logic previously in the prefetcher.

Relevant issues: #255

Does this change impact existing behavior?

No functional change.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@passaro passaro temporarily deployed to PR integration tests October 11, 2023 13:29 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests October 11, 2023 13:29 — with GitHub Actions Failure
@passaro passaro temporarily deployed to PR integration tests October 11, 2023 13:29 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 11, 2023 13:29 — with GitHub Actions Inactive
config: PrefetcherConfig,
runtime: Runtime,
}

impl<Client, Runtime> Debug for PrefetcherInner<Client, Runtime> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

derive(Debug) did not seem to work with part_feed: Arc<dyn ObjectPartFeed<Client>...>. Not completely sure why (wrt client: Arc<Client>) or whether there is a better solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the Debug impl for part_feed: Arc<dyn ObjectPartFeed<Client>...> has to work for any ObjectPartFeed (because the concrete type isn't known), which can only be possible if the ObjectPartFeed trait itself guarantees Debug. On the other hand, the Arc<Client> version only has to work for the concrete Client type(s) that we instantiate it with, which all happen to implement Debug.

@@ -45,8 +43,6 @@ pub struct PrefetcherConfig {
pub sequential_prefetch_multiplier: usize,
/// Timeout to wait for a part to become available
pub read_timeout: Duration,
/// The size of the parts that the prefetcher is trying to align with
pub part_alignment: usize,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we retrieve this directly from the client

.unwrap_or_default()
.clamp(MIN_PART_SIZE, MAX_PART_SIZE);

let get_object_result = match self.client.get_object(bucket, key, Some(range), Some(if_match)).await {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moved from PrefetchGetObject::spawn_next_request

fn get_aligned_request_size(&self, offset: u64, preferred_length: usize) -> usize {
// If the request size is bigger than a part size we will try to align it to part boundaries.
let part_alignment = self.client.part_size().unwrap_or(8 * 1024 * 1024);
let offset_in_part = (offset % part_alignment as u64) as usize;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from PrefetchGetObject::get_next_request_size

@passaro passaro requested a review from dannycjones October 11, 2023 13:36
Copy link
Contributor

@dannycjones dannycjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good

Comment on lines 119 to 120
// Tracks the size of read requests
max_read_request_size: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Tracks the size of read requests
max_read_request_size: usize,
/// Tracks the maximum size of read request submitted so far
max_read_request_size: usize,

@passaro passaro temporarily deployed to PR integration tests October 12, 2023 07:39 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests October 12, 2023 07:39 — with GitHub Actions Error
@passaro passaro temporarily deployed to PR integration tests October 12, 2023 07:39 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 12, 2023 07:39 — with GitHub Actions Inactive
@passaro passaro marked this pull request as ready for review October 12, 2023 10:28
dannycjones
dannycjones previously approved these changes Oct 12, 2023
@passaro passaro temporarily deployed to PR integration tests October 12, 2023 12:28 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 12, 2023 12:28 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests October 12, 2023 12:28 — with GitHub Actions Error
@passaro passaro temporarily deployed to PR integration tests October 12, 2023 12:28 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 12, 2023 14:39 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 12, 2023 14:39 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests October 12, 2023 14:39 — with GitHub Actions Error
@passaro passaro temporarily deployed to PR integration tests October 12, 2023 14:39 — with GitHub Actions Inactive
struct PrefetcherInner<Client, Runtime> {
client: Arc<Client>,
part_feed: Arc<dyn ObjectPartFeed<Client> + Send + Sync>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on whether this should be a generic type rather than dynamic dispatch? I guess we can get away with it here (unlike Client) because ObjectPartFeed is object-safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a pragmatic choice which we can revert later on. The downside of a new generic type here is that it would propagate up to the root S3FuseFilesystem and end up touching most of the code in this crate.
In fact, I had started working on an alternative option where I replaced ObjectClient with a new ObjectStore trait, with get_object_parts instead of get_object (and allowing for further divergence in the future), but that was a much larger change and would have slowed down work on the caching layer.
I still think something like that would be a preferable solution in the long term, but we can review the approach while we work on the cache or soon after.

config: PrefetcherConfig,
runtime: Runtime,
}

impl<Client, Runtime> Debug for PrefetcherInner<Client, Runtime> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the Debug impl for part_feed: Arc<dyn ObjectPartFeed<Client>...> has to work for any ObjectPartFeed (because the concrete type isn't known), which can only be possible if the ObjectPartFeed trait itself guarantees Debug. On the other hand, the Arc<Client> version only has to work for the concrete Client type(s) that we instantiate it with, which all happen to implement Debug.

mountpoint-s3/src/prefetch.rs Outdated Show resolved Hide resolved
@passaro passaro temporarily deployed to PR integration tests October 16, 2023 11:53 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 16, 2023 11:53 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 16, 2023 11:53 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 16, 2023 11:53 — with GitHub Actions Inactive
Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
@passaro passaro temporarily deployed to PR integration tests October 16, 2023 12:57 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 16, 2023 12:57 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 16, 2023 12:57 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests October 16, 2023 12:57 — with GitHub Actions Inactive
Comment on lines +36 to +40
/// [ObjectPartFeed] implementation which delegates retrieving object data to a [Client].
#[derive(Debug)]
pub struct ClientPartFeed<Client> {
client: Arc<Client>,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we plan to put the cache-based implementation of this? In this file also?

I'm wondering if there's a good way to break down this file.

@passaro passaro added this pull request to the merge queue Oct 16, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 16, 2023
@jamesbornholt jamesbornholt added this pull request to the merge queue Oct 16, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Oct 16, 2023
@jamesbornholt jamesbornholt added this pull request to the merge queue Oct 16, 2023
Merged via the queue into awslabs:main with commit 8490e8b Oct 16, 2023
18 checks passed
jamesbornholt added a commit to jamesbornholt/mountpoint-s3 that referenced this pull request Oct 17, 2023
The old test was hiding a bug because it used a hard coded part size of
8MB regardless of what the client used. awslabs#552 changed that and now this
test runs out of memory a lot because it degrades to doing 1 byte
requests. I don't think it's worth playing with the logic because it
requires a weird config to get there, so just fix the test.

Signed-off-by: James Bornholt <bornholt@amazon.com>
github-merge-queue bot pushed a commit that referenced this pull request Oct 18, 2023
* Allow seeking forwards within the prefetch stream

Right now we reset the prefetcher any time it seeks forwards, even if
the distance it's seeking could be handled by inflight requests (in the
worst case, the bytes are already in our buffers, and we just throw them
away). That's expensive and slow!

This change allows us to seek forwards a limited distance into the
prefetch stream. When we see a seek of an acceptable distance, we
fast-forward through the stream to the desired target offset, dropping
the skipped bytes on the floor. We enforce a maximum seek distance,
which is a trade-off between streaming a lot of unnecessary bytes versus
an extra request's latency. I haven't put any careful thought into the
number.

This commit also sets us up to support backwards seeking, which will
come in the future.

Signed-off-by: James Bornholt <bornholt@amazon.com>

* Allow seeking backwards within a prefetch stream

Linux asynchronous readahead confuses our prefetcher by sometimes making
the stream appear to go backwards, even though the customer is actually
just reading sequentially (#488). The problem is that with parallel FUSE
threads, the two asynchronous read operations can arrive to the
prefetcher out of order.

This change allows us to tolerate a little bit of backwards seeking in a
prefetch stream. We keep around a little bit of previously read data and
can reload it in the event that a seek goes backwards. We do this by
creating a fake new request containing the rewound bytes, so that the
existing read logic will pick them up. I chose an arbitrary max for the
backwards seek buffer, big enough to handle Linux readahead.

This should fix the readahead issue: in my testing, I no longer saw slow
sequential reads, and the logs confirmed this seeking logic was being
triggered in both directions (forwards and backwards), consistent with
the readahead requests sometimes arriving out of order.

Signed-off-by: James Bornholt <bornholt@amazon.com>

* Fix Shuttle tests with new request size logic

The old test was hiding a bug because it used a hard coded part size of
8MB regardless of what the client used. #552 changed that and now this
test runs out of memory a lot because it degrades to doing 1 byte
requests. I don't think it's worth playing with the logic because it
requires a weird config to get there, so just fix the test.

Signed-off-by: James Bornholt <bornholt@amazon.com>

---------

Signed-off-by: James Bornholt <bornholt@amazon.com>
@passaro passaro deleted the object-part-feed branch October 26, 2023 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants