-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy from S3 is slow #483
Comments
Hi, we are investigating into this problem. What we can share right now is that we run some tests on it and noticed that the kernel might try to improve copy performance by sending readahead requests to Mountpoint, but they are interpreted as random reads and end up messing Mountpoint's prefetcher logic. Could you share more info about how many files did you copy and what mount options did you configure on Mountpoint? |
Thank you! |
We created a new issue (#488) to track readahead problem and it's most likely a root cause of slow copy. Until it's fixed, running mountpoint in single-threaded mode ( |
after we change to O_DIRECT, the reading speed is still slow, about 200mb/s. However, the write speed is at 1200mb/s. |
Hey, I was digging into this issue. While we have #488 tracking a performance issue related to out-of-order reads, the issue here is that You can achieve a similar result by parallelizing the copy over multiple threads, which will drive the read requests to Mountpoint in parallel also. For example, I tested using GNU Parallel below. It creates a list of files in the directory and then parallelized the copy over ten threads.
I tried a quick comparison. I populated my bucket with 6x 512MiB objects and 6000x 120KiB objects. I timed on my machine and saw that the serial Can you give this approach a try and let us know? As I said, we are looking into #488 but I suspect you'll see a much greater improvement by parallelizing these copies. Thanks! |
Thank you! I'll try this method [hopefully next week], |
I'm closing this issue since we provided the recommendation to parallelize UNIX We are separately working on #488 to fix the performance issue triggered during some parallel reading, but that's unrelated to this use case. |
Mountpoint for Amazon S3 version
mount-s3 1.0.0
AWS Region
eu-central-1
Describe the running environment
Running on Ubuntu22.04, S3 bucket and EC2 instance are in the same region.
What happened?
Trying to copy 3.7GB data from S3 mount point to the local dir, it takes more than 10 minutes. (While 'aws s3 cp s3://my-test-bucket /tmp' takes 1 minute).
Using this command:
time cp -r mount-point/* /tmp/
Relevant log output
The text was updated successfully, but these errors were encountered: