Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Allow adjustment of the minimum S3 upload part size #17175

Open
rvrangel opened this issue Nov 7, 2024 · 0 comments · May be fixed by #17171
Open

Feature Request: Allow adjustment of the minimum S3 upload part size #17175

rvrangel opened this issue Nov 7, 2024 · 0 comments · May be fixed by #17171
Assignees
Labels
Component: Backup and Restore Type: Enhancement Logical improvement (somewhere between a bug and feature)

Comments

@rvrangel
Copy link
Contributor

rvrangel commented Nov 7, 2024

Feature Description

Allow operators to define the minimum size of the S3 upload part, in case there is a desired to avoid too many roundtrip requests to upload a file, while lowering the number of PUT requests made to S3

Use Case(s)

We have had issues with S3 throttling our backups because of the number of requests we make to S3. Because the way the S3 upload part size is calculated, for small shards it means we are dividing the backup in a huge number of parts unnecessarily.

As an example, a 50GB shard will have around 10k parts of 5MB each. When you have thousands of tablets backing up, this adds up and you can have a huge influx of requests. Each S3 bucket/partition can only handle around 3500 PUT requests, so every now and then we see those being throttled.

It would be nice if we could set a minimum part size instead, so if the normal calculation we do results in a smaller size, it gets overwritten, but if the calculated size is over it (specially on huge shards that are TB sized) we will still use that value to be able to go over the 5GB limit per file/part

@rvrangel rvrangel added the Needs Triage This issue needs to be correctly labelled and triaged label Nov 7, 2024
@rvrangel rvrangel linked a pull request Nov 7, 2024 that will close this issue
5 tasks
@shlomi-noach shlomi-noach added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Backup and Restore and removed Needs Triage This issue needs to be correctly labelled and triaged labels Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Backup and Restore Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants