Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel layer upload for s3 cache #5270

Merged
merged 1 commit into from
Aug 28, 2024

Conversation

bpaquet
Copy link
Contributor

@bpaquet bpaquet commented Aug 25, 2024

This PR allows to use multiple go routines to upload layers to S3 in parallel.
Parallelism is controlled by upload_parallelism

Individually, each layer is already send in parallel, using the standard Upload Manager provided by the S3 SDK.

Inspired by 6c439bd.

@bpaquet bpaquet force-pushed the s3_parallel_upload branch 2 times, most recently from 7756b08 to a7a00b8 Compare August 25, 2024 21:12
README.md Outdated
@@ -578,6 +578,7 @@ Other options are:
* Multiple manifest names can be specified at the same time, separated by `;`. The standard use case is to use the git sha1 as name, and the branch name as duplicate, and load both with 2 `import-cache` commands.
* `ignore-error=<false|true>`: specify if error is ignored in case cache export fails (default: `false`)
* `touch_refresh=24h`: Instead of being uploaded again when not changed, blobs files will be "touched" on s3 every `touch_refresh`, default is 24h. Due to this, an expiration policy can be set on the S3 bucket to cleanup useless files automatically. Manifests files are systematically rewritten, there is no need to touch them.
* `upload_parallelism=10`: This parameter changes the number of layers uploaded to s3 in parallel. Each individual layer is uploaded with 5 threads, using the Upload manager provided by the AWS SDK.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 10 maybe bit too big for default? In registry we use 4 (+1 for meta-requests) as a default https://github.com/moby/buildkit/blob/master/util/resolver/limited/group.go#L23

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I set it to 4

Signed-off-by: Bertrand Paquet <bertrand.paquet@gmail.com>
@bpaquet bpaquet force-pushed the s3_parallel_upload branch from a7a00b8 to 22f6b3e Compare August 28, 2024 06:41
@tonistiigi tonistiigi merged commit 6fdac94 into moby:master Aug 28, 2024
92 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants