-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buildkit support with buildx #43
Conversation
Thanks! I agree this is a good place to have a discussion. 😄 Some initial thoughts:
|
Point by point:
You did mention mounts though, so I remembered - buildkit also keeps a separate private cache in /var/lib/buildkit - so some functionality for cache pruning will need to be added, but it should just be a simple invocation of |
cmd/bashbrew/docker.go
Outdated
"--cache-to", "type=inline", | ||
} | ||
for _, cache := range caches { | ||
args = append(args, "--cache-from", "type=registry,ref="+cache) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I don't love how I've done this here. Essentially in the current implementation I've told buildkit to extract cache from all the tags of that build previously - which could be a few.
I can't use the bashbrew/cache:digest
tag to import cache from, since that changes on each update to the GitCommit
- it there a stable place that I could easily access "the last build of this manifest entry"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some other experiments, I've hacked around this by "guessing" a stable tag from the list of tags - it feels like it might be nice to have a stable Name
field for each entry that attempts to stay consistent between updates and builds (even if the tags are changing).
So I'm not quite sure where this would go - my best guess would be somewhere in docker-library/oi-janky-groovy as part of the CI multiarch scripts, as opposed to this tool directly? I can't quite find where the logic for tidying up cache lives. |
Heya @tianon 👋 Everything's a bit busy atm, but was wondering if you'd had a chance to think about this PR/had any concerns? Was looking at how to move this forward 🎉 |
My initial thoughts on buildkit builds after doing some research and testing. cache: how to correctly invalidate and how to get the cache for the "next" buildPushing cache to a local registry is going to increase overall maintenance, since now there is a service to be maintained and cleaned regularly. "What is useful" cache in the registry is very hard to calculate. There is not a unique identifier each tag grouping and that is not something we want to attempt to push onto the image maintainers, nor is there a way to get the previous set of tags that could be considered for build cache (e.g. for The most correct is that cache can and should be used for all images within a given repo that have at least one common parent image. Ideally we'd use a network mountable disk that works for With moby/buildkit#1368 still not being included in a Docker release and the same possibility still happening for all the other files in the context, we must do buildkit builds via context piped to stdin (e.g.
|
Thanks for your insight @yosifkit, much appreciated! ❤️ CacheThe current bane of my existence 😢 - if only cache could be simple 😄
I think I might have accidentally been unclear with regards to cache! - buildkit still has a local cache, similar to the old builder. We could simply continue to use that, for now, and it should behave similarly to the existing cache, and not add any additional maintenance overhead for the time being. However, this comes with the disclaimer that this cache is still per-node, and won't sync across multiple nodes - for multiple node caching, we'd need to have some form of cache storage. There's a couple storage options for that:
I have also seen a few interesting approaches that attempt to sync the How is this currently solved for the official images program? Are all the right images pulled directly before the build to ensure that they're present on each node to populate the cache, or is there some more complex logic behind it?
Working out how to connect different builds to use the same cache is quite difficult - even assuming we have some way of calculating cache dependencies efficiently, some of the complex images could possibly require many different cache dependencies. I'm not quite sure if buildkit has any (soft/hard) limits on this 😕
Aha, yup that makes sense - we can get buildx to do exactly the same thing here. I think if we were to use the buildkit git contexts at some point, we'd be safe as well, since those would use unique commit hashes in their url, and so couldn't be cached at all. But agreed, sticking with piped contexts seems like the way to go for now 😄 Syntax
I think it makes sense to allow using syntax selectively - though I think it makes sense to have those controls as a check by maintainers, instead of a buildkit feature flag, since maintaining those might quickly become a pain with a matrix of potential controls to test between. build output + parallelThink that makes sense! Nothing to add here. Multi-arch buildsSomething not haven't mentioned yet, but might be worth discussing - using buildkit, we could consider combining some of the different templated This could potentially look something like: FROM --platform=$BUILDPLATFORM scratch AS builder
ADD $BUILDPLATFORM.tar.gz /
CMD ["bash"]
FROM amd64-builder AS amd64
FROM arm64-builder AS arm64
# special case for s390x
FROM builder as s930x
RUN echo "do things"
FROM ${TARGETARCH}${TARGETVARIANT} Then just based on what platform this is built for, the right stages will be selected and built - curious what you think about something like this for some of the simpler templated builds, I think it could potentially help simplify some dockerfiles in some cases (though definitely not everywhere). |
Have pushed a smaller commit based on our conversations, it should be fairly self-explanatory 🎉 A couple notes:
If you want, I'm then happy to follow-up and document the new functionality in separate PRs in the right official-images repos once this is in a ready-to-go state 👍 |
Sorry for the delay! Here are some thoughts I'd love to discuss more in the spirit of not letting "the perfect review" be the enemy of "useful feedback" that might lead to more useful discussion: 😅 ❤️
|
✔️ Agreed, I've proposed
✔️ have done this
✔️ Aha, yup, good point - I've added this to the "unique bits" now. I've specifically only added it if we detect a non-empty value to preserve the contents of the cache hash for unchanged manifests, so only modified manifests with the new Also caught a couple places where it needs to be used alongside
❓ Would using an environment variable here be sufficient, to allow overriding a default of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really, really appreciate your patience, @jedevc 🙇
Just one really tiny nit and I think this is ready 👀
(I'm happy to update it if you need me to - the delay is totally my fault 😅)
dc25dfb
to
d21a2e3
Compare
This patch adds a new `Builder` entry to the RFC2822 image manifest files, allowing individual image components to opt into buildkit support, using docker buildx. Additionally, docker build is changed to explicitly set DOCKER_BUILDKIT=0 to force a non-buildkit build in preparation for the switching default in any upcoming docker versions. Signed-off-by: Justin Chadwell <me@jedevc.com>
Signed-off-by: Justin Chadwell <me@jedevc.com>
d21a2e3
to
bf91869
Compare
(finally 😩) made a few follow-ups to move this forward in the rest of the pipeline: 👀 |
Heya 👋
This isn't in a completed state yet, but thought it might be quicker to have a discussion around a PoC I was hacking around with, than to have a long-winded proposal 😄
TL;DR, proposal: official images could opt in to buildkit with a syntax like:
In this case, instead of building using
docker build
, the image is built usingdocker buildx build
. At some point, the default builder will likely switch to buildkit, but it might be nice to start allowing some images to opt-in sooner, to reduce the complexity of adapting many images later down the road (ideally no changes would be needed in any of the Dockerfiles, but I haven't had the time to manually review every single one of them 😢). Additionally, this could open up some nice opportunities, like being able to use the buildkit parser for extractingFROM
data (since it's backwards compatible with older dockerfiles), using the new imagetools for manifest management, etc.Regarding caching, buildkit has a few cache backends, and while it might be worth investigating using them in the future: but for a first pass, I think the inline cache option is most similar to what the other builder uses - this means that no separate cache policy will need to exist, it can be the same. Additionally, buildkit has good support for caching multi-stage images - which might be nice to investigate later down the road 🎉
This would let image maintainers use some newer buildkit features if they want, and hopefully unlock some future performance gains (though I haven't done any benchmarking to test yet).
Looking forward to hearing what you think!