Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-arch docker image build support #3015

Merged
merged 1 commit into from
Apr 27, 2023

Conversation

nrxr
Copy link
Contributor

@nrxr nrxr commented Apr 8, 2023

Update the Dockerfile to support cross-compile via GOOS and GOARCH. To use docker buildx cross-compile feature, the --platform=$BUILDPLATFORM is included for the builder stage. BUILDPLATFORM matches the runner machine platform and GOOS+ GOARCH is used to tell Go to cross-compile for the specific combo passed.

Move from go install to go build, because it can't install a cross-compiled binary; instead the output is saved in /usr/bin/k6 and copied from that path in the final stage.

Change the build workflow from docker build to docker buildx create and docker buildx build. For the build command the --platform flag is transformed as TARGETOS and TARGETARCH variables.

This provides a faster builder stage using the native architecture, with the slow part being left to the runtime stage which will only copy the final binary.

The resulting OCI image will provide a OCI Index that contains both architecture images. Adding support for more platform/arch combos is as simple as adding more entries to the docker buildx create and docker buildx build --platform flags.

More on this method of cross-compilation here:
https://www.docker.com/blog/faster-multi-platform-builds-dockerfile-cross-compilation-guide/

@CLAassistant
Copy link

CLAassistant commented Apr 8, 2023

CLA assistant check
All committers have signed the CLA.

@codecov-commenter
Copy link

codecov-commenter commented Apr 8, 2023

Codecov Report

Merging #3015 (bb7cdbf) into master (1785ec0) will decrease coverage by 0.01%.
The diff coverage is n/a.

❗ Current head bb7cdbf differs from pull request most recent head c0005fd. Consider uploading reports for the commit c0005fd to get more accurate results

@@            Coverage Diff             @@
##           master    #3015      +/-   ##
==========================================
- Coverage   76.99%   76.99%   -0.01%     
==========================================
  Files         228      228              
  Lines       17058    17058              
==========================================
- Hits        13134    13133       -1     
- Misses       3081     3083       +2     
+ Partials      843      842       -1     
Flag Coverage Δ
ubuntu 76.99% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 4 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Update the Dockerfile to support cross-compile via GOOS and GOARCH. To
use docker buildx cross-compile feature, the --platform=$BUILDPLATFORM
is included for the builder stage. BUILDPLATFORM matches the runner
machine platform and GOOS + GOARCH is used to tell Go to cross-compile
for the specific combo passed.

Move from go install to go build, because it can't install a
cross-compiled binary; instead the output is saved in /usr/bin/k6 and
copied from that path in the final stage.

Change the build workflow from docker build to docker buildx create and
docker buildx build. For the build command the --platform flag is
transformed as TARGETOS and TARGETARCH variables.

This provides a faster builder stage using the native architecture, with
the slow part being left to the runtime stage which will only copy the
final binary.

The resulting OCI image will provide a OCI Index that contains both
architecture images. Adding support for more platform/arch combos is as
simple as adding more entries to the docker buildx create and docker
buildx build --platform flags.

More on this method of cross-compilation here:
https://www.docker.com/blog/faster-multi-platform-builds-dockerfile-cross-compilation-guide/
@nrxr nrxr force-pushed the feature/add-docker-multiarch-support branch from 13dcf8d to c0005fd Compare April 8, 2023 23:10
@nrxr
Copy link
Contributor Author

nrxr commented Apr 8, 2023

I force-pushed a rebased commit with the change explained in the second paragraph. I wasn't aware there was a go install instead of a go build. To avoid changing anything else like creating an /opt (which I personally prefer and matches the expectations of FHS, writing to /opt/bin) I just put it in /usr/bin since I'm certain that will exists, no matter what image is used as base for the Go build.

And this is just for a temporal build-stage, so I guess that should be fine.

@oleiade
Copy link
Member

oleiade commented Apr 11, 2023

Hi @nrxr

Thanks a lot for your contribution, it's very appreciated 🙇🏻

I'm not very familiar with docker buildx myself, and while I read the article you've posted, I'm still rather unclear on what is the goal of your PR, from a feature/product perspective.

What would you say does it allow to do with k6+docker that's not currently possible? Why do you, as a user need it to do that? Finally, what value would that bring to k6 as a project, and to the other users of the project?

Cheers 🦖

@nrxr
Copy link
Contributor Author

nrxr commented Apr 11, 2023

Hi @oleiade,

I wanted to run k6 on AWS Graviton instances and couldn't because the docker image is linux/amd64 only (the infamous exec format error). I made a build and ran my own image with GOOS and GOARCH to match linux/arm64 so it could run in our kubernetes instances, which are only Graviton-based.

It would be nice to use the project's original docker image instead of having my own. The purpose of my PR is to give the best single-image solution for cross-architecture availability of container images.

Value to users:

  • Can run the same Docker commands, no matter what architecture they are using in their host instance. User can move from linux/amd64 to linux/arm64 and their docker commands are the same.
  • Can run in their kubernetes clusters whatever architecture they have in their node-groups. Could even run in a mixed-arch kubernetes cluster, where some are amd64 and others are arm64 (or add whatever architecture may arise in the future). Kubernetes deployment image values are the same. In the same image you have both (or as many as you want) architectures embedded. You can swap all your instances from linux/amd64 to linux/arm64 and you don't need to change your deployment files.
  • Running on AWS Graviton spot instances big load-tests is cheaper than a comparable x86-based instances, specially when it comes to intensive CPU loads.

Value to the project:

Support more architectures than linux/amd64 in their docker image in a single entry-point without extra-hassle for the user. Nothing to do on the user-side. No special tags to use. runC, containerd, docker, doesn't matter, they all understand the OCI specification telling them to run the image compatible with their host-architecture.

EDIT: Probably someone at Marketing would love a "Run a k6-distributed load-test from your Raspberry Pi k3s-cluster!".

@oleiade oleiade requested a review from javaducky April 12, 2023 13:12
@oleiade
Copy link
Member

oleiade commented Apr 13, 2023

Thanks a lot for your detailed explanation @nrxr, much appreciated 🙇🏻

Thanks to your explanation, and after looking into it some more, my understanding is now that essentially what the buildx command allows to do is to produce a docker image that would able to run on multiple architectures. As opposed to docker build which would target a single architecture (most likely the one you're building on).

This can be achieved by passing the --platform argument, which supports multiple values, and with the changes of your PR will be exposed as a $BUILDPLATFORM value to the Dockerfile as per docker docs.

We have discussed it internally with the team of maintainers, and agreed with you that this seems very useful indeed. We're keen to move forward with your PR but a few steps need to happen before:

  • @olegbespalov is away at the moment, he is the person we think might have the most ideas and view on this. Thus we'll wait on his feedback 🙇‍♂️
  • Ideally we would like to test this change somehow: which probably means adding a dedicated GitHub action test, to verify that an arm64 built k6 image does work and passes the test suites on an arm machine indeed.
  • We would need to document somehow how to build docker images from there, if that has any impact to our users. At the moment, unless you've read the specific article you've posted, I anticipate a user wouldn't have a clue how to use that technique 😄

If you can, and are willing to help with that, it would be much appreciated indeed 👍

PS: @javaducky added you to this PR as you're also a M1 user, and we'd be curious to get your feedback on the process of using this on an arm-based machine.

WORKDIR $GOPATH/src/go.k6.io/k6
COPY . .
RUN apk --no-cache add git=~2
RUN CGO_ENABLED=0 go install -a -trimpath -ldflags "-s -w -X go.k6.io/k6/lib/consts.VersionDetails=$(date -u +"%FT%T%z")/$(git describe --tags --always --long --dirty)"
RUN go build -a -trimpath -ldflags "-s -w -X go.k6.io/k6/lib/consts.VersionDetails=$(date -u +"%FT%T%z")/$(git describe --tags --always --long --dirty)" -o /usr/bin/k6 .
Copy link
Member

@oleiade oleiade Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason or benefit for writing the output binary to the /usr/bin folder instead of /go/bin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That /usr/bin will always exist, even if the golang container image maintainers decide to change the $GOPATH in the future. As I explained originally, I would prefer /opt/bin since is FHS-compatible but for a temporary stage, /usr/bin will be more than enough unless the k6 binary starts becoming standard-issue on distributions :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/usr/local/bin might be safer in that case, but since we already place it in /usr/bin in the final image, this is fine.

I would prefer changing the WORKDIR to somewhere outside of $GOPATH/src, which is a remnant of Go's dark ages before Go modules 😮‍💨. Changing it to e.g. /opt/k6 would ensure the directory exists, and we could write the binary there. But that's out of scope for this PR and can be done later.


ENV GOOS=$TARGETOS \
GOARCH=$TARGETARCH \
CGO_ENABLED=0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason or benefit justifying setting the CGO_ENABLED as an ENV statement as opposed to prefix the RUN statement with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I was adding already an ENV, I thought it would make sense to have all the environment variables together. Based on Docker's usage of cache per-line, ideally, changing the value of the ENV field would re-trigger the lower-layers.

This, in practical terms, does not matter much since adding an environment variable won't take much compute or storage. But stylistically, I think it looks way cleaner to have all the environment variables together, instead of leaving CGO_ENABLED by itself, alone, before the go build call.

@nrxr
Copy link
Contributor Author

nrxr commented Apr 14, 2023

Thanks a lot for your detailed explanation @nrxr, much appreciated 🙇🏻

Thanks to your explanation, and after looking into it some more, my understanding is now that essentially what the buildx command allows to do is to produce a docker image that would able to run on multiple architectures. As opposed to docker build which would target a single architecture (most likely the one you're building on).

This can be achieved by passing the --platform argument, which supports multiple values, and with the changes of your PR will be exposed as a $BUILDPLATFORM value to the Dockerfile as per docker docs.

Almost correct. $BUILDPLATFORM is the host platform. $TARGETPLATFORM is the value for which platform you want to build. This is leveraging Go's ability to cross-compile which is faster than docker emulating via QEMU the platform.

Why almost correct? Because the --platform argument is being used twice, once for creating the builder instance with docker buildx create and in that case the flag means what are the platforms the builder will support. Then with docker buildx build to let it know what platforms are expecting to run.

If you look closely at the Dockerfile, you'll see the FROM --platform $BUILDPLATFORM is only used in the build stage. That's because we are telling to docker what architecture to use for that stage. In the next stage, we don't use the --platform flag in the FROM call and that means BuildKit will use QEMU to run that stage (which is slower, compute wise).

This is effectively using linux/amd64 host to build linux/amd64 and linux/arm64 during the builder stage, but using a QEMU emulated instance for the copying part.

This could be achieved too by not using the FROM --platform on the builder stage but it would take around 13 minutes instead of the ~2 minutes it takes right now as you can see in the Github Actions report :) (Because QEMU is too slow!)

We have discussed it internally with the team of maintainers, and agreed with you that this seems very useful indeed. We're keen to move forward with your PR but a few steps need to happen before:

  • @olegbespalov is away at the moment, he is the person we think might have the most ideas and view on this. Thus we'll wait on his feedback 🙇‍♂️
  • Ideally we would like to test this change somehow: which probably means adding a dedicated GitHub action test, to verify that an arm64 built k6 image does work and passes the test suites on an arm machine indeed.

This one may be hard unless there are self-hosted runners available for this. Github only provides x86_64 based runners.

  • We would need to document somehow how to build docker images from there, if that has any impact to our users. At the moment, unless you've read the specific article you've posted, I anticipate a user wouldn't have a clue how to use that technique 😄

There's no impact on users. If a user wants to build a derivate from grafana/k6 they can just have a Dockerfile as usual:

FROM grafana/k6:latest

... do your thing

The resulting architecture will be the one from the host used for docker build. If they want to do multi-architecture builds they just use docker buildx create ... and docker build buildx ... but should not use the FROM --platform...

That means, there's really no change for their Dockerfiles.

Why not use the FROM --platform...? Because that's a trick for faster compilation times by using cross-compilation. Using it for something else than compilation is not a good idea.

If you can, and are willing to help with that, it would be much appreciated indeed 👍

PS: @javaducky added you to this PR as you're also a M1 user, and we'd be curious to get your feedback on the process of using this on an arm-based machine.

Sadly, Docker for Mac is already emulating linux, so is not the ideal test-bed for ARM to ARM test :(

@olegbespalov
Copy link
Contributor

Hey @nrxr !

I've checked the changes, and they are LGTM. However, we are currently working on the following release and will probably return to this PR straight after.

Anyway 👍 in advance and thanks for the contribution!

@olegbespalov olegbespalov added this to the v0.45.0 milestone Apr 20, 2023
@mstoykov mstoykov requested a review from imiric April 26, 2023 13:07
Copy link
Contributor

@imiric imiric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this change, and the detailed explanations @nrxr! 🙇

As you can tell, we don't have much experience with multi-arch OCI images. But from reading the documentation, all of this makes sense.

One concern was if all container registries supported these multi-arch images, and it seems that ghcr.io does, so it shouldn't be a problem. The real test will be when we merge this PR and the image is published, but I don't expect any problems with that either.

WORKDIR $GOPATH/src/go.k6.io/k6
COPY . .
RUN apk --no-cache add git=~2
RUN CGO_ENABLED=0 go install -a -trimpath -ldflags "-s -w -X go.k6.io/k6/lib/consts.VersionDetails=$(date -u +"%FT%T%z")/$(git describe --tags --always --long --dirty)"
RUN go build -a -trimpath -ldflags "-s -w -X go.k6.io/k6/lib/consts.VersionDetails=$(date -u +"%FT%T%z")/$(git describe --tags --always --long --dirty)" -o /usr/bin/k6 .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/usr/local/bin might be safer in that case, but since we already place it in /usr/bin in the final image, this is fine.

I would prefer changing the WORKDIR to somewhere outside of $GOPATH/src, which is a remnant of Go's dark ages before Go modules 😮‍💨. Changing it to e.g. /opt/k6 would ensure the directory exists, and we could write the binary there. But that's out of scope for this PR and can be done later.

@imiric imiric merged commit 9fa50b2 into grafana:master Apr 27, 2023
@imiric
Copy link
Contributor

imiric commented Apr 27, 2023

@nrxr The grafana/k6:master image was pushed when this was merged, but we don't see linux/arm64 under the OS/Arch column on Docker Hub. Can you confirm if you can pull the ARM image on an ARM machine?

@nrxr
Copy link
Contributor Author

nrxr commented May 1, 2023

@imiric sorry, I was away from this account a bit. Was busy with work and just today I can read again notifications :-)

It seems I failed to see something, since we are using buildx now, docker push alone won't do the whole thing.

There are three options:

  • The simple one which is adding --push to the docker buildx build command which will push it right away correctly. (I bet you won't want this since you want to test the image locally, then after, you want to push it, in very special cases)
  • The hard one, which is creating the manifest manually (see the hard way in this link) which I think is not desirable either.
  • The pragmatic one: you want to build and test in PRs. You want to push when approved and merged to main. In that case, maybe the best to do is just move the current docker buildx create and docker buildx build to the push part, adding the --push param. That way, you can test the docker image result, as you have done already, and be able to simplify the push of multi-images with a simple command.

At least with docker you can't export it locally, I tried in my machine (it should be done with --local) and spits out:

docker buildx build --load \
                  --platform linux/amd64,linux/arm64 \
                  -t grafana/k6 .
[+] Building 0.0s (0/0)
ERROR: docker exporter does not currently support exporting manifest lists

Sorry, I didn't knew this could happen because at work we build and push at the same time, so we use the --push; had no idea this would fail to push the manifests :-)

(Maybe docker in linux does work correctly with --load, sorry, I only have my AppleM1 available right now and it's too late at night to try and spin-up a VM just for this, also, I doubt it does work based on this issue)

EDIT:

Apparently, another option is not creating a builder but using the --builder default. At least for me it just worked locally: https://hub.docker.com/r/nrxr/k6/tags

I just used that for docker buildx build and docker push nrxr/k6.

~/code/src/github.com/nrxr/k6 fix/multi-arch-container*
❯ docker images | grep nrxr
nrxr/k6         latest                                                           cc497db4d3eb   3 minutes ago    16.2MB
nrxr/k6         latest                                                           cc497db4d3eb   3 minutes ago    15.2MB

After pushing, the manifest inspect shows correctly. Try it for yourself:

❯ docker manifest inspect --verbose nrxr/k6:latest

I did try adding random tags and pushing. It worked. I guess the docker buildx create step in this case is not necessary, then.

Again, sorry, at work we have the --push right away and I did not had this kind of issues :-)

nrxr added a commit to nrxr/k6 that referenced this pull request May 1, 2023
This is a fix for the PR grafana#3015. Using an external builder will create
issues for docker loading images locally and then won't be able to push
the multi-platform images correctly.
@nrxr
Copy link
Contributor Author

nrxr commented May 1, 2023

After my failure in #3046 I think the best approach is to try this out in a linux machine with Docker-Moby installed. Maybe the --load works there.

EDIT: it doesn't. See #3047 .

@imiric
Copy link
Contributor

imiric commented May 2, 2023

Sheesh, what a mess... 😓 Thanks for looking into it @nrxr 🙇

I wrongly assumed that docker push would implicitly push all architectures built by buildx, or at least allow the --platform option to be set. It's kind of insane their documentation doesn't mention this incompatibility with buildx.

The 3 options seem sensible, and here are my thoughts on each:

  1. You're right that we would prefer to not just add --push, but to first test the image before pushing. The tests now are just smoke tests to ensure basic commands work correctly, but it's still a minor layer of validation that we're pushing a working binary.

  2. docker manifest is a bit convoluted, but not that difficult. I didn't manage to build the ARM image on linux/amd64 though:

     ---> [Warning] The requested image's platform (linux/arm64/v8) does not match the detected host platform (linux/amd64) and no specific platform was requested
     ---> Running in 8613d8ac5f88
    exec /bin/sh: exec format error
    

    So I suppose this needs to run in an ARM VM?

    If so, that's definitely too complicated and we should avoid it.

  3. The pragmatic one: you want to build and test in PRs. You want to push when approved and merged to main.

    Correct.

    So if I'm understanding you correctly, keep using docker build for testing the image in PRs, but run docker buildx build --push on master. 🤔 Besides slightly increasing the build time since the image will be built twice (well, 3 times, actually), my main concern is that we would be pushing a different binary, and potentially image, than the one we tested with.

    Our current builds are not reproducible, and while this shouldn't have any functional differences, ideally we would like to push the same bit-for-bit image we tested.

    I suppose this was what you wanted to accomplish with --load, which also doesn't work for me on linux/amd64.

So I guess I would prefer to use docker manifest if possible, but given it's an experimental feature and the direction seems to be heading towards docker buildx, we probably shouldn't rely on it.

I'm not quite following what you did in your last steps. I tried using --builder default, but it error'd out with:

ERROR: multiple platforms feature is currently not supported for docker driver. Please switch to a different driver (eg. "docker buildx create --use")

Maybe this is only supported on darwin/arm64, since it has an x86 emulation layer, but I'm not sure if this is what's happening in this case. I don't use Docker Desktop on Linux, BTW, just the plain Docker package, so it also might be related. 🤷‍♂️

Sorry this turned out to be a bigger problem than expected. 😓 It's not urgent to fix this, as the amd64 image is still built and pushed correctly, but we would appreciate if you could propose another fix for the arm64 image.

@nrxr
Copy link
Contributor Author

nrxr commented May 5, 2023

@imiric Hi!

Yes, that works with "Docker Desktop" which is not what's available in Linux, but Docker-Moby.

With "Docker Desktop" things work with --load if you enable the beta containerd for pulling and storing images. Then what I did is to use GitHub Actions PRs to reproduce it and none of it worked. Because they have Docker-Moby as you can see here.

I'll write a PR with the third approach, if you accept it. I'll explain why the third (pragmatic) approach is just fine:

The Dockerfile used will be the same, meaning, as long as the code is the same, it'll have the same results. We have to remember Docker just compiles and saves in a separate layer. The tests you're running are saying: the code was compiled and what we are running in the separate layer is correctly copied. You do that with docker build already.

So, if you accept that, I'll work it out.

@imiric
Copy link
Contributor

imiric commented May 8, 2023

The Dockerfile used will be the same, meaning, as long as the code is the same, it'll have the same results. We have to remember Docker just compiles and saves in a separate layer. The tests you're running are saying: the code was compiled and what we are running in the separate layer is correctly copied. You do that with docker build already.

Hhmm so the image layer cache is shared between docker build and docker buildx? Even if it is, if I'm understanding your third approach, we would run build on PRs, and buildx on master/v* builds, which would run on potentially different VMs, so I'm not sure if they would be shared. Maybe we'd have to specifically tell GHA to do so? The code should functionally compile to an equivalent version, but ideally we'd want to avoid any recompilation and publish the same exact binary we test.

Anyway, I'm probably getting ahead of myself, so please propose the change in a PR, and we can discuss it there. Thanks! 🙇

@imiric
Copy link
Contributor

imiric commented May 9, 2023

Hey @nrxr, someone is helping us with a similar multi-arch setup for xk6 over at grafana/xk6#66. I think we have the same problem there, where we can't test the same image we build before pushing it, but maybe the use of different actions might be helpful as a reference.

@imiric imiric mentioned this pull request Jun 6, 2023
imiric pushed a commit that referenced this pull request Jun 16, 2023
This reverts commit 9fa50b2.

Pushing of ARM images is not working[1], and there are issues with
pushing master images[2].

Rather than deal with risky fixes with buildx days before the v0.45.0
release, we've decided to roll back this change and migrate to buildx
later, once we've understood and tested the build process better.

[1]: #3015 (comment)

[2]: #3127
imiric pushed a commit that referenced this pull request Jun 16, 2023
This reverts commit 9fa50b2.

Pushing of ARM images is not working[1], and there are issues with
pushing master images[2].

Rather than deal with risky fixes with buildx days before the v0.45.0
release, we've decided to roll back this change and migrate to buildx
later, once we've understood and tested the build process better.

[1]: #3015 (comment)

[2]: #3127
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants