-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Platform linux/arm64 to Docker Build #10441
Conversation
|
I'm in 2 minds about this. On the one hand, adding the extra architecture here is not a lot of code and is unlikely to create a lot of churn or maintenance relating to these lines of code added in this PR. Also this is a thing that has been asked for previously. #9192 My main reservation about this is that with the amd64 image, we use it ourselves. That's what our own servers are deployed from. As such we are pretty likely to spot any problems with it. An arm64 image would be something we don't use ourselves so we are unlikely to notice if we break it in some arch-specific way. We also do not run our test suite on arm64. If a user pops up and raises an issue saying "I'm using the official linux/arm64 shieldsio/shields image and I've spotted problem X", I basically have no easy way to reproduce that or help fix it. I do acknowledge that shields is written in javascript not C. The majority of what we're doing is sufficiently high level that it is very unlikely to be architecture specific. That said, we do build on top of a bunch of packages, libraries and images that are all working at lower levels of the stack. I guess my point is: Offering this also comes with some responsibility to have some confidence we are not breaking it as we change things, and some ability to reproduce problems. We don't really have that. So here's another question for you: You've already built and pushed an image to https://github.com/smashedr/shields/pkgs/container/shields Is there anything more we could do that would make it easier for you (or others) to easily maintain/update a 3rd party "shields, but built for [architecture I care about]" without actually taking on the support for those things ourselves? |
The chance that the code would break on one architecture and not another, is slim to none. I have been using an arm64 swarm cluster for a while now, and have never had any issues. The only issue I run into, is people not offering multi-architecture builds, like this project.
Maintaining your own build for an external project is not very easy and requires quite a few steps and workflows to work optimally. Including but not limited too, updating repository permissions, adding actions credentials, linking the package and setting visibility, creating workflows with custom logic to determine when to re-build, linking it all to your deploy, and debugging the many steps/issues that can arise. All compared to simply deploying I recently setup my own build for github-readme-stats after the public deployments paused issue and while it works just fine, I still need to take the time to write a custom workflow on a cron that checks for upstream updates and re-builds when found. And the only reason I took the time to do this is because they don't offer a docker image to begin with. For reference: https://github.com/smashedr/github-readme-stats At the end of the day, since docker images are already offered by this project, I feel at a minimum the additional architectures should be provided as is, as a convenience, for people using those architectures; with no support guarantee. The image I built is currently running on my linux/arm64 swarm cluster and everything I have tested seems to work just fine. If you want to poke around yourself, just note I have not setup any credentials/tokens yet: https://shields.cssnr.com/ Providing support for people trying to setup their own builds would require a detailed guide that would most likely need many revisions before it stops getting issues opened against it, as well as maintaining workflows for to use. Let me know what you think about just providing additional architectures without a support guarantee as I feel this is the most convenient solution solution. |
Chris raises a good point, though I also agree it's something we could resolve with some of the existing suggestions noted previously. Not too long ago, Shields built and provided an "official" docker image but we didn't use the image itself and trying to help people troubleshoot issues with it was a bit of a stretch sometimes. Even since we started using the the image in the production shields.io environment, we do still occasionally get contacted by folks who are having issues running it in kubernetes environments, where our ability/willingness to help troubleshoot is comparatively limited. One of the things we do in the Rust project is the notion of platform/target "tiers" that may be worth trying to adopt (at least in part) here, even if it's just something more simplistic and binary (e.g. x86 linux is tier 1 that the Shields project builds and tests for, everything else is no-guarantees & best-effort) |
So just to be clear, my question isn't "does it work now?" it is more like "how do we ensure future changes do not accidentally break it?". I guess my (imperfect) comparison point here is Windows compatibility. I think you'd also quite reasonably say "its unlikely we're going to do anything that is OS dependent" in the same way as "its unlikely we're going to do anything that is architecture dependent", but it has happened before ( #8350 , #8786 ). In that case, I was at least able to boot up a VM running Windows to fix the issues and we added a Windows CI build to prevent future problems. However, another comparison point here is: We are a website. People occasionally report a bug in the frontend that only manifests in Safari. That is also a giant pain in the ass to reproduce because I don't have a mac either. Sometimes people report a problem that only exists on a platform we don't use 🤷 I did do a bit of reading and have a look around to see if there is any good pattern for
but as far as I can see, most public projects I looked at just seem to build and chuck it over the wall. Maybe "it builds" is enough of a smoke test? It would seem at least comparable with what other projects are doing. Perhaps one of the reasons for this is that although GitHub recently launched linux/arm64 runners, they are currently only available to GitHub Team or GitHub Enterprise Cloud plans.
Tbh, I wouldn't want to classify any of our operations as above "no-guarantees & best-effort" I think on balance building/pushing both but documenting in https://github.com/badges/shields/blob/master/doc/self-hosting.md#docker that:
is probably a reasonable approach here. |
I'd agree 👍 This allows there to be an "official" image, one published by the Shields project which allows for easier, and perhaps more comfortable, consumption (i.e. there's some portion of the user base that'd' likely feel more at ease pulling an image produced by the project as opposed to 3rd party produced). At the same time, it doesn't overcommit the maintainer team |
I've pushed another commit to this branch adding a note to the docs. Does that seem reasonable? |
For testing, I know Amazon Web Services has a free tier of EC2 that allows for 750 hours of a t2.micro per month (always free) and is an ARM instance. If anyone is not using and willing to donate an instance I can configure a GitHub Actions runner on the instance that can easily be used by any current Actions for testing. Additionally, If any issues ever do get opened against the arm64 build that are not replicated in the amd64 build I would be more than happy to help resolve the issue in any way possible. Just make sure to assign or mention me on the issue. |
OK, so I've realised there is another issue with this that I had not spotted before :( I've just run a couple of builds and I've noticed that building for both architectures disporportionately increases the time it takes to build docker images. If I look at the last few builds where we are just building an amd64 image, these take just under 5 minutes to complete: If I look at the builds on this branch, they all consistently took around 35 minutes: I'm fine with publishing more images, but we can't have a CI build that runs for 35 mins on every pull request. That is just too slow. Any idea why building for amd64 takes so much longer? I accept we're now building 2 images so I think we can accept it will take about double the amount of time, but not 7x. |
If there is no way to speed it up, maybe one thing we could consider is doing multi arch builds only on the tagged releases (server-YYYY-MM-DD), but not every pull request and push event? |
agreed. i feel like our main goal here is to provide a convenience in the form of a project-produced image for arm, so no need to bog down CI with something we're explicitly not planning to use nor test |
I agree too, that is way to big of an increase in build time. When I get some time today I will see if I can reduce the build times, otherwise, I can set it to only build these on tagged |
This is the workflow where we push the tagged releases |
I did some research and testing. The emulation layer GitHub runners use to build non-native architectures can be extremely slow on complex builds, in this case to the tune of 5x. I assume we want to build the arm platform on release and next builds, and just remove it from the CI builds. I have updated the PR to reflect this. |
Thanks for having a look at it. I am surprised to learn our build is considered "complex". I've pushed a couple more commits to this branch. My final proposal on this is we push That does mean that I think in order to change my mind on this we'd need to be able to either:
In any case, adding |
So, If I do a matrix build, the amd build will be pushed within 5 minutes, then 35 minutes later, the arm build will be pushed. How do you feel about me adding a matrix? Its quite simple... In hindsight, the matrix approach is probably better, that way one architecture, does not effect the other. I pushed the matrix changes up for you to look at. |
Thanks. I tried this out on my fork and agree this seems like the right way to do it 👍 I've updated the docs and will merge this. |
@chris48s I don't think this is working correctly on Docker Hub. That was the one thing I was unable to test myself. It seems one tag is overwriting the other.
I am also unable to view the packages on GHCR, but, from my experience, GHCR handles multiple architectures seamlessly; however, with the matrix build, it is worth verifying. |
This reverts commit 4a37203.
OK. For the moment I have merged I won't really have a lot of time to dig into this for a week or so now. I'll have to come back to this another day. Reverting will put us back where we were and push an amd64 image back to the tip of |
I already created a new PR to address this going forward: #10476 When you get time lets get that tested and merged. |
It would be beneficial to provide multi-architecture docker images to allow deployment to linux/arm64 servers.
Instead of setting up my own build process I figured it would be useful to incorporate these changes upstream. This was going to be a feature request, but submitting as a PR so I can be of assistance if necessary.
Personally, I don't use Docker Hub, but I do know GHCR is seamlessly compatible with multi-architecture builds and can be deployed to any architecture using a single
latest
tag.My main swarm cluster uses linux/arm64 and after an initial build/deploy of this project, seems to run with no issues from GHCR.
Let me know your thoughts on this request, or any additional work you would like done. Thanks.