-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bootstrapping Bazel in Alpine+ARM hangs forever #17220
Comments
Could you please paste the output for I've seen similar behavior when VMs report running out of space (even when technically they haven't). |
BTW, I'd recommend using |
@yesudeep I re-ran in a Docker container on my Mac with the updated the java path to
Still gets stuck on "Patching repository....". I will check one of the EC2 instances as well |
Namaste @seanmor5 Thank you for responding. I don't have an aarch64 instance/machine handy, but have built myself a Qemu vm image using this quick and dirty script in case someone else wants to test. I was able to reproduce the problem while building bazel on the vm in emulation mode on Linux. I'll look at it in more detail soonish.
|
This appears to be reproducible on Clear Linux running on an x86_64 machine (a framework laptop) as well. Earlier on the same system running Fedora 37, bazel built alright. The file system in use was btrfs then and now
|
The build process is blocked waiting to read something that is perhaps unavailable/stuck in an indefinite loop. I'm building this in tmpfs. would that affect the process?
Update: On my machine the
it worked and started to build. The
On an aarch64 VM an
|
|
@yesudeep Hello, I just got around to this:
output:
Here is the tail of the output of running the compilation where it gets stuck:
|
I'm also experiencing this error sporadically while trying to build Bazel 6.0.0 for arm64 in Alpine for arm64 under QEMU, although it strangely succeeded once for me in CI. Locally it seems to fail at a slightly different place each time. The following Dockerfile hangs for me:
After some time, CPU and network usage go to zero but the command never exits. I can reproduce this in QEMU on x86_64 and on native arm64 AWS Graviton2 CPUs. I tried starting the Alpine image from scratch and adding strace to a few commands (this doesn't work when using QEMU, it works on native arm64 Docker only):
Tail of
I don't really understand any of this, but I noticed child processes are exiting in a sequence like 2212, 2213, etc. When this output was displayed, the container was running the following processes, including 2214 (which I think is hung):
Is the space in |
@yesudeep I'm currently able to reliably reproduce this hang while building Bazel 6.0.0 under Alpine in QEMU in my local Fedora x86_64 with btrfs environment, but the exact same build succeeds (albeit very slowly) under Github Actions. Is there anything I can check to help narrow things down? |
I can reproduce this with Bazel 6.1.0 on native ARM as well. I've uploaded the tail 4000 lines of strace output here: https://pastebin.com/abN6i5qa Stepping inside the hanging container and viewing the contents of Full strace output including "Building Bazel from scratch", then hanging immediately after "Building Bazel with Bazel" (1.7MB txt file) can be downloaded here: https://drive.google.com/file/d/1m_dPN3xYRvNT8_f-k6KtKgt7K8swCHBx |
@yesudeep @meteorcloudy @seanmor5 @strophy @sgowroji I build it on mips64le, error for somethings
|
I tried building again with Bazel 7.2.1 and did not encounter this error anymore, allowing me to package Bazel for Alpine here: https://pkgs.alpinelinux.org/package/edge/testing/aarch64/bazel7 Can anyone else confirm this is no longer an issue? |
Description of the bug:
Hi there, awhile ago I opened: #16484
I was able to get around the issue by running the container on an x86 Linux machine. I am now trying to do the same thing with aarch64. I assumed the issue was exclusive to Docker just being bad on Mac; however, I am running into the same issue on EC2. I've tried various versions of Alpine and Bazel (5.3, 6.0) with no success. This is what I run to bootstrap:
I've tried this on these EC2 AMIs, as well as on a Raspberry Pi with Alpine 3.16 installed:
alpine-ami-3.14.2-aarch64-r0 ami-00604621aea32b1f5
alpine-3.16.0-x86_64-bios-cloudinit-r0 ami-0c9f21a3f1772d2d8
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
This is what I've been using to bootstrap:
Which operating system are you running Bazel on?
Alpine
What is the output of
bazel info release
?n/a
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.See above
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
Most of the time is just hangs at:
Or something like
patching repository
for one of the first few packagesThe text was updated successfully, but these errors were encountered: