-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: mlock of signal stack failed: 12 #37436
Comments
That is the consequence of trying to work around a kernel bug that significantly impacts Go programs. See #35777. The error message suggests the only two known available fixes: increase the ulimit or upgrade to a newer kernel. |
Well, I'm running the official alpine docker image, the purpose of which is to be able to build a Go program. Apparently it cannot. IMHO the upstream image should be the one fixed to fulfill its purpose, not our build infra to hack around a bug in the upstream image. |
Is the Alpine image maintained by the Go team? (Genuine question. I don’t know about it.) Either way, yes, the image should be fixed, ideally with a kernel upgrade. |
I'm not fully sure who and how maintains the docker images (https://hub.docker.com/_/golang), but the docker hub repo is an "Official Image", which is a super hard to obtain status, so I assume someone high enough the food chain is responsible. |
It's "maintained by the Docker Community". Issues should be filed at https://github.com/docker-library/golang/issues EDIT: the problem is the host kernel, not the Docker library image, so they can't fix it. |
So, the official solution to Go crashing is to point fingers to everyone else to hack around your code? Makes sense. |
@karalabe I would like to remind you of https://golang.org/conduct. In particular, please be respectful and be charitable. |
Please answer the question |
It is standard practice to redirect issues to the correct issue tracking system. There is an extensive discussion of possible workarounds and fixes in the issue I linked to earlier, if you would like to see what options were considered on the Go side. |
This issue does not happen with Go 1.13. Ergo, it is a bug introduced in Go 1.14. Saying you can't fix it and telling people to use workarounds it is dishonest, because reverting a piece of code would actually fix it. An alternative solution would be to detect the problematic platforms / kernels and provide a fallback mechanism baked into Go. Telling people to use a different kernel is especially nasty, because it's not as if most people can go around and build themselves a new kernel. If alpine doesn't release a new kernel, there's not much most devs can do. And lastly if your project relies on a stable infrastructure where you can't just swap out kernels, you're again in a pickle.
The fact that Go crashes is not the fault of docker. Redirecting a Go crash to a docker repo is deflection. |
You could also disable preemptive scheduling at runtime
@ianlancetaylor we have a suggestion to do this when running on an affected kernel; is that viable? BTW, It's a known problem that Docker library modules don't get timely updates, which is a security liability. Caveat emptor. |
The kernel bug manifested as random memory corruption in Go 1.13 (both with and without preemptive scheduling). What is new in Go 1.14 is that we detect the presence of the bug, attempt to work around it, and prefer to crash early and loudly if that is not possible. You can see the details in the issue I referred you to. Since you have called me dishonest and nasty, I will remind you again about the code of conduct: https://golang.org/conduct. I am also done participating in this conversation. |
@karalabe, I misspoke, the issue is your host kernel, not the Docker image. Are you unable to update it? |
I'm on latest Ubuntu and latest available kernel. Apparently all available Ubuntu kernels are unsuitable for Go 1.14 https://packages.ubuntu.com/search?keywords=linux-image-generic based on the error message. |
Can you add the output of I've posted a note to golang-dev. cc @aclements |
When you say you are on the latest ubuntu and kernel what exactly do you mean (i.e. output of dpkg -l linux-image-*, lsb_release -a, uname -a, that sort of thing) because as far as I can see the fix is in the kernel in the updates pocket for both 19.10 (current stable release) and 20.04 (devel release). It's not in the GA kernel for 18.04 but is in the HWE kernel, but otoh those aren't built with gcc 9 and so shouldn't be affected anyway. |
@networkimprov Disabling signal preemption makes the bug less likely to occur but it is still present. It's a bug in certain Linux kernel versions. The bug affects all programs in all languages. It's particularly likely to be observable with Go programs that use signal preemption, but it's present for all other programs as well. Go tries to work around the bug by mlocking the signal stack. That works fine unless you run into the mlock limit. I suppose that one downside of this workaround is that we make the problem very visible, rather than occasionally failing due to random memory corruption as would happen if we didn't do the mlock. At some point there is no way to work around a kernel bug. |
which does satisfy the minimum version requirements. Similarly:
Can you clarify what you're seeing? |
FROM golang:1.14-alpine
RUN apk add --no-cache make gcc musl-dev linux-headers git wget
RUN \
wget -O geth.tgz "https://github.com/ethereum/go-ethereum/archive/v1.9.11.tar.gz" && \
mkdir /go-ethereum && tar -C /go-ethereum -xzf geth.tgz --strip-components=1 && \
cd /go-ethereum && make geth
|
Sorry, my previous comment was misleading. Because of course the kernel version returned by Hence per:
you need to upgrade the host OS kernel. FWIW, the steps you lay out above using Alpine to
|
Yes, but in my previous posts I highlighted that I'm already on the latest Ubuntu and have installed the latest available kernel from the package repository. I don't see how I could update my kernel to work with Go 1.14 apart from rebuilding the entire kernel from source. Maybe I'm missing something? |
Just to emphasize, I do understand what the workaround is and if I want to make it work, I can. I opened this issue report because I'd expect other people to hit the same problem eventually. If just updating my system would fix the issue I'd gladly accept that as a solution, but unless I'm missing something, the fixed kernel is not available for (recent) Ubuntu users, so quite a large userbase might be affected. |
Hm yes, I have just reproduced on focal too. The fix is present in the git for the Ubuntu eoan kernel: https://kernel.ubuntu.com/git/ubuntu/ubuntu-eoan.git/commit/?id=59e7e6398a9d6d91cd01bc364f9491dc1bf2a426 and that commit is in the ancestry for the 5.3.0-40.32 so the fix should be in the kernel you are using. In other words, I think we need to get the kernel team involved -- I'll try to do that. |
@karalabe - I've just realised my mistake: I though I was using the latest Ubuntu, I am in fact using @mwhudson - just one thing to note (although you're probably already aware of this), a superficial glance at the code responsible for this switch: go/src/runtime/os_linux_x86.go Lines 56 to 61 in 20a838a
seems to suggest that the Go side is checking for patch release 15 or greater. What does 5.3.0-40.32 report as a patch version? I'm guessing 0? Re-opening this discussion until we round out the issue here. |
A little summary because I had to piece it together myself:
So it seems like Ubuntu's kernel is patched, but the workaround gets enabled anyways. |
Oh right, yes I should actually read the failure shouldn't I? This is the workaround failing rather than the original bug, in a case where the workaround isn't actually needed but there's no good way for Go to know this. I can patch the check out of the Go 1.14 package in Ubuntu but that doesn't help users running e.g. the docker golang:1.14-alpine image. Hrm. |
I guess the question is, how many users are using "vulnerable" kernels at this point. There can't be all that many distributions that are compiling an unpatched kernel with gcc 9 by now. |
- unit test signal handler - more DistributedProcessImpl unit tests - tweak .bazelrc: use newer dockage image for docker builds/tests to resolve golang/go#37436 Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Check the issue: golang/go#37436
Older versions of k8s-await-election would lead to crashes on ubuntu systems. The golang version used to build (1.14.0) was not recognizing the ubuntu kernel version, applying a workaround for an already fixed issue that would end in random crashes. See: golang/go#37436
Has anyone been getting this error on 1.14.7? Just had this reported on 1.14.7. Running this on EKS-provided EC2 image:
|
@RobertLucian See the discussion at https://golang.org/wiki/LinuxKernelSignalVectorBug |
@iamoryanmoshe thank you! I'll go through that wiki and investigate further. |
Bumps submodule to use go 1.14.4 to pick up a workaround for linux kernel bug described in this isssue golang/go#37436 Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Bumps submodule to use go 1.14.4 to pick up a workaround for linux kernel bug described in this isssue golang/go#37436
We are facing issue golang/go#37436 in CI for multiple PRs. Trying to switch to latest go version from 1.14 tree. Signed-off-by: Sanket Sudake <sanketsudake@gmail.com>
Without this, `go get` will fail on Linux 5.4.0 (Ubuntu 20.04) when it hits golang/go#37436, with the following signature: walt@work:~/git/gravity/docs$ make docs mkdir -p ../build/docs // snip ... go: downloading golang.org/x/net v0.0.0-20200707034311-ab3426394381 runtime: mlock of signal stack failed: 12 runtime: increase the mlock limit (ulimit -l) or runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+ fatal error: mlock failed While this won't affect our current release infra (kernel 3.10.0), it is an important fix for developers running affected kernel versions.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
I hit this with
golang:1.14-rc-alpine
docker image, the error does not happen in 1.13.What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Clone
https://github.com/ethereum/go-ethereum
, replace the builder version inDockerfile
togolang:1.14-rc-alpine
(or use the Dockerfile from below), then from the root build the docker image:$ docker build .
What did you expect to see?
Go should run our build scripts successfully.
What did you see instead?
The text was updated successfully, but these errors were encountered: