-
Notifications
You must be signed in to change notification settings - Fork 20.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consensus/ethash: avoid runtime errors due to OOD on mmap writes #23799
Conversation
16cd366
to
7b04f14
Compare
consensus/ethash/mmap_help_linux.go
Outdated
|
||
import ( | ||
"os" | ||
"syscall" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In https://pkg.go.dev/syscall they say: Deprecated: this package is locked down. Callers should use the corresponding package in the golang.org/x/sys repository instead. That is also where updates required by new systems or versions should be applied. See https://golang.org/s/go1.4-syscall for more information.
consensus/ethash/mmap_help_linux.go
Outdated
|
||
// Package ethash implements the ethash proof-of-work consensus engine. | ||
|
||
//go:build linux && (386 || amd64 || arm || arm64 || mips || mips64 || mips64le || ppc || pcc64 || pcc64le || riscv64 || s390x || sparc64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to specify all the possible $GOARCH
in there? A simple
//go:build linux && (386 || amd64 || arm || arm64 || mips || mips64 || mips64le || ppc || pcc64 || pcc64le || riscv64 || s390x || sparc64) | |
//go:build linux |
should work, I think, as I see no reason for fallocate
not to be present for a given architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, indeed, it is present in all files. I was confused because they didn't put it in in syscall_linux.go
, but instead added it individually in all files.
7b04f14
to
7ecade8
Compare
Fixed, ptal |
@gballet lgty? |
@@ -358,7 +361,7 @@ func (d *dataset) generate(dir string, limit int, lock bool, test bool) { | |||
if err != nil { | |||
logger.Error("Failed to generate mapped ethash dataset", "err", err) | |||
|
|||
d.dataset = make([]uint32, dsize/2) | |||
d.dataset = make([]uint32, dsize/4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers: check line 330
, and you'll see that this is the same as what's used already whenever we don't use disk-based dag.
…ereum#23799) When we map a file for generating the DAG, we do a simple truncate to e.g. 1Gb. This is fine, even if we have nowhere near 1Gb disk available, as the actual file doesn't take up the full 1Gb, merely a few bytes. When we start generating into it, however, it eventually crashes with a unexpected fault address . This change fixes it (on linux systems) by using the Fallocate syscall, which preallocates suffcient space on disk to avoid that situation. Co-authored-by: Felix Lange <fjl@twurst.com>
…ereum#23799) When we map a file for generating the DAG, we do a simple truncate to e.g. 1Gb. This is fine, even if we have nowhere near 1Gb disk available, as the actual file doesn't take up the full 1Gb, merely a few bytes. When we start generating into it, however, it eventually crashes with a unexpected fault address . This change fixes it (on linux systems) by using the Fallocate syscall, which preallocates suffcient space on disk to avoid that situation. Co-authored-by: Felix Lange <fjl@twurst.com>
This PR avoids a pretty rare crash.
Some more info here: edsrzf/mmap-go#12 (comment) .
When we map a file for generating the DAG, we do a simple truncate to e.g.
1Gb
. This is fine, even if we have nowhere near1Gb
disk available, as the actual file doesn't take up the full1Gb
, merely a few bytes. When we start generating into it, however, it eventually crashes with aunexpected fault address
.Examples:
There are two ways we can fix this, the simplest being to use the
fallocate
linux system call.The second way is to convert the internals of ethash; instead of generating into a
[]byte
, we generate into an interface which supportsWriteAt
. Then we can first generate into a file (or into a wrapped memory slice), and later do a readonly-mmap on it.This PR implements the first approach. I tried the second approach, but it becomes a pretty large refactor, especially since it also touches the cache generator (which uses the same schema), and the cache-generator is not as 'linear' -- it reads from self in a way which the DAG-generator does not, and thus is more difficult to convert this way. In the end, I thought it seems a bit wasteful to do a major refactoring of ethash at this point.
This PR also fixes an over-allocation that could happen if the mmap fails, and we generate the dag in-memory, and does some more cleanup in case of failures.
Re testing
I found this error when I tried running
go test . -run - -bench BenchmarkHashimotoFullMmap/WithLock -v
. It generates the dag to the system temp, which in my case was limited to1Gb
. Without this fix, the benchmark crashed, but it runs fine with this fix.Re platforms
I got the supported-platform info from this