-
Notifications
You must be signed in to change notification settings - Fork 20.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: fix goroutine leak related to state snapshot generation #28974
Conversation
Exec speed of
Ok, I guess we keep the paralellism! |
tests/state_test_util.go
Outdated
@@ -240,6 +240,9 @@ func (t *StateTest) RunNoVerify(subtest StateSubtest, vmconfig vm.Config, snapsh | |||
vmconfig.ExtraEips = eips | |||
|
|||
block := t.genesis(config).ToBlock() | |||
// MakePreState invokes snapshot.New, which will start a goroutine to | |||
// generate.go:generate(). This goroutine cannot exit until the abort-stats | |||
// can be delivered, either via a call to Journal or e.g Disable. | |||
triedb, snaps, statedb := MakePreState(rawdb.NewMemoryDatabase(), t.json.Pre, snapshotter, scheme) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding this here?
if snaps != nil {
defer snaps.Disable()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And while here, we should probably also use defer triedb.Close()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only close these things unless we return them to caller. If we do, it's the callers job to clean up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, but that's a design smell then. So there are quite a few cases where we need to take care of closing it. The other way of using defer
here would be naming the returns, and then adding:
defer func() {
if err != nil {
triedb.Close()
if snaps != nil {
snaps.Disable()
}
}
}()
What I'm trying to get at here, is we should try to have worry-free return
whenever possible, and having to add a condition before every return
is not great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. Once the CI runs is done, I can change it to use that defer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This made the tests fail, for some reason https://ci.appveyor.com/project/ethereum/go-ethereum/builds/49167581/job/flhcsji6lgpk99kt#L1068
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not happy about this weird leak of snapshot internals, but I guess there is no good solution otherwise.
It doesn't work because we sometimes want to return an error from |
st.Snapshots.Disable() | ||
st.Snapshots.Release() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to make Disable
safe to call multiple times. Right now, it will halt, because nothing is listening to dl.genAbort
. I think maybe we should nil
the genAbort
when the generator becomes inactive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff --git a/core/state/snapshot/snapshot.go b/core/state/snapshot/snapshot.go
index 6389842382..52e6fca56c 100644
--- a/core/state/snapshot/snapshot.go
+++ b/core/state/snapshot/snapshot.go
@@ -258,6 +258,13 @@ func (t *Tree) Disable() {
for _, layer := range t.layers {
switch layer := layer.(type) {
case *diskLayer:
+
+ layer.lock.RLock()
+ if layer.genMarker == nil {
+ // Generator is already aborted or finished
+ layer.lock.RUnlock()
+ break
+ }
// If the base layer is generating, abort it
if layer.genAbort != nil {
abort := make(chan *generatorStats)
I think this would fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly a simpler fix could be to set st.Snapshots = nil
here?
d87a07c
to
fdfb4c2
Compare
|
||
layer.lock.RLock() | ||
generating := layer.genMarker != nil | ||
layer.lock.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is safe. We are iterating the layers here, and locking them individually. What happens if genMarker
is updated in-between?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. However, there's only one singleton disklayer. The genmarker is only ever set to nil
after completion. I don't see how this can go wrong, other than false-negative: after we release this lock, the generation completes, and we again get stuck trying to send stuff over a channel where nobody is listening.
The mechanism re genabort is not wonderful to work with.
…eum#28974) --------- Co-authored-by: Felix Lange <fjl@twurst.com>
This PR fixes two things,
At.Parallel
early in the state test execution causes a massive parallelization: basically all statetest files are read, and initiated. Once all state-tests are initiated, they are all halted, or, put into a wait, so that testing can proceed at the specified paralellism. I suspect that the early forking of testcases does not improve the testing-speed on appveyor/travis, so I removed it here. Will check which run is fastest, with or without it.