-
Notifications
You must be signed in to change notification settings - Fork 634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add protection when mmap
somehow fails
#362
Conversation
Thank you @ahrtr. This seems alligned with bbolt_unix.go so LGTM. I might missing something but in general the
|
I am on "unix" and since
I think it is a better approach as it works for all platforms. That being said there are more edge cases where this can break things. I don't have the time to write a test for this right now I am afraid though.
To sum this up I don't think this fix is as simple as it seems. |
mmap
somehow fails
This PR should can fix the following issues: |
db.go
Outdated
@@ -482,6 +482,14 @@ func (db *DB) mmap(minsz int) error { | |||
|
|||
// munmap unmaps the data file from memory. | |||
func (db *DB) munmap() error { | |||
defer func() { | |||
db.dataref = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider moving it to explicit db.invalidateUnsafe()
method
(and comment it should be execute under the exclusive mmapLock and metaLock)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated (see below). I prefer not to have the suffix "Unsafe
", and not to add the comment something like it should be execute under the exclusive mmapLock and metaLock
as well. BoltDB has well-designed lock mechanism (see below), we shouldn't worry about the lock on each single internal method/function.
For write transaction, it requires the rwlock
at db.rwlock.Lock(), and require the mmaplock
when it needs to allocate more space accordingly needs to remap the db (of course release the lock when it finishes mapping), see db.mmaplock.Lock().
For readonly transaction, it only requires the mmaplock
at db.mmaplock.RLock(), to prevent the db file from being remapped by anther write transaction.
The new added db.invalidateUnsafe()
is only called in db.munmap()
, so it's safe.
func (db *DB) invalidate() {
db.dataref = nil
db.data = nil
db.datasz = 0
db.meta0 = nil
db.meta1 = nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is etcd convention (that I though is even golang convention, but I haven't found any signs) to call
methods 'UnsafeFoo' when the caller of such method needs to take care of proper locking before calling such method.
$ git grep Unsafe | grep func | nl
shows 93 cases.
I do think that the call of 'invalidate' is safe in that context. I just want to avoid situation that someone will call 'invalidate' from context where the locks are not taken... because the person haven't read carefully that the body of the method interacts with the db.data
and db.meta
and these fields must be mutated under the exclusive lock.
I'm not insisting on the "Unsafe" convention... but I don't think comment costs as here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, etcd has the convention (UnsafeFoo
vs Foo
), but there is NO such convention in bbolt for now. If we change it to invalidateUnsafe
, then it isn't consistent with other internal methods.
So I'd like to keep it as it's for now.
Of course, please feel free to raise a separate ticket if you do think we should follow the same convention as etcd.
|
||
db.meta0 = nil | ||
db.meta1 = nil | ||
}() | ||
if err := munmap(db); err != nil { | ||
return fmt.Errorf("unmap error: " + err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a topic for a separate PR... but I think we should start logging Errors early from the failed syscalls...
There is high probability they will lead to panic later (either on bbolt side, but most likely on customer'a application side) and there is significant probability that customer will not log the error themselves...
And this might lead to hard to diagnose problems of "etcd not working" while there was OOM or no disk space.
Failed syscalls is something that shoud not happen on well configured machines, so it should not make the app logs spammy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will think about this separately.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
Signed-off-by: Benjamin Wang <wachao@vmware.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this.
Reported #382 to cover it with better (or any) integration testing.
Thanks @ptabor for the review. |
Provent bbolt from panicking in case #262
In
(*DB) mmap
, itmunmap
the file firstly, thenmmap
it again. If themunmap
somehow fails, then thedb.data
is reset tonil
. In this case, there is no need to executerollback
.Signed-off-by: Benjamin Wang wachao@vmware.com