-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
periodic GC for badger datastore #72
Conversation
datastore.go
Outdated
@@ -52,7 +59,9 @@ var DefaultOptions Options | |||
|
|||
func init() { | |||
DefaultOptions = Options{ | |||
gcDiscardRatio: 0.1, | |||
GcDiscardRatio: 0.5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will result is GC discarding less when it is run. It is set to a lower ratio so that it at least GCs something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(in many cases it seems we don't reach a ratio of deletions up to 0.5 for GC to do anything (but there might have been bugs in Badger too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hsanjuan I see. Badger recommends a value of 0.5 in the docs with the premise that a lower value will cause more GC's to run. But, I wasn't aware that we don't usually fill up 50 % of a value log file. I'll change it to a lower value.
func (d *Datastore) periodicGC() { | ||
for { | ||
select { | ||
case <-time.After(d.gcInterval): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it more natural to have a Ticker
? Or a timer that can be stopped when before closing.
(also depends if the interval should be a fixed interval or actually a countdown since the last GC, in which case the user should not expect GCs happening every 15 mins but after 15 mins the previous GC finished (that need to be more clarified in docs)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hsanjuan I think waiting for gcInterval
between two iterations makes more sense for a use case like GC. I'll add some documentation to that regard.
break LOOP | ||
default: | ||
if err == nil { | ||
err = d.DB.RunValueLogGC(d.gcDiscardRatio) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand this correctly as by default running RunValueLogGC() repeteadly for 1 minute non-stop? I'm not sure this works. In order to timeout a GC operation this would need support from badger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hsanjuan From the Badger GC docs :
One call to `DB.RunValueLogGC()` would only result in removal of
at max one log file. As an optimization, you could also immediately
re-run it whenever it returns nil error (indicating a successful
value log GC)
The idea is to keep calling DB.RunValueLogGC()
till Badger no longer has any log files to GC(which would be indicated by an error). The 1 minute timeout is to ensure we do not keep calling GC in case Badger has accumulated excessive garbage. However, we will ofcourse finish earlier if Badger has nothing left to GC. I am adding some docs to the code to explain this.
@hsanjuan Thank you for taking the time out to review this. I have made changes based on your suggestions & added some documentation . Please take a look when you can. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I have a few things I'd like to improve on but this is strictly better than what we currently have. Thanks @aarshkshah1992!
For #51
@Stebalien
Also took the liberty of addressing the issues raised on #56.
We can close that PR once this gets merged.