Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recreate compacted boltdb files from compactor when they are more than 12 hours old #4853

Merged

Conversation

sandeepsukhani
Copy link
Contributor

@sandeepsukhani sandeepsukhani commented Dec 1, 2021

What this PR does / why we need it:
This PR changes the compactor code to re-create compacted boltdb files using bbolt.Compact for 2 reasons:

  1. When files are deleted, boltdb leaves free pages in the file. The only way to drop those free pages is to re-create the file.
    See database file size not updating? boltdb/bolt#308 for more details.
  2. boltdb by default fills only about 50% of the page in the file. See https://github.com/etcd-io/bbolt/blob/master/bucket.go#L26.
    This setting is optimal for unordered writes.
    bbolt.Compact fills the whole page by setting FillPercent to 1 which works well here since while copying the data, it receives
    the index entries in order. The storage space goes down from anywhere between 25% to 50% as per my tests.

The compacted dbs would be re-created when:

  1. We have just a single file in table.
  2. File was compacted more than 12 hours ago to avoid re-creating the files too often while the file is still being modified by compaction or retention.
  3. We have not already re-created the db since re-creating again would not have any benefits.

To simplify the code, I have also changed the compactor code to drop the requirement of having more than 4 files for doing the compaction.

Checklist

  • Tests updated
  • Add an entry in the CHANGELOG.md about the changes.

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This file might need some refactoring it start to be a bit too complex for what it does.

@sandeepsukhani
Copy link
Contributor Author

LGTM

This file might need some refactoring it start to be a bit too complex for what it does.

Thanks for the review! I am taking care of it as part of per user index changes.

@sandeepsukhani sandeepsukhani merged commit 3ac9818 into grafana:main Dec 2, 2021
sandeepsukhani added a commit that referenced this pull request Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants