-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prioritize compaction of entry logs with the lowest amount of remaining usable data #3389
Labels
Comments
hangc0276
pushed a commit
that referenced
this issue
Jul 22, 2022
…nt of remaining usable data (#3390) Descriptions of the changes in this PR: ### Motivation Prioritize compaction to free up more space faster. ### Changes doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting. Added a Priority Queue of entry logs to pick ones with the most compactable space first; it also helps when the time for compaction is limited (via majorCompactionMaxTimeMillis / minorCompactionMaxTimeMillis), instead of spending time on rewriting files with more data we'll pick the files with the least amount of data first. Master Issue: #3389
zymap
pushed a commit
that referenced
this issue
Aug 2, 2022
…nt of remaining usable data (#3390) Descriptions of the changes in this PR: ### Motivation Prioritize compaction to free up more space faster. ### Changes doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting. Added a Priority Queue of entry logs to pick ones with the most compactable space first; it also helps when the time for compaction is limited (via majorCompactionMaxTimeMillis / minorCompactionMaxTimeMillis), instead of spending time on rewriting files with more data we'll pick the files with the least amount of data first. Master Issue: #3389 (cherry picked from commit 1825677)
dlg99
added a commit
to datastax/bookkeeper
that referenced
this issue
Nov 19, 2022
…nt of remaining usable data (apache#3390) Descriptions of the changes in this PR: Prioritize compaction to free up more space faster. doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting. Added a Priority Queue of entry logs to pick ones with the most compactable space first; it also helps when the time for compaction is limited (via majorCompactionMaxTimeMillis / minorCompactionMaxTimeMillis), instead of spending time on rewriting files with more data we'll pick the files with the least amount of data first. Master Issue: apache#3389 (cherry picked from commit 1825677) (cherry picked from commit 063cc8b)
dlg99
added a commit
to dlg99/bookkeeper
that referenced
this issue
Mar 14, 2023
…nt of remaining usable data (apache#3390) Descriptions of the changes in this PR: Prioritize compaction to free up more space faster. doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting. Added a Priority Queue of entry logs to pick ones with the most compactable space first; it also helps when the time for compaction is limited (via majorCompactionMaxTimeMillis / minorCompactionMaxTimeMillis), instead of spending time on rewriting files with more data we'll pick the files with the least amount of data first. Master Issue: apache#3389 (cherry picked from commit 1825677)
dlg99
added a commit
to dlg99/bookkeeper
that referenced
this issue
Mar 14, 2023
…nt of remaining usable data (apache#3390) Descriptions of the changes in this PR: Prioritize compaction to free up more space faster. doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting. Added a Priority Queue of entry logs to pick ones with the most compactable space first; it also helps when the time for compaction is limited (via majorCompactionMaxTimeMillis / minorCompactionMaxTimeMillis), instead of spending time on rewriting files with more data we'll pick the files with the least amount of data first. Master Issue: apache#3389 (cherry picked from commit 1825677)
dlg99
added a commit
to dlg99/bookkeeper
that referenced
this issue
Mar 16, 2023
…nt of remaining usable data (apache#3390) Descriptions of the changes in this PR: Prioritize compaction to free up more space faster. doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting. Added a Priority Queue of entry logs to pick ones with the most compactable space first; it also helps when the time for compaction is limited (via majorCompactionMaxTimeMillis / minorCompactionMaxTimeMillis), instead of spending time on rewriting files with more data we'll pick the files with the least amount of data first. Master Issue: apache#3389 (cherry picked from commit 1825677)
dlg99
added a commit
to datastax/bookkeeper
that referenced
this issue
Mar 16, 2023
#6) * [Issue 3389] Prioritize compaction of entry logs with the lowest amount of remaining usable data (apache#3390) Descriptions of the changes in this PR: Prioritize compaction to free up more space faster. doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting. Added a Priority Queue of entry logs to pick ones with the most compactable space first; it also helps when the time for compaction is limited (via majorCompactionMaxTimeMillis / minorCompactionMaxTimeMillis), instead of spending time on rewriting files with more data we'll pick the files with the least amount of data first. Master Issue: apache#3389 (cherry picked from commit 1825677) * checkstyle in random files * flaky test
Ghatage
pushed a commit
to sijie/bookkeeper
that referenced
this issue
Jul 12, 2024
…nt of remaining usable data (apache#3390) Descriptions of the changes in this PR: ### Motivation Prioritize compaction to free up more space faster. ### Changes doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting. Added a Priority Queue of entry logs to pick ones with the most compactable space first; it also helps when the time for compaction is limited (via majorCompactionMaxTimeMillis / minorCompactionMaxTimeMillis), instead of spending time on rewriting files with more data we'll pick the files with the least amount of data first. Master Issue: apache#3389
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
FEATURE REQUEST
Prioritize compaction to free up more space faster.
must-have
Looking at GarbageCollectorThread:
doCompactEntryLogs() iterates over entry logs in whatever natural order they happen to be, picks the first with usage below thresholds and starts compacting.
Do major compaction it means we can start compaction with an entry log at 80% utilization instead of e.g. one with 10%.
This can be easily fixed by building a PriorityQueue of entry logs, ordering by lowest utilization (meta.getUsage()) to free up more space sooner.
Building of the queue should not take too much time and can be combined with doGcEntryLogs() which iterates over all entries in entryLogMetaMap anyway; memory-wise it should be fine too.
The text was updated successfully, but these errors were encountered: