-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Mergeability column to support automatic merges #5187
base: main
Are you sure you want to change the base?
Conversation
This adds a new Mergeability column for marking if/when tablets are eligible to me merged by the system based on a threshold. The column stores a duration that is relative to the time the Manager uses, Steady time. There are 3 possible states for the value: 1) -1 : This means a tablet will never automatically merge 2) 0 : Tablet is eligible now to automatically merge 3) positive duration: eligible to merge after the given duration relative to the current system Steady time. Ie. the tablet can be merged if SteadyTime is later than the given duration. This change only adds the new column itself and populates it. The default is to never merge automatically for all cases except for when the system automatically splits tablets. In that case the newly split tablets are marked as being eligible to merge immediately. Future updates will add API enhancements to allow setting/viewing the mergeability setting as well as to enable automatic merging by the system that is based on this new column value. When automatic merging is enabled, if a user wants to make a tablet elgible to be merged in the future they would do so by adding a period of time to the current SteadyTime. For example, to make a tablet eligible to be merged 3 days from now the user would read the current SteadyTime value (represented as a duration), add 3 days to that and then store that new duration in the column. When the current steady time passes that duration the tablet would be eligible to be merged. See apache#5014 for more details
70dfb91
to
346f936
Compare
} else if (isNever()) { | ||
return "TabletMergeability=NEVER"; | ||
} | ||
return "TabletMergeability=AFTER:" + delay.toNanos(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to convert this to ms
or s
in the toString
method for ease of understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, not a bad idea, nanoseconds would be pretty tough to understand in a log.
private static final long serialVersionUID = 1L; | ||
|
||
public static final TabletMergeability NEVER = new TabletMergeability(Duration.ofNanos(-1)); | ||
public static final TabletMergeability NOW = new TabletMergeability(Duration.ofNanos(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static final TabletMergeability NOW = new TabletMergeability(Duration.ofNanos(0)); | |
public static final TabletMergeability NOW = new TabletMergeability(Duration.ZERO); |
@@ -392,6 +393,8 @@ interface TabletUpdates<T> { | |||
|
|||
T putCloned(); | |||
|
|||
T putTabletMergeability(TabletMergeability tabletMergeability); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we will need a steady time to evaluate if something should merge. Seems like the steady time should always be set when the mergability is set. Although its only needed when the duration is > 0. The steady time could be encoded in the same columns value.
T putTabletMergeability(TabletMergeability tabletMergeability); | |
T putTabletMergeability(TabletMergeability tabletMergeability, SteadyTime steadTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steady time could be added in later PRs. Just wondering how it will be persisted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original plan was to persist everything as one value by adding the current manager time to the value. So basically to compute the mergability duration you would read the current SteadyTime then add the offset to it and store it and then later you could compare to the current time by taking a diff.
However, I was thinking more about it and I think that storing it as two separate values makes sense because it allows you to do better debugging (logigng) plus you can see extra information such as when the value was created. It also allows doing other things like update/replacing the original SteadyTime, etc and other metric calculations if we keep it separate so I'll change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original plan was to persist everything as one value by adding the current manager time to the value.
Ok I had thought this would store the exact duration specified by the user. Summing the steady time and duration from the user would work. There may be a slight advantage for keeping them separate for debugging purposes as mentioned.
@@ -187,6 +187,20 @@ public Repo<Manager> call(FateId fateId, Manager manager) throws Exception { | |||
DeleteRows.getMergeTabletAvailability(range, tabletAvailabilities)); | |||
tabletMutator.putPrevEndRow(firstTabletMeta.getPrevEndRow()); | |||
|
|||
// TODO: How should we determine this value this after merging? | |||
// Do we just keep the value that's already in the last tablet in the range? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping the one on the last tablet as is makes sense if we think of mergability being set on a split point. So some mergabiity was set on the last split in the past, so need to keep it. The other split points are being merged away, so no longer care about their mergability.
This adds a new Mergeability column for marking if/when tablets are eligible to me merged by the system based on a threshold. The column stores a duration that is relative to the time the Manager uses, Steady time.
There are 3 possible states for the value:
to the current system Steady time. Ie. the tablet can be merged if SteadyTime is later than the given duration.
This change only adds the new column itself and populates it. The default is to never merge automatically for all cases except for when the system automatically splits tablets. In that case the newly split tablets are marked as being eligible to merge immediately.
Future updates will add API enhancements to allow setting/viewing the mergeability setting as well as to enable automatic merging by the system that is based on this new column value.
When automatic merging is enabled, if a user wants to make a tablet elgible to be merged in the future they would do so by adding a period of time to the current SteadyTime. For example, to make a tablet eligible to be merged 3 days from now the user would read the current SteadyTime value (represented as a duration), add 3 days to that and then store that new duration in the column. When the current steady time passes that duration the tablet would be eligible to be merged.
See #5014 for more details