-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-7665] Support rolling upgrade to table version 8 #12250
base: master
Are you sure you want to change the base?
Conversation
String instantTime, SupportsUpgradeDowngrade upgradeDowngradeHelper) { | ||
HoodieTable table = upgradeDowngradeHelper.getTable(config, context); | ||
HoodieTableMetaClient metaClient = table.getMetaClient(); | ||
HoodieTableConfig tableConfig = metaClient.getTableConfig(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can trigger a rollback for pending instants first, then remove the any log file markers explicitly if there are any.
This way we can clean up the table as much as possible before any other steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, changed the code to rollback and compact in one step.
...hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/SixToEightUpgradeHandler.java
Outdated
Show resolved
Hide resolved
throw new HoodieException(e); | ||
} | ||
}; | ||
lsmTimelineWriter.write(Collections.singletonList(ActiveAction.fromInstants(archivedTimeline.getInstants())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list for archived timeline is huge, we need to split the list into small batchs and write into the LSM timeline per-batch, by default, just use 10 instants as a batch which is in line with the current behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another concern is that the legacy archived timeline may contain enormous instants there (like several GBs of avro logs), it would be very time-consuming to load the whole legacy archived timeline, maybe we just load the latest avro log(which should be enough for file slicing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i thought about it and I agree. I wanted to quickly do some local testing to validate the upgrade path. I will do the batching soon.
|
||
// Migrate the LSM timeline back to the old archived timeline format | ||
try { | ||
// TODO: Convert instants from the LSM format to the old format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need timeline archiver v1 from Balaj's PR. cc @bvaradar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. For now, i've ported over some legacy archiver code (copied from #11923) for local testing. Once that PR lands, I will rebase.
af2c110
to
4cfaa0a
Compare
12271aa
to
9d9683f
Compare
} | ||
|
||
@Override | ||
public int archiveInstants(HoodieEngineContext context, List<HoodieInstant> instantsToArchive, boolean acquireLock) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this acquireLock
is always false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is true in UpgradeDowngradeUtils methods upgradeToLSMTimeline and downgradeFromLSMTimeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need lock for upgrade/downgrade, is there any concurrency here?
port some legacy timeline archiver code and compact post rollback
9d9683f
to
f5b1aa8
Compare
Change Logs
TODO:
Impact
Support rolling upgrade to table version 8.
Risk level (write none, low medium or high below)
high
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist