Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-8395] Fix metaClient handling when running upgrade or downgrade #12224

Merged
merged 4 commits into from
Nov 15, 2024

Conversation

codope
Copy link
Member

@codope codope commented Nov 8, 2024

Change Logs

After debugging, I found that upgrade path is invoked but the metaClient is not reloaded after that. Simply reloading in tryUpgrade is not enough because we call that in a lambda. So, this PR fixes the issue as follows:

  • Updated doInitTable() to return the potentially reloaded metaClient, allowing initTable() to access the updated metaClient reference after an upgrade.
  • Enhanced HoodieCommitMetadata.fromBytes() method with a fallback mechanism for JSON deserialization given that commit metadata serialization changed from json to avro in 1.0.

Tested manually by running the same steps as mentioned in HUDI-8395.

Impact

Refactor metaClient handling during upgrade/downgrade, and implement fallback in JSON deserialization.

Risk level (write none, low medium or high below)

medium

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Nov 8, 2024
@codope codope marked this pull request as ready for review November 9, 2024 06:06
@@ -1447,11 +1457,11 @@ protected void tryUpgrade(HoodieTableMetaClient metaClient, Option<String> insta
tableServiceClient.rollbackFailedWrites(pendingRollbacks, true, true);
}

new UpgradeDowngrade(metaClient, config, context, upgradeDowngradeHelper)
.run(HoodieTableVersion.current(), instantTime.orElse(null));

metaClient.reloadActiveTimeline();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just move the timeline reload out of the tryUpgrade method and do it outside, so that we can eliminate the tricks of this atomic reference get/set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated now, please check. I have made tryUpgrade return a boolean and if upgrade happened then reload the meta client outside tryUpgrade but still within the transaction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new method reloadTableConfig to address the problem.

UpgradeDowngrade upgradeDowngrade =
new UpgradeDowngrade(metaClient, config, context, upgradeDowngradeHelper);

if (upgradeDowngrade.needsUpgradeOrDowngrade(HoodieTableVersion.current())) {
metaClient = HoodieTableMetaClient.reload(metaClient);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this line is problematic, if we always use the same reference passed in, it should work, that is: use metaClient.reloadActiveTimeline(); here instead of replace the metaClient reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it is not the timeline, but the table config that is not refreshed.

return fromJsonString(fromUTF8Bytes(convertCommitMetadataToJsonBytes(deserializeCommitMetadata(bytes), org.apache.hudi.avro.model.HoodieCommitMetadata.class)), clazz);
} catch (Exception e) {
// fall back to the alternative method (0.x)
LOG.warn("Primary method failed; trying alternative deserialization method.", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should migrate to @bvaradar 's new SER/DE API instead of catching exceptions: #11923

@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@codope codope merged commit 7107995 into apache:master Nov 15, 2024
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:S PR with lines of changes in (10, 100]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants