[HUDI-6758] Fixing deducing spurious log blocks due to spark retries #9611

nsivabalan · 2023-09-02T23:26:33Z

Change Logs

We attempted a fix to avoid reading spurious log blocks on the reader side with #9545.
When I tested the patch end to end, found some gaps. Specifically, the attempt Id we had with taskContextSupplier was not referring to task's attempt number. So, fixing it in this patch. Tested end to test by simulating spark retries and spurious log blocks. Reader is able to detect them and ignore multiple copies of log blocks.

Impact

Properly deduce and skip spurious log blocks on the reader

Risk level (write none, low medium or high below)

medium

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

The config description must be updated if new configs are added or the default value of the configs are changed
Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
ticket number here and follow the instruction to make
changes to the website.

Contributor's checklist

Read through contributor's guide
Change Logs and Impact were stated clearly
Adequate tests were added if applicable
CI passed

codope

Some flink tests in TestStreamWriteOperatorCoordinator are failing. Please look into that.
cc @danny0405

pom.xml

hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java

hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java

hudi-common/src/main/java/org/apache/hudi/common/engine/LocalTaskContextSupplier.java

danny0405 · 2023-09-04T01:55:40Z

When I tested the patch end to end, found some gaps. Specifically, the attempt Id we had with taskContextSupplier was not referring to task's attempt number. So, fixing it in this patch.

If we do not test it well, can we revert #9545 first, and please do not merge such changes if we have no enough tests especially in the release process.

nsivabalan · 2023-09-04T06:47:32Z

hey @danny0405 : this patch fixes it end to end. I don't see a reason why we need to revert it though. if we don't have a solution yet, I agree we can revert it, but we already have a working solution e2e and has been tested w/ spark retries as well.

hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java

danny0405 · 2023-09-05T02:47:41Z

working solution e2e and has been tested w/ spark retries as well.

Okay, I'm just scared for any regression because the changes are very core, I know we already have 2e2 tests manually but still think we should give some buffer time to the community for the testing. Push it into 0.14.0 is not a very sensible choice.

nsivabalan · 2023-09-07T00:47:32Z

@hudi-bot run azure

danny0405 · 2023-09-07T01:01:42Z

...-client/hudi-flink-client/src/main/java/org/apache/hudi/client/FlinkTaskContextSupplier.java

@@ -62,4 +62,9 @@ public Option<String> getProperty(EngineProperty prop) {
    return Option.empty();
  }

+  @Override
+  public Supplier<Integer> getAttemptNumberSupplier() {
+    return () -> -1;


In Flink we already have getAttemptIdSupplier which serves as the same purpose, what's the usage of Spark then? should we use getAttemptIdSupplier instead ?

as of now, I have disabled it for flink. can you test it out and fix it.

here is the situation w/ spark.

Lets say we want to spin up 10 tasks for a stage.
in first attempt,
each task will be assigned numbers from 1 to 10 for attemptId. but for attempNumber, it will be 0.
and when a subset of tasks are retried, new attemptIds will be 11, 12 etc. but attemptNumber will be 1.

This is how it works in spark. I am not very sure on flink. Anyways, w/ latest commit, we are avoiding writing the block identifier flink.

So, in summary, attemptId supplier is not the right one to use atleast in spark. It has to be attempt number.

Does spark retry always takes the same data set? Is is possible one retried task goes to other executor/container and takes a different input dataset there?

wrt log files, it should not, bcoz, only updates go to log files.
incase of bucket Index, anyways, record key to fileId is hash based and we should be good.

only updates go to log files.

Only true for spark, so you are fixing a bug dependent on Spark write christeristic.

yes. we have disabled it for flink as of now. In java, anyways, MOR is not fully functional from what I know. but I am open to disabling it for java as well. mainly its an issue for ExpressionPayload and any other custom payloads. most of the other payloads are idempotent even if there are duplicate log blocks.

Let's make the Flink impl right first by using this.flinkRuntimeContext.getAttemptNumber()

sure, will sync up w/ you directly to better understand this.
https://issues.apache.org/jira/browse/HUDI-6844

nsivabalan · 2023-09-08T18:52:47Z

nsivabalan · 2023-09-09T03:35:12Z

@hudi-bot run azure

hudi-bot · 2023-09-09T07:31:36Z

CI report:

375c15d Azure: SUCCESS Azure: CANCELED

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

…9611) - We attempted a fix to avoid reading spurious log blocks on the reader side with #9545. When I tested the patch end to end, found some gaps. Specifically, the attempt Id we had with taskContextSupplier was not referring to task's attempt number. So, fixing it in this patch. Tested end to test by simulating spark retries and spurious log blocks. Reader is able to detect them and ignore multiple copies of log blocks.

…pache#9611) - We attempted a fix to avoid reading spurious log blocks on the reader side with apache#9545. When I tested the patch end to end, found some gaps. Specifically, the attempt Id we had with taskContextSupplier was not referring to task's attempt number. So, fixing it in this patch. Tested end to test by simulating spark retries and spurious log blocks. Reader is able to detect them and ignore multiple copies of log blocks.

beyond1920 · 2024-03-24T04:12:04Z

@nsivabalan Good job.
We found a minor drawback.
There are 4 cases that the task retry:

Task is slow, another speculation task is retried
The task failed and retry
The stage failed and retry
The executor failed and retry

For the third case, which stage retried. Task attempt number might be back to the original value.
Using attempt number to identify the block is not enough to handle this case. It might leads to wrong result to compare blocks size of each attempt no.

We might need to using stageAttemptNumber and AttemptNumber to identify it, or other solution. WDYT?

danny0405 · 2024-03-25T00:32:48Z

I would rather we revert this change first if there is no thorough solution or put a flag for switching whereas by default as disabled.

nsivabalan · 2024-03-25T16:02:49Z

hey @beyond1920 : thanks for flagging this. I agree w/ danny. will probably revert the change for now.

KnightChess · 2024-05-14T08:11:36Z

@nsivabalan hi, I have a question. Why can't we use taskAttemptId in Spark? From the results, it seems that in speculative execution, combining instanceTime with taskAttemptId or attemptId can achieve diff the dup blocks. Additionally, taskAttemptId is global, so it can distinguish tasks even in cases of stage retries or executor crashes like @beyond1920 said. Perhaps I'm missing some details, can you help clarify?

nsivabalan added release-0.14.0 priority:blocker labels Sep 2, 2023

codope reviewed Sep 3, 2023

View reviewed changes

danny0405 reviewed Sep 5, 2023

View reviewed changes

hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java Show resolved Hide resolved

nsivabalan force-pushed the testMorSpuriousLogs branch 2 times, most recently from 2d333fd to 3e8fb83 Compare September 7, 2023 00:02

danny0405 reviewed Sep 7, 2023

View reviewed changes

nsivabalan force-pushed the testMorSpuriousLogs branch from ab1897d to f2cc702 Compare September 7, 2023 02:27

nsivabalan force-pushed the testMorSpuriousLogs branch from 5620dc4 to e86aa76 Compare September 8, 2023 20:25

nsivabalan added 5 commits September 8, 2023 20:22

Fixing deducing duplicate log blocks due to spark retries

f4a02a8

Addressing comments

be48de1

Fixing block identifier header name and some minor refactoring

b613c16

Fixing flink tests

9e4b6f8

Addressing comments

375c15d

nsivabalan force-pushed the testMorSpuriousLogs branch from e86aa76 to 375c15d Compare September 9, 2023 03:22

codope approved these changes Sep 9, 2023

View reviewed changes

nsivabalan merged commit dd4c404 into apache:master Sep 10, 2023
28 checks passed

nsivabalan mentioned this pull request Mar 25, 2024

[HUDI-7549] Reverting spurious log block deduction with LogRecordReader #10922

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HUDI-6758] Fixing deducing spurious log blocks due to spark retries #9611

[HUDI-6758] Fixing deducing spurious log blocks due to spark retries #9611

nsivabalan commented Sep 2, 2023

codope left a comment

danny0405 commented Sep 4, 2023

nsivabalan commented Sep 4, 2023

danny0405 commented Sep 5, 2023

nsivabalan commented Sep 7, 2023

danny0405 Sep 7, 2023

nsivabalan Sep 7, 2023

nsivabalan Sep 7, 2023

danny0405 Sep 7, 2023

nsivabalan Sep 8, 2023

danny0405 Sep 9, 2023

nsivabalan Sep 9, 2023 •

edited

Loading

danny0405 Sep 10, 2023

nsivabalan Sep 10, 2023

nsivabalan commented Sep 8, 2023

nsivabalan commented Sep 9, 2023

hudi-bot commented Sep 9, 2023

beyond1920 commented Mar 24, 2024 •

edited

Loading

danny0405 commented Mar 25, 2024

nsivabalan commented Mar 25, 2024

KnightChess commented May 14, 2024

[HUDI-6758] Fixing deducing spurious log blocks due to spark retries #9611

[HUDI-6758] Fixing deducing spurious log blocks due to spark retries #9611

Conversation

nsivabalan commented Sep 2, 2023

Change Logs

Impact

Risk level (write none, low medium or high below)

Documentation Update

Contributor's checklist

codope left a comment

Choose a reason for hiding this comment

danny0405 commented Sep 4, 2023

nsivabalan commented Sep 4, 2023

danny0405 commented Sep 5, 2023

nsivabalan commented Sep 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nsivabalan Sep 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nsivabalan commented Sep 8, 2023

nsivabalan commented Sep 9, 2023

hudi-bot commented Sep 9, 2023

CI report:

beyond1920 commented Mar 24, 2024 • edited Loading

danny0405 commented Mar 25, 2024

nsivabalan commented Mar 25, 2024

KnightChess commented May 14, 2024

nsivabalan Sep 9, 2023 •

edited

Loading

beyond1920 commented Mar 24, 2024 •

edited

Loading