puller(ticdc): fix wrong update splitting behavior after table scheduling (#11296) #11303
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an automated cherry-pick of #11296
What problem does this PR solve?
Issue Number: close #11219
What is changed and how it works?
A
andB
, andB
start beforeA
, that isthresholdTSB
<thresholdTSA
;t
is first onA
;t
has an update event whichcommitTS
is smaller thanthresholdTSA
and larger thanthresholdTSB
. So the update event is split to a delete event and an insert event on nodeA
;B
, the update event are received by nodeB
again;B
because itscommitTS
is larger than thethresholdTSB
, and nodeB
just send an update sql to downstream which cause data inconsistency;And there is also another thing to notice that after scheduling, node
B
will send some events to downstream which are already send by nodeA
; So nodeB
must send these events in an idempotent way;Previously, this is handled by getting a
replicateTS
in sink module when sink starts and split these events whichcommitTS
is smaller thanreplicateTS
. But this mechanism is also removed in #11030. So we need to handle this case in puller too.In this pr, instead of maintaining a separate
thresholdTS
in sourcemanager, we try to get thereplicateTS
from sink when puller need to check whether to split the update event.And since puller module starts working before sink module, so we give
replicateTS
a default valueMAXUInt64
which means to split all update events. After sink starts working,replicateTS
will be set to the correct value.The last thing to notice, when sink restarts due to some error, after restart, the sink may send some events downstream which are already send before restart. These events also need be send in an idempotent way. But these events are already in sorter, so just restart sink cannot accomplish this goal. So we forbid restarting sink in this pr and just restart the changefeed when meet error.
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note