…oad as PS-9092
https://perconadev.atlassian.net/browse/PS-9144
Problem:
--------
ALTER TABLE which rebuilds InnoDB table using INPLACE algorithm might sometimes
lead to row loss if concurrent purge happens on the table being ALTERed.
Analysis:
---------
New implementation of parallel ALTER TABLE INPLACE in InnoDB was introduced in
MySQL 8.0.27. Its code is used for online table rebuild even in a single-thread
case.
This implementation iterates over all the rows in the table, in general case,
handling different subtrees of a B-tree in different threads. This iteration
over table rows needs to be paused, from time to time, to commit InnoDB MTR/
release page latches it holds. This is necessary to give a way to concurrent
actions on the B-tree scanned or before flushing rows of new version of table
from in-memory buffer to the B-tree. In order to resume iteration after such
pause persistent cursor position saved before pause is restored.
The cause of the problem described above lies in PCursor::restore_position()
method. This method used for two purposes in the code:
1) To resume iteration after it was paused in the scenario described
above.
2) To initialize cursor when a thread starts iteration through subtree
it was assigned to process.
In scenario 2) we restore cursor to the record which has not been
processed yet. If the record, which position was saved originally when
subtrees/ranges to process were assigned to threads, has been purged
meanwhile, the cursor will be restored to preceding record. And the
cursor needs to be moved to the next record, so we don't start processing
our subtree from the record belonging to a different thread.
PCursor::restore_position() handles this by detecting situation when saved
record was purged and moving to the next record in this case.
However, in scenario 1) we actually restore cursor to the record which
has been processed already and from which will be doing step to the next
record right after restore. So iterating to the next record if saved record
was purged like it is done in PCursor::restore_position(), and which is
necessary in case 2), leads to double step forward, resulting in our scan
missing record!
Fix:
----
This patch solves the problem by using different logic for restore of
cursor position in these two cases.
For case 1) we simply restore position which was saved using
btr_pcur_t::restore_position(). If the record to which cursor is
supposed to point has been purged meanwhile, this method will point
the cursor to preceding record. Then the calling code will iterate to
next record (i.e. successor of purged record) after restore.
For case 2) we keep pre-fix behavior and correct cursor position
after restoring if the record which position has been saved originally
has been purged by moving to the next record in subtree to be processed.
PCursor::restore_position() method which implements handling of this
case has been renamed to PCursor::restore_position_for_range() and
greately simplified.