-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure no uncommitted ops when open readonly engine #41317
Conversation
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments
* so peer recovery of closed indices can skip phase 2 (i.e., not replaying translog operations) without losing data. | ||
*/ | ||
private void ensureNoUncommittedOperation(SeqNoStats seqNoStats, SegmentInfos segmentInfos) throws IOException { | ||
// we can't enforce this check on an old index - should we prevent this engine as a recovery source? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would that mean? Can we restrict this more ie. doesn't it only apply if we used to be a follower engine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I pushed 8ba5ca5 to skip this check if max_seq_no equals to the global checkpoint.
this.indexCommit = Lucene.getIndexCommit(lastCommittedSegmentInfos, directory); | ||
reader = open(indexCommit); | ||
reader = wrapReader(reader, readerWrapperFunction); | ||
searcherManager = new SearcherManager(reader, searcherFactory); | ||
if (seqNoStats == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when does this happen ie. when is this null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should run this check only when this read-only engine is the only engine opened with its shard (i.e., when obtain write lock is true).
localCheckpointTracker = createLocalCheckpointTracker(engineConfig, segmentInfos, logger, | ||
() -> new Searcher("build_checkpoint_tracker", new IndexSearcher(reader), () -> {}), LocalCheckpointTracker::new); | ||
} | ||
try (Translog translog = new Translog(engineConfig.getTranslogConfig(), translogUUID, translogDeletionPolicy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit afraid of what would happen if that translog is very large? I mean we are reading the entire thing off disk no? Also isn't this expected to be empty or at least small?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 8ba5ca5, we now only execute this check if we were a follower before and have gaps in sequence numbers. Even in that situation, we would read a few last translog generations since we need to read only operations since the local checkpoint of the index commit.
@s1monw Thanks for looking. I've addressed your comments. |
Discussed with @ywelsch on another channel, we preferred not to proceed with this change for it might not be sufficient for closed follower indices. We will explore another option using a read-only marker. |
Closing a follower index can make the cluster become red because the sanity check (i.e., max_seq_no equals to the global checkpoint) does not hold for closed follow indices. The main purpose of this sanity check is to ensure that peer recovery of closed indices can safely skip phase 2 (won't replay translog operations) without losing data. We can achieve the same goal by making sure that all existing operations are committed in the last index commit.
Relates #33888
See #38767 (comment)