-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trim translog for closed indices #43156
Conversation
Pinging @elastic/es-distributed |
I think we should not trim translog in NoOpEngine because it extends from ReadOnlyEngine which should not modify translog (and Lucene index). How about trimming translog in InternalEngine when we execute the verify-before-close step (after we flush)? |
We talked about this today and we agreed that we can move forward with the proposed solution, as long as the translog trimming is correctly documented in NoOpEngine. This solution will help to curate translog files of indices closed in 7.2. We also agreed on trimming translogs files more aggressively when indices are verified before being closed, as suggested by @dnhatn and others, but that would require peer recoveries to not use translog at all anymore. This will become possible in a short future once peer recovery retention leases will be implemented (see #41536). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Thanks @tlrx. I left a comment.
server/src/main/java/org/elasticsearch/index/engine/NoOpEngine.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a smaller comment but LGTM. Thanks @tlrx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you follow-up with a change that exposes correct TranslogStats for NoOpEngine, and reloads them after the trimming?
try (ReleasableLock lock = readLock.acquire()) { | ||
ensureOpen(); | ||
final List<IndexCommit> commits = DirectoryReader.listCommits(store.directory()); | ||
if (commits.size() == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we guaranteed to have this on a successful close? How do we know that previous commits have been cleaned up on verify-before-close?
Today when an index is closed all its shards are forced flushed but the translog files are left around. As explained in #42445 we'd like to trim the translog for closed indices in order to consume less disk space. This commit reuses the existing AsyncTrimTranslogTask task and reenables it for closed indices. At the time the task is executed, we should have the guarantee that nothing holds the translog files that are going to be removed. It also leaves a short period of time (10 min) during which translog files of a recently closed index are still present on disk. This could also help in some cases where the closed index is reopened shortly after being closed (in order to update an index setting for example). Relates to #42445
…#43825) This commit changes NoOpEngine so that it refreshes its translog stats once translog is trimmed. Relates elastic#43156
This commit changes NoOpEngine so that it refreshes its translog stats once translog is trimmed. Relates #43156
Today when an index is closed all its shards are forced flushed but the translog files are left around. As explained in #42445 we'd like to trim the translog for closed indices in order to consume less disk space.
Instead of trimming the translog at closing time (which can be challenging as explained in #42445) or during the initialization of the noop engine, this pull request proposes to reuse the existing
AsyncTrimTranslogTask
task and to reenable it for closed indices.At the time the task is executed, we should have the guarantee that nothing holds the translog files that are going to be removed. It also leaves a short period of time (10 min) during which translog files of a recently closed index are still present on disk. This could also help in some cases where the closed index is reopened shortly after being closed (in order to update an index setting for example).