Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use soft-deletes to maintain document history #29530

Closed
14 tasks done
dnhatn opened this issue Apr 16, 2018 · 2 comments
Closed
14 tasks done

Use soft-deletes to maintain document history #29530

dnhatn opened this issue Apr 16, 2018 · 2 comments
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. Meta

Comments

@dnhatn
Copy link
Member

dnhatn commented Apr 16, 2018

With the introduction of soft-delete in Lucene, a history of a document can be maintained. This meta issue tracks work on migrating from hard-deletes to soft-deletes.

Cut over

Retention and source

Translog

TBD

  • Avoid having multiple docs for the same stale operation. We currently defer the dedup until the search time but we might revisit this decision to do it at the index time(to be done in Per doc replica rollbacks #31637).

Misc

@dnhatn dnhatn added Meta :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. labels Apr 16, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@bleskes
Copy link
Contributor

bleskes commented Apr 16, 2018

👍

@dnhatn dnhatn changed the title Use soft-deletes maintain document history Use soft-deletes to maintain document history Apr 17, 2018
dnhatn added a commit that referenced this issue Apr 20, 2018
Today we can use the soft-deletes feature from Lucene to maintain a
history of a document. This change simply replaces hard-deletes by
soft-deletes in Engine.

Besides marking a document as deleted, we also index a tombstone
associated with that delete operation. Storing delete tombstones allows
us to have a history of sequence-based operations which can serve in
recovery or rollback.

Relates #29530
dnhatn added a commit that referenced this issue Apr 27, 2018
Today, when processing out of order operations, we only add it into
translog but skip adding into Lucene. Translog, therefore, has a
complete history of sequence numbers while Lucene does not.

Since we would like to have a complete history in Lucene, this change
makes sure that stale operations will be added to Lucene as soft-deleted
documents if required.

Relates #29530
dnhatn added a commit that referenced this issue May 2, 2018
This commit adds a tombstone document into Lucene for every No-op. 
With this change, Lucene index is expected to have a complete history 
of operations like Translog. In fact, this guarantee is subjected to the
soft-deletes retention merge policy.

Relates #29530
dnhatn added a commit that referenced this issue May 5, 2018
…30335)

This commit introduces a soft-deletes retention merge policy based on
the global checkpoint. Some notes on this simple retention policy:

- This policy keeps all operations whose seq# is greater than the 
persisted global checkpoint and configurable extra operations prior to
the global checkpoint. This is good enough for querying history changes.

- This policy is not watertight for peer-recovery. We send the 
safe-commit in peer-recovery, thus we need to also send all operations
after the local checkpoint of that commit. This is analog to the min
translog generation for recovery.

- This policy is too simple to support rollback.

Relates #29530
dnhatn pushed a commit that referenced this issue May 9, 2018
This commit adds an API to read translog snapshot from Lucene,
then cut-over from the existing translog to the new API in CCR.

Relates #30086
Relates #29530
dnhatn added a commit that referenced this issue May 10, 2018
Today we can use the soft-update feature from Lucene to maintain a
history of document. This change simply replaces hard-update in the
Engine by soft-update methods. Stale operations, delete, and no-ops will
be handled in subsequent changes. This change is just a cut-over from
hard-update to soft-update, no new functionality has been introduced.

Relates #29530
dnhatn added a commit that referenced this issue May 10, 2018
Today we can use the soft-deletes feature from Lucene to maintain a
history of a document. This change simply replaces hard-deletes by
soft-deletes in Engine.

Besides marking a document as deleted, we also index a tombstone
associated with that delete operation. Storing delete tombstones allows
us to have a history of sequence-based operations which can serve in
recovery or rollback.

Relates #29530
dnhatn added a commit that referenced this issue May 10, 2018
Today, when processing out of order operations, we only add it into
translog but skip adding into Lucene. Translog, therefore, has a
complete history of sequence numbers while Lucene does not.

Since we would like to have a complete history in Lucene, this change
makes sure that stale operations will be added to Lucene as soft-deleted
documents if required.

Relates #29530
dnhatn added a commit that referenced this issue May 10, 2018
This commit adds a tombstone document into Lucene for every No-op.
With this change, Lucene index is expected to have a complete history
of operations like Translog. In fact, this guarantee is subjected to the
soft-deletes retention merge policy.

Relates #29530
dnhatn added a commit that referenced this issue May 10, 2018
…30335)

This commit introduces a soft-deletes retention merge policy based on
the global checkpoint. Some notes on this simple retention policy:

- This policy keeps all operations whose seq# is greater than the
persisted global checkpoint and configurable extra operations prior to
the global checkpoint. This is good enough for querying history changes.

- This policy is not watertight for peer-recovery. We send the
safe-commit in peer-recovery, thus we need to also send all operations
after the local checkpoint of that commit. This is analog to the min
translog generation for recovery.

- This policy is too simple to support rollback.

Relates #29530
dnhatn pushed a commit that referenced this issue May 10, 2018
This commit adds an API to read translog snapshot from Lucene,
then cut-over from the existing translog to the new API in CCR.

Relates #30086
Relates #29530
dnhatn added a commit that referenced this issue May 15, 2018
Since #29458, we use a searcher to calculate the number of documents for
a commit stats. Sadly, that approach is flawed. The searcher might no
longer point to the last commit if it's refreshed. As synced-flush
requires an exact numDocs to work correctly, we have to exclude all
soft-deleted docs.

This commit makes synced-flush stop using CommitStats but read an exact
numDocs directly from an index commit.

Relates #29458
Relates #29530
dnhatn added a commit that referenced this issue May 30, 2018
Since #29458, we use a searcher to calculate the number of documents for
a commit stats. Sadly, that approach is flawed. The searcher might no
longer point to the last commit if it's refreshed. As synced-flush
requires an exact numDocs to work correctly, we have to exclude all
soft-deleted docs.

This commit makes synced-flush stop using CommitStats but read an exact
numDocs directly from an index commit.

Relates #29458
Relates #29530
dnhatn added a commit that referenced this issue Jun 21, 2018
This commit adds Lucene soft-deletes as another source for peer-recovery
besides translog.

Relates #29530
dnhatn added a commit that referenced this issue Jun 24, 2018
This commit adds Lucene soft-deletes as another source for peer-recovery
besides translog.

Relates #29530
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Aug 28, 2018
Today a file-based recovery will replay  all existing translog operations from
the primary on a replica so that that replica can have a full history in
translog as the primary. However, with soft-deletes enabled, we should not do
it because:

1. All operations before the local checkpoint of the safe commit exist in the
commit already.

2. The number of operations before the local checkpoint may be considerable and
requires a significant amount of time to replay on a replica.

Relates elastic#30522
Relates elastic#29530
dnhatn added a commit that referenced this issue Aug 28, 2018
For now, we do not support changing the soft-deletes setting even with
closed indices. Therefore we should make it a final setting.

Relates #29530
dnhatn added a commit that referenced this issue Aug 28, 2018
This commit makes primary-replica resyncer use Lucene as the source of
history operation instead of translog if soft-deletes is enabled. With
this change, we no longer expose translog snapshot directly in IndexShard.

Relates #29530
dnhatn added a commit that referenced this issue Aug 28, 2018
…es (#33190)

Today a file-based recovery will replay all existing translog operations
from the primary on a replica so that that replica can have a full
history in translog as the primary. However, with soft-deletes enabled,
we should not do it because:

1. All operations before the local checkpoint of the safe commit exist in
the commit already.

2. The number of operations before the local checkpoint may be considerable
and requires a significant amount of time to replay on a replica.

Relates #30522
Relates #29530
dnhatn added a commit that referenced this issue Aug 28, 2018
For now, we do not support changing the soft-deletes setting even with
closed indices. Therefore we should make it a final setting.

Relates #29530
dnhatn added a commit that referenced this issue Aug 28, 2018
This commit makes primary-replica resyncer use Lucene as the source of
history operation instead of translog if soft-deletes is enabled. With
this change, we no longer expose translog snapshot directly in IndexShard.

Relates #29530
dnhatn added a commit that referenced this issue Aug 28, 2018
…es (#33190)

Today a file-based recovery will replay all existing translog operations
from the primary on a replica so that that replica can have a full
history in translog as the primary. However, with soft-deletes enabled,
we should not do it because:

1. All operations before the local checkpoint of the safe commit exist in
the commit already.

2. The number of operations before the local checkpoint may be considerable
and requires a significant amount of time to replay on a replica.

Relates #30522
Relates #29530
dnhatn added a commit that referenced this issue Aug 30, 2018
Today we add a NoOp to Lucene and translog if we fail to process an
indexing operation. However, we are only adding NoOps to translog for
delete operations. In order to have a complete history in Lucene, we
should add NoOps of failed delete operations to both Lucene and translog.

Relates #29530
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Aug 30, 2018
Today we add a NoOp to Lucene and translog if we fail to process an
indexing operation. However, we are only adding NoOps to translog for
delete operations. In order to have a complete history in Lucene, we
should add NoOps of failed delete operations to both Lucene and translog.

Relates elastic#29530
dnhatn added a commit that referenced this issue Aug 30, 2018
Today we add a NoOp to Lucene and translog if we fail to process an
indexing operation. However, we are only adding NoOps to translog for
delete operations. In order to have a complete history in Lucene, we
should add NoOps of failed delete operations to both Lucene and translog.

Relates #29530
dnhatn added a commit that referenced this issue Aug 31, 2018
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)

Relates #30086
Closes #29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand jpountz@gmail.com
Co-authored-by: Boaz Leskes b.leskes@gmail.com
Co-authored-by: Jason Tedor jason@tedor.me
Co-authored-by: Martijn van Groningen martijn.v.groningen@gmail.com
Co-authored-by: Nhat Nguyen nhat.nguyen@elastic.co
Co-authored-by: Simon Willnauer simonw@apache.org
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Aug 31, 2018
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (elastic#31106)
- Soft-deletes retention policy based on the global checkpoint (elastic#30335)
- Read operation history from Lucene instead of translog (elastic#30120)
- Use Lucene history in peer-recovery (elastic#30522)

Relates elastic#30086
Closes elastic#29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand <jpountz@gmail.com>
Co-authored-by: Boaz Leskes <b.leskes@gmail.com>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: Simon Willnauer <simonw@apache.org>
dnhatn added a commit that referenced this issue Aug 31, 2018
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)

Relates #30086
Closes #29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand <jpountz@gmail.com>
Co-authored-by: Boaz Leskes <b.leskes@gmail.com>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: Simon Willnauer <simonw@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. Meta
Projects
None yet
Development

No branches or pull requests

3 participants