Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move trimming unsafe commits from the Engine constructor to Store #29260

Merged
merged 11 commits into from
Mar 29, 2018

Conversation

bleskes
Copy link
Contributor

@bleskes bleskes commented Mar 27, 2018

As follow up to #28245 , this PR removes the logic for selecting the right start commit from the Engine constructor in favor of explicitly trimming them in the Store, before the engine is opened. This makes the constructor in engine follow standard Lucene semantics and use the last commit.

bleskes added 7 commits March 18, 2018 17:20
# Conflicts:
#	docs/reference/indices/flush.asciidoc
#	server/src/main/java/org/elasticsearch/index/engine/EngineDiskUtils.java
#	server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
#	server/src/test/java/org/elasticsearch/index/engine/EngineDiskUtilsTests.java
…_unsafe

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
#	server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
@bleskes bleskes requested review from s1monw and dnhatn March 27, 2018 10:46
@bleskes bleskes added >enhancement v7.0.0 v6.3.0 :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. labels Mar 27, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@@ -93,12 +93,12 @@ which returns something similar to:
{
"commit" : {
"id" : "3M3zkw2GHMo2Y4h4/KFKCg==",
"generation" : 4,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was tricky to trace - using a starting commit Lucene, immediately increments the pending changes counter, which means that the flush issued in the docs will actually cause a commit in Lucene. Now that we don't use a starting commit, the uncommittedChanges counter in Lucene stays 0 when opening the engine and the flush in the docs becomes a noop.

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I think it's good to make as many decisions as possible before we create the engine!

// note that we can't just use IndexCommit.delete() as we really want to make sure that those files won't be used
// even if a virus scanner causes the files not to be used.

// TODO: speak to @s1monw about the fact that we can' use getUserData(writer) as that uses that last's commit user
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you open an IW based on a commit it will use that commits user data if you use getUserData(writer) are you sure that is not the case. I just looked at the code and it looks correct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@s1monw
The new commit will use segment files from the starting commit but userData from the last commit by default. Therefore we need to manually set the userData from the starting commit to the new commit.

metadataLock.writeLock().lock();
try {
final List<IndexCommit> existingCommits = DirectoryReader.listCommits(directory);
if (existingCommits.size() == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use isEmpty()?

@@ -1329,6 +1329,9 @@ private void innerOpenEngineAndTranslog() throws IOException {

assertMaxUnsafeAutoIdInCommit();

final long minRetainedTranslogGen = Translog.readMinTranslogGeneration(translogConfig.getTranslogPath(), translogUUID);
store.trimUnsafeCommits(globalCheckpoint, minRetainedTranslogGen, config.getIndexSettings().getIndexVersionCreated());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

// note that we can't just use IndexCommit.delete() as we really want to make sure that those files won't be used
// even if a virus scanner causes the files not to be used.

// TODO: speak to @s1monw about the fact that we can' use getUserData(writer) as that uses that last's commit user
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnhatn dnhatn self-assigned this Mar 28, 2018
dnhatn added 4 commits March 29, 2018 10:40
# Conflicts:
#	server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
@dnhatn dnhatn merged commit eb8b317 into elastic:master Mar 29, 2018
@dnhatn
Copy link
Member

dnhatn commented Mar 29, 2018

I am merging this on behalf of Boaz. Thanks @bleskes and @s1monw.

dnhatn added a commit that referenced this pull request Mar 29, 2018
Since #29260, unsafe commits must be trimmed before opening an engine.
This makes the engine constructor follow Lucene standard semantics and
use the last commit. However, we haven't fully applied this change in some
tests.

Relates #29260
dnhatn pushed a commit that referenced this pull request Mar 31, 2018
As follow up to #28245 , this PR removes the logic for selecting the
right start commit from the Engine constructor in favor of explicitly
trimming them in the Store, before the engine is opened. This makes the
constructor in engine follow standard Lucene semantics and use the last
commit.

Relates #28245
Relates #29156
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Apr 3, 2018
* master: (80 commits)
  Remove HTTP max content length leniency (elastic#29337)
  Begin moving XContent to a separate lib/artifact (elastic#29300)
  Java versions for ci (elastic#29320)
  Minor cleanup in the InternalEngine (elastic#29241)
  Clarify expectations of false positives/negatives (elastic#27964)
  Update docs on vertex ordering (elastic#27963)
  Revert "REST high-level client: add support for Indices Update Settings API (elastic#28892)" (elastic#29323)
  [test] remove Streamable serde assertions (elastic#29307)
  Improve query string docs (elastic#28882)
  fix query string example for boolean query (elastic#28881)
  Resolve unchecked cast warnings introduced with elastic#28892
  REST high-level client: add support for Indices Update Settings API (elastic#28892)
  Search: Validate script query is run with a single script (elastic#29304)
  [DOCS] Added info on WGS-84. Closes issue elastic#3590 (elastic#29305)
  Increase timeout on Netty client latch for tests
  Build: Use branch specific refspec sysprop for bwc builds (elastic#29299)
  TEST: trim unsafe commits before opening engine
  Move trimming unsafe commits from engine ctor to store (elastic#29260)
  Fix incorrect geohash for lat 90, lon 180 (elastic#29256)
  Do not load global state when deleting a snapshot (elastic#29278)
  ...
@colings86 colings86 removed the v7.0.0 label Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v6.3.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants