Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable acked indexing #17038

Merged
merged 40 commits into from
Apr 6, 2016
Merged

Enable acked indexing #17038

merged 40 commits into from
Apr 6, 2016

Conversation

jasontedor
Copy link
Member

@jasontedor jasontedor commented Mar 9, 2016

Recent work on the distributed systems front brings us to a point where
we are very close to being able to enable the acked indexing test. We
have chased down one additional failure that will require the primary
terms work to address, but let's get this preparatory work in.

Closes #7572

bleskes and others added 17 commits February 12, 2016 10:06
* master: (350 commits)
  Note to configuration docs on number of threads
  Reduce maximum number of threads in boostrap check
  Limit generic thread pool
  Remove NodeService injection to Discovery
  Prevent closing index during snapshot restore
  [TEST] Fix newline issue in PluginCliTests on Windows
  ParseFieldMatcher should log when using deprecated settings. #16988
  fix checkstyle error
  Add test for the index_options on a keyword field. #16990
  Analysis : Allow string explain param in JSON
  Analysis : Allow string explain param in JSON
  fix typo
  Remove SNAPSHOT from versions in plugin descriptors
  Add support for alpha versions
  Enable unmap hack for java 9
  Simplify mock scripts
  Adding `time_zone` parameter to daterange-aggregation docs
  Adding tests for `time_zone` parameter for date range aggregation
  Added ingest info to node info API, which contains a list of available processors.
  Remove bw compat from size mapper
  ...
This commit adds a guard against an old cluster state that arrives out
of order from the last seen cluster state from the current master from
polluting the pending cluster states queue. Without this guard, such a
state can end up stuck in the pending states queue.
@@ -820,24 +821,64 @@ void finishBecauseUnavailable(ShardId shardId, String message) {
}
}


static ConcurrentMap<IndexShardReference, String> openShardReferences;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels way more natural on the IndexShard level - I felt that way when I made it but couldn't find a way (at the time) to push it to IndexShard while getting the debugging information to be used as the string value here. We can pass it to IndexShard.acquirePrimaryOperationLock and acquireReplicaOperationLock but that will force a toString on the request object. How about making the reason an Object type and only call toString on it in IndexShard?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even better a reason supplier called on demand?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's more natural there, but I prefer to keep it where it is until there's a clear need for it at the lower level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guys we can't have static maps on classes like this. If we want some kind of static assertions we should inject some mock services just as we do for for SearchContext in MockSearchService. I am a bit confused why we have a static factory method on literally the only impl of an interface in IndexShardReferenceImpl this smells like a design flaw to me. I think with this change together we should rather expose a pluggalbe service or move the creation into IndexService and let folks plug in a factory into IndexModule in order to impl this assertion? but static per JVM concurrent maps seems like a playing russian roulette

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 0e5b22a.

@bleskes
Copy link
Contributor

bleskes commented Mar 30, 2016

LGTM. Thanks again @jasontedor

* master: (156 commits)
  Make JNA calls optional
  Added RPM metadata
  Remove PROTOTYPE from MLT.Item
  Remove PROTOTYPE from VersionType
  Fix mistake in TopHits change
  Remove PROTOTYPEs from highlighting
  Clean up some log messages
  Command line arguments with comma must be quoted on windows
  Cluster Health should run on applied states, even if waitFor=0 #17440
  ingest: make concrete processor impl final, like all other processor concrete impls.
  Improve some test method comments.
  Document task id's as string in the rest spec
  Replace FieldStatsProvider with a method on MappedFieldType. #17334
  cleanup test
  Remove MathUtils. #17454
  Addressing review comments
  fix javadocs
  Make TranslogConfig immutable and pass TranslogGeneration as a ctor arg to Translog
  [reindex] Don't get rejected
  Remove redundant commit - #openTranslog() already commits in that case
  ...
This commit adds a guard preventing old cluster states from entering
into the pending queue when we are following a master, and cleans old
cluster states from the pending queue when processing a commit.
@@ -164,12 +164,17 @@ public synchronized void markAsProcessed(ClusterState state) {
currentMaster
);
}
} else if (state.supersedes(pendingState) && pendingContext.committed()) {
} else if (state.version() >= pendingState.version()) {
assert state.supersedes(pendingState) || (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is we're here, then the masters are equal. In this case supersedes == false, means version equality (because it's can't be lower. Also state.nodes().getMasterNodeId() is never null (and assigned to currentMaster.getId())) . I'm not sure what we're asserting than. Also note that it disables the next if clause, checking for uuid equality, so maybe we want to do the next if clause first?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed assertion, reordered checks in c2ed5a1.

@jasontedor
Copy link
Member Author

@bleskes I've addressed all comments from the latest feedback round.

This commit removes a superfluous check when validing incoming cluster
states. The check in question prevents out-of-order cluster states from
the same master from entering the queue. However, such out-of-order
cluster states will be cleaned from the queue when a commit message for
that cluster state arrives or a commit message for any higher-versioned
cluster state arrives.
@bleskes
Copy link
Contributor

bleskes commented Apr 6, 2016

LGTM again. Thanks for all the hard work @jasontedor

@jasontedor jasontedor removed the review label Apr 6, 2016
@jasontedor jasontedor merged commit 0a69985 into elastic:master Apr 6, 2016
@jasontedor jasontedor deleted the enable_acked branch April 6, 2016 22:13
@jasontedor jasontedor changed the title Prepare for enabling acked indexing Enable acked indexing Apr 8, 2016
@clintongormley clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Cluster labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >enhancement resiliency v5.0.0-alpha2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Indexing] A network partition can cause in flight documents to be lost
4 participants