-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds wait_for_no_initializing_shards to cluster health API #27489
Conversation
This adds a new option to the cluster health request allowing to wait until there is no initializing shards. Closes elastic#25623
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments and asks, general change looks good.
ClusterHealthRequest randomRequest() { | ||
ClusterHealthRequest request = new ClusterHealthRequest(); | ||
request.waitForStatus(randomFrom(ClusterHealthStatus.values())); | ||
request.waitForNodes(randomAlphaOfLengthBetween(5, 10)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering why this works. this should be a number or >=5, <7, ...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did not validate the request. I've updated the nodes expression.
.build(); | ||
final ShardId shardId = new ShardId(new Index("index", "uuid"), 0); | ||
final IndexRoutingTable.Builder routingTable = new IndexRoutingTable.Builder(indexMetaData.getIndex()) | ||
.addShard(TestShardRouting.newShardRouting(shardId, "node-0", true, ShardRoutingState.STARTED)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
primary always started? what if that one is initializing? This is not well randomized.... it just randomizes in the number of initializing replicas...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agreed. I've increased the randomness.
body: | ||
settings: | ||
index: | ||
number_of_replicas: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creating an index automatically waits for primary to be allocated (wait_for_active_shards is 1 by default). This means that the health check later is fully redundant, it will always succeed. This test does not really test anything here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about changing it so that the index is created with wait_for_active_shards : 0? Then this health check will actually wait for the primaries to be allocated and the output should guarantee that number of initializing shards is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (request.indices() == null || request.indices().length == 0) { // check that they actually exists in the meta data | ||
waitFor--; | ||
|
||
if (request.waitForNoInitializingShards()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this directly under waitForNoRelocatingShards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. I fixed it.
@ywelsch I have addressed your comments. Could you please take another look? Thank you. |
Left one more comment. Can you also update the documentation here: |
A boolean value which controls whether to wait (until the timeout provided) | ||
for the cluster to have no shard initializations. Defaults to false, which means | ||
it will not wait for initializing shards. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh missed that, sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
body: | ||
settings: | ||
index: | ||
number_of_shards: 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default of 5 is good enough (slow CI machines ftw)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I had to increase it to see the difference on my machine.
"cluster health basic test, one index with wait for no initializing shards": | ||
- skip: | ||
version: " - 6.99.99" | ||
reason: "wait_for_no_initializing_shard is introduced in 7.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait_for_no_initializing_shards
- match: { active_primary_shards: 50} | ||
- gt: { active_shards: 0 } | ||
- gte: { relocating_shards: 0 } | ||
- match: { initializing_shards: 0 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good enough to just test this match. The other ones are covered by other tests in this same file.
Thanks @ywelsch. |
This adds a new option to the cluster health request allowing to wait until there is no initializing shards. Closes #25623
* es/master: (38 commits) Backport wait_for_initialiazing_shards to cluster health API Carry over version map size to prevent excessive resizing (#27516) Fix scroll query with a sort that is a prefix of the index sort (#27498) Delete shard store files before restoring a snapshot (#27476) Replace `delimited_payload_filter` by `delimited_payload` (#26625) CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512) Fix merging of _meta field (#27352) Remove unused method (#27508) unmuted test, this has been fixed by #27397 Consolidate version numbering semantics (#27397) Add wait_for_no_initializing_shards to cluster health API (#27489) [TEST] use routing partition size based on the max routing shards of the second split Adjust CombinedDeletionPolicy for multiple commits (#27456) Update composite-aggregation.asciidoc Deprecate `levenstein` in favor of `levenshtein` (#27409) Automatically prepare indices for splitting (#27451) Validate `op_type` for `_create` (#27483) Minor ShapeBuilder cleanup muted test Decouple nio constructs from the tcp transport (#27484) ...
* es/6.x: (30 commits) Add wait_for_no_initializing_shards to cluster health API (#27489) Carry over version map size to prevent excessive resizing (#27516) Fix scroll query with a sort that is a prefix of the index sort (#27498) Delete shard store files before restoring a snapshot (#27476) CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512) Fix merging of _meta field (#27352) test: do not run percolator query builder bwc test against 5.x versions Remove unused method (#27508) Consolidate version numbering semantics (#27397) Adjust CombinedDeletionPolicy for multiple commits (#27456) Minor ShapeBuilder cleanup [GEO] Deprecate ShapeBuilders and decouple geojson parse logic Improve docs for split API in 6.1/6.x (#27504) test: use correct pre 6.0.0-alpha1 format Update composite-aggregation.asciidoc Deprecate `levenstein` in favor of `levenshtein` (#27409) Decouple nio constructs from the tcp transport (#27484) Bump version from 6.1 to 6.2 Fix whitespace in Security.java Tighten which classes can exit ...
This adds a new option to the cluster health request allowing to wait until there is no initializing shards.
Closes #25623