Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Failure in org.elasticsearch.index.mapper.DynamicMappingTests#testMappingVersionAfterDynamicMappingUpdate #38428

Closed
original-brownbear opened this issue Feb 5, 2019 · 2 comments · Fixed by #38873
Assignees
Labels
:Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch >test-failure Triaged test failures from CI v7.2.0

Comments

@original-brownbear
Copy link
Member

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+periodic/670/console

Suite: org.elasticsearch.index.mapper.DynamicMappingTests
  2> REPRODUCE WITH: ./gradlew :server:unitTest -Dtests.seed=2ED74AF20810AE4B -Dtests.class=org.elasticsearch.index.mapper.DynamicMappingTests -Dtests.method="testMappingVersionAfterDynamicMappingUpdate" -Dtests.security.manager=true -Dtests.locale=tr -Dtests.timezone=Australia/North -Dcompiler.java=11 -Druntime.java=8

Failed with:

java.lang.AssertionError: 
Expected: <2L>
     but: was <1L>
	at __randomizedtesting.SeedInfo.seed([2ED74AF20810AE4B:BBF3A8A2770A8162]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:956)
	at org.junit.Assert.assertThat(Assert.java:923)
	at org.elasticsearch.index.mapper.DynamicMappingTests.testMappingVersionAfterDynamicMappingUpdate(DynamicMappingTests.java:777)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.lang.Thread.run(Thread.java:748)

@original-brownbear original-brownbear added :Search Foundations/Mapping Index mappings, including merging and defining field types >test-failure Triaged test failures from CI v6.7.0 labels Feb 5, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@dakrone dakrone self-assigned this Feb 6, 2019
@dakrone
Copy link
Member

dakrone commented Feb 7, 2019

I am able to reproduce this after running the tests thousands of times at once. I'm still looking into the cause.

dakrone added a commit to dakrone/elasticsearch that referenced this issue Feb 7, 2019
This assertBusy is necessary. When a mapping update is needed as a document is
indexed, the document is tried, rejected (due to mapping conflict), then a
mapping update sent off, the document is then *immediately* retried to see if
the mapping change has occurred quickly enough, and if it has, indexing does not
wait for the next cluster state to occur before moving ahead. In very rare cases
this immediate retry succeeds, which causes the indexing request to
complete (because it was successful) but the new cluster state to not be
propagated entirely yet. In that case, we need to wait because the mapping
version will eventually be updated, it just hasn't been updated *yet*.

The first mapping-update-necessary check:

https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L487

Followed immediately by the mapping update:

https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L490

And then the *immediate* retry:

https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L499

In the event the immediate retry fails (99.9999% of the time), the context is
marked as needing to wait for a new cluster state before proceeding:

https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L501-L504

In the 0.0001% case, the immediate retry succeeds, causing the test to fail.

I was able to reproduce this bug about once every 10,000 tests. With the
awaitsFix I ran this 100,000 times with no failures.

Resolves elastic#38428
dakrone added a commit to dakrone/elasticsearch that referenced this issue Feb 13, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves elastic#38428
Supercedes elastic#38579
Relates to elastic#38711
dakrone added a commit that referenced this issue Feb 14, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves #38428
Supercedes #38579
Relates to #38711
dakrone added a commit that referenced this issue Feb 14, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves #38428
Supercedes #38579
Relates to #38711
dakrone added a commit that referenced this issue Feb 14, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves #38428
Supercedes #38579
Relates to #38711
dakrone added a commit that referenced this issue Feb 14, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves #38428
Supercedes #38579
Relates to #38711
@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch >test-failure Triaged test failures from CI v7.2.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants