Fix boundary condition in indexing pressure test #5168

andrross · 2022-11-09T00:18:19Z

This updates the boundary condition in an assertion in two tests in ShardIndexingPressureConcurrentExecutionTests. I could reliably reproduce errors here by running:

./gradlew ':server:test' -Dtests.iters=10000 --tests "org.opensearch.index.ShardIndexingPressureConcurrentExecutionTests.testReplicaThreadedUpdateToShardLimits"

On every error the value that failed was exactly 0.95 and failed the less than check. The change here is to accept 0.95, and also refactor the test to give a better error message on failure.

Issues Resolved

Closes #2241
Closes #4215

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff
Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

andrross · 2022-11-09T00:20:48Z

server/src/test/java/org/opensearch/index/ShardIndexingPressureConcurrentExecutionTests.java

            (double) (NUM_THREADS * 15) / shardIndexingPressure.shardStats()
                .getIndexingPressureShardStats(shardId1)
-                .getCurrentPrimaryAndCoordinatingLimits() > 0.75
+                .getCurrentPrimaryAndCoordinatingLimits(),
+            allOf(greaterThan(0.75), lessThanOrEqualTo(0.95))


@getsaurabh02 To be honest I don't fully understand this assertion. Can you verify it is safe/correct to change this assertion from "less than 0.95" to "less than or equal to 0.95"?

The additional memory allocation per shard is governed by its peak memory utilization threshold of the current allocation. This is configurable setting and defaults to 95%.

Once the current allocation is greater than this threshold, the new shard limits are calculated and allocated. Since this condition checks for current value to be greater than, it should be safe to assume that the current limit can reach until the 95% (0.95) utilization. Hence "less than or equal to 0.95" should be good.

github-actions · 2022-11-09T00:38:33Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/6662/
CommitID: 07507c8
Please examine the workflow log, locate, and copy-paste the failure below, then iterate to green.
Is the failure a flaky test unrelated to your change?

github-actions · 2022-11-09T03:00:05Z

Gradle Check (Jenkins) Run Completed with:

RESULT: UNSTABLE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/6667/
CommitID: e61fbac
Please examine the workflow log, locate, and copy-paste the failure below, then iterate to green.
Is the failure a flaky test unrelated to your change?

getsaurabh02 · 2022-11-09T03:29:00Z

server/src/test/java/org/opensearch/index/ShardIndexingPressureConcurrentExecutionTests.java

            (double) (NUM_THREADS * 15) / shardIndexingPressure.shardStats()
                .getIndexingPressureShardStats(shardId1)
-                .getCurrentPrimaryAndCoordinatingLimits() > 0.75
+                .getCurrentPrimaryAndCoordinatingLimits(),
+            allOf(greaterThan(0.75), lessThanOrEqualTo(0.95))


The additional memory allocation per shard is governed by its peak memory utilization threshold of the current allocation. This is configurable setting and defaults to 95%.

Once the current allocation is greater than this threshold, the new shard limits are calculated and allocated. Since this condition checks for current value to be greater than, it should be safe to assume that the current limit can reach until the 95% (0.95) utilization. Hence "less than or equal to 0.95" should be good.

andrross · 2022-11-09T17:20:04Z

Ironically a different shard indexing pressure test failed:

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.index.ShardIndexingPressureIT.testShardIndexingPressureTrackingDuringBulkWrites" -Dtests.seed=567CA44C57BDC19B -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=sr-Latn-ME -Dtests.timezone=Pacific/Palau -Druntime.java=19

org.opensearch.index.ShardIndexingPressureIT > testShardIndexingPressureTrackingDuringBulkWrites FAILED
    java.lang.AssertionError: 
    Expected: <22616L>
         but: was <22954L>
        at __randomizedtesting.SeedInfo.seed([567CA44C57BDC19B:DC58B8B44B77E953]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.junit.Assert.assertThat(Assert.java:964)
        at org.junit.Assert.assertThat(Assert.java:930)
        at org.opensearch.index.ShardIndexingPressureIT.testShardIndexingPressureTrackingDuringBulkWrites(ShardIndexingPressureIT.java:209)

#5176

github-actions · 2022-11-09T17:56:42Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/6688/
CommitID: e61fbac
Please examine the workflow log, locate, and copy-paste the failure below, then iterate to green.
Is the failure a flaky test unrelated to your change?

This updates the boundary condition in an assertion in two tests in ShardIndexingPressureConcurrentExecutionTests. I could reliably reproduce errors here by running: ``` ./gradlew ':server:test' -Dtests.iters=10000 --tests "org.opensearch.index.ShardIndexingPressureConcurrentExecutionTests.testReplicaThreadedUpdateToShardLimits" ``` On every error the value that failed was exactly 0.95 and failed the less than check. The change here is to accept 0.95, and also refactor the test to give a better error message on failure. Signed-off-by: Andrew Ross <andrross@amazon.com>

github-actions · 2022-11-09T18:25:40Z

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS ✅
URL: https://build.ci.opensearch.org/job/gradle-check/6694/
CommitID: 9f6d53a

github-actions · 2022-11-09T18:34:41Z

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS ✅
URL: https://build.ci.opensearch.org/job/gradle-check/6695/
CommitID: 179c1e7

This updates the boundary condition in an assertion in two tests in ShardIndexingPressureConcurrentExecutionTests. I could reliably reproduce errors here by running: ``` ./gradlew ':server:test' -Dtests.iters=10000 --tests "org.opensearch.index.ShardIndexingPressureConcurrentExecutionTests.testReplicaThreadedUpdateToShardLimits" ``` On every error the value that failed was exactly 0.95 and failed the less than check. The change here is to accept 0.95, and also refactor the test to give a better error message on failure. Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Andrew Ross <andrross@amazon.com> (cherry picked from commit 3423f44) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

This updates the boundary condition in an assertion in two tests in ShardIndexingPressureConcurrentExecutionTests. I could reliably reproduce errors here by running: ``` ./gradlew ':server:test' -Dtests.iters=10000 --tests "org.opensearch.index.ShardIndexingPressureConcurrentExecutionTests.testReplicaThreadedUpdateToShardLimits" ``` On every error the value that failed was exactly 0.95 and failed the less than check. The change here is to accept 0.95, and also refactor the test to give a better error message on failure. Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Andrew Ross <andrross@amazon.com> (cherry picked from commit 3423f44) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

andrross marked this pull request as ready for review November 9, 2022 00:19

andrross requested review from a team and reta as code owners November 9, 2022 00:19

andrross commented Nov 9, 2022

View reviewed changes

andrross added the skip-changelog label Nov 9, 2022

andrross force-pushed the shard-indexing-pressure-test branch from 07507c8 to e61fbac Compare November 9, 2022 01:09

getsaurabh02 approved these changes Nov 9, 2022

View reviewed changes

andrross force-pushed the shard-indexing-pressure-test branch from e61fbac to 9f6d53a Compare November 9, 2022 17:55

andrross force-pushed the shard-indexing-pressure-test branch from 9f6d53a to 179c1e7 Compare November 9, 2022 17:58

Poojita-Raj approved these changes Nov 9, 2022

View reviewed changes

Poojita-Raj merged commit 3423f44 into opensearch-project:main Nov 9, 2022

andrross added backport 2.x Backport to 2.x branch backport 1.x labels Nov 9, 2022

This was referenced Nov 9, 2022

[Backport 2.x] Fix boundary condition in indexing pressure test #5179

Merged

[Backport 1.x] Fix boundary condition in indexing pressure test #5180

Merged

andrross added the backport 2.4 Backport to 2.4 branch label Nov 9, 2022

opensearch-trigger-bot bot mentioned this pull request Nov 9, 2022

[Backport 2.4] Fix boundary condition in indexing pressure test #5182

Merged

andrross deleted the shard-indexing-pressure-test branch December 16, 2022 18:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix boundary condition in indexing pressure test #5168

Fix boundary condition in indexing pressure test #5168

andrross commented Nov 9, 2022

andrross Nov 9, 2022

getsaurabh02 Nov 9, 2022

github-actions bot commented Nov 9, 2022

github-actions bot commented Nov 9, 2022

getsaurabh02 Nov 9, 2022

andrross commented Nov 9, 2022 •

edited

Loading

github-actions bot commented Nov 9, 2022

github-actions bot commented Nov 9, 2022

github-actions bot commented Nov 9, 2022

Fix boundary condition in indexing pressure test #5168

Fix boundary condition in indexing pressure test #5168

Conversation

andrross commented Nov 9, 2022

Issues Resolved

Check List

andrross Nov 9, 2022

Choose a reason for hiding this comment

getsaurabh02 Nov 9, 2022

Choose a reason for hiding this comment

github-actions bot commented Nov 9, 2022

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Nov 9, 2022

Gradle Check (Jenkins) Run Completed with:

getsaurabh02 Nov 9, 2022

Choose a reason for hiding this comment

andrross commented Nov 9, 2022 • edited Loading

github-actions bot commented Nov 9, 2022

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Nov 9, 2022

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Nov 9, 2022

Gradle Check (Jenkins) Run Completed with:

andrross commented Nov 9, 2022 •

edited

Loading