Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky SearchScrollWithFailingNodesIT #10374

Merged
merged 1 commit into from
Oct 10, 2023

Conversation

andrross
Copy link
Member

@andrross andrross commented Oct 4, 2023

The test intended to stop a data node and called a method named stopRandomNonClusterManagerNode() in order to do that. However, that method would stop a random node that was not the currently elected cluster manager, regardless of node role. I have also renamed that method hoping to be more clear.

Resolves #10137

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • GitHub issue/PR created in OpenSearch documentation repo for the required public documentation changes (#[Issue/PR number])

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Compatibility status:

Checks if related components are compatible with change e8fc524

Incompatible components

Incompatible components: [https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

@github-actions
Copy link
Contributor

github-actions bot commented Oct 5, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=pit/10_basic/Delete all}
      1 org.opensearch.remotestore.RemoteStoreStatsIT.testStatsOnRemoteStoreRestore

@codecov
Copy link

codecov bot commented Oct 5, 2023

Codecov Report

Merging #10374 (505456a) into main (22ddcf1) will decrease coverage by 0.18%.
Report is 3 commits behind head on main.
The diff coverage is n/a.

❗ Current head 505456a differs from pull request most recent head e8fc524. Consider uploading reports for the commit e8fc524 to get more accurate results

@@             Coverage Diff              @@
##               main   #10374      +/-   ##
============================================
- Coverage     71.28%   71.11%   -0.18%     
+ Complexity    58480    58288     -192     
============================================
  Files          4843     4830      -13     
  Lines        275265   274840     -425     
  Branches      40076    40048      -28     
============================================
- Hits         196227   195455     -772     
- Misses        62621    63030     +409     
+ Partials      16417    16355      -62     

see 524 files with indirect coverage changes

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a name for nodes that aren't current cluster manager, like a "worker" node to avoid things like RandomNonMasterNodeNotCurrentClusterManagerFactoryResolverAdapterFunctor class names ;)

@andrross andrross added the backport 2.x Backport to 2.x branch label Oct 5, 2023
The test intended to stop a data node and called a method named
`stopRandomNonClusterManagerNode()` in order to do that. However, that
method would stop a random node that was not the currently elected
cluster manager, regardless of node role. I have also renamed that
method hoping to be more clear.

Resolves opensearch-project#10137

Signed-off-by: Andrew Ross <andrross@amazon.com>
@andrross andrross force-pushed the fix-search-scroll-test branch from 505456a to e8fc524 Compare October 10, 2023 13:54
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.client.PitIT.testDeleteAllAndListAllPits

@andrross andrross merged commit 562e3b2 into opensearch-project:main Oct 10, 2023
17 checks passed
@andrross andrross deleted the fix-search-scroll-test branch October 10, 2023 15:22
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 10, 2023
The test intended to stop a data node and called a method named
`stopRandomNonClusterManagerNode()` in order to do that. However, that
method would stop a random node that was not the currently elected
cluster manager, regardless of node role. I have also renamed that
method hoping to be more clear.

Resolves #10137

Signed-off-by: Andrew Ross <andrross@amazon.com>
(cherry picked from commit 562e3b2)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 10, 2023
The test intended to stop a data node and called a method named
`stopRandomNonClusterManagerNode()` in order to do that. However, that
method would stop a random node that was not the currently elected
cluster manager, regardless of node role. I have also renamed that
method hoping to be more clear.

Resolves #10137

Signed-off-by: Andrew Ross <andrross@amazon.com>
(cherry picked from commit 562e3b2)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
dblock pushed a commit that referenced this pull request Oct 10, 2023
The test intended to stop a data node and called a method named
`stopRandomNonClusterManagerNode()` in order to do that. However, that
method would stop a random node that was not the currently elected
cluster manager, regardless of node role. I have also renamed that
method hoping to be more clear.

Resolves #10137


(cherry picked from commit 562e3b2)

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 19, 2023
The test intended to stop a data node and called a method named
`stopRandomNonClusterManagerNode()` in order to do that. However, that
method would stop a random node that was not the currently elected
cluster manager, regardless of node role. I have also renamed that
method hoping to be more clear.

Resolves opensearch-project#10137

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: Siddhant Deshmukh <deshsid@amazon.com>
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Oct 23, 2023
The test intended to stop a data node and called a method named
`stopRandomNonClusterManagerNode()` in order to do that. However, that
method would stop a random node that was not the currently elected
cluster manager, regardless of node role. I have also renamed that
method hoping to be more clear.

Resolves opensearch-project#10137

Signed-off-by: Andrew Ross <andrross@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
The test intended to stop a data node and called a method named
`stopRandomNonClusterManagerNode()` in order to do that. However, that
method would stop a random node that was not the currently elected
cluster manager, regardless of node role. I have also renamed that
method hoping to be more clear.

Resolves opensearch-project#10137

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport 2.11 bug Something isn't working flaky-test Random test failure that succeeds on second run skip-changelog
Projects
None yet
2 participants