Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Check phase name before SearchRequestOperationsListener onPhaseStart #12035

Merged
merged 1 commit into from
Jan 30, 2024

Conversation

dzane17
Copy link
Contributor

@dzane17 dzane17 commented Jan 26, 2024

Description

The SearchRequestOperationsListener onPhaseStart() method is getting called TWICE consecutively during query phase in select cases. This occurs any time following can match phase. onPhaseStart() should only be called ONCE per search phase and must be paired with either onPhaseEnd() or onPhaseFailure().

Solution

  1. Rename first instance of duplicate "query" phase to "none" phase as it is not yet reached SearchQueryThenFetchAsyncAction
  2. Call onPhaseStart() only when current phase is one of 6 search phases we track
public enum SearchPhaseName {
    DFS_PRE_QUERY("dfs_pre_query"),
    QUERY("query"),
    FETCH("fetch"),
    DFS_QUERY("dfs_query"),
    EXPAND("expand"),
    CAN_MATCH("can_match");
}

Testing

  • existing UTs
  • phase_took search
% curl -XGET 'localhost:9200/_search?pretty&phase_took' -H 'X-Opaque-Id: my-id' -H 'Content-Type: application/json' -d'{                               
 "query": { "match_all": {} }
}'

{
  "took" : 63,
  "phase_took" : {
    "dfs_pre_query" : 0,
    "query" : 32,
    "fetch" : 17,
    "dfs_query" : 0,
    "expand" : 0,
    "can_match" : 0
  },
  "timed_out" : false,
  "_shards" : {
    "total" : 10,
    "successful" : 10,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test2",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "field2" : "value2"
        }
      },
      {
        "_index" : "test2",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "field1" : "value1"
        }
      }
    ]
  }
}

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dzane17 dzane17 changed the title [Bug] Check phase name before SearchRequestOperationsListener onPhase… [Bug] Check phase name before SearchRequestOperationsListener onPhaseStart Jan 26, 2024
Copy link
Contributor

❌ Gradle check result for 8f0e5b8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Jan 26, 2024

Compatibility status:

Checks if related components are compatible with change 99a8028

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/alerting.git]

Copy link
Contributor

❕ Gradle check result for 87110b5: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@deshsidd
Copy link
Contributor

Looks like some more tests needed for code coverage although this might not be related to your change

@dzane17 dzane17 force-pushed the can-match-fix branch 2 times, most recently from 6f3132a to df8c8bd Compare January 29, 2024 06:06
Copy link
Contributor

❕ Gradle check result for 6f3132a: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testWriteBlobWithRetries
      1 org.opensearch.index.IndexServiceTests.testAsyncTranslogTrimTaskOnClosedIndex

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Contributor

❕ Gradle check result for df8c8bd: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@msfroh
Copy link
Collaborator

msfroh commented Jan 29, 2024

Is there any risk with calling onPhaseEnd without calling onPhaseStart? With this change, that would be possible, right?

Would it make sense to gate all of the listener callbacks behind this check?

@dzane17
Copy link
Contributor Author

dzane17 commented Jan 29, 2024

With the current SearchRequestOperationsListener subscribers I don't see any risk, but it makes sense to maintain the 1:1 relation between start:end calls. I added verification to onPhaseEnd and onPhaseFailure.

Copy link
Contributor

❌ Gradle check result for 0020c46: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for cfceb50: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

…Start

Signed-off-by: David Zane <davizane@amazon.com>
Copy link
Contributor

❕ Gradle check result for 99a8028: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"true"}}

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@msfroh msfroh added backport 2.x Backport to 2.x branch v2.12.0 Issues and PRs related to version 2.12.0 labels Jan 30, 2024
@msfroh msfroh merged commit fb2c5f2 into opensearch-project:main Jan 30, 2024
37 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12035-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 fb2c5f21868ee364170363d881329b6e0afdfdb3
# Push it to GitHub
git push --set-upstream origin backport/backport-12035-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-12035-to-2.x.

@dzane17 dzane17 deleted the can-match-fix branch January 30, 2024 20:56
dzane17 added a commit to dzane17/OpenSearch that referenced this pull request Jan 30, 2024
…Start (opensearch-project#12035)

Signed-off-by: David Zane <davizane@amazon.com>
(cherry picked from commit fb2c5f2)
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…Start (opensearch-project#12035)

Signed-off-by: David Zane <davizane@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed v2.12.0 Issues and PRs related to version 2.12.0
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

5 participants