Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport 2.x] Initial search pipelines implementation (#6587) #7075

Merged
merged 3 commits into from
Apr 13, 2023

Conversation

msfroh
Copy link
Collaborator

@msfroh msfroh commented Apr 10, 2023

  • Initial search pipelines implementation

This commit includes the basic features of search pipelines (see opensearch-project/search-processor#80).

Search pipelines are modeled after ingest pipelines and provide a simple, clean API for components to modify search requests and responses.

With this commit we can:

  1. Can create, retrieve, update, and delete search pipelines.
  2. Transform search requests and responses by explicitly referencing a pipeline.

Later work will include:

  1. Adding an index setting to specify a default search pipeline.
  2. Allowing search pipelines to be defined within a search request (for development/testing purposes, akin to simulating an ingest pipeline).
  3. Adding a collection of search pipeline processors to support common useful transformations. (Suggestions welcome!)

Signed-off-by: Michael Froh froh@amazon.com

  1. SearchPipelinesClient: JavaDoc fix
  2. SearchRequest: Check versions when (de)serializing new "pipeline" property.
  3. Rename SearchPipelinesPlugin -> SearchPipelinePlugin.
  4. Pipeline: Change visibility to package private
  5. SearchPipelineProcessingException: New exception type to wrap exceptions thrown when executing a pipeline.

Bonus: Added an integration test for filter_query request processor.

Signed-off-by: Michael Froh froh@amazon.com

  • Register SearchPipelineProcessingException

Also added more useful messages to unit tests to explicitly explain what hoops need to be jumped through in order to add a new serializable exception.

Signed-off-by: Michael Froh froh@amazon.com

  • Remove unneeded dependencies from search-pipeline-common

I had copied some dependencies from ingest-common, but they are not used by search-pipeline-common (yet).

Signed-off-by: Michael Froh froh@amazon.com

  • Avoid cloning SearchRequest if no SearchRequestProcessors

Also, add tests to confirm that a pipeline with no processors works fine (as a no-op).

Signed-off-by: Michael Froh froh@amazon.com

  • Use NamedWritableRegistry to deserialize SearchRequest

Queries are serialized as NamedWritables, so we need to use a NamedWritableRegistry to deserialize.

Signed-off-by: Michael Froh froh@amazon.com

  • Check for empty pipeline with CollectionUtils.isEmpty

Signed-off-by: Michael Froh froh@amazon.com

  • Update server/src/main/java/org/opensearch/search/pipeline/SearchPipelineService.java

Co-authored-by: Navneet Verma vermanavneet003@gmail.com
Signed-off-by: Michael Froh froh@amazon.com

Signed-off-by: Michael Froh froh@amazon.com

  • Incorporate feedback from @reta
  • Renamed various classes from "SearchPipelinesSomething" to "SearchPipelineSomething" to be consistent.
  • Refactored NodeInfo construction in NodeService to avoid ternary operator and improved readability.

Signed-off-by: Michael Froh froh@amazon.com

  • Gate search pipelines behind a feature flag

Also renamed SearchPipelinesRequestConverters.

Signed-off-by: Michael Froh froh@amazon.com

  • More feature flag fixes for search pipeline testing
  • Don't use system properties for SearchPipelineServiceTests.
  • Enable feature flag for multinode smoke tests.

Signed-off-by: Michael Froh froh@amazon.com

  • Move feature flag into constructor parameter

Thanks for the suggestion, @reta!

Signed-off-by: Michael Froh froh@amazon.com

  • Move REST handlers behind feature flag

Signed-off-by: Michael Froh froh@amazon.com


Signed-off-by: Michael Froh froh@amazon.com
Co-authored-by: Navneet Verma vermanavneet003@gmail.com
(cherry picked from commit ee990bd)

* Initial search pipelines implementation

This commit includes the basic features of search pipelines
(see opensearch-project/search-processor#80).

Search pipelines are modeled after ingest pipelines and provide a
simple, clean API for components to modify search requests and
responses.

With this commit we can:

1. Can create, retrieve, update, and delete search pipelines.
2. Transform search requests and responses by explicitly referencing a
   pipeline.

Later work will include:

1. Adding an index setting to specify a default search pipeline.
2. Allowing search pipelines to be defined within a search request (for
   development/testing purposes, akin to simulating an ingest
   pipeline).
3. Adding a collection of search pipeline processors to support common
   useful transformations. (Suggestions welcome!)

Signed-off-by: Michael Froh <froh@amazon.com>

* Incorporate feedback from @reta and @navneet1v

1. SearchPipelinesClient: JavaDoc fix
2. SearchRequest: Check versions when (de)serializing new "pipeline"
   property.
3. Rename SearchPipelinesPlugin -> SearchPipelinePlugin.
4. Pipeline: Change visibility to package private
5. SearchPipelineProcessingException: New exception type to wrap
   exceptions thrown when executing a pipeline.

Bonus: Added an integration test for filter_query request processor.

Signed-off-by: Michael Froh <froh@amazon.com>

* Register SearchPipelineProcessingException

Also added more useful messages to unit tests to explicitly explain
what hoops need to be jumped through in order to add a new serializable
exception.

Signed-off-by: Michael Froh <froh@amazon.com>

* Remove unneeded dependencies from search-pipeline-common

I had copied some dependencies from ingest-common, but they are not used
by search-pipeline-common (yet).

Signed-off-by: Michael Froh <froh@amazon.com>

* Avoid cloning SearchRequest if no SearchRequestProcessors

Also, add tests to confirm that a pipeline with no processors works
fine (as a no-op).

Signed-off-by: Michael Froh <froh@amazon.com>

* Use NamedWritableRegistry to deserialize SearchRequest

Queries are serialized as NamedWritables, so we need to use a
NamedWritableRegistry to deserialize.

Signed-off-by: Michael Froh <froh@amazon.com>

* Check for empty pipeline with CollectionUtils.isEmpty

Signed-off-by: Michael Froh <froh@amazon.com>

* Update server/src/main/java/org/opensearch/search/pipeline/SearchPipelineService.java

Co-authored-by: Navneet Verma <vermanavneet003@gmail.com>
Signed-off-by: Michael Froh <froh@amazon.com>

* Incorporate feedback from @noCharger

Signed-off-by: Michael Froh <froh@amazon.com>

* Incorporate feedback from @reta

- Renamed various classes from "SearchPipelinesSomething" to
"SearchPipelineSomething" to be consistent.
- Refactored NodeInfo construction in NodeService to avoid ternary
  operator and improved readability.

Signed-off-by: Michael Froh <froh@amazon.com>

* Gate search pipelines behind a feature flag

Also renamed SearchPipelinesRequestConverters.

Signed-off-by: Michael Froh <froh@amazon.com>

* More feature flag fixes for search pipeline testing

- Don't use system properties for SearchPipelineServiceTests.
- Enable feature flag for multinode smoke tests.

Signed-off-by: Michael Froh <froh@amazon.com>

* Move feature flag into constructor parameter

Thanks for the suggestion, @reta!

Signed-off-by: Michael Froh <froh@amazon.com>

* Move REST handlers behind feature flag

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
Co-authored-by: Navneet Verma <vermanavneet003@gmail.com>
(cherry picked from commit ee990bd)
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@msfroh
Copy link
Collaborator Author

msfroh commented Apr 10, 2023

Oops -- looks like the merge misunderstood how to handle these version checks:

        if (in.getVersion().onOrAfter(LegacyESVersion.V_7_10_0)) {
            addInfoIfNonNull(AggregationInfo.class, in.readOptionalWriteable(AggregationInfo::new));
        if (in.getVersion().onOrAfter(Version.V_3_0_0)) { // TODO: Change if/when we backport to 2.x
            addInfoIfNonNull(SearchPipelineInfo.class, in.readOptionalWriteable(SearchPipelineInfo::new));
        }
        }

I'll go back and pull the one if out of the other.

1. Can't reference version 3.0.0.
2. Bad merges of adjacent version checks.
3. Use of Apache HTTP client 4 (vs 5).
4. Use of old cluster manager naming in REST params.
5. CollectionUtils didn't have isEmpty for collections.

Signed-off-by: Michael Froh <froh@amazon.com>
@msfroh msfroh force-pushed the backport/backport-6587-to-2.x branch from c1e8cea to 744d5f9 Compare April 10, 2023 21:48
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Michael Froh <froh@amazon.com>
@msfroh msfroh force-pushed the backport/backport-6587-to-2.x branch from 5ba734e to 4126cfc Compare April 10, 2023 23:12
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.indices.replication.SegmentReplicationIT.testScrollCreatedOnReplica
      1 org.opensearch.indices.replication.SegmentReplicationIT.testReplicationPostDeleteAndForceMerge

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=pit/10_basic/Delete all}

@codecov-commenter
Copy link

codecov-commenter commented Apr 10, 2023

Codecov Report

Merging #7075 (4126cfc) into 2.x (932f47b) will increase coverage by 0.46%.
The diff coverage is 53.37%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##                2.x    #7075      +/-   ##
============================================
+ Coverage     70.35%   70.81%   +0.46%     
- Complexity    59492    60043     +551     
============================================
  Files          4822     4849      +27     
  Lines        285965   286731     +766     
  Branches      41562    41649      +87     
============================================
+ Hits         201178   203042    +1864     
+ Misses        67995    67079     -916     
+ Partials      16792    16610     -182     
Impacted Files Coverage Δ
...search/client/SearchPipelineRequestConverters.java 0.00% <0.00%> (ø)
...eline/common/SearchPipelineCommonModulePlugin.java 0.00% <0.00%> (ø)
...ion/admin/cluster/node/info/NodesInfoResponse.java 3.03% <0.00%> (-0.10%) ⬇️
...search/action/search/GetSearchPipelineRequest.java 0.00% <0.00%> (ø)
...earch/action/search/GetSearchPipelineResponse.java 0.00% <0.00%> (ø)
.../org/opensearch/client/support/AbstractClient.java 31.33% <0.00%> (-0.43%) ⬇️
...search/cluster/service/ClusterManagerTaskKeys.java 0.00% <ø> (ø)
...pensearch/common/settings/FeatureFlagSettings.java 50.00% <ø> (ø)
...va/org/opensearch/common/util/CollectionUtils.java 78.87% <0.00%> (-0.56%) ⬇️
...a/org/opensearch/plugins/SearchPipelinePlugin.java 0.00% <0.00%> (ø)
... and 37 more

... and 478 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -102,6 +103,9 @@ public NodeInfo(StreamInput in) throws IOException {
if (in.getVersion().onOrAfter(LegacyESVersion.V_7_10_0)) {
addInfoIfNonNull(AggregationInfo.class, in.readOptionalWriteable(AggregationInfo::new));
}
if (in.getVersion().onOrAfter(Version.V_2_7_0)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msfroh we would need to bring these changes to main as well, could you please prepare the pull request so we will merge it right after this one? thank you

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks! Here we go: #7135

@msfroh msfroh closed this Apr 11, 2023
@msfroh
Copy link
Collaborator Author

msfroh commented Apr 11, 2023

I'm canceling this PR for now, just to make sure it doesn't go out in its current state for 2.7. I'll see about reopening after 2.7 is released.

@msfroh
Copy link
Collaborator Author

msfroh commented Apr 12, 2023

Talked it over with a colleague, who mentioned that the feature flag is enough that it should be safe to ship this code in 2.7.

@msfroh msfroh reopened this Apr 12, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=pit/10_basic/Delete all}

@reta
Copy link
Collaborator

reta commented Apr 13, 2023

@andrross just to confirm, good to go for 2.7.0? thank you

@reta reta merged commit 6024485 into opensearch-project:2.x Apr 13, 2023
@saratvemulapalli
Copy link
Member

Talked it over with a colleague, who mentioned that the feature flag is enough that it should be safe to ship this code in 2.7.

Late here. Im the mystery colleague. As long as we dont break semver and the feature is gated we should be good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants