[Enhancement] Implement pruning for neural sparse search #988

zhichao-aws · 2024-11-15T06:46:38Z

Description

Implement prune for sparse vectors, to save disk space and accelerate search speed with small loss on search relevance. #946

Implement pruning at sparse_encoding ingestion processor. Users can configure the pruning strategy when create the processor, and the processor will prune the sparse vectors before write to index.
Implement pruning at neural_sparse 2-phase search. Users can configure the pruning strategy when search with neural_sparse query. The query builder will prune the query before search on index.

Related Issues

#946

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

zhichao-aws · 2024-11-20T07:34:56Z

This PR is ready for review now

heemin32

Could you provide an overview of how the overall API will look? I initially thought this change would only affect the query side, but it seems it will also modify the parameters for neural_sparse_two_phase_processor.

Additionally, the current implementation appears to be focused on two-phase processing with different strategies for splitting vectors, rather than a combination of pruning and two-phase processing?

src/main/java/org/opensearch/neuralsearch/processor/factory/SparseEncodingProcessorFactory.java

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

zhichao-aws · 2024-11-21T03:25:53Z

Could you provide an overview of how the overall API will look? I initially thought this change would only affect the query side, but it seems it will also modify the parameters for neural_sparse_two_phase_processor.

Based on our benchmark results in #946 , when searching, applying prune to 2-phase search has superseded applying it to neural sparse query body, on both precision and latency. Therefore, enhancing the existing 2-phase search pipeline makes more sense.
To maintain compatibility with existing APIs, the overall API will look like:

# ingestion pipeline
PUT /_ingest/pipeline/sparse-pipeline
{
    "description": "Calling sparse model to generate expanded tokens",
    "processors": [
        {
            "sparse_encoding": {
                "model_id": "fousVokBjnSupmOha8aN",
                "pruning_type": "alpha_mass",
                "pruning_ratio": 0.8,
                "field_map": {
                    "body": "body_sparse"
                },
            }
        }
    ]
}

# two phase pipeline
PUT /_search/pipeline/neural_search_pipeline
{
  "request_processors": [
    {
      "neural_sparse_two_phase_processor": {
        "tag": "neural-sparse",
        "description": "Creates a two-phase processor for neural sparse search.",
        "pruning_type": "alpha_mass",
        "pruning_ratio": 0.8,
      }
    }
  ]
}

Additionally, the current implementation appears to be focused on two-phase processing with different strategies for splitting vectors, rather than a combination of pruning and two-phase processing?

The existing two-phase use max_ratio prune criteria. And now we add supports for other criteria as well

codecov · 2024-11-22T06:45:04Z

Codecov Report

Attention: Patch coverage is 96.85535% with 5 lines in your changes missing coverage. Please review.

Project coverage is 81.27%. Comparing base (3c7f275) to head (7486ee8).

Files with missing lines	Patch %	Lines
...opensearch/neuralsearch/util/prune/PruneUtils.java	96.80%	2 Missing and 1 partial ⚠️
...earch/processor/NeuralSparseTwoPhaseProcessor.java	94.11%	0 Missing and 1 partial ⚠️
...h/neuralsearch/query/NeuralSparseQueryBuilder.java	83.33%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main     #988      +/-   ##
============================================
+ Coverage     80.47%   81.27%   +0.79%     
- Complexity     1000     1054      +54     
============================================
  Files            78       80       +2     
  Lines          3411     3535     +124     
  Branches        578      611      +33     
============================================
+ Hits           2745     2873     +128     
+ Misses          425      423       -2     
+ Partials        241      239       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

heemin32

LGTM. Thanks!

martin-gaievski

Apart from minor comment, why this PR is trying to merge into main?
If this changes API that used to define the processor, it should be checked with application security and for that we need to merge to feature branch in main repo, and only after that's cleared from feature branch to main.

src/main/java/org/opensearch/neuralsearch/processor/NeuralSparseTwoPhaseProcessor.java

src/main/java/org/opensearch/neuralsearch/processor/SparseEncodingProcessor.java

src/main/java/org/opensearch/neuralsearch/processor/factory/SparseEncodingProcessorFactory.java

martin-gaievski · 2024-11-25T02:44:23Z

src/main/java/org/opensearch/neuralsearch/processor/factory/SparseEncodingProcessorFactory.java

+            );
+        } else {
+            // if we don't have prune type, then prune ratio field must not have value
+            if (config.containsKey(PruneUtils.PRUNE_RATIO_FIELD)) {


we can merge this if with a previous else and have one single else if block

This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.

we can merge this if with a previous else and have one single else if block

ack

This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.

We want to validate that the PRUNE_RATIO field is not provided. Any values will be illegal

src/main/java/org/opensearch/neuralsearch/util/prune/PruneType.java

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

martin-gaievski · 2024-11-25T02:58:56Z

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

+
+        switch (pruneType) {
+            case TOP_K:
+                return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio);


Suggested change

return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio);

return pruneRatio > 0 && pruneRatio == Math.rint(pruneRatio);

this is more reliable for float numbers, otherwise there is a chance of false positive

It doesn't seem correct to replace the floor to rint, from the definition, rint will give a even number if there are two values same close to the input value, I tested with input 3.5, floor result is 3 but rint result is 4.

Could you please give an example of false positive?

martin-gaievski · 2024-11-25T03:03:49Z

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

+            }
+        }
+
+        switch (pruneType) {


same as above, can we use map instead of switch?

zhichao-aws · 2024-11-25T03:36:19Z

@martin-gaievski Thanks for the comments. We didn't create feature branch because there is no other contributors working on this and we regard the PR branch as feature branch.

I'm on PTO this week, will follow the app sec issue and solve the comments next week.

zane-neo · 2024-12-05T07:21:29Z

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

+
+        switch (pruneType) {
+            case TOP_K:
+                return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio);


It doesn't seem correct to replace the floor to rint, from the definition, rint will give a even number if there are two values same close to the input value, I tested with input 3.5, floor result is 3 but rint result is 4.

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

zane-neo · 2024-12-05T07:22:40Z

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

+     * @param pruneType The type of prune strategy
+     * @throws IllegalArgumentException if prune type is null
+     */
+    public static String getValidPruneRatioDescription(PruneType pruneType) {


[nit] this can be refactored to a static map.

Please refer to the discussion with Martin at above

zane-neo · 2024-12-05T07:41:14Z

src/main/java/org/opensearch/neuralsearch/processor/factory/SparseEncodingProcessorFactory.java

+            );
+        } else {
+            // if we don't have prune type, then prune ratio field must not have value
+            if (config.containsKey(PruneUtils.PRUNE_RATIO_FIELD)) {


This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java

martin-gaievski

Let minor comment. Relaxing the potential merge blocker, looks like from app sec point of view the risk of this enhancement is low

App security flagged this as low risk change, removing the blocker

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

opensearch-trigger-bot · 2024-12-18T03:07:45Z

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-988-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e8fe2847a5237a03edd414a333799f7a5d2d8c7d
# Push it to GitHub
git push --set-upstream origin backport/backport-988-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-988-to-2.x.

* add impl Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add UT Signed-off-by: zhichao-aws <zhichaog@amazon.com> * rename pruneType; UT Signed-off-by: zhichao-aws <zhichaog@amazon.com> * changelog Signed-off-by: zhichao-aws <zhichaog@amazon.com> * ut Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add it Signed-off-by: zhichao-aws <zhichaog@amazon.com> * change on 2-phase Signed-off-by: zhichao-aws <zhichaog@amazon.com> * UT Signed-off-by: zhichao-aws <zhichaog@amazon.com> * it Signed-off-by: zhichao-aws <zhichaog@amazon.com> * rename Signed-off-by: zhichao-aws <zhichaog@amazon.com> * enhance: more detailed error message Signed-off-by: zhichao-aws <zhichaog@amazon.com> * refactor to prune and split Signed-off-by: zhichao-aws <zhichaog@amazon.com> * changelog Signed-off-by: zhichao-aws <zhichaog@amazon.com> * fix UT cov Signed-off-by: zhichao-aws <zhichaog@amazon.com> * address review comments Signed-off-by: zhichao-aws <zhichaog@amazon.com> * enlarge score diff range Signed-off-by: zhichao-aws <zhichaog@amazon.com> * address comments: check lowScores non null instead of flag Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> (cherry picked from commit e8fe284)

martin-gaievski mentioned this pull request Nov 16, 2024

[FEATURE] Enhanced adaptive token pruning for neural sparse search #989

Open

zhichao-aws force-pushed the pruning_dev branch from 1e55b7c to 46b9d9a Compare November 20, 2024 07:34

zhichao-aws marked this pull request as ready for review November 20, 2024 07:34

zhichao-aws requested review from heemin32, navneet1v, VijayanB, vamshin, jmazanec15, naveentatikonda, junqiu-lei, martin-gaievski, sean-zheng-amazon, model-collapse, zane-neo, vibrantvarun, yuye-aws and minalsha as code owners November 20, 2024 07:34

heemin32 reviewed Nov 20, 2024

View reviewed changes

src/main/java/org/opensearch/neuralsearch/processor/factory/SparseEncodingProcessorFactory.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java Outdated Show resolved Hide resolved

zhichao-aws changed the title ~~[Feature] Implement pruning for neural sparse search~~ [Enhancement] Implement pruning for neural sparse search Nov 22, 2024

zhichao-aws requested a review from heemin32 November 22, 2024 07:18

heemin32 approved these changes Nov 22, 2024

View reviewed changes

martin-gaievski previously requested changes Nov 25, 2024

View reviewed changes

zane-neo reviewed Dec 5, 2024

View reviewed changes

zhichao-aws added 3 commits December 10, 2024 15:05

add impl

1e09989

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

add UT

adca9be

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

rename pruneType; UT

2cc0d10

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

zhichao-aws added 12 commits December 10, 2024 15:05

changelog

26098cc

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

ut

6af02b8

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

add it

2ac90de

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

change on 2-phase

97963f1

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

UT

c5dd602

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

it

c7f0031

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

rename

0fd2597

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

enhance: more detailed error message

09e4765

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

refactor to prune and split

5b8ab70

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

changelog

6cabbc0

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

fix UT cov

cffd829

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

address review comments

0d928a9

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

zhichao-aws force-pushed the pruning_dev branch from 2a3e2cf to 0d928a9 Compare December 10, 2024 08:43

enlarge score diff range

7486ee8

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

zane-neo approved these changes Dec 16, 2024

View reviewed changes

merge main

b32b471

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

martin-gaievski added backport 2.x Label will add auto workflow to backport PR to 2.x branch enhancement v2.19.0 labels Dec 17, 2024

martin-gaievski reviewed Dec 17, 2024

View reviewed changes

src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java Show resolved Hide resolved

martin-gaievski reviewed Dec 17, 2024

View reviewed changes

zhichao-aws added 2 commits December 18, 2024 10:26

merge main

8d03256

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

address comments: check lowScores non null instead of flag

185fc51

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

zhichao-aws merged commit e8fe284 into opensearch-project:main Dec 18, 2024
39 checks passed

zhichao-aws added backport 2.x Label will add auto workflow to backport PR to 2.x branch and removed backport 2.x Label will add auto workflow to backport PR to 2.x branch labels Dec 18, 2024

zhichao-aws mentioned this pull request Dec 18, 2024

[Backport] manually backport 988 to 2.x #1030

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement] Implement pruning for neural sparse search #988

[Enhancement] Implement pruning for neural sparse search #988

zhichao-aws commented Nov 15, 2024

zhichao-aws commented Nov 20, 2024

heemin32 left a comment

zhichao-aws commented Nov 21, 2024 •

edited

Loading

codecov bot commented Nov 22, 2024 •

edited

Loading

heemin32 left a comment

martin-gaievski left a comment

martin-gaievski Nov 25, 2024

zane-neo Dec 5, 2024

zhichao-aws Dec 10, 2024

zhichao-aws Dec 10, 2024

martin-gaievski Nov 25, 2024

zane-neo Dec 5, 2024

zhichao-aws Dec 10, 2024

martin-gaievski Nov 25, 2024

zhichao-aws commented Nov 25, 2024

zane-neo Dec 5, 2024

zane-neo Dec 5, 2024

zhichao-aws Dec 10, 2024

zane-neo Dec 5, 2024

martin-gaievski left a comment

opensearch-trigger-bot bot commented Dec 18, 2024

	return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio);
	return pruneRatio > 0 && pruneRatio == Math.rint(pruneRatio);

[Enhancement] Implement pruning for neural sparse search #988

[Enhancement] Implement pruning for neural sparse search #988

Conversation

zhichao-aws commented Nov 15, 2024

Description

Related Issues

Check List

zhichao-aws commented Nov 20, 2024

heemin32 left a comment

Choose a reason for hiding this comment

zhichao-aws commented Nov 21, 2024 • edited Loading

codecov bot commented Nov 22, 2024 • edited Loading

Codecov Report

heemin32 left a comment

Choose a reason for hiding this comment

martin-gaievski left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhichao-aws commented Nov 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martin-gaievski left a comment

Choose a reason for hiding this comment

opensearch-trigger-bot bot commented Dec 18, 2024

zhichao-aws commented Nov 21, 2024 •

edited

Loading

codecov bot commented Nov 22, 2024 •

edited

Loading