-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: text chunking processor ingestion bug on multi-node cluster #713
Fix: text chunking processor ingestion bug on multi-node cluster #713
Conversation
Hi maintainers. This PR is a fix towards text chunking processor. Please attach backport 2.x and backport 2.13 labels to this PR. |
This PR is still work in progress. Before getting merged, this PR must satisfy the following conditions :
|
d3bc602
to
51adb22
Compare
aa08ffb
to
f3a81f6
Compare
@model-collapse @zane-neo This PR is ready for review now. Please merge this PR after passing all the CI workflow. |
Shall we add an IT to cover this "configured shard number is less than the number of nodes" scenario? Can be done in a separate issue and PR. |
In current CI all IT are run with one node. I think we can enhance the CI framework by adding the build with |
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
bae4e66
to
fe6eda0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All concerns addressed.
We can't merge the PR until bwc tests passes. |
@model-collapse GH workflows are failing. Lets ensure GH actions are successful before approving the PRs |
Even gradle checks are failing @yuye-aws |
"Model not deployed yet" error coming from ml-commons opensearch-project/ml-commons#2382 |
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
This is another issue that related to ml-commons main branch, we'll track this with a new issue. We'll merge this one for now as it fixes an critical issue that could impact on customers. |
* fix multi node text chunking processor index bug Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add change log Signed-off-by: yuye-aws <yuyezhu@amazon.com> * bug fix: no max token count setting in index Signed-off-by: yuye-aws <yuyezhu@amazon.com> * make program faster without creating index settings object Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * fix comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review Signed-off-by: yuye-aws <yuyezhu@amazon.com> * simplify the code given toInt in NumberUtils Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review comments Signed-off-by: yuye-aws <yuyezhu@amazon.com> --------- Signed-off-by: yuye-aws <yuyezhu@amazon.com> (cherry picked from commit 2d42408)
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.13 2.13
# Navigate to the new working tree
cd .worktrees/backport-2.13
# Create a new branch
git switch --create backport/backport-713-to-2.13
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2d42408c70e01b95825744bea0182ff361090a4e
# Push it to GitHub
git push --set-upstream origin backport/backport-713-to-2.13
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.13 Then, create a pull request where the |
* fix multi node text chunking processor index bug Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add change log Signed-off-by: yuye-aws <yuyezhu@amazon.com> * bug fix: no max token count setting in index Signed-off-by: yuye-aws <yuyezhu@amazon.com> * make program faster without creating index settings object Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * fix comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review Signed-off-by: yuye-aws <yuyezhu@amazon.com> * simplify the code given toInt in NumberUtils Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review comments Signed-off-by: yuye-aws <yuyezhu@amazon.com> --------- Signed-off-by: yuye-aws <yuyezhu@amazon.com> (cherry picked from commit 2d42408)
…nsearch-project#713) * fix multi node text chunking processor index bug Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add change log Signed-off-by: yuye-aws <yuyezhu@amazon.com> * bug fix: no max token count setting in index Signed-off-by: yuye-aws <yuyezhu@amazon.com> * make program faster without creating index settings object Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * fix comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review Signed-off-by: yuye-aws <yuyezhu@amazon.com> * simplify the code given toInt in NumberUtils Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review comments Signed-off-by: yuye-aws <yuyezhu@amazon.com> --------- Signed-off-by: yuye-aws <yuyezhu@amazon.com> (cherry picked from commit 2d42408)
… (#725) * fix multi node text chunking processor index bug Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add change log Signed-off-by: yuye-aws <yuyezhu@amazon.com> * bug fix: no max token count setting in index Signed-off-by: yuye-aws <yuyezhu@amazon.com> * make program faster without creating index settings object Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * fix comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review Signed-off-by: yuye-aws <yuyezhu@amazon.com> * simplify the code given toInt in NumberUtils Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review comments Signed-off-by: yuye-aws <yuyezhu@amazon.com> --------- Signed-off-by: yuye-aws <yuyezhu@amazon.com> (cherry picked from commit 2d42408) Signed-off-by: yuye-aws <yuyezhu@amazon.com>
… (#724) * fix multi node text chunking processor index bug Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add change log Signed-off-by: yuye-aws <yuyezhu@amazon.com> * bug fix: no max token count setting in index Signed-off-by: yuye-aws <yuyezhu@amazon.com> * make program faster without creating index settings object Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * fix comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review Signed-off-by: yuye-aws <yuyezhu@amazon.com> * simplify the code given toInt in NumberUtils Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review comments Signed-off-by: yuye-aws <yuyezhu@amazon.com> --------- Signed-off-by: yuye-aws <yuyezhu@amazon.com> (cherry picked from commit 2d42408) Co-authored-by: yuye-aws <yuyezhu@amazon.com>
… (#723) * fix multi node text chunking processor index bug Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add change log Signed-off-by: yuye-aws <yuyezhu@amazon.com> * bug fix: no max token count setting in index Signed-off-by: yuye-aws <yuyezhu@amazon.com> * make program faster without creating index settings object Signed-off-by: yuye-aws <yuyezhu@amazon.com> * add comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * fix comment Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review Signed-off-by: yuye-aws <yuyezhu@amazon.com> * simplify the code given toInt in NumberUtils Signed-off-by: yuye-aws <yuyezhu@amazon.com> * resolve code review comments Signed-off-by: yuye-aws <yuyezhu@amazon.com> --------- Signed-off-by: yuye-aws <yuyezhu@amazon.com> (cherry picked from commit 2d42408) Co-authored-by: yuye-aws <yuyezhu@amazon.com>
Description
For multi node cluster, the text chunking processor would produce "no such index" error if the configured shard number is less than the number of nodes. This is because some node does not contain the shard information. When we get max token count setting,
indicesService
fails to find the index information.Issues Resolved
Fix ingestion bug on multi-node cluster
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.