Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding back [Time series based workload desc order optimization through reverse segment read (#7244)] with fixes #7967

Merged
merged 5 commits into from
Jun 12, 2023

Conversation

gashutos
Copy link
Contributor

@gashutos gashutos commented Jun 8, 2023

Description

This PR adds back reverted PR #7244 due to bug introduced in re-index in this issue #7878.
The issue was, re-index operation on timeseries based workload was only able to re-index documents in its last segment and was skipping all other segments, hence we had data loss.
The reason behind this, we do early terminate the sort queries if lucene index is indexsorted or sort field is _doc. As per TopFieldCollector code, in case of IndexSort or _doc field sort, it does earlyTerminate and searches only very first segment. ReIndex plugin queries scroll query on _doc field in asc order. Hence the issue was with Reindex.

In this PR, I am adding check not to enable this optimization in case if sorting is not on @timestamp field as well if it is IndexSorted.

Related Issues

Resolves #7878

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…rough reverse segment read (opensearch-project#7244)" (opensearch-project#7892)"

This reverts commit bb26536.

Signed-off-by: gashutos <gashutos@amazon.com>
…also ASC order reverse should only consider in @timestamp field

Signed-off-by: gashutos <gashutos@amazon.com>
@gashutos
Copy link
Contributor Author

gashutos commented Jun 8, 2023

Please reivew second commit 59285c02385bb13bf98f710c6e683b67a2af041c only, The first commit in this PR contains just revert of #7244

@reta @andrross @Hailong-am.

many thanks for Hailong-am for catching this and @andrross to help reverting to unblock 2.8 release.

Signed-off-by: gashutos <gashutos@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.backpressure.SearchBackpressureIT.testSearchShardTaskCancellationWithHighCpu
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testDropPrimaryDuringReplication
      1 org.opensearch.remotestore.RemoteStoreRefreshListenerIT.testRemoteRefreshRetryOnFailure
      1 org.opensearch.indices.replication.SegmentReplicationIT.testScrollCreatedOnReplica

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.backpressure.SearchBackpressureIT.testSearchShardTaskCancellationWithHighCpu
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testDropPrimaryDuringReplication

@codecov
Copy link

codecov bot commented Jun 8, 2023

Codecov Report

Merging #7967 (52cd6bd) into main (584617e) will increase coverage by 0.53%.
The diff coverage is 35.71%.

@@             Coverage Diff              @@
##               main    #7967      +/-   ##
============================================
+ Coverage     70.87%   71.40%   +0.53%     
- Complexity    56504    56940     +436     
============================================
  Files          4719     4719              
  Lines        267408   267446      +38     
  Branches      39196    39210      +14     
============================================
+ Hits         189521   190970    +1449     
+ Misses        61861    60657    -1204     
+ Partials      16026    15819     -207     
Impacted Files Coverage Δ
.../main/java/org/opensearch/index/IndexSettings.java 85.93% <0.00%> (-0.67%) ⬇️
...n/java/org/opensearch/index/MergePolicyConfig.java 74.66% <ø> (ø)
...va/org/opensearch/index/engine/InternalEngine.java 75.35% <0.00%> (+0.58%) ⬆️
...ensearch/search/internal/ContextIndexSearcher.java 71.60% <20.00%> (-5.73%) ⬇️
...va/org/opensearch/cluster/metadata/DataStream.java 84.15% <22.22%> (-6.06%) ⬇️
...ava/org/opensearch/index/mapper/MappingLookup.java 92.79% <50.00%> (-0.79%) ⬇️
...in/java/org/opensearch/index/shard/IndexShard.java 70.63% <66.66%> (-0.15%) ⬇️
...java/org/opensearch/index/engine/EngineConfig.java 98.50% <100.00%> (+0.04%) ⬆️
...g/opensearch/index/engine/EngineConfigFactory.java 92.95% <100.00%> (+0.10%) ⬆️

... and 456 files with indirect coverage changes

@reta
Copy link
Collaborator

reta commented Jun 8, 2023

The issue was, re-index operation on timeseries based workload was only able to re-index documents in its last segment and was skipping all other segments, hence we had data loss.

@gashutos would be great if we could somehow add test case(s) for that, what do you think?

@andrross
Copy link
Member

andrross commented Jun 9, 2023

would be great if we could somehow add test case(s) for that, what do you think?

Agreed. @gashutos Can you add an integration or unit test to ensure we're not still hitting the early terminate behavior?

…termination

Signed-off-by: gashutos <gashutos@amazon.com>
@gashutos
Copy link
Contributor Author

Report

would be great if we could somehow add test case(s) for that, what do you think?

Agreed. @gashutos Can you add an integration or unit test to ensure we're not still hitting the early terminate behavior?

Oops, I missed adding file to commit. Just added now.
Sorry, just got back from short vacation so delay in response. Added integ test to commit now.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      2 org.opensearch.remotestore.RemoteStoreRefreshListenerIT.testRemoteRefreshRetryOnFailure
      1 org.opensearch.search.backpressure.SearchBackpressureIT.testSearchShardTaskCancellationWithHighCpu
      1 org.opensearch.action.admin.indices.create.SplitIndexIT.testSplitIndexPrimaryTerm

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.backpressure.SearchBackpressureIT.testSearchShardTaskCancellationWithHighCpu
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testReplicaHasDiffFilesThanPrimary
      1 org.opensearch.indices.replication.SegmentReplicationIT.testScrollCreatedOnReplica
      1 org.opensearch.indices.replication.SegmentReplicationIT.testReplicationPostDeleteAndForceMerge

@gashutos
Copy link
Contributor Author

would be great if we could somehow add test case(s) for that, what do you think?

Agreed. @gashutos Can you add an integration or unit test to ensure we're not still hitting the early terminate behavior?

Oops, I missed adding file to commit. Just added now.
Sorry, just got back from short vacation so delay in response. Added integ test to commit now.

I was trying a lot about adding unit test to cover codcov happy. I think for this scenario, Integ tests are fine but unit tests doesnt add much value even if we end up covering lines. Let me know if you guys think otherwise.

@andrross andrross added the backport 2.x Backport to 2.x branch label Jun 12, 2023
@andrross andrross merged commit 5c32256 into opensearch-project:main Jun 12, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-7967-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5c3225692dcea1eddbb3e76ae19f47de5ea23a96
# Push it to GitHub
git push --set-upstream origin backport/backport-7967-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-7967-to-2.x.

@reta
Copy link
Collaborator

reta commented Jun 12, 2023

@gashutos mind please to create a manual backport to 2.x, thank you

gashutos added a commit to gashutos/OpenSearch that referenced this pull request Jun 13, 2023
…gh reverse segment read (opensearch-project#7244)] with fixes (opensearch-project#7967)

* Revert "Revert "Time series based workload desc order optimization through reverse segment read (opensearch-project#7244)" (opensearch-project#7892)"

This reverts commit bb26536.

Signed-off-by: gashutos <gashutos@amazon.com>

* Enable time series optimization only if it is not IndexSorted index, also ASC order reverse should only consider in @timestamp field

Signed-off-by: gashutos <gashutos@amazon.com>

* Modifying CHANGELOG

Signed-off-by: gashutos <gashutos@amazon.com>

* Adding integ test for scroll API where sort by _doc is getting early termination

Signed-off-by: gashutos <gashutos@amazon.com>

---------

Signed-off-by: gashutos <gashutos@amazon.com>
gbbafna pushed a commit that referenced this pull request Jun 13, 2023
…gh reverse segment read (#7244)] with fixes (#7967) (#8037)

Signed-off-by: gashutos <gashutos@amazon.com>
gaiksaya pushed a commit to gaiksaya/OpenSearch that referenced this pull request Jun 26, 2023
imRishN pushed a commit to imRishN/OpenSearch that referenced this pull request Jun 27, 2023
…gh reverse segment read (opensearch-project#7244)] with fixes (opensearch-project#7967)

* Revert "Revert "Time series based workload desc order optimization through reverse segment read (opensearch-project#7244)" (opensearch-project#7892)"

This reverts commit bb26536.

Signed-off-by: gashutos <gashutos@amazon.com>

* Enable time series optimization only if it is not IndexSorted index, also ASC order reverse should only consider in @timestamp field

Signed-off-by: gashutos <gashutos@amazon.com>

* Modifying CHANGELOG

Signed-off-by: gashutos <gashutos@amazon.com>

* Adding integ test for scroll API where sort by _doc is getting early termination

Signed-off-by: gashutos <gashutos@amazon.com>

---------

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…gh reverse segment read (opensearch-project#7244)] with fixes (opensearch-project#7967)

* Revert "Revert "Time series based workload desc order optimization through reverse segment read (opensearch-project#7244)" (opensearch-project#7892)"

This reverts commit bb26536.

Signed-off-by: gashutos <gashutos@amazon.com>

* Enable time series optimization only if it is not IndexSorted index, also ASC order reverse should only consider in @timestamp field

Signed-off-by: gashutos <gashutos@amazon.com>

* Modifying CHANGELOG

Signed-off-by: gashutos <gashutos@amazon.com>

* Adding integ test for scroll API where sort by _doc is getting early termination

Signed-off-by: gashutos <gashutos@amazon.com>

---------

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Reindex is break can not copy all documents from source
3 participants