Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup file cache on deleting index/shard directory #11443

Merged
merged 2 commits into from
Feb 3, 2024

Conversation

bugmakerrrrrr
Copy link
Contributor

Description

Currently, we rely on the index event listener to manage file cache cleanup. This approach is effective when we simply delete the index. However, in certain situations, such as shard relocation, the shard data may be deleted asynchronously, causing the preset index event listener to not be triggered (as noted in the comments for org.opensearch.indices.cluster.IndicesClusterStateService#removeIndices/removeShards). To address this issue, I suggest that we implement file cache cleanup (including cache entry eviction and file deletion) during the deletion of the index/shard directory in NodeEnvironment.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Dec 4, 2023

❌ Gradle check result for 50c3b60: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Dec 4, 2023

Compatibility status:

Checks if related components are compatible with change e1e1472

Incompatible components

Incompatible components: [https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git]

@bugmakerrrrrr
Copy link
Contributor Author

@andrross @adnapibar Would you mind taking a look?

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added the stalled Issues that have stalled label Jan 11, 2024
@bugmakerrrrrr
Copy link
Contributor Author

@andrross Sorry to bother you again. I believe this is a bug that need to fix, can you have a look

@andrross
Copy link
Member

@bugmakerrrrrr Apologies for the delay, I'll look into this today, thanks!

@andrross andrross added backport 2.x Backport to 2.x branch bug Something isn't working Search:Remote Search and removed stalled Issues that have stalled labels Jan 11, 2024
@andrross andrross dismissed their stale review January 13, 2024 01:05

Accidentally clicked approve...

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
@bugmakerrrrrr
Copy link
Contributor Author

@andrross I add a new listener IndexStoreListener to NodeEnvironment, can you take a look?

@andrross
Copy link
Member

andrross commented Feb 2, 2024

@bugmakerrrrrr I'll take a look today, thanks!

Copy link
Contributor

github-actions bot commented Feb 2, 2024

❕ Gradle check result for 36a6bb1: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Feb 2, 2024

Codecov Report

Attention: 17 lines in your changes are missing coverage. Please review.

Comparison is base (57cc0dd) 71.34% compared to head (e1e1472) 71.41%.
Report is 3 commits behind head on main.

Files Patch % Lines
...index/store/remote/filecache/FileCacheCleaner.java 60.86% 5 Missing and 4 partials ⚠️
.../main/java/org/opensearch/env/NodeEnvironment.java 57.89% 8 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11443      +/-   ##
============================================
+ Coverage     71.34%   71.41%   +0.07%     
- Complexity    59486    59511      +25     
============================================
  Files          4927     4927              
  Lines        279663   279682      +19     
  Branches      40663    40666       +3     
============================================
+ Hits         199513   199745     +232     
+ Misses        63533    63335     -198     
+ Partials      16617    16602      -15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

github-actions bot commented Feb 2, 2024

❕ Gradle check result for e1e1472: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
      1 org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@andrross andrross merged commit c564ee3 into opensearch-project:main Feb 3, 2024
32 of 34 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Feb 3, 2024
* cleanup file cache on deleting index/shard directory

Signed-off-by: panguixin <panguixin@bytedance.com>

* add index store listener

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
(cherry picked from commit c564ee3)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
andrross pushed a commit that referenced this pull request Feb 20, 2024
* cleanup file cache on deleting index/shard directory



* add index store listener



---------


(cherry picked from commit c564ee3)

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024
…ect#11443)

* cleanup file cache on deleting index/shard directory

Signed-off-by: panguixin <panguixin@bytedance.com>

* add index store listener

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
…ect#11443)

* cleanup file cache on deleting index/shard directory

Signed-off-by: panguixin <panguixin@bytedance.com>

* add index store listener

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…ect#11443)

* cleanup file cache on deleting index/shard directory

Signed-off-by: panguixin <panguixin@bytedance.com>

* add index store listener

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch bug Something isn't working Search:Remote Search skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants