Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File System Caching for remote store #5641

Conversation

aabukhalil
Copy link
Contributor

@aabukhalil aabukhalil commented Dec 27, 2022

Signed-off-by: Ahmad AbuKhalil abukhali@amazon.com

Description

Implement FileSystem Tracking LRU Cache with RefCounting
Part of this commit was developed from code and concepts initially implemented
in Amazon OpenSearch Service as part of the UltraWarm feature.

Issues Resolved

#4965

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testIndexDeletionDuringSnapshotCreationInQueue

@codecov-commenter
Copy link

Codecov Report

Merging #5641 (c80e883) into main (b8c9b9a) will decrease coverage by 0.23%.
The diff coverage is 47.96%.

@@             Coverage Diff              @@
##               main    #5641      +/-   ##
============================================
- Coverage     71.13%   70.90%   -0.24%     
+ Complexity    58648    58630      -18     
============================================
  Files          4759     4773      +14     
  Lines        279468   280088     +620     
  Branches      40343    40394      +51     
============================================
- Hits         198803   198583     -220     
- Misses        64440    65308     +868     
+ Partials      16225    16197      -28     
Impacted Files Coverage Δ
...g/opensearch/common/cache/RemovalNotification.java 100.00% <ø> (ø)
...ch/common/cache/refcounted/stats/StatsCounter.java 0.00% <0.00%> (ø)
...search/index/store/remote/fc/CachedIndexInput.java 0.00% <0.00%> (ø)
...ch/index/store/remote/fc/FileCachedIndexInput.java 0.00% <0.00%> (ø)
...dex/store/remote/file/OnDemandBlockIndexInput.java 73.01% <ø> (ø)
...indices/fielddata/cache/IndicesFieldDataCache.java 80.19% <0.00%> (ø)
...arch/index/store/remote/utils/TransferManager.java 3.03% <4.76%> (+3.03%) ⬆️
...pensearch/common/util/ThrowableTranslateUtils.java 11.11% <11.11%> (ø)
...rg/opensearch/index/store/remote/fc/FileCache.java 13.04% <13.04%> (ø)
...ava/org/opensearch/common/collect/LinkedDeque.java 29.45% <29.45%> (ø)
... and 503 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

*
* @opensearch.internal
*/
public final class LinkedDeque<E extends Linked<E>> extends AbstractCollection<E> implements Deque<E> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know where this implementation comes from? It seems to be widely copied into projects on GitHub.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really what I'm getting at is whether we need to implement this data structure or if we can use something built into the JDK instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know where this implementation comes from? It seems to be widely copied into projects on GitHub.

No, I don't know where this file originally came from. But I see almost all cache implementations (Guava, Caffeine, ...) have this and no I'm not aware of any Java built in alternative

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrross this copied from [1], which is the predecessor of [2], we should not be doing that (IMHO)

[1] https://github.com/ben-manes/concurrentlinkedhashmap/
[2] https://github.com/ben-manes/caffeine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta There's now more details about why we're doing this in #4964. I would love to find a library implementation we can use for this and the caching logic in general, but at the moment we don't have one and I'd like to keep making incremental progress here. #6225 has been opened to track that effort.

Also, for additional context, this and other code in this PR has been adapted from UltraWarm (see the commit message) which is another reason why this implementation is our first approach.

What do you think?

Copy link
Collaborator

@reta reta Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrross sure, have a suggestion here (#5641 (comment)) on isolating this code (in separate library)

* @return the difference between this instance and {@code other}
*/

public CacheStats minus(CacheStats other) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the plus() and minus() methods used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope they are not used anywhere for now but there are few more methods which are not used but will be used once cache stats is exposed to nodes stats and maybe plus will be one of them. For now will remove plus and minus and if needed will add them back

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the unused code, methods will be introduced when needed

* @opensearch.internal
*/
class LRUCache<K, V> implements RefCountedCache<K, V> {
private long capacity;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can all these fields be final?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made them final

@dblock
Copy link
Member

dblock commented Jan 23, 2023

@aabukhalil What are we doing with this PR?

@aabukhalil
Copy link
Contributor Author

@aabukhalil What are we doing with this PR?

new commit should be uploaded today

@aabukhalil aabukhalil force-pushed the feature/searchable_snapshot_caching_01 branch from c80e883 to b940bb0 Compare January 25, 2023 01:46
@aabukhalil aabukhalil requested a review from gbbafna as a code owner January 25, 2023 01:46
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

* compatible open source license.
*/

package org.opensearch.index.store.remote.utils.cache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nknize What do you think about the location for this code? I kind of like keeping it contained here until either it matures a bit or we replace it with something else. But marking it experimental in the opensearch-commons library essentially accomplishes the same thing, so I'm happy to defer to your preference.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put my thoughts in the comments below..

@aabukhalil aabukhalil force-pushed the feature/searchable_snapshot_caching_01 branch 2 times, most recently from 49757a5 to ff26dde Compare February 9, 2023 00:18
@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.indices.replication.SegmentReplicationStatsIT.testSegmentReplicationStatsResponse
      1 org.opensearch.cluster.allocation.AwarenessAllocationIT.testThreeZoneOneReplicaWithForceZoneValueAndLoadAwareness

@reta
Copy link
Collaborator

reta commented Feb 9, 2023

This is adding some general purpose caching classes to the :server module

The common namespace was something of a moving target so I suggested keeping it in server until the recent changes were finalized. If we keep the general purpose caching classes then opensearch-common is probably the right place for it.

@andrross @nknize since we are copying a large chunk of specialized code, may be we could move it off to separate module libs/opensearch-cache and not pollute core (or common) with cache implementation? It looks fairly isolated as well, should be doable.

*/
public class SegmentedCache<K, V> implements RefCountedCache<K, V> {

private static final int HASH_BITS = 0x7fffffff; // usable bits of normal ActivenessAwareCache hash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ActivenessAwareCache, some leftovers from UltraWarm need clearing :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes :)

@nknize
Copy link
Collaborator

nknize commented Feb 9, 2023

...may be we could move it off to separate module libs/opensearch-cache and not pollute core (or common) with cache implementation?

I don't think we want to introduce a whole new opensearch-cache library artifact just yet since it's still very experimental.

...marking it experimental in the opensearch-commons library essentially accomplishes the same thing, so I'm happy to defer to your preference.

I would like to pull it out of :server and into a library so we can prevent tight coupling (inadvertent or not) w/ other classes in the server module but I think it's probably too soon since it's still very much tied to the remote store implementation only.

How about this: could we refactor the entire remote store implementation to a store-remote module in a follow up PR? Then assess whether to move the Cache components to :libs:opensearch-core?

I realize I didn't jump into the original PR conversation and raise the plugin / module question. Since it's already released as a part of 2.5 I think that takes away the plugin option.

@@ -136,7 +136,7 @@ public void testMultivaluedGeoPointsAggregation() throws Exception {
*
* @param geometry {@link Geometry}
* @param geoShapeDocValue {@link GeoShapeDocValue}
* @param intersectingWithBB
* @param intersectingWithBB enable intersectingWithBB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strange this change crept into this unrelated PR. spotless?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is failing on gradlew precommit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you building with JDK 14? We've seen weirdness with JDK 14 enforcing some javadoc rules in tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I'm building with JDK 14

* compatible open source license.
*/

package org.opensearch.index.store.remote.utils.cache;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put my thoughts in the comments below..

@andrross
Copy link
Member

andrross commented Feb 9, 2023

How about this: could we refactor the entire remote store implementation to a store-remote module in a follow up PR? Then assess whether to move the Cache components to :libs:opensearch-core?

@nknize The generic cache and the more specific "FileCache" components can be moved around quite easily I think. As for making the whole thing a module, I'm sure that's possible but if my understanding is correct we'll have to make sure we have all the necessary extension points in the Plugin interface. I honestly don't know the scope of that but we can definitely explore it.

@aabukhalil aabukhalil force-pushed the feature/searchable_snapshot_caching_01 branch from ff26dde to f3f4f04 Compare February 9, 2023 22:01
@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2023

Gradle Check (Jenkins) Run Completed with:

@nknize
Copy link
Collaborator

nknize commented Feb 9, 2023

I'm sure that's possible but if my understanding is correct we'll have to make sure we have all the necessary extension points in the Plugin interface

This sounds like a good follow on PR. I'm happy to help out here as well. Thoughts @reta?

@reta
Copy link
Collaborator

reta commented Feb 10, 2023

I'm sure that's possible but if my understanding is correct we'll have to make sure we have all the necessary extension points in the Plugin interface

This sounds like a good follow on PR. I'm happy to help out here as well. Thoughts @reta?

+1 from me as well

@andrross
Copy link
Member

I'm sure that's possible but if my understanding is correct we'll have to make sure we have all the necessary extension points in the Plugin interface

This sounds like a good follow on PR. I'm happy to help out here as well. Thoughts @reta?

Created #6281 to track this

Copy link
Member

@andrross andrross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this PR, though we've got some new items to follow up on going forward.

Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 LGTM! Thx all!

Part of this commit was developed from code and concepts initially implemented
in Amazon OpenSearch Service as part of the UltraWarm feature. Thank you to the
following developers and the entire UltraWarm team.

Co-authored-by: Min Zhou <minzho@amazon.com>
Co-authored-by: Ankit Malpani <malpani@amazon.com>
Co-authored-by: Rohit Nair <rohinair@amazon.com>
Co-authored-by: Sorabh Hamirwasia <hsorabh@amazon.com>
Co-authored-by: Ankit Jain <akjain@amazon.com>
Co-authored-by: Tianru Zhou <tianruz@amazon.com>
Co-authored-by: Neetika Singhal <neetiks@amazon.com>
Co-authored-by: Amit Khandelwal <mkhnde@amazon.com>
Co-authored-by: Vigya Sharma <vigyas@amazon.com>
Co-authored-by: Prateek Sharma <shrprat@amazon.com>
Co-authored-by: Venkata Jyothsna Donapati <donapv@amazon.com>
Co-authored-by: Vlad Rozov <vrozov@amazon.com>
Co-authored-by: Mohit Agrawal <agramohi@amazon.com>
Co-authored-by: Shweta Thareja <tharejas@amazon.com>
Co-authored-by: Palash Hedau <phhedau@amazon.com>
Co-authored-by: Saurabh Singh <sisurab@amazon.com>
Co-authored-by: Piyush Daftary <pdaftary@amazon.com>
Co-authored-by: Payal Maheshwari <pmaheshw@amazon.com>
Co-authored-by: Kunal Khatua <kkhatua@amazon.com>
Co-authored-by: Gulshan Kumar <kumargu@amazon.com>
Co-authored-by: Rushi Agrawal <agrrushi@amazon.com>
Co-authored-by: Ketan Verma <vermketa@amazon.com>
Co-authored-by: Gaurav Chandani <chngau@amazon.com>
Co-authored-by: Dharmesh Singh <sdharms@amazon.com>

Signed-off-by: Ahmad AbuKhalil <abukhali@amazon.com>
@aabukhalil aabukhalil force-pushed the feature/searchable_snapshot_caching_01 branch from f3f4f04 to c6d46cf Compare February 10, 2023 18:39
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@andrross
Copy link
Member

Windows precommit failed with an out of memory error, so almost certainly unrelated. Retrying...

@andrross andrross merged commit 0ca51a7 into opensearch-project:main Feb 10, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-5641-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 0ca51a774211184835c4825dfeff38b23198352e
# Push it to GitHub
git push --set-upstream origin backport/backport-5641-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-5641-to-2.x.

andrross pushed a commit to andrross/OpenSearch that referenced this pull request Feb 10, 2023
…oject#5641)

Part of this commit was developed from code and concepts initially implemented
in Amazon OpenSearch Service as part of the UltraWarm feature. Thank you to the
following developers and the entire UltraWarm team.

Co-authored-by: Min Zhou <minzho@amazon.com>
Co-authored-by: Ankit Malpani <malpani@amazon.com>
Co-authored-by: Rohit Nair <rohinair@amazon.com>
Co-authored-by: Sorabh Hamirwasia <hsorabh@amazon.com>
Co-authored-by: Ankit Jain <akjain@amazon.com>
Co-authored-by: Tianru Zhou <tianruz@amazon.com>
Co-authored-by: Neetika Singhal <neetiks@amazon.com>
Co-authored-by: Amit Khandelwal <mkhnde@amazon.com>
Co-authored-by: Vigya Sharma <vigyas@amazon.com>
Co-authored-by: Prateek Sharma <shrprat@amazon.com>
Co-authored-by: Venkata Jyothsna Donapati <donapv@amazon.com>
Co-authored-by: Vlad Rozov <vrozov@amazon.com>
Co-authored-by: Mohit Agrawal <agramohi@amazon.com>
Co-authored-by: Shweta Thareja <tharejas@amazon.com>
Co-authored-by: Palash Hedau <phhedau@amazon.com>
Co-authored-by: Saurabh Singh <sisurab@amazon.com>
Co-authored-by: Piyush Daftary <pdaftary@amazon.com>
Co-authored-by: Payal Maheshwari <pmaheshw@amazon.com>
Co-authored-by: Kunal Khatua <kkhatua@amazon.com>
Co-authored-by: Gulshan Kumar <kumargu@amazon.com>
Co-authored-by: Rushi Agrawal <agrrushi@amazon.com>
Co-authored-by: Ketan Verma <vermketa@amazon.com>
Co-authored-by: Gaurav Chandani <chngau@amazon.com>
Co-authored-by: Dharmesh Singh <sdharms@amazon.com>

Signed-off-by: Ahmad AbuKhalil <abukhali@amazon.com>
(cherry picked from commit 0ca51a7)
andrross pushed a commit to andrross/OpenSearch that referenced this pull request Feb 10, 2023
…oject#5641)

Part of this commit was developed from code and concepts initially implemented
in Amazon OpenSearch Service as part of the UltraWarm feature. Thank you to the
following developers and the entire UltraWarm team.

Co-authored-by: Min Zhou <minzho@amazon.com>
Co-authored-by: Ankit Malpani <malpani@amazon.com>
Co-authored-by: Rohit Nair <rohinair@amazon.com>
Co-authored-by: Sorabh Hamirwasia <hsorabh@amazon.com>
Co-authored-by: Ankit Jain <akjain@amazon.com>
Co-authored-by: Tianru Zhou <tianruz@amazon.com>
Co-authored-by: Neetika Singhal <neetiks@amazon.com>
Co-authored-by: Amit Khandelwal <mkhnde@amazon.com>
Co-authored-by: Vigya Sharma <vigyas@amazon.com>
Co-authored-by: Prateek Sharma <shrprat@amazon.com>
Co-authored-by: Venkata Jyothsna Donapati <donapv@amazon.com>
Co-authored-by: Vlad Rozov <vrozov@amazon.com>
Co-authored-by: Mohit Agrawal <agramohi@amazon.com>
Co-authored-by: Shweta Thareja <tharejas@amazon.com>
Co-authored-by: Palash Hedau <phhedau@amazon.com>
Co-authored-by: Saurabh Singh <sisurab@amazon.com>
Co-authored-by: Piyush Daftary <pdaftary@amazon.com>
Co-authored-by: Payal Maheshwari <pmaheshw@amazon.com>
Co-authored-by: Kunal Khatua <kkhatua@amazon.com>
Co-authored-by: Gulshan Kumar <kumargu@amazon.com>
Co-authored-by: Rushi Agrawal <agrrushi@amazon.com>
Co-authored-by: Ketan Verma <vermketa@amazon.com>
Co-authored-by: Gaurav Chandani <chngau@amazon.com>
Co-authored-by: Dharmesh Singh <sdharms@amazon.com>

Signed-off-by: Ahmad AbuKhalil <abukhali@amazon.com>
(cherry picked from commit 0ca51a7)
Signed-off-by: Andrew Ross <andrross@amazon.com>
andrross added a commit that referenced this pull request Feb 10, 2023
Part of this commit was developed from code and concepts initially implemented
in Amazon OpenSearch Service as part of the UltraWarm feature. Thank you to the
following developers and the entire UltraWarm team.

Co-authored-by: Min Zhou <minzho@amazon.com>
Co-authored-by: Ankit Malpani <malpani@amazon.com>
Co-authored-by: Rohit Nair <rohinair@amazon.com>
Co-authored-by: Sorabh Hamirwasia <hsorabh@amazon.com>
Co-authored-by: Ankit Jain <akjain@amazon.com>
Co-authored-by: Tianru Zhou <tianruz@amazon.com>
Co-authored-by: Neetika Singhal <neetiks@amazon.com>
Co-authored-by: Amit Khandelwal <mkhnde@amazon.com>
Co-authored-by: Vigya Sharma <vigyas@amazon.com>
Co-authored-by: Prateek Sharma <shrprat@amazon.com>
Co-authored-by: Venkata Jyothsna Donapati <donapv@amazon.com>
Co-authored-by: Vlad Rozov <vrozov@amazon.com>
Co-authored-by: Mohit Agrawal <agramohi@amazon.com>
Co-authored-by: Shweta Thareja <tharejas@amazon.com>
Co-authored-by: Palash Hedau <phhedau@amazon.com>
Co-authored-by: Saurabh Singh <sisurab@amazon.com>
Co-authored-by: Piyush Daftary <pdaftary@amazon.com>
Co-authored-by: Payal Maheshwari <pmaheshw@amazon.com>
Co-authored-by: Kunal Khatua <kkhatua@amazon.com>
Co-authored-by: Gulshan Kumar <kumargu@amazon.com>
Co-authored-by: Rushi Agrawal <agrrushi@amazon.com>
Co-authored-by: Ketan Verma <vermketa@amazon.com>
Co-authored-by: Gaurav Chandani <chngau@amazon.com>
Co-authored-by: Dharmesh Singh <sdharms@amazon.com>

(cherry picked from commit 0ca51a7)

Signed-off-by: Ahmad AbuKhalil <abukhali@amazon.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Ahmad AbuKhalil <abukhali@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Searchable Snapshot] Implement FileSystem Tracking LRU Cache with RefCounting
7 participants