-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RemoteDirectory instance to Store as a secondary directory #3285
Add RemoteDirectory instance to Store as a secondary directory #3285
Conversation
this(shardId, indexSettings, directory, null, shardLock, OnClose.EMPTY); | ||
} | ||
|
||
public Store(ShardId shardId, IndexSettings indexSettings, Directory directory, Directory secondaryDirectory, ShardLock shardLock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 extra constructor methods due to secondaryDirectory
. Should we use a builder instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a builder instead?
Yes, and the builder should validate the arguments and throw a RuntimeException
if the REMOTE_STORE
feature flag is not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latest commit, I have removed extra parameter to Store
. So, this is not required anymore.
Fixing |
399f1df
to
bcfd328
Compare
❌ Gradle Check failure 399f1dff56b54dc3010b29e46b523e29c478ed9d |
Directory remoteDirectory = null; | ||
if (this.indexSettings.isRemoteStoreEnabled()) { | ||
try { | ||
Repository repository = repositoriesService.repository("dragon-stone"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This repository name is hardcoded for now. It would be made configurable as a part of #3286
Please refer this comment to understand why we went with having |
if (secondaryDirectory != null) { | ||
ByteSizeCachingDirectory sizeCachingSecondaryDir = new ByteSizeCachingDirectory(secondaryDirectory, refreshInterval); | ||
this.secondaryDirectory = new StoreDirectory(sizeCachingSecondaryDir, Loggers.getLogger("index.store.deletes", shardId)); | ||
} else { | ||
this.secondaryDirectory = null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should we have a Directory factory instead instead of always having two directories always. Primary and secondary directory sounds confusing as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is your view on having a separate store instance with RemoteDirectory
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The presence of branching logic with primary / secondary is confusing, please consider subclassing the Store
(fe CachedStore
) to have a clean separation between regular Store
and Store
backed by a cache / secondary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I think more on the comments from @Bukhtawar and @eirsep, we may need to have a CompositeDirectory
which will have 2 directories (we can create compositions for multiple directories if required). CompositeDirectory
will be abstract and its implementations can define how to handle operations on the encapsulated directories. Implementation can be parallel or sequential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going forward I do feel a need to have an abstraction at a Store
level, espl with #2900 where we might extend search functionality directly on a remote store. Then we have to perform StoreRecovery
operations when the shard data is lost, integrity, checksums which is what the store abstraction provides. How about a CompositeStore
with each Store based out of their corresponding Directory
implementation and IndexShard
interfacing with this CompositeStore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I think more, going either with CompositeStore
or CompositeDirectory
would be too early. I was able to get a quick implementation out with CompositeDirectory but it would require most of the RemoteDirectory methods to be implemented before we can actually start using it. Using these abstractions will not avoid having a reconciliation mechanism between local and remote directory. I made some changes in the latest commit and Store
is not changed in the latest change. Please take a look.
remoteDirectory = remoteDirectoryFactory.newDirectory(this.indexSettings, path, repository); | ||
} catch (RepositoryMissingException e) { | ||
throw new IllegalArgumentException( | ||
"Repository should be created before creating index with remote_store enabled setting", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is repository created by user or by system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, it is created by user. Once we make the name configurable as a part of #3286, we can check if the repository can be created during bootstrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, you're saying as of now it will have to be created by user but he can only name it as "dragon-stone"? That doesn't seem correct right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -174,6 +174,7 @@ public class Store extends AbstractIndexShardComponent implements Closeable, Ref | |||
|
|||
private final AtomicBoolean isClosed = new AtomicBoolean(false); | |||
private final StoreDirectory directory; | |||
private final StoreDirectory secondaryDirectory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can store not have to worry about how many directories are there and just have one StoreDirectory which is extended by a Remote+Local store Directory wrapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is secondaryDirectory
just a cache? It looks like it, so why not to call it as such?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the latest changes. I have removed reference to secondaryDirectory from the Store
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to incorporate the REMOTE_STORE feature flag and throw a RuntimeException
if the system is trying to bootstrap the remote directories with the feature flag disabled. Bypassing the feature flag here is trappy.
this(shardId, indexSettings, directory, null, shardLock, OnClose.EMPTY); | ||
} | ||
|
||
public Store(ShardId shardId, IndexSettings indexSettings, Directory directory, Directory secondaryDirectory, ShardLock shardLock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a builder instead?
Yes, and the builder should validate the arguments and throw a RuntimeException
if the REMOTE_STORE
feature flag is not set.
import java.util.Collections; | ||
|
||
import static org.mockito.ArgumentMatchers.any; | ||
import static org.mockito.Mockito.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No wildcard imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with individual imports. I will check if we can add this check to spotless
ce5f6cd
to
d13a660
Compare
Yes, while creating instance of RemoteDirectory, I am checking if remote_store setting is enabled or not. As the setting itself is not active if feature flag is not set, we don't need explicit feature flag check, right? |
…hListener Signed-off-by: Sachin Kale <kalsac@amazon.com>
d13a660
to
207941d
Compare
|
||
@Override | ||
public void afterRefresh(boolean didRefresh) throws IOException { | ||
// ToDo Add implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actual implementation will be added in the next commit
✅ Gradle Check success d13a660cf4f666d36f222c3b54f2b2f00f5b953b |
|
start gradle check |
public class RemoteDirectoryFactory implements IndexStorePlugin.RemoteDirectoryFactory { | ||
|
||
@Override | ||
public Directory newDirectory(IndexSettings indexSettings, ShardPath path, Repository repository) throws IOException { | ||
assert repository instanceof BlobStoreRepository : "repository should be instance of BlobStoreRepository"; | ||
BlobPath blobPath = new BlobPath(); | ||
blobPath = blobPath.add(indexSettings.getIndex().getName()).add(String.valueOf(path.getShardId().getId())); | ||
BlobContainer blobContainer = ((BlobStoreRepository) repository).blobStore().blobContainer(blobPath); | ||
return new RemoteDirectory(blobContainer); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the factory supposed to vend out another variation based on the inputs? If not do we need a factory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there will be a directory instance per shard as path will be different per directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if that qualifies to be a factory f.e look at FsDirectoryFactory
which creates different a HybridDirectory
or NIOFSDirectory
based on certain attributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. You mean, if it is only one variant each time, we do not need a factory, right?
I don't have very strong inclination towards having a factory. It is mostly to keep things similar to what we have with FS based factories. It hides the complexity of understanding all the metadata and creating BlobContainer
. Without factory, the code would be in the constructor of RemoteDirectory
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call it a Builder
instead?
final List<ReferenceManager.RefreshListener> internalRefreshListener; | ||
if (remoteStoreRefreshListener != null && shardRouting.primary()) { | ||
internalRefreshListener = Arrays.asList(new RefreshMetricUpdater(refreshMetric), remoteStoreRefreshListener); | ||
} else { | ||
internalRefreshListener = Collections.singletonList(new RefreshMetricUpdater(refreshMetric)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets push this down to IndexShard
and pass a NoopRemoteStoreRefreshListener
as needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not get this. You mean push it down to EngineConfig
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I meant IndexServices that create IndexShard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this to avoid null check? As RemoteStoreRefreshListener
is injected into IndexShard, tests can have the NoopRemoteStoreRefreshListener
if does not want to actually upload the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with what we have now, but if the checks grow at multiple place we might want to give this a revisit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is identical to the listener for segrep - here. What about building a list of additional listeners in IndexService and pass directly to IndexShard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Bukhtawar Agree, adding a task for the same.
@mch2 Yes, once we merge with SegRep code, we can abstract it out.
4408e6b
into
opensearch-project:feature/durability-enhancements
…hListener (opensearch-project#3285) Co-authored-by: Sachin Kale <kalsac@amazon.com> Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale kalsac@amazon.com
Description
primaryDirectory
andsecondaryDirectory
. The idea is to encapsulate RemoteDirectory in Store instead of creating a separate instance of Store, remoteStore.Issues Resolved
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.