-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Remote Store] Inject remote store in IndexShard instead of RemoteStoreRefreshListener #3703
[Remote Store] Inject remote store in IndexShard instead of RemoteStoreRefreshListener #3703
Conversation
✅ Gradle Check success 8a3c14c5fe347910bcc5938b69d01352464e7263 |
✅ Gradle Check success 7b3a31d510b5dd02ddfd5c6244ca3e139826e92a |
@@ -3205,8 +3211,9 @@ private EngineConfig newEngineConfig(LongSupplier globalCheckpointSupplier) { | |||
|
|||
final List<ReferenceManager.RefreshListener> internalRefreshListener = new ArrayList<>(); | |||
internalRefreshListener.add(new RefreshMetricUpdater(refreshMetric)); | |||
if (remoteStoreRefreshListener != null && shardRouting.primary()) { | |||
internalRefreshListener.add(remoteStoreRefreshListener); | |||
if (remoteStore != null && shardRouting.primary()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe extract it into remoteStoreEnabled()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -329,7 +331,7 @@ public IndexShard( | |||
final RetentionLeaseSyncer retentionLeaseSyncer, | |||
final CircuitBreakerService circuitBreakerService, | |||
@Nullable final SegmentReplicationCheckpointPublisher checkpointPublisher, | |||
@Nullable final RemoteStoreRefreshListener remoteStoreRefreshListener | |||
@Nullable final Store remoteStore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the Store
should be abstracted as a CompositeStore
that has data backed in remote. IndexShard shouldn't worry about exposing or maintaining another Store
that is remote
class CompositeStore extends Store
{
private Store localStore;
private Store remoteStore,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some time to think about this abstraction. Tracking here: #3719
✅ Gradle Check success 49412c44fd45b4c154f66620c5e2c7b2ee1c0c88 |
if (isRemoteStoreEnabled()) { | ||
Directory remoteDirectory = ((FilterDirectory) ((FilterDirectory) remoteStore.directory()).getDelegate()).getDelegate(); | ||
internalRefreshListener.add(new RemoteStoreRefreshListener(store.directory(), remoteDirectory)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I couldn't find where are we handling the reference counting for Store. The call to directory()
requires the Store to be Open(ref counter > 0) so this call would fail?
OpenSearch/server/src/main/java/org/opensearch/index/store/Store.java
Lines 208 to 211 in 3d4d5ca
public Directory directory() { | |
ensureOpen(); | |
return directory; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When store is created, default refCount
is 1.
OpenSearch/libs/core/src/main/java/org/opensearch/common/util/concurrent/AbstractRefCounted.java
Lines 37 to 42 in 3d4d5ca
/** | |
* A basic RefCounted implementation that is initialized with a | |
* ref count of 1 and calls {@link #closeInternal()} once it reaches | |
* a 0 ref count | |
*/ | |
public abstract class AbstractRefCounted implements RefCounted { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sachinpkale so when should Store
be closed for remote store as in reference count decremented or explicitly closed? I see IndexShard#closeShard closing the local store, do we need to handle close for remote store as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, we are not closing remoteStore
. Let me add remoteStore close as a part of IndexShard.close()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For local store we have ShardStoreDeleter
as a listener that cleans up data, for remote we are deleting it as a part of close. Also when a shard relocates to another node, the shard engine I believe closes, would that mean we are removing data from the remote? I guess that would be no, but please verify if that could happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a feeling we might be leaking the Store on abrupt Engine failures?
OpenSearch/server/src/main/java/org/opensearch/index/engine/InternalEngine.java
Lines 2231 to 2273 in 492ffc0
/** | |
* Closes the engine without acquiring the write lock. This should only be | |
* called while the write lock is hold or in a disaster condition ie. if the engine | |
* is failed. | |
*/ | |
@Override | |
protected final void closeNoLock(String reason, CountDownLatch closedLatch) { | |
if (isClosed.compareAndSet(false, true)) { | |
assert rwl.isWriteLockedByCurrentThread() || failEngineLock.isHeldByCurrentThread() | |
: "Either the write lock must be held or the engine must be currently be failing itself"; | |
try { | |
this.versionMap.clear(); | |
if (internalReaderManager != null) { | |
internalReaderManager.removeListener(versionMap); | |
} | |
try { | |
IOUtils.close(externalReaderManager, internalReaderManager); | |
} catch (Exception e) { | |
logger.warn("Failed to close ReaderManager", e); | |
} | |
try { | |
IOUtils.close(translogManager.getTranslog()); | |
} catch (Exception e) { | |
logger.warn("Failed to close translog", e); | |
} | |
// no need to commit in this case!, we snapshot before we close the shard, so translog and all sync'ed | |
logger.trace("rollback indexWriter"); | |
try { | |
indexWriter.rollback(); | |
} catch (AlreadyClosedException ex) { | |
failOnTragicEvent(ex); | |
throw ex; | |
} | |
logger.trace("rollback indexWriter done"); | |
} catch (Exception e) { | |
logger.warn("failed to rollback writer on close", e); | |
} finally { | |
try { | |
store.decRef(); | |
logger.debug("engine closed [{}]", reason); | |
} finally { | |
closedLatch.countDown(); | |
} |
Can we add assertions in the tests to ensure the state of the remote store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should not delete data on close. I think I have removed the blobcontainer.delete()
as a part of some other commit which is not included in this PR. Let me add that.
I did not get the store leaking problem though. Are you talking about store or remoteStore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a case of abrupt Engine failure should we close the remote store similar to how a local store gets closed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But internal engine is not aware of the remoteStore. Its scope is limited to IndexShard.
fa4996f
to
2326fca
Compare
Build is failing with following exception:
Running on local worked fine. |
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
2326fca
to
3e0aa6b
Compare
…reRefreshListener (opensearch-project#3703) * Inject remote store in IndexShard instead of RemoteStoreRefreshListener Signed-off-by: Sachin Kale <kalsac@amazon.com> * Pass supplier of RepositoriesService to RemoteDirectoryFactory Signed-off-by: Sachin Kale <kalsac@amazon.com> * Create isRemoteStoreEnabled function for IndexShard Signed-off-by: Sachin Kale <kalsac@amazon.com> * Explicitly close remoteStore on indexShard close Signed-off-by: Sachin Kale <kalsac@amazon.com> * Change RemoteDirectory.close to a no-op Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com>
…reRefreshListener (opensearch-project#3703) * Inject remote store in IndexShard instead of RemoteStoreRefreshListener Signed-off-by: Sachin Kale <kalsac@amazon.com> * Pass supplier of RepositoriesService to RemoteDirectoryFactory Signed-off-by: Sachin Kale <kalsac@amazon.com> * Create isRemoteStoreEnabled function for IndexShard Signed-off-by: Sachin Kale <kalsac@amazon.com> * Explicitly close remoteStore on indexShard close Signed-off-by: Sachin Kale <kalsac@amazon.com> * Change RemoteDirectory.close to a no-op Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com>
…reRefreshListener (opensearch-project#3703) * Inject remote store in IndexShard instead of RemoteStoreRefreshListener Signed-off-by: Sachin Kale <kalsac@amazon.com> * Pass supplier of RepositoriesService to RemoteDirectoryFactory Signed-off-by: Sachin Kale <kalsac@amazon.com> * Create isRemoteStoreEnabled function for IndexShard Signed-off-by: Sachin Kale <kalsac@amazon.com> * Explicitly close remoteStore on indexShard close Signed-off-by: Sachin Kale <kalsac@amazon.com> * Change RemoteDirectory.close to a no-op Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com>
…reRefreshListener (opensearch-project#3703) * Inject remote store in IndexShard instead of RemoteStoreRefreshListener Signed-off-by: Sachin Kale <kalsac@amazon.com> * Pass supplier of RepositoriesService to RemoteDirectoryFactory Signed-off-by: Sachin Kale <kalsac@amazon.com> * Create isRemoteStoreEnabled function for IndexShard Signed-off-by: Sachin Kale <kalsac@amazon.com> * Explicitly close remoteStore on indexShard close Signed-off-by: Sachin Kale <kalsac@amazon.com> * Change RemoteDirectory.close to a no-op Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com>
…4380) * [Remote Store] Upload segments to remote store post refresh (#3460) * Add RemoteDirectory interface to copy segment files to/from remote store Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com> * Add index level setting for remote store Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com> * Add RemoteDirectoryFactory and use RemoteDirectory instance in RefreshListener Co-authored-by: Sachin Kale <kalsac@amazon.com> Signed-off-by: Sachin Kale <kalsac@amazon.com> * Upload segment to remote store post refresh Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com> Signed-off-by: Sachin Kale <kalsac@amazon.com> * [Remote Store] Inject remote store in IndexShard instead of RemoteStoreRefreshListener (#3703) * Inject remote store in IndexShard instead of RemoteStoreRefreshListener Signed-off-by: Sachin Kale <kalsac@amazon.com> * Pass supplier of RepositoriesService to RemoteDirectoryFactory Signed-off-by: Sachin Kale <kalsac@amazon.com> * Create isRemoteStoreEnabled function for IndexShard Signed-off-by: Sachin Kale <kalsac@amazon.com> * Explicitly close remoteStore on indexShard close Signed-off-by: Sachin Kale <kalsac@amazon.com> * Change RemoteDirectory.close to a no-op Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com> * [Remote Store] Add remote store restore API implementation (#3642) * Add remote restore API implementation Signed-off-by: Sachin Kale <kalsac@amazon.com> * [Remote Store] Add support to add nested settings for remote store (#4060) * Add support to add nested settings for remote store Signed-off-by: Sachin Kale <kalsac@amazon.com> * [Remote Store] Add rest endpoint for remote store restore (#3576) * Add rest endpoint for remote store restore Signed-off-by: Sachin Kale <kalsac@amazon.com> * [Remote Store] Add validator that forces segment replication type before enabling remote store (#4175) * Add validator that forces segment replication type before enabling remote store Signed-off-by: Sachin Kale <kalsac@amazon.com> * [Remote Store] Change remote_store setting validation message to make it more clear (#4199) * Change remote_store setting validation message to make it more clear Signed-off-by: Sachin Kale <kalsac@amazon.com> * [Remote Store] Add RemoteSegmentStoreDirectory to interact with remote segment store (#4020) * Add RemoteSegmentStoreDirectory to interact with remote segment store Signed-off-by: Sachin Kale <kalsac@amazon.com> * Use RemoteSegmentStoreDirectory instead of RemoteDirectory (#4240) * Use RemoteSegmentStoreDirectory instead of RemoteDirectory Signed-off-by: Sachin Kale <kalsac@amazon.com> Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale kalsac@amazon.com
Description
RemoteStoreRefreshListener
intoIndexShard
where IndexService has the logic of creating an instance ofRemoteStoreRefreshListener
. As IndexShard already takes Store as one of the parameters, to make the behavior consistent, this change injects another instance ofStore
(created with RemoteDisrectory) to IndexShard.RepositoriesService
is injected intoRemoteDirectoryFactory
and factory takes care of fetching the repository for the given name.Issues Resolved
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.