-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Searchable Snapshots] Add a new node role for remote search capabilities #4652
Comments
Is this an important static capability, or is it an optimization that can be modeled with dynamic node role that doesn't always exist and is simply preferred? Meaning if no node was "remote searcher", could you fall back to a "data" node? If that's the case then you can use #3436 with no code changes. |
This is a required capability where we will need additional node configuration for a node to perform as a remote searcher. We might have to design specifically for a fall back scenario, and it will not work as per the current design. |
My intuition is that we want this as an important static capability. I believe it's possible to design the system to work that a regular data node can fallback to acting as a remote searcher, but it is generally a sub-optimal setup. The static role would require a user to be intentional about such a setup by applying both roles to a given node. Having typed all of that, I suspect that is true about any of the dynamic roles that are used to select "preferred" nodes, so I'm open to being convinced otherwise. |
@kotwanikunal What are things that would go into the additional node configuration? @andrross Maybe my question implied fallback too much. We can also use the dynamic node capability without fallback (fail as fallback). Reading the issue the tl;dr difference between a remote search node and a regular data node in which the entire shard will not be downloaded onto the node, correct? Is there more to this node? Does it need to be a first class node type? Finally, is there a better name than "remote searcher"? Is "search" a better name for this that can in the future collect other search-only capabilities? |
The intent is for the role to serve two purposes:
|
@andrross Thanks, makes sense. I do want to try to explain why I am suggesting not introducing a "remote search" role: It's a kind of "search node". I suspect that in the grand scheme of things users want to separate and independently scale indexing from searching. It would be real simple to think of these as "index" and "search" roles, and OpenSearch making decisions such as "a node is both index and search therefore shards are downloaded to the node" vs. "a node is just search and therefore only remote shards are allocated to the node and part of disk is reserved for caching". Am I over-simplifying this? When we've implemented all known storage and search ideas that are already discussed out there, what will this picture look like, and will a "remote search" node still make sense? |
I do expect there to be separate "index" and "search" nodes as has been discussed. It's a fair question whether the "remote" aspect of it should be baked into the role. The purpose of the "remote" part is to ensure the requisite cache configuration has been supplied, but we could either define reasonable defaults or fail at runtime if the required configuration is not present. I do like the simplicity of "index" and "search". (Note that there may well be a remote variant of indexers as well, i.e. indexers that index to local disk and replicate via seg rep versus indexers that write directly to a remote store and searchers access the remote data. If we go with the "remote search" role then we're potentially looking at 4 distinct roles.) |
I would be much more comfortable with a "search" role vs. "remote search" as a node optimized for search, and other parameters such as whether "MB of disk is reserved for cache" becomes a configuration parameter that is independent of the role (but may have some semantics like "cannot be enabled on a node that's not "search"). |
Is your feature request related to a problem? Please describe.
REMOTE_SEARCHER
node roleDescribe the solution you'd like
cluster_manager
,data
,ingest
listed here.Additional context
The text was updated successfully, but these errors were encountered: