-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Introduce PULL replicas with segrep #11819
Comments
Would the read replicas goes out of sync in the absence of the no-op replication? Any thoughts on how to keep them in sync eventually (with some delay)? |
No they won't publication checkpoints would still go through, on refreshes/segment creation and with segrep the PULL replicas will download the new segments. If for some reason it isn't able to and replication lag increases, we would fail the shard copy, status quo |
Sounds good. This is a good optimization which comes into play when the number of replicas are high, we should pursue this. |
What is the overhead here? My understanding is that this is a super lightweight message, with its only goal to ensure that the node sending the message is in fact still the primary. Could we achieve this optimization without introducing a new type of replica shard? I'm thinking of something like revisiting the "ensure I'm still the primary" check with something like "send no-op replication message to max(3, num_replicas) replicas" or "broadcast no-op replication to all replicas but continue once a quorum of responses have been received". These are off-the-top-of-my-head suggestions and may be very bad, but my broader point is whether we should focus the optimization on the no-op replication logic as opposed to introducing a new replica shard type. |
Feels like a smart tradeoff to solve the search freshness lag with segpre. I am guessing we would be introducing a new search preference strategy as well ? to ensure PULL replicas are preferred for search? |
Every indexing request has to necessarily touch all copies in a way that couples write and read availability. With large number of replicas this would cause bulk latencies to increase as one of the copy goes slow. To ensure we improve tail latencies and reduce chattiness we need a mechanism to identify certain replicas as PULL replica to ensure only those can get promoted to primary. So indexing and failover flows both need to understand the protocol. I am all in for a simplified way to accomplish the same |
Is your feature request related to a problem? Please describe
With a large cluster setup and replication group containing a lot of read replicas, no-op replication might become an overhead. Typically we would need replica shards for write availability and pull-replica shards for read/write availability
So essentially we can have these two variants
Describe the solution you'd like
We don't need to perform no-op replication for operation level durability on all copies but only a limited sub-set of failover copies to ensure they can become failover primaries.
Related component
Indexing:Replication
Describe alternatives you've considered
Separate reader & writer is one option but would require significant infra changes
Additional context
No response
The text was updated successfully, but these errors were encountered: