Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Comments for Cross-Cluster Replication #1

Closed
naveenpajjuri opened this issue Feb 22, 2021 · 1 comment
Closed

Request for Comments for Cross-Cluster Replication #1

naveenpajjuri opened this issue Feb 22, 2021 · 1 comment

Comments

@naveenpajjuri
Copy link
Contributor

naveenpajjuri commented Feb 22, 2021

Overview

Today Open Distro for Elasticsearch users don’t have a native solution to replicate data across multiple clusters. We are announcing an experimental release of cross-cluster replication with this repository. The key drivers for the new native replication feature are:

  • High Availability (HA): Cross-cluster replication ensures uninterrupted service availability with the ability to failover to an alternate cluster in case of failure or outages on the primary cluster.
  • Reduced Latency: Replicating data to a cluster that is closer to the application users minimizes the query latency. **
  • Horizontal scalability: Splitting a query heavy workload across multiple replica clusters improves application availability.
  • Aggregated reports: Enterprise customers can roll up reports continually from smaller clusters belonging to different lines of business into a central cluster for consolidated reports, dashboards or visualizations.

Design

Cross-Cluster Replication follows an active-passive replication model where the follower cluster (where the data is replicated) pulls data from the leader (source) cluster

The replication machinery is implemented as an Elasticsearch plugin that exposes APIs to control replication, spawns background persistent tasks to asynchronously replicate indices and utilizes snapshot repository abstraction to facilitate bootstrap. Replication relies on cross cluster connection setup from the follower cluster to the leader cluster for connectivity. Once replication is initiated on an index, a background persistent task per primary shard on the follower cluster continuously polls corresponding shards from the leader index and applies the changes on to the follower shard. The cross cluster replication plugin offers seamless integration with Open Distro for Elasticsearch Security plugin for secure data transfer and access control.

Refer to the Request for Comments (RFC) document for detailed design.

Key User Stories

  • Users should be able to start replication for desired indices on the leader cluster onto the follower cluster.
  • Users should be able to start replication for indices matching wildcard pattern on the leader cluster onto the follower cluster.
  • Users should be able to stop replication for desired indices on the follower cluster.
  • Users should be able to see the status of an ongoing replication activities.
  • Users should be able to use node-to-node encryption feature of Open Distro for Elasticsearch Security plugin to encrypt cross-cluster replication traffic.
  • Users should be able to control access for replication activities via Open Distro for Elasticsearch Security plugin.

Comments/feedback

Please note that cross-cluster replication is being released in experimental mode.

  • For any feedback on the problem statement or design, please leave your comments on this issue.
  • For any issues or specific feature requests, please report by following the guidelines.
@krishna-ggk krishna-ggk changed the title cross-cluster-replication Key Features. Request for Comments for Cross-Cluster Replication Feb 22, 2021
@vinoov
Copy link

vinoov commented Mar 4, 2021

Congratulations on the release of the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants