This repository has been archived by the owner on Jan 13, 2021. It is now read-only.
Database Refactor - Better HA + Fault Tolerance #473
Labels
help wanted
Extra attention is needed
priority:high
High priority
type:improvement
New feature or request
Overview
In light of the temporary impact to IPFS HTTP API directory uploads caused by a database sync issue, we need to refactor the way our database tooling works. We need better HA, and Fault Tolerance so if another repeat of the incident happens, we can automatically fail-over to a working database.
Our current database system consists of three nodes all in logical replication, allowing us to conduct manual fail-over in the event of an incident, and ensures that we have backups of our databases, as well as hourly backups. However this isn't as smooth as it can be.
While this endeavour falls on my to accomplish, it has the
help wanted
label as this is an area of database administration I'm not familiar with, and would welcome community input.End Goals
Research
Possible Implementations
Will contain analysis, pros, cons, etc... of the available solutions
Standby Databases
Clusters
DRBD (Distributed Replicated Block Device)
Pgpool II
Citus CE
Postgres-XL
CockroachDB
Bucardo
Links
The text was updated successfully, but these errors were encountered: