Database Refactor - Better HA + Fault Tolerance #473

bonedaddy · 2020-04-13T05:16:42Z

Overview

In light of the temporary impact to IPFS HTTP API directory uploads caused by a database sync issue, we need to refactor the way our database tooling works. We need better HA, and Fault Tolerance so if another repeat of the incident happens, we can automatically fail-over to a working database.

Our current database system consists of three nodes all in logical replication, allowing us to conduct manual fail-over in the event of an incident, and ensures that we have backups of our databases, as well as hourly backups. However this isn't as smooth as it can be.

While this endeavour falls on my to accomplish, it has the help wanted label as this is an area of database administration I'm not familiar with, and would welcome community input.

End Goals

Multi-master replication
Automatic fail-over
Load balanced requests

Research

track research notes and such

Possible Implementations

Will contain analysis, pros, cons, etc... of the available solutions

Standby Databases

Clusters

DRBD (Distributed Replicated Block Device)

Corosync + Pacemaker + DRBD

Pgpool II

Citus CE

Postgres-XL

CockroachDB

On-prem deployment

Bucardo

Links

The text was updated successfully, but these errors were encountered:

bonedaddy · 2020-04-28T09:15:06Z

going to use cockroachdb as it appears to be the easiest to maintain. the other solutions seem to require a pretty solid understanding of databases, and general DBA stuff which I definitely do not know; I dropped out of a college program to become a DBA so yea 😂

bonedaddy self-assigned this Apr 13, 2020

bonedaddy added help wanted Extra attention is needed priority:high High priority type:improvement New feature or request labels Apr 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database Refactor - Better HA + Fault Tolerance #473

Database Refactor - Better HA + Fault Tolerance #473

bonedaddy commented Apr 13, 2020 •

edited

Loading

bonedaddy commented Apr 28, 2020

Database Refactor - Better HA + Fault Tolerance #473

Database Refactor - Better HA + Fault Tolerance #473

Comments

bonedaddy commented Apr 13, 2020 • edited Loading

Overview

End Goals

Research

Possible Implementations

Standby Databases

Clusters

DRBD (Distributed Replicated Block Device)

Pgpool II

Citus CE

Postgres-XL

CockroachDB

Bucardo

Links

bonedaddy commented Apr 28, 2020

bonedaddy commented Apr 13, 2020 •

edited

Loading