Digital Ocean Managed Postgre Primary Failover #1026

bjg2 · 2021-02-09T10:27:33Z

Hey everyone!

In my org, we're using Digital Ocean Managed Postgre, and we're having a database setup of 1 primary and 2 standbys. Digital Ocean does not provide hosts for all 3 nodes, just one host that always points to the primary node (something like XXX.a.db.ondigitalocean.com). Sometimes, by design, primary failover happens, so that primary node becomes secondary and some secondary becomes primary.

Problem is, when that happens, connections seem to remain open with the node that was primary and just became secondary. Those connections now forever start erroring out with write tcp YYY->ZZZ: write: connection timed out, and never recover.

Mitigation I came up with was db.SetConnMaxLifetime(time.Minute), but that's not ideal. Is there any better way around this problem at the moment?

PS: I saw a similar issue here #683, but I don't think it applies to our problem, as we do not have multiple hosts provided, just one host string, and that host points to the current primary.

The text was updated successfully, but these errors were encountered:

Lekensteyn · 2021-02-22T18:16:32Z

Possibly a duplicate of #835.

bjg2 · 2022-06-20T10:33:02Z

We need to find a way to mitigate this better, as we noticed that conn lifetime has really bad perf implications. Should we expect that this issue is fixed by #1013 ? It is very hard to test this, as DO does not provide an interface for triggering primary failover...

zak905 · 2024-02-26T12:39:01Z

any updates ?
@bjg2 I am interested to know if you have found new ways to handle this issue ?

bjg2 · 2024-02-27T17:22:09Z

I think we did not have that issue for a long time, I guess pq patched the issue with linked fix above.

bjg2 changed the title ~~Digital Ocean Managed Postgre Failover~~ Digital Ocean Managed Postgre Primary Failover Feb 9, 2021

bjg2 closed this as completed Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Digital Ocean Managed Postgre Primary Failover #1026

Digital Ocean Managed Postgre Primary Failover #1026

bjg2 commented Feb 9, 2021

Lekensteyn commented Feb 22, 2021

bjg2 commented Jun 20, 2022

zak905 commented Feb 26, 2024

bjg2 commented Feb 27, 2024

Digital Ocean Managed Postgre Primary Failover #1026

Digital Ocean Managed Postgre Primary Failover #1026

Comments

bjg2 commented Feb 9, 2021

Lekensteyn commented Feb 22, 2021

bjg2 commented Jun 20, 2022

zak905 commented Feb 26, 2024

bjg2 commented Feb 27, 2024