Implement Retries on Transient Roomlog DB Connection Errors #10776
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem:
Previously, any unexpected DB connection termination resulted in lost logs, since we only retried on
42P01
(table not found). Other transient failures (e.g., “Connection terminated unexpectedly”) were crashlogged without a retry, causing data loss.Solution:
Introduce a bounded retry mechanism (3 attempts) specifically for transient “Connection terminated unexpectedly” errors. After a brief wait, the query is re-run. If retries are exhausted, or if
ignoreFailure
istrue
, the function skips retries and relies on crashlogs as before.Benefits:
ignoreFailure
: Honors theignoreFailure
flag to bypass retries when log insertion isn't critical.