Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dangling transactions after 'failed to deallocate cached statement' errors #2100

Open
Valodim opened this issue Aug 1, 2024 · 1 comment
Labels

Comments

@Valodim
Copy link

Valodim commented Aug 1, 2024

Hey there. First of all, thank you for all your work on this library 👍

Describe the bug

We are using pgx in a fairly typical backend application, that serves requests to clients via grpc. We use pgx as a driver via sqlx. Starting with version 5.5.4, we started to notice some problems:

  • Our production logs contained lines "failed to deallocate cached statement: conn closed". Roughly it was a couple hundred times a day, for a couple hundred thousand requests a day where each request is a transaction.
  • This happened predominantly (but not exclusively) in streaming requests. Those are ended when cancelled by the user, so we assume it's some race condition around cancelled context behavior.
  • The logged error happens from a call to tx.Rollback(), which is called from a defer that is set up with the transaction and run at the end of the cancelled rpc method.
  • We also started to see dangling transactions left in an idle in transaction state, indefinitely (hours), causing issues particularly when they held on to locks. While this is not definitely related to the log entries, it started happening at the same time and the log entries indicate an error during transaction cleanup.

It seems like something goes wrong around transaction cleanup, but we can't tell what's going on at all.

To Reproduce

Unfortunately, we were unable to reproduce this bug outside our production environment, which also isn't source available. :( Rolling back to 5.5.3 "fixed" the issue for us, so it might be caused by 046f497. We tried some things in our codebase around transaction behavior and made sure we begin/end transactions consistently throughout, to no avail. If you have any ideas, we'd be happy to try.

Version

  • Go: $ go version -> go version go1.21.5 linux/amd64
  • PostgreSQL: $ psql --no-psqlrc --tuples-only -c 'select version()' -> PostgreSQL 15.7 on x86_64-pc-linux-musl, compiled by gcc (Alpine 13.2.1_git20240309) 13.2.1 20240309, 64-bit
  • pgx: $ grep 'github.com/jackc/pgx/v[0-9]' go.mod -> v5.5.4
@Valodim Valodim added the bug label Aug 1, 2024
@Valodim Valodim changed the title dangling transactions after f'failed to deallocate cached statement' errors dangling transactions after 'failed to deallocate cached statement' errors Aug 1, 2024
@jackc
Copy link
Owner

jackc commented Aug 3, 2024

I do not think that failed to deallocate cached statement: conn closed is directly connected to dangling transactions -- after all the connection is closed -- it can't be holding open a connection. I suppose it could be worth checking that the connection actually is closed on the server though.

I suppose it's possible this has something to do with the context cancellation. But because of the difficulty in reliably interrupting a query and recovering the connection, pgx simply closes the connection when a query is interrupted. So it would seem there's not much that could go wrong there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants