Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XXX: Hack statement ID to be random instead of sequential #1740

Closed
wants to merge 1 commit into from

Conversation

mareq
Copy link
Contributor

@mareq mareq commented Mar 9, 2022

Hi, we are trying to use PgBouncer, but there is (a known? related issue: #67 (comment)) problem with the duplicate IDs of prepared statements used by distinct instances sharing the database connections.

One nasty hacky way to get around the problem (statistically working most of the time) is using random statement IDs instead of the sequence. This obviously is not the real solution. It is only the proof of concept to verify that this is the root cause.

So, I am not really looking to merge this PR. Instead, I wanted to ask you guys what are your thoughts on this. What would you see as the best approach to resolve the problem?

Use random statement IDs instead of sequence to avoid conflicts (almost
100% of the time). This obviously is not the solution to the problem.
It is only the proof of concept to verify that this is the root cause
of the problem.
@mareq mareq force-pushed the mq/hack_random-statement-id branch from a74e03f to 5543374 Compare March 9, 2022 15:55
@abonander
Copy link
Collaborator

abonander commented Mar 9, 2022

Serious question: why use PgBouncer? It doesn't seem to offer anything special over using a connection pool in the application. I suppose it makes it easier to pool connections from multiple applications, and it lets you pretend to have more than the maximum by multiplexing requests, but it just seems like a really hacky way to do it in general.

In turn you seem to give up one really powerful feature: prepared statements, along with a bunch of other nice-to-have features like advisory locks and LISTEN/NOTIFY.

I think the ideal solution to me is for us to somehow detect when we're talking to PgBouncer and just stop caching prepared statements. But that's not quite enough as the Prepare and Execute messages could go to two different sessions. If it's in transaction pooling mode we could just always prepare and execute in a transaction.

Statement pooling mode we're kind of out of luck though.

@abonander
Copy link
Collaborator

@mareq did you see my above comment?

@mareq
Copy link
Contributor Author

mareq commented Mar 24, 2022

@abonander Yes, sorry, I got side-tracked. You are making a valid point there. We have managed to dodge the issue by essentially throwing more money at the problem, so no PgBouncer is no longer needed. Still need to investigate our root cause and based on that we will need to come up with the resolution. So I do not know yet, if that will include PgBouncer or not.

In any case, thank you for your help 👍. If you do not mind keeping this open for the time being, I will come back once I have something to share. Otherwise please reject this PR and I will eventually open a new one if and when needed.

@greyblake
Copy link

@mareq Thank you! I may use this hack too.

@abonander

Serious question: why use PgBouncer?

In my case I have no choice. My application needs to query NEAR Explorer DB (https://github.com/near/near-indexer-for-explorer) and it looks like they use PgBouncer. This causes quite amount of undesired errors in my app in production :/

@abonander
Copy link
Collaborator

abonander commented Apr 4, 2022

I think the best advice I can give if you have no choice but to use PgBouncer is to avoid prepared statements entirely. If you invoke methods of Executor directly with query strings, it will not use prepared statements. You can't use bind arguments this way, sadly. That's inherently tied to the prepared statement interface.

Unfortunately, the way PgBouncer's transaction pooling mode is designed, I don't think there's anything we can do from the client-side to ever guarantee with any level of confidence that prepared statements will work correctly.

The one exception would be to always begin a transaction before using the prepared statement interface via sqlx::query() or sqlx::query!() et al, which ensures that you have a single session assigned for the duration of the transaction. However, that only works if you know for sure that the PgBouncer instance is in transaction pooling mode and not statement pooling mode.

In that case, we should provide some way to ensure that prepared statement IDs don't collide, but I don't know if I like just generating random IDs and hoping for the best. It seems inelegant, makes the IDs on average longer than they need to be, doesn't help with debugging, and has a nonzero chance to still suffer collisions (especially in a VM or container environment where /dev/random may always start with the same seed).

Since Postgres returns a specific error code if the client submits a duplicate prepared statement ID, I'm thinking we should use that to detect this condition and set PgConnection::next_statement_id to a random value, retry the prepare step, and if successful, log the old value for next_statement_id and the new one, then let it continue autoincrementing from there. This way the driver can handle collisions even with low-quality random numbers and the behavior shouldn't change at all for non-PgBouncer users.

Otherwise, I would also recommend joining the discussion on the PgBouncer side via the following issue, and see if it's possible to convince them to properly support prepared statements: pgbouncer/pgbouncer#695

@greyblake
Copy link

@abonander Thank you! Using the method from Executor helped indeed.
I am lucky, that I had to do it only with few requests, but it is a very big pain relief for me and my users =)

@abonander
Copy link
Collaborator

Closing this since we're unlikely to accept it as-is. If anyone wants to take another stab at it, I'd like to see the approach I described in #1740 (comment)

@levkk
Copy link
Contributor

levkk commented Jun 14, 2023

You can't use bind arguments this way, sadly. That's inherently tied to the prepared statement interface.

I don't think that's true. Postgres has the extended protocol, with support for anonymous prepared statements which just sends a P message, then B with parameters, and then E which executes the query. This is pipelined, so it's pretty fast. Anonymous prepared statements aren't cached afaik, so you can just use the extended protocol with all of its features.

If you use pg_bench, it has a --protocol option that describes the three ways to talk to PG: simple (Q), extended (P, B, E), and prepared which is the same as extended except with cached and re-used prepared statements. SQLx should support the extended protocol, since that'll cover both the need for bind & the lack of support of prepared statements from poolers.

Adding prepared statements support to poolers isn't straight forward. When multiplexing N clients with M servers, the pooler has to keep track of which client prepared which statement on which server, and since clients don't randomize their prepared statement names (and they are not guaranteed to be unique), I don't think there is really a way to know if a prepared statement has been prepared already or not.

@levkk
Copy link
Contributor

levkk commented Jun 15, 2023

Serious question: why use PgBouncer? It doesn't seem to offer anything special over using a connection pool in the application. I suppose it makes it easier to pool connections from multiple applications, and it lets you pretend to have more than the maximum by multiplexing requests, but it just seems like a really hacky way to do it in general.

That's exactly how one uses Postgres at scale. It's not uncommon for enterprises to launch thousands of containers to serve traffic or to have hundreds of microservices connecting to the same database. We've ran Postgres with over 200,000 clients connecting to PgBouncer and multiplexing probably 10-15 server connections.

Postgres is known for not being able to have many concurrent connections open and a pooler is the only way presently to use Postgres at scale.

Coming from a Rust world, I can see how one could run a large business from just a few servers, but in other languages that require a bit more compute and have global locks on the runtime, like Python, Ruby, and others, the natural way to parallelize computation is processes, not threads, so many connections from many different instances of the app need to be open to the database to serve traffic.

Even in apps that can easily multithread, a connection cannot be shared between threads, since it's stateful. So, a connection pool with # connections = # of threads is usually required. Multiply this by 100s of instances of the same app running in a highly available at scale deployment and you'll run out of Postgres connections very quickly without using a pooler.

@mrl5
Copy link
Contributor

mrl5 commented Jul 25, 2023

@levkk this is great postgresml/pgcat#474 - I'm surprised that no one mentioned it here :)

for anyone else who loves sqlx and wants to use "external" connection pooler then consider using pgcat over pgbouncer. The cool part is that PgCat recognizes SQLx :) https://postgresml.org/blog/making-postgres-30-percent-faster-in-production

This is not only a performance benefit, but also a usability improvement for client libraries that have to use prepared statements, like the popular Rust crate SQLx. Until now, the typical recommendation was to just not use a pooler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants