-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect connection pool deadlocks #63
Comments
Nesting a transaction should work as you have it there. Which db driver are you using? |
Fluent |
Sigh. And by Fluent I mean PostgreSQL |
This is all it takes to cause deadlock in the app func badMonkey(req: Request) throws -> EventLoopFuture<HTTPStatus> {
req.db.transaction { db in
Point.query(on: req.db).all().transform(to: .noContent)
}
} Notice how I used |
Hmm that's different from the original example. It makes sense that would not work since you should only use the db supplied to the closure while in a transaction. Try refactoring like this: func badMonkey(req: Request) throws -> EventLoopFuture<HTTPStatus> {
let points = Point.query(on: req.db).all()
return req.db.transaction { db in
points...
}.transform(to: .noContent)
} Or this: func badMonkey(req: Request) throws -> EventLoopFuture<HTTPStatus> {
req.db.transaction { db in
Point.query(on: db).all().transform(to: .noContent)
}
} To clarify, for this to work you need two connections: req.db.transaction { db in
Point.query(on: req.db).all().transform(to: .noContent)
} One in a transaction and one normal connection. These must be separate connections since once a connection is in a transaction all queries on that connection will be a part of the transaction. This is deadlocking immediately since FluentPostgresDriver currently defaults to 1 connection max per pool: https://github.com/vapor/fluent-postgres-driver/blob/master/Sources/FluentPostgresDriver/FluentPostgresConfiguration.swift#L22 Increasing that number should stop this from always deadlocking, but I'm not sure you can avoid deadlocks here. Say you up the number to 2 max connections per event loop. Two requests come in at the same time to Fluent takes care of this if you use |
Exactly. If you have some way to flag it happened, awesome, otherwise close this. I know I'm supposed to always use the |
I'll leave this open as a reminder to try and find a way to detect deadlocks. |
Why would this cause deadlocks? My understanding from @tanner0101 's explanations was that each query should get a new connection. So this shouldn't cause problems as long as the pool has at least 2 connections.
Also, shouldn't the connection from the first query have been returned to the pool once the ode inside the `flatMap` block is run?
… On 8. May 2020, at 18:21, grosch ***@***.***> wrote:
Something like this will hang the client, and in db log we get
LOG: unexpected EOF on client connection with an open transaction
Client.find(id, on: req.db).flatMap {
/// Validate we want to make updates
req.db.transaction { db in
// Do stuff
I don't really want to create a transaction if I don't have to. In the case where I discovered this I'm doing multiple queries against multiple tables, and only under certain conditions do I then create a transaction and update multiple different tables.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
You can create a deadlock by doing something like: let pool = EventLoopConnectionPool(..., maxConnections: 1)
pool.withConnection { a in
pool.withConnection { b in
pool.eventLoop.makeSuccededFuture()
}
} Since the pool only has one connection and we're waiting for another connection from the pool before we can return the first connection, this code can never finish. Fluent queries, when run on |
Would that also apply if the pool has 2 or more connections? Wouldn't the second call to |
Yes, if the pool had a max of two connections, then that code would not be a problem. To generalize, if you have a pool with If you have let pool = EventLoopConnectionPool(..., maxConnections: 2)
pool.withConnection { first in
pool.withConnection { second in
pool.withConnection { third in
// we have three connections at the same time
// this is impossible since maxConnections = 2
print(first, second, third)
...
}
}
} |
Right, that is the way I would have expected it to work. As I said, my personal "solution" is to just make sure my connection pool is large enough... |
Yup that's a good solution. I do think it would be helpful if the connection pool could detect deadlocks to assist in debugging issues around a connection pool being too small. But I'm not sure that will be possible to implement. |
Agreed, to both the usefulness and (sadly) the feasibility. |
From taking a quick peek at the code, wouldn't simply keeping count of Though I might be oversimplifying it here :) |
Yes, but then any requests received while all e.g. 5 connections are momentarily in use would simply fail. So we would start to randomly drop connections any time all connections are "taken" rather than simply wait until they become available again. However, I'd say that having a timeout for this waiting could be a good idea. That way, "deadlocked" requests could e.g. be released after, say, 10 seconds. Not ideal, but better than locking indefinitely, I guess. |
Very good point, totally overlooked that. In that case a timeout does sound better than deadlocking. This would at least free up the connections after x amount of time, instead of at some point deadlocking everything because there are no more connections that can be freed. |
I feel it should not be possible to deadlock Vapor apps if me or my developers accidentally use I wonder if it’s possible that once a transaction is started the value of I understand some mitigation has been done in #67, but I'd rather see Vapor make design choices that prevent these kinds of scenarios in the first place. To err is human. |
It's a difficult balancing act. Vapor should try to prevent common errors but it should also not get in the way of the developer. If we silently swap out At least #67 has prevented applications from hanging forever if deadlocks are created. But I agree we can do better. A few ideas: "leaky" pooling I've opened an issue for this here #72. This allows the connection pool to temporarily open new connections beyond its maximum if needed. detect deadlocks Maybe there is some way for researching other frameworks How have other web frameworks dealt with this? Node.js / Java would probably be good places to look. |
detect deadlocksI think with the async nature of our connection pool this is going to be a tricky one, mostly because requesting researching other frameworksDid a quick peek into Python's Peewee ORM, and they either block for a certain time, indefinitely, or raise an error when requesting I don't think always raising when requesting Also looked at Node's Sequelize ORM, and they accept timeouts for acquiring a new connection, a min and max. Full options from Sequelize
For Java I found the Hibernate ORM and the following documentation on connection pooling. The documentation mentions two options here. Overall seems none of the ORMs fully prevent deadlocks, other than timeouts when you are in a deadlock and I think this is the approach we should follow too |
I don’t agree that this is an acceptable compromise, but this is not my project, you have your own list of priorities that you adhere to. In our app I wrote a wrapper that prevents this from happening and am ok with that solution. |
We’re using https://github.com/candor/sublimate which is the “synchronous” solution I was talking about ~6 months ago. Inside of which the Overall I aim for reliability above all, so this is why the set of trade offs for me end up there, but I certainly understand every library and framework has different base policies that effect everything above that. We’re all experienced in releasing such tooling here and I have much respect for this project. |
Something like this will hang the client, and in db log we get
I don't really want to create a transaction if I don't have to. In the case where I discovered this I'm doing multiple queries against multiple tables, and only under certain conditions do I then create a transaction and update multiple different tables.
The text was updated successfully, but these errors were encountered: