Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: statement prepared in TX is closed with TX #117

Closed
wants to merge 8 commits into from

Conversation

letFunny
Copy link
Collaborator

@letFunny letFunny commented Oct 20, 2023

If a sqlair.Statement is first used inside a transaction, it is prepared in the transaction's dedicated connection. When the transaction is closed, the connection is closed as well, meaning the statement is no longer valid.

Why do we prepare the statement in a new dedicated connection in the first place? Because if the database pool has only one connection and we create a transaction statement in that connection, we create a deadlock. This is because a transaction needs a dedicated connection so, if the pool is configured to have a maximum of one, it will wait indefinitely. Additionally, SQLite will only have one connection if opened with :memory: which is a common source of trouble (ref).

There are several solutions:

  1. The first is adding a timeout to the context when preparing the sql.Statement. This will avoid the deadlock and will present the user with the error. However, we have to set an arbitrary timeout which can result in false positives depending on the user's network latency.
  2. The second approach is to not care about the deadlock. The sql package documentation clearly states that setting a max number of connections is akin to acquiring locks [1]. If the user sets the number, we can expect him to also use a context with a timeout to prevent deadlocks.
  3. The third approach is to bypass the cache preparing and closing the statement directly.
  4. The last one is to add the TX as another key to the cache. Where right now we have db and statement, we would have db, TX and statement.

I am more inclined to go for 2. as it seems this is a problem with how the database is configured, or in the case of SQLite a problem with the defaults (it only has one connection); and not something we should worry about.

[1]:

Keep in mind that setting a limit makes database usage similar to acquiring a lock or semaphore, with the result that your application can deadlock waiting for a new database connection.

If a sqlair.Statement is first used inside a transaction, it is prepared
in the transaction dedicated connection. When the transaction is closed,
the connection is closed as well, meaning the statement is no longer
valid.
@letFunny letFunny marked this pull request as draft October 23, 2023 07:44
@letFunny letFunny changed the title buxfix: statement prepared in TX is closed with TX bugfix: statement prepared in TX is closed with TX Nov 7, 2023
Copy link
Collaborator

@Aflynn50 Aflynn50 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot on this bug. We are not actually caching any statements for transactions at all right now because of this. The code was working because even thought the sql.Stmt prepared on the sql.Conn was closed, when this sql.Stmt was moved on to subsequent transactions with tx.Stmt it would silently re-prepare the statement after finding it closed.

package_test.go Outdated Show resolved Hide resolved
package_test.go Outdated Show resolved Hide resolved
@@ -1165,40 +1165,6 @@ func (s *PackageSuite) TestPreparedStmtCaching(c *C) {
checkCacheEmpty()
}

func (s *PackageSuite) TestTransactionWithOneConn(c *C) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have some protection against this still. I have a few ideas but I'm not sure about any of them.

  1. In NewDB we return an error if the max connections is 1. The problem with this is that a deadlock can still if the number of in progress transactions matches the number connections in the pool. Also the user can still increase the max connections.
  2. We could track the number of connections ourselves and have a sqlair version of SetMaxOpenConns. Then we could not prepare on the DB if we know it would block. Again, the problem with this is that the user can still use the "plain" DB and we would not control it. A solution for this would be to also override the Open method and remove PlainDB however this would be a breaking change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding 2. I don't think that is the responsibility of SQLair, it should be done at the sql std library instead. Essentially we would be having a parallel "pool" implementation/accounting.

package_test.go Outdated

// Run the same existing statement outside the transaction.
q = db.Query(nil, selectStmt)
c.Assert(q.Run(), IsNil)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should close the DB at the end of the test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by closing the DB? When the db object gets out of scope the finalizer will be called closing the underlying sql db.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had another look at it, I meant drop the tables but actually this test doesn't create and tables so it should be ok.

@Aflynn50 Aflynn50 added the Bug An undesired feature ;-) label Nov 10, 2023
@letFunny letFunny marked this pull request as ready for review November 28, 2023 15:21
func (db *DB) prepareStmt(ctx context.Context, ps prepareSubstrate, s *Statement) (*sql.Stmt, error) {
// prepareStmt prepares a Statement on a DB. It first checks in the cache to
// see if it has already been prepared.
func (db *DB) prepareStmt(ctx context.Context, s *Statement) (*sql.Stmt, error) {
var err error
cacheMutex.RLock()
Copy link
Collaborator

@manadart manadart Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see this tucked inside a cache implementation that uses the singleton pattern discussed prior, but not in this patch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #126 does this.

@@ -561,7 +542,7 @@ func (tx *TX) Query(ctx context.Context, s *Statement, inputArgs ...any) *Query
return &Query{ctx: ctx, err: ErrTXDone}
}

sqlstmt, err := tx.db.prepareStmt(ctx, tx.sqlconn, s)
sqlstmt, err := tx.db.prepareStmt(ctx, s)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this?

  • prepareStmt checks dbStats to see if MaxOpenConnections is set to 1.
  • If it is, we return a typed error.
  • If we catch that error here, we log a warning and just prepare against the transaction instead of the db, skipping the cache.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could work but only for that specific case. If we had MaxOpenConnections set to 2 and we created two transactions we are going to have the same problem. And the same applies to n connections and n transactions. My other concern is that MaxOpenConnections can be changed concurrently from another thread accessing the underlying sqldb which is going to lead to some bugs that are difficult to trace.

Another solution would be to check whether there is a max number of connections and issue a warning or even return an error. We could even require an environment variable to be set just to be explicit about the number of connections being a problem. The issue, once again, is the concurrency aspect.

@letFunny
Copy link
Collaborator Author

As discussed in a meeting today the caching system has some potential issues which require a lot of discussion and testing. Right now, we are not able to commit the time investment, we will focus on delivering the roadmap features (slices, insert, primitive types, etc.) first.

The decision has been to remove the cache from the library and revisit it in the future. We will need to create a through design that addresses all of the issues and that is performant. For the latter, we will need to create a test harness and benchmark the different implementations to confirm our assumptions.

For reference, here is a high level list of the issues:

  • Finalizers run serially on a single goroutine which means that code that creates a lot of statements potentially will have some contention in the GC.
  • Because of the cache design, when the max number of connections is 1 and a transaction is created, we actually need 2 connections which is not intuitive. Incidentally, this is also because of the sql standard library design which can block when requesting new connections and the only way around it is to use a context with a timeout.
  • It is not clear whether preparing in SQLite is an improvement in the performance (SQLite does not do a lot of processing when preparing a statement). On top of that, Juju's workload is mainly creating transactions and reusing statements across them. Because of how preparing on a connection works, we are not getting a lot of benefits. We would need to pool the connections ourselves or be smarter about it.

@letFunny letFunny closed this Dec 14, 2023
Aflynn50 added a commit that referenced this pull request Jul 5, 2024
* Add a cache for driver prepared statements.

When the same SQLair query is executed multiple times against the database it is recompiled by the database every time. This is inefficient.

It is possible to instead prepare a SQL query against the database driver and cache it in SQLair to allow for reuse of the query.

With this change, when a SQLair statement is executed directly against a database it is first prepared and then executed, the prepared statement is put in the cache for reuse. If the statement is executed in a transaction, the SQLair statement is looked up in the cache but if it is not found it is not prepared on the driver. This is due to the issues in #117.

* Allow a single statement to correspond to multiple SQL strings in cache

* SQUASH Review corrections
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug An undesired feature ;-)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants