Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close connection to base DB after cleanup #5

Merged
merged 5 commits into from
Apr 4, 2024

Conversation

ebabani
Copy link
Contributor

@ebabani ebabani commented Jan 8, 2024

When running multiple quick tests in parallel it's possible to get a connection failure due to too many clients connected.

@peterldowns
Copy link
Owner

peterldowns commented Jan 8, 2024

Interesting, thank you for reporting the issue and for suggesting a fix. Can you explain any more about how you hit the problem and how it manifests? Any suggested way for me to try to reproduce the issue? Running multiple quick tests in parallel is the expected use case of this library and I haven't seen this problem myself.

I'll have time later in the week to look into this further and I'll use this PR as a starting point. Thank you again for writing in.

When running multiple quick tests in parallel it's possible to get a connection failure due to too many clients connected.
@ebabani ebabani force-pushed the eb/cleanup-basedb-connection branch from bdeb9d9 to 7923c6f Compare January 8, 2024 19:37
@ebabani
Copy link
Contributor Author

ebabani commented Jan 8, 2024

Hi,

I pushed another change to the tests and docker compose. This issues only shows up when the number of parallel tests can potentially be higher than the max connections to the base DB.

I updated the docker compose to set max-connections to 100, and the parallel tests to run 50 tests in parallel.

Running without this change you should see tests failing due to too many connections.

In our code base we use Ginkgo for writing and running tests. After we switched to pgtestdb we noticed that running our ~200 DB related tests in parallel would hit our connection limit (100), but not when running in serial which led me to look at the cleanup code.

@peterldowns
Copy link
Owner

peterldowns commented Apr 4, 2024

@ebabani hi, thanks for your patience. I've fixed the implementation of create() to only use a single database connection at any point in time — previously, as you pointed out, it used 2 simultaneous connections. I was able to verify the behavior by adding some time.Sleep calls in the library and then connecting to the test database while running some tests:

Now, when you call `pgtestdb.New()` or `pgtestdb.Custom()`, pgtestdb connects, ensures the template exists, creates the instance db, then cleans everything up like before, but only uses a single connection to the server at any point in time.
  • Here, you see a single active connection to the postgres database to start (that's me connected in psql.)
postgres=# SELECT datname, sum(numbackends) FROM pg_stat_database WHERE datname is not null group by 1;
                   datname                   | sum
---------------------------------------------+-----
 postgres                                    |   1
 template0                                   |   0
 template1                                   |   0
  • I start running a test. pgtestdb connects to the postgres database as well, this is the baseDB in the code. It uses this connection to create the template database, run migrations, and then create the instance database for the specific test.
postgres=# SELECT datname, sum(numbackends) FROM pg_stat_database WHERE datname is not null group by 1;
                   datname                   | sum
---------------------------------------------+-----
 postgres                                    |   2
 template0                                   |   0
 template1                                   |   0
  • pgtestdb connects to the instance database, then hands it to the test. this is the db := pgtestdb.New(...) connection.
postgres=# SELECT datname, sum(numbackends) FROM pg_stat_database WHERE datname is not null group by 1;
                             datname                             | sum
-----------------------------------------------------------------+-----
 template1                                                       |   0
 testdb_tpl_ed8ae75db1176559951eadb85d6be0db_inst_8ed74b29d41d8c |   1
 postgres                                                        |   1
 testdb_tpl_ed8ae75db1176559951eadb85d6be0db                     |   0
 template0                                                       |   0
  • The test case finishes, and pgtestdb closes the connection to the instance, connects back to the baseDB, and deletes the instance.
postgres=# SELECT datname, sum(numbackends) FROM pg_stat_database WHERE datname is not null group by 1;
                   datname                   | sum
---------------------------------------------+-----
 postgres                                    |   2
 testdb_tpl_ed8ae75db1176559951eadb85d6be0db |   0
 template0                                   |   0
 template1                                   |   0
  • pgtestdb closes its connection, the test is entirely finished running.
postgres=# SELECT datname, sum(numbackends) FROM pg_stat_database WHERE datname is not null group by 1;
                   datname                   | sum
---------------------------------------------+-----
 postgres                                    |   1
 testdb_tpl_ed8ae75db1176559951eadb85d6be0db |   0
 template0                                   |   0
 template1                                   |   0

Thanks again for reporting this bug. In my testing, re-connecting to the database adds some slight overhead per test, but it's fundamentally more correct, and the database connection time should be dominated by the time taken during each test so overall this is worth the performance impact because it allows you to double the amount of tests you are running at once!

I also updated the FAQ to explain the problem, show some example error messages, and provide some ideas for how to work around it.

@peterldowns
Copy link
Owner

^ github actions is having some issues right now but once they're running again, I'll go ahead and merge this.

@peterldowns peterldowns merged commit 15a8a8a into peterldowns:main Apr 4, 2024
3 checks passed
@ebabani
Copy link
Contributor Author

ebabani commented Apr 8, 2024

Awesome, thanks for the update and the new release. I'll try to update sometimes this week. Thanks for maintaining this project, has been really helpful in parallelizing and speeding up our tests.

@peterldowns
Copy link
Owner

@ebabani of course, you're welcome. Thank you for using it. Please report any further bugs or questions, I'm happy to make improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants