Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database connections remain, even after switching profiles with load_profile(..., allow_switch=True) #5506

Closed
jbweston opened this issue Apr 28, 2022 · 6 comments · Fixed by #5728
Labels
Milestone

Comments

@jbweston
Copy link

jbweston commented Apr 28, 2022

Describe the bug

Switching profiles from profile-a to profile-b with aiida.load_profile('profile-b', allow_switch=True) does not close all connections to the database of profile-a. This can be verified by trying to delete profile-a after loading profile-b:

Steps to reproduce

Create 2 fresh profiles, profile-a and profile-b with verdi quicksetup, then run the following script:

import aiida
import aiida.orm
from aiida.manage.configuration import get_config

aiida.load_profile("profile-a")
aiida.orm.Int()  # force storage to be loaded

aiida.load_profile("profile-b", allow_switch=True)
aiida.orm.Int()  # force storage to be loaded

get_config().delete_profile("profile-a")  # This fails

Here's me running this in a jupyter notebook (but running it in a regular Python interpreter from the commandline gives the same result):
image

Expected behavior

I would expect switching profiles to close all connections to the database, and to be able to delete a profile once it has been unloaded.

Your environment

  • Operating system: Linux
  • Python version: 3.9
  • aiida-core version: 2.0.1
  • sqlalchemy version: 1.4.35

Additional context

I find this behavior very confusing: unloading the profile invokes the close() method of PsqlDosBackend, and that method indeed seems to close all connections and properly call dispose on the sqlalchemy engine:

def close(self) -> None:
if self._session_factory is None:
return # the instance is already closed, and so this is a no-op
# close the connection
# pylint: disable=no-member
engine = self._session_factory.bind
if engine is not None:
engine.dispose() # type: ignore
self._session_factory.expunge_all()
self._session_factory.close()
self._session_factory = None

Maybe there are extra sessions that are created elsewhere, which we are not tracking, and dispose maybe doesn't do anything until those are closed?

@jbweston
Copy link
Author

tagging @ltalirz at his request.

@ltalirz
Copy link
Member

ltalirz commented Apr 28, 2022

Pinging @chrisjsewell - do you happen to know what is going on here?

@chrisjsewell
Copy link
Member

do you happen to know what is going on here?

Heya, well as @jbweston notes the unload-> close should indeed close all connections, so nothing immediately comes to mind.

You could try looking at the _sessions dict that sqlalchemy maintains, to see what other sessions are open: https://github.com/sqlalchemy/sqlalchemy/blob/6a496a5f40efe6d58b09eeca9320829789ceaa54/lib/sqlalchemy/orm/session.py#L142

You could also try changing the sqlalchemy logging to DEBUG (via verdi config), and see if that gives any clues

@jbweston
Copy link
Author

Here's another datapoint: adding gc.collect() after manager.unload_profile() seems to solve the issue.

@ltalirz
Copy link
Member

ltalirz commented Apr 29, 2022

Thanks a lot for investigating @jbweston !

Do I understand correctly that this means there is a reference cycle that isn't being garbage-collected automatically/fast enough?

Is anybody opposed to adding an explicit gc.collect() here:

def unload_profile(self) -> None:
"""Unload the current profile, closing any associated resources."""
self.reset_profile()
self._profile = None

If not, I'm happy to prepare a PR, maybe even with a test :-)

It seems to me that we don't need to worry about possible performance overheads for this operation, and clearing out the remaining sessions seems important.

@ltalirz
Copy link
Member

ltalirz commented Oct 28, 2022

You could try looking at the _sessions dict that sqlalchemy maintains, to see what other sessions are open: https://github.com/sqlalchemy/sqlalchemy/blob/6a496a5f40efe6d58b09eeca9320829789ceaa54/lib/sqlalchemy/orm/session.py#L142

I confirm that after manager.unload_profile(), sqlalchemy.orm.session._sessions contains 1 weakref to a session.
After running gc.collect(), it is empty.
Will open PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants