Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: close sqla connections when unloading profile #5728

Merged
merged 1 commit into from
Nov 1, 2022

Conversation

ltalirz
Copy link
Member

@ltalirz ltalirz commented Oct 28, 2022

Fixes #5506

After unloading an AiiDA profile in the psql_dos backend, sqlalchemy would keep a weakref to a connection alive (which points to a reference cycle somewhere).
Calling the garbage collector explicitly cleares the connections.

@ltalirz ltalirz marked this pull request as ready for review October 28, 2022 11:40
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ltalirz . Is there no way through the API of sqlalchemy to tell it to clear weak references? In PsqlDosBackend.close we are calling engine.close() and self._session_factory.expunge_all() and self._session_factory.close(). Maybe we are missing something here?

If this is nothing to do with sqlalchemy and this is the only way, fine for me to add it, but since it is specific to the PsqlDosBackend why not add it to its close method? This is what is ultimately called when Manager.unload_profile is called. Would also move the test from test_nodes.py to test_backend.py, which is more apt.

@ltalirz ltalirz force-pushed the fix/5506-close-db-connections branch from d767baf to 1eaa081 Compare October 28, 2022 13:12
@ltalirz
Copy link
Member Author

ltalirz commented Oct 28, 2022

Thanks @ltalirz . Is there no way through the API of sqlalchemy to tell it to clear weak references? In PsqlDosBackend.closewe are callingengine.close()andself._session_factory.expunge_all()andself._session_factory.close()`. Maybe we are missing something here?

I guess the problem is that a reference to the session object itself survives somewhere:

from sqlalchemy import Column
from sqlalchemy import create_engine
from sqlalchemy import inspect
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy.orm import declarative_base
from sqlalchemy.orm import Session
from sqlalchemy.orm.session import _sessions

Base = declarative_base()


class A(Base):
    __tablename__ = "a"

    id = Column(Integer, primary_key=True)
    data = Column(String)


e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)

s = Session(e)

a1 = A()
s.add(a1)
s.commit()

# object is not detached
assert not inspect(a1).detached

# close session, which will detach a1
s.close()

# object is detached
assert inspect(a1).detached
assert a1 not in s

assert len(_sessions) == 1
del a1
assert len(_sessions) == 1
e.dispose()
assert len(_sessions) == 1
s.expunge_all()
assert len(_sessions) == 1

del s
assert len(_sessions) == 0

Looking at the code it's not entirely clear to me how this happens (I tried del engine after it's no longer needed but that was not it).

def close(self) -> None:
if self._session_factory is None:
return # the instance is already closed, and so this is a no-op
# close the connection
# pylint: disable=no-member
engine = self._session_factory.bind
if engine is not None:
engine.dispose() # type: ignore
self._session_factory.expunge_all()
self._session_factory.close()
self._session_factory = None

I incorporated your other suggestions - thanks!

@ltalirz
Copy link
Member Author

ltalirz commented Oct 28, 2022

P.S. The original issue raised by Joe was about an error "database ... is being accessed by other users".
I guess this should not really be the case if the session is closed (even if it remains in memory).

It almost seems like some other place in AiiDA grabs this session and does stuff to it.

@ltalirz ltalirz requested a review from sphuber October 31, 2022 18:41
@ltalirz
Copy link
Member Author

ltalirz commented Oct 31, 2022

Note: if I don't load the profile again after unloading it, numerous of the following tests fail with errors like

2022-10-28T13:27:55.5118491Z     def get_profile_storage(self) -> 'StorageBackend':
2022-10-28T13:27:55.5118765Z         """Return the current profile's storage backend, loading it if necessary."""
2022-10-28T13:27:55.5118905Z         from aiida.common import ConfigurationError
2022-10-28T13:27:55.5119045Z         from aiida.common.log import configure_logging
2022-10-28T13:27:55.5119218Z         from aiida.manage.profile_access import ProfileAccessManager
2022-10-28T13:27:55.5119290Z     
2022-10-28T13:27:55.5119487Z         # if loaded, return the current storage backend (which is "synced" with the global profile)
2022-10-28T13:27:55.5119611Z         if self._profile_storage is not None:
2022-10-28T13:27:55.5119709Z             return self._profile_storage
2022-10-28T13:27:55.5119782Z     
2022-10-28T13:27:55.5119900Z         # get the currently loaded profile
2022-10-28T13:27:55.5120009Z         profile = self.get_profile()
2022-10-28T13:27:55.5120105Z         if profile is None:
2022-10-28T13:27:55.5120219Z >           raise ConfigurationError(
2022-10-28T13:27:55.5120555Z                 'Could not determine the current profile. Consider loading a profile using `aiida.load_profile()`.'
2022-10-28T13:27:55.5120617Z             )
2022-10-28T13:27:55.5121053Z E           aiida.common.exceptions.ConfigurationError: Could not determine the current profile. Consider loading a profile using `aiida.load_profile()`.
2022-10-28T13:27:55.5121079Z 

Logs example: https://github.com/aiidateam/aiida-core/actions/runs/3345686746/jobs/5541529122

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ltalirz . I think this fix might be fine for now, don't see how it can hurt. Still I wonder where the actual problem is. In the shell tests that you shared, you showed that once you delete the reference to the session with del session the weakref was released. Looking at the PsqlDosBackend, I noticed that we actually don't even keep track of sessions but simply call the factory each time. When I quickly switched this to create a single session and keep the reference and then in close calling del on it explicitly, the test you added also passed, even when I remove the explicit gc.collect. The problem is though that this does not seem a 100% reproducible. Sometimes the test still fails. I am wondering if time plays a role here and sometimes the assert comes before the weakref has had the chance to be resolved. Anyway, I think we can still merge this for the time being after the test is updated.

tests/storage/psql_dos/test_backend.py Outdated Show resolved Hide resolved
@ltalirz ltalirz requested a review from sphuber October 31, 2022 22:36
When calling `Manager.unload_profile`, the `close` method is called on
the `StorageBackend`. For the `PsqlDosBackend` this means cleaning up
all SqlAlchemy connections and sessions. This was implemented, however,
SqlAlchemy would often still be holding on to at least one session in
the `WeakrefValueDictionary` of `sqlalchemy.orm.sessions._sessions`. It
turns out however that the reference was already cleared in fact, as
calling the garbage collector explicitly cleans the session references.
@sphuber sphuber force-pushed the fix/5506-close-db-connections branch from 4a553bf to a99981e Compare November 1, 2022 07:11
@sphuber sphuber merged commit 798ded5 into aiidateam:main Nov 1, 2022
@sphuber sphuber deleted the fix/5506-close-db-connections branch November 1, 2022 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Database connections remain, even after switching profiles with load_profile(..., allow_switch=True)
2 participants