Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryBuilder: Fix exception when storing node within iterall #5804

Merged
merged 5 commits into from
Dec 12, 2022

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Nov 28, 2022

Fixes #5802

Storing a node while iterating over the result of QueryBuilder.iterall
would raise sqlalchemy.exc.InvalidRequestError with the message:

Can't operate on closed transaction inside context manager.

The problem was that the Node implementation for the PsqlDosBackend,
the SqlaNode class, would not consistently open a transaction, using
the PsqlDosBackend.transaction method, whenever it mutated the state
of the session, and would then straight up commit to the current session.
For example, when storing a new node, the store method would simply
call save. Through the ModelWrapper, this would call commit on the
session, but that was the session being used for the iteration.

The same problem was present for the SqlaGroup implementation that had
a number of places where sessions state was mutated without opening a
transaction first.

The problem is fixed therefore by consistently opening a transaction
before making changes to the session. The transaction implementation
is slightly changed as any SqlaIntegrityError raised during the context
is now converted into an aiida.common.exceptions.IntegrityError to make
it backend independent.

@sphuber sphuber requested a review from chrisjsewell November 28, 2022 23:06
@espenfl
Copy link
Contributor

espenfl commented Nov 30, 2022

@sphuber Hereby confirm this fixed failing node storage tests for us.

aiida/storage/psql_dos/orm/groups.py Outdated Show resolved Hide resolved
aiida/storage/psql_dos/orm/groups.py Show resolved Hide resolved
aiida/storage/psql_dos/orm/groups.py Show resolved Hide resolved
aiida/storage/psql_dos/orm/nodes.py Outdated Show resolved Hide resolved
@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch 4 times, most recently from 1ce59c1 to 304cb6a Compare December 1, 2022 14:15
@sphuber sphuber requested a review from chrisjsewell December 1, 2022 14:37
@sphuber
Copy link
Contributor Author

sphuber commented Dec 1, 2022

Seems like it is just the docs build that is failing now. Super annoying because all it says is:

<string>:1: (WARNING/2) Inline interpreted text or phrase reference start-string without end-string.

It doesn't give any info on where that might be. I have had this before and it was caused by using an alphanumeric characters after inline markup, such as:

multiple ``WorkChain``s

See https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#inline-markup-recognition-rules . But I cannot find any such instances. There is next to no changes in docstrings in this PR, so I am at a loss. You have any ideas?

@sphuber
Copy link
Contributor Author

sphuber commented Dec 1, 2022

@chrisjsewell Thanks for the review. I have addressed all comments and this is ready for review. In the meantime I will look in the docs issue.

@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch from 7651013 to c05388c Compare December 2, 2022 09:14
@sphuber
Copy link
Contributor Author

sphuber commented Dec 2, 2022

@chrisjsewell the docs are failing because there is an exception when executing the intro/tutorial.md notebook. Running it locally, this is the exception in the log file:

Traceback (most recent call last):
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/jupyter_cache/executors/utils.py", line 51, in single_nb_execution
    executenb(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/client.py", line 1204, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/util.py", line 84, in wrapped
    return just_run(coro(*args, **kwargs))
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/util.py", line 62, in just_run
    return loop.run_until_complete(coro)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/client.py", line 663, in async_execute
    await self.async_execute_cell(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/client.py", line 965, in async_execute_cell
    await self._check_raise_for_error(cell, cell_index, exec_reply)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/client.py", line 862, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
%verdi node show 1
------------------

---------------------------------------------------------------------------
InvalidOperation                          Traceback (most recent call last)
/tmp/ipykernel_45194/2387781979.py in <module>
----> 1 get_ipython().run_line_magic('verdi', 'node show 1')

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
   2362                 kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2363             with self.builtin_trap:
-> 2364                 result = fn(*args, **kwargs)
   2365             return result
   2366 

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/decorator.py in fun(*args, **kw)
    230             if not kwsyntax:
    231                 args, kw = fix(args, kw, sig)
--> 232             return caller(func, *(extras + args), **kw)
    233     fun.__name__ = func.__name__
    234     fun.__doc__ = func.__doc__

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188 
    189         if callable(arg):

~/code/aiida/env/dev/aiida-core/aiida/tools/ipython/ipython_magics.py in verdi(self, line, local_ns)
     74         profile = get_profile()
     75         obj = AttributeDict({'config': config, 'profile': profile})
---> 76         return verdi(shlex.split(line), prog_name='%verdi', obj=obj, standalone_mode=False)  # pylint: disable=too-many-function-args,unexpected-keyword-arg
     77 
     78     @magic.needs_local_scope

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/core.py in __call__(self, *args, **kwargs)
   1128     def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any:
   1129         """Alias for :meth:`main`."""
-> 1130         return self.main(*args, **kwargs)
   1131 
   1132 

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/core.py in main(self, args, prog_name, complete_var, standalone_mode, windows_expand_args, **extra)
   1052         try:
   1053             try:
-> 1054                 with self.make_context(prog_name, args, **extra) as ctx:
   1055                     rv = self.invoke(ctx)
   1056                     if not standalone_mode:

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/core.py in make_context(self, info_name, args, parent, **extra)
    918 
    919         with ctx.scope(cleanup=False):
--> 920             self.parse_args(ctx, args)
    921         return ctx
    922 

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/core.py in parse_args(self, ctx, args)
   1611             ctx.exit()
   1612 
-> 1613         rest = super().parse_args(ctx, args)
   1614 
   1615         if self.chain:

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/core.py in parse_args(self, ctx, args)
   1376 
   1377         for param in iter_params_for_processing(param_order, self.get_params(ctx)):
-> 1378             value, args = param.handle_parse_result(ctx, opts, args)
   1379 
   1380         if args and not ctx.allow_extra_args and not ctx.resilient_parsing:

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/core.py in handle_parse_result(self, ctx, opts, args)
   2358 
   2359             try:
-> 2360                 value = self.process_value(ctx, value)
   2361             except Exception:
   2362                 if not ctx.resilient_parsing:

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/core.py in process_value(self, ctx, value)
   2314 
   2315     def process_value(self, ctx: Context, value: t.Any) -> t.Any:
-> 2316         value = self.type_cast_value(ctx, value)
   2317 
   2318         if self.required and self.value_is_missing(value):

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/core.py in type_cast_value(self, ctx, value)
   2302             return tuple(convert(x) for x in check_iter(value))
   2303 
-> 2304         return convert(value)
   2305 
   2306     def value_is_missing(self, value: t.Any) -> bool:

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/click/types.py in __call__(self, value, param, ctx)
     80     ) -> t.Any:
     81         if value is not None:
---> 82             return self.convert(value, param, ctx)
     83 
     84     def get_metavar(self, param: "Parameter") -> t.Optional[str]:

~/code/aiida/env/dev/aiida-core/aiida/cmdline/params/types/profile.py in convert(self, value, param, ctx)
     68 
     69         if self._load_profile:
---> 70             load_profile(profile.name)
     71 
     72         ctx.obj.profile = profile

~/code/aiida/env/dev/aiida-core/aiida/manage/configuration/__init__.py in load_profile(profile, allow_switch)
    151     """
    152     from aiida.manage import get_manager
--> 153     return get_manager().load_profile(profile, allow_switch)
    154 
    155 

~/code/aiida/env/dev/aiida-core/aiida/manage/manager.py in load_profile(self, profile, allow_switch)
    124 
    125         if self._profile and self.profile_storage_loaded and not allow_switch:
--> 126             raise InvalidOperation(
    127                 f'cannot switch to profile {profile.name!r} because profile {self._profile.name!r} storage '
    128                 'is already loaded and allow_switch is False'

InvalidOperation: cannot switch to profile 'main' because profile 'myprofile' storage is already loaded and allow_switch is False
InvalidOperation: cannot switch to profile 'main' because profile 'myprofile' storage is already loaded and allow_switch is False

Why does this fail now? Nothing has changed in this PR that is related. Just to see if it would compile, I added allow_switch=True to load_profile in aiida/cmdline/params/types/profile.py. The previous error disappeared as expected, but then the next one cropped up which is even weirder:

Traceback (most recent call last):
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/jupyter_cache/executors/utils.py", line 51, in single_nb_execution
    executenb(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/client.py", line 1204, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/util.py", line 84, in wrapped
    return just_run(coro(*args, **kwargs))
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/util.py", line 62, in just_run
    return loop.run_until_complete(coro)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/client.py", line 663, in async_execute
    await self.async_execute_cell(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/client.py", line 965, in async_execute_cell
    await self._check_raise_for_error(cell, cell_index, exec_reply)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/nbclient/client.py", line 862, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
multiply(x, y)
------------------

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_162213/3141548813.py in <module>
----> 1 multiply(x, y)

~/code/aiida/env/dev/aiida-core/aiida/engine/processes/functions.py in decorated_function(*args, **kwargs)
    177         def decorated_function(*args, **kwargs):
    178             """This wrapper function is the actual function that is called."""
--> 179             result, _ = run_get_node(*args, **kwargs)
    180             return result
    181 

~/code/aiida/env/dev/aiida-core/aiida/engine/processes/functions.py in run_get_node(*args, **kwargs)
    150 
    151             try:
--> 152                 result = process.execute()
    153             finally:
    154                 # If the `original_handler` is set, that means the `kill_process` was bound, which needs to be reset

~/code/aiida/env/dev/aiida-core/aiida/engine/processes/functions.py in execute(self)
    364     def execute(self) -> Optional[Dict[str, Any]]:
    365         """Execute the process."""
--> 366         result = super().execute()
    367 
    368         # FunctionProcesses can return a single value as output, and not a dictionary, so we should also return that

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/processes.py in func_wrapper(self, *args, **kwargs)
     84         if self._closed:
     85             raise exceptions.ClosedError('Process is closed')
---> 86         return func(self, *args, **kwargs)
     87 
     88     return func_wrapper

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/processes.py in execute(self)
   1185             self.loop.run_until_complete(self.step_until_terminated())
   1186 
-> 1187         return self.future().result()
   1188 
   1189     @ensure_not_closed

~/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/process_states.py in execute(self)
    226                 try:
    227                     self._running = True
--> 228                     result = self.run_fn(*self.args, **self.kwargs)
    229                 finally:
    230                     self._running = False

~/code/aiida/env/dev/aiida-core/aiida/engine/processes/functions.py in run(self)
    408                 kwargs[name] = value
    409 
--> 410         result = self._func(*args, **kwargs)
    411 
    412         if result is None or isinstance(result, ExitCode):

/tmp/ipykernel_162213/3198245695.py in multiply(x, y)
      3 @engine.calcfunction
      4 def multiply(x, y):
----> 5     return x * y

~/code/aiida/env/dev/aiida-core/aiida/orm/nodes/data/numeric.py in inner(self, other)
     36         """Decorator wrapper."""
     37         assert not isinstance(other, NumericType)
---> 38         return to_aiida_type(func(self.value, other))
     39 
     40     return inner

~/code/aiida/env/dev/aiida-core/aiida/orm/nodes/data/numeric.py in __rmul__(self, other)
     66     @_right_operator
     67     def __rmul__(self, other):
---> 68         return other * self
     69 
     70     @_left_operator

TypeError: unsupported operand type(s) for *: 'SinglefileData' and 'int'
TypeError: unsupported operand type(s) for *: 'SinglefileData' and 'int'

As far as I can tell, nowhere in intro/tutorial.md is there even a SinglefilData node constructed. Any idea what is going on?

@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch from c05388c to cb32263 Compare December 5, 2022 16:02
@chrisjsewell
Copy link
Member

chrisjsewell commented Dec 5, 2022

Why does this fail now?

The local issue, I also get and seems to be a regression, whereby when loading the (temporary) profile, the Config is not also being set with that profile. So then verdi thinks no profile is loaded and tries to load the default.
Not sure why this does not happen on the CI, but indeed this is separate to this PR.

Definitely don't allow allow_switch=True, because then you will be running verdi against a local profile.

In conf.py you should add nb_execution_show_tb = "READTHEDOCS" in os.environ, so we can see what the error is on the CI

@chrisjsewell
Copy link
Member

chrisjsewell commented Dec 5, 2022

So after adding that (and stripping ANSI colors) from: https://readthedocs.org/projects/aiida-core/builds/18825707/

This definitely looks loke an issue caused by this PR:

nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
multiply(x, y)
------------------

---------------------------------------------------------------------------
InvalidRequestError                       Traceback (most recent call last)
/tmp/ipykernel_466/3141548813.py in <cell line: 1>()
----> 1 multiply(x, y)

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/engine/processes/functions.py in decorated_function(*args, **kwargs)
    177         def decorated_function(*args, **kwargs):
    178             """This wrapper function is the actual function that is called."""
--> 179             result, _ = run_get_node(*args, **kwargs)
    180             return result
    181 

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/engine/processes/functions.py in run_get_node(*args, **kwargs)
    129                 raise ValueError(f'{function.__name__} does not support these kwargs: {kwargs.keys()}')
    130 
--> 131             process = process_class(inputs=inputs, runner=runner)
    132 
    133             # Only add handlers for interrupt signal to kill the process if we are in a local and not a daemon runner.

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/base/state_machine.py in __call__(cls, *args, **kwargs)
    193         inst = super().__call__(*args, **kwargs)
--> 194         inst.transition_to(inst.create_initial_state())
    195         call_with_super_check(inst.init)
    196         return inst

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/base/state_machine.py in transition_to(self, new_state, *args, **kwargs)
    337                 raise
    338             self._transition_failing = True
--> 339             self.transition_failed(initial_state_label, label, *sys.exc_info()[1:])
    340         finally:
    341             self._transition_failing = False

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/processes.py in transition_failed(self, initial_state, final_state, exception, trace)
   1001         # If we are creating, then reraise instead of failing.
   1002         if final_state == process_states.ProcessState.CREATED:
-> 1003             raise exception.with_traceback(trace)
   1004 
   1005         self.transition_to(process_states.ProcessState.EXCEPTED, exception, trace)

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/base/state_machine.py in transition_to(self, new_state, *args, **kwargs)
    322 
    323             try:
--> 324                 self._enter_next_state(new_state)
    325             except StateEntryFailed as exception:
    326                 # Make sure we have a state instance

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/base/state_machine.py in _enter_next_state(self, next_state)
    382     def _enter_next_state(self, next_state: State) -> None:
    383         last_state = self._state
--> 384         self._fire_state_event(StateEventHook.ENTERING_STATE, next_state)
    385         # Enter the new state
    386         next_state.do_enter()

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/base/state_machine.py in _fire_state_event(self, hook, state)
    298     def _fire_state_event(self, hook: Hashable, state: Optional[State]) -> None:
    299         for callback in self._event_callbacks.get(hook, []):
--> 300             callback(self, hook, state)
    301 
    302     @super_check

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/processes.py in <lambda>(_s, _h, state)
    327         event_hooks = {
    328             state_machine.StateEventHook.ENTERING_STATE:
--> 329             lambda _s, _h, state: self.on_entering(cast(process_states.State, state)),
    330             state_machine.StateEventHook.ENTERED_STATE:
    331             lambda _s, _h, from_state: self.on_entered(cast(Optional[process_states.State], from_state)),

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/processes.py in on_entering(self, state)
    681         state_label = state.LABEL
    682         if state_label == process_states.ProcessState.CREATED:
--> 683             call_with_super_check(self.on_create)
    684         elif state_label == process_states.ProcessState.RUNNING:
    685             call_with_super_check(self.on_run)

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/plumpy/base/utils.py in call_with_super_check(wrapped, *args, **kwargs)
     27     call_count = getattr(self, '_called', 0)
     28     self._called = call_count + 1
---> 29     wrapped(*args, **kwargs)
     30     msg = f"Base '{wrapped.__name__}' was not called from '{self.__class__}'\nHint: Did you forget to call the super?"
     31     assert self._called == call_count, msg

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/engine/processes/process.py in on_create(self)
    400             if isinstance(current, Process):
    401                 self._parent_pid = current.pid  # type: ignore[assignment]
--> 402         self._pid = self._create_and_setup_db_record()  # pylint: disable=attribute-defined-outside-init
    403 
    404     @override

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/engine/processes/process.py in _create_and_setup_db_record(self)
    599         if self.metadata.store_provenance:
    600             try:
--> 601                 self.node.store_all()
    602                 if self.node.is_finished_ok:
    603                     self._state = Finished(self, None, True)  # pylint: disable=attribute-defined-outside-init

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/orm/nodes/node.py in store_all(self, with_transaction)
    427                 link_triple.node.store(with_transaction=with_transaction)
    428 
--> 429         return self.store(with_transaction)
    430 
    431     def store(self, with_transaction: bool = True) -> 'Node':  # pylint: disable=arguments-differ

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/orm/nodes/node.py in store(self, with_transaction)
    464                 self._store_from_cache(same_node, with_transaction=with_transaction)
    465             else:
--> 466                 self._store(with_transaction=with_transaction, clean=True)
    467 
    468             if self.backend.autogroup.is_to_be_grouped(self):

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/orm/nodes/node.py in _store(self, with_transaction, clean)
    481 
    482         links = self.base.links.incoming_cache
--> 483         self._backend_entity.store(links, with_transaction=with_transaction, clean=clean)
    484 
    485         self.base.links.incoming_cache = []

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/storage/sqlite_zip/orm.py in store(self, *args, **kwargs)
     71         if getattr(backend, '_read_only', False):
     72             raise ReadOnlyError(f'Cannot store entity in read-only backend: {backend}')
---> 73         super().store(*args, **kwargs)  # type: ignore # pylint: disable=no-member
     74 
     75 

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/storage/psql_dos/orm/nodes.py in store(self, links, with_transaction, clean)
    218             if links:
    219                 for link_triple in links:
--> 220                     self._add_link(*link_triple)
    221 
    222             return self

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/storage/psql_dos/orm/nodes.py in _add_link(self, source, link_type, link_label)
    197     def _add_link(self, source, link_type, link_label):
    198         """Add a single link"""
--> 199         with self.backend.transaction() as session:
    200             try:
    201                 link = self.LINK_CLASS(input_id=source.pk, output_id=self.pk, label=link_label, type=link_type.value)

~/.pyenv/versions/3.8.6/lib/python3.8/contextlib.py in __enter__(self)
    111         del self.args, self.kwds, self.func
    112         try:
--> 113             return next(self.gen)
    114         except StopIteration:
    115             raise RuntimeError("generator didn't yield") from None

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/aiida/storage/sqlite_temp/backend.py in transaction(self)
    150             session.commit()
    151         else:
--> 152             with session.begin():
    153                 with session.begin_nested():
    154                     yield session

<string> in begin(self, subtransactions, nested, _subtrans)

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/sqlalchemy/util/deprecations.py in warned(fn, *args, **kwargs)
    307                         stacklevel=3,
    308                     )
--> 309             return fn(*args, **kwargs)
    310 
    311         doc = fn.__doc__ is not None and fn.__doc__ or ""

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/sqlalchemy/orm/session.py in begin(self, subtransactions, nested, _subtrans)
   1318             )
   1319 
-> 1320         if self._autobegin():
   1321             if not subtransactions and not nested and not _subtrans:
   1322                 return self._transaction

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/sqlalchemy/orm/session.py in _autobegin(self)
   1258         if not self.autocommit and self._transaction is None:
   1259 
-> 1260             trans = SessionTransaction(self, autobegin=True)
   1261             assert self._transaction is trans
   1262             return True

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/sqlalchemy/orm/session.py in __init__(self, session, parent, nested, autobegin)
    525         autobegin=False,
    526     ):
--> 527         TransactionalContext._trans_ctx_check(session)
    528 
    529         self.session = session

~/checkouts/readthedocs.org/user_builds/aiida-core/envs/5804/lib/python3.8/site-packages/sqlalchemy/engine/util.py in _trans_ctx_check(cls, subject)
    197         if trans_context:
    198             if not trans_context._transaction_is_active():
--> 199                 raise exc.InvalidRequestError(
    200                     "Can't operate on closed transaction inside context "
    201                     "manager.  Please complete the context manager "

InvalidRequestError: Can't operate on closed transaction inside context manager.  Please complete the context manager before emitting further commands.

@sphuber
Copy link
Contributor Author

sphuber commented Dec 5, 2022

Thanks. Now the problem is clear. Seems that SqliteTempBackend also has the transaction method, that is now incorrect, but it isn't tested in the unit tests, as they all pass.

@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch from b9ac7e1 to 4014ad6 Compare December 5, 2022 19:39
@sphuber
Copy link
Contributor Author

sphuber commented Dec 5, 2022

@chrisjsewell problems fixed. Ready for review

@chrisjsewell
Copy link
Member

wanna update the branch cheers

@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch from 4014ad6 to fa36c74 Compare December 6, 2022 06:26
@sphuber
Copy link
Contributor Author

sphuber commented Dec 6, 2022

updated

@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch 2 times, most recently from 58cf7ba to 27c23a5 Compare December 7, 2022 08:23
@sphuber
Copy link
Contributor Author

sphuber commented Dec 7, 2022

@chrisjsewell could you please give this a final review? Thanks

aiida/storage/psql_dos/orm/groups.py Outdated Show resolved Hide resolved
@@ -192,7 +193,7 @@ def check_node(given_node):
if not given_node.is_stored:
raise ValueError('At least one of the provided nodes is unstored, stopping...')

with utils.disable_expire_on_commit(self.backend.get_session()) as session:
with self.backend.transaction() as session, disable_expire_on_commit(session) as session:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the right way round? you are going to exit the disable_expire_on_commit, then exit the transaction (which is the point when things get committed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it feels that there are no tests actually checking this disable_expire_on_commit behaviour.
I assume it actually desired (I didn't add it), and if so then I think there should be a test for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right, this is incorrect. And indeed, I don't think there is a proper test for this. If memory serves correctly, this was added way back when as a performance improvement. For a daemon worker, which could build up a session with quite a large number of instances in the session, it would invalidate them if just one node would be added to a group, which is not at all necessary.

I added a test that checks that a node instance, after adding it to a group, is not expired, however, the test passes with and without the disable_expire_on_commit. I am not sure whether I am using the SQLA API incorrectly to check whether an instance is expired, or if the way we are using the transaction, the node would anyway not get expired. Could you give a quick look to the test and see if you see anything weird/wrong? Maybe we can just remove the disable_expire_on_commit?

And if we do need to keep it, do you know how to make it work? I cannot do

with disable_expire_on_commit(self.backend.transaction()) as session

because then disable_expire_on_commit would receive a GeneratorContext instead of a Session. How do we call disable_expire_on_commit while still passing the session of the transaction call?

@@ -249,7 +247,7 @@ def check_node(node):

list_nodes = []

with utils.disable_expire_on_commit(self.backend.get_session()) as session:
with self.backend.transaction() as session, disable_expire_on_commit(session) as session:
Copy link
Member

@chrisjsewell chrisjsewell Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, regarding order and testing

aiida/storage/psql_dos/orm/nodes.py Outdated Show resolved Hide resolved
aiida/storage/psql_dos/orm/nodes.py Outdated Show resolved Hide resolved
aiida/storage/psql_dos/orm/nodes.py Outdated Show resolved Hide resolved
@chrisjsewell
Copy link
Member

@sphuber I'm not sure if any of the original PR description now makes sense?

Firstly, there is now no new get_session method, its modification of transaction

Secondly, "This will ensure that commit is only called if not already in a transaction.", but I don't think this is the case, given that: "When the context manager yielded by begin_nested() completes, it “commits” the savepoint" (see https://docs.sqlalchemy.org/en/20/orm/session_transaction.html#using-savepoint)

For reference, this is the transaction context

            if session.in_transaction():
                with session.begin_nested():
                    yield session
            else:
                with session.begin():
                    with session.begin_nested():
                        yield session

@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch 4 times, most recently from 0a7ed52 to b3476ec Compare December 9, 2022 12:15
@sphuber
Copy link
Contributor Author

sphuber commented Dec 9, 2022

@chrisjsewell the PR description and commit message is updated to correctly reflect the latest changes

Storing a node while iterating over the result of `QueryBuilder.iterall`
would raise `sqlalchemy.exc.InvalidRequestError` with the message:

    Can't operate on closed transaction inside context manager.

The problem was that the `Node` implementation for the `PsqlDosBackend`,
the `SqlaNode` class, would not consistently open a transaction, using
the `PsqlDosBackend.transaction` method, whenever it mutated the state
of the session, and would then straight up commit to the current session.
For example, when storing a new node, the `store` method would simply
call save. Through the `ModelWrapper`, this would call commit on the
session, but that was the session being used for the iteration.

The same problem was present for the `SqlaGroup` implementation that had
a number of places where sessions state was mutated without opening a
transaction first.

The problem is fixed therefore by consistently opening a transaction
before making changes to the session. The `transaction` implementation
is slightly changed as any `SqlaIntegrityError` raised during the context
is now converted into an `aiida.common.exceptions.IntegrityError` to make
it backend independent.
When trying to store nodes, for example by running a calcfunction,
SqlAlchemy would throw an exception complaining that the transaction had
already been closed and the context manager had to be closed first. The
implementation is fixed by matching it with that of the `PsqlDosBackend`
storage backend.

The error was only discovered through the execution of the `tutorial.md`
notebook in the documentation. An explicit unit test is added to make it
more obvious and easier to debug in case a regression is introduced.
The test passes, but it seems independent of the use of the utility
`disable_expire_on_commit`. It looks like this is because now a
transaction is properly used to add or remove nodes, and so the
instances in the parent session are not expired. It seems that the
proper use of transactions essentially makes `disable_expire_on_commit`
obsolete. When the same test is run on `v2.1`, where transactions were
not properly used, removing the context manager fails the test.
@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch from b3476ec to 858cbc9 Compare December 9, 2022 19:03
@sphuber
Copy link
Contributor Author

sphuber commented Dec 9, 2022

@chrisjsewell I fixed the disable_expire_on_commit order. I tested the test I added against main and there the test does fail if I remove the context manager, while it passes with the context manager enabled. This seems to suggest to me that the test is correct. I think the reason for it passing on this PR with and without the disable expire, is because maybe now we are using a proper transaction for the addition/removal of nodes, and committing that transaction does not expire instances of the outer session. I tried to confirm this with the SqlAlchemy docs (this was the best I could find https://docs.sqlalchemy.org/en/20/orm/session_basics.html#committing) but it is not 100% clear. Still I think this PR is now ready to be merged.

@chrisjsewell
Copy link
Member

I think the reason for it passing on this PR with and without the disable expire, is because maybe now we are using a proper transaction for the addition/removal of nodes, and committing that transaction does not expire instances of the outer session.

Cheers, if it indeed is working without the need for disable_expire_on_commit, then does it not make sense to just remove it?

@sphuber
Copy link
Contributor Author

sphuber commented Dec 12, 2022

I think the reason for it passing on this PR with and without the disable expire, is because maybe now we are using a proper transaction for the addition/removal of nodes, and committing that transaction does not expire instances of the outer session.

Cheers, if it indeed is working without the need for disable_expire_on_commit, then does it not make sense to just remove it?

I thought about it, but I am bit uncomfortable with that since I am not really sure if I wrote the test correctly. But if you think it looks correct as well, then I think we can decide to remove it.

@chrisjsewell
Copy link
Member

Yeh I think unless there is a demonstrable way to show that it is actually doing something, then we should just remove it.

@sphuber sphuber force-pushed the fix/5802/querybuilder-iterall-store branch from 858cbc9 to f952553 Compare December 12, 2022 19:00
@sphuber
Copy link
Contributor Author

sphuber commented Dec 12, 2022

@chrisjsewell ok, I have removed the utility and the test. I have done so in a separate commit and have kept the commit that added the test, such that we have a record of what happened. This will make it easier to come back to this if ever we find a performance regression in the future.

With that, I think this is ready to be merged then. Could you please give the final approve? Thanks!

Copy link
Member

@chrisjsewell chrisjsewell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers!

Sebastiaan Huber and others added 2 commits December 12, 2022 22:03
The implementation of the backend `Group` ORM class for the
`PsqlDosBackend` used the `disable_expire_on_commit` context manager in
the `add_nodes` and `remove_nodes` methods. By default, the engine used
sets `expire_on_commit=True`, which means that all database model
instances in the session are expired once the session is committed,
which forces the instances to be refreshed from the database the next
time they are accessed.

This is a sensible and important setting, as it ensures that we are in
sync with the database when multiple processes are committing to the
database. However, expiring instances comes at a cost because they will
have to be refreshed at some point. It is unknown exactly why his change
was originally applied, but it is probably because expiring all nodes in
the session upon addition/removal of nodes from a group affected
performance badly and the probability of inconsistencies by omitting it
are less problematic.

The behavior was never tested though and with the recent refactor of the
transaction control in the `PsqlDosBackend` it seems the utility to
explicitly disable the expire on commit no longer seems necessary. In
the previous commit a test was added to test the utility correctly being
applied, but it works whether the utility is used or not at all. This
suggests that with the current proper use of a nested transaction in
`add_nodes` and `remove_nodes`, the nodes in the outer session are no
longer expired, making the utility obsolete and therefore it is removed.
There was a backtick missing in the `help` attribute of an input port
declaration of the `CalcJob` class. This class is rendered automatically
through our sphinx plugin in the `topics/calculations/usage.rst` file in
the "Options" section. No idea why this didn't fail before this because
this mistake was not introduced in this PR.

Also add config option for `myst-nb` to `conf.py` so that it shows the
stacktrace if an exception is raised during notebook execution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants