Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: permit class member functions as calcfunctions #4963

Merged
merged 2 commits into from
Feb 1, 2023

Conversation

dev-zero
Copy link
Contributor

It is already possible to specify a class member function as a "calcfunction"
and use it when not running via the daemon (and not with caching enabled):

class MyWorkChain(WorkChain):
    @calcfunction
    def mysum(a, b):
        return a+b

    def workfunc(self):
        self.ctx.sum = MyWorkChain.mysum(self.inputs.x, self.inputs.y)

Here I simply extend the loading to unbreak caching and running via the daemon,
while brushing up error handling a bit (catch one exception more and cleanup the traceback).

@dev-zero dev-zero force-pushed the feature/class-member-calcfunctions branch from e8fc6f5 to cf5899d Compare May 26, 2021 12:15
@dev-zero dev-zero requested a review from sphuber May 26, 2021 12:25
@greschd
Copy link
Member

greschd commented May 26, 2021

As far as I can see, this doesn't bind either the class or the instance in any way - should there maybe be a @staticmethod somewhere to indicate that?
From the call site it seems clear enough, but looking only at the function definition was somewhat puzzling to me.

@dev-zero
Copy link
Contributor Author

@greschd at first I thought I'd have to do that too. Indeed my first working attempt was this (since an unbound staticmethod is not a callable you have to call it via the descriptor protocol):

class MyWorkChain(WorkChain):
    @staticmethod
    def mysum(a, b):
        return a + b

MyWorkChain.mysum = calcfunction(MyWorkChain.mysum)

But the @calcfunction decorator wraps/replaces the function with a new type anyway, hence it doesn't seem necessary.

@greschd
Copy link
Member

greschd commented May 26, 2021

Right, I understand the @staticmethod isn't actually necessary. What I'm a bit concerned about is that the decorated function looks very similar to a plain method (that would bind self), but doesn't act that way. I'm wondering if there's some way we could make that distinction more clear.

@dev-zero
Copy link
Contributor Author

Good point, yes.

Maybe some more things:

  • while doing testing related to this I encountered some weird database errors when a process could not be loaded (e.g. when loading from caching). My guess is that the loading failed but the in the error path the exception would have been added to the incompletely loaded process node.
  • I started messing with this since I wanted to see whether it would be possible to have class member calcfunctions which pull their arguments from the context or input automatically, such that
    @calcfunction
    def mysum(a, b):
        return a+b

    def workfunc(self):
        self.ctx.sum = MyWorkChain.mysum(self.ctx.x, self.ctx.y)

could be written as

    @calcfunction
    @args_from_context  # automatically passes members x, y from self.ctx
    def sum(x, y):
        return x+y

In the sense that such functions could be used in the outline, but also directly act as calcfunctions. So instead of having explicit delayed/async function calls as calcfunctions those class member calcfunctions in the outline would then more or less push/pop from the ctx "stack".

@greschd
Copy link
Member

greschd commented May 29, 2021

* while doing testing related to this I encountered some weird database errors when a process could not be loaded (e.g. when loading from caching). My guess is that the loading failed but the in the error path the exception would have been added to the incompletely loaded process node.

Happy to take a look at the caching case if you can create a reproducing example.

I like the idea of a calcfunction that operates directly on the ctx.

@sphuber
Copy link
Contributor

sphuber commented Feb 10, 2022

Was just experimenting with something similar and turning a class member method of a Data node also works:

from .int import Int
from aiida.engine import calcfunction

class CustomInt(Int):
    """`Data` sub class to represent an integer value."""

    @calcfunction
    def add(self, b):
        return self + b

when running it

In [1]: from aiida.orm.nodes.data.customint import CustomInt

In [2]: i = CustomInt(1)

In [3]: i.add(Int(2))
Out[3]: <Int: uuid: 9da61dda-71ea-4dbf-9d98-8f46676cea8d (pk: 346) value: 3>

In [5]: !verdi process list -a -p1
  PK  Created    Process label    Process State    Process status
----  ---------  ---------------  ---------------  ----------------
 345  8s ago     addFinished [0]

Total results: 1

It is a bit surprising after reading the analysis of having a calcfunction within a process class definition. You seem to be saying that self doesn't get bound, but in this examples, it clearly is. So not sure what the difference is.

@dev-zero
Copy link
Contributor Author

My guess would be that the self here is serializable, while a workchain class instance is not.

@sphuber sphuber force-pushed the feature/class-member-calcfunctions branch from cf5899d to 17104cd Compare November 2, 2022 22:57
@sphuber
Copy link
Contributor

sphuber commented Nov 2, 2022

@dev-zero as you may have noticed, I have been going through old PRs to clean house a bit. This one I think can still be merged, so I took the opportunity to rebase your branch, hope you don't mind. I have also added tests that test the proposed functionality. I am testing two ways of declaring the calcfunction; as a proper staticmethod of the workchain and as a class attribute. The tests validate that it can be run as well as submitted and also that caching works. The question is now whether we should officially recommend these approaches in the documentation. Maybe we should only recommend the staticmethod approach as it seems to make explicit why self is not getting bound and passed the the calcfunction when invoked. Thoughts?

@sphuber sphuber force-pushed the feature/class-member-calcfunctions branch from 17104cd to dc909f9 Compare November 9, 2022 12:16
@sphuber sphuber force-pushed the feature/class-member-calcfunctions branch 5 times, most recently from 9288351 to 14ad1bc Compare December 14, 2022 11:37
@sphuber
Copy link
Contributor

sphuber commented Dec 14, 2022

@chrisjsewell I have fixed the failing test and added the documentation. Please have a final look, thanks!

@sphuber sphuber force-pushed the feature/class-member-calcfunctions branch from 14ad1bc to 3d7fca3 Compare December 14, 2022 15:47
dev-zero and others added 2 commits February 1, 2023 17:09
The `FunctionProcess` class that is generated dynamically from the
process function signature used the function's `__name__` attribute to
construct the new dynamic type. This prevented process functions from
being defined as class member methods. By using `__qualname__` instead,
this is now enabled and allows for example the following:

    class CalcFunctionWorkChain(WorkChain):

        @classmethod
        def define(cls, spec):
            super().define(spec)
            spec.input('x')
            spec.input('y')
            spec.output('sum')
            spec.outline(
                cls.run_compute_sum,
            )

        @staticmethod
        @calcfunction
        def compute_sum(x, y):
            return x + y

        def run_compute_sum(self):
            self.out('sum', self.compute_sum(self.inputs.x, self.inputs.y))

The changes also allow these class member process functions to be valid
cache sources.
The test defines a workchain that defines a calcfunction as a class
member in two ways:

 * A proper staticmethod
 * A class attribute

Both versions work, but the former is more correct and is the only one
that can be called from an instance of the workchain. The other has to
be invoked by retrieving the calcfunction as an attribute and then
calling it. The former is the only one that is offcially documented.

The tests verify that both methods of declaring the calcfunction can be
called within the `WorkChain` and that it also works when submitting the
workchain to the daemon. A test is also added that verifies that caching
of the calcfunctions functions properly.

The changes also broke one existing test. The test took a calcfunction,
originally defined on the module level, and modifying it within the test
function scope. The goal of the test was twofold:

 * Check that changing the source code would be recognized by the
   caching mechanism and so an invocation of the changed function would
   not be cached from an invocation of the original
 * Check that it is possible to cache from a function defined inside the
   scope of a function.

The changes to the dynamically built `FunctionProcess`, notably changing
the type from `func.__name__` to `func.__qualname__` stopped the second
point from working. In the original code, the type name would be simply
`tests.engine.test_calcfunctions.add_function`, both for the module
level function as well as the inlined function. However, with the change
this becomes:

 `tests.engine.test_calcfunctions.TestCalcFunction.test_calcfunction_caching_change_code.<locals>.add_calcfunction`

this can no longer be loaded by the `ProcessNode.process_class` property
and so `is_valid_cache` returns `False`, whereas in the original code
it was a valid cache as the process class could be loaded.

Arguably, the new code is more correct, but it is breaking. Before
inlined functions were valid cache sources, but that is no longer the
case. In exchange, class member functions are now valid cache sources
where they weren't before. Arguably, it is preferable to support class
member functions over inline functions.

The broken test is fixed by moving the inlined `add_calcfunction` to a
separate module such that it becomes a valid cache source again.
@sphuber sphuber force-pushed the feature/class-member-calcfunctions branch from 3d7fca3 to 8032a8a Compare February 1, 2023 16:09
@sphuber sphuber merged commit e5ab1fb into aiidateam:main Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants