You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This discussion is a spin-off of the discussions started in #361.
Here, I want to present new ways to define task products and discuss their usefulness.
The goal is not to select just one approach over all the others. All approaches have their strengths and weaknesses. It is more about finding a combination of approaches that offers ease of use to beginners, a nice interface and all necessary flexibility and correctness for power users.
@pytask.mark.produces and magic argument produces
The decorator @pytask.mark.produces seems unnecessary. When using the @pytask.mark.task decorator, it is already possible to pass values to produces by either using the kwargs keyword or a default value for the argument. It will also be generally possible without the task decorator in v0.4.0.
Thus, having produces as a magical keyword argument gives us the same functionality as the decorator, but one does not need to use the decorator.
+ People know it.
+ No decorator anymore.
- Users are forced to insert all there products under the argument name produces.
- For multiple products, the type has to use a container, and with different kinds of products (as you see later) even less clear.
The Product annotation
The Product annotation allows users to declare any argument of a task as a product using an annotation.
+ Products do not need to be called produces anymore.
+ Multiple products can be spread across multiple argument names, making much better use of the namespace.
- People have to learn about annotations.
- People need to be able to annotate the task function which you cannot do with third-party functions (later more).
Allowing tasks to return
So far, task functions were not able to return which seems unintutive in the
beginning but many users made their peace with it.
Mainly, return annotations allow to delegate all of the I/O to pytask and
remove it from the task function. We need a little bit more knowledge about the
internals of pytask which is why it is probably more an interface for intermediate to
advanced users.
pytask works with protocols for nodes. Anything that follows the protocol for Node is a valid dependency or product of a task.
"""Protocol for an intersection between nodes and tasks."""
name: str|None
"""The name of node that must be unique."""
@abstractmethod
defstate(self) ->Any:
...
@runtime_checkable
classNode(MetaNode, Protocol):
"""Protocol for nodes."""
value: Any
defload(self) ->Any:
...
defsave(self, value: Any) ->Any:
...
deffrom_annot(self, value: Any) ->Any:
...
Here, are two proposals for an interface that allows returns.
Returns via annotations
Similar to function argument annotations, we can use return annotations to specify how the function result should be stored. Here, we specify a path in the annotation. Internally, the path will be converted to a PathNode that can store strings and bytes.
+ Returns are not function arguments anymore.
- Return annotations are only possible if the user defines the function.
Returns via @pytask.mark.task
Similar to kwargs, @pytask.mark.task should receive another argument, for example, produces that receives the same PyTree that you would usually define in the annotation of the return.
+ This approach also works with third-party functions in contrast to return annotations.
+ More suitable for a programmatic API where tasks could be lambda or external functions.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
This discussion is a spin-off of the discussions started in #361.
Here, I want to present new ways to define task products and discuss their usefulness.
The goal is not to select just one approach over all the others. All approaches have their strengths and weaknesses. It is more about finding a combination of approaches that offers ease of use to beginners, a nice interface and all necessary flexibility and correctness for power users.
@pytask.mark.produces
and magic argumentproduces
The decorator
@pytask.mark.produces
seems unnecessary. When using the@pytask.mark.task
decorator, it is already possible to pass values toproduces
by either using thekwargs
keyword or a default value for the argument. It will also be generally possible without the task decorator in v0.4.0.Thus, having
produces
as a magical keyword argument gives us the same functionality as the decorator, but one does not need to use the decorator.+ People know it.
+ No decorator anymore.
- Users are forced to insert all there products under the argument name
produces
.- For multiple products, the type has to use a container, and with different kinds of products (as you see later) even less clear.
The
Product
annotationThe
Product
annotation allows users to declare any argument of a task as a product using an annotation.+ Products do not need to be called
produces
anymore.+ Multiple products can be spread across multiple argument names, making much better use of the namespace.
- People have to learn about annotations.
- People need to be able to annotate the task function which you cannot do with third-party functions (later more).
Allowing tasks to return
So far, task functions were not able to return which seems unintutive in the
beginning but many users made their peace with it.
Mainly, return annotations allow to delegate all of the I/O to pytask and
remove it from the task function. We need a little bit more knowledge about the
internals of pytask which is why it is probably more an interface for intermediate to
advanced users.
pytask works with protocols for nodes. Anything that follows the protocol for
Node
is a valid dependency or product of a task.pytask/src/_pytask/node_protocols.py
Lines 10 to 35 in 6017b82
Here, are two proposals for an interface that allows returns.
Returns via annotations
Similar to function argument annotations, we can use return annotations to specify how the function result should be stored. Here, we specify a path in the annotation. Internally, the path will be converted to a
PathNode
that can store strings and bytes.It is also possible to return any PyTree in the function and match it to a PyTree with the same structure in the annotations.
+ Returns are not function arguments anymore.
- Return annotations are only possible if the user defines the function.
Returns via
@pytask.mark.task
Similar to
kwargs
,@pytask.mark.task
should receive another argument, for example,produces
that receives the same PyTree that you would usually define in the annotation of the return.+ This approach also works with third-party functions in contrast to return annotations.
+ More suitable for a programmatic API where tasks could be lambda or external functions.
Beta Was this translation helpful? Give feedback.
All reactions