-
-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@rules as coroutines #5580
@rules as coroutines #5580
Conversation
Select(Address)]) | ||
def resolve_unhydrated_struct(address_mapper, address_family, address): | ||
@rule(UnhydratedStruct, [Select(AddressMapper), Select(Address)]) | ||
def resolve_unhydrated_struct(address_mapper, address): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just looking at this exact rule today to try to understand how to use SelectProjection and this change is RIDICULOUSLY cool imo
"""Given an Address and its AddressFamily, resolve an UnhydratedStruct. | ||
|
||
Recursively collects any embedded addressables within the Struct, but will not walk into a | ||
dependencies field, since those are requested explicitly by tasks using SelectDependencies. | ||
""" | ||
|
||
address_family = yield Get(AddressFamily, Dir(address.spec_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no Dir -> AddressFamily
path exists in the rule graph, does it error at this line? Would the previous use of SelectProjection
have errored out earlier, during the rule graph construction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, exactly. And that's the primary downside of this change as it stands: we would lose the validation that all rules are satisfiable.
Have been thinking more about this though, and I think that with some very, very basic parsing (essentially, "capture the RHS of yield statements"), we could get back the static checking... which might be justified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and also, as it stands the exception would be created in the rust code rather than literally on this line. It looks like it is easy to send an exception to the generator though with generator.throw()
.
EDIT: Hmm, but actually: the downside of this would be that it would introduce the possibility of people accidentally/intentionally implementing error checking. And frankly, I'd rather get the static checks fixed and discourage all error handling in @rule
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "capture the RHS of yield statements" is a highly reasonable goal, but I don't know if it needs to be done by parsing -- that being said I remember python's ast module is supposed to be not awful? I can't see what the exact code would look like right now, but since we know we can get the desired info we want before invoking the python rule definition (with the previous example using SelectProjection
), I think there's hopefully a middle ground where we can specify rule graphs without too much ceremony (probably exactly the same in spirit as parsing for yield statements). I would love to parse some python for yield statements if it gets us there, but I think there might be a neater way -- will think on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now resolved: all Get
statements in the body of an @rule
are statically checked at RuleGraph
construction time.
Parsing So I think that with one relatively minor restriction (which technically already existed) we can preserve the static checks; namely: the request/Get/yield would need to include both the product and subject type explicitly. An example syntax might be:
EDIT: And after thinking about it for a few minutes more: we'd probably want to parse the calls to the |
ec40733
to
051c0e2
Compare
I like it, much cleaner than |
051c0e2
to
c568241
Compare
3c86fa8
to
73f73b5
Compare
I've implemented the static checks for The next order of business: bikeshedding. What should |
Very cool :) Though I'm not sure I entirely follow the fine details of everything going on. I'm happy to keep the name My next request would be a few Python unit tests which should this actually in use |
@@ -356,7 +372,7 @@ impl Select { | |||
product: self.product().clone(), | |||
variants: self.variants.clone(), | |||
task: task, | |||
entry: entry.clone(), | |||
entry: Arc::new(entry.clone()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a little strange for entry to be both Arc
'd and clone
'd here. My instinct (which may be wrong) is that either it's a cheap to clone type, and we shouldn't be Arc
ing it, or it's an expensive to clone type, and we should always be Arc
ing it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I have a separate review that attempts to Arc up the rule-graph, since it is full of tiny Vec
s. On the other hand, it does not appear to impact performance in a significant way... so perhaps this would not either. It's very surprising to me that all of the allocation doesn't have a noticeable impact.
c4b40a3
to
b51dae5
Compare
9e22ec5
to
2b70546
Compare
I haven't written any new unit tests, but I updated the documentation in the topmost commit and added validation to the |
2b70546
to
56dd7bd
Compare
A link to the rendered README: https://github.com/twitter/pants/blob/56dd7bdd4df03cc8e6a7864bf1ab8231a2500656/src/python/pants/engine/README.md |
I like the direction, but I'm not in love with the new syntax / concept / mixed-point dependency injection. Perhaps best to explain with a summary example. From the current change we now have: class Get(datatype('Get', ['product_type', 'subject'])):
pass
@rule(UnhydratedStruct, [Select(AddressMapper), Select(Address)])
def resolve_unhydrated_struct(address_mapper, address):
address_family = yield Get(AddressFamily, Dir(address.spec_path)
... So we have dependency injection through function arguments and, now, co-routine results. This could alternatively be modeled with just dependency injection through function arguments if we allow for both terminal values and rules (functions) to be injected: class Rule(datatype('Rule', ['product_type', 'subject_type'])):
def __call__(subject):
pass # IE: Get
def map(subjects):
pass # IE: GetMulti
@rule(UnhydratedStruct, [Select(AddressMapper), Select(Address), Rule(AddressFamily, Dir)])
def resolve_unhydrated_struct(address_mapper, address, address_family_rule):
address_family = address_family_rule(address.spec_path)
... IOW, I'm a Rule author who knows how to operate on certain selected values but I need to enlist the help of other rules imperatively. The proposed syntax change says this more directly I think and allows eliminating AST gymnastics. Taken further (but importantly ignoring variant selects), this just means @rule becomes nothing but type annotations for a python function; ie: things eventuially could just boil down to: @rule(UnhydratedStruct, [AddressMapper, Address, Rule(AddressFamily, Dir)])
def resolve_unhydrated_struct(address_mapper, address, address_family_rule): |
Thanks @jsirois , that's definitely an interesting suggestion! One downside I see is that it removes the trampolining that happens when we
I believe that it's possible that the EDIT: Although, of course, they would still be coroutines, and thus harder to invoke. |
Another advantage that @illicitonion might point out is that avoiding coroutines would mean that we could more easily use Skylark at some point (which doesn't support them, afaik). EDIT: Then again, skylark probably doesn't support decorators either. |
I may be missing something here, but couldn't the implementation of the call into the engine be masked, deferring the optimizations for behind-the-scenes work?: >>> def coroutine(value):
... result = yield rust(value)
... yield result
...
>>> def rust(value):
... return value * 2
...
>>> def using_coroutine(value):
... c = coroutine(value)
... for r in c:
... if r:
... return r
...
>>> using_coroutine(3)
6 Where |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished some discussion about coroutines vs rule stubs injected as values like all other values in the function arguments in slack #engine. I'd still prefer the latter on behalf of the average rule writer, but my instincts may be wrong and I don't strongly object.
After dancing around the realization a bit, it occurred to us this morning that if we have synchronous calls back and forth between python and rust, we can't have async io. While fully switching to synchronous io via |
56dd7bd
to
9879ff2
Compare
The top commit now adds a test utility function that makes it easier to test |
9879ff2
to
f62743a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the direction this is going.
The yield syntax in rules is a lot more understandable than the projecting selects.
I'd like to think a bit more about how we can validate that rules construct the Get
s in expected ways going forward, but I don't think that blocks getting this in.
I do have a few minor requests about the doc updates, but I don't think they're blocking. I think doing a couple copy edit passes on the docs might be worthwhile as a separate task.
src/python/pants/engine/README.md
Outdated
@rule(StringType, [Select(IntType)]) | ||
def int_to_str(an_int): | ||
return str(an_int) | ||
The first argument to the `@rule` decorator is the Product (ie, return) type for the @rule. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: second @rule
needs back ticks
src/python/pants/engine/README.md
Outdated
it will next look for any other registered Rules that can compute an IntType Product for the | ||
the subject is already of `type(subject) == IntType`, then the @rule will be satisfiable without | ||
any other depenencies. On the other hand, if the type _doesn't_ match, the engine doesn't give up: | ||
it will next look for any other registered @rules that can compute an IntType Product for the | ||
Subject (and so on, recursively.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a couple more @rule
references here without backticks. Also s/depenencies/dependencies/
src/python/pants/engine/README.md
Outdated
@rule(FormattedInt, [Select(IntType)]) | ||
def int_to_str(an_int): | ||
return FormattedInt('{}'.format(an_int)) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd make sense to add a paragraph here explaining the FormattedInt, and maybe providing an example of what a subject would be that would cause it to be evaluated. I think discussing subjects here would help the next section do its job of explaining how to change subjects.
src/python/pants/engine/README.md
Outdated
In cases where this is necessary, `@rule`s may be written as coroutines (ie, using the python | ||
`yield` statement) that yield "`Get` requests" that request products for other subjects. Just like | ||
`@rule` parameter Selectors, `Get` requests instatiated in the body of an `@rule` are statically | ||
checked to be satisfiable in the set of installed `@rules`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/instatiated/instantiated/
src/python/pants/engine/README.md
Outdated
pass | ||
This @rule declares that: "for any Subject for which we can compute `Files`, we can also compute | ||
`ConcattedFiles`". Each yielded `Get` request results in FileContent for a different File Subject | ||
from the Files list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it could use a little more explanation in it. Maybe something like
This `@rule` declares that: "for any Subject for which we can compute `Files`, we can also compute
`ConcattedFiles`". In order to do that, it needs the `FileContent` for each of the files in the selected
`Files`. It gets those contents by yielding `Get` requests that ask for `FileContent`s for with each
file as a Subject. Once those have been collected, it concatenates the contents and yields the result.
if isinstance(node, ast.FunctionDef) and node.name == func.__name__: | ||
rule_visitor = _RuleVisitor() | ||
rule_visitor.visit(node) | ||
gets.extend(Get(resolve_type(p), resolve_type(s)) for p, s in rule_visitor.gets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty slick. After we nail down the API further, or as part of doing so, I'd like to see a bunch of tests that cover different ways that Gets
are defined and the error cases defined here.
src/python/pants/engine/selectors.py
Outdated
return (product_type.id, subject_type.id) | ||
else: | ||
raise Exception('Invalid {}; expected either two or three args, but ' | ||
'got: ({})'.format(Get.__name__, render_args())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps these should raise ValueError
?
…cies with the generator syntax.
f62743a
to
febb531
Compare
`SelectProjection` is entirely superseded by the new `@rule` coroutine syntax from #5580.
Problem
The "complex"
Selector
types used to select@rule
inputs for subjects other than the current subject of an@rule
have proven to be entirely too complicated. Additionally, their lack of flexibility meant that we needed to introduce many intermediate types to provide links between@rules
in cases where imperative logic was needed for filtering or conditionals.The only remaining motivating factor for these complex selectors was that they allow for closed-world compilation of the
@rule
graph, which avoids the need for error checking in rule bodies and provides for easier composition of sets of@rules
.Solution
After experimenting with using coroutines (ie: functions containing
yield
statements that receive a result on their left hand side), it became clear both that:And so, this change removes all usage of
SelectProjection
and most usage ofSelectDependencies
in favor of@rule
coroutines. It does not yet remove the implementation of thoseSelectors
(although doing so will shed a massive amount of complexity from the codebase).Result
@rules
are still statically checked, but are now much easier to write. Performance is unaffected.