-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to register algorithm passes #1377
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together. I like what this API is trying to accomplish -- agree we need a better API to sort the ordering of algorithms -- but wonder if it would be helpful to work through the design a bit more? My main questions with this API are:
- Should an algorithm register a pass, or do we want this functionality to live outside of the algorithm? I was thinking for modularity it could be helpful for each algorithm to self-contain this information. But, this would not be possible via this API (nor via the status quo), since algorithms do not have access to the engine.
- Would it be confusing to set the ordering of passes (via the index argument correctly? Mainly thinking if multiple (different) sources are both inserting passes -- then the order in which
engine.register_pass
is called is important. - Do we want a full function for algorithm sorting that would be opaque to the engine? Or would it be helpful to give the engine more visibility into the scheduling requirements with something like a DAG , where each algorithm could have a
run_before(self, event) -> Sequence[Type[Algorithm]]
and arun_after(self, event) -> Sequence[Type[Algorithm]]
method. The engine could then rearrange algorithms (so long as it satisfies the DAG requirements) for optimal performance (e.g. when running with XLA and lazy execution).
I didn't review the code in detail; happy to do that if we would like to go with this design.
I dont think this warrants a full design discussion, this is refactoring the existing design for better extensibility and readability (rather than hiding all the algorithm passes inside |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Interactions are messy between algorithms -- I don't think you can include per algorithm
- Also don't love weird indexing, but fine leaving as a super user feature
- We likely will need to rewrite this with a more complicated scheduling algorithm -- agree with many of the points Ravi said. With that said, I'm fine with this refactor (which is much better than current code imo) because I don't think we're at the point where we need to do a full design on algorithm DAGs -- Id punt this to later
The order in which algorithms are run matters significantly during composition. For example,
FusedLayerNorm
must run afterGatedLinearUnits
(which adds layer norms), to ensure that all layer norms are converted into fused versions. To enforce these, we use algorithm passes, which operate on lists of algorithms.This PR refactors these algorithm passes into its own
passes
module, and allows the user to register custom passes (for custom algorithms) into the Engine.Also coming along for the ride are some readability and code quality improvements to the engine.
todos: