-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broadcasting with differentiable functions (Remove Cast?) #12
Comments
Relevant dev note (will turn into an advanced usage doc section or something at some point): Line 142 in 6cfd4b9
It's a half-placeholder 😉kind of expresses the idea, but the implementation is mainly a toy one compared to e.g. https://github.com/jrevels/MixedModeBroadcastAD.jl. Regardless - unless I'm misunderstanding something, which could always be the case - none of the default rule definitions should be a barrier to adoption by downstream AD, since downstream ADs can overload rules in whatever manner best fits their specific implementations. Zygote can define the broadcast rule in whatever way makes the most sense for Zygote.
It assumes that forward mode is the correct choice of "fallback" mode for unary scalar functions, which is almost certainly the case. It doesn't make any assumptions about non-unary functions, and/or functions where there's a specialized Anyway, even if we had a more general |
Okay sounds good to me. Could you elaborate a little on how a downstream package would change the default behaviour? If the intention is to literally override the default method in ChainRules, will this not cause annoying warnings? (Not that this is the end of the world of course...) |
Or is the intention that Cassette will come to the rescue? |
related #122 |
Most parts of this are resolved or put in other issues. |
The current implementation of broadcast assumes that the function being
broadcast
ed doesn't contain any differentiable bits, and that we can therefore safely assume that there is no gradient information to be associated with it. It also assumes that the forwards-inside-reverse-mode trick is the correct choice for implementing the adjoint, which isn't necessarily the case.Presumably this implementation is a placeholder, however, it will definitely be necessary to relax the above assumptions before e.g. Zygote is able to adopt ChainRules, so I believe it should be addressed as a priority.
The text was updated successfully, but these errors were encountered: