-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dealing with non-array/scalar structured values #8
Comments
I'm also not entirely sure what the correct answer is here. The other thing we should sort out while making these choices is whether, when dealing with non- |
Something else to consider here is objects with "virtual fields," for which in some cases the actual fields may be considered "private." For example, |
So my attempt to summarize the discussion right now: The normal case: (Without NamedTuples) Now the new case is if one of the inputs was a Struct. This makes sense as at the end of the day a Coming off this, to make things nice we might want to add This also maxises our expressive power,
Also we want to do the same for |
Also note that we might not actually need |
@willtebbutt do you remember why we decided on a That latter feels better to me now. |
I think it was probably something to do with a named tuple of differentials being something that can reasonably participate immediately in whatever computations are required downstream, whereas a named tuple of rules would require further unwrapping etc. For example, if you have a heavily nested namedtuple (e.g. corresponding to the adjoint w.r.t. a struct inside a struct inside a struct etc in Zygote) you'll wind up having to repeat a two-step procedure to recursively evaluate everything as required as you work through the nested named tuple, rather than just having every already evaluated. I think, in short, it's just simpler to have a named tuple of differentials than a named tuple of rules. |
The annoying case is in particular the fact that most function derviative W.R.T self Which is byfar the most common case.
Also means all deferring of work now hass to be done via I guess I will define |
So in the post #30 world we do not have Rules anymore so everything we do in this space is more elegant. At least in terms of having resolves fully the question of Right now for
Ever since #16 we don't have To go with that, we should I think add Also we should have some macro or function to convert NamedTuples and Tuples to the Differiential type them. |
This seems reasonable to me. It is kind of sad that we don't have |
I don't think we will encounter more (touch wood), There is an interesting case around things like DateTime. If you say have a function of |
The |
@willtebbutt and I have concluded That we only need 1 composite type. Also @willtebbutt is keen on the idea of it being parameterized byt the original object type, |
I am thinking The interesting one in this framework is the Composite |
Turns out one of the cases we need this for is |
I think we do want |
Master behavior ```julia julia> @scalar_rule(one(x), Zero()) julia> frule(one, 1, Zero(), [1, 2]) (1, Zero()) julia> frule(one, 1, Zero(), One()) (1, Zero()) ``` Desirable behavior ```julia julia> @scalar_rule(one(x), Zero()) julia> frule(one, 1, Zero(), [1, 2]) (1, [0, 0]) julia> frule(one, 1, Zero(), One()) (1, Thunk(var"#8#10"()) ) ```
* Eagerly evaluate scalers rules Master behavior ```julia julia> @scalar_rule(one(x), Zero()) julia> frule(one, 1, Zero(), [1, 2]) (1, Zero()) julia> frule(one, 1, Zero(), One()) (1, Zero()) ``` Desirable behavior ```julia julia> @scalar_rule(one(x), Zero()) julia> frule(one, 1, Zero(), [1, 2]) (1, [0, 0]) julia> frule(one, 1, Zero(), One()) (1, Thunk(var"#8#10"()) ) ``` * New release * Add tests * Revert "Eagerly evaluate scalers rules" This reverts commit dbe7765. * Redefine * between ::Zero and ::Any * Make it nicer * Add tests and move zero(::AbstractDifferential) to the right folder
closed by #59 |
Consider the case of an eigenvalue decomposition from the
eigen
function, which produces both eigenvalues and eigenvectors in anEigen
object. Giles provides forward- and reverse-mode sensitivities for the decomposition, which depend on both the eigenvalues and vectors. That begs the question of how this should be expressed in ChainRules.In a conversation with @jrevels, he said that his vision for this was to use named tuples in cases such as this, which involve structures other than arrays and scalars. However, we weren't able to come to a concrete conclusion on whether it makes more sense for a
Rule
for e.g.eigen
to produce a named tuple upon application of the rule, or iffrule
/rrule
should themselves should produce a named tuple ofRule
s.It seems that the author of a rule can reuse computations for the eigenvalues and vectors if applying the rule yields a single named tuple. This would look something along the lines of (untested):
However, if
frule
/rrule
returns a named tuple of rules, one can specify accessor functions much more conveniently, something like:To quote Jarrett directly, at what stage do we want the caller to know that they should deal with a named tuple?
The text was updated successfully, but these errors were encountered: