-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rules for cumsum
#573
Rules for cumsum
#573
Conversation
##### `cumsum` | ||
##### | ||
|
||
function frule((_, xdot), ::typeof(cumsum), x::AbstractArray; dims::Integer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps an frule
for cumsum!
here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are forward rlues for mutating functions safe? I haven't thought that through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reverse mode, #521 has thorny examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe so, since the primal and tangent are mutated simultaneously. We already have several frule
s for mutating functions. See e.g. https://github.com/JuliaDiff/ChainRules.jl/blob/main/src/rulesets/LinearAlgebra/factorization.jl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, nice.
Then someone should do a big PR adding this to half the Base functions, right? sum!
and circshift!
et. al.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must we define methods like cumsum!([0,0,0], ZeroTangent())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, not certain. @oxinabox didn't we decide at some point that it's up to the AD to handle cases where all (co)tangents are zero tangents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can solve this when or if it comes up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we expect the AD to handle cases in reverse mode when inputs are all ZeroTangent
.
For forwards mode, i would need to think more, but i think even with mutation https://en.wikipedia.org/wiki/Nothing_comes_from_nothing
We can absolutely add fixes as things come up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dod not understand by what way this works, but the tests show it does
No description provided.