-
-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
collect OneElement when used with implicit Params #989
Conversation
It seems to me that there must have been a bug in accum over using the OneElelment? Its pretty standard to be truthful about the element types we deal with, and yes, would not want to rule out immutable arrays (or any other such data type) We should explore what the flaw with accum (before OneElelment) was. |
Why does OneElelment attempt to modify the return type or the type of output gradient at all? That doesn't seem right. |
Zygote already doesn't enforce that your returned gradient type matches the input parameters at all. I'm not saying that this is a good thing, but: using Zygote
Zygote.gradient(sum,rand(4))[1] # 4-element Fill{Float64}: entries equal to 1.0 it's just generally a problem with Zygote. As a band-aid this is fine. What really should be happening is one should |
Good example. I think this I'd argue that the ideal thing isn't to satisfy Flux by unnecessarily materialising things, but for Flux to at least check before blindly mutating, but ideally update |
I think that change should be done regardless. Flux tries to make sure you aren't mutating, but then requires that the types can mutate, seems to just generally be a combination that would cause trouble. |
I can't think of any areas off the top of my head where Flux tries to prevent mutation beyond what Zygote complains about. Certainly there is a laundry list of pain points with the current optimizers, and that's what Optimisers.jl is trying to address. See in particular this PR, which brings in ArrayInterface to avoid mutating immutable parameter types. Now, there are still a couple of roadblocks. Foremost is that not every parameter type in Flux is a proper (Abstract)Array. Dhairya and I talked about that yesterday and it hopefully shouldn't be a problem for too much longer. The second, more fundamental issue is that not being able to mutate wrecks all sorts of havok when using implicit params. I've been ruminating over writing a "Taking Explicit Params Seriously" issue for a while now, but need to figure out how to structure it to avoid too much scorched earth ;) |
Related to FluxML/Flux.jl#1510. I agree with @ChrisRackauckas, we |
Closing in favour of fixing this in Flux, then. Both a quick band-aid, and ultimately a better design. |
1613: use ArrayInterface.restructure in update! r=CarloLucibello a=CarloLucibello Suggestion coming from @ChrisRackauckas in FluxML/Zygote.jl#989. Now `update!` handles basically any gradient Zygote emits, e.g. FillArrays and Zygote.OneElement. Fix #1510 Co-authored-by: CarloLucibello <carlo.lucibello@gmail.com>
1613: use ArrayInterface.restructure in update! r=CarloLucibello a=CarloLucibello Suggestion coming from @ChrisRackauckas in FluxML/Zygote.jl#989. Now `update!` handles basically any gradient Zygote emits, e.g. FillArrays and Zygote.OneElement. Fix #1510 Co-authored-by: CarloLucibello <carlo.lucibello@gmail.com>
This is to address SciML/Surrogates.jl#279, by ensuring that when using implicit parameters, the arrays in
Grads
are mutable ones. Current behaviour:Perhaps deserves a bit more thought before merging. Do we insist that gradients in Grads are mutable?
The stack trace from Flux looks like this, shouldn't it be updating
x::Vector{Float32}
fromxs::Zygote.Params
, not updatingx̄::Zygote.OneElement
?But before updating
x
, Flux scales the gradient, to apply the learning rate from the optimiser. That's a slightly strange feature, maybe it shouldn't do that?