Fixes gradient, Jacobian, Hessian, and vjp tests #2

frankschae · 2021-05-08T19:31:13Z

Current test status:

Test Summary:              | Pass  Fail  Error  Total
AbstractDifferentiation.jl |  103     1      1    105
  FiniteDifferences        |  103     1      1    105
    Derivative             |   12                  12
    Gradient               |   18                  18
    Jacobian               |   18                  18
    Hessian                |   30                  30
    jvp                    |    7     1      1      9
    j′vp                   |   18                  18

There are basically two fixes:

FiniteDifferences mutates the input x to compute the gradients, so the value was computed wrongly.
when the identity was an AbstractMatrix it didn't loop over its columns but vectorized the entire matrix which lead to a dimension error.

It remains to fix the jvp case.

test_fdm_jvp(fdm_backend1) # passes
test_fdm_jvp(fdm_backend2) # fails
test_fdm_jvp(fdm_backend3) # fails

For example, in fdm_backend2

v = (rand(length(xvec)), rand(length(yvec)))
pf1 = AD.pushforward_function(fdm_backend2, fjac, xvec, yvec)(v)

results in a single output vector instead of a tuple of two output vectors.

src/AbstractDifferentiation.jl

Co-authored-by: Mohamed Tarek <mohamed82008@gmail.com>

frankschae · 2021-05-11T20:30:36Z

Pass  Total
AbstractDifferentiation.jl |  117    117

The issue within the jvp tests was that

v = (rand(length(xvec)), rand(length(yvec)))

was interpreted as one directional derivative, i.e.,

x = to_vec(xvec, yvec)
ẋ = to_vec(v)
fdm(ε -> f(x .+ ε .* ẋ), zero(eltype(x)))

The fix now just augments v to ((v[1], zero(v[1])), (zero(v[2]), v[2]).

mohamed82008 · 2021-06-16T09:56:16Z

src/AbstractDifferentiation.jl

+    return y * derivative(d.backend, d.f, d.xs...)
+end
+
+function Base.:*(d::LazyDerivative, y::Union{Number,Tuple})


I don't think we should support multiplication if the derivative returns a tuple. This part of the code can be simplified.

Don't we want to support functions like fder(x, y) = exp(y) * x + y * log(x) for the LazyOperators as well?

Hmm I guess we could.

mohamed82008 · 2021-06-16T09:57:19Z

src/AbstractDifferentiation.jl

+Base.:*(y, d::LazyGradient) = y * gradient(d.backend, d.f, d.xs...)
+
+function Base.:*(d::LazyGradient, y::Union{Number,Tuple})
+    if d.xs isa Tuple


Why is xs sometimes a tuple and sometimes not?

I wanted to support:
AD.LazyJacobian(fdm_backend, x->fjac(x, yvec), xvec)
where xs is then only a vector xvec and not (xvec, ).

src/AbstractDifferentiation.jl

mohamed82008 · 2021-06-17T11:20:21Z

src/AbstractDifferentiation.jl

+    end
+end
+
+function Base.:*(d::LazyJacobian, ys::AbstractArray)


I don't think this does the right thing if ys is a matrix.

Should we throw an error in that case or fix that manually by putting a vec? I now combined these AbstractArray specializations with the general fallback. They were just differing by stuff like (y,).

mohamed82008 · 2021-06-17T11:26:10Z

src/AbstractDifferentiation.jl

+    end
+end
+
+function Base.:*(d::LazyHessian, ys::AbstractArray)


how is this function different from the general fallback?

mohamed82008 · 2021-06-17T11:27:00Z

src/AbstractDifferentiation.jl

+    end
+end
+
+function Base.:*(ys::AbstractArray, d::LazyHessian)


can we combine this with the general fallback?

good idea! See my comment above as well -- When I started I thought it makes sense to dispatch them all separately because of these tiny differences between Arrays, Tuples, and Numbers as input but it really leads to much more code and gets less readable..

mohamed82008 · 2021-06-17T11:35:33Z

src/AbstractDifferentiation.jl

@@ -350,6 +510,19 @@ function define_pushforward_function_and_friends(fdef)
                        pff(cols)
                    end
                end
+            elseif eltype(identity_like) <: AbstractMatrix


Could you please comment this part? It's not clear to me what's happening here.

This was the fix I added for the computation of the Hessian. identity_like and cols look as follows in that case:

identity_like = ([1.0 0.0 0.0 0.0 0.0; 0.0 1.0 0.0 0.0 0.0; 0.0 0.0 1.0 0.0 0.0; 0.0 0.0 0.0 1.0 0.0; 0.0 0.0 0.0 0.0 1.0],) identity_like[1] = [1.0 0.0 0.0 0.0 0.0; 0.0 1.0 0.0 0.0 0.0; 0.0 0.0 1.0 0.0 0.0; 0.0 0.0 0.0 1.0 0.0; 0.0 0.0 0.0 0.0 1.0] cols = [1.0, 0.0, 0.0, 0.0, 0.0] cols = [0.0, 1.0, 0.0, 0.0, 0.0] cols = [0.0, 0.0, 1.0, 0.0, 0.0] cols = [0.0, 0.0, 0.0, 1.0, 0.0] cols = [0.0, 0.0, 0.0, 0.0, 1.0]

so it was mainly to fix the input/output for the additional pushforward that is used. I'll need to check in a bit more detail if one can simplify that function a bit more. IIRC without that function I got dimension errors in the jvp of FiniteDifferences.jl because it would have pushed forward matrices like identity_like[1] instead of the columns.

but I think you are right that this needs some additional care.. I think the current version will break down once we'd like to compute a tuple of Hessians.

mohamed82008 · 2021-06-17T11:44:34Z

test/runtests.jl

 end

 function test_fdm_jvp(fdm_backend)
    v = (rand(length(xvec)), rand(length(yvec)))
-    pf1 = AD.pushforward_function(fdm_backend, fjac, xvec, yvec)(v)
+
+    if fdm_backend isa FDMBackend2 # augmented version of v


Could you explain this part?

This is related to the comment above: #2 (comment) .

The resulting error traces back to the following: When we use: v = (rand(length(xvec)), rand(length(yvec))) with our FDMBackend2 as the vector that we'd like to pushforward through a function fvec(x,y) that takes two inputs x and y, we get out an output that corresponds to the directional derivative with "v[1] as the direction" along x and "v[2] as the direction" along y. The other backends however interprete the v as change x along the direction v[1] while keeping y fix and change y in the direction v[2] while keeping x fix.

mohamed82008 · 2021-06-17T11:48:23Z

Looks good overall. Just left a few comments because I couldn't immediately understand what's happening in a couple of places.

frankschae added 4 commits May 3, 2021 11:29

fix gradients

512fd2d

fix Jacobian tests

caa72e1

fix hessian tests

4a763e3

fix j′vp

b6f4804

mohamed82008 reviewed May 9, 2021

View reviewed changes

src/AbstractDifferentiation.jl Outdated Show resolved Hide resolved

frankschae and others added 4 commits May 9, 2021 17:37

Update src/AbstractDifferentiation.jl

5ed5c4e

Co-authored-by: Mohamed Tarek <mohamed82008@gmail.com>

fix vjp tests

2991b8a

fix jvp tests

ddf2863

Merge remote-tracking branch 'frankschae/mt/interface' into mt/interface

4269f83

frankschae added 4 commits May 21, 2021 16:15

lazy derivative fixes

959fa01

lazy gradient

5102f01

lazy jacobian

6941f5c

lazy hessian tests

c744f92

mohamed82008 reviewed Jun 16, 2021

View reviewed changes

mohamed82008 reviewed Jun 17, 2021

View reviewed changes

src/AbstractDifferentiation.jl Outdated Show resolved Hide resolved

mohamed82008 reviewed Jun 17, 2021

View reviewed changes

src/AbstractDifferentiation.jl Outdated Show resolved Hide resolved

mohamed82008 reviewed Jun 17, 2021

View reviewed changes

src/AbstractDifferentiation.jl Outdated Show resolved Hide resolved

mohamed82008 reviewed Jun 17, 2021

View reviewed changes

frankschae added 3 commits June 29, 2021 23:42

combine general fallback with abstract array

5c0a147

reshape gradient and fix for AD.hessian

48af108

remove prints

a9f1538

mohamed82008 merged commit 347d978 into JuliaDiff:mt/interface Aug 2, 2021

devmotion mentioned this pull request Jul 28, 2023

Remove special cases for AbstractFiniteDifference #94

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes gradient, Jacobian, Hessian, and vjp tests #2

Fixes gradient, Jacobian, Hessian, and vjp tests #2

frankschae commented May 8, 2021

frankschae commented May 11, 2021

mohamed82008 Jun 16, 2021

frankschae Jun 16, 2021

mohamed82008 Jun 17, 2021

mohamed82008 Jun 16, 2021

frankschae Jun 16, 2021 •

edited

Loading

mohamed82008 Jun 17, 2021

mohamed82008 Jun 17, 2021

frankschae Jun 29, 2021

mohamed82008 Jun 17, 2021

mohamed82008 Jun 17, 2021

frankschae Jun 29, 2021

mohamed82008 Jun 17, 2021

frankschae Jun 29, 2021

frankschae Jun 29, 2021

mohamed82008 Jun 17, 2021

frankschae Jun 29, 2021

mohamed82008 commented Jun 17, 2021

Fixes gradient, Jacobian, Hessian, and vjp tests #2

Fixes gradient, Jacobian, Hessian, and vjp tests #2

Conversation

frankschae commented May 8, 2021

frankschae commented May 11, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

frankschae Jun 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mohamed82008 commented Jun 17, 2021

frankschae Jun 16, 2021 •

edited

Loading