Explicit Rules for Higher Order "Adjoints" #67

jessebett · 2019-07-31T19:02:04Z

Can the API for adding an adjoint rule allow for explicitly specifying the rule for a higher order adjoint?

e.g. D(sin,x) = v -> v * cos(x) but I also know that D(sin,x; n=2) = (v1,v2) -> v2*v1*(- sin(x))

The text was updated successfully, but these errors were encountered:

willtebbutt · 2019-07-31T19:39:19Z

Hmmm I don't think that we currently have that interface. I would be very interested to know what you think it should look like though, preferably with reference to the existing API.

Big re-organization

oxinabox · 2020-01-04T18:41:03Z

So I think I worked out the API this should have.
Its basically frule, (cf #74 )
except instead of giving pertubations (input sensitivites) as a single differential,
you pass them in as a series of Tensor Coefficients (to use Griewank's term) of length N
and the taylor rule returns the output, and a series of tensor coefficients also of length N which is the pertubation having been pushed forward (output sensitivities).

We don't take N directly -- we get it implictly from what ever we are pushing forward.

jessebett · 2020-01-07T19:48:16Z

@oxinabox this is exactly what we've done with jax.jet. Though, I'm not sure if our choice to pass around Tensor Coefficients instead of scaled Taylor Coefficients (again, Griewank's distinction) was actually a good choice. Something else to consider.

oxinabox · 2020-01-16T22:57:41Z

Originally posted by @shashi in #74 (comment)

It turns out we need change #88 to be of the form:

res = frule(f, x..., partials...)
if res !== nothing
    fx, pushforward = res
    partials = pushforward(Zero(), partials...)
end

We're starting to think about Taylor mode FD where we need to differentiate through pushforward. If we don't have pushforward as a separate function, then we'd have to differentiate a call to frule which also re-runs the primal computation.

oxinabox · 2020-01-16T23:01:32Z

I don't think thats right.
What I think we need to do is to define frules (or perhaps call them instead taylorrules)
that give you the taylor series of the right dimension.
And if they do that via defining some pushforward functiom that they then call AD on
(cf #68) then thats fine.

Though I suspect for most frules needed they have well known taylor series that we could write out much more efficiently.

Or perhaps that we would like to get using symbolic AD.

oxinabox · 2020-01-17T19:41:57Z

Paraphrasing the second part of my post in #102 (comment)

We can't generate higher order frules from lower order frules (either with fused or unfused pushforwards) via AD in because it runs into the same problems that recursive forward mode runs into in the first place.
The problems that Taylor mode wants to avoid.
Or that at very least need to be carefully programed around.

Because all functions we want to write frules for call other functions themselves, any frule we write for them implictly (or explictitly) invokes the chain rule.
The nth order generaliaztion of that is Faa di Bruno's formula which is from the 1800s.

The nth order deriviative of f(g(x)) needs the a bunch of different combinations of intermeidate values that have already been computed when taking the n-1th derivative (or earlier).
So this approach robs us of that possible efficiency.

Here are the Faa di Bruno formula for derivative of f(g(x))

* 0th: `f(g(x))`

* 1st:  `g'(x) f'(g(x))`
  - reuses `g(x)`

* 2nd:  `g'(x)^2 f''(g(x)) + g''(x) f'(g(x))`
  - reuses `g(x)`, `g'(x)` and `f'(g(x))`

* 3rd:  `f'''(g(x)) g'(x)^3 + 3 g'(x) g''(x) f''(g(x)) + g'''(x) f'(g(x))
 - reuses: `g(x)`, `g'(x)`, `g'(x)^2`, `g''(x)`, `f''(g(x))` and`f'(g(x))`

So naive attempts at using AD to generate the rules via recursive call will fail to put us in a position to easily reuse the values.
If we use symbolic AD to do all at once we might be better off, if ModellingToolKit can do very agressive CSE elimination of the final product.

oxinabox · 2020-01-18T10:23:32Z

@jessebett
for Taylor mode, does one even want the nth order adjoint?
Or does one want the nth-taylor coefficient?
Given that those stop being equal for n>2

I am leaning towards us leaving frule as it is (for first order only), and then adding trule for taylor mode rules.
Then frule can fall back to trule on a single term.
So providing the trule gives you the frule,

That way in taylor mode you either hit good trules that give you exactly what you want,
or you don't and you just do the normal thing of doing the computation on the polynomial.
Since for higher orders that computation will be more efficient than e.g. a AD'd frule anyway.

oxinabox transferred this issue from JuliaDiff/ChainRules.jl Nov 21, 2019

YingboMa pushed a commit to YingboMa/ChainRulesCore.jl that referenced this issue Dec 21, 2019

Merge pull request JuliaDiff#67 from JuliaDiff/ox/reorg

e24c190

Big re-organization

oxinabox added the speculative fairly out there ideas for consideration in longer term label Jan 4, 2020

willtebbutt mentioned this issue Jan 16, 2020

Seperating frule and pushforward prevents efficient solutions (fuse pushforward) #74

Closed

YingboMa mentioned this issue Jan 16, 2020

Make frule return a closure which returns the derivative #102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit Rules for Higher Order "Adjoints" #67

Explicit Rules for Higher Order "Adjoints" #67

jessebett commented Jul 31, 2019

willtebbutt commented Jul 31, 2019

oxinabox commented Jan 4, 2020 •

edited

Loading

jessebett commented Jan 7, 2020

oxinabox commented Jan 16, 2020

oxinabox commented Jan 16, 2020

oxinabox commented Jan 17, 2020 •

edited

Loading

oxinabox commented Jan 18, 2020

Explicit Rules for Higher Order "Adjoints" #67

Explicit Rules for Higher Order "Adjoints" #67

Comments

jessebett commented Jul 31, 2019

willtebbutt commented Jul 31, 2019

oxinabox commented Jan 4, 2020 • edited Loading

jessebett commented Jan 7, 2020

oxinabox commented Jan 16, 2020

oxinabox commented Jan 16, 2020

oxinabox commented Jan 17, 2020 • edited Loading

oxinabox commented Jan 18, 2020

oxinabox commented Jan 4, 2020 •

edited

Loading

oxinabox commented Jan 17, 2020 •

edited

Loading