Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Planned backends to implement #40

Open
4 of 10 tasks
sethaxen opened this issue Jan 26, 2022 · 14 comments
Open
4 of 10 tasks

Planned backends to implement #40

sethaxen opened this issue Jan 26, 2022 · 14 comments
Labels
feature New feature or request help wanted Extra attention is needed

Comments

@sethaxen
Copy link
Member

sethaxen commented Jan 26, 2022

We should add backends for the following AD/FD packages:

@AriMKatz
Copy link

Can you add Yota also ?

@sethaxen
Copy link
Member Author

Yota is ChainRules-compatible, so it should be covered with the others.

@wsmoses
Copy link
Collaborator

wsmoses commented Jan 26, 2022

Make sure to add both Enzyme forward and reverse modes!

@sethaxen
Copy link
Member Author

Will do! Should I start with the public API? @frankschae said you had mentioned we might want to use some internal functions (he pointed me to https://github.com/wsmoses/Enzyme.jl/blob/2ce81ffa8f56c5bf44a4d85234c2110fa9d6eb0a/src/compiler.jl#L1745)

@wsmoses
Copy link
Collaborator

wsmoses commented Jan 27, 2022

I might not go quite that low level to save yourself some common LLVM setup, but probably using the thunk level (https://github.com/wsmoses/Enzyme.jl/blob/2ce81ffa8f56c5bf44a4d85234c2110fa9d6eb0a/src/compiler.jl#L2700) which has options for "combined" augmented forward pass+gradient, an augmented forward pass (storing values from the original function that need preservation), a standalone gradient (just running the reverse, using the stored values from an augmented forward pass), and forward mode AD.

This is used, for example, to generate the high-level autodiff/fwddiff routines (https://github.com/wsmoses/Enzyme.jl/blob/2ce81ffa8f56c5bf44a4d85234c2110fa9d6eb0a/src/Enzyme.jl#L173) and is currently the highest-level point that exposes "split mode" [e.g. the split augmented forward pass and standalone gradient]

@mohamed82008
Copy link
Member

mohamed82008 commented Feb 4, 2022

I would like to add a "batch" version of Zygote as a backend which falls back on Zygote except for jacobian where the pullback is called with all the bases simultaneously (i.e. pb(I) where I is the identity matrix). This can be useful to preserve sparsity of Jacobians if all the rules are written in a way that preserves sparsity.

@mohamed82008
Copy link
Member

And a SparseDiffTools backend to optimise for sparsity structure

@sethaxen
Copy link
Member Author

sethaxen commented Feb 4, 2022

I would like to add a "batch" version of Zygote as a backend which falls back on Zygote except for jacobian where the pullback is called with all the bases simultaneously (i.e. pb(I) where I is the identity matrix). This can be useful to preserve sparsity of Jacobians if all the rules are written in a way that preserves sparsity.

Is this a feature Zygote actually supports, or just something that sometimes works?

@ChrisRackauckas
Copy link
Member

It requires that the function being differentiated has independent actions on each column. For example, a neural network satisfies this.

@mohamed82008
Copy link
Member

or just something that sometimes works?

Something that sometimes works. The goal is to make it easy to define a sparse Jacobian in a rrule and then get it back when calling Zygote.jacobian.

@JTaets
Copy link

JTaets commented Feb 7, 2022

Is adding Symbolics.jl also planned?

In my field (control theory), symbolic differentiation is almost exclusively used since it gives speed when derivatives need to be calculated multiple times due to a lack of overhead of logic from the ADs calculating the forward pass and allocations. This is also the case for machine learning with constant graph, which can also benefit from this when common sub-expression elimination (cse) from Symbolics.jl is fully functional.

Calculating the derivative would happen by symbolically tracing the function and generating the derivative/gradient/jacobian function, then passing the inputs to the function.

This is useful when caches are added to this package, for Symbolics.jl the cache would just be the generated derivative function, resulting in no overhead in calculating the derivative.

@sethaxen
Copy link
Member Author

sethaxen commented Feb 7, 2022

I think it would be good to support this. As you say, this would require support for caching. See #41.

@prbzrg
Copy link

prbzrg commented Feb 7, 2023

Via GitHub advanced search, I found some other AD packages as well:

  • gdalle/ImplicitDifferentiation.jl
  • avigliotti/AD4SM.jl
  • JuliaDiff/TaylorDiff.jl
  • abap34/JITrench.jl
  • sshin23/MadDiff.jl

https://github.com/search?l=&o=desc&q=Automatic+Differentiation+stars%3A%3E10+pushed%3A%3E2022-01-01+language%3AJulia&s=stars&type=Repositories

@gdalle
Copy link
Member

gdalle commented May 24, 2023

gdalle/ImplicitDifferentiation.jl

Actually, ImplicitDifferentiation.jl now uses AbstractDifferentiation.jl under the hood, to call any AD package as a backend. Can it be a backend itself? I don't think it's a good idea, so no need to include it on the list :)

@gdalle gdalle added feature New feature or request help wanted Extra attention is needed labels Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

8 participants