-
-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some limited definitions for nested AD #77
base: master
Are you sure you want to change the base?
Conversation
This adds a bunch of definitions to make nested AD work and adds tests for second-order AD. Unfortunately by the time we get to third-order AD, the types get so large that base decides to go on vacation while it thinks about whether or not it might be willing to compile a function with a type of such complexity. Additionally, Zygote introduces some unnecessary stacks, which then prevent higher order AD. I plan to work on both of those issues, but in the meantime, here are the changes to Zygote required to make this work.
The Zygote parts that break nested AD are addressed by #78. Still takes |
b560693
to
bfcacfa
Compare
In general it'd be cleaner if we could avoid defining second order gradients in favour of e.g. having dgetindex use Also, what's the motivation for
Somewhat surprisingly, the huge types appear not to be a problem (or at least not to make things significantly worse here). I suspect the main issue is simply that we're generating, differentiating and infering the equivalent of several thousand lines of code. If so we may not be able to solve this without either running AD on typed IR with aggressive DCE, or (perhaps in the short term) switching to tracing where DCE is more valuable than other optimisations. |
Yes, agreed, but I just wanted to make things work for now.
Just as a debugging aid for now. While we don't have |
340bcb0
to
6b1b80f
Compare
Finding third order derivatives is very slow. Is this the related PR to fix it? BTW: this branch is quite outdated, a lot of conflicts with mater, hoping someone can fix it 😃 |
This adds a bunch of definitions to make nested AD work and adds tests
for second-order AD. Unfortunately by the time we get to third-order AD,
the types get so large that base decides to go on vacation while it thinks
about whether or not it might be willing to compile a function with a type
of such complexity. Additionally, Zygote introduces some unnecessary stacks,
which then prevent higher order AD. I plan to work on both of those issues,
but in the meantime, here are the changes to Zygote required to make this
work.