basic manifold diff sandbox test #1697

dehann · 2023-03-15T12:57:37Z

Hope things are well. I was looking at the new ManifoldDiff and Manopt stuff you put up recently. Thanks for putting that up!

I was wondering if this quick example PR (see files changed tab) was in-line with how you'd recommend we implement?

Best,
Dehann

PS, We'd like to build towards Riemannian optimization even though the current IncrementalInference implementation still has a few remaining AbtractGroupManifold requirements.

PSS, Looks like this basic example has very heavy recompilation load. Will be looking to get down to 0% compile time as most important, then move towards 0 heap allocations from there.

       # and solve
       @time m1 = gradient_descent(M, f, grad_f_FD, data[1])
       
       @info "Basic Manopt test" string(m1')
       @test isapprox(p, m1; atol=0.15)
  0.483038 seconds (421.98 k allocations: 30.606 MiB, 3.25% gc time, 90.31% compilation time: 96% of which was recompilation)
┌ Info: Basic Manopt test
└   string(m1') = "[0.7285775291286883 0.07535342266464667 0.6808058796319686]"
Test Passed

codecov · 2023-03-15T13:31:52Z

Codecov Report

Merging #1697 (5791e0c) into master (78d5f54) will decrease coverage by 22.16%.
The diff coverage is 0.00%.

@@             Coverage Diff             @@
##           master    #1697       +/-   ##
===========================================
- Coverage   76.48%   54.33%   -22.16%     
===========================================
  Files          74       73        -1     
  Lines        5601     5595        -6     
===========================================
- Hits         4284     3040     -1244     
- Misses       1317     2555     +1238

Impacted Files	Coverage Δ
src/IncrementalInference.jl	`87.50% <ø> (ø)`
src/ManifoldsExtentions.jl	`49.09% <0.00%> (-32.31%)`	⬇️

... and 45 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

mateuszbaran · 2023-03-15T14:19:59Z

Hi!

Thank you, things are going mostly OK. When I was working on finishing ManifoldDiff.jl AD rules it occurred to me that my code is quite far from being polished and well tested. There is a lot of edge cases that are quite difficult to work out. I haven't published some parts yet because I have too few good applications that would show which parts are actually important to polish.

I was wondering if this quick example PR (see files changed tab) was in-line with how you'd recommend we implement?

Best, Dehann

PS, We'd like to build towards Riemannian optimization even though the current IncrementalInference implementation still has a few remaining AbtractGroupManifold requirements.

PSS, Looks like this basic example has very heavy recompilation load. Will be looking to get down to 0% compile time as most important, then move towards 0 heap allocations from there.

Here are some of conclusions from the work I've done:

We haven't put much effort into limiting recompilation in Manifolds.jl yet. I've checked ManifoldsBase.jl and it looks quite clean regarding invalidation/recompilation issues. Manifolds.jl/Manopt.jl recompilation is something I'd like to tackle at some point so I can help with that.
As far as I'm aware, all AD libraries are fairly (re-)compilation-heavy. ForwardDiff.jl adds type-level tags everywhere, Zygote.jl does lots of runtime stuff, ReverseDiff.jl is often quite slow (runtime) in my experience, Enzyme's rule system is still somewhat experimental... There are surely things that can be improved but 0% compilation time with AD doesn't look particularly realistic to me, unless you want that only for specific functions, which would turn it into AOT gradient compilation.
0 heap allocations with AD is very hard on nontrivial code. In reverse mode IIRC only Enzyme can do that and in forward mode it requires fairly advanced tricks. Have you considered writing gradients, Jacobians and Hessians by hand instead? Note that this also opens up additional optimisation opportunities. For example currently Julia's AD ecosystem doesn't do any inter-procedural (non-inlined) optimisations AFAICT (as opposed to libraries like PyTorch or Tensorflow). This is a complicated topic. I'm open to an online chat, which would make this discussion easier 🙂 .
I have recently spent a decent amount of time comparing Optim.jl's and Manopt.jl's Riemannian optimisation performance on some problems from multivariate data analysis (things like ICA or Varimax). For most (but not all) objective functions Optim.jl converged faster than Manopt.jl. I have a small code that can turn Manifolds.jl projections into Optim.jl retract and project_tangent so Optim.jl can work with any of our manifolds. It's reasonably easy to switch between Optim.jl and Manopt.jl but setting things up for switching isn't particularly obvious. I think the setup I have is better than using Optimization.jl (it doesn't even have an interface for manifold constraints yet).

I hope this helps. Feel free to ask further questions, there is more that I can share but I don't know what would be interesting for you 🙂 .

dehann · 2023-03-16T06:16:07Z

Oh great thanks for the information. Johan and I spoke on a call earlier to work out the best steps to follow in the short term. We would also like to take you up on the offer to call and chat (thanks!) but we want to prepare and do some leg work before we do, so we don't waste your time. We'll work some examples and reach out to you when ready.

Two immediate steps for us (scope of 2-3 weeks of work) is to get the IncrementalInference.jl computations onto StaticArrays through all the hot-loops, or as far possible. The next is a stop gap solution to gradients which will be limited to a few GroupManifolds only. We would also need to do (what we call) "partial constraints" -- e.g. a camera sees a feature as i,j pixel coordinate... hence dim2 measurement inside a dim6 SpecialEuclidean(3) mechanization.

I have a small code that can turn Manifolds.jl projections into Optim.jl retract and project_tangent so Optim.jl can work with any of our manifolds. It's reasonably easy to switch between Optim.jl and Manopt.jl but setting things up for switching isn't particularly obvious.

Would you perhaps be open to sharing that code snippet? Ultimately we would like to use Manopt with full support for Riemannian operations. We currently use Optim and are considering whether it's worth it to use Optim.manifold as a stop gap while we transition.

There are surely things that can be improved but 0% compilation time with AD doesn't look particularly realistic to me, unless you want that only for specific functions, which would turn it into AOT gradient compilation.

Our current best path would be to only do AD on a few specific functions, which get compiled once or AOT. When we build large cost functions, we'd assemble the cost via a for loop. The non-parametric case would leverage the same few AD functions (using different kind of for loop assembly).

I haven't published some parts yet because I have too few good applications that would show which parts are actually important to polish.

Perhaps this is an area where we can help. At this stage we are likely using only a few manifolds relating to 2D and 3D mechanics with a few sensor or derivative variations.

0 heap allocations with AD is very hard on nontrivial code. In reverse mode IIRC only Enzyme can do that and in forward mode it requires fairly advanced tricks. Have you considered writing gradients, Jacobians and Hessians by hand instead?

I'd rather write a few specializations needed by Manifolds to avoid allocations or type-stability issues. One of the ideas I'm trying to avoid is building LazyLieGroup types and then defining ChainRules.jl definitions for these new "driven" types. I'd rather stay closer to Manifolds.jl native and just overwrite the three or four functions that really make AD or Manopt run fast -- from there we can upstream the specialized functions to JuliaManifolds (if they help / fit in).

mateuszbaran · 2023-03-16T09:23:18Z

Would you perhaps be open to sharing that code snippet? Ultimately we would like to use Manopt with full support for Riemannian operations. We currently use Optim and are considering whether it's worth it to use Optim.manifold as a stop gap while we transition.

Sure! I've made a gist here: https://gist.github.com/mateuszbaran/0354c0edfb9cdf25e084a2b915816a09 . It would be good to put it in some package but I'm not sure in which one. I think Optim.Manifold with my glue code is a perfectly fine way of doing Riemannian optimisation. Of course, it has a very limited set of ways of handling Riemannian constraints compared to Manopt.jl but for some problems they are sufficient. It's definitely worth a try if you already use Optim.

Our current best path would be to only do AD on a few specific functions, which get compiled once or AOT. When we build large cost functions, we'd assemble the cost via a for loop. The non-parametric case would leverage the same few AD functions (using different kind of for loop assembly).

I see. If you can write which functions you want to do AD on I may have some more specific thoughts on that.

Perhaps this is an area where we can help. At this stage we are likely using only a few manifolds relating to 2D and 3D mechanics with a few sensor or derivative variations.

My current examples are also limited to a few manifolds (hyperbolic space, sphere, tangent bundle) so any new example could help 🙂 . I was exploring Stiefel and Grassmann constraints but there the issue is that it is too easy to derive gradients using with https://www.matrixcalculus.org/ and hand-optimise the code beyond what AD in Julia could do. Or maybe I'm just looking at this from the wrong side and using https://www.matrixcalculus.org/ is actually too complex for some people interested in this kind of thing? Designing good interfaces is hard and usually underappreciated.

Anyway, back to the main topic, it would be great to incorporate your needs into this work and have another point of view.

I'd rather write a few specializations needed by Manifolds to avoid allocations or type-stability issues. One of the ideas I'm trying to avoid is building LazyLieGroup types and then defining ChainRules.jl definitions for these new "driven" types. I'd rather stay closer to Manifolds.jl native and just overwrite the three or four functions that really make AD or Manopt run fast -- from there we can upstream the specialized functions to JuliaManifolds (if they help / fit in).

That sounds like a good idea, I can help with that when I know which functions are needed.

basic manifold diff sandbox test

46f799f

dehann self-assigned this Mar 15, 2023

dehann added testing example experimental manifolds labels Mar 15, 2023

dehann added this to the v0.33.1 milestone Mar 15, 2023

dehann requested a review from Affie March 15, 2023 12:58

slightly more standardized

fe893b0

dehann added 6 commits March 16, 2023 22:52

add permanent check from Mateusz' example

d7bd96e

add SpecialEuclidean(2) test with ManifoldDiff

006bc96

add Manopt extra

283152d

add SpecialEuclidean(3) test

3010a30

missing deps in toml for tests

c53422d

new optimizeManifold_FD convenience function

1cbb4ff

dehann added the feature request label Mar 27, 2023

dehann added 6 commits March 27, 2023 08:30

FiniteDifferences fix compat

1e9a448

weird test error

1964b8d

weird test fix

cd68bce

try different fix

cc78357

expand and fix test

879e6e5

better manidiff test tol

5791e0c

dehann merged commit c855c31 into master Mar 27, 2023

Affie modified the milestones: v0.33.1, v0.34.0 Apr 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

basic manifold diff sandbox test #1697

basic manifold diff sandbox test #1697

dehann commented Mar 15, 2023 •

edited

Loading

codecov bot commented Mar 15, 2023 •

edited

Loading

mateuszbaran commented Mar 15, 2023

dehann commented Mar 16, 2023 •

edited

Loading

mateuszbaran commented Mar 16, 2023

basic manifold diff sandbox test #1697

basic manifold diff sandbox test #1697

Conversation

dehann commented Mar 15, 2023 • edited Loading

codecov bot commented Mar 15, 2023 • edited Loading

Codecov Report

mateuszbaran commented Mar 15, 2023

dehann commented Mar 16, 2023 • edited Loading

mateuszbaran commented Mar 16, 2023

dehann commented Mar 15, 2023 •

edited

Loading

codecov bot commented Mar 15, 2023 •

edited

Loading

dehann commented Mar 16, 2023 •

edited

Loading