Promote state and time-span in `schroedinger` to the type from `H(t)*psi` #356

AmitRotem · 2022-12-28T07:04:03Z

For optimizing cost wrt par, it is sometimes useful to get the gradient of cost wrt par.
Promoting the state and time-span to the type from Hamiltonian * state makes it possible to use ForwardDiff.jl on cost - if you happen to define the Hamiltonian in a compatible manner to ForwardDiff.jl.

For example;

using QuantumOptics
import Preferences, UUIDs
Preferences.set_preferences!(UUIDs.UUID("f6369f11-7733-5829-9624-2563aa707210"), "nansafe_mode" => true)
import ForwardDiff as FD

# 3 level kerr transmon with drive
ba = FockBasis(2)
T2 = 1e4
function get_Ht(p::Vector{<:Tp}) where Tp
    ω0, α = Tp(5.0), Tp(-0.2)
    A, freq, ϕ, T = p
    op = 2π*([number(ba), 2\create(ba)*number(ba)*destroy(ba), im*(create(ba)-destroy(ba))])
    fω(t) = ω0
    fα(t) = α
    fΩ(t) = A*cospi(2t*freq + 2ϕ)*sinpi(t/T)^2
    H_at_t = LazySum(zeros(Tp,length(op)), op)
    function Ht(t,_)
        H_at_t.factors.= (fω(t), fα(t), fΩ(t))
        return H_at_t
    end
    return Ht
end

ψ0 = Operator(SpinBasis(1/2), basisstate(ba, 1), basisstate(ba, 2))
target = ψ0*exp(im*0.5π*dense(sigmax(SpinBasis(1/2)))) # x gate
function cost(par)
    T = last(par)
    Ht = get_Ht(par)
    ts = (0.0, T)
    _, ψT = timeevolution.schroedinger_dynamic(ts, ψ0, Ht)
    1-abs2(tr(target'last(ψT))/2)*exp(-T/T2)
end

p0 = [0.03, 5.0, 0.25, 100.0]
cost(p0) |> display
FD.gradient(cost, p0)

codecov · 2022-12-28T08:30:38Z

Codecov Report

Merging #356 (c9ad058) into master (64108ba) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #356      +/-   ##
==========================================
+ Coverage   98.03%   98.05%   +0.02%     
==========================================
  Files          16       16              
  Lines        1422     1438      +16     
==========================================
+ Hits         1394     1410      +16     
  Misses         28       28

Impacted Files	Coverage Δ
src/timeevolution_base.jl	`100.00% <ø> (ø)`
src/schroedinger.jl	`93.75% <100.00%> (+3.12%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Krastanov · 2022-12-28T13:18:39Z

You probably already knew most of this (this being marked as a draft) but:

Could you remove Manifest.toml from this pull request?
I do not think the creation of the example folder is very appropriate. Nothing else in this library uses that style. Maybe this should go to a dedicated section in the documentation?
This probably needs tests being added
What is the performance cost of the changes you made? Does it still work as fast as before in cases where you do not need differentiation?

It would be quite wonderful to have autodiff working with this library.

AmitRotem · 2022-12-28T21:49:04Z

For this small example;

p0 = [0.03, 5.0, 0.25, 100.0]
cost(p0);
@benchmark cost(p0)

In this branch, benchmarking gives run time of 21.429 ms ± 797.528 μs with 289 allocations.
In master branch, benchmarking gives run time of 22.229 ms ± 1.182 ms with 277 allocations.
So mostly a small allocation overhead ?

I'm guessing this will be true for the general case of working with ComplexF64.
Do you think there is some edge cases where the promotion will results in some abstract type, resulting lots of allocations ?
This happens in this example when taking the gradient if I don't promote ω0 or α.

Krastanov · 2022-12-28T22:23:03Z

src/schroedinger.jl

+    # general case is Ts<:Complex, Tt<:Real
+    Ts = eltype(H)
+    Tt = real(Ts)
+    (isconcretetype(Ts) && isconcretetype(Tt)) || @warn "For using `ForwardDiff` on `schroedinger` the element type of `real(H(t,psi)*psi)` must be concrete !!\nGot elements of type $Tt \nTry promoting the Hamiltonian elements based on the parameters you are differentiating by."


In what situations would this warning message be seen? I feel it might be confusing as it will show up in front of users that have not touched ForwardDiff or schroedinger.

I'll remove it.
This was trying to warn about something like

ω0, α = 5.0, -0.2 # not promoting #... fun = [fω, fα, fΩ] H_at_t = LazySum([f(0.0) for f=fun], op) @show H_at_t.factors # type is abstract !! Complex{Real}

in the definition of get_Ht.
But It doesn't cover all possible "bad" definitions of get_Ht.

Krastanov · 2022-12-28T22:24:29Z

src/schroedinger.jl

+    Ts = eltype(H)
+    Tt = real(Ts)
+    (isconcretetype(Ts) && isconcretetype(Tt)) || @warn "For using `ForwardDiff` on `schroedinger` the element type of `real(H(t,psi)*psi)` must be concrete !!\nGot elements of type $Tt \nTry promoting the Hamiltonian elements based on the parameters you are differentiating by."
+    tspan = Tt.(tspan)


I imagine there are situations where no promotion is necessary. Can we shortcircuit those?

I'll declare type Dual and only promote that case, probably would be safer.

Krastanov · 2022-12-28T22:32:53Z

The extra allocation does not seem particularly problematic on first sight (as long as it is just a constant non-growing overhead), but it can probably be removed with a bit of extra logic in the promotion function.

But I am a bit confused, why is promotion even necessary. You are not deriving with respect to state or tspan, so why is it necessary to make them dual numbers? I thought ForwardDiff can multiply by a constant? This works for instance without explicitly promoting the vector [1,2,3]:

julia> cost(H) = sum(H*[1.,2.,3.])

julia> ForwardDiff.gradient(cost, rand(3,3))
...

Krastanov · 2022-12-28T22:46:43Z

I guess the answer to my question about why promotion is necessary is because there are in-place operations done by the diff eq solver?

How about we just shortcircuit on the isconcrete checks. If the types are concrete and if they are different from the types of tspan and state, then we modify them. If not, we do not touch them?

Krastanov · 2022-12-28T22:51:27Z

I am also a bit worried: how does this work when another autodiff tool is used? Is this making Zygote and Enzyme happier or is it breaking things for them.

AmitRotem · 2022-12-28T23:10:04Z

s the answer to my question about why promotion is necessary is because there are in-place operations done by the diff eq solver?

I think it's a QO.jl issue, meaning in-place operations on states and operators. Using DiffEq.jl directly with matrix and vector there no need for the user to promote anything.

Krastanov · 2022-12-28T23:12:24Z

s the answer to my question about why promotion is necessary is because there are in-place operations done by the diff eq solver?

I think it's a QO.jl issue, meaning in-place operations on states and operators. Using DiffEq.jl directly with matrix and vector there no need for the user to promote anything.

I think the correct solution would be to fix that issue in QO.jl instead of working around it. My worry is that workarounds that do not follow the DiffEq.jl style would just become more and more difficult to keep functional as the autodiff capabilities in julia evolve.

Any idea why DiffEq does not have this problem, but QO has it? Anything in particular that QO does that makes things difficult?

AmitRotem · 2022-12-28T23:13:03Z

I am also a bit worried: how does this work when another autodiff tool is used? Is this making Zygote and Enzyme happier or is it breaking things for them.

Last time I tryed Zygote didn't work with QO, and I didn't check if the promoting changed anything.
Anyway, If we promote only if H is Dual it shouldn't affect other packages right?

Krastanov · 2022-12-28T23:17:03Z

I am also a bit worried: how does this work when another autodiff tool is used? Is this making Zygote and Enzyme happier or is it breaking things for them.

Last time I tryed Zygote didn't work with QO, and I didn't check if the promoting changed anything. Anyway, If we promote only if H is Dual it shouldn't affect other packages right?

That is true, but similarly to my comment above, this would be a fragile fix around some other underlying issue. I would be hesitant to use a library that has such special cases in its code, and such a library would be more difficult for others to continue developing. Moreover, that approach would need to add ForwardDiff as a dependency.

Figuring out why QO does not support something that is supported by the underlying DiffEq would probably lead to simpler code and support for other unrelated autodiff tools. Could you give a minimum example of a working autodiff with DiffEq?

AmitRotem · 2022-12-28T23:20:00Z

s the answer to my question about why promotion is necessary is because there are in-place operations done by the diff eq solver?

I think it's a QO.jl issue, meaning in-place operations on states and operators. Using DiffEq.jl directly with matrix and vector there no need for the user to promote anything.

I think the correct solution would be to fix that issue in QO.jl instead of working around it. My worry is that workarounds that do not follow the DiffEq.jl style would just become more and more difficult to keep functional as the autodiff capabilities in julia evolve.

Any idea why DiffEq does not have this problem, but QO has it? Anything in particular that QO does that makes things difficult?

I think it's the preallocation of dpsi, than mul! is failing with ForwardDiff because it try to convert Dual from H*psi to ComplexF64 in dpsi - related to qojulia/QuantumOpticsBase.jl#76 .

DiffEq.jl Also had this issue in the past, meaning the user had to promote time and state for ForwardDiff to work. Not sure what solution they implemented internally.

Krastanov · 2022-12-28T23:33:57Z

I think the fix should be in integrate or recast! where we are interacting with the low level DiffEq library.

https://github.com/qojulia/QuantumOptics.jl/blob/master/src/timeevolution_base.jl#L14

https://github.com/qojulia/QuantumOptics.jl/blob/master/src/schroedinger.jl#L57

Maybe we should try to expose some of SciMLSensitivity https://docs.sciml.ai/SciMLSensitivity/stable/

Krastanov · 2022-12-28T23:53:36Z

This might be of use: https://github.com/SciML/PreallocationTools.jl

AmitRotem · 2022-12-29T00:02:55Z

I am also a bit worried: how does this work when another autodiff tool is used? Is this making Zygote and Enzyme happier or is it breaking things for them.

Last time I tryed Zygote didn't work with QO, and I didn't check if the promoting changed anything. Anyway, If we promote only if H is Dual it shouldn't affect other packages right?

That is true, but similarly to my comment above, this would be a fragile fix around some other underlying issue. I would be hesitant to use a library that has such special cases in its code, and such a library would be more difficult for others to continue developing. Moreover, that approach would need to add ForwardDiff as a dependency.

Figuring out why QO does not support something that is supported by the underlying DiffEq would probably lead to simpler code and support for other unrelated autodiff tools. Could you give a minimum example of a working autodiff with DiffEq?

Cheating a bit, because there is an issue with complex numbers... "Dual on Complex is ambiguous".

using LinearAlgebra, DifferentialEquations, SciMLSensitivity
import ForwardDiff as FD
import Zygote as Zy

H0 = randn(5,5)
H0-=H0'
A = randn(5,5)
A-=A'
u0 = hcat(normalize(randn(5)), normalize(randn(5)))

function Ht!(du,u,p,t)
    a,b,c = p
    du .= A*u
    du.*= a*cos(b*t+c)
    du.+= H0*u
    nothing
end

function loss(p)
    prob = ODEProblem(Ht!,u0,(0.0,1.0),p)
    sol = solve(prob)
    abs2(tr(u0'last(sol.u)))
end

loss(rand(3))
FD.gradient(loss, rand(3))
Zy.gradient(loss, rand(3))

AmitRotem · 2022-12-29T00:55:28Z

Not sure if there is a nice way to deal with complex number here.
https://docs.sciml.ai/SciMLSensitivity/dev/optimal_control/SDE_control/ here they split the eq to real and imag parts.

If everything is nice and analytic, I think it's safe to define Dual on Complex;

function FD.Dual{T, V, N}(Z::Complex) where {T, V, N}
    complex(FD.Dual{T, V, N}(real(Z)), FD.Dual{T, V, N}(imag(Z)))
end

then we only need to promote the time-span, and the above DiffEq.jl example works with complex numbers.

I'm not sure how or why this issue do not show up with QO.jl

AmitRotem · 2022-12-29T00:56:21Z

I will look into using PreallocationTools.jl to better handle edge cases

Krastanov · 2022-12-29T01:00:43Z

I tried to read through the DiffEq code, and it seems they do the promotions themselves already:

https://github.com/SciML/DiffEqBase.jl/blob/82a69aecf6f20062658f6a554695b56821f4f25d/src/solve.jl#L936

And they do the promotion in a way that is very similar to what you have created here:

https://github.com/SciML/DiffEqBase.jl/blob/aad604cc44d9ae2584a94936b8c8eddc29293662/src/forwarddiff.jl#L207

It would be fantastic if we can figure out why the promotion does not work by default for us. That way we would not need to add extra code to QO.jl and we will have it for free for all dynamical equations, not just the shroedinger one.

Krastanov · 2022-12-29T01:09:18Z

The DiffEq promotion does not automatically work, because their promotion happens only if there is a dual in the p parameters argument. We never use the parameters argument, so it never gets triggered. Here is an example of this not working that does not involve QuantumOptics:

function make_f(param)
    function fun!(du,u,p,t)
        du .+= param .* u
    end
end
function cost(param)
    fun! = make_f(param)
    prob = ODEProblem(fun!,[1.0,0.0],(0,.1))
    state = solve(prob,Tsit5(),save_on=false,save_end=true,save_start=false,wrap=false)[end]
    sum(state)
end

cost(1.0)
ForwardDiff.derivative(cost,1.0)  # breaks

Krastanov · 2022-12-29T01:15:03Z

My personal belief is that the "proper" solution is to rework a lot of the internals of QO.jl in order to use DiffEq.jl in a more idiomatic manner. That would significantly simplify the code.

However, that is a gigantic amount of work that I do not think should be blocking important features like the one you are currently suggesting.

Therefore, I am now in favor of the original approach you had started with, even if it re-implements features that supposedly we should get from DiffEq for free. I would suggest using a check similar to eltype(...) <: ForwardDiff.Dual so that promotion is veeeeeeeeery limited.

AmitRotem · 2022-12-29T05:33:15Z

The DiffEq promotion does not automatically work, because their promotion happens only if there is a dual in the p parameters argument.
....

This reminds me of how to get ForwardDiff.jl working on an ode with complex numbers;

function loss(p)
    Tp = eltype(p)
    Ts = promote_type(Tp, eltype(u0))
    function f(u,qwe,t)
        Ht(u,p,t)
    end
    prob = ODEProblem( f, Ts.(u0), Tp.((0.0,1.0)) )
    sol = solve(prob)
    abs2(tr(first(sol.u)'last(sol.u)))
end

This bypass the promotion in DiffEqBase.jl using the type from p, as you mentioned.
I'm not sure why DiffEqBase.promote_u0 fails on complex.

I've opened an issue SciML/DiffEqBase.jl#858

Pst

AmitRotem · 2023-01-08T17:04:18Z

test/test_ForwardDiff.jl

+FDgrad(cost1_with_dt, p0)
+### test vs finite difference
+#### there somtimes an issue with the gradient at p0, which is alot less accurate
+@test all([test_vs_fin_diff(cost1_with_dt, p; atol=1e-6) for p=rp.(range(1e-9, 0.1, tests_repetition))])


There is a small unstable region around p0, where AD and finite diff differ by 1e-2. Not sure why that is.

Would just removing that test make sense? Is it really testing anything that the previous set of much simpler tests do not? This is about testing the autodiff, not about a study of numerical stability.

Yes.
I didn't check this example vs DiffEq, but given that the other examples I compared showed less than 1e-12 difference I think we can remove this test.
And also the 1'st test?

I am not sure which one you refer to as 1st. Testing both the static and dynamic Hamiltonians seems valuable to me.

Keeping the two tests you specifically cited below seems reasonable to me. You can also guide yourself by the coverage report - we should aim to have all new lines covered by tests.

AmitRotem · 2023-01-09T02:18:01Z

test/test_ForwardDiff.jl

+## static
+function get_H(p)
+    op = create(ba0)+destroy(ba0)
+    return sin(p)*op


I mean keep this static test

AmitRotem · 2023-01-09T02:18:38Z

test/test_ForwardDiff.jl

+target2 = randstate(ba2)
+function cost2(par)
+    a,b = par
+    Ht(t,u) = A + a*cos(b*t)*B/10


And this dynamic test

Pst

AmitRotem · 2023-01-17T15:58:46Z

Are we happy with this?

Krastanov · 2023-01-18T22:14:09Z

I think the way it is done is quite clean, or at least as clean as possible.

I am not the biggest fan of the style of the tests: my personal philosophy is to not include tests that are sensitive to numerical instabilities (if the code being tested is unrelated to numerical instability control). But being practical, I do not think it is necessary to redo the tests or block the merge. Maybe in a couple of years there will be a test failure due to some change to default DiffEq / ForwardDiff algorithms, but that is fine.

I am a bit worried though that this adds whole 3 minutes to the test run. I would have expected testing this to take 10-20 seconds. Wouldn't both these issues (instabilities and long test runs) be fixed if we just test fewer examples. Testing fewer examples would not drop the test coverage percentage or really change much.

Minor notes:

There is not much point in seeding the random number generator: the generator algorithm is not stable between versions and architectures. If you insist on it, you can use StableRNG.jl but that seems like overkill
The comments in the test file were a bit disorganized, e.g. it jumps from Example 0 to Example 2.

Anyway, I do not see blocking issues. @david-pl , is there a policy/philosophy about what the tests need to look like. These seem good enough even if unpolished and a bit long.

More importantly, thanks for fixing some of the ergonimics with autodiff in this package. This would be much more pleasant to use for me now.

david-pl · 2023-01-19T17:56:03Z

@AmitRotem thanks for all the work here! I have to admit that my knowledge of AD is a bit limited, so I trust @Krastanov here - thanks for the detailed review!

Regarding the tests: they are a bit of a mess throughout the package already. Also, time is a bit of an issue as the CI already takes quite long, but we'll have to look at that in a separate PR. There's no real philosophy in how to do things, except maybe that we have one test per file. So long as there are tests for new code, I'm happy!

Any mess we have in the tests can be cleaned up in the future. For that I'd simply stick to the way SciML tests are structured.

So all good from my side. @Krastanov just give me one last go ahead once you're happy with the PR and I'll merge it.

Krastanov · 2023-01-19T19:31:08Z

I will do one last pass and rerun the tests tonight and will give that go-ahead then.

Krastanov · 2023-01-20T19:19:17Z

@AmitRotem , if I understand correctly, this depends on a change you contributed to DiffEqBase in version 6.113.0. Could you add this to the Project dependencies and compat? You can also then directly do using DiffEqBase instead of getting it through the OrdinaryDiffEq namespace.

@david-pl good to merge in my opinion after adding the dependency. Details below.

The comments about the tests stand, but as you mentioned that requires an overall cleanup, probably following SafeTestsets as done in SciML.

This also introduces a dependence on a private DiffEqBase function (the promotion one), however that function has existed since 2018 and very early versions of DiffEqBase. We can not really rely on semver to detect issues with this, but it seems relatively safe (and testing will detect any possible issues in the future). Amit has added another method to that function upstream to make sure it does not break on complex numbers.

As I already mentioned. The actual changes look neat and clean. Once this is merged I will create issues to track the addition of the same features to the rest of the solvers.

AmitRotem · 2023-01-21T07:45:09Z

Most of the time is compilation time for ForwardDiff.jl.
Even if I keep only 2 test (not repeating over randoms), for checking no error during promotion and comparing AD on QO.jl vs AD on DiffEq.jl (which i think is the bare minimum to test for), the run time (localy) goes from 77 sec to 34 sec.

add DiffEqBase to Project and compat

Krastanov · 2023-01-21T14:57:19Z

Looks good to merge.

I do not actually think there is that much need for the tests against DiffEq. We are using DiffEq internally anyway, so just checking for correctness with a quick finite differences approximation on a numerically stable example would also have been enough.

clean up test

AmitRotem · 2023-01-22T06:49:12Z

If you think this minimal version of the test is enough, I'll remove the longer version.

fix covrage

Krastanov · 2023-01-23T07:10:40Z

Thanks for the test cleanup! Looks good to merge to me (the only changes from the previous time I said that is that tests were streamlined).

Amit Rotem added 7 commits December 21, 2022 02:17

promote time and state to type of hamiltonian

dff8b1a

innit

2e78b53

added nansafe_mode=true preference

d5097bc

use eltype instead of _get_type

fe2e01a

update example

c00785d

small fix

8932e03

_promote_state for Bra

73a1684

remove example folder

8174cb8

Krastanov reviewed Dec 28, 2022

View reviewed changes

Amit Rotem and others added 3 commits January 7, 2023 22:47

compare vs DiffEq, choose random seed

cffea71

change range to avoid tol issue

c3f9cc5

Merge pull request #2 from AmitRotem/pst

0965915

Pst

AmitRotem commented Jan 8, 2023

View reviewed changes

AmitRotem marked this pull request as ready for review January 8, 2023 17:07

AmitRotem commented Jan 9, 2023

View reviewed changes

Amit Rotem and others added 3 commits January 10, 2023 15:03

rebuild state only if promoted

b11273b

modify test to impove covrage. remove test. add comments.

a719bc3

Merge pull request #3 from AmitRotem/pst

dfe7b95

Pst

add DiffEqBase to Project

26aa49a

Merge pull request #4 from AmitRotem/pst

447ccd5

add DiffEqBase to Project and compat

Amit Rotem and others added 3 commits January 21, 2023 16:59

clean up AD test

1ca17ec

name change

b312938

Merge pull request #5 from AmitRotem/pst

a128c33

clean up test

AmitRotem and others added 4 commits January 22, 2023 10:15

Improve coverage

203846f

Update test_ForwardDiff.jl

f3e59fc

small fix

9020a11

Merge pull request #6 from AmitRotem/pst

c9ad058

fix covrage

david-pl merged commit 1abc40f into qojulia:master Jan 25, 2023

AmitRotem deleted the promote-state-and-time branch January 26, 2023 00:14

Promote state and time-span in schroedinger to the type from H(t)*psi #356

Promote state and time-span in schroedinger to the type from H(t)*psi #356

Conversation

AmitRotem commented Dec 28, 2022 • edited Loading

codecov bot commented Dec 28, 2022 • edited Loading

Codecov Report

Krastanov commented Dec 28, 2022 • edited Loading

AmitRotem commented Dec 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Krastanov commented Dec 28, 2022

Krastanov commented Dec 28, 2022

Krastanov commented Dec 28, 2022

AmitRotem commented Dec 28, 2022

Krastanov commented Dec 28, 2022

AmitRotem commented Dec 28, 2022

Krastanov commented Dec 28, 2022

AmitRotem commented Dec 28, 2022

Krastanov commented Dec 28, 2022

Krastanov commented Dec 28, 2022

AmitRotem commented Dec 29, 2022 • edited Loading

AmitRotem commented Dec 29, 2022

AmitRotem commented Dec 29, 2022

Krastanov commented Dec 29, 2022

Krastanov commented Dec 29, 2022

Krastanov commented Dec 29, 2022

AmitRotem commented Dec 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AmitRotem commented Jan 17, 2023

Krastanov commented Jan 18, 2023

david-pl commented Jan 19, 2023

Krastanov commented Jan 19, 2023

Krastanov commented Jan 20, 2023

AmitRotem commented Jan 21, 2023

Krastanov commented Jan 21, 2023

AmitRotem commented Jan 22, 2023

Krastanov commented Jan 23, 2023

Promote state and time-span in `schroedinger` to the type from `H(t)*psi` #356

Promote state and time-span in `schroedinger` to the type from `H(t)*psi` #356

AmitRotem commented Dec 28, 2022 •

edited

Loading

codecov bot commented Dec 28, 2022 •

edited

Loading

Krastanov commented Dec 28, 2022 •

edited

Loading

AmitRotem commented Dec 28, 2022 •

edited

Loading

AmitRotem commented Dec 29, 2022 •

edited

Loading

AmitRotem commented Dec 29, 2022 •

edited

Loading