Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale block not supporting chainrules/Zygote diff yet #323

Closed
vincentelfving opened this issue Dec 20, 2021 · 3 comments
Closed

Scale block not supporting chainrules/Zygote diff yet #323

vincentelfving opened this issue Dec 20, 2021 · 3 comments

Comments

@vincentelfving
Copy link

Scale blocks appear to be unsupported by the chainrules in YaoBlocks

A minimal example setting:

using Zygote
using Yao
using YaoBlocks

N=2
psi_0 = zero_state(N)
U0 = chain(N, put(1=>Rx(0.0)), put(2=>Ry(0.0)))
C = 2.1*sum([chain(N, put(k=>Z)) for k=1:N])

function loss(theta)
    U = dispatch(U0, theta)
    psi0 = copy(psi_0)
    psi1 = apply(psi0, U)
    psi2 = apply(psi1, C)
    result = real(sum(conj(state(psi1)) .* state(psi2)))
    return result
end

theta = [1.7,2.5]
println(expect'(C, copy(psi_0) => dispatch(U0, theta))[2])
grad = Zygote.gradient(theta->loss(theta), theta)[1]
println(grad)

the above loss function computes effectively an expectation value equivalent to expect(C, psi_0 => U). Computing expect' is no problem, but when instead we use Zygote we find the following error:

[-2.0824961019501838, -1.2567915026183087]
ERROR: LoadError: UndefKeywordError: keyword argument in not assigned
Stacktrace:
  [1] apply_back!(st::Tuple{ArrayReg{1, ComplexF64, Matrix{ComplexF64}}, ArrayReg{1, ComplexF64, Matrix{ComplexF64}}}, circuit::Add{2}, collector::Vector{Any})
    @ YaoBlocks.AD ~/.julia/packages/YaoBlocks/amVAv/src/autodiff/apply_back.jl:112
  [2] apply_back!(st::Tuple{ArrayReg{1, ComplexF64, Matrix{ComplexF64}}, ArrayReg{1, ComplexF64, Matrix{ComplexF64}}}, block::Scale{Float64, 2, Add{2}}, collector::Vector{Any})
    @ YaoBlocks.AD ~/.julia/packages/YaoBlocks/amVAv/src/autodiff/apply_back.jl:98
  [3] apply_back(st::Tuple{ArrayReg{1, ComplexF64, Matrix{ComplexF64}}, ArrayReg{1, ComplexF64, Matrix{ComplexF64}}}, block::Scale{Float64, 2, Add{2}}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ YaoBlocks.AD ~/.julia/packages/YaoBlocks/amVAv/src/autodiff/apply_back.jl:151
  [4] apply_back(st::Tuple{ArrayReg{1, ComplexF64, Matrix{ComplexF64}}, ArrayReg{1, ComplexF64, Matrix{ComplexF64}}}, block::Scale{Float64, 2, Add{2}})
    @ YaoBlocks.AD ~/.julia/packages/YaoBlocks/amVAv/src/autodiff/apply_back.jl:150
  [5] (::YaoBlocks.AD.var"#47#48"{Scale{Float64, 2, Add{2}}, ArrayReg{1, ComplexF64, Matrix{ComplexF64}}})(outδ::ArrayReg{1, ComplexF64, Matrix{ComplexF64}})
    @ YaoBlocks.AD ~/.julia/packages/YaoBlocks/amVAv/src/autodiff/chainrules_patch.jl:80
  [6] ZBack
    @ ~/.julia/packages/Zygote/AlLTp/src/compiler/chainrules.jl:204 [inlined]
  [7] Pullback
    @ ~/zygote_scale_bug.jl:14 [inlined]
  [8] (::typeof(∂(loss)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
  [9] Pullback
    @ ~/zygote_scale_bug.jl:21 [inlined]
 [10] (::typeof(∂(#35)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [11] (::Zygote.var"#55#56"{typeof(∂(#35))})(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface.jl:41
 [12] gradient(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface.jl:76
 [13] top-level scope
    @ ~/zygote_scale_bug.jl:21
 [14] include(fname::String)
    @ Base.MainInclude ./client.jl:444
 [15] top-level scope
    @ REPL[4]:1
 [16] top-level scope
    @ ~/.julia/packages/CUDA/YpW0k/src/initialization.jl:52
zygote_scale_bug.jl:21

However, if we instead put the scale factor in front of each Z instead of in front of the whole sum([chain[][) block, so

C = sum([chain(N, put(k=>2.1*Z)) for k=1:N])

, expect' and zygote.gradient yield the same result [-2.0824961019501838, -1.2567915026183087], as expected.
The two methods are mathematically equivalent, but support for the former would be useful/clean!

@GiggleLiu
Copy link
Member

GiggleLiu commented Dec 20, 2021

Thanks for the issue, this is not because the Scale block is not supported, but the Add block is not supported for back-propagating the apply function since it is not reversible. If you do not want to get the gradients of the Hamiltonian, please use Zygote.@ignore to ignore it.

julia> N=2;

julia> psi_0 = zero_state(N);

julia> U0 = chain(N, put(1=>Rx(0.0)), put(2=>Ry(0.0)));

julia> C = 2.1*sum([chain(N, put(k=>Z)) for k=1:N]);

julia> function loss(theta)
           U = dispatch(U0, theta)
           psi0 = copy(psi_0)
           psi1 = apply(psi0, U)
           psi2 = Zygote.@ignore apply(psi1, C)
           result = real(sum(conj(state(psi1)) .* state(psi2)))
           return result
       end

julia> theta = [1.7,2.5];

julia> println(expect'(C, copy(psi_0) => dispatch(U0, theta))[2])
[-2.0824961019501838, -1.2567915026183087]

julia> grad = Zygote.gradient(theta->loss(theta), theta)[1];

julia> println(grad)
[-1.0412480509750919, -0.6283957513091544]

@vincentelfving
Copy link
Author

vincentelfving commented Dec 21, 2021

thanks that could work for most cases!
why is C = sum([chain(N, put(k=>2.1*Z)) for k=1:N]) working correctly do you think? that is also an Add block and I use apply on it without Zygote ignore, right?

Also, why is your gradient returned a factor 2 difference in both cases?

@GiggleLiu
Copy link
Member

thanks that could work for most cases! why is C = sum([chain(N, put(k=>2.1*Z)) for k=1:N]) working correctly do you think? that is also an Add block and I use apply on it without Zygote ignore, right?

Also, why is your gradient returned a factor 2 difference in both cases?

If you are asking why expect’ returns the correct gradient, the it is because you are using Yao‘s built in AD engine. Yao ignores it automatically. The reason why the gradients are different by two is probably related to the macro also ignores half of psi‘s gradient at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants