Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinearAlgebra: adjoint for bidiag/tridiag may preserve structure #54027

Merged
merged 2 commits into from
May 20, 2024

Conversation

jishnub
Copy link
Contributor

@jishnub jishnub commented Apr 10, 2024

After this,

julia> B = Bidiagonal(rand(ComplexF64,3), rand(ComplexF64,2), :U)
3×3 Bidiagonal{ComplexF64, Vector{ComplexF64}}:
 0.0444083+0.405889im  0.982608+0.798427im                
                      0.723202+0.839624im    0.35967+0.482086im
                                          0.0729189+0.835082im

julia> B'
3×3 Bidiagonal{ComplexF64, Base.ReshapedArray{ComplexF64, 1, Adjoint{ComplexF64, Vector{ComplexF64}}, Tuple{}}}:
 0.0444083-0.405889im                                    
  0.982608-0.798427im  0.723202-0.839624im                
                       0.35967-0.482086im  0.0729189-0.835082im

julia> B'' === B # false on master
true

Similar changes for Tridiagonal. This makes matrix multiplication much more optimized, as we hit the O(N) tridiag methods. On master,

julia> T = Tridiagonal(rand(ComplexF64,399), rand(ComplexF64,400), rand(ComplexF64,399));

julia> v = rand(ComplexF64, size(T,2));

julia> @btime $T * $v;
  1.241 μs (3 allocations: 6.31 KiB)

julia> @btime $T' * $v;
  537.109 μs (3 allocations: 6.31 KiB)

This PR

julia> @btime $T' * $v;
  1.304 μs (4 allocations: 6.39 KiB)

jishnub added a commit that referenced this pull request Apr 23, 2024
This specialized `copyto!` for combinations of banded structured matrix
types so that the copy may be O(N) instead of the fallback O(N^2)
implementation.

E.g.:
```julia
julia> T = Tridiagonal(zeros(999), zeros(1000), zeros(999));

julia> B = Bidiagonal(ones(1000), fill(2.0, 999), :U);

julia> @Btime copyto!($T, $B);
  1.927 ms (0 allocations: 0 bytes) # master
  229.870 ns (0 allocations: 0 bytes) # PR
```

This also changes the `copyto!` implementation for mismatched matrix
sizes, bringing it closer to the docstring. So, the following works on
master:
```julia
julia> Ddest = Diagonal(zeros(4));

julia> Dsrc = Diagonal(ones(2));

julia> copyto!(Ddest, Dsrc)
4×4 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅   0.0   ⋅ 
  ⋅    ⋅    ⋅   0.0
```
but this won't work anymore with this PR. This was inconsistent anyway,
as materializing the matrices produces a different result, which
shouldn't be the case:
```julia
julia> copyto!(Matrix(Ddest), Dsrc)
4×4 Matrix{Float64}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  0.0  0.0
 1.0  0.0  0.0  0.0
```
After this PR, the way to carry out the copy would be
```julia
julia> copyto!(Ddest, CartesianIndices(Dsrc), Dsrc, CartesianIndices(Dsrc))
4×4 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅   0.0   ⋅ 
  ⋅    ⋅    ⋅   0.0
```
This change fixes https://github.com/JuliaLang/julia/issues/46005.

Also fixes https://github.com/JuliaLang/julia/issues/53997
After this,
```julia
julia> @Btime copyto!(C, B) setup=(n = 1_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  158.405 ns (0 allocations: 0 bytes)

julia> @Btime copyto!(C, B) setup=(n = 10_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  4.706 μs (0 allocations: 0 bytes)

julia> @Btime copyto!(C, B) setup=(n = 100_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  120.880 μs (0 allocations: 0 bytes)
```
which is roughly linear scaling.

Taken along with #54027, the
speed-ups would also apply to the adjoints of banded matrices.
dkarrasch

This comment was marked as outdated.

@dkarrasch dkarrasch merged commit 8ac1db5 into master May 20, 2024
5 of 7 checks passed
@dkarrasch dkarrasch deleted the jishnub/bitridiagcomplexadj branch May 20, 2024 15:21
KristofferC pushed a commit to JuliaLang/LinearAlgebra.jl that referenced this pull request Nov 14, 2024
This specialized `copyto!` for combinations of banded structured matrix
types so that the copy may be O(N) instead of the fallback O(N^2)
implementation.

E.g.:
```julia
julia> T = Tridiagonal(zeros(999), zeros(1000), zeros(999));

julia> B = Bidiagonal(ones(1000), fill(2.0, 999), :U);

julia> @Btime copyto!($T, $B);
  1.927 ms (0 allocations: 0 bytes) # master
  229.870 ns (0 allocations: 0 bytes) # PR
```

This also changes the `copyto!` implementation for mismatched matrix
sizes, bringing it closer to the docstring. So, the following works on
master:
```julia
julia> Ddest = Diagonal(zeros(4));

julia> Dsrc = Diagonal(ones(2));

julia> copyto!(Ddest, Dsrc)
4×4 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅   0.0   ⋅ 
  ⋅    ⋅    ⋅   0.0
```
but this won't work anymore with this PR. This was inconsistent anyway,
as materializing the matrices produces a different result, which
shouldn't be the case:
```julia
julia> copyto!(Matrix(Ddest), Dsrc)
4×4 Matrix{Float64}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  0.0  0.0
 1.0  0.0  0.0  0.0
```
After this PR, the way to carry out the copy would be
```julia
julia> copyto!(Ddest, CartesianIndices(Dsrc), Dsrc, CartesianIndices(Dsrc))
4×4 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅   0.0   ⋅ 
  ⋅    ⋅    ⋅   0.0
```
This change fixes https://github.com/JuliaLang/julia/issues/46005.

Also fixes https://github.com/JuliaLang/julia/issues/53997
After this,
```julia
julia> @Btime copyto!(C, B) setup=(n = 1_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  158.405 ns (0 allocations: 0 bytes)

julia> @Btime copyto!(C, B) setup=(n = 10_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  4.706 μs (0 allocations: 0 bytes)

julia> @Btime copyto!(C, B) setup=(n = 100_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  120.880 μs (0 allocations: 0 bytes)
```
which is roughly linear scaling.

Taken along with JuliaLang/julia#54027, the
speed-ups would also apply to the adjoints of banded matrices.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants