Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

special case copyto!(::Bidiagonal, ::Bidiagonal) #1061

Closed
dlfivefifty opened this issue Apr 8, 2024 · 0 comments · Fixed by JuliaLang/julia#54041
Closed

special case copyto!(::Bidiagonal, ::Bidiagonal) #1061

dlfivefifty opened this issue Apr 8, 2024 · 0 comments · Fixed by JuliaLang/julia#54041

Comments

@dlfivefifty
Copy link
Contributor

At the moment copyto! with Bidiagonal is O(n^2):

julia> n = 1000; B = Bidiagonal(randn(n), randn(n-1), :L); @time copyto!(B, B);
  0.002047 seconds (3 allocations: 16.031 KiB)

julia> n = 10_000; B = Bidiagonal(randn(n), randn(n-1), :L); @time copyto!(B, B);
  0.127239 seconds (5 allocations: 156.375 KiB)

julia> n = 100_000; B = Bidiagonal(randn(n), randn(n-1), :L); @time copyto!(B, B);
 11.586487 seconds (5 allocations: 1.526 MiB)
jishnub referenced this issue in JuliaLang/julia Apr 23, 2024
This specialized `copyto!` for combinations of banded structured matrix
types so that the copy may be O(N) instead of the fallback O(N^2)
implementation.

E.g.:
```julia
julia> T = Tridiagonal(zeros(999), zeros(1000), zeros(999));

julia> B = Bidiagonal(ones(1000), fill(2.0, 999), :U);

julia> @Btime copyto!($T, $B);
  1.927 ms (0 allocations: 0 bytes) # master
  229.870 ns (0 allocations: 0 bytes) # PR
```

This also changes the `copyto!` implementation for mismatched matrix
sizes, bringing it closer to the docstring. So, the following works on
master:
```julia
julia> Ddest = Diagonal(zeros(4));

julia> Dsrc = Diagonal(ones(2));

julia> copyto!(Ddest, Dsrc)
4×4 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅   0.0   ⋅ 
  ⋅    ⋅    ⋅   0.0
```
but this won't work anymore with this PR. This was inconsistent anyway,
as materializing the matrices produces a different result, which
shouldn't be the case:
```julia
julia> copyto!(Matrix(Ddest), Dsrc)
4×4 Matrix{Float64}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  0.0  0.0
 1.0  0.0  0.0  0.0
```
After this PR, the way to carry out the copy would be
```julia
julia> copyto!(Ddest, CartesianIndices(Dsrc), Dsrc, CartesianIndices(Dsrc))
4×4 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅   0.0   ⋅ 
  ⋅    ⋅    ⋅   0.0
```
This change fixes https://github.com/JuliaLang/julia/issues/46005.

Also fixes https://github.com/JuliaLang/julia/issues/53997
After this,
```julia
julia> @Btime copyto!(C, B) setup=(n = 1_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  158.405 ns (0 allocations: 0 bytes)

julia> @Btime copyto!(C, B) setup=(n = 10_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  4.706 μs (0 allocations: 0 bytes)

julia> @Btime copyto!(C, B) setup=(n = 100_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  120.880 μs (0 allocations: 0 bytes)
```
which is roughly linear scaling.

Taken along with #54027, the
speed-ups would also apply to the adjoints of banded matrices.
KristofferC referenced this issue Nov 14, 2024
This specialized `copyto!` for combinations of banded structured matrix
types so that the copy may be O(N) instead of the fallback O(N^2)
implementation.

E.g.:
```julia
julia> T = Tridiagonal(zeros(999), zeros(1000), zeros(999));

julia> B = Bidiagonal(ones(1000), fill(2.0, 999), :U);

julia> @Btime copyto!($T, $B);
  1.927 ms (0 allocations: 0 bytes) # master
  229.870 ns (0 allocations: 0 bytes) # PR
```

This also changes the `copyto!` implementation for mismatched matrix
sizes, bringing it closer to the docstring. So, the following works on
master:
```julia
julia> Ddest = Diagonal(zeros(4));

julia> Dsrc = Diagonal(ones(2));

julia> copyto!(Ddest, Dsrc)
4×4 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅   0.0   ⋅ 
  ⋅    ⋅    ⋅   0.0
```
but this won't work anymore with this PR. This was inconsistent anyway,
as materializing the matrices produces a different result, which
shouldn't be the case:
```julia
julia> copyto!(Matrix(Ddest), Dsrc)
4×4 Matrix{Float64}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  0.0  0.0
 1.0  0.0  0.0  0.0
```
After this PR, the way to carry out the copy would be
```julia
julia> copyto!(Ddest, CartesianIndices(Dsrc), Dsrc, CartesianIndices(Dsrc))
4×4 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅   0.0   ⋅ 
  ⋅    ⋅    ⋅   0.0
```
This change fixes https://github.com/JuliaLang/julia/issues/46005.

Also fixes https://github.com/JuliaLang/julia/issues/53997
After this,
```julia
julia> @Btime copyto!(C, B) setup=(n = 1_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  158.405 ns (0 allocations: 0 bytes)

julia> @Btime copyto!(C, B) setup=(n = 10_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  4.706 μs (0 allocations: 0 bytes)

julia> @Btime copyto!(C, B) setup=(n = 100_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L));
  120.880 μs (0 allocations: 0 bytes)
```
which is roughly linear scaling.

Taken along with JuliaLang/julia#54027, the
speed-ups would also apply to the adjoints of banded matrices.
@KristofferC KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant