-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
special case copyto!(::Bidiagonal, ::Bidiagonal) #1061
Comments
jishnub
referenced
this issue
in JuliaLang/julia
Apr 23, 2024
This specialized `copyto!` for combinations of banded structured matrix types so that the copy may be O(N) instead of the fallback O(N^2) implementation. E.g.: ```julia julia> T = Tridiagonal(zeros(999), zeros(1000), zeros(999)); julia> B = Bidiagonal(ones(1000), fill(2.0, 999), :U); julia> @Btime copyto!($T, $B); 1.927 ms (0 allocations: 0 bytes) # master 229.870 ns (0 allocations: 0 bytes) # PR ``` This also changes the `copyto!` implementation for mismatched matrix sizes, bringing it closer to the docstring. So, the following works on master: ```julia julia> Ddest = Diagonal(zeros(4)); julia> Dsrc = Diagonal(ones(2)); julia> copyto!(Ddest, Dsrc) 4×4 Diagonal{Float64, Vector{Float64}}: 1.0 ⋅ ⋅ ⋅ ⋅ 1.0 ⋅ ⋅ ⋅ ⋅ 0.0 ⋅ ⋅ ⋅ ⋅ 0.0 ``` but this won't work anymore with this PR. This was inconsistent anyway, as materializing the matrices produces a different result, which shouldn't be the case: ```julia julia> copyto!(Matrix(Ddest), Dsrc) 4×4 Matrix{Float64}: 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 ``` After this PR, the way to carry out the copy would be ```julia julia> copyto!(Ddest, CartesianIndices(Dsrc), Dsrc, CartesianIndices(Dsrc)) 4×4 Diagonal{Float64, Vector{Float64}}: 1.0 ⋅ ⋅ ⋅ ⋅ 1.0 ⋅ ⋅ ⋅ ⋅ 0.0 ⋅ ⋅ ⋅ ⋅ 0.0 ``` This change fixes https://github.com/JuliaLang/julia/issues/46005. Also fixes https://github.com/JuliaLang/julia/issues/53997 After this, ```julia julia> @Btime copyto!(C, B) setup=(n = 1_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L)); 158.405 ns (0 allocations: 0 bytes) julia> @Btime copyto!(C, B) setup=(n = 10_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L)); 4.706 μs (0 allocations: 0 bytes) julia> @Btime copyto!(C, B) setup=(n = 100_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L)); 120.880 μs (0 allocations: 0 bytes) ``` which is roughly linear scaling. Taken along with #54027, the speed-ups would also apply to the adjoints of banded matrices.
KristofferC
referenced
this issue
Nov 14, 2024
This specialized `copyto!` for combinations of banded structured matrix types so that the copy may be O(N) instead of the fallback O(N^2) implementation. E.g.: ```julia julia> T = Tridiagonal(zeros(999), zeros(1000), zeros(999)); julia> B = Bidiagonal(ones(1000), fill(2.0, 999), :U); julia> @Btime copyto!($T, $B); 1.927 ms (0 allocations: 0 bytes) # master 229.870 ns (0 allocations: 0 bytes) # PR ``` This also changes the `copyto!` implementation for mismatched matrix sizes, bringing it closer to the docstring. So, the following works on master: ```julia julia> Ddest = Diagonal(zeros(4)); julia> Dsrc = Diagonal(ones(2)); julia> copyto!(Ddest, Dsrc) 4×4 Diagonal{Float64, Vector{Float64}}: 1.0 ⋅ ⋅ ⋅ ⋅ 1.0 ⋅ ⋅ ⋅ ⋅ 0.0 ⋅ ⋅ ⋅ ⋅ 0.0 ``` but this won't work anymore with this PR. This was inconsistent anyway, as materializing the matrices produces a different result, which shouldn't be the case: ```julia julia> copyto!(Matrix(Ddest), Dsrc) 4×4 Matrix{Float64}: 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 ``` After this PR, the way to carry out the copy would be ```julia julia> copyto!(Ddest, CartesianIndices(Dsrc), Dsrc, CartesianIndices(Dsrc)) 4×4 Diagonal{Float64, Vector{Float64}}: 1.0 ⋅ ⋅ ⋅ ⋅ 1.0 ⋅ ⋅ ⋅ ⋅ 0.0 ⋅ ⋅ ⋅ ⋅ 0.0 ``` This change fixes https://github.com/JuliaLang/julia/issues/46005. Also fixes https://github.com/JuliaLang/julia/issues/53997 After this, ```julia julia> @Btime copyto!(C, B) setup=(n = 1_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L)); 158.405 ns (0 allocations: 0 bytes) julia> @Btime copyto!(C, B) setup=(n = 10_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L)); 4.706 μs (0 allocations: 0 bytes) julia> @Btime copyto!(C, B) setup=(n = 100_000; B = Bidiagonal(randn(n), randn(n-1), :L); C = Bidiagonal(randn(n), randn(n-1), :L)); 120.880 μs (0 allocations: 0 bytes) ``` which is roughly linear scaling. Taken along with JuliaLang/julia#54027, the speed-ups would also apply to the adjoints of banded matrices.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
At the moment
copyto!
withBidiagonal
is O(n^2):The text was updated successfully, but these errors were encountered: