Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix performance issue with diagonal multiplication #44651

Merged
merged 4 commits into from
Mar 29, 2022
Merged

Conversation

dkarrasch
Copy link
Member

This an attempt to fix a major performance regression in diagonal multiplication, see #42321 (comment), following suggestions by @N5N3. @staticfloat may want to keep an eye on CI runtime here.

On my machine, in a complete test-LinearAlgebra run, I had a runtime of the Diagonal test only of nearly 1000 seconds (of course including compilation time). Locally, my make test-LinearAlgebra is currently broken, but when I ran the Diagonal test suite, it finished in less than 300 seconds. Not sure if this is fair comparison, but let's see what the online CI says.

@DilumAluthge
Copy link
Member

$ trap 'kill -- $$' INT TERM QUIT; make --output-sync -j${JULIA_CPU_THREADS:?} check-whitespace
 
Whitespace check found 1 issues:
stdlib/LinearAlgebra/src/diagonal.jl:374 -- trailing whitespace
make: *** [Makefile:102: check-whitespace] Error 1
🚨 Error: The command exited with status 2

@N5N3
Copy link
Member

N5N3 commented Mar 17, 2022

The regression fix is good. But vectorization should not accelete matrix size (that much) used by test (about 10~20?)
So I think we still need some effort to find the reason.

@dkarrasch dkarrasch added performance Must go faster linear algebra Linear algebra backport 1.8 Change should be backported to release-1.8 labels Mar 17, 2022
@staticfloat
Copy link
Sponsor Member

Are these test failures related?

Test threw exception
Expression: (H * x)::UpperHessenberg == Array(H) * x
TypeError: in typeassert, expected Main.Furlongs.Furlong{1, Float64}, got a value of type Bool

@dkarrasch
Copy link
Member Author

Yes, and there are more. I think I'll need to write out the alpha, beta = 0 and != 0 cases manually. This will look scary, but I don't see another option.

Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
@dkarrasch
Copy link
Member Author

@staticfloat After spending quite some time, I realized I don't really know what I'm after, actually. The addmul tests you mentioned are completely inconclusive regarding runtime because there is a lot of randomness in them (size of test matrices, types of test matrices, number of matrix wrapper types considered). Did you also observe in other test submodules a slowdown? Performancewise, all I did here doesn't make any difference TBH.

@KristofferC KristofferC mentioned this pull request Mar 23, 2022
22 tasks
@dkarrasch
Copy link
Member Author

dkarrasch commented Mar 29, 2022

For the record, this PR does improve the speed of [l/r]mul! with Diagonals, and it's even faster than 3-arg mul!!!! As it should be, but wasn't (since v1.6 or even earlier). 🎉

EDIT: This PR

julia> using LinearAlgebra

julia> A = rand(500, 500);

julia> C = copy(A);

julia> D = Diagonal(rand(500));

julia> using BenchmarkTools

julia> @btime rmul!($A, $D);
  900.866 μs (0 allocations: 0 bytes)

julia> @btime lmul!($D, $A); # ca. 1 ms
  1.053 ms (0 allocations: 0 bytes)

julia> @btime mul!($C, $A, $D);
  1.085 ms (0 allocations: 0 bytes)

julia> @btime mul!($C, $D, $A);
  1.094 ms (0 allocations: 0 bytes)

One day old master

julia> using LinearAlgebra

julia> A = rand(500, 500);

julia> C = copy(A);

julia> D = Diagonal(rand(500));

julia> using BenchmarkTools

julia> @btime rmul!($A, $D); # ca. 1 ms
  2.874 ms (2 allocations: 96 bytes)

julia> @btime lmul!($D, $A); # ca. 1 ms
  2.402 ms (0 allocations: 0 bytes)

julia> @btime mul!($C, $A, $D);
  1.222 ms (2 allocations: 96 bytes)

julia> @btime mul!($C, $D, $A);
  1.215 ms (0 allocations: 0 bytes)

@KristofferC KristofferC mentioned this pull request Mar 29, 2022
67 tasks
@dkarrasch
Copy link
Member Author

Test failures are unrelated. Let's go!

@dkarrasch dkarrasch merged commit 03af781 into master Mar 29, 2022
@dkarrasch dkarrasch deleted the dk/diagmul branch March 29, 2022 15:44
KristofferC pushed a commit that referenced this pull request Apr 19, 2022
@KristofferC KristofferC removed the backport 1.8 Change should be backported to release-1.8 label May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants