-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite mul! to dispatch based on memory layout, not matrix type #25558
Conversation
Some notes to myself as I delete code to make sure to restore behaviour:
|
…rraymemorylayout
…rraymemorylayout
@@ -488,6 +488,7 @@ mul2!(A::Tridiagonal, B::AbstractTriangular) = A*full!(B) # is this necessary? | |||
|
|||
mul!(C::AbstractMatrix, A::AbstractTriangular, B::Tridiagonal) = mul!(C, copyto!(similar(parent(A)), A), B) | |||
mul!(C::AbstractMatrix, A::Tridiagonal, B::AbstractTriangular) = mul!(C, A, copyto!(similar(parent(B)), B)) | |||
<<<<<<< HEAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge/rebase error
Just pushed a fix. |
Some (but only some) of the builds are failing because of this: Test Failed at C:\projects\julia\julia-d966fe96ce\share\julia\site\v0.7\IterativeEigensolvers\test\runtests.jl:104
Expression: sort((eigs(A, B, nev=k, sigma=1.0))[1]) ≈ (sort(eigvals(A, B)))[2:4]
Evaluated: [0.00646459, 0.0779555, 0.312055] ≈ [-Inf, 0.00646459, 0.0779555]
ERROR: LoadError: Test run finished with errors Anyone have any idea why the changes here would effect generalized eigenvalues? I suspect on some architectures it's returning one On my machine, it looked like EDIT: Nevermind, it's because |
I'm for this in principle, but I have two issues with the proposed
So, it seems like we would want at least four "dense" layouts:
And then I guess you also need conjugated versions of the above, but it seems more flexible to be able to conjugate any layout, i.e. have a |
I like your suggestion. I suppose we can keep the |
Also, to play devil's advocate here, what exactly is the advantage of making this a trait, rather than just a check of Pro: traits let these checks be done at compile-time, although the cost of a runtime Con: now array types have to implement yet another method. And, for cases like NumPy arrays where |
I think the benefit of traits is it allows other formats like triangular, symmetric, and banded. The main point of the pull request from my perspective is so that banded matrices can pipe into the |
…/julia into dl/arraymemorylayout
…rraymemorylayout
I'm not sure what proposed changes are left: @mbauman had suggestions for simplifying the type hierarchy and some renamings (e.g., Adding I've also looked through the usage of |
Thanks for the update! I'm not too worried about the names and No, |
Can you give an example of such array that is not dense column major? |
struct OnesMatrix <: AbstractArray{Int,2} end
Base.getindex(::OnesMatrix, i::Int) = 1
Base.size(::OnesMatrix) = (2,2)
Base.IndexStyle(::Type{OnesMatrix}) = IndexLinear() (Edit: or in the wild: FillArrays.jl) |
I'm only following this at a very high level, but here's my 2¢:
Sure, these aren't perfectly orthogonal, but we can come up with examples to fill in most of the boxes if we put a grid of these together:
|
If conflating traits is the issue, another option would be to move everything back to LinearAlgebra (now possible as |
I think the main sticking point at this moment is whether Here are the arguments in favour of
function _mul!(y::AbstractVector{T}, A::AbstractMatrix{T}, x::AbstractVector{T}, ::AbstractStridedLayout, ::AbstractStridedLayout, ::AbstractStridedLayout) where T<: BlasFloat
if stride(A,1) == 1 && stride(A,2) ≥ stride(A,1)
gemv!(y, 'N', A, x)
elseif stride(A,2) == 2 && stride(A,1) ≥ stride(A,2)
gemv!(y, 'T', transpose(A), x)
elseif stride(A,2) ≥ stride(A,1)
generic_matvecmul!(y, 'T', transpose(A), x)
else
generic_matvecmul!(y, 'N', A, x)
end
end
# repeat above for ConjLayout{<:AbstractStridedLayout} with 'C' in place of 'T' Compare this with the current version: _mul!(y::AbstractVector{T}, A::AbstractMatrix{T}, x::AbstractVector{T}, ::AbstractStridedLayout, ::AbstractColumnMajor, ::AbstractStridedLayout) where {T<:BlasFloat} = gemv!(y, 'N', A, x)
_mul!(y::AbstractVector{T}, A::AbstractMatrix{T}, x::AbstractVector{T}, ::AbstractStridedLayout, ::AbstractRowMajor, ::AbstractStridedLayout) where {T<:BlasFloat} = gemv!(y, 'T', transpose(A), x)
_mul!(y::AbstractVector, A::AbstractMatrix, x::AbstractVector, _1, _2, _3) = generic_matvecmul!(y, 'N', A, x) The first two points are very minor issues. The last point is really the sticking point: the code becomes complex handling each possible stride variant, and so I won't have the time to make that change. If @mbauman or someone else still feels strongly about removing |
Thanks for that great explanation and narrowing our focus down to just that one type (well, two including The status quo on master is:
This leads us to the three types I proposed in my earlier comment. Now, this PR — especially with the I'm still not sure how much work such a refactor would be, but I'd like to try and tackle it once I'm done with the broadcasting stuff. |
👍 Though note keeping function _mul!(y::AbstractVector, A::AbstractMatrix, x::AbstractVector, ::AbstractStridedLayout, ::AbstractStridedLayout, ::AbstractStridedLayout)
if stride(A,1) == 1 && stride(A,2) ≥ stride(A,1)
_mul!(y, A, x, MemoryLayout(y), ColumnMajor(), MemoryLayout(x))
elseif stride(A,2) == 2 && stride(A,1) ≥ stride(A,2)
_mul!(y, A, x, MemoryLayout(y), RowMajor(), MemoryLayout(x))
elseif stride(A,2) ≥ stride(A,1)
generic_matvecmul!(y, 'T', transpose(A), x)
else
generic_matvecmul!(y, 'N', A, x)
end
end |
@mbauman I had a look at the new Broadcast interface, and I'm now wondering if this PR would be better rewritten to replicate that. What I mean is that struct Mul{StyleA, StyleX, AType, XType}
style_A::StyleA
style_x::StyleX
A::AType
x::XType
end
Mul(A::AbstractMatrix, x::AbstractVecOrMat) = Mul(MulStyle(A), MulStyle(x), A, b)
mul!(y::AbstractVector, A::AbstractMatrix, x::AbstractVector) = copyto!(y, Mul(A,x))
copyto!(y::AbstractVector{T}, M::Mul{DenseColumnMajor,<:AbstractStridedLayout,<:AbstractMatrix{T},<:AbstractVector{T}}) where T<:BlasFloat = BLAS.gemv!('N', one(T), M.A, M.x, zero(T), y) One could imagine combining this with z .= α .* A*x .+ β .* y To materialize!(z, Broadcasted((α, Ax, β, y) -> α*Ax + β*y, (α, Mul(A,x), y))) |
That would indeed be a wonderful feature, but the trouble is that we need to know when to actually perform the multiplication. Broadcasting is able to figure it out because the dot-fusion happens at the parser level — it's a very clear syntax that has well-defined bounds. We materialize as soon as you hit a non-dotted function call. |
The "one could imagine" line was more of a pipe-dream. But for near term I was proposing *(A::AbstractMatrix, b::AbstractVector) = materialize(Mul(A, b)) which is certainly doable without parser changes. |
I've done a mock up of the lazy https://github.com/dlfivefifty/LazyLinearAlgebra.jl and it works surprisingly well with the new broadcast, to give nice versions of BLAS routines: A = randn(5,5); b = randn(5); c = similar(b);
c .= Mul(A,b)
@test all(c .=== BLAS.gemv!('N', 1.0, A, b, 0.0, similar(c)))
c .= 2.0 .* Mul(A,b)
@test all(c .=== BLAS.gemv!('N', 2.0, A, b, 0.0, similar(c)))
c = copy(b)
c .= Mul(A,b) .+ c
@test all(c .=== BLAS.gemv!('N', 1.0, A, b, 1.0, copy(b)))
c = copy(b)
c .= Mul(A,b) .+ 2.0 .* c
@test all(c .=== BLAS.gemv!('N', 1.0, A, b, 2.0, copy(b)))
c = copy(b)
c .= 3.0 .* Mul(A,b) .+ 2.0 .* c
@test all(c .=== BLAS.gemv!('N', 3.0, A, b, 2.0, copy(b)))
d = similar(c)
c = copy(b)
d .= 3.0 .* Mul(A,b) .+ 2.0 .* c
@test all(d .=== BLAS.gemv!('N', 3.0, A, b, 2.0, copy(b))) I think I'm going to focus on developing that approach as a package, instead of working on it as a PR, as it can exist outside of StdLib. Then the various linear algebra packages (BlockArrays.jl, BandedMatrices.jl, etc.) can use |
FYI I'm planning to make a new proposal for how to approach this based on https://github.com/JuliaMatrices/ArrayLayouts.jl This used to be in LazyArrays.jl, which became too heavy. Essentially the idea is very close to the previous PR except the following:
|
This is in part to address #10385 and based on discussions there, and is the second stage of #25321.
The aim of this pull request is to use memory layout to decide when to dispatch to BLAS/LAPack routines, instead of array type. This will lead to a simplified dispatch of linear algebra, without so many special cases for
Adjoint
,Transpose
, etc. It also facilitates other array types outside of Base (e.g.BandedMatrix
) use BLAS linear algebra routines.I've only adapted
mul!
so far (withldiv!
, etc. yet to be adapted), but I wanted to get comments before finishing the changes. I'll finish completing implementing this pull request if it's decided to be a good solution to the problem.Note I use the name
CTransposeStridedLayout
instead ofAdjointStridedLayout
as it is referring to memory layout, not mathematical structure. Alternatively, it could beRowMajorStridedLayout
andConjRowMajorStridedLayout
.@timholy @stevengj Any thoughts would be appreciated.