-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SHermitianCompact #358
Conversation
Codecov Report
@@ Coverage Diff @@
## master #358 +/- ##
==========================================
- Coverage 93.36% 91.41% -1.96%
==========================================
Files 37 38 +1
Lines 2713 2771 +58
==========================================
Hits 2533 2533
- Misses 180 238 +58
Continue to review full report at Codecov.
|
Cool - I've always wanted to try this and see how it panned out! There's no reason this doesn't belong here, but I guess there is a question as to what the killer application of this beastie is? I guess it could be useful for storing lots of little, Hermitian matrices in memory? Also, we need to keep in mind that
Oh dear, we should probably change that, the |
In continuum mechanics, symmetric second order tensors is very common (we have something similar to this in Tensors.jl). |
So that would be a |
Yes. In addition, the derivative of a symmetric second order tensors with another gives symmetric fourth order tensors where the storage saving is more substantial (36 vs 81 elements for symmetric fourth order vs unsymmetric). |
I was personally hoping to use this in RigidBodyDynamics to speed up operations involving moments of inertia. Made a bit more progress. Next up is
Thanks for the heads up; fixed this. I did notice that this is actually currently not the case for Lines 58 to 70 in a83cac6
|
Yeah I think this was a design decision I was trying to ignore when we first wrote static arrays. :) We should fix that up. I'm slightly worried that result here might be that the eltype is a |
Maybe I don't follow, but isn't that problem avoided here, since @inline transpose(a::SSymmetricCompact) = SSymmetricCompact(transpose.(a.lowertriangle)) |
Alright, this is ready for review now. Regarding the recursiveness of a = Symmetric([[rand(Complex{Int}) for i = 1 : 2] for row = 1 : 3, col = 1 : 3])
@test transpose(SSymmetricCompact{3}(a)) == transpose(a) passes with the current code, but not with transpose(a::SSymmetricCompact) = SSymmetricCompact((map(transpose, a.lowertriangle))) |
Is that v0.6 or v0.7? |
0.6 and 0.7 both have (different) issues. On 0.6.1: julia> a = Symmetric([[Complex(row * i, col * i + 1) for i = 1 : 2] for row = 1 : 3, col = 1 : 3])
3×3 Symmetric{Array{Complex{Int64},1},Array{Array{Complex{Int64},1},2}}:
Complex{Int64}[1+2im, 2+3im] Complex{Int64}[1+3im, 2+5im] Complex{Int64}[1+4im, 2+7im]
Complex{Int64}[1+3im, 2+5im] Complex{Int64}[2+3im, 4+5im] Complex{Int64}[2+4im, 4+7im]
Complex{Int64}[1+4im, 2+7im] Complex{Int64}[2+4im, 4+7im] Complex{Int64}[3+4im, 6+7im]
julia> transpose(a)[1]
2-element Array{Complex{Int64},1}:
1+2im
2+3im
julia> transpose(Array(a))[1]
ERROR: MethodError: Cannot `convert` an object of type RowVector{Complex{Int64},Array{Complex{Int64},1}} to an object of type Array{Complex{Int64},1}
This may have arisen from a call to the constructor Array{Complex{Int64},1}(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] transpose_f!(::Base.#transpose, ::Array{Array{Complex{Int64},1},2}, ::Array{Array{Complex{Int64},1},2}) at ./linalg/transpose.jl:54
[2] transpose(::Array{Array{Complex{Int64},1},2}) at ./linalg/transpose.jl:121 on 5-day old master: julia> a = Symmetric([[Complex(row * i, col * i + 1) for i = 1 : 2] for row = 1 : 3, col = 1 : 3])
3×3 Symmetric{Array{Complex{Int64},1},Array{Array{Complex{Int64},1},2}}:
[1+2im, 2+3im] [1+3im, 2+5im] [1+4im, 2+7im]
[1+3im, 2+5im] [2+3im, 4+5im] [2+4im, 4+7im]
[1+4im, 2+7im] [2+4im, 4+7im] [3+4im, 6+7im]
julia> transpose(a)[1]
2-element Array{Complex{Int64},1}:
1 + 2im
2 + 3im
julia> transpose(Array(a))[1]
1×2 Transpose{Complex{Int64},Array{Complex{Int64},1}}:
1+2im 2+3im |
@tkoolen thought I should let you know I'm trying to ignore this until all the v0.7 carnage is dealt with, but I am generally supportive of this addition. |
Also relevant for JuliaIO/NRRD.jl#30 |
I think |
As in, in a separate package, perhaps. The |
? I said (or meant to say) that it belongs here. |
* Extract out @pure triangularnumber and triangularroot functions, use to simplify code * Add similar_type overload * Speed up == overload * Generalize +, - overloads with two SSymetricCompact matrices by overloading _fill * More complete list of scalar-array ops * Make transpose recursive * Overloads for rand, randn, randexp
Ah, I misunderstood then. |
I've brought this up to date and fixed some issues with recursive transposes and adjoints. @timholy, could I ask you to review this? |
By the way, if you're wondering about the efficiency of julia> sa = SSymmetricCompact(rand(SMatrix{5, 5}));
julia> f(sa, i) = @inbounds getindex(sa, i)
f (generic function with 1 method)
julia> @code_native f(sa, 1)
.text
; ┌ @ REPL[23]:1 within `f'
; │┌ @ SSymmetricCompact.jl:82 within `getindex' @ REPL[23]:1
shlq $4, %rsi
; │└
; │┌ @ pair.jl:50 within `getindex'
movabsq $139903100985504, %rax # imm = 0x7F3DBAA320A0
; │└
; │┌ @ SSymmetricCompact.jl:83 within `getindex' @ SVector.jl:37 @ tuple.jl:24
movq -16(%rsi,%rax), %rax
; ││ @ SSymmetricCompact.jl:84 within `getindex'
vmovsd -8(%rdi,%rax,8), %xmm0 # xmm0 = mem[0],zero
; │└
retq
nopw (%rax,%rax)
; └ I don't understand the trick the compiler is using yet, but since the native code for various matrix sizes only seems to differ in one constant (e.g. Edit: that one constant is just a memory address, probably of the precomputed tuple of indices that it's indexing into; it's not using some magical formula. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
src/SSymmetricCompact.jl
Outdated
end | ||
end | ||
|
||
Base.@propagate_inbounds function Base.getindex(a::SSymmetricCompact{N}, i::Int) where {N} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why implement this rather than getindex(a, i::Int, j::Int)
and make it IndexCartesian
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reasons are that getindex(a, i::Int, j::Int)
would also not be trivial, requiring a solution similar to _symmetric_compact_indices
to be fast, and a lot of functionality inside of StaticArrays implicitly assumes that all StaticArray
s are IndexLinear
, e.g.:
StaticArrays.jl/src/mapreduce.jl
Lines 34 to 35 in 407c65f
tmp = [:(a[$j][$i]) for j ∈ 1:length(a)] | |
exprs[i] = :(f($(tmp...))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the nontrivial, doesn't
i, j = i >= j ? (i, j) : (j, i)
idx = i + (j-1)*N
do the job?
If you've not benchmarked it or checked generated code, this seems to be worth doing. But most likely you have and you know that it's all fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's not quite right.
julia> N = 3; L = StaticArrays.triangularnumber(N)
6
julia> m = SSymmetricCompact(SVector{L}(1 : L))
3×3 SSymmetricCompact{3,Int64,6}:
1 2 3
2 4 5
3 5 6
julia> i, j = 2, 3;
julia> m[i, j]
5
julia> i, j = i >= j ? (i, j) : (j, i)
(3, 2)
julia> idx = i + (j-1)*N
6
julia> m.lowertriangle[idx]
6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh! Try
idx = i - j + 1 + ((j-1)*(2N+2-j)÷2)
The point being: your implementation looks slow, though perhaps it's not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not quite right either I'm afraid. After removing the current getindex
and adding
Base.IndexStyle(::Type{<:SSymmetricCompact}) = Base.IndexCartesian()
Base.@propagate_inbounds function Base.getindex(a::SSymmetricCompact{N}, i::Int, j::Int) where N
idx = i - j + 1 + ((j-1)*(2N+2-j)÷2)
x = a.lowertriangle[idx]
end
we get
julia> N = 3; L = StaticArrays.triangularnumber(N)
6
julia> m = SSymmetricCompact(SVector{L}(1 : L))
3×3 SSymmetricCompact{3,Int64,6}:
1 3 4
2 4 5
3 5 6
but it's supposed to be
3×3 SSymmetricCompact{3,Int64,6}:
1 2 3
2 4 5
3 5 6
As I showed in #358 (comment), the compiler does a very good job on the whole _symmetric_compact_indices
thing. Basically, it computes that tuple at compile time for a given N
, and then the getindex
method simply indexes into that constant tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I should still add the i, j = i >= j ? (i, j) : (j, i)
; it works then. I'll benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, with
Base.IndexStyle(::Type{<:SSymmetricCompact}) = Base.IndexCartesian()
Base.@propagate_inbounds function Base.getindex(a::SSymmetricCompact{N}, i::Int) where N
# override StaticArray overload, replace with AbstractArray version.
# Base._getindex(IndexStyle(a), a, Base.to_indices(a, (i,))...)
c, r = divrem(i - 1, N)
c, r = c + 1, r + 1
a[r, c]
end
Base.@propagate_inbounds function Base.getindex(a::SSymmetricCompact{N}, i::Int, j::Int) where N
i, j = i >= j ? (i, j) : (j, i)
idx = i - j + 1 + ((j-1)*(2N+2-j)÷2)
x = a.lowertriangle[idx]
end
we get
julia> @btime a[i] setup = begin
a = SSymmetricCompact(rand(SMatrix{6, 6}))
i = rand(1 : length(a))
end
3.272 ns (0 allocations: 0 bytes)
whereas before,
julia> @btime a[i] setup = begin
a = SSymmetricCompact(rand(SMatrix{6, 6}))
i = rand(1 : length(a))
end
1.783 ns (0 allocations: 0 bytes)
Do note that multiplying two SSymmetricCompact
s results in the same runtime before and after this change, as the indices are known at compile time and so it looks like the compiler is able to constant-propagate everything in either case. But just randomly linearly indexing into an SSymmetricCompact
is faster with the _symmetric_compact_indices
approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, my proposal is really for IndexCartesian
, where you never have to call divrem
(which is insanely slow).
But I just benchmarked with
Base.@propagate_inbounds function Base.getindex(a::SSymmetricCompact{N}, i::Int, j::Int) where {N}
i, j, lower = j > i ? (j, i, false) : (i, j, true)
idx = i - j + 1 + ((j-1)*(2N+2-j))>>UInt(1)
@inbounds value = a.lowertriangle[idx]
return lower ? value : transpose(value)
end
and to my surprise it's no better. So I'm happy going with yours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For constant y
the compiler does a very good job on div(x, y)
and divrem(x, y)
, almost to the point where it produces the same code as the manual >> UInt(1)
optimization:
julia> f(x) = div(x, 2);
julia> g(x) = x >> UInt(1);
julia> @code_native f(3)
.text
; ┌ @ REPL[15]:1 within `f'
; │┌ @ REPL[15]:1 within `div'
movq %rdi, %rax
shrq $63, %rax
leaq (%rax,%rdi), %rax
sarq %rax
; │└
retq
nop
; └
julia> @code_native g(3)
.text
; ┌ @ REPL[16]:1 within `g'
; │┌ @ REPL[16]:1 within `>>'
sarq %rdi
; │└
movq %rdi, %rax
retq
nopw (%rax,%rax)
; └
Similarly for divrem(i, N)
for fixed N
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick review!
src/SSymmetricCompact.jl
Outdated
end | ||
end | ||
|
||
Base.@propagate_inbounds function Base.getindex(a::SSymmetricCompact{N}, i::Int) where {N} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reasons are that getindex(a, i::Int, j::Int)
would also not be trivial, requiring a solution similar to _symmetric_compact_indices
to be fast, and a lot of functionality inside of StaticArrays implicitly assumes that all StaticArray
s are IndexLinear
, e.g.:
StaticArrays.jl/src/mapreduce.jl
Lines 34 to 35 in 407c65f
tmp = [:(a[$j][$i]) for j ∈ 1:length(a)] | |
exprs[i] = :(f($(tmp...))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty neat. I'm worried that the Tuple
constructor will interact quite badly with the similar_type
machinery? Might need to have SMatrix
as the result of similar_type
.
Some tests of functionality for non symmetry-preserving operations should be added I think.
src/SSymmetricCompact.jl
Outdated
An `SSymmetricCompact` may be constructed either: | ||
|
||
* from an `AbstractVector` containing the lower triangular elements; or | ||
* from a `Tuple` containing both upper and lower triangular elements (in column major order); or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess this inconsistency between Tuple
and AbstractVector
is due to the rest of StaticArrays
assuming dense storage? That's unfortunate.
Does this Tuple
constructor ignoring the top half, combined with the similar_type
machinery mean that a lot of StaticArrays
functionality is actually broken with SSymmetricCompact
? For example, matrix multiply of SSymmetricCompact
with SMatrix
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit of a pickle (see #358 (comment), item 2), but I think the current implementation still makes sense.
Any StaticArray
definitely needs to be constructable from a Tuple
, which needs to have the interpretation of the full list of column-major-ordered elements.; that's the basic assumption in convert.jl
. That also means we can't use Tuple
as the type of the lowertriangle
field without causing all kinds of confusion with the constructors. So I chose to use an SVector
as the storage type for the lower triangular elements and designed the constructors in accordance. The precedent for this was SDiagonal
(which has since been replaced with Symmetric{T, SVector{<:Any, T}
; maybe the same will happen for this case in time).
Note that the StaticMatrix
subtype constructors / convert
methods currently also have the job of implementing reshape
,
StaticArrays.jl/src/convert.jl
Lines 9 to 10 in 407c65f
# this covers most conversions and "statically-sized reshapes" | |
@inline convert(::Type{SA}, sa::StaticArray) where {SA<:StaticArray} = SA(Tuple(sa)) |
which I'm not 100% on board with anyway. SSymmetricCompact
hijacks this particular use case, but I'd say it's niche enough that it's OK. Alternatively, a new type could be added precisely for this purpose, but I don't think there's much of an advantage in doing so.
Does this Tuple constructor ignoring the top half, combined with the similar_type machinery mean that a lot of StaticArrays functionality is actually broken with SSymmetricCompact? For example, matrix multiply of SSymmetricCompact with SMatrix?
No. similar_type
is not overloaded for SSymmetricCompact
, so the default SMatrix
is used. So in that sense, SSymetricCompact
is very similar to e.g. Rotations.Quat
, and so the only thing that really matters is that getindex(::StaticArray, i::Int)
works.
possibly relevant comment: JuliaLang/julia#31836 (comment) |
I'd be fine with turning this into |
OK, changed to |
Thanks, all! |
I was just playing around yesterday with a statically-sized symmetric matrix type that (as opposed to
Base.Symmetric
) only stores the lower triangle.This is currently lacking all tests and some things still need to be implemented, but what's there now seem to be working in my scrap notebook. Some operations are slightly faster, some slightly slower than plain
SMatrix
depending on the operation, matrix size, and processor architecture. It's exciting to see that multiplying twoSSymmetricCompact{3, 3}
s (using the generic code inmatrix_multiply.jl
!) is slightly faster than multiplying regularSMatrix{3, 3}
s. Overall I was slightly underwhelmed by the performance, but I still think this could be useful.I was just wondering:
SDiagonal
(which I used for inspiration), so maybe?SSymmetricCompact(::Tuple)
should do? I saw that forSDiagonal
, it assumes that theTuple
is just the diagonal, which I thought was kind of strange given the use of theTuple
methods inconvert.jl
. It's also not how the Rotations.jl types do it, and I ended up following those examples.