-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-associating matrix-chain multiplication yields unpredictable results #1043
Comments
Why is this a correctness bug? Writing |
This method could be specialized to Real and Complex. It would lose some extensibility for user-defined associative types that are not real or complex, but are there any? |
Yes, quaternions for example. |
this is not necessarily what Julia should nor, nor is it necessarily a reasonable assumption but purely as a matter of practicality, I suspect most users would expect that in the absence of parens, maybe an |
Is this really behavior we want to have undefined? If you go by https://docs.julialang.org/en/v1/manual/mathematical-operations/#Operator-Precedence-and-Associativity then help?> *
search: *
<lines omitted>
*(A, B::AbstractMatrix, C)
A * B * C * D
Chained multiplication of 3 or 4 matrices is done in the most efficient sequence, based on the sizes of the
arrays. That is, the number of scalar multiplications needed for (A * B) * C (with 3 dense matrices) is
compared to that for A * (B * C) to choose which of these to execute.
If the last factor is a vector, or the first a transposed vector, then it is efficient to deal with these
first. In particular x' * B * y means (x' * B) * y for an ordinary column-major B::Matrix. Unlike dot(x, B,
y), this allocates an intermediate array.
If the first or last factor is a number, this will be fused with the matrix multiplication, using 5-arg mul!.
See also muladd, dot.
│ Julia 1.7
│
│ These optimisations require at least Julia 1.7. (note that this docstring does not appear in the online Julia docs, which should probably be resolved) I'm here to argue that an order that is undefined and results in mathematical errors (i.e., more than just roundoff) is something we should consider not having because it makes generic code very difficult to write. You cannot write any syntactically-chainable Personally, I always use parentheses to express my chained matrix multiplication because I didn't even know about these specializations until this week (and they've only existed since v1.7 and don't appear in the online docs). I never found it particularly onerous and there are packages that handle this anyway (and much more generally) by introducing dedicated functions for re-associative Another fun consequence of our special-casing of 3-4 argument chained matmul is this: julia> [x;;] * [y;;] * [z;;] * [1;] # right-associative
1-element Vector{Octonion{Int64}}:
Octonion{Int64}(-652, -1064, 612, -736, -1148, -888, -1420, -1376)
julia> [1;;] * [x;;] * [y;;] * [z;;] * [1;] # chain another matrix and it reverts to left-associative
1-element Vector{Octonion{Int64}}:
Octonion{Int64}(-652, -1064, 612, -736, -860, -1128, -1036, -1712) I find this behavior very difficult to defend. |
Maybe the first question is whether The assumption that LinearAlgebra deals with associative multiplication also exists in matrix powers, and in using Octonions
begin
a = Octonion(rand(1:99, 8)...)
M = [Octonion(rand(1:99, 8)...) for _ in 1:2, _ in 1:2]
v = [Octonion(rand(1:99, 8)...) for _ in 1:2]
w = [Octonion(rand(1:99, 8)...) for _ in 1:2]
end;
@assert M^3 == M*(M*M) != (M*M)*M # power_by_squaring, since before Julia 1.0
@assert M^4 == (M^2) * (M^2) != ((M*M)*M)*M
@assert dot(v,M,w) == (v'*M)*w == v'*M*w != dot(v, M*w) # disagrees with docstring, Julia 1.4
@assert dot(v,M',w) == v'*(M'*w) == dot(v, M'*w) # calls adjoint(dot(w, M.parent, v)) [Edit, I misunderstood a comment above]. Quaternions are associative (but non-commutative). The methods for fused What examples are there besides Octonions? Are there some which are useful in the wild? Useful to feed into generic code? (We have a vector cross product |
Version 1.9.4.
Obviously, this transformation slightly changes results for floating point numbers due to their mild non-associativity. However, it is also applied in contexts with mathematics that are decisively non-associative. For example:
We like to compose packages in Julia. Arbitrary re-assocation is incompatible with this. Since "standard" associativity is undefined, it becomes necessary for a "correct" package to explicitly force every association with parentheses. N-ary evaluation cannot be correctly applied without tight control of the context that ensures re-association is mathematically valid. The presence of re-association in any insufficiently-controlled context (like the above matrix-chain specialization) will lead to correctness issues across the ecosystem.
Spiritually related to JuliaLang/julia#52333.
The text was updated successfully, but these errors were encountered: