-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for AbstractWrappedArray in order to mitigate "AbstractArray-fallback" #31563
Conversation
@ViralBShah, @StefanKarpinski, @andreasnoack, I would like to bring this PR to your attention. It is introducing a new abstract type into the standard type hierarchy, and is slightly breaking. |
Interesting. This seems to be a competing option to doing something like #25558 — that is, instead of traits this encodes the "basetype" as a parameter in the type tree. |
@mbauman, yes, I was first thinking also of a traits solution, but I favored the additional-type-parameter way for those reasons:
|
Failing test seems unrelated. |
@mbauman or @KristofferC, please merge if this looks good to you. |
Isn't this heavily breaking since it adds type parameters? |
Yes, I think this needs a more careful design review with an eye towards what we want our final state to be. I've simply not had a chance to do so myself at this point. |
The appended type parameter is a breaking change, because we have to replace for example
by
(I am not sure, if the |
bu mp ;-) |
bummpp |
What about a definition like this: abstract type AbstractWrappedArray{T,N,P} <: AbstractArray{T,N}; end
basetype(::Type{<:AbstractWrappedArray{T,N,P}}) where {T,N,P} = basetype(P) We'd only store the "one-level up" parent in the type but allow We can discuss it on an upcoming triage call. Edit: ah, yes, that's what you suggested in #31563 (comment). foo(A::AbstractWrappedArray) = _foo(basetype(typeof(A)), A)
_foo(::Type{<:SparseMatrixCSC}, A) = ... |
The complete story would look like: foo(A::AbstractWrappedArray) = _foo(basetype(typeof(A)), A)
_foo(::Type{<:SparseMatrixCSC}, A) = foo(sparse(A))
_foo(::Type{<:AbstractMatrix}, A) = invoke(foo, Tuple{AbstractMatrix}, A) # avoid dispatch loop foo(A) if we want to re-use the pre-existing A. additional type parameter in wrapped types + one additional method per use case (method) In both cases the new |
Inserting new abstract types into the hierarchy is allowed and considered backwards-compatible. Adding the type parameters, as observed, is breaking. |
So I will backpedal on the additional type parameter, but insert
|
BTW, is there a chance in the future to have a breaking change approved? |
I see there have been no comments on this PR for a while, so I wonder if any decision has been reached? The problem that is being addressed here, "AbstractArray-fallback", can occur in various situations, not just for sparse matrices. I have stumbled into this when trying to train a deep learning model on a GPU with Zygote. This problem causes the broadcast result of a computation involving GPU arrays to be materialised on a CPU, so the remainder of gradient backward pass (that should take place on GPU) blows up. See JuliaGPU/GPUArrays.jl#244 and linked issues. This issue can easily cause errors downstream that are hard to debug. Either a general solution (not necessarily this one) needs to be adopted, or individual patches for various downstream use cases (e.g. sparse matrices, broadcasting, GPU, Zygote, etc..) need to be implemented. |
The issue needs fixing, but I think several of us are skeptical that this is the right approach. Approaches that might work include (1) submitting specialized methods for the cases you're hitting, and/or (2) resurrect https://github.com/timholy/ArrayIteration.jl and use it to implement the generic fallbacks.
Given that you need it, I would love it if you picked up ArrayIteration and pushed it over the finish line. If memory serves, you probably want the |
https://github.com/timholy/ArrayIteration.jl is very interesting. It has similar goals to @ChrisRackauckas 's https://github.com/JuliaDiffEq/ArrayInterface.jl and @vchuravy 's https://github.com/JuliaGPU/KernelAbstractions.jl so I wonder if it makes sense to coordinate on next gen array stuff, which can solve the problems raised in this thread. |
The problem that this could fix is that it would allow you to flip functions to be below the wrapper by the default. The problem comes up if you compose wrappers. Examples of this are on
However, this points to the fact that the PR is missing a few pieces of an AbstractWrappedArray interface that would be necessary:
|
See https://github.com/JuliaGPU/Adapt.jl/blob/master/src/Adapt.jl for a |
Yes, Adapt.jl is great, but what it's doing should be what happens by default since right now if you don't use Adapt you end up with these issues. Since a lot of Julia users don't even know about Adapt, this is probably what I run into in most generic codes as the blocker for why GPUs or TrackedArrays don't "just work". |
This is interesting 👍 I've done something similar recently in a private package I'm developing. I have a package with an abstract type Since it is a very common pattern to wrap arrays, it makes sense to me to have an I'm actually working on a tensor algebra package now to illustrate some points made in this issue: JuliaLang/LinearAlgebra.jl#729 Having an Small comment: I would probably prefer |
There's another potential advantage of this PR: latency. I've been seeing significant package load-time regressions now that I've started more heavily using the That means however it would need to be possible to use |
If I understand right, this is now handled in the dispatch system for |
This PR proposes a principled approach that would work with all arrays, and not a handful of operations. I don't think it should have been closed. Sadly, doing so removed the upstream branch, so it cannot be re-opened. |
The new abstract type
AbstractWrappedArray{T,N,P,B} <: AbstractArray{T,N}
is inserted in the type hierarchy of some wrapping arrays:Adjoint <: AbstractWrappedArray
;Symmetric
,SubArray
, etc. as well.Purpose is to allow to avoid the "AbstractArray-fallback" trap by defining additional methods, which use the additional type parameter
B
(base-type) for dispatch.Example:
foo(A::AbstractMatrix) = ... using getindex massively ...
foo(A::SparseMatrixCSC) = ... exploiting sparsity of A - not using getindex ...
When calling
foo(Adjoint(sparse([1 2; 3 4]))
method 1 is invoked; for really big sparse matrices that is prohibitive (the "AbstractArray-fallback" use case).Adding the following method improves the situation considerably:
3.
foo(A::AbstractWrappedArray{<:Any,2,<:Any,<:SparseMatrixCSC}) = foo(sparse(A))
Now the wrapped matrix
A
is converted to aSparseMatrixCSC
bysparse
and then method 2. is invoked. This way the "AbstractArray-fallback" has been replaced by a "SparseMatrixCSC-fallback", which has much better performance.Note, that the wrapped types may be arbitrary deeply nested;
sparse
is now reasonably fast after PR #30552.