Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate vectorized round methods in favor of compact broadcast syntax #18590

Closed
wants to merge 1 commit into from

Conversation

Sacha0
Copy link
Member

@Sacha0 Sacha0 commented Sep 19, 2016

This PR deprecates all remaining vectorized round methods (less those for SparseVectors, separate PR) in favor of compact broadcast syntax. Ref. #16285, #17302, #18495, #18512, #18513, #18558, #18564, #18566, #18571 #18575, #18576, and #18586.

This PR fails one test, specifically

@inferred sprand(1, 1, 1.0, x->round(Int,rand(x)*100))
: In that test, replacing round(Int, rand(x)*100) with round.(Int, rand(x)*100) changes the inferred result type from SparseMatrixCSC{Int64,Int64} to SparseMatrixCSC{Tv,Int64}. Ref #16074 and cc @KristofferC.

Best!

(Unlike with float, real, etc., the remaining vectorized round methods never alias their input. This PR should be less controversial than #18495, #18512, and #18513 as a result.)

@TotalVerb
Copy link
Contributor

There are so many of these functions for various matrix types. I wonder if broadcast can directly support them?

@StefanKarpinski
Copy link
Member

I'm not following you, @TotalVerb.

@TotalVerb
Copy link
Contributor

TotalVerb commented Sep 19, 2016

@StefanKarpinski These PRs are subtly changing the semantics of broadcast, in terms of what type the produce. For instance, before this PR, we have

julia> broadcast(round, Bidiagonal([1, 0, 0], [1, 0], true))
3×3 Array{Int64,2}:
 1  1  0
 0  0  0
 0  0  0

and after it, we have

julia> broadcast(round, Bidiagonal([1, 0, 0], [1, 0], true))
3×3 Bidiagonal{Int64}:
 1  1  
   0  0
     0

But if a user defines myround = x -> round(x), then the former behaviour persists, which could possibly be confusing. This is also a concern with loop fusion, which will fail to protect the underlying matrix type, even if it's fusing two operations that preserve it (such as round.(abs.(X))).

My question is whether there could possibly be a way for broadcast itself to recognize certain special kinds of matrices, and certain safe operations that cannot unzero zeroes, and avoid promoting them to Array. That would substantially reduce the number of special-case methods of broadcast necessary.

@TotalVerb
Copy link
Contributor

Example of change of behaviour: in v0.5, abs(round(Bidiagonal([1, 0, 0], [1, 0], true))) works and returns Bidiagonal. In v0.6, abs.(round.(Bidiagonal([1, 0, 0], [1, 0], true))) will turn into a fused broadcast, which will produce a plain old array. It would be nice if this discrepancy can be repaired on the broadcast end instead of loading it up with many special cases.

@StefanKarpinski
Copy link
Member

Thanks for clarifying, @TotalVerb.

@Sacha0
Copy link
Member Author

Sacha0 commented Sep 19, 2016

My question is whether there could possibly be a way for broadcast itself to recognize certain special kinds of matrices, and certain safe operations that cannot unzero zeroes, and avoid promoting them to Array. That would substantially reduce the number of special-case methods of broadcast necessary.

I've been thinking along the same lines while writing these PRs.

To some degree SparseMatrixCSCs and SparseVectors realize what you describe above, see base/sparse/sparsematrix.jl. Having similar broadcast methods for the special matrix types could be advantageous. What exists for SparseMatrixCSCs and SparseVectors is limited; improving the sophistication of that code could be great, for example with logic to check zero-preserving behavior for functions other than those explicitly identified as zero-preserving (which fusion makes more important as you note). Ref. #11474 and #18309.

Apart from the existing SparseMatrixCSC/SparseVector model, annotating scalar functions with a ZeroPreserving trait for broadcast dispatch might be better than the status quo involving lists of zero-preserving functions that are replicated in multiple files (for example see base/sparse/sparsematrix.jl and base/sparse/sparsevector.jl). Best!

@StefanKarpinski
Copy link
Member

Check if f(zero(T)) == zero(T)? More broadly, array types might need to have a mechanism for expressing the set of values that must be preserved in order to be able to use them.

@TotalVerb
Copy link
Contributor

@StefanKarpinski I like that idea, though it would have to be f(zero(T)) == zero(f(zero(T)) to deal with cases with physical units where zero(f(zero(T)) may not compare equal to zero(T). Being able to check directly that zeroes are preserved is much more flexible and versatile than using a trait.

@Sacha0
Copy link
Member Author

Sacha0 commented Sep 19, 2016

Something along those lines :). But whether calling f for that purpose is acceptable might be contentious (re. side effects).

@TotalVerb
Copy link
Contributor

Side effects should just be prohibited within broadcast. They are evaluated in undefined order with loop fusion anyway.

@Sacha0
Copy link
Member Author

Sacha0 commented Sep 19, 2016

Side effects should just be prohibited within broadcast. They are evaluated in undefined order with loop fusion anyway.

I agree. Just noting that (IIRC) that point has been contentious. But perhaps that debate has been settled? If so, great! Makes life relatively straightforward. Best!

@tkelman
Copy link
Contributor

tkelman commented Sep 20, 2016

I don't think it has. https://github.com/JuliaLang/julia/issues/7010

@Sacha0
Copy link
Member Author

Sacha0 commented Sep 20, 2016

Ref. #10536

@martinholters
Copy link
Member

Ok, here is a radical idea (more intended as food for thought than actual realization):

  • Introduce a zero-preserving trait for functions. E.g.
iszeropreserving(::Function) = NotZeroPreserving()
iszeropreserving(::typeof(round)) = ZeroPreserving()
# ...
  • Provide a broadcast method which accepts a tuple of functions as first argument and applies their composition, special-casing if they are all zero-preserving (and hence, their composition is, too). This one is lowered to.

    - OR -

    Lowering still constructs the composed function but also emits code to attach the trait.

@TotalVerb
Copy link
Contributor

                                                                                  I feel like this is more complicated than it needs to. Testing whether a function is zeropreserving should be as simple as applying it to zero. Functions that behave badly when called on zero should not be applied to matrix types with unstored zeroes anyway.But if we are going the trait route, we should introduce `Base.ComposedFunction{Tuple{#f,#g,#h}}` or similar type and have broadcast produce those. This would have greatest generality.

@mauro3
Copy link
Contributor

mauro3 commented Sep 20, 2016

This is the trait grouping function:

julia> @generated function iszeropreserving{T}(fn, dt::Type{T})
         f = fn.instance
         val = f(zero(T)) == zero(f(zero(T)))
         :($val)
       end
iszeropreserving (generic function with 1 method)

julia> iszeropreserving(sin, Int) 
true

julia> iszeropreserving(cos, Int)
false

But it is a generated function. Could this be done with an @pure function in a performant manner?

@Sacha0
Copy link
Member Author

Sacha0 commented Sep 20, 2016

Obviating the need for a trait would be great. Discussion in #10536 converged on acceptance of the (weaker) condition (than purity) that return values of f (and particularly f(0)) may be reused in the relevant higher-order functions (summary #10536 (comment)). That condition should suffice for checking zero-preservation behavior when calling relevant higher-order functions on structured or sparse matrices. A naive implementation over #17265 should be straightforward (also note #18309); a performant implementation might be trickier.

@stevengj
Copy link
Member

stevengj commented Sep 21, 2016

Actually, side effects are not in an undefined order with loop fusion, because fusion is guaranteed to occur.

One of the advantages of making this a syntactic guarantee was precisely that we don't have to worry about proving that the result is equivalent to the result without fusion (as would be required if this were a mere optimization).

@TotalVerb
Copy link
Contributor

@stevengj My terminology was inaccurate. But the point stands that the order is not simply a linear order. I really don't think side effects in broadcast are all that common or desirable. If side effects are desired, the loop should really just be written out.

@stevengj
Copy link
Member

stevengj commented Sep 21, 2016

However, in the special case of operating on a sparse array, calling f(0) seems reasonable to me, because broadcasting an operation with side-effects to a sparse array seems unlikely to deserve well-defined semantics.

@kshyatt kshyatt added the broadcast Applying a function over a collection label Sep 21, 2016
@Sacha0
Copy link
Member Author

Sacha0 commented Sep 25, 2016

Some thoughts on generic structure-preserving broadcast methods for sparse and structured matrices. The first section explores the simple case of broadcasting unary operations. The second section explores broadcasting binary operations (as a model of broadcasting operations with two or more arguments). The third section explores one approach to some of the challenges mentioned in the first and second sections (inference of zero preserving behavior). Below, Diagonal matrices serve as a simple model of structured and sparse matrices in general.

Apologies this is so long. Lots of material.

Broadcasting unary operations

Consider magicbroadcast(f, D::Diagonal), which broadcasts unary operation f over D and preserves D's structure for zero-preserving f. The naive implementation

function magicbroadcast(f, D::Diagonal)
    if f(zero(eltype(D))) == zero(f(zero(eltype(D))))
        return Diagonal(f.(D.diag))
    else
        return f.(convert(Matrix, foo))
    end
end

is not type stable due to the runtime branch (for general f). The implementation

type ZeroPreserving{T} end
iszeropreserving{T<:Number}(::typeof(sin), ::Type{T}) = ZeroPreserving{true}()
iszeropreserving{T<:Number}(::typeof(cos), ::Type{T}) = ZeroPreserving{false}()
magicbroadcast(f, D::Diagonal) = _magicbroadcast(f, iszeropreserving(f, eltype(D)), D::Diagonal)
_magicbroadcast(f, ::ZeroPreserving{true}, D::Diagonal) = Diagonal(f.(D.diag))
_magicbroadcast(f, ::ZeroPreserving{false}, D::Diagonal) = f.(convert(Matrix, D))

has the virtue of being type stable, but the disadvantage of requiring the ZeroPreserving trait. This approach may be workable (if cumbersome and limited), but significantly extending its sophistication (to avoid the need for most trait declarations) seems possible (see the third section). (Edit: Per @vtjnash, the @generated function below is illegal, so please skip the following struck-through text.) @mauro3's trick

@generated function iszeropreserving{dT}(fT, ::Type{dT})
    f = fT.instance
    if f(zero(dT)) == zero(f(zero(dT)))
        return :(ZeroPreserving{true}())
    else
        return :(ZeroPreserving{false}())
    end
end

obviates the necessity of specifying the ZeroPreserving trait for each unary operation, but at the cost of the conditions on f that calling f within a @generated function requires. Specifying the ZeroPreserving trait for a particular function, however, in turn obviates the need for the @generated function call, thereby providing an escape hatch from those restrictions.

Broadcasting binary operations

Consider magicbroadcast(f, A::Diagonal, B::Diagonal), which broadcasts binary operation f over A and B and preserve's A/B's structure insofar as possible. In contrast to the unary case where the operation either preserves zeros or not, in the binary case many possible zero-preservation classes exist depending on what information (e.g. argument domain) is available/considered; here we consider the following five zero-preservation classes:

(1) if either x or y is zero, f(x,y) returns zero ('zero preserving in both arguments');
(2) if x is zero, f(x,y) returns zero independent of y ('zero preserving in the first argument');
(3) if y is zero, f(x,y) returns zero independent of x ('zero preserving in the second argument');
(4) if both x and y are zero, f(x,y) returns zero ('zero-pair preserving');
(5) f does not belong to any of classes 1-4 ('not zero preserving').

Also in contrast to the unary case where sampling f(0) enables classification, in the binary case sampling does not enable classification: On the one hand, if f(0,0) does not yield zero, then f is not zero preserving (belongs solely to class five). On the other hand, if f(0,0) does yield zero, then the operation is at least zero-pair preserving (belongs to class four). But the operation may also belong to any of classes 1-3, and membership in any those classes can only be excluded via sampling, not established.

So for operations accepting two or more arguments, it seems zero preserving behavior must either be declared or inferred from other declarations (if one wants classification beyond merely classes 4/5).

Inference

As with type, zero-preservation class might be inferrable in many cases (given declaration of: (1) the zero-preservation class of a set of common 'atomic' functions; and (2) rules for computing the zero-preservation class of compositions of those functions). Below is a demo (which for simplicity ignores argument types). I know next to nothing about type inference, so I imagine the following snippet may be horrific and/or amusing to those who know more; any direction would be much appreciated.

What the demo code (below) enables

julia> # unary zero preserving
julia> magicbroadcast(x -> 4*abs(sin(x)*cos(x)) + (1/2)*tan(x)*exp(x), Diagonal(rand(4)))
4×4 Diagonal{Float64}:
 0.163829                   
          2.69932           
                  1.48974   
                          1.25917

julia> # unary not zero preserving
julia> magicbroadcast(x -> 4*abs(sin(x)*cos(x)) - log(x)*exp(x), Diagonal(rand(4)))
4×4 Array{Float64,2}:
   2.51054  Inf        Inf        Inf
 Inf          2.65709  Inf        Inf
 Inf        Inf          2.73759  Inf
 Inf        Inf        Inf          2.77127

julia> # binary zero preserving in the first argument, first arg structured
julia> magicbroadcast((x,y) -> sin(x)*cos(y), Diagonal(rand(4)), rand(4,4))
4×4 Diagonal{Float64}:
 0.37013                     
         0.367892            
                  0.258526   
                           0.506654

julia> # binary zero preserving in the first argument, first arg dense
julia> magicbroadcast((x,y) -> sin(x)*cos(y), rand(4,4), Diagonal(rand(4)))
4×4 Array{Float64,2}:
 0.64847   0.470474  0.380607  0.273377
 0.42008   0.549254  0.109858  0.449444
 0.322848  0.18293   0.5724    0.649076
 0.361723  0.561342  0.170636  0.227352

julia> # binary zero-pair preserving, both args structured
julia> magicbroadcast((x,y) -> sin(x) + sin(y), Diagonal(rand(4)), Diagonal(rand(4)))
4×4 Diagonal{Float64}:
 1.05201                     
         0.905931            
                  0.990544   
                           1.24777

julia> # binary zero-pair preserving, one arg dense
julia> magicbroadcast((x,y) -> sin(x) + sin(y), Diagonal(rand(4)), rand(4,4))
4×4 Array{Float64,2}:
 0.880026  0.432476  0.822629   0.653721
 0.309337  1.01615   0.523797   0.720296
 0.672002  0.546849  1.27934    0.173061
 0.494756  0.791128  0.0729191  0.714148

The demo code

import Base: *, +, -, abs, exp, log, sin, cos, tan

type ZeroPreserving{T} end
# ZeroPreserving{true} - zero preserving in all args (unary zero preserving, or binary zero preserving in both args)
# ZeroPreserving{false} - zero nonpreserving in all args (unary not zero preserving, or binary not zero preserving)
# ZeroPreserving({true,true}) - binary zero-pair preserving
# ZeroPreserving{(true,false)} - binary zero preserving in first argument
# ZeroPreserving{(false,true)} - binary zero preserving in second argument

# Some short names
const ZP = ZeroPreserving
const AT = true
const AF = false
const TT = (true, true)
const TF = (true, false)
const FT = (false, true)

# unary magicbroadcast
magicbroadcast(f, D::Diagonal) = _magicbroadcast(f, iszeropreserving(f, (eltype(D),)), D::Diagonal)
_magicbroadcast(f, ::ZP{AT}, D::Diagonal) = broadcastnz(f, D)
_magicbroadcast(f, ::ZP{AF}, D::Diagonal) = broadcastall(f, D)
broadcastnz(f, D) = Diagonal(f.(D.diag))
broadcastall(f, D) = f.(convert(Matrix, D))

# binary magicbroadcast
magicbroadcast(f, A, B) = _magicbroadcast(f, iszeropreserving(f, (eltype(A), eltype(B))), A, B)
_magicbroadcast(f, ::ZP, A, B) = broadcastall(f, A, B)
_magicbroadcast(f, ::ZP{AT}, A, B) = broadcastnz(f, A, B)
_magicbroadcast(f, ::ZP{TF}, A::Diagonal, B) = broadcastnz(f, A, B)
_magicbroadcast(f, ::ZP{FT}, A, B::Diagonal) = broadcastnz(f, A, B)
_magicbroadcast(f, ::ZP{TT}, A::Diagonal, B::Diagonal) = broadcastnz(f, A, B)
broadcastnz(f, A::Diagonal, B) = Diagonal(f.(diag(A), diag(B)))
broadcastnz(f, A, B::Diagonal) = Diagonal(f.(diag(A), diag(B)))
broadcastnz(f, A::Diagonal, B::Diagonal) = Diagonal(f.(A.diag, B.diag)) # disambiguate
broadcastall(f, A, B) = f.(convert(Matrix, A), convert(Matrix, B))

iszeropreserving(f, ::Tuple{Type}) = f(ZP{AT}())
iszeropreserving(f, ::Tuple{Type,Type}) = f(ZP{TF}(), ZP{FT}())

# Below we define rules for computing the zero-preservation class of an operation formed via composition of operations with known zero-preservation class

# Rules for unary operations
declareunaryZPT(fns...) = for fn in fns; (::typeof(fn))(t::ZP) = t; end # zero preserving
declareunaryZPF(fns...) = for fn in fns; (::typeof(fn))(::ZP) = ZP{AF}(); end # not zero preserving
declareunaryZPT(abs, sin, tan)
declareunaryZPF(cos, log, exp)

# Rules for binary operations that are zero preserving in both arguments
function declarebinaryZPT(fns...)
    for fn in fns
        # ZP{T} in either argument
        (::typeof(fn))(::ZP{AT}, ::ZP) = ZP{AT}()
        (::typeof(fn))(::ZP, ::ZP{AT}) = ZP{AT}()
        (::typeof(fn))(::ZP{AT}, ::ZP{AT}) = ZP{AT}() # disambiguate
        # ZP{F} in either argument
        (::typeof(fn))(::ZP{AF}, t::ZP) = t
        (::typeof(fn))(t::ZP, ::ZP{AF}) = t
        (::typeof(fn))(::ZP{AF}, ::ZP{AF}) = ZP{AF}() # disambiguate
        (::typeof(fn))(::ZP{AF}, ::ZP{AT}) = ZP{AT}() # disambiguate
        (::typeof(fn))(::ZP{AT}, ::ZP{AF}) = ZP{AT}() # disambiguate
        # remaining ZP{TT} cases
        (::typeof(fn))(::ZP{TT}, ::ZP{TT}) = ZP{TT}()
        (::typeof(fn)){C<:Union{ZP{TF},ZP{FT}}}(::ZP{TT}, t::C) = t
        (::typeof(fn)){C<:Union{ZP{TF},ZP{FT}}}(t::C, ::ZP{TT}) = t
        # remaining ZP{TF} and ZP{FT} cases
        (::typeof(fn))(::ZP{TF}, ::ZP{TF}) = ZP{TF}()
        (::typeof(fn))(::ZP{FT}, ::ZP{FT}) = ZP{FT}()
        (::typeof(fn))(::ZP{TF}, ::ZP{FT}) = ZP{AT}()
        (::typeof(fn))(::ZP{FT}, ::ZP{TF}) = ZP{AT}()
        # combinations involving numeric constants
        (::typeof(fn))(::ZP{AT}, ::Number) = ZP{AT}()
        (::typeof(fn))(::Number, ::ZP{AT}) = ZP{AT}()
    end
end
declarebinaryZPT(*)

# Rules for binary operations that are zero-pair preserving
function declarebinaryZPTT(fns...)
    for fn in fns
        # ZP{T} in either argument
        (::typeof(fn))(::ZP{AT}, t::ZP) = t
        (::typeof(fn))(t::ZP, ::ZP{AT}) = t
        (::typeof(fn))(::ZP{AT}, ::ZP{AT}) = ZP{AT}() # disambiguate
        # ZP{F} in either argument
        (::typeof(fn))(::ZP{AF}, ::ZP) = ZP{AF}()
        (::typeof(fn))(::ZP, ::ZP{AF}) = ZP{AF}()
        (::typeof(fn))(::ZP{AF}, ::ZP{AF}) = ZP{AF}() # disambiguate
        (::typeof(fn))(::ZP{AF}, ::ZP{AT}) = ZP{AF}() # disambiguate
        (::typeof(fn))(::ZP{AT}, ::ZP{AF}) = ZP{AF}() # disambiguate
        # remaining ZP{TT} cases
        (::typeof(fn))(::ZP{TT}, ::ZP{TT}) = ZP{TT}()
        (::typeof(fn)){C<:Union{ZP{TF},ZP{FT}}}(::ZP{TT}, ::C) = ZP{TT}()
        (::typeof(fn)){C<:Union{ZP{TF},ZP{FT}}}(::C, ::ZP{TT}) = ZP{TT}()
        # remaining ZP{TF} and ZP{FT} cases
        (::typeof(fn))(::ZP{TF}, ::ZP{TF}) = ZP{TF}()
        (::typeof(fn))(::ZP{FT}, ::ZP{FT}) = ZP{FT}()
        (::typeof(fn))(::ZP{TF}, ::ZP{FT}) = ZP{TT}()
        (::typeof(fn))(::ZP{FT}, ::ZP{TF}) = ZP{TT}()
    end
end
declarebinaryZPTT(+, -)

Some extensions and challenges

Handling numeric constants in additional cases should be possible, for example for binary operations that preserve zeros in both arguments consider another two rules

(::typeof(fn))(::ZP{AF}, x::Number) = ifelse(x == 0, ZP{AT}(), ZP{AF}())
(::typeof(fn))(x::Number, ::ZP{AF}) = ifelse(x == 0, ZP{AT}(), ZP{AF}())

Though in principle the branches in these rules could be eliminated at compile time (x being constant), that does not seem to be the case yet (#17880 ?), so these rules introduce type instability.

Given the zero-preservation class of a binary operation, intelligently combining different structured and/or sparse matrix types should be possible. For example, broadcasting a binary operation that is zero preserving in the first argument (but not in the second) over a (Tridiagonal, Diagonal) matrix pair could type-stably yield a Tridiagonal matrix.

Some operations pose challenges (perhaps common enough operations to sink this ship?). For example, not knowing that the domain of /'s second argument excludes zero, one must pessimistically assume that / is not zero preserving (whereas assuming that restriction, / is zero preserving in the first argument). ^ poses a similar challenge. Some such domain information might be inferrable given additional (declared) knowledge of various operations; Guy Steele's JuliaCon keynote comes to mind, specifically the segment on declaring all sorts of mathematical properties of types for the Fortress compiler to (at least in principle) exploit.

Complement or alternative

The approach discussed in #7010 seems either a simpler alternative to the above approach or a nice complement for when the above approach fails. (#7010 discusses providing versions of the common higher-order functions that, when operating over argument combinations involving sparse / structured matrices, allow the user to conveniently specify what subsection of the matrices they want to operate over / what type to return).

Thoughts? Thanks and best!

@TotalVerb
Copy link
Contributor

TotalVerb commented Sep 25, 2016

Reference #14324, #17880, which I believe together would eliminate the runtime branch and type instability of magicbroadcast. Neither is a particularly easy change though.

@stevengj
Copy link
Member

stevengj commented Oct 7, 2016

@tkelman, I'm not sure what you exactly mean by the "adjacency structure input", but changing an adjacency graph or weights does still not strike me as something you'd (sensibly) do by broadcast on a whole matrix.

Look, if you just want single-argument functions to work, we could do:

function Base.broadcast{T<:Number}(f::Function, D::Diagonal{T})
    d = f.(diag(D))
    z = f(zero(T)) # for sparse matrices, we assume f(0) is pure and deterministic
    if z == 0
        return spdiagm((d,), (0,))
   else
        return ... dense SparseMatrixCSC with d on diagonal and z offdiagonal
   end
end

and similarly for Tridiagonal etcetera. This is only a small number of methods to define. I don't think they are terribly useful, but the price in code complexity is small.

When you start to get into multi-argument broadcasts (which includes the round function!), however, I think the price in complexity is far too high in light of the negligible practical need for these things.

@andyferris
Copy link
Member

andyferris commented Nov 7, 2016

Just a suggestion... if sparse structures stored their "default" element, then this could be calculated once for the output. E.g. you could make a SparseVector whose default element was 1, :black, or Nullable{T}(), rather than calling zero(T) (similarly, it could track a function to call, which defaults to zero). Then you can always broadcast sparse structures to sparse structures, and compute the new default element just once and the filled elements once each.

(Some of the difficulty shifts to matrix multiplication but type-wise that is much simpler to deal with).

@nalimilan
Copy link
Member

@andyferris That's been discussed before at #14963, and it doesn't look like people support this solution.

@kmsquire
Copy link
Member

kmsquire commented Nov 7, 2016

It would be reasonable to explore a sparse type with defaults in a package, if anyone were up to it. (I don't have the bandwidth.)

@andyferris
Copy link
Member

Thanks @nalimilan. I read that post, and it seemed like an aweful lot of worrying about odd corner cases, when we could make sparse structures much more powerful (e.g. sparse .+ 1) with a settable default value almost for free and have no corner cases...

@kmsquire good point

@tkelman
Copy link
Contributor

tkelman commented Nov 8, 2016

It wouldn't be free though, every single method that handles sparse input would need to be checked and most of them modified or rewritten to correctly handle a non-zero default value. If there were any precedent for making that design choice in any other sparse linear algebra library then I think there would be more enthusiasm for it.

@andyferris
Copy link
Member

andyferris commented Nov 8, 2016

I wasn't suggesting it would be free for the developers :) The run-time checking of zero default values could easily be eliminated by dispatching on zero or some similar trick.

As for precedent, Julia is clearly the only language I've seen that is generic enough to make this worthwhile for the end user, fast enough to bother with such optimizations, and flexible enough (with e.g. the trick I mention above) that it is possible without run-time compromise.

PS - the gain is also outside of linear algebra, e.g. you get a kickass NullableArray like-thing.

@stevengj
Copy link
Member

stevengj commented Nov 8, 2016

@andyferris, the runtime overhead is not the concern; sparse arrays are only performant if the arrays are large, in which case the overhead of checking a single element is negligible. The concern is the developer overhead of having to modify/rewrite a ton of working code, and of the subtle trap that "sparse" wouldn't mean what it normally means. Worse, it's not clear whether all this work would actually accomplish anything useful; who needs a .+ 1 for sparse arrays?

The nullable use case seems so different that I don't think you would be able to usefully re-use the "sparse" code anyway.

@andyferris
Copy link
Member

andyferris commented Nov 8, 2016

@stevengj you make two excellent points, and of course generalizations could be dealt with in a package.

The nullable use case seems so different that I don't think you would be able to usefully re-use the "sparse" code anyway.

You might be right, I'm not sure. My interest is partially because I have been spending time on this and have been thinking of possible related array storage containers.

@Sacha0
Copy link
Member Author

Sacha0 commented Dec 24, 2016

Rebased. Best!

@Sacha0 Sacha0 force-pushed the devecround branch 2 times, most recently from 5a98ec5 to 33e6f99 Compare December 24, 2016 23:41
@Sacha0 Sacha0 changed the title WIP: Deprecate vectorized round methods in favor of compact broadcast syntax Deprecate vectorized round methods in favor of compact broadcast syntax Dec 26, 2016
@Sacha0
Copy link
Member Author

Sacha0 commented Dec 31, 2016

Subsumed by #19791.

@Sacha0 Sacha0 closed this Dec 31, 2016
@Sacha0 Sacha0 deleted the devecround branch December 31, 2016 05:07
@tkelman
Copy link
Contributor

tkelman commented Dec 31, 2016

Is there an issue yet for making broadcast fall back to returning sparse instead of dense on the structured array types where that would make sense?

@Sacha0
Copy link
Member Author

Sacha0 commented Dec 31, 2016

Is there an issue yet for making broadcast fall back to returning sparse instead of dense on the structured array types where that would make sense?

Not yet. I have the code prototyped though. Hoping to polish that off tomorrow, time allowing. Best!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection
Projects
None yet
Development

Successfully merging this pull request may close these issues.