-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Julep: Syntax for reduction (overloadable splatting?) #32860
Comments
I'm pretty strongly against this. Actually, I think the idea is not best described as "overloadable splatting", but as "change Splatting is supposed to be completely orthogonal to function calling. Calling functions is the core operation in the language, and splatting is just an alternate way to provide arguments. If this proposal were adopted, we would need some other way to call a function on arguments taken from a container. For example, this change makes it impossible to implement utilities like
because there would be no way to generically take some arguments, and then pass all of them on to another function as if it were called directly with them. So in short I'd favor coming up with some syntax for reduce rather than re-purposing |
Would it be a problem if there is a way to seal methods (#31222)? I imagine that #31222 allows us to construct lowering such that it is impossible to change the behavior when splatting tuples. I understand that function call is too fundamental so that you don't want to play any magic with it. At the same time, since it is very fundamental, many people understand its semantics; including the users who are not very familiar with functional programming construct like |
Yes if we can get syntax for Maybe we just need more dots in more places 😬 +...(xs.^2) # mapreduce(x->x^2, +, xs)
# [Edit: really reduce(+, broadcasted(literal_pow, ^, xs, Val(2)))] (motivation: an ellipsis indicates more items in the list. In this case "many |
@c42f Moving By the way, we only need the syntax for |
Agreed; all I really mean is that you want to be able compile to something as efficient as |
I checked whether To my surprise:
Haha! What. |
Ok, harder problem: can we get syntax for For 0 +... xs # Now we're talking!
f...(0, xs) For (+...).(eachslice(A, dims=2)) # reduce(+, A, dims=1) if A is a matrix
+....(eachslice(A, dims=2)) # ?? Is it even possible to parse this? :-(
(+...).(eachslice(A, dims=2).^2) # Well, it should work with broadcast at least That's... a lot of dots... Though a bit tangential, this discussion also reminds me of @mbauman's comment at #31217 (comment) |
Dealing with f...(a, xs; h=1, g=2) # reduce((x1,x2)->f(x1, x2; h=1, g=2), xs, init=a) but that naturally leads to wondering about other arguments to the "binary" operator f...(a, xs, b, c; h=1, g=2) # Unsure whether this can have a sensible meaning. |
Maybe throw more dots? How about +...(x.^2 .+ 1)[., ., :]
# Equivalent to:
dropdims(reduce(+, (for x.^2 .+ 1); dims=(1, 2)); dims=(1, 2))
# `.`s inside `[., ., :]` become `dims` and y .= +...(x.^2 .+ 1)[., ., :]
# Equivalent to:
reducedim!(+, reshape(fill!(y, 0), (1, 1, size(y)...)), (for x.^2 .+ 1))
# `.`s insert singleton dimensions in the destination array (I'm using The Regarding +(a, b...) # reduce(+, (a, reduce(+, b))) or reduce(+, b; init=a)
+(a, b..., c..., d) # reduce(+, (a, reduce(+, b), reduce(+, c), d))
push!([], b...) # foldl(push!, b; init=[]) |
Actually, +(a, b[.]) # => reduce(+, (a, reduce(+, b))) or reduce(+, b; init=a)
+(a, b[.], c[.], d) # => reduce(+, (a, reduce(+, b), reduce(+, c), d))
push!([], b[.]) # => foldl(push!, b; init=[])
write(io, a[.]) # => foreach(x -> write(io, x), a)
+(x.^2 .+ 1)[., ., :]
# => dropdims(reduce(+, (for x.^2 .+ 1); dims=(1, 2)); dims=(1, 2))
y .= +(x.^2 .+ 1)[., ., :]
# => reducedim!(+, reshape(fill!(y, 0), (1, 1, size(y)...)), (for x.^2 .+ 1)) It also motivates |
Very interesting idea. I think we're tackling two somewhat orthogonal problems here — slice iteration (AKA the curse of the metastasizing a[., ., :]
# (@view(a[i,j,:]) for i=axes(a,1), j=axes(a,2)) # or equivalent iterator type Then you can express various useful things independently from having reduction syntax sum.(a[., ., :])
# sum(a, dims=3)
sum.(a[., :, :])
# sum.(a, dims=(2,3))
sum.(a[., :, 1])
# sum.(a[:,:,1], dims=2)
diff.((a .* b)[., :])
# Fused version of diff(a .* b, dims=2) ? For Still it would be neat if the dims keywords could be removed from |
If we are to go this direction, I think it'd better to drop That is to say, there is a lazy object like struct Axed{T, D <: Tuple{Vararg{Int}}}
x::T
dims::D
end such that Axed(a, (1, 2)) and sum(a::Axed) = sum(a.x; dims=a.dims) or maybe sum(a::Axed) = Axed(sum(a.x; dims=a.dims), a.dims) so that Having said that, I think
is not quite right (although I half agree). Specifying "loop axes" ( Can we have other syntax for
|
It looks like prefix +(...a)
+(a, ...b)
+(a, ...b, ...c, d) So I guess we can use y .= +(...(x.^2 .+ 1)[., ., :]) if "syntax orthogonalization" is desirable. |
Right, this makes particular sense for things like
I agree that reduction is closely related and that the syntaxes should compose in a way which allows for lowing into efficient fused loops (ie, allow the compiler to remove the I must admit, I'm still somewhat taken by +(a, ...bs, ...cs, d)
# vs
a +... bs +... cs + d — the second seems much more like natural julia syntax to me. For more complex cases it's admittedly a toss up y .= +(...(x.^2 .+ 1)[., ., :])
# vs
y .= +...(x.^2 .+ 1)[., ., :] The y .= +...(x.^2 .+ 1)[?, ?, :] |
I guess you can already do this with current broadcasting facility, like this? f.(view.(a, axes(a, 1), reshape(axes(a, 2), 1, :), Ref(:))) Yea, I know it's super ugly. But I think syntaxes to make this cleaner is already discussed elsewhere. I think we "just" need f.(@view a.[:, :, &:]) (or maybe even without
Good point. Thank you for pointing it out!
I totally agree that infix syntax is more natural in Julia for operations like vcat(...f.(as), b, ...cs)
intersect(...f.(as), b, ...cs)
push!([], ...f.(as), b, ...cs)
write(io, ...f.(as), b, ...cs)
But, to make the infix version and function call versions compatible, I think a + ...bs + ...cs + d would make more sense (although maybe parser would handle both cases anyway?). |
Some of these proposals remind me of the using JuliennedArrays
Base.getindex(A::Array, fs::Function...) = Slices(A, map(f -> f==(:) ? True() : False(), fs)...)
ones(2,3)[:,*] # 3-element Slices, ≈ eachcol(ones(2,3))
sum(ones(2,3)[:,*]) # == dropdims(sum(ones(2,3), dims=2), dims=2)
sum.(ones(2,3)[:,*]) # == dropdims(sum(ones(2,3), dims=1), dims=1) A version of this (perhaps with a different symbol) might be nice as an alternative to Reducing a broadcast like |
@mcabbott I think non-materializing broadcast is orthogonal to the issue here because non-materializing broadcast is useful outside the context of reduction. For example, it lets you write a function f(x) = @: x.^2 .+ 1 which does not have to care at all how this expression will be materialized ( |
You're welcome, I think I should thank @andyferris for this observation that
Thanks for pointing these out again, they're extremely relevant. There seems to be some confusion over at #30845 as to whether
At this point I'm thinking out loud and it's unclear to me how this could be fused with the reduction |
@tkf I agree it would be awesome if some syntax for reduction could be used for Perhaps you could explain more why you'd like syntax for reducing multiple collections in one call? To me this seems sufficiently unusual that you could just chain the reductions (provided that the use of vcat(...f.(as), b, ...cs)
# vs the much uglier
vcat...(vcat(vcat...(f.(as)), b)), cs) |
Actually for the case of |
Ah, I missed that. I think you are right.
In previous comment I said
Isn't it already fused? I mean, it's a nested loop and allocating as-is, but if you specify the destination then it would be equivalent to y .= f.(view.(a, axes(a, 1), reshape(axes(a, 2), 1, :), Ref(:))) So "loops over axes 1 and 2" are fused with "copying to
It may be tricky but it's something understandable once you think about the rules, isn't it? I mean, there are two nested loops (a loop inside
In the OP, my proposal was to not lower directly to
So,
Those properties are already satisfied for splatting. That is to say, +(numbers...)
push!([], objects...)
print(printables...) already follow the guideline I just mentioned. It means that the users only have to learn to change
It was hard to come up with the case ...matrices * vector (Oh, actually, this is why
Other than the efficiency reason you just explained, we already have vcat(f.(as)..., b, cs...) which works nicely for tuples. So why not have a syntax that works with arbitrary iterators? |
It would be confusing, but
The allocation for the destination array
Right, that seems clear. Sorry — I'd glossed over that part of the OP given that we're not going to use actual splatting for reduction. I wonder what a good default is out of By the way, I would love to have |
Good point. I guess it can be special-cased in
Ah, I was assuming that
I'd say acc = 0
for x in xs
acc += write(io, x)
end
acc to be equivalent with I think
OK. You will hate to know what I proposed... I wanted |
Yes you're right; |
If we had a function like printlnio(io, x) = (println(io, x); io) then Alternatively, maybe having a helper function like ln(x) = string(x, "\n") for |
Another syntax option is to have the vcat...(as, &b, cs)
# vs
vcat(...as, b, ...cs) The idea is that it allow us to express vcat...(&a, &b, &c) # ow my eyes. It's a stronger analogy with broadcasting syntax where the Having said all that, a cleaner syntax for this case is just to use a tuple: vcat...((a,b,c))
# or
vcat(...(a,b,c)) These are still much uglier than |
I don't think there is any fundamental reasons why automatic
Me too. In fact, I think we already have arrived at the API where (Well, I don't strictly agree with the "reduction is about the iteration API" part because reduction is much more general; e.g., it's parallelizable. But I'm guessing that's probably not what he meant.)
One of the disadvantages with this approach is that we loose a chance to assign a meaning to IIUC your motivation is to "auto-define" splatting-like behavior? Then why not use abstract types to do it (#32860 (comment))? It would make vararg actually work and no new language constructs have to be introduced. (But types like |
Actually I'm confused by what what you want this to mean. Is it
My observation is that Put another way, if you want |
About the But it's a very different way to address the observation that varargs methods of some functions arguably "shouldn't even exist" and instead be replaced by a reduction involving the binary version of that function only (if only we had a sufficiently slick syntax; honestly I'm not sure this is possible). |
My mental model is "prefix
I see. So remove vararg definitions from various functions like But... |
Yes, it's that differing code path which bothers me, I think it highlights a deeper design flaw in this way of arranging things. Again, compare to broadcast where nobody needs to worry about defining "broadcastable" functions; all functions are naturally broadcastable with a uniform syntax and implementation. To come at this from another angle: in |
The thing I find confusing about this is it supposes that the splatted version exists and has the meaning of a reduction. This is fine for An alternative meaning of prefix- println(io, ...xs, " ", ...ys)
# foreach((x,y)->println(io, x, " ", y), xs, ys) Or in general, foo(a, b, ...cs, d, ...es)
# foreach((c,e)->foo(a, b, c, d, e), cs, es) I realize this diverges in rather a different direction. Coming back to an earlier point you made about |
On the other hand, the way how arguments are broadcasted is highly customizable; there is an intricate pipeline for processing arguments to change the semantics (e.g., some types are considered scalar by default) and to choose the best broadcasting strategy (style). Also, strictly speaking, you can tweak Another way to look at it is that in broadcasting the behavior of arguments are highly customizable while in the prefix-
Isn't it an argument in favor of If I were to point out the design flaw, it's the fact that
I was just explaining an easy mnemonic. It perhaps is better to formalize bclift(f) = (args...) -> broadcast(f, args...) Then f.(...xs) == bclift(f)(...xs) == reduce(bclift(f), xs)
This is broadcasting so this can be written as |
It's an argument in favor of having some systematic way to treat reductions. f.(...xs) == bclift(f)(...xs) == reduce(bclift(f), xs) Right this makes sense though I do find it confusing syntax. I think it's because the It's true that attaching the reduction to arguments does make a difference in what can be easily expressed. Roughly, f.(...xs) == reduce((x,y)->map(f, x, y), xs)
# vs
(f...).(xs) == map(x->reduce(f, x), xs) |
I didn't think of reduce((x, y) -> ismissing(x) || ismissing(y) ? missing : x + y, xs) But under the "closer modifier lifts function first" rule, it wouldn't make sense to write
I guess this is typical for when you are composing multiple higher order functions (e.g., transducer composition "feels" like happening in the reverse order)? I think that's why a mnemonic based on splatting is important. Now that I used the word "lift", I think I may understand one of your points. Since the transformation of two-argument function to a reduction is a some kind of "lift", the syntax should act on the function, not the arguments. Is this your point? But I think it an important observation is that we need "unlifting" operation like |
I took the liberty of renaming this topic again, as I think it's been a very interesting exploration of reduction syntax even though the option of overriding existing splatting syntax was quickly discarded.
Yes, I think that's accurate. I just noticed today that this is the model used by APL where the I'll have to think about unlifting some more. I'm still a bit confused about how precisely this interacts with the |
Thanks, I guess the updated title reflect more what is going on here.
FYI, there is a table of foldl/foldr syntaxes of various languages in https://en.wikipedia.org/wiki/Fold_(higher-order_function)#In_various_languages J and Scala may be interesting in that they support initial value with APL like syntax. I think what I suggest is closest to C++. Reading the table, I realized that another axis of discussion can be whether or not it makes sense to support reduce/foldl/foldr distinction at syntax level. The languages like J, Scala and C++ seem to support it. I think it makes sense to handle by the binary function implementers (as in the OP) since what should happen is most of the time clear from the function and argument type.
Just in case looking at implementation can help, here is what I'd suggest for non-associative binary functions: const FoldableFunction = Union{
typeof(/),
typeof(-),
typeof(intersect!),
typeof(union!),
typeof(merge!),
typeof(append!),
typeof(push!),
}
struct ReduceMe{T} # need to find a better name :)
value::T
end
function apply(op::FoldableFunction, x, xs...)
init = x isa ReduceMe ? foldl(op, x.value) : x
return foldl(xs; init=init) do acc, input
if input isa ReduceMe
foldl(op, input.value, init=acc)
else
op(acc, input)
end
end
end
using Test
# push!(...[[], 1, 2])
@test apply(push!, ReduceMe([[], 1, 2])) == [1, 2]
# push!([], ...[1, 2])
@test apply(push!, [], ReduceMe([1, 2])) == [1, 2]
# push!([], ...[1, 2], 3, ...[4])
@test apply(push!, [], ReduceMe([1, 2]), 3, ReduceMe([4])) == [1, 2, 3, 4] I am not super sure if the first argument should be allowed to be a " # intersect!(...[[1, 2], [1], [0, 1]])
@test apply(intersect!, ReduceMe([[1, 2], [1], [0, 1]])) == [1] |
An interesting data point is something(xs...) === reduce(something, xs) Instead, more useful definition of firstsomething(x, y) = x isa Nothing ? y : x
something(reduce(firstsomething, xs; init=nothing))
# something(reduce(right, ReduceIf(!isnothing), xs)) # using Transducers |
Here is an example where using julia> xs = (:Base, :CoreLogging, :Info);
julia> foldl(getproperty, xs, init=Main)
Info This code snippet is highly useful but |
Very interesting example. Again it makes me wonder if we should be looking to delete methods which have varargs versions which are implicitly reductions :-) |
I think it's OK to get rid of vararg definitions as long as the "lifting" occurs per-argument basis rather than the whole function (i.e., |
I have been thinking a bit about reductions over lazy collections. I've got a different tangent to suggest. We have these awesome generators which can do julia> (x^2 for x in 1:10 if iseven(x))
Base.Generator{Base.Iterators.Filter{typeof(iseven),UnitRange{Int64}},getfield(Main, Symbol("##41#42"))}(getfield(Main, Symbol("##41#42"))(), Base.Iterators.Filter{typeof(iseven),UnitRange{Int64}}(iseven, 1:10))
julia> collect(x^2 for x in 1:10 if iseven(x))
5-element Array{Int64,1}:
4
16
36
64
100 So we have these beautiful almost-English sentences describing in a rather declaritive way an iterator (I kind of think of it as a "SQL for iterators"). However, one operation I often want to do is reductions. Take this example: julia> reduce(+, x^2 for x in 1:10 if iseven(x))
220 So what if I could use a "generator-style" syntax for reductions as well? I'm going to suggest the syntax/operator julia> + over x^2 for x in 1:10 if iseven(x)) # Note: this doesn't actually work
220 The thing on the right of julia> + over [4, 16, 36, 64, 100]
220 Therefore over(f, x) = reduce(f, x) If you like... we could simply make julia> + reduce [4, 16, 36, 64, 100]
220 :) |
I'm not sure how much But connection to the comprehension expressions is interesting. It makes me think that my emphasis on "lifting per argument" is yet another example of non-orthogonal features. The observation is that what I've been proposing with foldl(f, (x for y in (&a, b, c, &d) for x in y)) if .[xs; ys; zs] for f...(.[&a; b; c; &d]) or maybe we can have a shorthand for this: f.[&a; b; c; &d] It is nice because we have syntax for orthogonal features:
Alternatively, Having said that, we can also treat |
Sure, but the generator/comprehension thing seems well-received (in more languages than Julia), and the |
I just realized that julia> const ^ᵒᵛᵉʳ = reduce;
julia> vcat ^ᵒᵛᵉʳ (1:x for x in 1:3)
6-element Array{Int64,1}:
1
1
2
1
2
3
julia> (+) ^ᵒᵛᵉʳ (x^2 for x in 1:3) # + has to be in ()
14 Yeah, so it makes sense to make it a keyword if we go with this direction, to get rid of the extra parenthesis. I still think But one advantage might be that |
Why does julia> struct Over{F}
f::F
end
julia> (o::Over)(args...; kwargs...) = reduce(o.f, args...; kwargs...)
julia> Over(push!)((1,2,3); init=[])
3-element Array{Any,1}:
1
2
3 |
I think that's the same as asking why |
I think at this point, julia has so much special syntax that almost any new syntax we add at this point that's not just a slight generalization of existing syntax (e.g. There's a huge amount of special forms to learn at this point. Furthermore, I really don't see how |
|
Personally, I like the idea of
I think that you'd be lucky if 5 of them guessed correctly. From that pov, adding a syntax like that is just another step along the road to becoming perl. |
The most important thing I care at this point in this discussion is to make For example, Fortress has Anyway, it's conceivable that "don't be Perl" is the best option possible at this point. I don't really know. But I don't think we have explored syntax enough and I think it's too early to say no more syntax. |
I suggest a mechanism to customize splatting behavior such that code like
can be executed efficiently without any allocations.
The idea is inspired by discussion in #29114 by @c42f et al.
Idea
Like dot-call syntax, I suggest to lower splatting to a series of function calls that are overloadable. One possibility is
(To avoid recursion, this lowering should happen only when the call includes splatting.)
When the dot-call syntax appears in the splatting operand, I suggest not materialize the dot-call. That is to say, for example,
op(f.(xs)...)
is lowered toThis let us evaluate
op(f.(xs)...)
without any allocation oncereduce
/foldl
supportsBroadcasted
object (#31020 started tackle this).Interface
The lowering above requires the following interface functions and types:
apply
must be dispatched on the first argument type and may be dispatched on the second argument type.splattable
. This is analogous tobroadcastable
(@yurivish suggested this in Performance of splatting a number #29114 (comment)).VA
(whose name can/should be improved) must be used only for definingapply
; its constructor must not be overloaded. This is for making it hard to break splatting semantics.Arguments
must not be overloaded for the same reason.Using current
Core._apply
, the defaultapply
can be implemented asExample overloads
Associative binary operators
Many useful operations can be expressed using splatting into associative operators
or "mapped-splatting"
(This is reminiscent of the "big operator" in Fortress.)
There are also various other associative binary operators in Base. Invoking
reduce
with splatting could be useful:For example, concatenating vectors would be efficiently done via
vcat(vectors...)
thanks to #27188.It also is possible to support
such that it is computed as
This may be implemented as
Note that, since
reduce
would degrade tofoldl
when the input is not an array (or notBroadcasted
after #31020), we can also use it to fuse filtering with reductionNon-associative binary functions
Splatting is useful for non-associative binary functions:
or more generally
Matrix-vector multiplications
Not sure how many people need this, but
*(matrices..., vector)
can be (somewhat) efficiently evaluated by defining(Of course, allocation could be much more minimized if we really want this.)
Similar optimization can be done for
∘(fs...)
; but I'm not sure about the exact usecase.Higher-order functions (
map(f, iters..)
etc.)As this mechanism let any function optimize splatting, higher-order functions that may call splatting of given function can be optimized by defining their own
apply
specialization. For example,map(f, iters..)
can be specialized aswhere
_zipsplat(iters)
behaves likezip(iters...)
but its element type does not have to be aTuple
.Defining a similar overload for
broadcasted
may be possible provided that the object returned by_zipsplat(iters)
is indexable. This let us nest reduction inside mapping and avoid allocation in some cases:Other
map
-like functions includingmap!
andforeach
can also implement this overload.print
-like functionsprint
,println
andwrite
can be invoked with varargs. We can make, e.g.,println(xs...)
more compiler friendly and efficient whenxs
is a generic iterator. Note thatapply(string, Arguments(xs))
can also be implemented in terms ofapply(print, Arguments(io, xs))
.splattable
splattable
may be used to solve performance problem discussed in #29114:In case of
Broadcasted
, it can be used for callinginstantiate
:The text was updated successfully, but these errors were encountered: