Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Document that mapreduce(f, op, _) may exploit commutativity of op and purity of f #36424

Closed
wants to merge 2 commits into from

Conversation

tkf
Copy link
Member

@tkf tkf commented Jun 25, 2020

Currently reduce etc. document that it may exploit associativity of the operator. However, we already exploit commutativity of certain operations like max

julia/base/reduce.jl

Lines 627 to 630 in 98a845f

v1 = _fast(op, v1, f(A[i+0]))
v2 = _fast(op, v2, f(A[i+1]))
v3 = _fast(op, v3, f(A[i+2]))
v4 = _fast(op, v4, f(A[i+3]))

Also, using @simd arguably requires this caveat. So, I propose to document the current behavior.

If we do this, I think it makes sense to also require that f and getindex(A, _) for mapreduce(f, op, A) are reasonably pure and they may be called out of order. Again, this is already assumed in mapreduce(f, max, array) since it runs over the array (up to) twice.

julia/base/reduce.jl

Lines 642 to 650 in 98a845f

# enforce correct order of 0.0 and -0.0
# e.g. maximum([0.0, -0.0]) === 0.0
# should hold
if isbadzero(op, v)
for i in first:last
x = @inbounds A[i]
isgoodzero(op,x) && return x
end
end

In particular, it means that it's possible to implement a more efficient maximum(f, ::PermutedDimsArray) etc.

close #36081

@tkf tkf added fold sum, maximum, reduce, foldl, etc. minor change Marginal behavior change acceptable for a minor release labels Jun 25, 2020
@tkf tkf requested a review from mbauman June 25, 2020 00:41
@tkf tkf added the docs This change adds or pertains to documentation label Jun 25, 2020
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
@@ -270,6 +270,10 @@ In general, it will be necessary to provide `init` to work with empty collection
intermediate collection needs to be created. See documentation for [`reduce`](@ref) and
[`map`](@ref).

Known commutativity of the operation `op` such as `+` and `max` may be exploited in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "known commutativity" mean? Are we strict about the fact that max should be commutative, or are we limiting ourselves to only some max(::T, ::T)s?

We already say that the associativity of reduce isn't specified (and that if you want to demand a particular associativity to use fold[lr])... I know that's not the same as commutativity, but it feels very similar since it means the accumuland might appear on either side.

Copy link
Member Author

@tkf tkf Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant by "known commutativity" was that max implementer must make it commutative.

I guess we can instead restrict this to known functions and input types (so methods). But then we can't use this for something like maximum(f, xs) because robustly dispatching on return type is not really possible.

Are there use cases for non-commutative max, min, +, &, | etc.? If I need something similar to max but not commutative, I think I'm OK with creating a new function maxish. I think it's better to demand more to the implementer so that the higher order functions can optimize more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bigger point here is simply: are there reduction algorithms that care about commutativity but not associativity? Associativity seems like the bigger deal to me, but perhaps that's just a limitation of my own imagination.

My gut reaction would lean towards making (map)fold[lr] the implementations that guarantee both associativity and commutativity, and free mapreduce to do whatever it wants on both properties.

Copy link
Member Author

@tkf tkf Jun 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Associativity seems like the bigger deal to me

That's my understanding, too.

From a sequential computation tree

        +
       / \
      +   4
     / \
    +   3
   / \
  +   2
 / \
0   1

you can't lower the height of the tree (= data dependency) by using only commutativity:

  +
 / \
4    +
    / \
   +   3
  / \
 2   +
    / \
   0   1

while it's possible with associativity

     +
   /   \
  +     +
 / \   / \
1   2 3   4

I can't see how modifying the tree using commutativity helps the computation. I think commutativity is useful only if the op already is associative. (Though I'd love to know if there is something you can do only with commutativity.)

free mapreduce to do whatever it wants on both properties

I think my proposal is a bit more subtle. For the associativity, it's the caller that asserts it. For the commutativity, it's the implementer of op. I guess this "asymmetry" of who asserts what is not really the most beautiful API. Some alternatives are:

(1) Add a wrapper type

struct Commutative{F} <: Function
    op::F
end

(f::Commutative)(a, b) = f.op(a, b)

so that mapreduce can dispatch on op::Commutative{<:Union{typeof(max), typeof(min)}}. This way, it'd always be the caller who asserts certain properties. For example, you'd write maximum(f, xs) = mapreduce(f, Commutative(max), xs) and document that max(::eltype(xs), ::eltype(xs)) must be commutative. (This is something I meant to explore in JuliaFolds/Transducers.jl#143.)

(2) Add reduce_commutative (with a better name) that requires op to be associative and commutative. If we go this way, maybe we can use a consistent naming scheme like folda (= reduce) for associative op and foldac for associative and commutative op.

(3) Add reduce_dwim (with a better name) that also auto-detect known associativity. This way, it's always the implementer who asserts the associativity.

Ultimately, I think using reduce for (3) would be ideal for users. They can just call reduce and it dispatches to an optimal implementation. But it is not possible to do this until Julia 2.0. I think what I suggest in this PR can be used as an intermediate step for implementing (3) in Julia 2.0.

Co-authored-by: Stefan Karpinski <stefan@karpinski.org>
@tkf tkf requested a review from nalimilan June 25, 2020 21:14
Comment on lines +274 to +275
implementation. This means that the elements of `itr` may be accessed out of order and
the order in which `f` is called is implementation-defined.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbauman What do you think about the purity/out-of-order part?

@vtjnash
Copy link
Member

vtjnash commented Apr 21, 2021

@mbauman should we close this, since the docs already mention associativity, and the additional mention of commutativity doesn't sound that useful? perhaps we should actually just suggest that all implementations be commutative, and someday worry about changing some like max to do that?

@vtjnash vtjnash closed this May 4, 2021
@vtjnash vtjnash requested a review from mbauman May 4, 2021 19:03
@jtrakk
Copy link

jtrakk commented Jul 7, 2021

Commutativity can be useful for table join operations, because join(smalltable, bigtable) is often written with a full sequential scan on the left and a hash lookup on the right. Being able to swap the operands can make it much faster. It doesn't have to be written this way of course, but I think it's a good example of where commutativity is helpful.

Commutativity will also help whenever you're receiving items asynchronously out of order and you want to combine them with an accumulator. If the 4th item arrives before the 3rd, you can proceed without waiting.

Consider too when the arguments are in different locations. For example, a + (c + b) if a and b are close to each other (e.g. on the same server). Then it may be better to evaluate (a + b) first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation fold sum, maximum, reduce, foldl, etc. minor change Marginal behavior change acceptable for a minor release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

maximum(::Broadcasted) can produce buggy outputs with stateful functions
5 participants