fix non numeric accumulate(op, v0, x) (#25506) #25515

jw3126 · 2018-01-11T20:25:31Z

jw3126 · 2018-01-12T06:32:04Z

@simonbyrne You said in the issue #25506:

A full solution to this would probably involve extending #25051 to accumulate

Can you explain whats wrong with the fix in this PR and provide an example that fails?

simonbyrne · 2018-01-12T19:00:36Z

I don't think this will fix

julia> accumulate(*, ['a','b','c'])
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Char
Stacktrace:
 [1] setindex!(::Array{Char,1}, ::String, ::Int64) at ./array.jl:684
 [2] setindex! at ./multidimensional.jl:451 [inlined]
 [3] accumulate!(::typeof(*), ::Array{Char,1}, ::Array{Char,1}, ::Int64) at ./multidimensional.jl:1028
 [4] accumulate(::Function, ::Array{Char,1}, ::Int64) at ./multidimensional.jl:957
 [5] accumulate(::Function, ::Array{Char,1}) at ./multidimensional.jl:984
 [6] top-level scope

jw3126 · 2018-01-14T19:20:04Z

@simonbyrne I think the accumulate case is actually simpler then (map)reduce. Because

accumulate over empty collection is empty, we don't need to come up with an element
reduce does a lot of widening, which accumulate does not

jw3126 · 2018-01-16T20:13:24Z

Can somebody review this?

simonbyrne · 2018-01-16T20:18:51Z

reduce does a lot of widening, which accumulate does not

I would like to change that to be consistent

jw3126 · 2018-01-16T20:21:13Z

Its was a conscious design decision, that accumulate does not widen.

simonbyrne · 2018-01-16T20:24:30Z

I'm actually trying to find that discussion at the moment: do you have a link?

I would like to change accumulate to use the same machinery as reduce, so that accumulate(+,...) won't widen, but cumsum(...) will.

jw3126 · 2018-01-16T20:26:58Z

Okay #18364 for cumsum and a bit in #18931.

jw3126 · 2018-01-16T20:30:51Z

FWIW I am also in favor of making this consistent with reduce.

simonbyrne · 2018-01-16T20:34:15Z

Thanks for the links, I'll leave that to a separate issue.

Even if we don't widen, I would still prefer to use reduce_first instead of defining rcum_convert.

simonbyrne · 2018-01-17T16:12:10Z

base/multidimensional.jl

    n == 1 && return result
    _accumulate_pairwise!(op, result, v, v1, i1+1, n-1)
    return result
 end

 function accumulate_pairwise(op, v::AbstractVector{T}) where T
-    out = similar(v, rcum_promote_type(op, T))
+    out = similar(v, promote_op(op, T))


This (and others) should probably be promote_op(op, T, T): though built-in operators provide unary versions, anonymous functions typically won't.

Oh thanks! I somehow assumed promote_op(op,T) = promote_op(op,T,T) without checking. Documenting it is a good idea!

jw3126 · 2018-01-19T10:43:32Z

Can someone rerun CI?

martinholters · 2018-01-19T11:05:49Z

The correct thing to do would be to do the same as broadcast for figuring out the return eltype, namely base it on the actual values, and only fallback on promote_op for the empty case.

jw3126 · 2018-01-19T15:27:45Z

@martinholters What do you mean by empty case? Are you talking about accumulate(op, v0, iter) vs accumulate(op, iter) or about iter being empty? I am not familiar with the broadcast code, can you explain the return type mechanism there or link to some relevant lines?

martinholters · 2018-01-22T15:37:53Z

What do you mean by empty case?

The situation where the output is empty, so where iter is empty. If v0 is present it might be better to us its type as the eltype instead of calling promote_op, not sure.

I am not familiar with the broadcast code, can you explain the return type mechanism there

Conceptually, it works as follows:

Compute first output element, allocate output container with first element's type as eltype, store first element.
Compute next element. If its type is a subtype of the containers eltype, store it. Otherwise, typejoin the current eltype and the type of the newly computed element, allocate new container with the new eltype, copy previous work, store new element.
Repeat at 2) until all elements have been computed.

If the code is type stable, inference can figure out that the type check in 2) can be optimized away. So a sufficiently tweaked implementation can be quite efficient.

However,

julia> accumulate(*, ['a', 'b'])
2-element Array{Any,1}:
 'a' 
 "ab"

may not be what you want in the end. (But then again, accumulate(*, "", ['a', 'b']) would give the desired Vector{String}.)

simonbyrne · 2018-01-22T18:37:54Z

To save you some time hunting around (the broadcast code is a bit confusing): broadcast calls Base._return_type: if it is a concrete type, this is used to create an output array, otherwise it checks if empty and determines the initial element, then uses the iterative widening discussed above.

jw3126 · 2018-01-22T23:11:52Z

Mhhh, seeing

julia> accumulate(*, ['a', 'b'])
2-element Array{Any,1}:
 'a' 
 "ab"

hurts my eyes. Getting accumulate(*, ['a', 'b']) right, was one of the motivating examples of this PR 😄.

simonbyrne · 2018-01-22T23:13:43Z

I think we should still be able to use reduce_first with this idea.

jw3126 · 2018-01-23T06:59:29Z

@simonbyrne okay you are right, we could still use reduce_first. I am still not convinced, this adds so much complexity to accumulate. What would be an example, that goes haywire with this PR but can be handled gracefully by the broadcast approach?

martinholters · 2018-01-23T08:39:45Z

Getting accumulate(*, ['a', 'b']) right, was one of the motivating examples of this PR

Maybe we have different opinions of what "right" is here? To me, accumulate(*, ['a', 'b']) == ['a', 'a'*'b'] looks pretty right. Anything else has lot of DWIM to it. Compare with

julia> accumulate(*, [1m, 1m]) # using Unitful (result faked)
2-element Array{Unitful.Quantity{Int64,D,U} where U where D,1}:
   1 m
 1 m^2

which is basically the same situation. Would we want to enforce some other element type (and likely cause a failure by doing so)?

StefanKarpinski · 2018-01-23T14:57:46Z

To me the difference between reduce(*, ['a', 'b']) and reduce(*, [1m, 1m]) is simply that 'a' and "ab" should probably promote to String as common type – we don't currently do that, which is something we could easily change. On the other hand, 1m and 1m^2 should certainly not promote to a common concrete type (what unit would it be?). So perhaps a side fix here would be adding sufficiently general versions of these methods:

Base.promote_rule(::Type{Char}, ::Type{S}) where S<:AbstractString = S
Base.convert(::Type{String}, c::Char) = string(c)

allowing this to work:

julia> promote('a', "ab")
("a", "ab")

That would, I believe, naturally make the reduce(*, ['a', 'b']) example act as desired. Note that we were once reluctant to be too fast and loose with character <=> string conversions since characters were subtypes of integers as code points, but that's no longer the case, so I think it should be safe.

jw3126 · 2018-01-24T10:09:36Z

Okay maybe the issue is indeed more with general Char <=> String conversion policy, then with accumulate. I still think, accumulate(*, ['a']) = ["a"] and reduce(*, ['a']) = "a" are the behavior we should aim for.
But it agree it is not ideal to hack that into reduce and accumulate and it should instead come out of the general conversion machinery.

jekbradbury · 2018-01-24T19:12:38Z

Ultimately the reason you want reduce(*, ['a']) = "a" isn't necessarily because you generally want 'a' and "b" to promote to "a" and "b" but because you want to initialize the reduction with ""? So maybe something like one(::Char) = ""?

simonbyrne · 2018-01-24T19:16:55Z

@jekbradbury There isn't always such an identity element, e.g. min over an array of BigInts.

StefanKarpinski · 2018-01-24T19:20:55Z

Given that 'a'*'b' === "ab" for "multiplicative purposes, characters are embedded into strings as one-character strings. Therefore, I think that when you ask what the multiplicative identity of characters is, the only reasonable answer is that it's the multiplicative identity for strings, i.e. "".

simonbyrne · 2018-01-24T19:25:47Z

I agree, which is why I defined a reduce_empty(::typeof(*), ::Type{Char}) = "" method in #25051. My point was that it isn't well-defined for all possible operations/types.

martinholters · 2018-01-25T12:51:33Z

Assuming #25728 lands, could this serve as a prototype implementation?

function accumulate(f, x)
    res = Vector{Any}(uninitialized, length(x))
    res[1] = x[1]
    for i in 2:length(x)
        res[i] = f(res[i-1], x[i])
    end
    restype = mapreduce(typeof, promote_type, res)
    return convert(Vector{restype}, res)
end

Obviously, this is not the most efficient way to go about it and is restricted to non-empty Vector inputs, but would it produce the results we want?

simonbyrne · 2018-01-26T22:25:05Z

I've created a WIP fix in #25766, which I think might be a more general solution.

jw3126 · 2018-01-31T07:56:00Z

I close this in favor of #25766.

jw3126 added 4 commits January 17, 2018 13:43

fix non numeric accumulate(op, v0, x) (JuliaLang#25506)

3f3c1bc

rcum_convert

65dff7a

get rid of rcum_convert

696c916

get rid of obsolete rcum_promote_type

6069c76

jw3126 force-pushed the accumulate branch from 9699be5 to 6069c76 Compare January 17, 2018 13:18

simonbyrne reviewed Jan 17, 2018

View reviewed changes

simonbyrne mentioned this pull request Jan 17, 2018

docs for Base.promote_op #25609

Closed

use two argument for of promote_op

1b44727

simonbyrne approved these changes Jan 17, 2018

View reviewed changes

KristofferC closed this Jan 19, 2018

KristofferC reopened this Jan 19, 2018

StefanKarpinski mentioned this pull request Jan 24, 2018

convert & promote_rule for Char and strings #25728

Closed

jw3126 closed this Jan 31, 2018

jw3126 mentioned this pull request Mar 20, 2018

create Accumulate iterator #25766

Open

2 tasks

simonbyrne reopened this Mar 29, 2018

simonbyrne closed this Mar 29, 2018

simonbyrne mentioned this pull request Mar 29, 2018

Change accumulation promotion behaviour #26658

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix non numeric accumulate(op, v0, x) (#25506) #25515

fix non numeric accumulate(op, v0, x) (#25506) #25515

jw3126 commented Jan 11, 2018

jw3126 commented Jan 12, 2018

simonbyrne commented Jan 12, 2018

jw3126 commented Jan 14, 2018

jw3126 commented Jan 16, 2018

simonbyrne commented Jan 16, 2018

jw3126 commented Jan 16, 2018

simonbyrne commented Jan 16, 2018 •

edited

Loading

jw3126 commented Jan 16, 2018

jw3126 commented Jan 16, 2018

simonbyrne commented Jan 16, 2018

simonbyrne Jan 17, 2018

jw3126 Jan 17, 2018

jw3126 commented Jan 19, 2018

martinholters commented Jan 19, 2018

jw3126 commented Jan 19, 2018

martinholters commented Jan 22, 2018

simonbyrne commented Jan 22, 2018 •

edited

Loading

jw3126 commented Jan 22, 2018

simonbyrne commented Jan 22, 2018

jw3126 commented Jan 23, 2018

martinholters commented Jan 23, 2018

StefanKarpinski commented Jan 23, 2018 •

edited

Loading

jw3126 commented Jan 24, 2018

jekbradbury commented Jan 24, 2018

simonbyrne commented Jan 24, 2018

StefanKarpinski commented Jan 24, 2018

simonbyrne commented Jan 24, 2018 •

edited

Loading

martinholters commented Jan 25, 2018

simonbyrne commented Jan 26, 2018

jw3126 commented Jan 31, 2018

fix non numeric accumulate(op, v0, x) (#25506) #25515

fix non numeric accumulate(op, v0, x) (#25506) #25515

Conversation

jw3126 commented Jan 11, 2018

jw3126 commented Jan 12, 2018

simonbyrne commented Jan 12, 2018

jw3126 commented Jan 14, 2018

jw3126 commented Jan 16, 2018

simonbyrne commented Jan 16, 2018

jw3126 commented Jan 16, 2018

simonbyrne commented Jan 16, 2018 • edited Loading

jw3126 commented Jan 16, 2018

jw3126 commented Jan 16, 2018

simonbyrne commented Jan 16, 2018

simonbyrne Jan 17, 2018

Choose a reason for hiding this comment

jw3126 Jan 17, 2018

Choose a reason for hiding this comment

jw3126 commented Jan 19, 2018

martinholters commented Jan 19, 2018

jw3126 commented Jan 19, 2018

martinholters commented Jan 22, 2018

simonbyrne commented Jan 22, 2018 • edited Loading

jw3126 commented Jan 22, 2018

simonbyrne commented Jan 22, 2018

jw3126 commented Jan 23, 2018

martinholters commented Jan 23, 2018

StefanKarpinski commented Jan 23, 2018 • edited Loading

jw3126 commented Jan 24, 2018

jekbradbury commented Jan 24, 2018

simonbyrne commented Jan 24, 2018

StefanKarpinski commented Jan 24, 2018

simonbyrne commented Jan 24, 2018 • edited Loading

martinholters commented Jan 25, 2018

simonbyrne commented Jan 26, 2018

jw3126 commented Jan 31, 2018

simonbyrne commented Jan 16, 2018 •

edited

Loading

simonbyrne commented Jan 22, 2018 •

edited

Loading

StefanKarpinski commented Jan 23, 2018 •

edited

Loading

simonbyrne commented Jan 24, 2018 •

edited

Loading