Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #17970 (output of map over BitArrays) #18198

Merged
merged 1 commit into from
Nov 4, 2016

Conversation

pabloferz
Copy link
Contributor

@pabloferz pabloferz commented Aug 23, 2016

With this patch the output of mapping over BitArrays is properly decided based on the return type of the function (if Bool the return type will be a BitArray and an Array of the appropriate type otherwise).

@martinholters Can you take a look?

Cc: @JeffBezanson

@pabloferz
Copy link
Contributor Author

pabloferz commented Aug 23, 2016

Also, can anyone check that there are not performance regressions (once tests pass)?

@JeffBezanson
Copy link
Sponsor Member

I think this is too inference-dependent; our current rule is that the result can only depend on inference for empty arrays. I would prefer the simple and predictable behavior of only returning a BitArray for the set of boolean functions listed in the code (~, !, &, |, etc.)

@pabloferz
Copy link
Contributor Author

pabloferz commented Aug 23, 2016

Ok, I changed so it returns BitArray only for the special boolean functions ~, !, &, etc. (returning, for any other case, an Array of the appropriate type).

EDIT: It turns out that the fall-back map (which relies on collect) works properly with BitArrays, so in the end, the solution was to remove some methods and reorganize the specialized ones.

@tkelman
Copy link
Contributor

tkelman commented Aug 23, 2016

@nanosoldier runbenchmarks(ALL, vs = ":master")

@JeffBezanson
Copy link
Sponsor Member

Ah yes, that's a good solution.

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@martinholters
Copy link
Member

LGTM - hooray, BitChunkFunctor is gone!
I doesn't look to me like any of the performance regressions are caused by this. But I also wonder if mapping over BitArrays is exercised at all in the benchmarks. Did you do some specific benchmarking locally? Thinking about specialization, would bit_map!{F<:Function}(f::F, ...) be better?

@tkelman tkelman added the potential benchmark Could make a good benchmark in BaseBenchmarks label Aug 24, 2016
@pabloferz
Copy link
Contributor Author

pabloferz commented Aug 24, 2016

AFAICS, I also believe that the performance regressions are unrelated. And I think you're right, it doesn't seem like there are benchmarks for mapping over BitArrays. Locally I checked that the special cases (&, ~, etc.) weren't affected.

As for the specialization, that might help a bit in some cases, but I'm not sure.

@pabloferz
Copy link
Contributor Author

our current rule is that the result can only depend on inference for empty arrays

@JeffBezanson Is there any consideration of using inference when the result type is concrete as an optimization and falling back to use the values otherwise (or that is problematic in any way)?

@yuyichao
Copy link
Contributor

Is there any consideration of using inference when the result type is concrete as an optimization

Type inference is supposed to handle that automatically.

@pabloferz
Copy link
Contributor Author

@yuyichao I should have been more precise, I was particularly referring to doing it here https://github.com/JuliaLang/julia/blob/master/base/array.jl#L312. We currently have

function _collect(c, itr, ::EltypeUnknown, isz::Union{HasLength,HasShape})
    st = start(itr)
    if done(itr,st)
        return _similar_for(c, _default_eltype(typeof(itr)), itr, isz)
    end
    v1, st = next(itr, st)
    collect_to_with_first!(_similar_for(c, typeof(v1), itr, isz), v1, itr, st)
end

but this could be improved a bit like this

function _collect(c, itr, ::EltypeUnknown, isz::Union{HasLength,HasShape})
    et = _default_eltype(typeof(itr))
    isleaftype(et) && return copy!(_similar_for(c, et, itr, isz), itr)
    st = start(itr)
    if done(itr,st)
        return _similar_for(c, et, itr, isz)
    end
    v1, st = next(itr, st)
    collect_to_with_first!(_similar_for(c, typeof(v1), itr, isz), v1, itr, st)
end

@yuyichao
Copy link
Contributor

Yes and I'm under the impression that the type inference should handle the first case just fine if the callback is inferrable. Is that not the case?

@pabloferz
Copy link
Contributor Author

pabloferz commented Aug 24, 2016

Well, it is handled as it should. What I should have probably had to say was that copy! is faster than collect_to_with_first! when the type can be inferred.

@yuyichao
Copy link
Contributor

yuyichao commented Aug 24, 2016

collect_to_with_first! looks pretty efficient to me.

We can certainly tweak the implementation at many points if benchmarking different cases suggests that there's a good reason to do so. I don't see a reason we have to special case the leaf inferred type to get the same performance though.

@JeffBezanson
Copy link
Sponsor Member

It should be possible to make collect_to_with_first! where the type is known produce the same code as any other efficient loop. I'm pretty sure it does already, at least in some cases.

@pabloferz
Copy link
Contributor Author

pabloferz commented Aug 24, 2016

I see, good to know. Sorry for the noise, then.

EDIT: The reason I brought it up the first time was because I was under the impression that it was not the case for some example I was testing. But after benchmarking it properly I see it works a you point out.

@pabloferz
Copy link
Contributor Author

Should this one be backported?

@martinholters
Copy link
Member

IIUC, this does not change behavior for anything that works in master (or 0.5), but enables something that used to work in 0.4 (and really should work). So backporting sounds like a good idea.

@pabloferz
Copy link
Contributor Author

Bump. This should be good to go as well as back-portable.

@JeffBezanson JeffBezanson merged commit 7b41e72 into JuliaLang:master Nov 4, 2016
@pabloferz pabloferz deleted the pz/mapbitarray branch November 4, 2016 19:46
tkelman pushed a commit that referenced this pull request Feb 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential benchmark Could make a good benchmark in BaseBenchmarks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants