Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use broadcasting instead of convert() for disallowmissing() #135

Open
kescobo opened this issue Jul 21, 2021 · 3 comments
Open

Use broadcasting instead of convert() for disallowmissing() #135

kescobo opened this issue Jul 21, 2021 · 3 comments

Comments

@kescobo
Copy link

kescobo commented Jul 21, 2021

From discussion on Slack with @pdeffebach, broadcasting seems to be a bit faster

using Missings, BenchmarkTools

v = Vector{Union{Float64, Missing}}(rand(10000));

@benchmark disallowmissing($v)

@benchmark Float64.($v)

image

@kescobo kescobo changed the title Use broadcasting instead of convert() for dropmissing() Use broadcasting instead of convert() for disallowmissing() Jul 21, 2021
@kescobo
Copy link
Author

kescobo commented Jul 21, 2021

Hmm - actually, it must be whatever else disallowmissing is doing, since

@benchmark convert.(Float64, $v)

is about the same:

image

@pdeffebach
Copy link
Contributor

The reason Missings.jl uses convert(AbstractArray{T}, ...) is because of categorical arrays and other non-Vector{T} array types.

But for Vector{T} we can define a more specialized method that is faster.

@nalimilan
Copy link
Member

But for Vector{T} we can define a more specialized method that is faster.

Note that the benchmark above uses convert.(Float64, $v), not convert(Vector{Float64}, $v). The latter seems to be faster than convert(AbstractVector{Float64}, $v), but still slower than broadcasting.

I think these performance differences should be reported against Julia. disallowmissing isn't the problem here, it's definition is trivial.

(disallowmissing cannot use broadcasting as the element type should be preserved disregarding the actual contents of the vector.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants