Fast versions of isequal() and isless() for Nullables #18304

nalimilan · 2016-08-31T10:42:39Z

This is a minimal version of #16988, including only its less controversial parts. From the numerous discussions we've had on the subject, it seems nobody disagrees about the semantics of isequal(::Nullable, ::Nullable)::Bool (which is already implemented, but slow) and of isless(::Nullable, ::Nullable)::Bool. These basically mirror those of NaN

tkelman · 2016-08-31T13:38:00Z

base/nullable.jl

+    isequal(x::Nullable, y::Nullable)
+
+If neither `x` nor `y` is null, compares them according to their values
+(i.e. `isless(get(x), get(y))`). Else, returns `true` if both arguments are null,


Funny, I remember checking exactly this because I knew I was going to make this mistake. Will fix in next round of updates.

eschnett · 2016-08-31T14:13:25Z

base/nullable.jl

+## Operators
+
+"""
+    null_safe_op(f::Any, ::Type)::Bool


uninit_safe_op?

Yeah, there may be better names. uninit sounds too much like an action, though. Maybe undef_safe_op?

undef_op_is_safe?

Don't spend too much time on this; if this feature will be useful in other places as well (map on an array of nullables?), then we can change the name later.

...and this variant sounds like op is undef to me. Finding names is hard. :-)

I guess this is a combination of three things:

the operation is expected to be not much more expensive than a branch, so the optimization is reasonably profitable

the type is pointer free and so cannot be #undef

the operation is supported and will not fail on all patterns of bits possible

so perhaps isbitstotal? Like isbits, but also total under a certain operation?

"Total operation" does not evoke much to me, but maybe it's just my lacking mathematical culture. Another possibility is to call this eager_op or null_eager_op.

Opinions?

Honestly, I think null_safe_op is fine.

null_safe_op is fine by me as well. In math terms, this is basically an assertion that an operation is uniformly safe over the space of all bit patterns, but I think that term is too mathematic to make sense in our codebase.

nalimilan · 2016-09-03T09:11:00Z

More comments?

StefanKarpinski · 2016-09-13T19:14:45Z

Technically breaking (but minorly so); needs review and acceptance.

nalimilan · 2016-09-15T19:54:39Z

See JuliaCI/BaseBenchmarks.jl#24 for benchmarks (I'll add isless after merging the PR).

johnmyleswhite

This is good by me, but I think we might be able to simplify and generalize the implementation by using a tuple of types.

johnmyleswhite · 2016-09-16T18:10:22Z

base/nullable.jl

+always computing the result even for null values, a branch is avoided, which helps
+vectorization.
+"""
+null_safe_op(f::Any, ::Type) = false


Should we replace this definition and the one below with something more like code_llvm, where the second argument is always a tuple of types? That way we'd hand arbitrary arity functions in one function definition, while still getting appropriate specialization.

Do you have a case in mind where varargs wouldn't work the same as a tuple? I followed the style of promote_op here.

I'm not sure. Constructing varargs seems superfluous given that this doesn't happen in user-facing code, but I don't know how different the performance would be, so I'm not confident it's worth debating much.

tkelman · 2016-09-16T19:26:16Z

base/nullable.jl

+returning `true` means that the operation may be called on any bit pattern without
+throwing an error (though returning invalid or nonsensical results is not a problem).
+In particular, this means that the operation can be applied on the whole domain of the
+type *and on uninitialized objects*. As a general rule, these proporties are only true for


tkelman · 2016-09-16T19:27:37Z

base/nullable.jl

+isequal(x::Nullable, y::Nullable{Union{}}) = x.isnull
+
+"""
+    isless(x::Nullable, y::Nullable)


was this elsewhere in helpdb or the rst? should it be added to the rst if not?

edit: likewise with isequal

johnmyleswhite · 2016-09-16T20:01:07Z

base/nullable.jl

+always computing the result even for null values, a branch is avoided, which helps
+vectorization.
+"""
+null_safe_op(f::Any, ::Type) = false


I'm not sure. Constructing varargs seems superfluous given that this doesn't happen in user-facing code, but I don't know how different the performance would be, so I'm not confident it's worth debating much.

johnmyleswhite · 2016-09-16T20:02:00Z

base/nullable.jl

+vectorization.
+"""
+null_safe_op(f::Any, ::Type) = false
+null_safe_op(f::Any, ::Type, ::Type) = false


My concern is really to remove this line and the precedent it seems to set for an ever-increasing number of customized-by-arity functions.

I could write these as null_safe_op(f::Any, ::Type...) = false.

That's ok by me.

Actually, null_safe_op(f::Any, ::Type, ::Type...) = false.

Introduce the null_safe_op() function to allow declaring which combinations of operators and types are safe (i.e. can be computed without branching even when null). This function will be used for other operators in the future.

nalimilan · 2016-09-16T21:54:57Z

The new version should address all comments.

tkelman · 2016-09-16T22:12:48Z

I don't see any methods right now that would make null_safe_op(isless, S, T) ever true?

johnmyleswhite · 2016-09-16T22:26:28Z

Wait, I'm confused: isn't isless(?Int, ?Int) safe, but isless(?BigInt, ?BigInt) not safe where ?T = Nullable{T}?

This method follows the semantics of the already implemented isequal(::Nullable, ::Nullable). It is needed to sort arrays of Nullable.

nalimilan · 2016-09-17T11:00:07Z

Good catch, looks like I forgot to add these (or more precisely lost them when rebasing my old branch). Should be good now.

Unfortunately, tests cannot check whether the fast path was used for these operations; for those which return a Nullable, we can check the contents of the value field even for nulls to do that. At least, the new benchmarks will ensure there's no regression.

This enables SIMD for basic types when null_safe_op() is true. Inlining of the isequal() call on values still depends on the element type.

nalimilan · 2016-09-19T20:58:39Z

Let's see whether the new benchmarks are ready:

@nanosoldier runbenchmarks("nullable", vs = ":master")

nanosoldier · 2016-09-19T22:48:12Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

TotalVerb · 2016-09-19T22:49:30Z

This is great! 👍

nalimilan · 2016-09-20T08:45:05Z

OK, as expected the tests for isequal are twice to ten times faster than before. "Possible regressions" come from sum benchmarks, which shouldn't be affected at all by this. So I'm going to merge.

@jrevels Maybe the perf_sum benchmarks need a higher tolerance?

RFC: Pair to Pair conversions

nalimilan mentioned this pull request Aug 31, 2016

Add isless(::Nullable, ::Nullable) -> Bool JuliaStats/NullableArrays.jl#141

Merged

nalimilan force-pushed the nl/nullableops branch from 4f63108 to b2acbeb Compare August 31, 2016 10:53

tkelman reviewed Aug 31, 2016
View reviewed changes

tkelman added the potential benchmark Could make a good benchmark in BaseBenchmarks label Aug 31, 2016

eschnett reviewed Aug 31, 2016
View reviewed changes

nalimilan force-pushed the nl/nullableops branch 2 times, most recently from a375963 to 3b74f93 Compare September 3, 2016 09:08

nalimilan added the missing data Base.missing and related functionality label Sep 6, 2016

StefanKarpinski modified the milestones: 0.5.x, 0.6.0 Sep 13, 2016

StefanKarpinski assigned johnmyleswhite Sep 13, 2016

johnmyleswhite approved these changes Sep 16, 2016

View reviewed changes

tkelman reviewed Sep 16, 2016

View reviewed changes

johnmyleswhite reviewed Sep 16, 2016

View reviewed changes

Make isequal(::Nullable, ::Nullable) faster for safe types

277c21f

Introduce the null_safe_op() function to allow declaring which combinations of operators and types are safe (i.e. can be computed without branching even when null). This function will be used for other operators in the future.

nalimilan force-pushed the nl/nullableops branch from 3b74f93 to 689e139 Compare September 16, 2016 21:54

nalimilan added 2 commits September 17, 2016 12:58

Add isless(::Nullable, ::Nullable)::Bool

4046cd9

This method follows the semantics of the already implemented isequal(::Nullable, ::Nullable). It is needed to sort arrays of Nullable.

Add docstrings for isequal() and isless() on Nullables

2660cb0

nalimilan force-pushed the nl/nullableops branch from 689e139 to 2660cb0 Compare September 17, 2016 11:00

Inline isequal() and isless() for Nullable

4777f7a

This enables SIMD for basic types when null_safe_op() is true. Inlining of the isequal() call on values still depends on the element type.

nalimilan merged commit 721ba7a into master Sep 20, 2016

nalimilan deleted the nl/nullableops branch September 20, 2016 08:45

Sacha0 referenced this pull request Oct 1, 2016

Merge pull request #18736 from ninjin/nin/pairconv

3398c7e

RFC: Pair to Pair conversions

nalimilan mentioned this pull request Oct 19, 2016

Implement more operators on Nullable with lifting semantics #19034

Closed

KristofferC removed the potential benchmark Could make a good benchmark in BaseBenchmarks label Oct 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast versions of isequal() and isless() for Nullables #18304

Fast versions of isequal() and isless() for Nullables #18304

nalimilan commented Aug 31, 2016

tkelman Aug 31, 2016

nalimilan Aug 31, 2016

nalimilan Sep 3, 2016

eschnett Aug 31, 2016

nalimilan Aug 31, 2016

eschnett Aug 31, 2016

nalimilan Aug 31, 2016

TotalVerb Aug 31, 2016

nalimilan Sep 1, 2016

TotalVerb Sep 2, 2016

johnmyleswhite Sep 16, 2016 •

edited

Loading

nalimilan commented Sep 3, 2016

StefanKarpinski commented Sep 13, 2016

nalimilan commented Sep 15, 2016

johnmyleswhite left a comment

johnmyleswhite Sep 16, 2016

nalimilan Sep 16, 2016

johnmyleswhite Sep 16, 2016 •

edited

Loading

tkelman Sep 16, 2016

tkelman Sep 16, 2016 •

edited

Loading

johnmyleswhite Sep 16, 2016 •

edited

Loading

johnmyleswhite Sep 16, 2016

nalimilan Sep 16, 2016

johnmyleswhite Sep 16, 2016

nalimilan Sep 16, 2016

nalimilan commented Sep 16, 2016

tkelman commented Sep 16, 2016

johnmyleswhite commented Sep 16, 2016 •

edited

Loading

nalimilan commented Sep 17, 2016

nalimilan commented Sep 19, 2016

nanosoldier commented Sep 19, 2016

TotalVerb commented Sep 19, 2016

nalimilan commented Sep 20, 2016

Fast versions of isequal() and isless() for Nullables #18304

Fast versions of isequal() and isless() for Nullables #18304

Conversation

nalimilan commented Aug 31, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnmyleswhite Sep 16, 2016 • edited Loading

Choose a reason for hiding this comment

nalimilan commented Sep 3, 2016

StefanKarpinski commented Sep 13, 2016

nalimilan commented Sep 15, 2016

johnmyleswhite left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnmyleswhite Sep 16, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkelman Sep 16, 2016 • edited Loading

Choose a reason for hiding this comment

johnmyleswhite Sep 16, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nalimilan commented Sep 16, 2016

tkelman commented Sep 16, 2016

johnmyleswhite commented Sep 16, 2016 • edited Loading

nalimilan commented Sep 17, 2016

nalimilan commented Sep 19, 2016

nanosoldier commented Sep 19, 2016

TotalVerb commented Sep 19, 2016

nalimilan commented Sep 20, 2016

johnmyleswhite Sep 16, 2016 •

edited

Loading

johnmyleswhite Sep 16, 2016 •

edited

Loading

tkelman Sep 16, 2016 •

edited

Loading

johnmyleswhite Sep 16, 2016 •

edited

Loading

johnmyleswhite commented Sep 16, 2016 •

edited

Loading