numeric equality ignoring type with NaNs equal #5314

StefanKarpinski · 2014-01-06T15:44:48Z

One thing I find highly objectionable about the current state of affairs with ==, isequal and === is that there is no way to compare numeric values of different types so that type doesn't matter and NaNs are equal. I was trying to do this test of the fidelity of the a + b*im construct on my sk/imaginary branch and there's just no obvious way to check that a + b*im and Complex(a,b) are the same value even if they are different types, where a and/or b may be NaN.

The text was updated successfully, but these errors were encountered:

StefanKarpinski · 2014-01-06T15:47:12Z

The whole issue can be reduced to the fact that there's no equality operation such that NaN and NaN32 are equal:

julia> NaN == NaN32
false

julia> isequal(NaN,NaN32)
false

julia> NaN === NaN32
false

JeffBezanson · 2014-01-06T19:59:25Z

Maybe we should try again here. We could have every type generally implement ==, and then add

isequal(x::Real, y::Real) = ((x==y) & (signbit(x)==signbit(y))) | (isnan(x)&isnan(y))

It would be nice for isequal to only be defined where it is different from == (same for isless and <). Although this is simpler, it has worse pitfalls than what we do now. Now, we can say <(x,y) = isless(x,y), because a total order isless suffices for <, but not the other way around. Therefore if a type only provides < or isless, everything is correct. If we switched to the simpler approach we would have to make sure every type with a partial < implements both functions.

Also, I'm not sure it's necessary for every data structure to have == that calls == recursively, and isequal that calls isequal recursively. Arguably, the IEEE standard only applies when an argument to == is a float. Having == for a non-IEEE-754 type like Dict call isequal on its elements might be the right thing to do. Getting false when comparing two identical data structures just because there's a NaN buried inside always seemed kind of strange to me.

StefanKarpinski · 2014-01-06T21:14:35Z

Fucking NaN. Honestly, I'm starting to think that the idea of a thing that isn't equal to itself is just not compatible with a sane notion of equality. I'm starting to think we should just throw errors when we encounter NaN :-\

JeffBezanson · 2014-01-06T21:56:36Z

That's what makes me want to push NaN into a corner where it only matters for a==b where a and/or b is floating-point. Every other context can do something more sane.

Most operations that can give NaN should throw errors instead; the only problem is performance-critical functions like +.

Then there is also the issue of signed zero.

Switching briefly to ordering, some other types, like sets, also have a canonical strictly-partial order. Currently isless for sets is strict-subset, and I'm not sure that's right. That should be a < method, and we could maybe also define lexcmp for sets.

lindahua · 2014-01-06T22:20:37Z

FWIW, MATLAB has a function isequaln that Determine array equality, treating NaN values as equal.

Reference: http://www.mathworks.com/help/matlab/ref/isequaln.html

pao · 2014-01-06T22:24:29Z

@lindahua I believe we had that function for a while (I may have even added it). It may also have been renamed. I seem to recall a discussion along those lines.

lindahua · 2014-01-06T22:26:37Z

This function doesn't live in Julia base though.

pao · 2014-01-06T22:33:53Z

Yeah, it didn't make the transition out of extras.jl. I don't see any explanation as to why, so it might have been either oversight or disgust; can't say for sure. :D

JeffBezanson · 2014-01-06T22:51:59Z

isequal is supposed to do that.

jiahao · 2014-01-07T04:10:25Z

If a=[1.0, NaN] and == is changed to call isequal elementwise and isequal is changed to become true on NaNs, then a==a != all(a.==a)

jiahao · 2014-01-07T04:13:57Z

#5234

JeffBezanson · 2014-01-07T04:43:42Z

Numeric arrays are also fairly "number-like" and so would probably decide to call == recursively. But I'm not sure this applies to all data structures.

nalimilan · 2014-01-07T08:49:59Z

FWIW, there's a very similar issue when dealing with NAs in DataArrays, and there's also the question of == vs. isequal(), with the former returning NA as soon as one NA is found. JuliaStats/DataArrays.jl#46

gitfoxi · 2014-01-07T19:07:48Z

As a side thing, in general, it would be cool if arbitrary data structures
with the same contents would evaluate as being equal. Instead you have to
write custom isequal for every type.

The reason is probably that if it the structure were to have circular
references you'd end up in an infinite loop. But there's some
straightforward algorithms for comparing possibly-recursive structures
which detect cycles by keeping track of everywhere it's already been and
terminating recursion -- which is some overhead, but maybe worth it?

On Tue, Jan 7, 2014 at 12:50 AM, Milan Bouchet-Valat <
notifications@github.com> wrote:

FWIW, there's a very similar issue when dealing with NAs in DataArrays,
and there's also the question of == vs. isequal(), with the former
returning NA as soon as one NA is found. JuliaStats/DataArrays.jl#46 JuliaStats/DataArrays.jl#46

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/5314#issuecomment-31721430
.

Michael

stevengj · 2014-02-12T22:56:42Z

Note that it's not just NaN != NaN32. As per the IEEE-754 standard, NaN != NaN (it is the only numeric value that does not compare equal to itself).

JeffBezanson · 2014-02-12T23:10:13Z

That is settled; we're thinking of isequal here.

JeffBezanson · 2014-02-19T18:34:35Z

I want to point out that there is no equality predicate that always does what you want. All we can do is provide some reasonable primitives, and in specific cases you must reason about what kind of equality you need. For example, this exact same issue could have been filed about sign bits instead of NaN (a comparison like ==, but also requiring sign bits to match). Sometimes you will have to write a==b && signbit(a)==signbit(b), or a==b || (isnan(a) && isnan(b)). Or you might have to write a==b && typeof(a)==typeof(b).

JeffBezanson · 2014-02-19T19:37:03Z

Perhaps isequal should ignore the types of NaNs. Sounds like a hack, but it is true that nearly all numeric functions will do the same thing given a NaN of any type --- it will propagate and you'll get another NaN. This isn't the case with non-NaNs; different precision arguments will yield different answers.

StefanKarpinski · 2014-02-20T13:22:49Z

How does this help?

isequal(Complex(1e0,1e0),Complex(1f0,1f0)) ==> false
isequal(Complex(NaN,NaN),Complex(NaN32,NaN32)) ==> true

Clearly you can't satisfy everyone, but this is a case that is particularly nasty to express generically.

The more I think about it, the more I think that making isequal behave as much like == as possible is the best approach. That minimizes the number of things for programmers to think about since isequal isn't really an entirely new thing – it's just == modified as little as possible to make it work for hashing. There just isn't any sensible intermediate between === and == that isn't fundamentally arbitrary.

I also think this is bit of a mistake:

julia> 1 == "foo"
false

julia> 1 < "foo"
ERROR: no method isless(ASCIIString, Int64)
 in < at operators.jl:18

It seems to me that == and <= should fail or work on exactly the same arguments. Making === work everywhere makes sense though since it's a completely global comparison predicate.

JeffBezanson · 2014-02-20T15:17:39Z

I do think there is a non-arbitrary intermediate between === and ==, which is to behave like === but treat everything as immutable. This is justified because many values are immutable by convention.

It's really helpful for == to fall back to ===. Otherwise it would be hard to know what methods are needed. For example, if you have a value-or-nothing argument, you might informally check if x==nothing. Having that not work is an unnecessary gotcha. Equality is really different from ordering, since we already admit that there is at least one valid equivalence relation among all values.

StefanKarpinski · 2014-02-20T15:21:48Z

Unfortunately, that definition is not at all what people expect hashing to do. To play devil's advocate, currently it's unclear whether you should write x == nothing or x === nothing, leading to a lot of variation in people's code. If x == nothing was a no method error, then it would be clear that x === nothing should be used.

JeffBezanson · 2014-02-20T15:53:27Z

I agree that one can argue that definition is not useful. The first example I always think of is that string encoding shouldn't matter for hash keys. But for numeric types I'm not so sure.

It's a tough call. It is always nice to trap questionable operations like comparing totally unrelated things, but (1) a lot of people want to just use == everywhere and forget about it, (2) it does make code more generic. Otherwise we might start to see boilerplate like if (isa(x,AbstractArray) && x==y) || x===y.

stevengj · 2014-02-20T16:22:20Z

I think that not defining == for comparison of unrelated types would just confuse people for no purpose; we should only throw an error in cases like 3 < "hello" where the desired meaning is not clear.

StefanKarpinski · 2014-02-20T16:24:05Z

Fair enough. That's really a tangential issue.

JeffBezanson · 2014-05-07T19:56:50Z

Fixed by #6624

StefanKarpinski mentioned this issue Feb 12, 2014

hashing of ranges is awful #5778

Closed

StefanKarpinski modified the milestones: 0.4, 0.3 Mar 28, 2014

JeffBezanson mentioned this issue Apr 19, 2014

No promotion of integertypes for Dict keys #6580

Closed

JeffBezanson closed this as completed May 7, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

numeric equality ignoring type with NaNs equal #5314

numeric equality ignoring type with NaNs equal #5314

StefanKarpinski commented Jan 6, 2014

StefanKarpinski commented Jan 6, 2014

JeffBezanson commented Jan 6, 2014

StefanKarpinski commented Jan 6, 2014

JeffBezanson commented Jan 6, 2014

lindahua commented Jan 6, 2014

pao commented Jan 6, 2014

lindahua commented Jan 6, 2014

pao commented Jan 6, 2014

JeffBezanson commented Jan 6, 2014

jiahao commented Jan 7, 2014

jiahao commented Jan 7, 2014

JeffBezanson commented Jan 7, 2014

nalimilan commented Jan 7, 2014

gitfoxi commented Jan 7, 2014

stevengj commented Feb 12, 2014

JeffBezanson commented Feb 12, 2014

JeffBezanson commented Feb 19, 2014

JeffBezanson commented Feb 19, 2014

StefanKarpinski commented Feb 20, 2014

JeffBezanson commented Feb 20, 2014

StefanKarpinski commented Feb 20, 2014

JeffBezanson commented Feb 20, 2014

stevengj commented Feb 20, 2014

StefanKarpinski commented Feb 20, 2014

JeffBezanson commented May 7, 2014

numeric equality ignoring type with NaNs equal #5314

numeric equality ignoring type with NaNs equal #5314

Comments

StefanKarpinski commented Jan 6, 2014

StefanKarpinski commented Jan 6, 2014

JeffBezanson commented Jan 6, 2014

StefanKarpinski commented Jan 6, 2014

JeffBezanson commented Jan 6, 2014

lindahua commented Jan 6, 2014

pao commented Jan 6, 2014

lindahua commented Jan 6, 2014

pao commented Jan 6, 2014

JeffBezanson commented Jan 6, 2014

jiahao commented Jan 7, 2014

jiahao commented Jan 7, 2014

JeffBezanson commented Jan 7, 2014

nalimilan commented Jan 7, 2014

gitfoxi commented Jan 7, 2014

stevengj commented Feb 12, 2014

JeffBezanson commented Feb 12, 2014

JeffBezanson commented Feb 19, 2014

JeffBezanson commented Feb 19, 2014

StefanKarpinski commented Feb 20, 2014

JeffBezanson commented Feb 20, 2014

StefanKarpinski commented Feb 20, 2014

JeffBezanson commented Feb 20, 2014

stevengj commented Feb 20, 2014

StefanKarpinski commented Feb 20, 2014

JeffBezanson commented May 7, 2014