Should std::{f32,f64}::NAN be a QNAN or SNAN ? #52897

gnzlbg · 2018-07-31T07:34:36Z

Right now, whether they are a QNAN or an SNAN depends on the architecture. Currently, they are an SNAN on MIPS (EDIT: r5 and older) at least, and a QNAN on most others.

It probably makes sense to offer consistent behavior here independently of whether LLVM preservers payloads, signaling/quietness, etc.

cc @rkruppe @draganmladjenovic

est31 · 2018-07-31T09:30:04Z

Note that "on MIPS at least" is imprecise: which bit pattern is considered quiet and which pattern is considered signaling depends on the used MIPS version and configuration. See #46246 for more info.

gnzlbg · 2018-07-31T09:52:14Z

@est31 indeed. #52666 suggests leaving the current mips targets for the "older" mips versions (r5 and older), and to add a new targets (e.g. mipsisa64r6-unknown-linux-gnuabi64) for the newer versions (r6 and newer).

MIPS in the issue refers to r5 and older since currently we don't have any targets for r6. I've added an edit to clarify that.

hanna-kruppe · 2018-07-31T10:41:49Z

It probably makes sense to offer consistent behavior here independently of whether LLVM preservers payloads, signaling/quietness, etc.

Why? Considering how LLVM behaves (and the differences in how hardware handle payloads, too!), I can't think of any (edit: portable) use case that can reliably use any particular NAN constant for a bitwise equality check. It seems to me you'd always want to test with is_nan().

varkor · 2019-05-28T20:16:16Z

The guarantees provided by std::{f32,f64}::NAN should be documented. At the moment, there is no information, but it might be helpful to be explicit about the fact that nothing is guaranteed other than being a NaN.

silwol · 2020-02-05T17:15:40Z

In Debian, we hit what I believe to be the issue at hand.
I bisected the rustup versions of compilers, and it seems the behavior changed from rustc 1.36 to 1.37. Was this intentional, or maybe triggered by an update of a library used by rustc, e.g. llvm?

The issue surfaced in rust-num/num-traits#151 where the tests worked before and don't work any longer now in mipsel and mips64el.

nikic · 2020-02-05T18:16:50Z

@silwol I'd expect that minnum/maxnum use libcall legalization on mips, so you're likely seeing libm behavior here. Can you check whether that's the case?

Per the C standard, I believe fmin/fmax should be treating SNaN the same as QNaN (contrary to IEEE754).

silwol · 2020-02-06T07:06:53Z

@nikic I hope I checked this the right way. If not, please give me a hint how / where to try next. The important information here is that it's not the num-traits code that fails, but instead there is a doctest inside the project which falls back to std's min and max implementation if their std feature is enabled, and that is what fails.

Here is what I tried:

$ cat example.c 
#include <math.h>
#include <stdio.h>

int main() {
    double a = 1;
    double b = NAN;
    printf("fmin(%f, %f) = %f\n", a, b, fmin(a, b));
    return 0;
}
$ clang -Wall -Werror -std=c99 -lm -fno-builtin -o example example.c 
$ ./example
fmin(1.000000, nan) = 1.000000
$

This behaves exactly the same on Debian mipsel and amd64.

In contrast, the following behaved the same on these architectures with rustc up to v1.36.0 (installed through rustup.rs), and fails on mipsel since rustc v1.37.0.

General information on my mipsel VM:

$ uname -a
Linux debian 5.4.0-3-4kc-malta #1 SMP Debian 5.4.13-1 (2020-01-19) mips GNU/Linux
$ cat src/lib.rs 
#[cfg(test)]
mod tests {
    #[test]
    fn min() {
        let nan = 0f64 / 0f64;
        assert_eq!(1f64.min(nan), 1f64);
    }

    #[test]
    fn max() {
        let nan = 0f64 / 0f64;
        assert_eq!(1f64.max(nan), 1f64);
    }
}

Trying with latest stable rustc installed with rustup, behaves the same with the rustc 1.40.0 installed from the Debian repository:

$ rustc --version
rustc 1.41.0 (5e1a79984 2020-01-27)
$ cargo test
    Finished test [unoptimized + debuginfo] target(s) in 0.57s
     Running target/debug/deps/floatextremestest-23756e6931b17aa8

running 2 tests
test tests::max ... FAILED
test tests::min ... FAILED

failures:

---- tests::max stdout ----
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `NaN`,
 right: `1.0`', src/lib.rs:12:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

---- tests::min stdout ----
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `NaN`,
 right: `1.0`', src/lib.rs:6:9


failures:
    tests::max
    tests::min

test result: FAILED. 0 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out

Trying with latest known-to-work stable rustc:

$ rustc --version
rustc 1.36.0 (a53f9df32 2019-07-03)
$ cargo test
    Finished dev [unoptimized + debuginfo] target(s) in 9.38s
     Running target/debug/deps/floatextremestest-3c9ba83b13a3b688

running 2 tests
test tests::max ... ok
test tests::min ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests floatextremestest

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

I'm not very familiar with compiler internals, so please let me know if that might be a different problem from this one, then I'll move it to a new issue.
For the record, this is tracked in Debian at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=950583

est31 · 2020-02-06T07:29:37Z

@silwol which mips version does your cpu have? E.g. what does running cat /proc/cpuinfo | grep "model name" show?

silwol · 2020-02-06T07:41:57Z

For checking if the error is reproducible, I started a qemu mipsel VM with qemu-system-mipsel -M malta …. /proc/cpuinfo doesn't contain a "model name" entry here, but instead has this line:

cpu model		: MIPS 24Kc V0.0  FPU V0.0

Regarding the Debian infrastructure, I don't have direct access to these machines. The failing test logs can be found on https://buildd.debian.org/status/package.php?p=rust-num-traits from where you can follow the link to the machine information in the Buildd column. There you have a "Build machine info" entry somewhere in the head, showing a list of links to the details of each machine.
The affected num-traits crate has been built multiple times on different machines. The details are on https://buildd.debian.org/status/logs.php?pkg=rust-num-traits&ver=0.2.11-1&arch=mips64el and https://buildd.debian.org/status/logs.php?pkg=rust-num-traits&ver=0.2.11-1&arch=mipsel (corresponding the "all" link in the "Logs" column).

silwol · 2020-02-06T07:55:37Z

I just looked up the information about the physical machines that built the project. They're all listed with Processor: Rhino Labs UTM8. There are machines available with Processor: LS3A-RS780-1w (Quad Core Loongson 3A) and Processor: Lemote 3A ITX-A1101 (Quad Core Loongson 3A), but I don't know if there is any way to assign builds explicitly to a specific machine type.

est31 · 2020-02-06T07:56:12Z

The f32::max docs say:

If one of the arguments is NaN, then the other argument is returned.

Which is clearly not the case here.

f32::max itself is only a wrapper of the maxnumf32 intrinsic, which maps to llvm.maxnum.f32.

The docs of that LLVM intrinsic say:

Follows the IEEE-754 semantics for maxNum except for the handling of signaling NaNs. This matches the behavior of libm’s fmax.

If either operand is a NaN, returns the other non-NaN operand.

So this promise seems to be violated as well. However, this does not imply a bug in llvm just yet, as we provide some intrinsics to llvm. It might be the intrinsics that are wrong. The Rust based libm implementation has this fmaxf code. So maybe the bug lurks there? The code looks alright though.

silwol · 2020-02-07T06:44:31Z

Thanks @est31 for the hints.
I found f64::from_bits docs mentioning a difference between MIPS and the other platforms in the signaling bit. Also the fmaxf code you referenced has a comment telling that sNaN is not handled due to not being supported yet. Interestingly enough, the implementation that the num_traits crate uses for no-std environments succeeds in the test.
The question remains what changed from rustc 1.36 to 1.37 to reveal these circumstances.

est31 · 2020-02-07T08:18:48Z

@silwol yeah there is a fundamental difference between older MIPS targets and newer MIPS targets in that the signaling bit is opposite to IEEE 754-2008 vs follows the standard. But even older MIPS targets follow IEEE 754-1985 which clearly specifies the bit patterns of NaNs (it only doesn't specify the bit pattern distinction between qNaN and sNaN).

@silwol if you have access to a MIPS machine you could try finding the nightly that caused the regression using cargo-bisect-rustc.

Either way, I think you should file a proper issue for your bug report.

vinc17fr · 2021-08-19T01:47:26Z

IEEE 754-2008 is obsolete. You should follow IEEE 754-2019, where the poorly designed minNum and maxNum operations (as being non-associative) have been replaced by other operations, which handle qNaN and sNaN in the same way (except for the signaled invalid operation exception in case of sNaN).

est31 · 2021-08-19T08:15:49Z

@vinc17fr there is a discussion about which IEEE 754 version to use in #83984 .

colt-browning · 2024-12-02T20:18:56Z

As of today, std::{f32,f64}::NAN are deprecated in favour of {f32,f64}::NAN, for which the absence of any guarantees is documented. On the other hand, the relatively recent RFC 3514-float-semantics specifies that "the only way to ever see a signaling NaN in a program is to create one with from_bits" (and accepts that the old MIPS is non-compliant).

gnzlbg mentioned this issue Jul 31, 2018

run-pass/simd-intrinsic-float-minmax: Force use of qNaN on Mips #52895

Merged

jonas-schievink added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jan 27, 2019

varkor added the A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools label May 28, 2019

silwol mentioned this issue Feb 7, 2020

Different NaN types on mipsel and mips64el than on most other architectures #68925

Open

ecstatic-morse mentioned this issue Jun 13, 2020

Document guarantees (or lack thereof) regarding sign, quietness, and payload of NaNs #73328

Closed

ecstatic-morse added the A-floating-point Area: Floating point numbers and arithmetic label Jun 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should std::{f32,f64}::NAN be a QNAN or SNAN ? #52897

Should std::{f32,f64}::NAN be a QNAN or SNAN ? #52897

gnzlbg commented Jul 31, 2018 •

edited

Loading

est31 commented Jul 31, 2018

gnzlbg commented Jul 31, 2018

hanna-kruppe commented Jul 31, 2018 •

edited

Loading

varkor commented May 28, 2019

silwol commented Feb 5, 2020

nikic commented Feb 5, 2020

silwol commented Feb 6, 2020 •

edited

Loading

est31 commented Feb 6, 2020

silwol commented Feb 6, 2020

silwol commented Feb 6, 2020

est31 commented Feb 6, 2020 •

edited

Loading

silwol commented Feb 7, 2020 •

edited

Loading

est31 commented Feb 7, 2020

vinc17fr commented Aug 19, 2021

est31 commented Aug 19, 2021

colt-browning commented Dec 2, 2024

Should std::{f32,f64}::NAN be a QNAN or SNAN ? #52897

Should std::{f32,f64}::NAN be a QNAN or SNAN ? #52897

Comments

gnzlbg commented Jul 31, 2018 • edited Loading

est31 commented Jul 31, 2018

gnzlbg commented Jul 31, 2018

hanna-kruppe commented Jul 31, 2018 • edited Loading

varkor commented May 28, 2019

silwol commented Feb 5, 2020

nikic commented Feb 5, 2020

silwol commented Feb 6, 2020 • edited Loading

est31 commented Feb 6, 2020

silwol commented Feb 6, 2020

silwol commented Feb 6, 2020

est31 commented Feb 6, 2020 • edited Loading

silwol commented Feb 7, 2020 • edited Loading

est31 commented Feb 7, 2020

vinc17fr commented Aug 19, 2021

est31 commented Aug 19, 2021

colt-browning commented Dec 2, 2024

gnzlbg commented Jul 31, 2018 •

edited

Loading

hanna-kruppe commented Jul 31, 2018 •

edited

Loading

silwol commented Feb 6, 2020 •

edited

Loading

est31 commented Feb 6, 2020 •

edited

Loading

silwol commented Feb 7, 2020 •

edited

Loading