-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How better handle negative NaNs? #1
Comments
so I suppose it should be handled as |
const F64 = new Float64Array(1);
const U64 = new Uint32Array(F64.buffer);
F64[0] = NaN;
console.log('0x' + U64[1].toString(16));
F64[0] =-NaN;
console.log('0x' + U64[1].toString(16));
> 0x7ff80000
> 0xfff80000 |
There is no negative NaN in spec though
https://tc39.github.io/ecma262/#sec-ecmascript-language-types-number-type |
I found that all engines are actually have different raw bits for For example chakra implement it in chakra-core/ChakraCore#5905 . And it seems Chrome recently also implement it (version 79+) though I have no time to find the original PR. |
// use TypedArray to expose the sign bit
// note this also use the coercion `ToNumber` semantic
Math.signbit = (() => {
const LE = new Uint8Array(new Uint16Array([1]).buffer)[0]
return function signbit(n) {
const f64 = new Float64Array([n])
const i32 = new Uint32Array(f64.buffer)
return (i32[LE] >>> 31) === 1
}
})()
console.log(Math.signbit(0))
console.log(Math.signbit(-0))
console.log(Math.signbit(Infinity))
console.log(Math.signbit(-Infinity))
console.log(Math.signbit(NaN))
console.log(Math.signbit(-NaN))
console.log(Math.signbit(-(-NaN)))
const negNaN = Number.POSITIVE_INFINITY / Number.NEGATIVE_INFINITY
console.log(Math.signbit(negNaN))
Note all tests are run on my MacBook Air (macOS High Sierra 10.13.6, Intel Core i5) |
Interesting. Btw you could use simpler approach because JS should use LE for x84: const F64 = new Float64Array(1);
const U64 = new Uint32Array(F64.buffer);
const signbit = x => (F64[0] = x, Boolean(U64[1] >>> 31)); |
I came along and was wondering why special casing was made for NaNs too. It wouldn't act like C's signbit at all then, but according to @chicoxyzzy, JS doesn't have a negative NaN. If it isn't possible to create/use a NaN with an arbitrary bitset, then wouldn't one be able to use the bit manipulation implementations that most other languages use for signbit, without special casing NaNs, relying on the JS VM to canonicalize the NaN upon writing/reading/serializing it? |
I think as my previous tests, engines actually have negative |
Exposing the bit patterns of NaN is a massive mistake in Typed Arrays, and one we should not extend anywhere else. |
If I may ask, why? NaN is just as much of a number as 53.5 is, as 8 is, as 0 is, as -0 is, as infinity is, etc, as least according to IEEE 754 semantics and rules. All of them have a hard bit-pattern, and because TypedArrays expose any of them, I'd argue that they should all be exposed. Maybe... just maybe, the language spec should be changed to reflect modern implementations, and have different NaNs? |
@crimsoncodes0 because in JS, explicitly and intentionally, there is supposed to only be one observable NaN value. Typed Arrays expose them because the implementations that led to them didn't canonicalize. That doesn't mean it's a good decision. Nothing should ever be added to the language that widens this unfortunate exposure. |
Yes, according IEEE 754 negative NaN is canonical and fully valid (chould be preserve sign and propagate with sign) |
@ljharb In my opinion the big mistake is try to fix IEEE 754 on software (language or VM) level. Even WebAssembly which try to be most deterministic ISA/VM don't try to do this |
All of JavaScript does this already, outside of typed arrays. It’s part of the language design. |
Would it be a web compatibility-breaking change to add to the TypedArray's spec that implementations must canonicalize NaN values from the Float{32,64}Array numerical accessors and DataView.getFloat{32,64}? Presently, it sounds like the language is quite frankly... broken. Yes, it's a small thing, but it still breaks a fundamental part of the ES language spec, and explicitly putting a a step into the algorithms for reading memory into JS floats would fill this hole, and clear up this issue, as JS implementations would no-longer expose NaN bit patterns. |
It wouldn't likely break the web, but the committee explicitly decided in 2015 to not mandate NaN canonicalization in Typed Arrays, for performance reasons, and I'm quite confident there's no appetite to revisit that decision. |
And this totally make sense. How about relax NaN canonization to other lang parts? I don't think it may break the web |
@MaxGraey other language parts aren't used in hot paths or perf-sensitive code like Typed Arrays are (that's their reason for existing). I would be strongly opposed to any attempt to further worsen the situation around NaN canonicalization in the language. |
Why? In user space bit signature of NaN doesn't matter at all. It may still canonize for FFI or something like this if it's necessary. Relax this requirement will simplify and speedup js engines |
Off-topic, but does ECMAScript's canonical NaN value have a canonical bitset?
And is there any documented reasoning behind that decision? If so, could it be linked, so that we may at least understand this situation (a bit) better? |
@crimsoncodes0 no, since the only bits of it are exposed via Typed Arrays. The spec itself: https://tc39.es/ecma262/#sec-ecmascript-language-types-number-type.
|
I can't open the spec's multi-megabyte webpage without causing my entire device to lag, or crashing my (mobile) browser, is there a way to open only a small section of the spec? Besides that, I have one last question to help me assess this problem: does the ES spec say that the floating point number (5.0) has a bitset? Does it acknowledge that it has one or otherwise say that it does? If it acknowledges that any numbers have bit-patterns, it should acknowledge that all numbers do, including not-a-number, otherwise the specification makes no sense whatsoever, and ought to be changed. If it does not acknowledge that any numbers have bit-patterns, then TypedArrays and DataViews are just plain broken features in JavaScript, since they clearly expose these "non-existent" bit-patterns to user scripts. |
Here's the same section on the multipage build: https://tc39.es/ecma262/multipage/ecmascript-data-types-and-values.html#sec-ecmascript-language-types-number-type There's a note in there about the bit pattern; not sure if that answers your question. That the language here is incongruous between "typed arrays" and "everything else" is true, but doesn't mean anything can change it. It also doesn't mean the incongruity should be worsened. |
But I think the semantic of It also keep the simple invariant of |
I don't think that invariant is possible; There are no guarantees once you have a |
The first step of the unary negation algorithm canonicalizes the NaN, therefore this is merely a double canonicalization, thus the NaN should be the exact same NaN and consequently have the same bit-pattern, so I don't follow? If the above is correct, then current engines aren't implementing it at all. |
Feel free to experiment with it in various engines - when writing https://npmjs.com/get-nans, i found a lot of unpredictable and unintuitive behavior. |
As my previous test #1 (comment) , most engines keep the bit pattern. |
@hax it's not consistent btw.
If you call signbit long enough, it will give different results. Try this in console:
After ~3k calls it gives different result. |
@dy as i explained on nodejs/node#56373 (comment)
It would be perfectly reasonable in this non-TA API to canonicalize all NaN values, and treat them all the same. |
I think need better clarify how handle negative NaNs. Most of implementations in built-ins of LLVM, GCC, Go and Rust use non-sign agnostic for NaNs like:
But in spec this not strictly mentioned and it seems we need always handle signed and unsigned NaNs as
false
?Relate to this discussion
The text was updated successfully, but these errors were encountered: