For floating point operations, allow inputs to be arbitrary, including SNaNs. #883

nlewycky · 2019-10-17T19:21:24Z

Description

For floating point operations, allow inputs to be arbitrary, including SNaNs.

Instead of ensuring inputs are canonical NaNs on every operation, we tag outputs as pending such a canonicalization check, so that a sequence of computations can have a single canonicalization step at the end.

There's an extra wriggle for SIMD. The Wasm type system only indicates them as V128, so it's possible that we might do computations as F32x4Add, I8x16Add, F64x2Add in a row with no other computations in between. Since a canonicalization may change the bit patterns in a way that transforms one non-NaN to another non-NaN in the next subsequent instructions interpretation, most SIMD functions apply pending canonicalizations to their inputs, even the integer SIMD operations.

Review

Add a short description of the the change to the CHANGELOG.md file

…g SNaNs. Instead of ensuring outputs are arithmetic NaNs on every function, we tag them as pending such a check, so that a sequence of computation can have a single canonicalization step at the end. There's an extra wriggle for SIMD. The Wasm type system only indicates them as V128, so it's possible that we might do computations as F32x4Add, I8x16Add, F64x2Add in a row with no other computations in between. Thus, most SIMD functions apply pending canonicalizations to their inputs, even integer SIMD operations.

nlewycky · 2019-10-17T19:23:05Z

@syrusakbary I'd appreciate if you could benchmark with this PR, please!

nlewycky · 2019-10-17T19:25:08Z

bors try

bors · 2019-10-17T20:04:20Z

try

Build succeeded

wasmerio.wasmer

…-nan-but-fast

nlewycky · 2019-10-22T00:51:29Z

I ran some benchmarks with particle-repel:

master
simd: 36.56s 36.69s 36.43s
non-simd: 100.65s 100.66s, 100.62s

feature/llvm-nan-but-fast
simd: 48.69s 48.70s 48.74s
non-simd: 62.48s 62.63s 62.40s

The speedup with non-SIMD is what I expected. I know we have to handle SIMD differently, and that it can put canonicalizations where we didn't use to have them sometimes, but I was not expecting a slowdown compared to our current approach.

It looks like wasi-sdk-5 doesn't emit a SIMD FDiv operation, but instead emits individual FDivs. At master, we got lucky with the LLVM autovectorizer reassembling it into a SIMD operation, but with this branch, we got unlucky and one of the fdivs moved into a different basic block. Overall we emit fewer instructions, even for the SIMD case, with this patch. That usually translates into better performance, but in this particular benchmark, the fdiv performance dominates.

syrusakbary · 2019-10-25T20:14:21Z

Good to merge once we have the changes integrated in the Changelog

nlewycky · 2019-10-25T20:16:03Z

bors r+

883: For floating point operations, allow inputs to be arbitrary, including SNaNs. r=nlewycky a=nlewycky # Description For floating point operations, allow inputs to be arbitrary, including SNaNs. Instead of ensuring inputs are canonical NaNs on every operation, we tag outputs as pending such a canonicalization check, so that a sequence of computations can have a single canonicalization step at the end. There's an extra wriggle for SIMD. The Wasm type system only indicates them as V128, so it's possible that we might do computations as F32x4Add, I8x16Add, F64x2Add in a row with no other computations in between. Since a canonicalization may change the bit patterns in a way that transforms one non-NaN to another non-NaN in the next subsequent instructions interpretation, most SIMD functions apply pending canonicalizations to their inputs, even the integer SIMD operations. # Review - [x] Add a short description of the the change to the CHANGELOG.md file Co-authored-by: Nick Lewycky <nick@wasmer.io> Co-authored-by: nlewycky <nick@wasmer.io> Co-authored-by: Syrus Akbary <me@syrusakbary.com>

bors · 2019-10-25T20:50:00Z

Build succeeded

wasmerio.wasmer

nlewycky requested a review from losfair as a code owner October 17, 2019 19:21

Merge branch 'master' into feature/llvm-nan-but-fast

eaf16f6

bors bot added a commit that referenced this pull request Oct 17, 2019

Try #883:

658b8e4

nlewycky added 2 commits October 21, 2019 11:16

Remove dead functions, don't leave them commented out.

813f641

Merge branch 'master' of github.com:wasmerio/wasmer into feature/llvm…

3d3aef6

…-nan-but-fast

Merge branch 'master' into feature/llvm-nan-but-fast

1a91f0e

syrusakbary approved these changes Oct 25, 2019

View reviewed changes

Add changelog entry.

dae9949

bors bot merged commit dae9949 into master Oct 25, 2019

bors bot deleted the feature/llvm-nan-but-fast branch October 25, 2019 20:50

nlewycky mentioned this pull request Dec 2, 2019

Fix LLVM speed regression #651

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For floating point operations, allow inputs to be arbitrary, including SNaNs. #883

For floating point operations, allow inputs to be arbitrary, including SNaNs. #883

nlewycky commented Oct 17, 2019 •

edited

Loading

nlewycky commented Oct 17, 2019

nlewycky commented Oct 17, 2019

bors bot commented Oct 17, 2019

nlewycky commented Oct 22, 2019 •

edited

Loading

syrusakbary commented Oct 25, 2019

nlewycky commented Oct 25, 2019

bors bot commented Oct 25, 2019

For floating point operations, allow inputs to be arbitrary, including SNaNs. #883

For floating point operations, allow inputs to be arbitrary, including SNaNs. #883

Conversation

nlewycky commented Oct 17, 2019 • edited Loading

Description

Review

nlewycky commented Oct 17, 2019

nlewycky commented Oct 17, 2019

bors bot commented Oct 17, 2019

try

Build succeeded

nlewycky commented Oct 22, 2019 • edited Loading

syrusakbary commented Oct 25, 2019

nlewycky commented Oct 25, 2019

bors bot commented Oct 25, 2019

Build succeeded

nlewycky commented Oct 17, 2019 •

edited

Loading

nlewycky commented Oct 22, 2019 •

edited

Loading