From 78a77e5faa810c210c3071c3d50ef6db671b1c27 Mon Sep 17 00:00:00 2001 From: Marat Dukhan Date: Tue, 21 May 2019 16:57:14 -0700 Subject: [PATCH] Add Quasi-Fused Multiply-Add/Subtract instructions --- proposals/simd/BinarySIMD.md | 5 +++++ proposals/simd/SIMD.md | 12 ++++++++++++ 2 files changed, 17 insertions(+) diff --git a/proposals/simd/BinarySIMD.md b/proposals/simd/BinarySIMD.md index 919dd767b..f4446a132 100644 --- a/proposals/simd/BinarySIMD.md +++ b/proposals/simd/BinarySIMD.md @@ -25,6 +25,7 @@ instr ::= ... Some SIMD instructions have additional immediate operands following `simdop`. The `v8x16.shuffle` instruction has 16 bytes after `simdop`. +<<<<<<< HEAD | Instruction | `simdop` | Immediate operands | | ---------------------------|---------:|--------------------| | `v128.load` | `0x00`| m:memarg | @@ -141,6 +142,8 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`. | `f32x4.abs` | `0x95`| - | | `f32x4.neg` | `0x96`| - | | `f32x4.sqrt` | `0x97`| - | +| `f32x4.qfma` | `0x98`| - | +| `f32x4.qfms` | `0x99`| - | | `f32x4.add` | `0x9a`| - | | `f32x4.sub` | `0x9b`| - | | `f32x4.mul` | `0x9c`| - | @@ -150,6 +153,8 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`. | `f64x2.abs` | `0xa0`| - | | `f64x2.neg` | `0xa1`| - | | `f64x2.sqrt` | `0xa2`| - | +| `f64x2.qfma` | `0xa3`| - | +| `f64x2.qfms` | `0xa4`| - | | `f64x2.add` | `0xa5`| - | | `f64x2.sub` | `0xa6`| - | | `f64x2.mul` | `0xa7`| - | diff --git a/proposals/simd/SIMD.md b/proposals/simd/SIMD.md index 0f1379207..28352aeb0 100644 --- a/proposals/simd/SIMD.md +++ b/proposals/simd/SIMD.md @@ -778,6 +778,18 @@ Lane-wise IEEE `multiplication`. Lane-wise IEEE `squareRoot`. +### Quasi-Fused Multiply-Add +* `f32x4.qfma(a: v128, b: v128, c: v128) -> v128` +* `f64x2.qfma(a: v128, b: v128, c: v128) -> v128` + +Lane-wise multiplication and addition (`a + b * c`), either with, or without intermediate rounding. WebAssembly implementation may execute this instruction as either IEEE Fused-Multiply-Add (FMA) or a combination of IEEE `multiplication` and IEEE `addition` operations, depending on availability and performance of FMA instruction on the target native platform. `qfma` instructions in a WebAssembly module must execute as either all fused, or all unfused operations. + +### Quasi-Fused Multiply-Subtract +* `f32x4.qfms(a: v128, b: v128, c: v128) -> v128` +* `f64x2.qfms(a: v128, b: v128, c: v128) -> v128` + +Lane-wise multiplication and subtraction (`a - b * c`), either with, or without intermediate rounding. WebAssembly implementation may execute this instruction as either IEEE Fused-Multiply-Subtract (FMS) or a combination of IEEE `multiplication` and IEEE `subtraction` operations, depending on availability and performance of FMS instruction on the target native platform. `qfms` instructions in a WebAssembly module must execute as either all fused, or all unfused operations. + ## Conversions ### Integer to floating point * `f32x4.convert_i32x4_s(a: v128) -> v128`