From 94daf3e9461e07b66e65d713fcbf8d7232bf12ce Mon Sep 17 00:00:00 2001 From: Marat Dukhan Date: Sun, 27 Oct 2019 21:20:20 -0700 Subject: [PATCH] i32x4.dot2_s and i32x4.dot2_acc_s instructions --- proposals/simd/BinarySIMD.md | 2 ++ proposals/simd/ImplementationStatus.md | 2 ++ proposals/simd/SIMD.md | 11 +++++++++++ 3 files changed, 15 insertions(+) diff --git a/proposals/simd/BinarySIMD.md b/proposals/simd/BinarySIMD.md index 8617d725d..75c00307a 100644 --- a/proposals/simd/BinarySIMD.md +++ b/proposals/simd/BinarySIMD.md @@ -189,3 +189,5 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`. | `i64x2.load32x2_s` | `0xd6`| m:memarg | | `i64x2.load32x2_u` | `0xd7`| m:memarg | | `v128.andnot` | `0xd8`| - | +| `i32x4.dot2_s` | `0xd9`| - | +| `i32x4.dot2_add_s` | `0xda`| - | diff --git a/proposals/simd/ImplementationStatus.md b/proposals/simd/ImplementationStatus.md index 4778d4fa0..e64d89042 100644 --- a/proposals/simd/ImplementationStatus.md +++ b/proposals/simd/ImplementationStatus.md @@ -100,6 +100,8 @@ | `i16x8.sub_saturate_s` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | `i16x8.sub_saturate_u` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | `i16x8.mul` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| `i16x8.dot2_s` | | | | | +| `i16x8.dot2add_s` | | | | | | `i32x4.neg` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | `i32x4.any_true` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | `i32x4.all_true` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | diff --git a/proposals/simd/SIMD.md b/proposals/simd/SIMD.md index 0f1379207..f46b84203 100644 --- a/proposals/simd/SIMD.md +++ b/proposals/simd/SIMD.md @@ -380,6 +380,17 @@ def S.mul(a, b): return S.lanewise_binary(mul, a, b) ``` +### Integer dot product +* `i32x4.dot2_s(a: v128, b: v128) -> v128` + +Lane-wise multiply signed 16-bit integers in the two input vectors and add adjacent pairs of the full 32-bit results. + +### Integer dot product with accumulation + +* `i32x4.dot2_add_s(a: v128, b: v128, c: v128) -> v128` + +Lane-wise multiply signed 16-bit integers in the two input vectors, add adjacent pairs of the full 32-bit results, and accumulate with corresponding 32-bit lanes of `c`. This operation is equivalent to `i32x4.add(i32x4.dot2_s(a, b), c)`. + ### Integer negation * `i8x16.neg(a: v128) -> v128` * `i16x8.neg(a: v128) -> v128`