-
Notifications
You must be signed in to change notification settings - Fork 43
Conversation
Incorporate review suggestions from WebAssembly#77
pinging @dtig, @tlively, @Maratyszcza, and @arunetm |
Looks fine to me from a tooling perspective. @ngzhian would the renumbering of the narrowing and widening operations here be problematic? |
LGTM |
Thanks for distilling multiple PRs/Issues into this one, I'm in favor of merging this because collaborators from different application domains have indicated that this would be a useful operation to have, this has been a requested addition since 2017 and seems to be reasonably well supported on all architectures. A couple of outstanding things to make sure we handle previous concerns.
|
There was general consensus on removing i8x16.mul. @penzn can we update this PR to address it as well. |
Do you mean the discussion in #28? Will add a commit to remove those if there is no objections. |
Consensus for the removal is documented in WebAssembly#28 and WebAssembly#98.
Can we add formal pseudocode to SIMD.md for these instructions? |
Good point, maybe we should. |
Sorry for double-posting. We don't have semantics on memory ops, I am not sure how to describe that yet, the "extend" part of the operation should be very similar to "widen" operation, which does not have semantics yet either. |
Ok, I'm fine with merging this without pseudocode. I haven't thought of any ambiguities in the semantics here. |
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
And remove i8x16.mul, as documented in WebAssembly#28 and WebAssembly#98.
Sorry I'm late to point this out:
|
You are right that |
Thanks, I think I'll implement it using add to temporary first, and then add the optimization to use VLDR if the offset can be an immediate. |
@ngzhian are you working on implementing that in V8? We wanted to get some timings for this, can lend a hand with implementation. |
Yup I am, I have done it all for x64, so if you are interested only for x64 you can build locally and run. I have not started on arm/arm64/ia32 yet, so if you want to pick those up, lmk! |
I believe @rrwinterton was interested in testing this. Arm - probably not, let us think about ia32, could be somebody else would be willing to take it as well. |
Sounds good, I'll be working on arm/arm64 soon, and will leave ia32 to yall for now, please keep me updated (here or via email) so we don't overlap :) Thanks! |
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (WebAssembly/simd#98 (comment)) about removing i8x16.
Rebasing #77 on current master and incorporating the latest review feedback.
This change proposes six new load instructions, that would combine memory read of "half-size" vector with extending each lane to the next standard lane size. Motivating workloads are machine learning, image compression, video rendering, and data processing. There is widespread hardware support. Also see #23, #28, #77
Hardware support from #23:
PMOVZXWD xmm, [mem]
on x86 with SSE4.1MOVQ xmm, [mem] + PXOR xmm0, xmm0 + PUNPCKLWD xmm, xmm0
on SSE2VLD1.16 {dX}, [rAddr] + VMOVL.U16 qX, dX
on ARMv7+NEONLD1 {Vx.4H}, xAddr + UXTL Vx.4S, Vx.4H
on ARM64