Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

Re-introduce some 64x2 instructions #101

Closed
ngzhian opened this issue Sep 9, 2019 · 5 comments · Fixed by #115
Closed

Re-introduce some 64x2 instructions #101

ngzhian opened this issue Sep 9, 2019 · 5 comments · Fixed by #115

Comments

@ngzhian
Copy link
Member

ngzhian commented Sep 9, 2019

In #81, we wanted some performance data to support the inclusion of 64x2 instructions. In https://github.com/ngzhian/simd-benchmarks we documented steps to run benchmarks, and the results of running those benchmarks.

I would like to suggest the inclusion of 64x2 instructions back into the spec because we see real world workloads that benefit from these 64x2 instructions.

However, we don't see all 64x2 instructions getting the same usage and exposure, as such, we might want to consider including only a subset of these instructions.

Option 1, all 64x2 instructions 👀

  • i64x2.{splat,extract_lane,replace_lane}
  • i64x2.{add,sub,mul,neg}
  • i64x2.{shl,shr_s,shr_u}
  • i64x2.{any_true,all_true}
  • i64x2.{eq,le,lts,ltu,gts,gtu,les,leu,ges,geu}
  • i64x2.{abs,min,max}
  • i64x2.trunc_{s,u}/f64x2
  • f64x2.{splat,extract_lane,replace_lane}
  • f64x2.{add,sub,mul,neg}
  • f64x2.{div,sqrt}
  • f64x2.{shl,shr_s,shr_u}
  • f64x2.{any_true,all_true}
  • f64x2.{eq,le,lt,ge,gt}
  • f64x2.{abs,min,max}
  • f64x2.convert_{s,u}/i64x2

Option 2, only f64x2 instructions 🚀

  • all the f64x2 instructions (including conversions) from Option 1

Option 3, f64x2 instructions and common i64x2 instructions 🎉

  • all the f64x2 instructions (including conversions) from Option 1
  • i64x2.{splat,extract_lane,replace_lane}
  • i64x2.{add,sub,mul,neg}
  • i64x2.{shl,shr_s,shr_u}

I can put up a PR for changes to the docs when one of these options is decided.

Feel free to vote on this issue, eyes 👀 for 1, rocket 🚀 for 2, tada 🎉 for 3.

@tlively
Copy link
Member

tlively commented Sep 9, 2019

I'm not sure we actually ever updated the docs to reflect that any of these instructions had been removed in the first place. The entire CG blessed the operations that were present during the in-person meeting, so I think we should make sure to go back at some point and get the CG's blessing for any operations we've added since then, including any 64x2 operations we want to get back into the proposal.

@arunetm
Copy link
Collaborator

arunetm commented Sep 9, 2019

@ngzhian Thanks. what kind of workloads use the newly introduced common i64x2 instructions?
Great idea to share the newly added instructions at the CG meeting. We have a few outstanding PR's on quasi fma, load extend etc. It will be better to go to the CG after we have merged the best versions of ones we are targeting to have in initial spec.

@dtig
Copy link
Member

dtig commented Sep 9, 2019

It's true we didn't end up updating the Spec, I think the intent here is to remove the operations we decide should not be included as a part of v1 in the spec. Aside from these, the ones that were included after the poll were the Load and Splat instructions, and the swizzle which have direct mappings to instructions, and are ergonomic variants of operations that already exist in the current subset of operations, so I'm not sure these need an explicit CG Poll.

It would be good to keep this issue just to polling interest for just the 64x2 options listed by @ngzhian above, so we can put this class of operations to vote at an upcoming CG meeting. I'd look at this more as a pre-poll to get consensus from the folks involved in the SIMD proposal.

@ngzhian
Copy link
Member Author

ngzhian commented Sep 9, 2019

@arunetm the SIMD-oriented Mersenne Twister implementation uses a lot i64x2 shifts, many hash algorithms like Blake2b use a lot of i64x2 shifts and and. f64x2 instructions feature heavily in matrix operations, machine learning models, optimization/solvers (the specific workload depends on what it is doing).

@arunetm
Copy link
Collaborator

arunetm commented Sep 9, 2019

Got it. Thanks ngzhian!

ngzhian added a commit to ngzhian/simd that referenced this issue Oct 1, 2019
A poll was held at CG meeting today, the consensus was for Option 3 of
WebAssembly#101.

This means we have all the f64x2 instructions, and some of the more
common i64x2 that have been benchmarked. This change reflects that
option in the proposal text, binary text, and implementation status.
ngzhian added a commit to ngzhian/simd that referenced this issue Oct 1, 2019
A poll was held at CG meeting today, the consensus was for Option 3 of
WebAssembly#101.

This means we have all the f64x2 instructions, and some of the more
common i64x2 that have been benchmarked. This change reflects that
option in the proposal text, binary text, and implementation status.

Fixes WebAssembly#101.
ngzhian added a commit to ngzhian/simd that referenced this issue Oct 1, 2019
A poll was held at CG meeting today, the consensus was for Option 3 of
WebAssembly#101.

This means we have all the f64x2 instructions, and some of the more
common i64x2 that have been benchmarked. This change reflects that
option in the proposal text, binary text, and implementation status.

Fixes WebAssembly#101.
Honry pushed a commit to Honry/simd that referenced this issue Oct 19, 2019
A poll was held at CG meeting today, the consensus was for Option 3 of
WebAssembly#101.

This means we have all the f64x2 instructions, and some of the more
common i64x2 that have been benchmarked. This change reflects that
option in the proposal text, binary text, and implementation status.

Fixes WebAssembly#101.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants