SIMD Support #15

Aatch · 2014-03-19T02:16:30Z

RFC for improving SIMD support.

First time I've done something like this, so apoligies for anything I may have gotten wrong.

Acknowledgements to @jensnockert's post here: http://blog.aventine.se/2013/07/16/my-vision-for-rust-simd.html for forming the basis of this RFC

sfackler · 2014-03-19T02:24:06Z

active/0000-simd-support.md

+As such, these would be used like so:
+
+```rust
+fn make_vec3() -> simd![f32,..3] {


Syntax extensions cannot currently expand into types, FYI. I don't believe that it would be a hard change to support it, but it's something that should probably at least be discussed.

One point on this that I wasn't sure how to fit into the RFC itself is that simd types could be special-cased in the parser, so they aren't actually macros there. That's what I've done in my current branch.

huonw · 2014-03-19T02:24:46Z

cc @jensnockert @cartazio @sanxiyn

huonw · 2014-03-19T02:33:16Z

active/0000-simd-support.md

+
+# Unresolved questions
+
+1. Syntax - should it stay as proposed or is there a better alternative?


The one you proposed in IRC is also reasonable: let v: <f32, .. 4> = <0.0, 1.0, 2.0, 3.0> + <0.0, .. 4>;.

(Although I'm mildly worried about some subtle grammar interaction appearing.)

+1 for the simd![] syntax

I really like the <f32, ..4> syntax, but on the other hand, we may want to save special syntax for things that appear more often in normal code. The simd! one could just be a normal syntax extension, and then people could add more crazy types that they need. (MIPS accumulators comes to mind)

Unless we decide that cornering the dense linear algebra/DSP/GPGPU market is important for Rust, then really slick SIMD syntax might be worth it.

cartazio · 2014-03-19T02:39:31Z

I like the skeleton i see here.

Theres still the need to have a type safe way of expressing shuffles (but nothing in this proposal precludes figuring out that support later). Not just the "llvm shuffle", but also the the various cpu target microarch specific shuffles. This is loosely related to "compile time evaluation", but where you really really need to have those arguments be fixed at compile time.

cartazio · 2014-03-19T02:56:38Z

I kinda like the simd![ ] type syntax. It like putting a warning on the tin that simd vectors != normal rust vectors/arrays. (and I think thats a very valuable thing! )

cartazio · 2014-03-19T03:16:08Z

active/0000-simd-support.md

+# Unresolved questions
+
+1. Syntax - should it stay as proposed or is there a better alternative?
+2. Shuffle support via field access. It's a nice feature that I would like to


this would be essentially sugar for the llvm_shuffle right?

Clarify repeat syntax requirements. Clear up section on comparisons. Fix error in example code.

Elaborate on similarity to Open CL Vector support Clarify link between element access and shuffle syntax Add more detail to the "shuffle assign" operation. Remove unresolved question regarding shuffling. It's here to stay, even if not in the current format.

Aatch · 2014-03-23T04:25:54Z

I've implemented a fair amount of this RFC here: https://github.com/Aatch/rust/tree/simd-support

I'd also like some feedback on the syntax. I'm not too bothered as to what it actually is, but I want it resolved quickly.

brendanzab · 2014-03-24T01:20:31Z

+1 on the syntax for me. Looks quite nice, and fits in with the fixed length vector syntax.

Aatch · 2014-03-24T03:34:08Z

By syntax, I'm also including the field access syntax. As strange as it may seem, it's actually simpler to implement most of the features this way, instead of potentially adding yet another expression node and all that.

pnkfelix · 2014-03-25T21:21:33Z

active/0000-simd-support.md

+let x = v + u
+```
+
+will be quivalent to:


nit: "equivalent"

nikomatsakis · 2014-03-25T21:34:42Z

I feel a bit confused as to what the alternatives are. I'd like to know how this might look if it were more purely macro- or library-based. (For example, the swizzle ops might operate over any type T that implements a SIMD trait and so on).

I'd also like to know what the path is for extending the set of operations beyond "the basics" -- i.e., to cover things like Neon's fancy table lookup instructions and so on. Ideally we'd be able to smoothly extend the set of ops.

I'm surprised that 3-vectors work, but I guess LLVM handles that.

auroranockert · 2014-03-25T21:36:13Z

3-vectors in LLVM are 4-vectors due to OpenCL, iirc.

EDIT: I was wrong on the reason… the new rules for type legalization is http://blog.llvm.org/2011/12/llvm-31-vector-changes.html

Aatch · 2014-03-25T21:49:48Z

@nikomatsakis LLVM handles every size vector you throw at it. I had a test function that swizzled a 3-element vector up to a 6-element vector.

I'm really, really reluctant to even start writing up what it would look like as a library type without associated items. Hell, there's a ton of stuff I want to do with this implementation that won't really work without it. We need associated items in order for any API around this to be at all palatable without compiler integration. There is no way to avoid massive API explosion otherwise, traits or not. With a stronger type system than C/C++, we would end up needing a frankly obscene number of types and intrinsics to make this work. It's why nobody has done anything with the minimal support we have now!

pnkfelix · 2014-03-25T22:28:24Z

@Aatch on the flip side, how reasonable does a library type look once one adds associated items?

I'm not really opposed to adding something (as long as it looks reasonable) now under a feature guard, but if there's a chance that there's a cleaner variant that makes use of something like associated items that I'm pretty sure we'll want to put in post-1.0, then that affects the decision about what to do now (namely in terms of what to identify as "interim syntax until other language features land" versus "syntactic forms we all stand behind as being good regardless of what other features are likely to land in the future").

Aatch · 2014-03-25T23:12:08Z

@pnkfelix it does look better with associated items. However, there is technically no new syntax here, except I suppose the macro-in-type-position, but I don't think that should count. That was a part of my reasoning for it, as well. I guess the only major issue is the swizzle/shuffling syntax, which does require specific support from the compiler, however it is syntactically just field access. I'd be sad to see it go, though.

pnkfelix · 2014-03-26T04:31:18Z

@Aatch well, since you brought it up, regarding the field access syntax for swizzle/shuffle, I think part of my problem there is that it is a conceptual mismatch. You cannot take the address of a shuffle's field nor assign into it, right? (That was my motivation for suggesting the use of method inovcations rather than field accesses for swizzles/shuffles in the team mtg)

Is there some way in which field access is the "right thing" here? Or are you merely trying to avoid putting a trailing () (i.e. for the method invocation syntax) on a bunch of swizzle/shuffles?

Update: Sorry, I overlooked line 142: "The same field accessor syntax may be used to set arbitrary components of a vector all at once:in the RFC."

Still not sure I actually like it, nonetheless. (But I can at least see why this is more appealing than a method.) E.g. you still hit the problem that you cannot take the address of these "fields"

cartazio · 2014-03-26T04:39:50Z

https://github.com/mozilla/rust/wiki/Meeting-weekly-2014-03-25#simd better link for those trying to look up the thread via email

cartazio · 2014-03-26T04:40:21Z

I'm not sure how the shuffle complexity can be punted to macros (cf the meeting notes), would be interesting to see such a design

dobkeratops · 2014-03-27T18:03:05Z

very interesting stuff. i agree with the final point in the original linked post, it would be nice to have direct overrides to avoid relying on compiler optimization

Q1) would you consider supporting comparison operations generating select masks?
.. comparison yielding a vectorized bool - and a way to do selects. I guess many of the operators would just be vectorized. would you just catch additional case with more ! directives, or start adding intrinsic functions like C. rust safety being what it is, how might you represent it in the type system... a:simd![f32,..4] b:simd![f32,.4] cmp=simd![bool32,..4] = a>b; c=simd_sel(cmp, d,e) /d,e = another 2 unrelated vectors.. could have written c=simd_sel( a>b, d,e)/

Q2)would you be supporting intel haswell gather (simd indexing) .. less important to me since its not widespread across platforms. (i think the vload/vstore are just misaligned loads?) perhaps it would actually make sense to allow passing a vector index into a [] operator on a vector, yielding a vector .. eg foo[index_vector] = simd![ foo[index_vector[0]], foo[index_vector[1]], foo[index_vector[2]], foo_index_vector[3]] ]. Or perhaps thats best left to an intrinsic :)

Q3)tangential question, I had been wondering if you had considered a simple space in the type for [T,..N] eg [T N] .. but is that too ambiguous/ just not clear. i realise why [T*N] didnt stay

Q4) would you have monstrous u128, u256 .. types that these can be cooerced too (.. i can see you wouldn't want to call them int/unsigned int actually since they wont support int arithmetic, they'd just be raw bit patterns). i suppose you'd coerce to simd![u32,..4] for vectorized bitwise operations.. fancy compression/conversions... - but you want to say its an operation on the bit-pattern , not a float converted

I'm sure the idea of arbitrary data in raw bit fields is going to raise alarm bells , but sometimes data is flipped (aos<->soa).. loaded from standard non-simd structs, permuted into simd vectors, worked on, permuted back & stored.. in the middle you're saying you want to just transpose 4x4 blocks of memory basically..

Aatch · 2014-03-27T20:41:29Z

@dobkeratops

A1 To fit in with the type system, equality operations yielding vectors would be intrinsics something like fn simd_eq<T, n>(v1: simd![T,..n], v2: simd![T,..n]) -> simd![bool,..n]. In LLVM IR, the result of a comparison is a vector of i1. Those vectors are what LLVM expects to be passed to its select instruction that is used for generating blend instructions and what any intrinsic expecting a mask would take too.
A2 Again, LLVM supports this via vectors of pointers, which is something I am currently undecided on supporting. I certainly see the utility, however I would have to think about the best way to present the functionality. I have been careful to try and avoid limit future extensions to SIMD support though.
A3 I'm not sure what the question is here...
A4 Again, I'm not sure what the question is here. However, I would not support coercion, in fact I removed a similar coercion from this RFC after finding that it didn't feel at all natural in the rest of Rust, which largely avoids most coercions.

cartazio · 2014-03-27T20:53:37Z

certain classes of shuffle masks have to fixed a compile time, or the CPU and LLVM will both barf. this isn't always true, but on most archs is tis
that should be an intrinsic
no clue/ opinion
i'd be slightly meh about that, though LLVM does have "target lowering" that can do this. But the this would have a lowering that could be very very bad depending on the -march=FOO settings.

dobkeratops · 2014-03-27T21:12:30Z

@Aatch @cartazio cartazio let me clarifyQ4 with some C pseudocode
this is a technique that was efficient for some cases on PS3 CELL

struct Foo { f32 x; f32 y; f32 z; i32 flags; 
/* some packed integer control data. could be anything , the point is this is 3 floats, not 4
   could have been {posx,posy,posz, velx,velz,velz,flags, pad}.. whatever.
 */ }

process_foo_x4(Foo* f0,Foo* f1,Foo* f2,Foo* f3  ,..outputs..) {
     raw128simd_t    r0=(raw128simd_t&) *f0;
     raw128simd_t    r1=(raw128simd_t&) *f1;
     raw128simd_t    r2=(raw128simd_t&) *f2;
     raw128simd_t    r3=(raw128simd_t&) *f3;
     raw128simd_t x0123, y0123, z0123, flags0123;
     transpose_4x4(/*input*/ r0,r1,r2,r3,   /*output:*/ &x0123, &y0123, &z0123, &flags0123); // just       transposes 4x4 32bits, opaquely
     // now operate on 4 elements.  eg. lengths.
     len_squared_0123 = x0123*x0123+y0123*y0123+z0123*z0123
     // store square lengths in output

   // could have been anything, eg doing some update on x/y/z, permuting it back to write out.
}

i guess you might call the innerloop "AOSOA4" or something .. batching in scalar structs in 4's with permuting.

dobkeratops · 2014-03-27T21:15:51Z

r.e. the vectorized comparison, intrinsics would be perfectly ok and maybe even preferable at this level of granularity.

(I have seen people use operator overloading in C++ for this.... but was never so keen on that , at that level c++ was a hazard not a benefit, you're probably writing something where you want to reason about the asm more..)

cartazio · 2014-03-27T21:20:18Z

I agree that having the proper simd vector immediate type is a good idea, just that it shouldn't be conflated with some "really really wide (un)signed int" type.

dobkeratops · 2014-05-15T13:56:27Z

is it complex to allow simd support for (T,T,T,T) tuples and struct {T x,T y,T z,T w} aswell; sometimes these are just more appealing than [T,..4] , e.g. reserving bracket syntax for larger collections; [i].x vs [i][0] .. this is just an issue of personal style, i know.

DiamondLovesYou · 2014-05-15T14:12:26Z

@dobkeratops Already done: use #[simd] on your type declarations.

pczarn · 2014-05-15T14:18:41Z

A distant idea of mine is

type UInt<N: static uint = 64, V: static uint = 1>
type Float<N: static uint, V: static uint = 1>

sparrisable · 2014-05-16T07:17:01Z

I came here from the reddit discussion: http://www.reddit.com/r/rust/comments/25mdvz/is_it_time_to_integrate_vectors_like_vec4f_as/

boost.simd is mentioned in there and another possible inspiration might be sierra for c++ : http://www.cdl.uni-saarland.de/papers/lhh14.pdf

I don't know if it is realistic to include a general abstract simd implementation in the scope of this rfc but I thought it would be nice to have the paper referenced.

brson · 2014-06-05T21:40:55Z

I would love to make progress on SIMD, but this is an incredibly important subject that we don't want to risk doing wrong while we're focused on more high-priority tasks.

Right now, the most promising way forward is to get an RFC that has minimal language impact, and that lets authors experiment with SIMD libraries out-of-tree. If somebody wants to keep pushing SIMD forward please propose an RFC that does nothing but add experimental intrinsics, no new types.

Closing.

pnkfelix · 2014-06-05T21:42:33Z

FYI, just to follow up on some discussion of swizzle syntax as proposed here, I made some macros for generating the full set of swizzle accessors in my rust-glm port, see here for example usage: https://github.com/pnkfelix/rust-glm/blob/17c4e4de2cc59b174ef644a3beeb1eca38365933/src/vector.rs#L1110

I haven't tried making mutators yet (i.e. a method for assigning to a swizzled name), but IMO I am not convinced that one really needs to use field access syntax to express this.

The Road to Ember 2.0 RFC

SIMD Support

ef71699

sfackler reviewed Mar 19, 2014
View reviewed changes

huonw reviewed Mar 19, 2014
View reviewed changes

cartazio reviewed Mar 19, 2014
View reviewed changes

Aatch added 2 commits March 20, 2014 16:46

Address Comments

a159858

Clarify repeat syntax requirements. Clear up section on comparisons. Fix error in example code.

Update SIMD RFC

c2b8f40

Elaborate on similarity to Open CL Vector support Clarify link between element access and shuffle syntax Add more detail to the "shuffle assign" operation. Remove unresolved question regarding shuffling. It's here to stay, even if not in the current format.

pnkfelix reviewed Mar 25, 2014
View reviewed changes

active/0000-simd-support.md

let x = v + u

```

will be quivalent to:

Copy link

Member

pnkfelix Mar 25, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "equivalent"

brson mentioned this pull request Apr 3, 2014

Alignment not respected on the heap rust-lang/rust#13094

Closed

pcwalton added a commit that referenced this pull request May 2, 2014

Assert is RFC #15

5e28254

brson closed this Jun 5, 2014

brson added the postponed label Jun 5, 2014

rust-highfive mentioned this pull request Sep 24, 2014

SIMD Support #280

Closed

petrochenkov added T-lang Relevant to the language team, which will review and decide on the RFC. and removed postponed RFCs that have been postponed and may be revisited at a later time. labels Feb 24, 2018

wycats pushed a commit to wycats/rust-rfcs that referenced this pull request Mar 5, 2019

Merge pull request rust-lang#15 from emberjs/ember-2.0-rfc

ae0215d

The Road to Ember 2.0 RFC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIMD Support #15

SIMD Support #15

Aatch commented Mar 19, 2014

sfackler Mar 19, 2014

Aatch Mar 19, 2014

huonw commented Mar 19, 2014

huonw Mar 19, 2014

nrc Mar 19, 2014

auroranockert Mar 19, 2014

cartazio commented Mar 19, 2014

cartazio commented Mar 19, 2014

cartazio Mar 19, 2014

Aatch commented Mar 23, 2014

brendanzab commented Mar 24, 2014

Aatch commented Mar 24, 2014

pnkfelix Mar 25, 2014

nikomatsakis commented Mar 25, 2014

auroranockert commented Mar 25, 2014

Aatch commented Mar 25, 2014

pnkfelix commented Mar 25, 2014

Aatch commented Mar 25, 2014

pnkfelix commented Mar 26, 2014

cartazio commented Mar 26, 2014

cartazio commented Mar 26, 2014

dobkeratops commented Mar 27, 2014

Aatch commented Mar 27, 2014

cartazio commented Mar 27, 2014

dobkeratops commented Mar 27, 2014

dobkeratops commented Mar 27, 2014

cartazio commented Mar 27, 2014

dobkeratops commented May 15, 2014

DiamondLovesYou commented May 15, 2014

pczarn commented May 15, 2014

sparrisable commented May 16, 2014

brson commented Jun 5, 2014

pnkfelix commented Jun 5, 2014


		# Unresolved questions

		1. Syntax - should it stay as proposed or is there a better alternative?

SIMD Support #15

SIMD Support #15

Conversation

Aatch commented Mar 19, 2014

sfackler Mar 19, 2014

Choose a reason for hiding this comment

Aatch Mar 19, 2014

Choose a reason for hiding this comment

huonw commented Mar 19, 2014

huonw Mar 19, 2014

Choose a reason for hiding this comment

nrc Mar 19, 2014

Choose a reason for hiding this comment

auroranockert Mar 19, 2014

Choose a reason for hiding this comment

cartazio commented Mar 19, 2014

cartazio commented Mar 19, 2014

cartazio Mar 19, 2014

Choose a reason for hiding this comment

Aatch commented Mar 23, 2014

brendanzab commented Mar 24, 2014

Aatch commented Mar 24, 2014

pnkfelix Mar 25, 2014

Choose a reason for hiding this comment

nikomatsakis commented Mar 25, 2014

auroranockert commented Mar 25, 2014

Aatch commented Mar 25, 2014

pnkfelix commented Mar 25, 2014

Aatch commented Mar 25, 2014

pnkfelix commented Mar 26, 2014

cartazio commented Mar 26, 2014

cartazio commented Mar 26, 2014

dobkeratops commented Mar 27, 2014

Aatch commented Mar 27, 2014

cartazio commented Mar 27, 2014

dobkeratops commented Mar 27, 2014

dobkeratops commented Mar 27, 2014

cartazio commented Mar 27, 2014

dobkeratops commented May 15, 2014

DiamondLovesYou commented May 15, 2014

pczarn commented May 15, 2014

sparrisable commented May 16, 2014

brson commented Jun 5, 2014

pnkfelix commented Jun 5, 2014