Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add intrinsics & target features for rd{rand,seed} #38561

Merged
merged 1 commit into from
Feb 14, 2017

Conversation

nagisa
Copy link
Member

@nagisa nagisa commented Dec 22, 2016

One question is whether or not we want to map feature name rdrnd to rdrand instead.

EDIT: as for use case, I would like to port my rdrand crate from inline assembly to these intrinsics.

@rust-highfive
Copy link
Collaborator

r? @brson

(rust_highfive has picked a reviewer for you, use r? to override)

@pnkfelix pnkfelix added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 3, 2017
@pnkfelix
Copy link
Member

pnkfelix commented Jan 3, 2017

@eddyb points out that these are platform-intrinsics (not Rust intrinsics), and platform-intrinsics are "sort of half-compiler half-libs"

@pnkfelix pnkfelix added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jan 3, 2017
@pnkfelix
Copy link
Member

pnkfelix commented Jan 3, 2017

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Jan 3, 2017

Team member @pnkfelix has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@brson
Copy link
Contributor

brson commented Jan 3, 2017

I question whether the use case is strong enough, and the wisdom of continuing to splat every possible intrinsic into the rust spec. There is nothing wrong with using assembly code. If we need to make using assembly easier we should.

@brson
Copy link
Contributor

brson commented Jan 3, 2017

@rfcbot concern Adding more intrinsics instead of using assembly

Per my above comment.

@eddyb
Copy link
Member

eddyb commented Jan 3, 2017

@brson Keep in mind these are platform-intrinsics which strive to be complete, as opposed to regular Rust intrinsics which expose more general architecture-independent functionality.
We might in the future expose something much like the vendor's own C APIs to have the hypothetical Rust spec defer to them instead of having our own lists that differ in subtle ways.

@nrc
Copy link
Member

nrc commented Jan 6, 2017

I don't understand what 'platform intrinsic' means here, nor what it means to be complete. Could someone explain please?

@nagisa
Copy link
Member Author

nagisa commented Jan 6, 2017

Platform intrinsic is a set of functions which expose instructions supported by underlying hardware. Notable examples of such platform intrinsics are operations on the SSE vectors. rd{rand,seed} is a rarer instruction but is essentially in the same bucket as all the SSE intrinsics. These have ABI "platform-intrinsic" and differ from the "rust-intrinsic" in many ways – most notably call of a platform-intrinsic function usually ends up being exactly one instruction. Call to a rust-intrinsic causes arbitrary codegen (as implemented in trans) to happen.

For the platform-intrinsics to be complete it would mean having intrinsics for every instruction set extension supported by a supported CPU target. That means platform-intrinsics would exclude base ISA and the insns which the compiler would generate by itself anyway, but would include esoteric stuff like AVX512, rdrand, SHA, AES, etc.

There’s multiple points in favour of platform-intrinsics compared to the inline assembly:

  • Inline assembly in rust is broken and fixing that is non-trivial;

  • Writing inline assembly even for a single insn correctly is non-trivial to get right. Forgot that memory/flag clobber? Rockets get launched and world wars get started.

    In contrast there’s no way to call a platform-intrinsic function incorrectly if it type-checks;

  • Calling platform-intrinsics also gives considerably better diagnostics in case of mistakes;

  • And finally, inline assembly inhibits optimisation and, potentially, optimal register allocation as well. Inline assembly is black-box to LLVM, platform intrinsics are not.


You could also do SSE with inline assembly ;)

@eddyb
Copy link
Member

eddyb commented Jan 6, 2017

@nrc They're exposing similar per-platform functionality to what each vendor describes in terms of C APIs.
In the case of this PR, it's these intrinsics that are being added to the (already pretty large) list.
The ideal final result would be to support every single one listed on those pages, and then the same for every single architecture. This is not as automatable as we'd like, sadly.

Also, the original champion, @huonw, is no longer able to routinely get involved in any of this.

@nikomatsakis
Copy link
Contributor

I think the question that @brson is getting at is probably -- to what extent are we committed to supporting these forever (i.e., how publicly exposed are these "platform intrinsics").

That said, a few things:

  • I definitely agree with @nagisa that there are many advantages to the intrinsics.
  • As long as they are in close affinity to actual CPU instructions (which seems to be the case), it seems like the portability hazard is very small. Every backend must support inline assembly (because that is offered by Rust), which means that, at worst, the backend can support the intrinsic as being mapped to an inline assembly. This won't be as readily optimized, but it ought to work.

So basically I'm feeling pretty positive about "platform intrinsics" as a thing we offer, but I also kind of feel ... surprised. I guess I was just unaware of their existence. =) They do seem like a thing we should decide about affirmatively, once and for all. i.e., deciding piecemeal whether to include a particular platform intrinsic feels silly. What is at issue is whether the concept* of platform intrinsics is something we want, no?

Why do I feel so ignorant of this? Have they been in there forever and I just forgot/never-knew about them?

@BurntSushi
Copy link
Member

BurntSushi commented Jan 6, 2017

@nikomatsakis I'm hoping we can answer this question once and for all in the stabilization effort for SIMD: https://internals.rust-lang.org/t/getting-explicit-simd-on-stable-rust/4380 --- You can see my current implementation work here (which is based on the consensus from that thread): https://github.com/BurntSushi/stdsimd --- The TL;DR is that we expose normal Rust functions (in std) that adhere to the vendor intrinsic C APIs but use LLVM intrinsics (and other tricks, examples) as internal implementation details. AIUI, this would replace our existing "platform intrinsic" story.

My plan is to write up an RFC for this, but I haven't gotten to it yet. In the mean time, I feel like it's probably not hurting anyone to continue to add to the existing platform intrinsics.

@BurntSushi
Copy link
Member

In the mean time, I feel like it's probably not hurting anyone to continue to add to the existing platform intrinsics.

On the other hand, if people are resigned to nightly Rust anyway, then they shouldn't have to add new platform intrinsics. They can link to LLVM intrinsics directly. @nagisa Is there a reason why you can't just use the LLVM intrinsics directly in your crate? (At least until something is properly stabilized.)

@nagisa
Copy link
Member Author

nagisa commented Jan 7, 2017

@nagisa Is there a reason why you can't just use the LLVM intrinsics directly in your crate?

I indeed could. The issue comes with the target feature whitelist which you need to be able to properly lower the intrinsics and adding entries to whitelist doesn’t make much sense to me without also adding corresponding intrinsics.


EDIT: using LLVM intrinsics directly also requires using unadjusted ABI (yet another perma-unstable thing), as the return type of the intrinsic is non-trivial (has to be precisely { i<bitwidth>, i32 } at LLVM level).

@nagisa
Copy link
Member Author

nagisa commented Jan 19, 2017

Still interested in making this land.

@BurntSushi
Copy link
Member

I also think this is OK to do. @brson Could you elaborate a bit more on your concern?

@brson
Copy link
Contributor

brson commented Jan 20, 2017

I also think this is OK to do. @brson Could you elaborate a bit more on your concern?

Rust has a history of upstreaming every LLVM feature without thinking, and regretting it. Every feature should be added to Rust on its own merits, not because it is technically possible. With platform intrinsics we are in the process of making yet another unstable mess that we have to extract ourselves from someday.

As @nagisa states "for the platform-intrinsics to be complete it would mean having intrinsics for every instruction set extension supported by a supported CPU target". The way I read that is "literally every minor CPU instruction will have an intrinsic". I'd suggest that needs some deep consideration, not blindly walking down that path.

For the ops stated use case, porting from inline assembly to platform intrinsics, the only obvious advantage I see is reducing duplicate inline-assembly blocks - neither inline assembly nor platform intrinsics is going to be stable any time soon, so @nagisa's crate will continue to not work on stable Rust. I don't find any of the arguments for platform intrinsics to be particularly compelling in this case - a few rand instructions is not hard to get right and if LLVM can optimize rand instructions let's see proof.

That said, we have added other platform intrinsics for obscure things, like PTX I think, (though rejecting those would also be a totally reasonable decision).

@arielb1
Copy link
Contributor

arielb1 commented Jan 20, 2017

@brson

When did we regret adding a platform intrinsic? Inline assembly sucks because it is type-unsafe.

@BurntSushi
Copy link
Member

@brson Note that platform intrinsics aren't necessarily just llvm features, their interface is defined by external vendors.

While I don't think platform intrinsics should be our future, I do think at least adding the cpu features should be fine, no? (That's not really an llvm thing either, since AIUI, those feature names are also defined by vendors as well.)

@BurntSushi
Copy link
Member

@nagisa In the interest of not adding more platform intrinsics that we might end up abandoning anyway, might you be willing to remove those but keep the added CPU targets? That way, you can link to the llvm intrinsics directly until we iron out the SIMD stability story, but still avoid writing inline assembly.

@nagisa
Copy link
Member Author

nagisa commented Jan 24, 2017

That way, you can link to the llvm intrinsics directly until we iron out the SIMD stability story, but still avoid writing inline assembly.

I said this already, this is tricky – these intrinsics are one of the very few which return not a single value but rather a pair (at LLVM level a { i{bitwidth}, i32 }), which is very hard to achieve from Rust as is.

@BurntSushi
Copy link
Member

BurntSushi commented Jan 24, 2017

@nagisa I didn't see your edit above. Namely:

EDIT: using LLVM intrinsics directly also requires using unadjusted ABI (yet another perma-unstable thing), as the return type of the intrinsic is non-trivial (has to be precisely { i, i32 } at LLVM level).

Could you say more about this? Why doesn't (i<bitwidth>, i32) work?

The other interesting piece here is that the platform intrinsics you've defined don't conform to the vendor intrinsic API, which store the random value in the pointer passed as a parameter. The return value is just the "success" indicator.

@nagisa
Copy link
Member Author

nagisa commented Jan 24, 2017

Could you say more about this? Why doesn't (i, i32) work?

Because handling of (ixy, i32) is ABI dependent, and not always C ABI will be equal to { ixy, i32 } in LLVM. Rust ABI is not specified at all, so that does not work. unadjusted ABI is going away as soon as I get around fixing LLVM’s handling of i128 intrinsics on windows. The other x86 ABIs are invalid for declaring LLVM intrinsics.

The other interesting piece here is that the platform intrinsics you've defined don't conform to the vendor intrinsic API, which store the random value in the pointer passed as a parameter. The return value is just the "success" indicator.

Fair point. Thinking about it, matching the C intrinsic API would involve promoting these to regular intrinsics – storing random number to a pointee needs extra store insn to get translated, which is not accommodated by the current platform-intrinsics infrastructure.

@BurntSushi
Copy link
Member

BurntSushi commented Jan 24, 2017

Because handling of (ixy, i32) is ABI dependent, and not always C ABI will be equal to { ixy, i32 } in LLVM. Rust ABI is not specified at all, so that does not work. unadjusted ABI is going away as soon as I get around fixing LLVM’s handling of i128 intrinsics on windows. The other x86 ABIs are invalid for declaring LLVM intrinsics.

Could you recommend any reading I could do on this topic? I feel like I still don't fully grok the issue here (nor the failure modes), and that somewhat frightens me (because I am formulating an RFC for SIMD).

For example, you said it would be "hard to link to llvm intrinsics directly." Could you humor me and say more about what you would actually have to do? (I have very little experience working with llvm.)

@BurntSushi
Copy link
Member

Fair point. Thinking about it, matching the C intrinsic API would involve promoting these to regular intrinsics – storing random number to a pointee needs extra store insn to get translated, which is not accommodated by the current platform-intrinsics infrastructure.

Yeah. I believe this is what Clang does for these specific intrinsics.

@nagisa
Copy link
Member Author

nagisa commented Jan 24, 2017

Could you recommend any reading I could do on this topic?

There’s not a single place which would have all the particularities written down on this topic, sadly. If you want to piggy-back on any of the ABIs supported by rust currently (other than unadjusted), you need to be aware of how they map to LLVM (they don’t always map to intuitively the same thing. C ABI adjustments rustc does (also the cabi* files in the same directory) are a good place to look at to find out why the C ABI is unsuitable in particular. You could also just read the C ABI specs for each of the targets.

Rust ABI is obviously unsuitable. Even if (i16, i32) might map to {i16, i32} in LLVM today, it will become {i32, i16} as soon as layout optimisations get involved, for example. You cannot use a repr(C) struct either, as it might map to, say, {i32, i16, [i8x2]} eventually, and that’s not the same as {i32, i16} LLVM expects.

(Failure mode) Luckily LLVM will complain to you very nicely by aborting if the intrinsic declaration does not match what LLVM expect, so there’s that.

In the end I believe having a direct way to specify how the declarations will end up looking in LLVM is the best. That’s what platform-intrinsics infrastructure achieves well. The caveat is that LLVM declaration won’t always match the C intrinsic signature. That’s true for rdrand intrinsics and it might also be true for some other (cough, SIMD) intrinsics as well.

@brson
Copy link
Contributor

brson commented Jan 24, 2017

@BurntSushi @nagisa My stance on this has softened, and I'm accepting that we are adding platform intrinsics and cpu features at will for the present, and will figure out how to rationalize a stable interface for it later. Do as you like with this PR.

@BurntSushi
Copy link
Member

All right. Given @nagisa's points, I'm also OK with it as is. This was a good PR, because it raised an interesting issue that I'll need to address in the RFC if we are to remove platform intrinsics.

@Aatch, @arielb1, @bkoropoff, @dotdash Any concerns?

@alexcrichton
Copy link
Member

Discussed during libs triage today our conclusion was to move forward with this PR, so:

@bors: r+

@bors
Copy link
Contributor

bors commented Feb 14, 2017

📌 Commit b2cf6df has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Feb 14, 2017

⌛ Testing commit b2cf6df with merge 61b93bd...

bors added a commit that referenced this pull request Feb 14, 2017
Add intrinsics & target features for rd{rand,seed}

One question is whether or not we want to map feature name `rdrnd` to `rdrand` instead.

EDIT: as for use case, I would like to port my rdrand crate from inline assembly to these intrinsics.
@bors
Copy link
Contributor

bors commented Feb 14, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 61b93bd to master...

@bors bors merged commit b2cf6df into rust-lang:master Feb 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.