Implement Intel HLE and RTM intrinsics #718

gnzlbg · 2019-04-14T08:21:39Z

@mtak- mentioned _xbegin recently, we should probably expose them all via core::arch.

The Intel RTM (Restricted Transactional Memory) intrinsics (Intel Intrinsic Guide, clang header, CPUID: RTM, EAX=7, ECX=0: Extended Features, EBX=11):

void _xabort (const unsigned int imm8) (assert_instr(xabort), `llvm.x86.xabort)
unsigned int _xbegin (void) (assert_instr(xbegin), llvm.x86.xbegin, note: can return multiple times)
void _xend (void) (assert_instr(xend), llvm.x86.xend)
unsigned char _xtest (void) (assert_instr(xtest), llvm.x86.xtest)

We'd need to whitelist the rtm feature in rust-lang/rust as part of this.

I've asked on the LLVM bugzilla whether xbegin needs to be marked returns_twice: https://bugs.llvm.org/show_bug.cgi?id=41493, Craig Topper suggested on IRC to submit an LLVM patch to mark this intrinsic / clang intrinsic as returns_twice (cc @ctopper - hope I get the github id right), but @TNorthover mentioned that this might not be necessary or enough. We don't have to figure this out for the initial implementation, as @mtack- mentions, clang does not do this, but we should not forget about this issue.

The text was updated successfully, but these errors were encountered:

TNorthover · 2019-04-14T10:42:01Z

I've done some more thinking, and I now don't think returns_twice is the right model at all. The only way the processor is getting back to the xbegin itself is via normal control flow that LLVM can see.

I think what we actually have is something akin to a call that might throw an exception. From LLVM's perspective, either the transaction eventually succeeds, in which case the xbegin has acted like a normal intrinsic call, or it fails, in which case execution proceeds from the landingpad as if nothing had happened.

Someone would need to investigate how well the existing landingpad actually fits in with what happens though. Key questions to answer will be

Current landingpads in use provide two values to the handler. One is usually (but not always) in rax so should be good; is the other just harmlessly undef, or will its presence break things? Is it even needed?
What registers are preserved through to the landingpad and how does LLVM know? It could either be special logic, or inherited from the assumptions about the preceding call.
I think landingpads need a personality function right now, but there isn't really one here. As far as I know it's only used for unwinding metadata, which we also don't need. So perhaps the real issue here is to make sure we don't try to generate that metadata for RTM invokes.

xabort should probably be noreturn, though I don't think that would affect correctness; it'll just allow more dead code elimination.

gnzlbg · 2019-04-14T12:30:50Z

cc @Amanieu —- For the abort intrinsic to be noreturn it has to return the Never type in Rust (‘fn () -> !’).

Amanieu · 2019-04-14T12:57:06Z

If _xabort is used outside of a transaction then it acts as a no-op, so it shouldn't return !.

gnzlbg · 2019-04-14T14:09:01Z

Indeed, then it can't be noreturn.

mtak- · 2019-04-14T17:21:15Z

In case it's helpful here's swym-htm's llvm bindings:
https://github.com/mtak-/swym/blob/b854eed11cc99b8551168934aeeb1f05ee1e04b2/swym-htm/src/x86_64.rs#L16

The signatures don't match the list above. Taken from llvm here (is there a better source?):
https://github.com/llvm-mirror/llvm/blob/993ef0ca960f8ffd107c33bfbf1fd603bcf5c66c/test/CodeGen/X86/rtm.ll
https://github.com/llvm-mirror/llvm/blob/993ef0ca960f8ffd107c33bfbf1fd603bcf5c66c/test/CodeGen/X86/xtest.ll

TNorthover · 2019-04-14T17:32:16Z

Oops, it looks like I didn't investigate just what @llvm.x86.xbegin did properly. It's a slightly higher level wrapper over the xbegin instruction that does seem to re-merge both success and failure paths into its return value.

So returns_twice is looking a lot more sensible again for that particular intrinsic now (though I still think an invoke-like interface would be more powerful and natural).

mtak- · 2019-04-14T18:03:21Z

@TNorthover What is special from an LLVM perspective about _xbegin?

IIUC xbegin is a lot like a read from volatile memory and then a branch based on the value read (was the transaction started or aborted). Even though we know it might take both branches, it's "as-if" only one were taken for any single call to _xbegin. Very much the same as the CPU speculating that it might take a certain branch, and then rolling that back when it realizes that it mispredicted the branch.

whitelist RTM x86 target cpu feature This PR adds support for intels restricted transactional memory cpu feature. I mostly copied what was done for the [movbe](rust-lang#57999) feature. rust-lang/stdarch#718

mtak- mentioned this issue Apr 18, 2019

whitelist RTM x86 target cpu feature rust-lang/rust#60060

Merged

gnzlbg added the feature-request label Apr 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Intel HLE and RTM intrinsics #718

Implement Intel HLE and RTM intrinsics #718

gnzlbg commented Apr 14, 2019

TNorthover commented Apr 14, 2019

gnzlbg commented Apr 14, 2019 via email

Amanieu commented Apr 14, 2019

gnzlbg commented Apr 14, 2019

mtak- commented Apr 14, 2019

TNorthover commented Apr 14, 2019

mtak- commented Apr 14, 2019

Implement Intel HLE and RTM intrinsics #718

Implement Intel HLE and RTM intrinsics #718

Comments

gnzlbg commented Apr 14, 2019

TNorthover commented Apr 14, 2019

gnzlbg commented Apr 14, 2019 via email

Amanieu commented Apr 14, 2019

gnzlbg commented Apr 14, 2019

mtak- commented Apr 14, 2019

TNorthover commented Apr 14, 2019

mtak- commented Apr 14, 2019