Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't make atomic loads and stores volatile #30962

Merged
merged 1 commit into from
Feb 4, 2016

Conversation

Amanieu
Copy link
Member

@Amanieu Amanieu commented Jan 16, 2016

Rust currently emits atomic loads and stores with the LLVM volatile qualifier. This is unnecessary and prevents LLVM from performing optimization on these atomic operations.

@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@nagisa
Copy link
Member

nagisa commented Jan 17, 2016

I do not think we can make this change anymore. This has potential (and probably will) silently, and horribly, break certain programs (e.g. those that rely on atomics as a stable way to do volatile reads/stores). We should make new intrinsics+functions for non-volatile operations instead.

(I think we also had an issue/rfc somewhere to add volatile/non-volatile differentiation for atomic operations).

@Amanieu
Copy link
Member Author

Amanieu commented Jan 17, 2016

Does any code actually rely on this? The volatile semantics of atomic types were never documented anywhere and any code that would need it for MMIO is using nightly which has volatile read/write intrinsics.

Note that the volatile semantics only applied to atomic load and store, but not compare_and_swap or read-modify-write operations like fetch_add. This is very inconsistent if someone is relying on atomics providing volatile semantics.

Making atomic types have volatile semantics on some operations is a terrible default behavior to have, especially since it hurts compiler optimizations.

@nagisa
Copy link
Member

nagisa commented Jan 17, 2016

Does any code actually rely on this?

The only way to really find out is to break a stable release and wait for complaints,

The volatile semantics of atomic types were never documented anywhere

I’ve seen atomic store/load recommended by, I think, @huonw (?) as a stable replacement for volatile loads and stores, at least once.

Note that the volatile semantics only applied to atomic load and store, but not compare_and_swap or read-modify-write operations like fetch_add. This is very inconsistent if someone is relying on atomics providing volatile semantics.

You needn’t volatile CAS/RMW at all if all you’re doing is writing and reading bytes out of a hardware port. Basically, one might rely on the volatility behaviour and not atomicity behaviour and ignoring all of the more complex atomic operations for their use cases.

Making atomic types have volatile semantics on some operations is a terrible default behavior to have, especially since it hurts compiler optimizations.

I do not disagree, I’m simply pointing out the potential hazard of making these ops non-volatile, now that their volatility, albeit undocumented, is stable.

@Zoxc
Copy link
Contributor

Zoxc commented Jan 17, 2016

Any code relying on atomics to be volatile is very misguided and is likely broken in other ways too. It is pretty well known that volatile is orthogonal to atomic operations.

@huonw
Copy link
Member

huonw commented Jan 17, 2016

@huonw (?)

Nope, although I did recommend against it in the thread on reddit, so you may've just mixed up the user names. :)

On that note, this change seems like the right choice to me. cc @rust-lang/libs and especially @aturon.

@aturon
Copy link
Member

aturon commented Jan 17, 2016

This seems reasonable to me, but I have to admit I don't fully grasp the subtleties around volatile accesses in LLVM. I think @rust-lang/compiler may be interested as well.

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

Definition of LLVM's volatile, for reference: http://llvm.org/docs/LangRef.html#volatile-memory-accesses

Certain memory accesses, such as load‘s, store‘s, and llvm.memcpy‘s may be marked volatile. The optimizers must not change the number of volatile operations or change their order of execution relative to other volatile operations. The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior.

IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are flagged volatile. Likewise, the backend should never split or merge target-legal volatile load/store instructions.

Rationale
Platforms may rely on volatile loads and stores of natively supported data width to be executed as single instruction. For example, in C this holds for an l-value of volatile primitive type with native hardware support, but not necessarily for aggregate types. The frontend upholds these expectations, which are intentionally unspecified in the IR. The rules above ensure that IR transformations do not violate the frontend’s contract with the language.

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

I have a feeling this is why Arc couldn't be trivially pwned by a mem::forget loop. A Sufficiently Smart compiler should be able to optimize an Arc mem::forget loop into a single atomic add (I think), but it never did.

@alexcrichton
Copy link
Member

To me this seems "correct" in a void (e.g. atomics should not be volatile by default), and along those lines I think we should pursue what we need to do to make this change. Unfortunately crater would not be great at evaluating a change such as this, but this is also why we have a nightly and beta period (e.g. the change will bake for ~12 weeks).

I would personally be in favor of merging close to after we branch the next beta to give this maximal time to bake, and we should be ready to back it out if necessary, but like @Amanieu I would expect very little breakage.

@alexcrichton
Copy link
Member

An alternative, if it does cause breakage, would be to simultaneously add {load,store}_volatile which can be recommended if code breaks.

@whitequark
Copy link
Member

Note that sequentially consistent atomic operations provide guarantees that are a strict superset of volatile. Assuming no downstream code uses the more finely grained acquire/release semantics, this PR would cause no functional change.

@Amanieu
Copy link
Member Author

Amanieu commented Jan 19, 2016

Note that sequentially consistent atomic operations provide guarantees that are a strict superset of volatile. Assuming no downstream code uses the more finely grained acquire/release semantics, this PR would cause no functional change.

That's not actually true. For example a compiler is allowed to optimize this:

x.fetch_add(2, SeqCst);
x.fetch_add(2, SeqCst);

into

x.fetch_add(4, SeqCst);

only if the operation is not volatile. Volatile semantics are only really useful for memory-mapped I/O where the compiler must preserve the exact sequence of loads and store in the program without merging or eliminating them.

@whitequark
Copy link
Member

Hm, you're right, "no functional change" was too strong of an assertion. Still, I think a look at downstream code would be in order.

@nikomatsakis
Copy link
Contributor

This is a tricky case. Given that the docs did not promise volatile
semantics, I think we are within our rights to make this change, but it
seems like it could affect some users -- it might be worth trying to
advertise the change to the "O/S" and "embedded" communities, for example.
(I'm not sure how best to reach those users.)

On Mon, Jan 18, 2016 at 9:25 PM, whitequark notifications@github.com
wrote:

Hm, you're right, "no functional change" was too strong of an assertion.
Still, I think a look at downstream code would be in order.


Reply to this email directly or view it on GitHub
#30962 (comment).

@huonw
Copy link
Member

huonw commented Jan 19, 2016

In practice, I expect there to be no functional change at the moment: compilers are generally pretty hands-off for atomics. For example, rustc doesn't optimise @Amanieu's example to a single instruction, and neither do any of gcc, clang or icc (for the C++11 equivalent below); it's always two lock xadd's.

#include<atomic>

void foo(std::atomic<long>& x) {
    x.fetch_add(2, std::memory_order_seq_cst);
    x.fetch_add(2, std::memory_order_seq_cst);
}

This is true even after pushing the ordering all the way down to relaxed.

And, e.g., LLVM's performance tips say:

Be wary of ordered and atomic memory operations. They are hard to optimize and may not be well optimized by the current optimizer. Depending on your source language, you may consider using fences instead.

(That's not to say there aren't examples that are optimised, but I haven't seen anything non-trivial: even x.store(1); x.store(2); isn't optimised to just x.store(2).)

@brson brson added the relnotes Marks issues that should be documented in the release notes of the next release. label Jan 20, 2016
@brson
Copy link
Contributor

brson commented Jan 20, 2016

I also think we can do this as a bugfix, but we should advertise it.

@nikomatsakis
Copy link
Contributor

Given @huonw's comment, seems like there's no reason not to make the change
(but yes, advertise it).

On Tue, Jan 19, 2016 at 7:15 PM, Brian Anderson notifications@github.com
wrote:

I also think we can do this as a bugfix, but we should advertise it.


Reply to this email directly or view it on GitHub
#30962 (comment).

@sfackler sfackler added I-nominated T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 21, 2016
@sfackler
Copy link
Member

Nominating for discussion by the libs and lang teams. Sounds like people are on board with this, but it's probably a good idea to make sure everyone's are aware of what's going on.

@sfackler sfackler added T-lang Relevant to the language team, which will review and decide on the PR/issue. and removed T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 21, 2016
@Gankra
Copy link
Contributor

Gankra commented Jan 21, 2016

DIdn't the libs team already agree that this is fine?

@aturon
Copy link
Member

aturon commented Jan 30, 2016

The lang team is on board with this change. The main remaining question is whether to try to offer this functionality by some other means before landing the change (as suggested by @alexcrichton).

@briansmith
Copy link
Contributor

This is a tricky case. Given that the docs did not promise volatile semantics, I think we are within our rights to make this change

FWIW, I agree with the above.

The main remaining question is whether to try to offer this functionality by some other means before landing the change (as suggested by @alexcrichton).

I don't think it is worth the effort. It is early days for Rust. People need to write code based on what is guaranteed on the language spec/docs, not based on what the compiler accidentally does. This change isn't going to break anybody who wrote correct code in the first place.

@alexcrichton
Copy link
Member

The libs team discussed this during triage today and the conclusion was that everyone seems on board, and the only question is whether we need to also add {load,store}_volatile like I proposed above. At this time that doesn't seem urgent or pressing, so we decided to move forward with the PR.

@bors: r+ 01112d1

Thanks @Amanieu!

@bors
Copy link
Contributor

bors commented Feb 4, 2016

⌛ Testing commit 01112d1 with merge 1096e7a...

bors added a commit that referenced this pull request Feb 4, 2016
Rust currently emits atomic loads and stores with the LLVM `volatile` qualifier. This is unnecessary and prevents LLVM from performing optimization on these atomic operations.
@bors bors merged commit 01112d1 into rust-lang:master Feb 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.