-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM loop optimization can make safe programs crash #28728
Comments
The LLVM IR of the optimised code is ; Function Attrs: noreturn nounwind readnone uwtable
define internal void @_ZN4main20h5ec738167109b800UaaE() unnamed_addr #0 {
entry-block:
unreachable
} This kind of optimisation breaks the main assumption that should normally hold on uninhabited types: it should be impossible to have a value of that type. |
triage: I-nominated Seems bad! If LLVM doesn't have a way to say "yes, this loop really is infinite" though then we may just have to sit-and-wait for the upstream discussion to settle. |
A way to prevent infinite loops from being optimised away is to add |
Is this related to #18785? That one's about infinite recursion to be UB, but it sounds like the fundamental cause might be similar: LLVM doesn't consider not halting to be a side effect, so if a function has no side effects other than not halting, it's happy to optimize it away. |
It's the same issue. |
Yes, looks like it's the same. Further down that issue, they show how to get |
👍 |
Crash, or, possibly even worse heartbleed https://play.rust-lang.org/?gist=15a325a795244192bdce&version=stable |
So I've been wondering how long until somebody reports this. :) In my opinion, the best solution would of course be if we could tell LLVM not to be so aggressive about potentially infinite loops. Otherwise, the only thing I think we can do is to do a conservative analysis in Rust itself that determines whether:
Either of this should be enough to avoid undefined behavior. |
triage: P-medium We'd like to see what LLVM will do before we invest a lot of effort on our side, and this seems relatively unlikely to cause problems in practice (though I have personally hit this while developing the compiler as well). There are no backwards incomatibility issues to be concerned about. |
Quoting from the LLVM mailing list discussion:
|
@dotdash The excerpt you are quoting comes from the C++ specification; it is basically the answer to "how it [having side effects] is defined in C" (also confirmed by the standard committee: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm ). Regarding what is the expected behaviour of the LLVM IR there is some confusion. https://llvm.org/bugs/show_bug.cgi?id=24078 shows that there seems to be no accurate & explicit specification of the semantics of infinite loops in LLVM IR. It aligns with the semantics of C++, most likely for historical reasons and for convenience (I only managed to track down https://groups.google.com/forum/#!topic/llvm-dev/j2vlIECKkdE which apparently refers to a time when infinite loops were not optimised away, some time before the C/C++ specs were updated to allow it). From the thread it is clear that there is the desire to optimise C++ code as effectively as possible (i.e. also taking into account the opportunity to remove infinite loops), but in the same thread several developers (including some that actively contribute to LLVM) have shown interest in the ability to preserve infinite loops, as they are needed for other languages. |
@ranma42 I'm aware of that, I just quoted that for reference, because one possibility to work-around this would be to detect such loops in rust and add one of the above to it to stop LLVM from performing this optimization. |
Is this a soundness issue? If so, we should tag it as such. |
Yes, following @ranma42's example, this way shows how it readily defeats array bounds checks. playground link |
The policy is that wrong-code issues that are also soundness issues (i.e. most of them) should be tagged |
- Allows the rrt0 start function to panic when main returns - Fixes undefined behavior - See: rust-lang/rust#28728
* Update rrt0 and remove the empty loop - Allows the rrt0 start function to panic when main returns - Fixes undefined behavior - See: rust-lang/rust#28728 * Fix link in doc comment
I don't see it mentioned here: LLVM now has a "mustprogress" attribute:
It appears to implement the proposal from way back in 2015, wherein C/C++ must annotate their functions "mustprogress" to enable these types of optimizations. The way it reads to me is that Rust should stop miscompiling these situations in the future just by not using this attribute. Edit: oh I see @RalfJung linked this previously in some collapsed comments. |
This should be mostly fixed in LLVM 12 by https://reviews.llvm.org/D94106. There's still a minor issue remaining (side-effects can be hoisted across infinite loops). |
The upstream issue is marked as fixed now. https://bugs.llvm.org/show_bug.cgi?id=965 |
Is there an LLVM bug to track this issue? (I presume it's not something rustc can address on its own?) Also, if this issue were fully addressed, would we want to revert #77972? If so, that would be worth tracking somewhere. Even though it appears to cause no noticeable regressions, not doing unnecessary work seems valuable. |
This issue is fixed by https://reviews.llvm.org/D95288. There might be more places with problematic assumptions, but at least I'm not aware of any. (In default-enabled passes that is. The attributor has known issues.)
We shouldn't revert that yet as Rust also supports older LLVM versions, but we can limit sideeffect insertion to older versions at least. |
I just found the following miscompilation resulting from this bug, which simply compiles to pub fn oops() -> u32 {
(0..).sum() // or .last().unwrap() instead of .sum()
} It's a minified version of the following snippet where I mistyped pub fn foo(end: u32) -> u32 {
(0..)
.map(|i| i * i)
.filter(|&i| i < end)
.sum()
} |
I just want to notice that it's a bug on which beginner Rustaceans may stumble upon when working on the final exercise from The Book (Building a Multithreaded Web Server). When I was playing with the web server's code, for a moment I had an empty loop body: impl Worker {
fn new(id: usize, receiver: Arc<Mutex<mpsc::Receiver<Job>>>) -> Worker {
let thread = thread::spawn(|| loop {});
Worker { id, thread }
}
} instead of (Listing 20-20): impl Worker {
fn new(id: usize, receiver: Arc<Mutex<mpsc::Receiver<Job>>>) -> Worker {
let thread = thread::spawn(move || loop {
let job = receiver.lock().unwrap().recv().unwrap();
println!("Worker {} got a job; executing.", id);
job();
});
Worker { id, thread }
}
} Running the first version ends with I keep my fingers crossed for solving this issue 🤞 |
Closed by #81451 |
Thankyou to everyone who helped to get this fixed. However, I would like to note that it took over five years from the original report, and four years after LLVM devs said they would fix it on their side (#28728 (comment)), for this memory-unsafe soundness bug to actually be fixed in Rust. Is this the expected length of time for fixes to memory-unsafe soundness bugs? :-/ It could, of course, have been fixed much earlier by just adding a side-effect into infinite loops [edit: I know that this is a strong statement but I have seen nothing that contradicts it], and then removing that workaround as a performance optimization after LLVM fixed their bug. It doesn't appear to be the only example: #52652 (which also could be triggered from safe code as noted in #52652 (comment)) took over three years to be fixed, measuring from the original attempted fix in #46833. Based on the evidence of this bug and #52652, there appears to be a de facto policy to leave tricky unsoundness bugs unfixed for long periods, whenever fixing them in the most obvious way could cause any potential performance regression or break any (no matter how obscure) code. I would like to register my objection to that de facto policy, which I do not think well serves the majority of Rust users. |
@daira While I think you make good points, closed issues are generally considered resolved and not in need of attention, so new comments there are easy to miss. Consider starting a new discussion elsewhere (though to be honest I’m not sure which venue would be best in this case.) |
https://internals.rust-lang.org/ might be a reasonable place. (FWIW, I disagree with your assessment @daira, in particular about "just" adding a side-effect into infinite loops -- it's not that simple; as usual, one has to be careful with "why don't you just"-style questions.) |
I used "just" advisedly. It really wouldn't have been hard, as compiler bugs go, to use that approach. Specifically, my understanding is that adding |
That bug was 'fixed' very quickly, but the fix had to be reverted because it broke the |
The following snippet crashes when compiled in release mode on current stable, beta and nightly:
https://play.rust-lang.org/?gist=1f99432e4f2dccdf7d7e&version=stable
This is based on the following example of LLVM removing a loop that I was made aware of: https://github.com/simnalamburt/snippets/blob/12e73f45f3/rust/infinite.rs.
What seems to happen is that since C allows LLVM to remove endless loops that have no side-effect, we end up executing a
match
that has no arms.The text was updated successfully, but these errors were encountered: