Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insufficient synchronization in Arc::get_mut #51780

Closed
Tracked by #10
jhjourdan opened this issue Jun 25, 2018 · 8 comments
Closed
Tracked by #10

Insufficient synchronization in Arc::get_mut #51780

jhjourdan opened this issue Jun 25, 2018 · 8 comments
Labels
C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@jhjourdan
Copy link
Contributor

jhjourdan commented Jun 25, 2018

Consider the following Rust code:

extern crate rayon;

use std::sync::{Arc, Mutex};
use std::mem;
use rayon::join;

fn main() {
    let a1 = Arc::new(Mutex::new(0));
    let mut a2 = &mut a1.clone();
    join(
        || {
            { let mut guard = a1.lock().unwrap();
              *guard += 1;
              mem::forget(guard); }
            drop(a1);
        },
        || {
            loop {
                match Arc::get_mut(&mut a2) {
                    None => (),
                    Some(m) =>
                    { *m.get_mut().unwrap() += 1;
                      return; }
                }
            }
        }
    );
}

The first thread acquires the lock, modifies the variable, and then drop its Arc without unlocking (that's the point of the mem::forget).

The second thread waits until the first thread decrements the count by dropping its Arc, and then uses get_mut to access the content without taking the lock (at that time, the mutex is still locked).

My claim is that there is a race between the two accesses of the content of the mutex. The only reason the two accesses would be in a happens-before relationship would be that Arc::drop and Arc::get_mut would establish this happens-before relationship. However, even though Arc::drop does use a release write, Arc::get_mut only uses a relaxed read of the strong counter (via is_unique).

The fix is to use an acquire read in is_unique. I do not expect significant performance penalty here, since is_unique already contains several release-acquire accesses (of the weak count).

CC @RalfJung

EDIT

As @RalfJung noted, we do not actually need leaking memory to exploit this bug (hence this is not another instance of Leakpocalypse). The following piece of code exhibit the same problem:

extern crate rayon;

use std::sync::Arc;
use rayon::join;

fn main() {
    let a1 = Arc::new(0);
    let mut a2 = a1.clone();
    join(
        || {
            let _ : u32 = *a1;
            drop(a1);
        },
        || {
            loop {
                match Arc::get_mut(&mut a2) {
                    None => {}
                    Some(m) => {
                        *m = 1u32;
                        return;
                    }
                }
            }
        }
    );
}
@jhjourdan
Copy link
Contributor Author

Actually, the race also exists if we were not using Mutex : Arc<u32> also triggers the bug, but then this is a race between a read in the first thread and a write in the second thread.

@jhjourdan
Copy link
Contributor Author

That bug was discovered while trying to formally prove soundness of Arc in the RustBelt project.

@RalfJung
Copy link
Member

Another bug found by formal verification :)

@jhjourdan
Copy link
Contributor Author

Actually, the race also exists if we were not using Mutex : Arc also triggers the bug, but then this is a race between a read in the first thread and a write in the second thread.

I should emphasize that this new version of the bug does not use mem::forget. So this is not another instance of the leakpocalypse.

@kennytm kennytm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness C-bug Category: This is a bug. labels Jun 25, 2018
@jhjourdan jhjourdan changed the title Insuficient synchronization in Arc::get_mut Insufficient synchronization in Arc::get_mut Jun 25, 2018
@jhjourdan
Copy link
Contributor Author

jhjourdan commented Jun 25, 2018

That bug was discovered while trying to formally prove soundness of Arc in the RustBelt project.

More detail on this: The bug was found while Hai (@hans89), Derek and me were trying to formally verify Arc.

@RalfJung
Copy link
Member

RalfJung commented Jun 25, 2018

This may be confusing because I claimed previously that we (that was including @jhjourdan) already had verified Arc; the answer is that the previous verification was assuming sequential consistency whereas @jhjourdan et al are now doing the verification with a much weaker memory model.

@bstrie
Copy link
Contributor

bstrie commented Jul 3, 2018

I should emphasize that this new version of the bug does not use mem::forget. So this is not another instance of the leakpocalypse.

@jhjourdan Can you update the original issue with the new version that doesn't use mem::forget? Leakpocalypse is what my mind immediately jumped to upon seeing that example.

@RalfJung
Copy link
Member

RalfJung commented Jul 3, 2018

@bstrie here's the code:

extern crate rayon;

use std::sync::Arc;
use rayon::join;

fn main() {
    let a1 = Arc::new(0);
    let mut a2 = a1.clone();
    join(
        || {
            let _ : u32 = *a1;
            drop(a1);
        },
        || {
            loop {
                match Arc::get_mut(&mut a2) {
                    None => {}
                    Some(m) => {
                        *m = 1u32;
                        return;
                    }
                }
            }
        }
    );
}

The read and write access satisfy the condition of a race according to C11: Two accesses not ordered by happens-before, at least one a write, and at least one non-atomic.

(It is debatable whether this should be a race because it seems no compiler optimization actually invalidates said code. But that's a different question.)

RalfJung added a commit to RalfJung/rust that referenced this issue Jul 3, 2018
Previously, `is_unique` would not synchronize at all with a `drop` that returned
early because it was not the last reference, leading to a data race.

Fixes rust-lang#51780
kennytm added a commit to kennytm/rust that referenced this issue Jul 6, 2018
Strenghten synchronization in `Arc::is_unique`

Previously, `is_unique` would not synchronize at all with a `drop` that returned
early because it was not the last reference, leading to a data race.

Fixes rust-lang#51780

Unfortunately I have no idea how to add a test for this.

Cc @jhjourdan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants