ArcCell: wait-free get() #81

Vtec234 · 2016-08-02T17:48:48Z

As an idea I got from @schets in #80, I wrote a slightly less locking implementation of ArcCell. In this version:
get() is always wait-free.
set() is wait-free with respect to other set()s. Whenever any get() arrives, the set() will block and have to wait until no concurrent get()s are active, behaving like the original implementation would in the degenerate case of the spinlock always going to get()s.
Since this will scale better for many readers, but most likely worse for many writers, I may also PR this as a new type, even though the change is backwards-compatible in terms of behaviour.

arthurprs

This is very nice, maybe we should extend it with a cas-like function too.

arthurprs · 2016-10-31T11:55:41Z

src/sync/arc_cell.rs

-        debug_assert_eq!(self.0.load(Ordering::SeqCst), 0);
-        self.0.store(unsafe { mem::transmute(t) }, Ordering::Release);
+    /// Create a new `ArcCell` from the given `Arc` interior.
+    pub fn with_val(v: T) -> ArcCell<T> {


I see no reason to shorten value to val.

arthurprs · 2017-02-28T10:26:57Z

I was looking at this again today and was wondering if it makes sense to set the high bit if the writer can't acquire the lock after a while. A reader seeing that should backoff.

mstewartgallus · 2017-03-12T01:12:28Z

If you already have an Arc then isn't what you want to implement https://en.wikipedia.org/wiki/Read-copy-update ?

schets · 2017-03-12T02:52:30Z

@sstewartgallus RCU requires a memory management scheme to go along with the read-copy-update part. Imagine that you load the Arc ptr, stall, somebody drops the arc, and then resume trying to acquire the pointer. The Linux kernel scheme for doing so is often referred to RCU as a whole. Crossbeam EBR could also be used to manage the memory. This is managing it independently of the EBR with what's essentially a second refcounting scheme.

The orderings here are subtly incorrect. In get, the SeqCst load doesn't prevent the refcount increment from reordering past it, see generated code here. SeqCst on a load is essentially an acquire load that's also ordered with respect to all other SeqCst operations and observe a global order of all SeqCst operations.

In an ideal world an Acquire on the fetch_add should work, but on aarch64 the load could still reorder behind the store part of the ll/sc pair (it's basically load-linked-acquire->store-conditional). aarch64 rmw orderings are extremely screwy and a major stretch of the c++11 atomic memory model, but oh well. I guess acquire only applies to the load ordering technically. I'm not sure of a performance-friendly way to do this without platform specific code.

schets · 2017-04-05T14:54:48Z

src/sync/arc_cell.rs

-        old
+        unsafe {
+            let t: usize = mem::transmute(t);
+            let old: Arc<T> = mem::transmute(self.ptr.swap(t, Ordering::Acquire));


This ordering is subtly incorrect, see my comment in the main thread for why. There's a crate atomic_utilities that has a fence/ordering combo that will create an optimal rmw setup that is ordered-before future loads and stores. I'm adding docs to it now but there's an example in the code.

arthurprs · 2017-04-17T17:55:43Z

@schets I was checking this thread, what's wrong with the linked code? I get

load(std::atomic<int>&, std::atomic<int>&):
        nop nop nop

.L2:
        ldxr    w2, [x1]
        add     w2, w2, 1
        stxr    w3, w2, [x1]
        cbnz    w3, .L2
        nop nop nop

        ldar    w0, [x0]
        nop nop nop

.L3:
        ldxr    w2, [x1]
        sub     w2, w2, #1
        stxr    w3, w2, [x1]
        cbnz    w3, .L3
        nop nop nop

        ret

I know little about arm64 assembly but it doesn't look unordered to me.

schets · 2017-04-17T18:05:51Z

@arthurprs ldar prevents memory accesses after the load from happening before it but doesn't do anything about memory accesses before. Arm also allows loads to speculatively move before branches and loads are allowed to move before an stxr (it can return true before the store is finished, meaning speculative execution isn't required for this problem to manifest).

Given all that, it's possible for the ldar instruction to be ordered-before the stxr and I'm pretty sure the ldxr instruction, so it's possible for the SeqCst load to happen partially or completely before the fetch_add/swap. That's definitely a problem in get(), and I believe a problem in set as well (less clear on that one).

arthurprs · 2017-04-17T18:14:20Z

Ahh I misunderstood the "reorder", you meant inside the execution units. Just like I say, my arm64 skills suck.

You suggest using this, right? https://docs.rs/crate/atomic_utilities/0.5.0/source/src/fence_rmw.rs

schets · 2017-04-19T14:29:27Z

That's what I've used for similar cases. It basically does:

acquire rmw on x86, since that's just an rmw + full fence all in one
relaxed rmw + seqcst fence, since every other architecture needs that sort of fencing or does it for acquire

Vtec234 · 2018-03-07T19:08:43Z

More general types have been proposed in crossbeam-rs/rfcs#28.

Vtec234 added 2 commits August 24, 2016 13:18

ArcCell: wait-free get()

d3c92cc

ArcCell: fix ordering

e600f24

Vtec234 force-pushed the free_arc_cell branch from 21a3efe to e600f24 Compare August 24, 2016 11:19

arthurprs reviewed Oct 31, 2016

View reviewed changes

Vtec234 mentioned this pull request Feb 15, 2017

Primitive leaks memory nivekuil/rculock#1

Closed

schets reviewed Apr 5, 2017

View reviewed changes

Vtec234 mentioned this pull request Dec 11, 2017

ArcCell not lock-free #160

Open

Vtec234 closed this Mar 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ArcCell: wait-free get() #81

ArcCell: wait-free get() #81

Vtec234 commented Aug 2, 2016

arthurprs left a comment

arthurprs Oct 31, 2016

arthurprs commented Feb 28, 2017

mstewartgallus commented Mar 12, 2017

schets commented Mar 12, 2017

schets Apr 5, 2017

arthurprs commented Apr 17, 2017

schets commented Apr 17, 2017 •

edited

Loading

arthurprs commented Apr 17, 2017 •

edited

Loading

schets commented Apr 19, 2017

Vtec234 commented Mar 7, 2018

ArcCell: wait-free get() #81

ArcCell: wait-free get() #81

Conversation

Vtec234 commented Aug 2, 2016

arthurprs left a comment

Choose a reason for hiding this comment

arthurprs Oct 31, 2016

Choose a reason for hiding this comment

arthurprs commented Feb 28, 2017

mstewartgallus commented Mar 12, 2017

schets commented Mar 12, 2017

schets Apr 5, 2017

Choose a reason for hiding this comment

arthurprs commented Apr 17, 2017

schets commented Apr 17, 2017 • edited Loading

arthurprs commented Apr 17, 2017 • edited Loading

schets commented Apr 19, 2017

Vtec234 commented Mar 7, 2018

schets commented Apr 17, 2017 •

edited

Loading

arthurprs commented Apr 17, 2017 •

edited

Loading