-
Notifications
You must be signed in to change notification settings - Fork 7
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about adding these functions:
- Logical operations:
and
,or
,xor
- Min/max:
min
,max
(support for these will be added to Rust soon)
In a more general sense, do we want to have both AtomicCell<T>
and Atomic<T>
(from atomic-rs)? The main differences that I can see if that AtomicCell<T>
supports non-Copy types, but only supports SeqCst
orderings.
src/atomic/atomic_cell.rs
Outdated
atomic_compare_and_swap(self.value.get(), current, new) | ||
}; | ||
|
||
if byte_eq(&previous, ¤t) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using byte_eq
here, you could just use the Result
returned by compare_exchange_weak
.
]; | ||
|
||
// If the modulus is a constant number, the compiler will use crazy math to transform this into | ||
// a sequence of cheap arithmetic operations rather than using the slow modulo instruction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you actually check that the number you used (499) actually produces such a cheap instruction sequence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the instruction sequence: https://godbolt.org/g/Pc6qoC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, FWIW, changing LEN
to 512 doesn't affect the benchmark numbers at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to say that it is.. crazy and beautiful.
src/atomic/atomic_cell.rs
Outdated
|
||
let mut step = 0usize; | ||
|
||
while lock.compare_and_swap(false, true, Ordering::Acquire) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using compare_exchange_weak
here will produce slightly more efficient code on ARM. Also the failure ordering can be Relaxed
.
src/atomic/atomic_cell.rs
Outdated
{ | ||
a = &*(dst as *const _ as *const _); | ||
mem::transmute_copy( | ||
&a.compare_and_swap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend using the explicit compare_exchange
or compare_exchange_weak
operations rather than compare_and_swap
. They return a Result
which avoids the need to do a byte_eq
afterwards.
/// If this value can be transmuted into a primitive atomic type, it will be treated as such. | ||
/// Otherwise, all potentially concurrent operations on this data will be protected by a global | ||
/// lock. | ||
value: UnsafeCell<T>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would use alignment to round this up to the size of the next atomic type. This is what _Atomic
in C11 does. However there is currently no way to express this in Rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Amanieu by "next atomic type" what do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably that alignment/size of _Atomic char[3]
is the same as alignment/size of _Atomic uint64_t
(because that's how it gets represented internally).
src/atomic/atomic_cell.rs
Outdated
/// ``` | ||
pub fn update<F>(&self, mut f: F) -> T | ||
where | ||
F: FnMut(T) -> T, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered making the function return an Option<T>
? I find that this is very useful in practice. See this comment.
src/atomic/atomic_cell.rs
Outdated
}; | ||
|
||
if byte_eq(&previous, ¤t) { | ||
return new; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general convention for atomic operations is to return the previous value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, if you change the return value of F
to Option<T>
then it would make more sense to return a bool
here.
/// | ||
/// This type is equivalent to [`Cell`], except it can also be shared among multiple threads. | ||
/// | ||
/// Operations on `AtomicCell`s use atomic instructions whenever possible, and synchronize using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just curious -- can we avoid global locks? If so, do we prefer global locks to per-object locks?
I'd like to ask if we document a little bit more on this design choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can do that without sticking an AtomicBool
next to absolutely every object, even if they don't use locking. Note that std::atomic
in C++ is implemented in a similar manner.
In case an object does need locking, I'm not really sure whether we prefer global locks over per-object locks. It's a tradeoff between memory consumption and performance.
src/atomic/atomic_cell.rs
Outdated
pub fn add(&self, val: $t) -> $t { | ||
if can_transmute::<$t, atomic::AtomicUsize>() { | ||
let a = unsafe { &*(self.value.get() as *const atomic::AtomicUsize) }; | ||
a.fetch_add(val as usize, Ordering::SeqCst).wrapping_add(val as usize) as $t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of using SeqCst
in loads and stores. I described the reason here: rust-lang/rust#31650 (comment) In short, as contrary to commonly believed, using SeqCst
for loads, stores, and rmws doesn't give you a useful guarantee over Release
/Acquire
semantics. In particular, it has never been proved that using SeqCst
for a location gives you a perfect ordering for that particular location.
For AtomicCell
, it seems Release
and Acquire
will be enough, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel I still haven't fully grasped the consequences of the broken C/C++11 model. :( Maybe we should take a look at an example. Here's one with 4 threads:
// THREAD 1
a.set(1);
// THREAD 2
b.set(1);
// THREAD 3
a.get(); // 1
b.get(); // 0
// THREAD 4
b.get(); // 1
a.get(); // 0
Now, two questions:
-
Is this scenario possible if all
get
operations useAcquire
and allset
operations useRelease
? I would say "yes". -
Is this scenario possible if all
get
and allset
operations useSeqCst
? I would say "no" because the model prohibits this behavior. But you're saying the answer is "yes" because the compilers have no way of compiling this correctly on Power/ARM? Is my understanding correct?
I've rebased onto master and replaced Benchmark results (operation on
Regarding
Yes. I'm mostly taking inspiration from Java here, where My thinking on let new = loop {
let x = a.get();
let y = x * 5;
if a.compare_and_set(x, y) {
break y;
}
}; This is not a lot of code, but writing the same thing in the form of this one-liner is much nicer: let new = a.update(|x| x * 5); Now, if the update function (the lambda) had to return an let new = a.update(|x| Some(x * 5)).unwrap(); There are two unfortunate angles to this. First, the one-liner just became a bit longer. Second, we have this Consider that the original CAS loop didn't have Finally, even in the case of failing update functions, the improvement of // Using a CAS loop.
let new = loop {
let x = a.get();
if x == 0 {
break None;
} else {
let y = 1000 / x;
if a.compare_and_set(x, y) {
break Some(y);
}
}
};
// Using `update`.
let new = a.update(|x| {
if x == 0 {
None
} else {
Some(1000 / x)
}
}).unwrap(); In the end, it just seems to me the payoff is greater if the update function doesn't return an
That is true, and the reason most likely lies in the fact that returning the previous value is strictly more powerful - one can obtain the new value by feeding the old one into the update function. Java has two variants: Again, my feeling is that using the new value is more commonly desirable, so manually recomputing it could be unergonomic and brittle: let new = a.fetch_update(|x| x * 10) * 10; Maybe we should just have two variants
Yes, we should eventually add those, too. I'm a bit stuck on the naming problem:
|
@stjepang Note that I was suggesting only having the result of the closure be an The rationale behind this is that in a lot of my code, I end up having loops that look like this: loop {
if some_condition {
return false;
}
match atomic.compare_exchange_weak(
oldval,
newval,
Ordering::Relaxed,
Ordering::Relaxed,
) {
Ok(_) => return true,
Err(x) => oldval = x,
}
} I only want to update the value if |
Rebased onto master and:
I removed method @jeehoonkang Regarding memory orderings, perhaps let's leave them as strong as possible now and relax in a future PR? @Amanieu How do you feel about adding |
src/atomic/atomic_cell.rs
Outdated
/// let ptr = a.as_ptr(); | ||
/// ``` | ||
pub fn as_ptr(&self) -> *mut T { | ||
self.value.get() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this return a *const T
pointer? Since its only a reference to itself, not a mutable one.
Also shouldn't this be an unsafe function? If users threats this as a regular pointer, while it's being shared across threads, suprising behaviour might happen from non-atomic/non-locked operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I'll remove this function as it doesn't make much sense for atomic types (Cell<T>
has it, but not AtomicUsize
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Thomasdezeeuw Probably it's okay not to mark it unsafe
because using *mut T
and *const T
requires unsafe
. Anyway users should make their code unsafe
, so I think both *const T
and *mut T
are fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still concerned with orderings. I think SeqCst
obscures the specification: what do we expect from the methods of AtomicCell
? I think the word "sequentially consistent" doesn't really help, because practically none understand the term precisely. Even worse, using SeqCst
may guarantee some ordering properties which users will rely on, after which we're stuck and cannot optimize it further.
So I suggest that, even we're using SeqCst
here, let's make clear that the specification of AtomicCell
is unclear. (Oh, this sentence is too pedantic... Sorry.) And in a later PR presumably we will confirm that its specification is actually release/acquire.
Also on the global locks: I agree global locks are the only sane choice we can use now. I still prefer per-object locking to global locking because I think it better conveys programmers' intent, but sadly there's no way to implement per-object locking efficiently. In an ideal world, I hope atomic_is_lock_free()
be a const function, and we can use its results in at compile time so that lock-free AtomicCell<T>
doesn't have a lock and non-lock-free AtomicCell<T>
has one.
/// If this value can be transmuted into a primitive atomic type, it will be treated as such. | ||
/// Otherwise, all potentially concurrent operations on this data will be protected by a global | ||
/// lock. | ||
value: UnsafeCell<T>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Amanieu by "next atomic type" what do you mean?
src/atomic/atomic_cell.rs
Outdated
/// let ptr = a.as_ptr(); | ||
/// ``` | ||
pub fn as_ptr(&self) -> *mut T { | ||
self.value.get() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Thomasdezeeuw Probably it's okay not to mark it unsafe
because using *mut T
and *const T
requires unsafe
. Anyway users should make their code unsafe
, so I think both *const T
and *mut T
are fine.
src/atomic/atomic_cell.rs
Outdated
/// assert_eq!(a.compare_and_set(Foo(11), Foo(12)), true); | ||
/// assert_eq!(a.get().0, 12); | ||
/// ``` | ||
pub fn compare_and_set(&self, mut current: T, new: T) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding its signature: how about returning Result<T, T>
, where Ok(cur)
means CAS succeeded and Err(cur)
means CAS was unsuccessful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I think we have three options here:
fn compare_and_set(&self, current: T, new: T) -> bool;
fn compare_and_swap(&self, current: T, new: T) -> T;
fn compare_exchange(&self, current: T, new: T) -> Result<T, T>;
Maybe let's get rid of compare_and_set
and implement the last two since those are already established idioms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm concerned, I think even having only compare_exchange()
is also fine :)
I'll just leave the decision to you.
src/atomic/atomic_cell.rs
Outdated
} | ||
|
||
// The compare-exchange operation has failed and didn't store `new`. However, | ||
// `previous` is semantically equal to `current`, although not byte-equal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can also be byte-equal, as we're using the weak version of CAS here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I've updated the comment.
]; | ||
|
||
// If the modulus is a constant number, the compiler will use crazy math to transform this into | ||
// a sequence of cheap arithmetic operations rather than using the slow modulo instruction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to say that it is.. crazy and beautiful.
while lock.compare_exchange_weak(false, true, Ordering::Acquire, Ordering::Relaxed).is_err() { | ||
if step < 5 { | ||
// Just try again. | ||
} else if step < 10 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably it's a good spot to use Backoff
:)
@jeehoonkang Regarding I guess my somewhat hand-wavy definition would go like this: To me, the fact that |
Just one more question: do you prefer So |
@stjepang I'm concerned with documentation, actually. It says its semantics is sequentially consistent, but I intend to propose to weaken it to release/acquire in the future. So I'm asking to leave the documentation somewhat unspecified at this moment. |
Oh, I see. Is the documentation all right now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your work on AtomicCell
! I think it'll be very useful.
I'm still concerned with the comment "This operation uses the SeqCst
ordering", it kinda looks like a spec, but let's resolve it in a later PR...
The doc comment applies only to private functions so we haven't actually commited to it. |
Some changes:
|
@Amanieu @jeehoonkang Are you OK with merging with these changes? |
It's wonderful. Thanks!! |
This type is a thread-safe version of
Cell
.It is a good replacement for
AtomicOption
and is more general. The new equivalent ofAtomicOption<T>
would beAtomicCell<Option<Box<T>>>
.AtomicCell
is also very similar toLockCell
in rustc, except it uses atomic instructions whenever possible and doesn't embed any additional data in the object for locking.