-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement volatile barrier APIs #107843
base: main
Are you sure you want to change the base?
Implement volatile barrier APIs #107843
Conversation
Now that I'm on the correct branch
Note regarding the
|
1 similar comment
Note regarding the
|
reads are not reordered after writes in x86. In short - WriteBarrier needs to wait for reads/writes in progress to complete before allowing more writes.
|
Hmm, I still don't understand. What's the difference between: ...x
Volatile.ReadBarrier(); //emits nothing
Volatile.WriteBarrier(); //emits nothing
...y And ...x
Interlocked.MemoryBarrier(); //emits lock ...
...y @VSadov can you give me an example of some x & y where the behaviour is allowed to be different so I can see an example of what I'm not understanding? Edit: I think I understand now (leaving this here for my future reference)
could be re-ordered to: read a, read c, write b, write d, whereas |
- And fix missed file from jit-format
Would Using my example from earlier: read a, write b, barrier/s, read c, write d (where these represent arbitrary quantities of reads & writes in any order):
So it would seem to me as though |
At first glance it seems that the combination is as good as a full barrier. |
It's actually not as strong as a full barrier, since it doesn't give b before c, which is the same thing that x86 doesn't give by default I think based on what you were saying. |
Ah, right, it still does not order Store-Load. It could be cheaper then, since it guarantees less. |
I did some testing based on code I gave in the use case section of my api proposal issue (converted to C++) on a M-series macbook and got about a 1.4% regression overall (don't interpret that as the pair is exactly 1.4% slower than just |
- And fix up some FIXMEs & comments re cpobj and cpblk
Fixes #98837
This implements the proposed
Read-ReadWrite
andReadWrite-Write
barriers. Note: I haven't implemented any tests yet./cc @jkotas @VSadov @kouvel