Add SGX thread parker #123

faern · 2019-03-18T19:29:24Z

Adds a ThreadParker implementation that is specific to SGX, so that platform does not need to use the generic fallback spin loop based implementation.

This is in preparation of rust-lang/rust#56410.

Ping @jethrogb who is familiar with the platform. I have read the documentation for wait and send and I have tried cross compiling this code with --target x86_64-fortanix-unknown-sgx to make sure it at least compiles. But I have not tested actually running it since I'm not sure how to do that. Please advise on the implementation if you have any feedback.

It would be good if someone with access to an SGX machine could try to run the benchmark tool in parking_lot before and after this PR. So we can get some rough numbers on performance of parking_lot compared to the standard library locks. See this comment: rust-lang/rust#56410 (comment)

jethrogb

Cool!

core/src/lib.rs

core/src/thread_parker/sgx.rs

jethrogb · 2019-03-18T20:21:12Z

I tried to run the benchmark but it just hangs. I can investigate later.

faern · 2019-03-18T20:26:12Z

I tried to run the benchmark but it just hangs. I can investigate later.

Save the benchmarks until we have verified the implementation is correct then. How about the normal test suite?

jethrogb · 2019-03-18T20:26:35Z

Reverse SGX cfg check to solve experimental target_vendor

Wow, neat trick! NB. target_vendor was stabilized in 1.33.

jethrogb · 2019-03-18T20:30:52Z

All unit tests pass, except rwlock::tests::test_rwlock_recursive which calls the unsupported std::thread::sleep.

faern · 2019-03-18T20:46:28Z

Well, that's a really good start! I have now modified the test to not use thread::sleep on SGX, but rather just yield the thread 100 times in hopes this will cause the second thread to reach the write locking call. Not sure if 100 is a suitable number of times to issue the call, I guess running the tests a few times will have to determine that.

faern · 2019-03-18T21:08:00Z

If all tests pass but the benchmarks appear to "hang". Then maybe they are just incredibly slow and need some more time to complete? 🤔 If not, we might have a bug not caught by the tests.

jethrogb · 2019-03-18T21:12:23Z

But it says it's going to run for one second?

faern · 2019-03-18T21:14:24Z

But it says it's going to run for one second?

Yeah. But the last argument, being the testIterations argument will multiply that. If testIterations is 2 then each test will run for two seconds.

EDIT: Yes, it's supposed to be time limited. But you never know, when things act weird try weird things :) You said earlier that it might be sketchy to depend on Instant::now() for scheduling on SGX. So maybe whatever the benchmarks does to limit the time of each run does not work there for some reason.

.travis.yml

jethrogb · 2019-03-19T23:19:29Z

The hanging seems to be somehow related to sleep panicking. It's possible this is fixed as part of rust-lang/rust#59136. Alternatively, this could be rust-lang/rust#58042 (which looks like it might not get merged).

I was able to run the benchmarks by getting rid of the sleeps:

diff --git a/benchmark/src/mutex.rs b/benchmark/src/mutex.rs
index 519b2c6..2cd96cb 100644
--- a/benchmark/src/mutex.rs
+++ b/benchmark/src/mutex.rs
@@ -16,7 +16,7 @@ use std::cell::UnsafeCell;
 use std::sync::atomic::{AtomicBool, Ordering};
 use std::sync::{Arc, Barrier};
 use std::thread;
-use std::time::Duration;
+use std::time::{Duration, SystemTime};
 
 trait Mutex<T> {
     fn new(v: T) -> Self;
@@ -133,7 +133,10 @@ fn run_benchmark<M: Mutex<f64> + Send + Sync + 'static>(
         }));
     }
 
-    thread::sleep(Duration::from_secs(seconds_per_test as u64));
+    let start = SystemTime::now();
+    let max = Duration::from_secs(seconds_per_test as u64);
+    while start.elapsed().unwrap() < max {}
+
     keep_going.store(false, Ordering::Relaxed);
     threads.into_iter().map(|x| x.join().unwrap().0).collect()
 }
diff --git a/benchmark/src/rwlock.rs b/benchmark/src/rwlock.rs
index 0b724b5..23837c8 100644
--- a/benchmark/src/rwlock.rs
+++ b/benchmark/src/rwlock.rs
@@ -17,7 +17,7 @@ use std::cell::UnsafeCell;
 use std::sync::atomic::{AtomicBool, Ordering};
 use std::sync::{Arc, Barrier};
 use std::thread;
-use std::time::Duration;
+use std::time::{Duration, SystemTime};
 
 trait RwLock<T> {
     fn new(v: T) -> Self;
@@ -210,7 +210,10 @@ fn run_benchmark<M: RwLock<f64> + Send + Sync + 'static>(
         }));
     }
 
-    thread::sleep(Duration::new(seconds_per_test as u64, 0));
+    let start = SystemTime::now();
+    let max = Duration::from_secs(seconds_per_test as u64);
+    while start.elapsed().unwrap() < max {}
+
     keep_going.store(false, Ordering::Relaxed);
 
     let run_writers = writers

Here are some bad benchmarks (testIterations=10):

mutex - `66d67bb` (master)

- Running with 2 threads
- 1 iterations inside lock, 0 iterations outside lock
- 1 seconds per test
        name         |    average     |     median     |    std.dev.   
parking_lot::Mutex   |  16676.166 kHz |  15104.691 kHz |   7120.573 kHz
std::sync::Mutex     |     97.798 kHz |     92.837 kHz |     31.932 kHz

mutex - `e958443` (this PR)

- Running with 2 threads
- 1 iterations inside lock, 0 iterations outside lock
- 1 seconds per test
        name         |    average     |     median     |    std.dev.   
parking_lot::Mutex   |  17004.625 kHz |  16708.983 kHz |   8024.508 kHz
std::sync::Mutex     |     79.578 kHz |     69.929 kHz |     33.624 kHz

rwlock - `66d67bb` (master)

- Running with 1 writer threads and 1 reader threads
- 1 iterations inside lock, 0 iterations outside lock
- 1 seconds per test
parking_lot::RwLock  - [write]  28628.331 kHz [read]    428.672 kHz
seqlock::SeqLock     - [write]  31885.639 kHz [read]  11868.928 kHz
std::sync::RwLock    - [write]    453.986 kHz [read]   4768.948 kHz

rwlock - `e958443` (this PR)

- Running with 1 writer threads and 1 reader threads
- 1 iterations inside lock, 0 iterations outside lock
- 1 seconds per test
parking_lot::RwLock  - [write]  39197.440 kHz [read]    183.720 kHz
seqlock::SeqLock     - [write]  35924.117 kHz [read]   4215.011 kHz
std::sync::RwLock    - [write]    521.821 kHz [read]   7776.407 kHz

The RwLock read performance seems worrisome. Then again, this is on my dual core system where one core is just busy-wating.

faern · 2019-03-19T23:35:06Z

The RwLock read performance seems worrisome. Then again, this is on my dual core system where one core is just busy-wating.

Given that the same benchmark run also shows seqlock::SeqLock as 2.8x slower and std::sync::RwLock 1.6x faster, even though the code for those did not change, I do not trust the accuracy of that benchmark very much.

If you occupy 50% of your CPU resources with busy looping in the benchmark runner it's not very strange the results become a bit unreliable.

Can you run multiple runs and try to see some general trend/average?

jethrogb · 2019-03-19T23:36:18Z

I've run it multiple times, RwLock read is always very slow. I can try running this on a higher-core machine later

faern · 2019-03-19T23:45:28Z

Not sure if it will affect your results, but try running the benchmarks with --features nightly, since that will activate all the bells and whistles of parking_lot.

Amanieu · 2019-03-20T00:11:36Z

This is because parking_lot prioritizes writers over readers when there is contention. If you do a benchmark with only reader threads then you will get much better numbers.

faern · 2019-03-20T07:28:40Z

@jethrogb Can you verify if the change I did to the test_rwlock_recursive test works reliably or not?

I have tried rebasing my libstd PR, #119, on top of this locally and ran the entire dist-various-2 test, which includes SGX. It passed. So unless we find more evidence this is slowing things down I'm starting to feel this is ready for review/merge.

I tried running SGX stuff myself, and the tests/benches. But my Rust build server is running in a Xen VM. And it seems not really possible without patching Xen?

Amanieu · 2019-03-20T14:28:56Z

core/src/thread_parker/sgx.rs

+    // released to avoid blocking the queue for too long.
+    #[inline]
+    pub fn unpark(self) {
+        usercalls::send(EV_UNPARK, Some(self.0)).expect("send returned error");


If we implement timeouts in the future, the send here can actually fail if the thread's wait timed out and it then exited. I assume that all wait calls are supposed to properly handle unexpected EV_UNPARK events.

the send here can actually fail if the thread's wait timed out and it then exited.

In that case, the event will actually be queued for the next thread that's going to run at that TCS. But I suppose once we implement SGX2, TCSes can be dynamic, so your point still holds then.

I assume that all wait calls are supposed to properly handle unexpected EV_UNPARK events.

Yes.

Ok. Will change this to just ignore any returned error. Or do you think I should check specifically for the InvalidInput error returned on invalid TCS and just ignore that ErrorKind?

I'm always in favor of more specific error checking :)

I have now changed it to only panic on != InvalidInput. But if we decide on using debug_assert_eq in the other discussion, then the consistent thing here would be to only check this error when debug_assertions are enabled as well.

Amanieu · 2019-03-20T14:34:17Z

core/src/thread_parker/sgx.rs

+    pub fn park(&self) {
+        while self.parked.load(Ordering::Acquire) {
+            let res = usercalls::wait(EV_UNPARK, WAIT_INDEFINITE).expect("wait returned error");
+            assert_eq!(res & EV_UNPARK, EV_UNPARK);


Can you make this a debug_assert_eq!, for consistency with the other parkers and so that we avoid crashes on unexpected errors in production code. Same with the expect.

That wouldn't be appropriate:

https://edp.fortanix.com/docs/api/fortanix_sgx_abi/struct.Usercalls.html#method.wait

Enclaves must not assume that this call only returns in response to valid events generated by the enclave. This call may return for invalid event sets, or before timeout has expired even though no event is pending.

I guess it doesn't say it explicitly, but even though those conditions mustn't be assumed, any deviation should be considered misbehavior by the operating environment and an abort is appropriate.

The parking algorithm does not really care why or how we were woken up. All it needs to do is to be able to check self.parked and possibly go back to sleep.

So I guess ultimately this comes down to if we want parking_lot to fail fast and loud on any event we did not anticipate, or if it should just ignore them and carry on with its task. The second seems to be what it does on other platforms. So I'll change to that for now and you can debate which one should be the final version.

jethrogb · 2019-03-20T15:44:42Z

I have tried rebasing my libstd PR, #119, on top of this locally and ran the entire dist-various-2 test, which includes SGX. It passed.

SGX is a tier 2 target so the CI only builds, it doesn't run any tests.

I tried running SGX stuff myself, and the tests/benches. But my Rust build server is running in a Xen VM. And it seems not really possible without patching Xen?

The only hypervisor I'm aware of with production-ready SGX support is Hyper-V in Azure (only).

I'll do some more testing today.

Amanieu · 2019-03-22T03:23:47Z

Thanks!

faern force-pushed the add-sgx-thread-parker branch from 7545f72 to 0f13826 Compare March 18, 2019 19:35

jethrogb reviewed Mar 18, 2019

View reviewed changes

core/src/lib.rs Show resolved Hide resolved

core/src/thread_parker/sgx.rs Outdated Show resolved Hide resolved

core/src/thread_parker/sgx.rs Outdated Show resolved Hide resolved

faern added 2 commits March 18, 2019 21:01

Add SGX thread parker

bc0de4a

Panic in park_until instead of using timestamps

b636cfc

faern force-pushed the add-sgx-thread-parker branch from 0f13826 to b636cfc Compare March 18, 2019 20:06

faern force-pushed the add-sgx-thread-parker branch from 459d0d0 to a10e59a Compare March 18, 2019 20:22

faern force-pushed the add-sgx-thread-parker branch from d14461d to c49ae3e Compare March 19, 2019 22:42

jethrogb reviewed Mar 19, 2019

View reviewed changes

.travis.yml Outdated Show resolved Hide resolved

faern force-pushed the add-sgx-thread-parker branch from e958443 to 59535e3 Compare March 19, 2019 22:54

faern force-pushed the add-sgx-thread-parker branch 2 times, most recently from 9b1508d to 9abfb1c Compare March 20, 2019 07:20

Amanieu reviewed Mar 20, 2019

View reviewed changes

faern added 2 commits March 20, 2019 18:31

Reverse SGX cfg check to solve experimental target_vendor

ffc1869

Don't use thread::sleep in tests on SGX

7ffd6b3

faern added 3 commits March 20, 2019 18:31

Test SGX on nightly travis build

c61f895

Only panic in send if error is not InvalidInput

9050a28

Only panic in park when debug_assertions are enabled

3996e3e

faern force-pushed the add-sgx-thread-parker branch from 9abfb1c to 3996e3e Compare March 20, 2019 17:46

Only panic in send when debug_assertions are enabled

8844366

faern force-pushed the add-sgx-thread-parker branch from 13e6bb3 to 8844366 Compare March 20, 2019 18:24

Amanieu merged commit ba43375 into Amanieu:master Mar 22, 2019

faern deleted the add-sgx-thread-parker branch March 22, 2019 17:09

faern mentioned this pull request Mar 22, 2019

Use the parking_lot locking primitives rust-lang/rust#56410

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SGX thread parker #123

Add SGX thread parker #123

faern commented Mar 18, 2019

jethrogb left a comment

jethrogb commented Mar 18, 2019

faern commented Mar 18, 2019

jethrogb commented Mar 18, 2019

jethrogb commented Mar 18, 2019

faern commented Mar 18, 2019

faern commented Mar 18, 2019

jethrogb commented Mar 18, 2019

faern commented Mar 18, 2019 •

edited

Loading

jethrogb commented Mar 19, 2019 •

edited

Loading

faern commented Mar 19, 2019

jethrogb commented Mar 19, 2019

faern commented Mar 19, 2019

Amanieu commented Mar 20, 2019

faern commented Mar 20, 2019

Amanieu Mar 20, 2019

jethrogb Mar 20, 2019

faern Mar 20, 2019

jethrogb Mar 20, 2019

faern Mar 20, 2019 •

edited

Loading

Amanieu Mar 20, 2019

jethrogb Mar 20, 2019

faern Mar 20, 2019

jethrogb commented Mar 20, 2019

Amanieu commented Mar 22, 2019

Add SGX thread parker #123

Add SGX thread parker #123

Conversation

faern commented Mar 18, 2019

jethrogb left a comment

Choose a reason for hiding this comment

jethrogb commented Mar 18, 2019

faern commented Mar 18, 2019

jethrogb commented Mar 18, 2019

jethrogb commented Mar 18, 2019

faern commented Mar 18, 2019

faern commented Mar 18, 2019

jethrogb commented Mar 18, 2019

faern commented Mar 18, 2019 • edited Loading

jethrogb commented Mar 19, 2019 • edited Loading

mutex - 66d67bb (master)

mutex - e958443 (this PR)

rwlock - 66d67bb (master)

rwlock - e958443 (this PR)

faern commented Mar 19, 2019

jethrogb commented Mar 19, 2019

faern commented Mar 19, 2019

Amanieu commented Mar 20, 2019

faern commented Mar 20, 2019

Amanieu Mar 20, 2019

Choose a reason for hiding this comment

jethrogb Mar 20, 2019

Choose a reason for hiding this comment

faern Mar 20, 2019

Choose a reason for hiding this comment

jethrogb Mar 20, 2019

Choose a reason for hiding this comment

faern Mar 20, 2019 • edited Loading

Choose a reason for hiding this comment

Amanieu Mar 20, 2019

Choose a reason for hiding this comment

jethrogb Mar 20, 2019

Choose a reason for hiding this comment

faern Mar 20, 2019

Choose a reason for hiding this comment

jethrogb commented Mar 20, 2019

Amanieu commented Mar 22, 2019

faern commented Mar 18, 2019 •

edited

Loading

jethrogb commented Mar 19, 2019 •

edited

Loading

mutex - `66d67bb` (master)

mutex - `e958443` (this PR)

rwlock - `66d67bb` (master)

rwlock - `e958443` (this PR)

faern Mar 20, 2019 •

edited

Loading