Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Don't Sleep(1) in some spin-wait loops #21722

Merged
merged 1 commit into from
Dec 31, 2018
Merged

Don't Sleep(1) in some spin-wait loops #21722

merged 1 commit into from
Dec 31, 2018

Conversation

kouvel
Copy link
Member

@kouvel kouvel commented Dec 30, 2018

  • In spin-wait loops that are not expected to last too long, Sleep(1) significantly delays threads from completing the operation
  • From eliminating Sleep(1) in such spin-wait loops, there can be a tradeoff, better fairness vs worse throughput, because Sleep(1) removes threads from contention and in some cases fewer threads can make faster progress at the cost of delaying the Sleep(1) threads relatively significantly. Eliminating the Sleep(1) in such spin-wait loops seems to be a good tradeoff.
  • Fixes https://github.com/dotnet/corefx/issues/29595

- In spin-wait loops that are not expected to last too long, Sleep(1) significantly delays threads from completing the operation
- From eliminating Sleep(1) in such spin-wait loops, there can be a tradeoff, better fairness vs worse throughput, because Sleep(1) removes threads from contention and in some cases fewer threads can make faster progress at the cost of delaying the Sleep(1) threads relatively significantly. Eliminating the Sleep(1) in such spin-wait loops seems to be a good tradeoff.
@kouvel kouvel added this to the 3.0 milestone Dec 30, 2018
@kouvel kouvel self-assigned this Dec 30, 2018
@kouvel kouvel requested a review from stephentoub December 30, 2018 23:46
@kouvel
Copy link
Member Author

kouvel commented Dec 30, 2018

With a short delay between operations (more realistic), there are mostly improvements. ConcurrentStack's Pop in particular reaches Sleep(1) very quickly and so was very unfair to begin with.

Spin                                           Left score       Right score      ∆ Score   ∆ Score %  Comment
---------------------------------------------  ---------------  ---------------  --------  ---------  ---------------
ConcurrentBagFairness 02T                      98851.58 ±0.34%  99358.90 ±0.10%    507.32      0.51%
ConcurrentBagFairness 04T                      99509.70 ±0.07%  99613.40 ±0.19%    103.69      0.10%
ConcurrentBagFairness 08T                      95944.50 ±0.96%  95945.79 ±0.82%      1.29      0.00%
ConcurrentBagFairness 16T                        533.12 ±1.31%    555.91 ±1.66%     22.79      4.28%
ConcurrentBagThroughput 02T                     5465.33 ±0.48%   5546.60 ±0.46%     81.27      1.49%  Likely improved
ConcurrentBagThroughput 04T                     8475.14 ±0.77%   8502.81 ±0.38%     27.67      0.33%
ConcurrentBagThroughput 08T                    12536.09 ±0.17%  12764.54 ±0.44%    228.45      1.82%  Improved
ConcurrentBagThroughput 16T                    12478.13 ±0.10%  12747.11 ±0.30%    268.97      2.16%  Improved
ConcurrentQueueFairness 02T                    98676.06 ±0.36%  98652.12 ±0.79%    -23.94     -0.02%
ConcurrentQueueFairness 04T                    98799.26 ±0.86%  98828.64 ±0.91%     29.39      0.03%
ConcurrentQueueFairness 08T                    19827.74 ±8.26%  65681.26 ±3.09%  45853.52    231.26%  Improved
ConcurrentQueueFairness 16T                      765.69 ±1.14%    783.43 ±1.30%     17.73      2.32%
ConcurrentQueueThroughput 02T                   5721.85 ±0.56%   5676.11 ±0.47%    -45.74     -0.80%
ConcurrentQueueThroughput 04T                   7590.31 ±0.25%   7651.99 ±0.10%     61.69      0.81%  Likely improved
ConcurrentQueueThroughput 08T                   8957.50 ±0.78%   9065.22 ±0.25%    107.71      1.20%
ConcurrentQueueThroughput 16T                   9310.72 ±0.60%   9236.79 ±0.77%    -73.93     -0.79%
ConcurrentStackFairness 02T                    96485.08 ±0.23%  96877.09 ±0.23%    392.01      0.41%
ConcurrentStackFairness 04T                      576.30 ±2.07%  79495.05 ±0.14%  78918.75  13693.95%  Improved
ConcurrentStackFairness 08T                      379.10 ±0.88%  48063.94 ±0.30%  47684.84  12578.40%  Improved
ConcurrentStackFairness 16T                      223.52 ±0.71%  10238.24 ±0.19%  10014.71   4480.37%  Improved
ConcurrentStackThroughput 02T                   4993.09 ±0.43%   5574.10 ±0.27%    581.01     11.64%  Improved
ConcurrentStackThroughput 04T                   5220.53 ±0.49%   6149.84 ±0.12%    929.32     17.80%  Improved
ConcurrentStackThroughput 08T                   5448.92 ±0.44%   6174.03 ±0.02%    725.11     13.31%  Improved
ConcurrentStackThroughput 16T                   5482.35 ±0.49%   6025.84 ±0.03%    543.49      9.91%  Improved

With no delay between operations (less realistic), the tradeoff is more visible in throughput regressions. In a scenario where multiple operations are done frequently in short order on a concurernt collection, beyond a threshold a lock with a non-concurrent collection may perform better.

Spin                                           Left score       Right score      ∆ Score    ∆ Score %  Comment
---------------------------------------------  ---------------  ---------------  ---------  ---------  ---------------
ConcurrentBagFairness 02T                      99449.49 ±0.06%  99320.03 ±0.16%    -129.46     -0.13%
ConcurrentBagFairness 04T                      99790.30 ±0.06%  99613.59 ±0.13%    -176.71     -0.18%
ConcurrentBagFairness 08T                      94718.08 ±0.77%  93844.13 ±0.83%    -873.94     -0.92%
ConcurrentBagFairness 16T                        525.12 ±0.95%    561.36 ±1.05%      36.24      6.90%  Improved
ConcurrentBagThroughput 02T                    10208.16 ±0.28%  10041.72 ±0.25%    -166.45     -1.63%  Regressed
ConcurrentBagThroughput 04T                    11687.04 ±0.60%  11629.58 ±0.45%     -57.46     -0.49%
ConcurrentBagThroughput 08T                    16980.23 ±0.22%  17099.60 ±0.09%     119.36      0.70%  Likely improved
ConcurrentBagThroughput 16T                    16833.99 ±0.06%  17275.41 ±0.41%     441.41      2.62%  Improved
ConcurrentQueueFairness 02T                    99541.56 ±0.01%  99480.95 ±0.03%     -60.61     -0.06%
ConcurrentQueueFairness 04T                    42182.13 ±4.44%  99462.63 ±0.03%   57280.50    135.79%  Improved
ConcurrentQueueFairness 08T                     1928.71 ±2.78%  83839.00 ±1.20%   81910.28   4246.89%  Improved
ConcurrentQueueFairness 16T                     1289.63 ±2.27%   4618.29 ±0.59%    3328.66    258.11%  Improved
ConcurrentQueueThroughput 02T                  10725.42 ±1.96%   9764.64 ±0.47%    -960.78     -8.96%  Regressed
ConcurrentQueueThroughput 04T                   8377.90 ±1.29%   7770.49 ±0.56%    -607.41     -7.25%  Regressed
ConcurrentQueueThroughput 08T                   8846.80 ±0.47%   9080.83 ±0.10%     234.03      2.65%  Improved
ConcurrentQueueThroughput 16T                   9045.76 ±0.86%   9165.55 ±0.13%     119.80      1.32%
ConcurrentStackFairness 02T                     1698.35 ±5.23%  97748.43 ±0.02%   96050.08   5655.51%  Improved
ConcurrentStackFairness 04T                      551.04 ±1.88%  65313.83 ±0.22%   64762.79  11752.81%  Improved
ConcurrentStackFairness 08T                      366.45 ±1.13%  40789.38 ±0.28%   40422.93  11031.02%  Improved
ConcurrentStackFairness 16T                      233.11 ±0.94%  11143.66 ±0.24%   10910.55   4680.36%  Improved
ConcurrentStackThroughput 02T                  28438.66 ±0.49%  25582.18 ±0.05%   -2856.48    -10.04%  Regressed
ConcurrentStackThroughput 04T                  28893.91 ±0.14%  17577.92 ±0.10%  -11315.98    -39.16%  Regressed
ConcurrentStackThroughput 08T                  28715.13 ±0.05%   9811.45 ±0.47%  -18903.69    -65.83%  Regressed
ConcurrentStackThroughput 16T                  28405.37 ±0.16%  11763.53 ±0.44%  -16641.84    -58.59%  Regressed

Code for tests is here: #13670 (comment)

kouvel added a commit to kouvel/corefx that referenced this pull request Dec 30, 2018
@kouvel
Copy link
Member Author

kouvel commented Dec 30, 2018

I also experimented with removing the spin-wait in compare-exchange loops in some cases where a thread is always guaranteed to make progress (compare-exchanging a recently observed state to a new state unconditionally), it appeared to be worse in fairness and throughput so I left in the spin-waits.

@jkotas jkotas merged commit 7bd31ac into dotnet:master Dec 31, 2018
@jkotas
Copy link
Member

jkotas commented Dec 31, 2018

@Anipik Looks like the CoreLib mirror is down. Could you please take a look?

@Anipik
Copy link

Anipik commented Dec 31, 2018

There was a merge conflict due to this dotnet/corert@dfb9267

i tried to use the corert version

Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/corert that referenced this pull request Dec 31, 2018
- In spin-wait loops that are not expected to last too long, Sleep(1) significantly delays threads from completing the operation
- From eliminating Sleep(1) in such spin-wait loops, there can be a tradeoff, better fairness vs worse throughput, because Sleep(1) removes threads from contention and in some cases fewer threads can make faster progress at the cost of delaying the Sleep(1) threads relatively significantly. Eliminating the Sleep(1) in such spin-wait loops seems to be a good tradeoff.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/mono that referenced this pull request Dec 31, 2018
- In spin-wait loops that are not expected to last too long, Sleep(1) significantly delays threads from completing the operation
- From eliminating Sleep(1) in such spin-wait loops, there can be a tradeoff, better fairness vs worse throughput, because Sleep(1) removes threads from contention and in some cases fewer threads can make faster progress at the cost of delaying the Sleep(1) threads relatively significantly. Eliminating the Sleep(1) in such spin-wait loops seems to be a good tradeoff.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@jkotas
Copy link
Member

jkotas commented Dec 31, 2018

The CoreCLR version of the file is the correct one. I have force pushed the fix to #21724. Is that ok?

@Anipik
Copy link

Anipik commented Dec 31, 2018

i think that should be fine. But i think that would made the corert version out of sync. I will correct everything tomorrow morning when i have access to the commit database.
I will update everything to the coreclr version

jkotas pushed a commit to dotnet/corert that referenced this pull request Dec 31, 2018
- In spin-wait loops that are not expected to last too long, Sleep(1) significantly delays threads from completing the operation
- From eliminating Sleep(1) in such spin-wait loops, there can be a tradeoff, better fairness vs worse throughput, because Sleep(1) removes threads from contention and in some cases fewer threads can make faster progress at the cost of delaying the Sleep(1) threads relatively significantly. Eliminating the Sleep(1) in such spin-wait loops seems to be a good tradeoff.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@kouvel kouvel deleted the SpinFix branch December 31, 2018 08:54
marek-safar pushed a commit to mono/mono that referenced this pull request Dec 31, 2018
- In spin-wait loops that are not expected to last too long, Sleep(1) significantly delays threads from completing the operation
- From eliminating Sleep(1) in such spin-wait loops, there can be a tradeoff, better fairness vs worse throughput, because Sleep(1) removes threads from contention and in some cases fewer threads can make faster progress at the cost of delaying the Sleep(1) threads relatively significantly. Eliminating the Sleep(1) in such spin-wait loops seems to be a good tradeoff.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
stephentoub pushed a commit to dotnet/corefx that referenced this pull request Jan 2, 2019
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
- In spin-wait loops that are not expected to last too long, Sleep(1) significantly delays threads from completing the operation
- From eliminating Sleep(1) in such spin-wait loops, there can be a tradeoff, better fairness vs worse throughput, because Sleep(1) removes threads from contention and in some cases fewer threads can make faster progress at the cost of delaying the Sleep(1) threads relatively significantly. Eliminating the Sleep(1) in such spin-wait loops seems to be a good tradeoff.

Commit migrated from dotnet/coreclr@7bd31ac
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants