Skip to content

Commit

Permalink
Fix some regressions in ASP.NET benchmarks
Browse files Browse the repository at this point in the history
- dotnet#44265 seems to have caused large regressions on Windows and Linux-arm64. During that change we had tested adding the `Sleep(1)` to some `ConcurrentQueue` operations in contending cases, and not spin-waiting at all in forward-progressing cases. Not spin-waiting at all where possible in contending cases seemed to be better or equal for the most part (compared with spin-waiting without `Sleep(1)`), so I have removed spin-waiting in forward-progressing cases in `ConcurrentQueue`.
- There were some regressions from the portable thread pool on Windows. I have moved/tweaked a slight delay that I had added early on, after changes thereafter it lost its intention, with the changes it goes back to the original intention and seems to resolve some of the gap, but maybe not all of it in some tests. We'll check the graphs after this change and see if there is more to investigate. There are also other things to improve on Windows, and many of those may be separate from the portable thread pool but some may be relevant to the changes in perf characteristics.
  • Loading branch information
kouvel committed Dec 16, 2020
1 parent 8c863fe commit 6084dea
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 30 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,6 @@ internal sealed class ConcurrentQueueSegment<T>
internal ConcurrentQueueSegment<T>? _nextSegment; // SOS's ThreadPool command depends on this name
#pragma warning restore 0649

/// <summary>Threshold to spin before allowing threads to sleep when there is contention.</summary>
private const int Sleep1Threshold = 8;

/// <summary>Creates the segment.</summary>
/// <param name="boundedLength">
/// The maximum number of elements the segment can contain. Must be a power of 2.
Expand Down Expand Up @@ -162,6 +159,9 @@ public bool TryDequeue([MaybeNullWhen(false)] out T item)
}
return true;
}

// The head was already advanced by another thread. A newer head has already been observed and the next
// iteration would make forward progress, so there's no need to spin-wait before trying again.
}
else if (diff < 0)
{
Expand All @@ -182,11 +182,19 @@ public bool TryDequeue([MaybeNullWhen(false)] out T item)

// It's possible it could have become frozen after we checked _frozenForEnqueues
// and before reading the tail. That's ok: in that rare race condition, we just
// loop around again.
// loop around again. This is not necessarily an always-forward-progressing
// situation since this thread is waiting for another to write to the slot and
// this thread may have to check the same slot multiple times. Spin-wait to avoid
// a potential busy-wait, and then try again.
spinner.SpinOnce(sleep1Threshold: -1);
}
else
{
// The item was already dequeued by another thread. The head has already been updated beyond what was
// observed above, and the sequence number observed above as a volatile load is more recent than the update
// to the head. So, the next iteration of the loop is guaranteed to see a new head. Since this is an
// always-forward-progressing situation, there's no need to spin-wait before trying again.
}

// Lost a race. Spin a bit, then try again.
spinner.SpinOnce(Sleep1Threshold);
}
}

Expand Down Expand Up @@ -243,11 +251,19 @@ public bool TryPeek([MaybeNullWhen(false)] out T result, bool resultUsed)

// It's possible it could have become frozen after we checked _frozenForEnqueues
// and before reading the tail. That's ok: in that rare race condition, we just
// loop around again.
// loop around again. This is not necessarily an always-forward-progressing
// situation since this thread is waiting for another to write to the slot and
// this thread may have to check the same slot multiple times. Spin-wait to avoid
// a potential busy-wait, and then try again.
spinner.SpinOnce(sleep1Threshold: -1);
}
else
{
// The item was already dequeued by another thread. The head has already been updated beyond what was
// observed above, and the sequence number observed above as a volatile load is more recent than the update
// to the head. So, the next iteration of the loop is guaranteed to see a new head. Since this is an
// always-forward-progressing situation, there's no need to spin-wait before trying again.
}

// Lost a race. Spin a bit, then try again.
spinner.SpinOnce(Sleep1Threshold);
}
}

Expand All @@ -261,7 +277,6 @@ public bool TryEnqueue(T item)
Slot[] slots = _slots;

// Loop in case of contention...
SpinWait spinner = default;
while (true)
{
// Get the tail at which to try to return.
Expand Down Expand Up @@ -292,6 +307,9 @@ public bool TryEnqueue(T item)
Volatile.Write(ref slots[slotsIndex].SequenceNumber, currentTail + 1);
return true;
}

// The tail was already advanced by another thread. A newer tail has already been observed and the next
// iteration would make forward progress, so there's no need to spin-wait before trying again.
}
else if (diff < 0)
{
Expand All @@ -302,9 +320,14 @@ public bool TryEnqueue(T item)
// we need to enqueue in order.
return false;
}

// Lost a race. Spin a bit, then try again.
spinner.SpinOnce(Sleep1Threshold);
else
{
// Either the slot contains an item, or it is empty but because the slot was filled and dequeued. In either
// case, the tail has already been updated beyond what was observed above, and the sequence number observed
// above as a volatile load is more recent than the update to the tail. So, the next iteration of the loop
// is guaranteed to see a new tail. Since this is an always-forward-progressing situation, there's no need
// to spin-wait before trying again.
}
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,24 @@ private static void WorkerThreadStart()
alreadyRemovedWorkingWorker = true;
break;
}

if (threadPoolInstance._separated.numRequestedWorkers <= 0)
{
break;
}

// In highly bursty cases with short bursts of work, especially in the portable thread pool
// implementation, worker threads are being released and entering Dispatch very quickly, not finding
// much work in Dispatch, and soon afterwards going back to Dispatch, causing extra thrashing on
// data and some interlocked operations, and similarly when the thread pool runs out of work. Since
// there is a pending request for work, introduce a slight delay before serving the next request.
// The spin-wait is mainly for when the sleep is not effective due to there being no other threads
// to schedule.
Thread.UninterruptibleSleep0();
if (!Environment.IsSingleProcessor)
{
Thread.SpinWait(1);
}
}

// Don't spin-wait on the semaphore next time if the thread was actively stopped from processing work,
Expand Down Expand Up @@ -141,21 +159,6 @@ private static void RemoveWorkingWorker(PortableThreadPool threadPoolInstance)
currentCounts = oldCounts;
}

if (currentCounts.NumProcessingWork > 1)
{
// In highly bursty cases with short bursts of work, especially in the portable thread pool implementation,
// worker threads are being released and entering Dispatch very quickly, not finding much work in Dispatch,
// and soon afterwards going back to Dispatch, causing extra thrashing on data and some interlocked
// operations. If this is not the last thread to stop processing work, introduce a slight delay to help
// other threads make more efficient progress. The spin-wait is mainly for when the sleep is not effective
// due to there being no other threads to schedule.
Thread.UninterruptibleSleep0();
if (!Environment.IsSingleProcessor)
{
Thread.SpinWait(1);
}
}

// It's possible that we decided we had thread requests just before a request came in,
// but reduced the worker count *after* the request came in. In this case, we might
// miss the notification of a thread request. So we wake up a thread (maybe this one!)
Expand Down

0 comments on commit 6084dea

Please sign in to comment.