-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lock-free updates for Histogram #2951
Lock-free updates for Histogram #2951
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2951 +/- ##
==========================================
+ Coverage 83.98% 84.02% +0.04%
==========================================
Files 254 254
Lines 8942 8946 +4
==========================================
+ Hits 7510 7517 +7
+ Misses 1432 1429 -3
|
Great analysis! 👍 |
this.histogramBuckets.RunningBucketCounts[i]++; | ||
if (Interlocked.Exchange(ref this.histogramBuckets.UsingHistogram, 1) == 0) | ||
{ | ||
this.runningValue.AsLong++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't look into the underlying code, wonder if this ++
might throw (e.g. integer overflow case). If that's the case, we might need to make sure we release the spinlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't acquire any lock in the first place to be released. SpinWait
is used to just smartly apply context switch for the thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one Ln335 we set this.histogramBuckets.UsingHistogram = 0
, if we throw before this, all other threads would spin I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think unchecked
would solve the problem here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry/Internal/CircularBuffer.cs#L139-L143 We could explore if we need to have an exit plan like done for CircularBuffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once https://github.com/open-telemetry/opentelemetry-dotnet/pull/2951/files#r815210688 is confirmed.
Approach LGTM 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add to changelog and we are good to go.
…om/utpilla/opentelemetry-dotnet into utpilla/Lock-free-Histogram-Update
Co-authored-by: Cijo Thomas <cithomas@microsoft.com>
Changes
Histogram
MetricPoint
struct as we add a newint
which would be used for synchronizationIf this approach seems good, I can implement it for
HistogramSumCount
as well.Stress Test Results
To better understand the perf improvement, we need to look at two scenarios:
MetricPoint
. This was tested by running the following in theRun
method of stress test:MetricPoint
. This was tested by running the following in theRun
method of stress test:While there is a perf improvement in both the cases, there is a substantial improvement in the second case where there is high contention to update the same
MetricPoint
.For the first scenario, Loops/ second go up from ~20M to ~23M (~15% increase)
For the second scenario, Loops/second go up from ~5.8M to ~7.6M (~31% increase)
Here are the numbers for the 1st and 2nd scenario respectively:
main branch
With this PR