-
Notifications
You must be signed in to change notification settings - Fork 775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SDK] Volatile reads + MetricPoint improvements #3458
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3458 +/- ##
==========================================
+ Coverage 86.21% 86.54% +0.32%
==========================================
Files 265 264 -1
Lines 9598 9584 -14
==========================================
+ Hits 8275 8294 +19
+ Misses 1323 1290 -33
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes me wonder how the stress test might be adapted to have discovered this bug. I assume with enough concurrent threads some amount of metric updates would reliably be lost by the snapshot thread resetting things.
|
||
private void UpdateHistogramSumCount(double number) | ||
{ | ||
lock (this.histogramBuckets.LockObject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cijothomas Possibly, but there be dragons. I'll paraphrase what @noahfalk told me.
Consider code like this...
if (Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 1) == 0)
{
this.runningValue.AsLong++;
this.histogramBuckets.RunningSum += number;
this.histogramBuckets.RunningBucketCounts[i]++;
this.histogramBuckets.IsCriticalSectionOccupied = 0;
}
The compiler is free/able (by spec) to rewrite that as:
if (Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 1) == 0)
{
this.histogramBuckets.IsCriticalSectionOccupied = 0; // Danger!
this.runningValue.AsLong++;
this.histogramBuckets.RunningSum += number;
this.histogramBuckets.RunningBucketCounts[i]++;
}
That is why it is risky to try and make our own mechanism. If we just use lock
there, order is guaranteed to not change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use Interlocked.Exchange
(or maybe Volatile.Write
) instead of simply assigning that to zero to avoid that risk.
There is an example on the docs for Interlocked
that shows this this approach: https://docs.microsoft.com/en-us/dotnet/api/system.threading.interlocked?view=net-6.0#examples. This example does not even use Volatile.Write
. It just uses an Interlocked.Exchange
to update that value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@utpilla Let's say we did it like this...
if (Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 1) == 0)
{
Interlocked.Increment(ref this.runningValue.AsLong);
Interlocked.Add(ref this.histogramBuckets.RunningSum, number); // Not possible with double, but for discussion sake
Interlocked.Increment(ref this.histogramBuckets.RunningBucketCounts[i]);
Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 0);
}
(Or different version with a mix of Interlocked
vs Volatile
.)
That would work (I think) but would also be a lot slower for the happy-path?
I thought maybe Thread.MemoryBarrier would help us. But the docs do recommend a lock
over that 🤷
Let's say we did it like this...
if (Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 1) == 0)
{
unchecked
{
this.runningValue.AsLong++;
this.histogramBuckets.RunningSum += number;
this.histogramBuckets.RunningBucketCounts[i]++;
}
Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 0);
}
I don't think that works. Because the stuff in unchecked could be cached. Needs some kind of a memory fence. But I could be wrong this stuff is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I suggested using Interlocked
, I didn't mean that we should switch every statement to use Interlocked
like shown here:
if (Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 1) == 0)
{
Interlocked.Increment(ref this.runningValue.AsLong);
Interlocked.Add(ref this.histogramBuckets.RunningSum, number); // Not possible with double, but for discussion sake
Interlocked.Increment(ref this.histogramBuckets.RunningBucketCounts[i]);
Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 0);
}
I was indeed referring to this:
if (Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 1) == 0)
{
unchecked
{
this.runningValue.AsLong++;
this.histogramBuckets.RunningSum += number;
this.histogramBuckets.RunningBucketCounts[i]++;
}
Interlocked.Exchange(ref this.histogramBuckets.IsCriticalSectionOccupied, 0);
}
I don't think that works. Because the stuff in unchecked could be cached. Needs some kind of a memory fence.
How does using a lock
instead help with this? ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be okay even if the instructions inside the unchecked
block are re-ordered.
the stuff in unchecked could be cached
But does memory barrier/fence even help with caching/freshness?
Do we have a clear understanding of what's the use case for |
There was some discussion on #3384 regarding this. |
Apart from the
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
[Builds off of #3384]
Changes
Switch to Volatile.Read where we were doing Interlocked.Read or Interlocked.CompareExchange (for doubles).
Today there are two different locking mechanisms in play for histograms:
opentelemetry-dotnet/src/OpenTelemetry/Metrics/MetricPoint.cs
Line 340 in 2ebea48
opentelemetry-dotnet/src/OpenTelemetry/Metrics/MetricPoint.cs
Line 498 in 2ebea48
Now we use
lock (this.histogramBuckets.LockObject)
in both places. I think as it existed it was bugged. Ran it by @noahfalk he suggested just using thelock
as trying to make our own mechanism is risky.Some misc optimizations like in Update(long) call into dedicated Histrogram update methods instead of entering Update(double) which needs to switch again.