-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The AllocationTick threshold is computed by a Poisson process with a 100 KB mean. #85750
Conversation
Tagging subscribers to this area: @dotnet/gc Issue DetailsThe fixed 100 KB threshold to trigger the AllocationTick event does not allow good ways to estimate the real allocations. Using a Poisson process to calculate the threshold provides much better statistical results. This first step changes the fixed threshold to an exponential distribution for the different SOH/LOH/POH threshold/counters. The next steps will be to:
Relates to #49424.
|
if (EVENT_ENABLED(GCAllocationTick_V4)) | ||
#endif | ||
{ | ||
// compute the next threshold based on a Poisson process with a etw_allocation_tick_mean average |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many of us aren't greatly familiar with statistics so making this explanation not so vague would be helpful. instead of saying "based on a Possion process" it'd be much more helpful to start with something like "we are treating this as a Possion process because each sample we take has no influence on any other sample. the samples are exponentially distributed in a Possion process, meaning that the possibility of the next sample happening is calculated by (1 - e^(-lambda*x))
. and then explain what lambda and x would be in this particular context so the readers know how the formula you are using came to be.
also -ln (1 - uniformly_random_number_between_0_and_1)
, is the same as -ln (uniformly_random_number_between_0_and_1)
. so I don't think you need the 1 -
part.
can you please show the results of running this on some workloads where this is much better compared to the current implementation? also have you tried with just a uniformly random distribution instead of an exponential distribution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many of us aren't greatly familiar with statistics so making this explanation not so vague would be helpful. instead of saying "based on a Possion process" it'd be much more helpful to start with something like "we are treating this as a Possion process because each sample we take has no influence on any other sample. the samples are exponentially distributed in a Possion process, meaning that the possibility of the next sample happening is calculated by
(1 - e^(-lambda*x))
. and then explain what lambda and x would be in this particular context so the readers know how the formula you are using came to be.
I updated the description accordingly with also additional information about the upscaling formula
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please show the results of running this on some workloads where this is much better compared to the current implementation? also have you tried with just a uniformly random distribution instead of an exponential distribution?
I'm currently simulating the results based on a web application for which I'm recording ALL allocations using ICorProfilerCallback::ObjectAllocated() and check against the sampled then upscaled sizes. The variance of the results shows almost random results for fixed threshold, much better for variable threshold as in the first commit and a little better if sampling could happen within allocation context.
Since the recorder is available in the Datadog profiler only, it will be complicated to generate the corresponding .balloc files (i.e. list of allocations - type+size) used by the simulation to show result on any application. BTW, is there any sample application that you would like to see used as example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also -ln (1 - uniformly_random_number_between_0_and_1), is the same as -ln (uniformly_random_number_between_0_and_1). so I don't think you need the 1 - part.
This sticks to the mathematical way to derive the formula. Since the result should be the same, I would recommend to keep it as it is but no problem to change it.
Based on the GC team’s current schedule and offline discussion with @chrisnas, we have decided to close this PR and re-evaluate during our .NET 9 planning time. We definitely recognize the value of this idea and would like to pursue it in the future but right now the GC team simply does not have the time to see this through. We really want to thank you for your contribution and will absolutely let you know about our plan in .NET 9. |
The fixed 100 KB threshold to trigger the AllocationTick event does not allow good ways to estimate the real allocations. The main issue is not really the sampling by itself but the fact that it is not possible to upscale the sampled size in order to get back sizes with the same order of magnitude than the real ones. Also, keeping the relative size differences between allocated types is important.
As a solution, it is possible to a Poisson process because each sample we take has no influence on any other sample. The samples are exponentially distributed in a Poisson process, meaning that the possibility of the next sample happening is calculated by the exponential cumulative distribution function CDF = (1 - e^(-lambda*x)).
In our context, it means that we need to compute x (= the allocated size to wait for the next sample) based on the mean of the distribution (lambda would the odds to sample a byte - in our case 1/100.000 for the current 100 KB threshold). With this sampling, it is possible to upscale/estimate the "real" allocated sizes based on the sampled size with the following formula:
upscaled size = sampled size / (1 - e^(- average size / 100.000))
3 additional threshold sampling scenarios are implemented to compare them against the fixed 100 KB one:
The last one seems to provide the best results once upscaled.
Note that using the Poisson process to upscale the current fixed scenario gives better results than the simple upscaling mechanism based on the per type allocation size / total allocated size.
Relates to #49424.