The AllocationTick threshold is computed by a Poisson process with a 100 KB mean. #85750

chrisnas · 2023-05-04T08:24:54Z

The fixed 100 KB threshold to trigger the AllocationTick event does not allow good ways to estimate the real allocations. The main issue is not really the sampling by itself but the fact that it is not possible to upscale the sampled size in order to get back sizes with the same order of magnitude than the real ones. Also, keeping the relative size differences between allocated types is important.

As a solution, it is possible to a Poisson process because each sample we take has no influence on any other sample. The samples are exponentially distributed in a Poisson process, meaning that the possibility of the next sample happening is calculated by the exponential cumulative distribution function CDF = (1 - e^(-lambda*x)).

In our context, it means that we need to compute x (= the allocated size to wait for the next sample) based on the mean of the distribution (lambda would the odds to sample a byte - in our case 1/100.000 for the current 100 KB threshold). With this sampling, it is possible to upscale/estimate the "real" allocated sizes based on the sampled size with the following formula:
upscaled size = sampled size / (1 - e^(- average size / 100.000))

3 additional threshold sampling scenarios are implemented to compare them against the fixed 100 KB one:

100 KB +/- 50KB random
exponential
exponential triggered within an allocation context
The last one seems to provide the best results once upscaled.
Note that using the Poisson process to upscale the current fixed scenario gives better results than the simple upscaling mechanism based on the per type allocation size / total allocated size.

Relates to #49424.

…100 KB mean.

ghost · 2023-05-04T08:25:04Z

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

The fixed 100 KB threshold to trigger the AllocationTick event does not allow good ways to estimate the real allocations. Using a Poisson process to calculate the threshold provides much better statistical results.

This first step changes the fixed threshold to an exponential distribution for the different SOH/LOH/POH threshold/counters.

The next steps will be to:

add a new GC keyword to enable that new behaviour without changing the AllocationTick_V4 payload. The check should be made by calling GCEventStatus::IsEnabled(provider, keyword, level) instead of EVENT_ENABLED(GCAllocationTick_V4).
try to check for variable threshold when it ends within an allocation context

Relates to #49424.

Author:	chrisnas
Assignees:	-
Labels:	`area-GC-coreclr`
Milestone:	-

Maoni0 · 2023-05-04T20:27:20Z

src/coreclr/gc/gc.cpp

+    if (EVENT_ENABLED(GCAllocationTick_V4))
+  #endif
+    {
+        // compute the next threshold based on a Poisson process with a etw_allocation_tick_mean average


many of us aren't greatly familiar with statistics so making this explanation not so vague would be helpful. instead of saying "based on a Possion process" it'd be much more helpful to start with something like "we are treating this as a Possion process because each sample we take has no influence on any other sample. the samples are exponentially distributed in a Possion process, meaning that the possibility of the next sample happening is calculated by (1 - e^(-lambda*x)). and then explain what lambda and x would be in this particular context so the readers know how the formula you are using came to be.

also -ln (1 - uniformly_random_number_between_0_and_1), is the same as -ln (uniformly_random_number_between_0_and_1). so I don't think you need the 1 - part.

can you please show the results of running this on some workloads where this is much better compared to the current implementation? also have you tried with just a uniformly random distribution instead of an exponential distribution?

many of us aren't greatly familiar with statistics so making this explanation not so vague would be helpful. instead of saying "based on a Possion process" it'd be much more helpful to start with something like "we are treating this as a Possion process because each sample we take has no influence on any other sample. the samples are exponentially distributed in a Possion process, meaning that the possibility of the next sample happening is calculated by (1 - e^(-lambda*x)). and then explain what lambda and x would be in this particular context so the readers know how the formula you are using came to be.

I updated the description accordingly with also additional information about the upscaling formula

can you please show the results of running this on some workloads where this is much better compared to the current implementation? also have you tried with just a uniformly random distribution instead of an exponential distribution?

I'm currently simulating the results based on a web application for which I'm recording ALL allocations using ICorProfilerCallback::ObjectAllocated() and check against the sampled then upscaled sizes. The variance of the results shows almost random results for fixed threshold, much better for variable threshold as in the first commit and a little better if sampling could happen within allocation context.
Since the recorder is available in the Datadog profiler only, it will be complicated to generate the corresponding .balloc files (i.e. list of allocations - type+size) used by the simulation to show result on any application. BTW, is there any sample application that you would like to see used as example?

also -ln (1 - uniformly_random_number_between_0_and_1), is the same as -ln (uniformly_random_number_between_0_and_1). so I don't think you need the 1 - part.

This sticks to the mathematical way to derive the formula. Since the result should be the same, I would recommend to keep it as it is but no problem to change it.

…cation context.

Maoni0 · 2023-06-06T19:12:52Z

Based on the GC team’s current schedule and offline discussion with @chrisnas, we have decided to close this PR and re-evaluate during our .NET 9 planning time. We definitely recognize the value of this idea and would like to pursue it in the future but right now the GC team simply does not have the time to see this through. We really want to thank you for your contribution and will absolutely let you know about our plan in .NET 9.

The AllocationTick threshold is computed by a Poisson process with a …

f1c218f

…100 KB mean.

dotnet-issue-labeler bot added the area-GC-coreclr label May 4, 2023

ghost added the community-contribution Indicates that the PR has been added by a community member label May 4, 2023

Maoni0 reviewed May 4, 2023

View reviewed changes

chrisnas mentioned this pull request May 12, 2023

[Profiler] Allow .balloc/.pprof allocations comparison DataDog/dd-trace-dotnet#4145

Merged

chrisnas added 2 commits May 16, 2023 08:39

Use different randomizer

0dbfd06

Allow AllocationTick to be triggered when threshold is inside an allo…

d074c90

…cation context.

Maoni0 closed this Jun 6, 2023

ghost locked as resolved and limited conversation to collaborators Jul 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The AllocationTick threshold is computed by a Poisson process with a 100 KB mean. #85750

The AllocationTick threshold is computed by a Poisson process with a 100 KB mean. #85750

chrisnas commented May 4, 2023 •

edited

Loading

ghost commented May 4, 2023

Maoni0 May 4, 2023

chrisnas May 7, 2023

chrisnas May 7, 2023

chrisnas May 7, 2023 •

edited

Loading

Maoni0 commented Jun 6, 2023

The AllocationTick threshold is computed by a Poisson process with a 100 KB mean. #85750

The AllocationTick threshold is computed by a Poisson process with a 100 KB mean. #85750

Conversation

chrisnas commented May 4, 2023 • edited Loading

ghost commented May 4, 2023

Maoni0 May 4, 2023

Choose a reason for hiding this comment

chrisnas May 7, 2023

Choose a reason for hiding this comment

chrisnas May 7, 2023

Choose a reason for hiding this comment

chrisnas May 7, 2023 • edited Loading

Choose a reason for hiding this comment

Maoni0 commented Jun 6, 2023

chrisnas commented May 4, 2023 •

edited

Loading

chrisnas May 7, 2023 •

edited

Loading