-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Methods JITed during startup #67398
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsRepro:
Actual result: Expected result: R2R version of
|
Could this be caused by the use of |
Cc @brianrob |
crossgen with
|
Yes, it is probably related. However, we should have enough flexibility in crossgen now to make it possible to precompile this successfully (as long as the app is running on current hardware at least). |
Related: dotnet/designs#173 (and dotnet/core#7131) I assume we're going to default to Avx2 so it kinds of fixes itself |
@trylek @dotnet/crossgen-contrib @richlander . Yeah we have been discussing compiling with avx2 by default. |
As @EgorBo raises, it seems like the base question on why the R2R code cannot be used as-is is the basic question. If all non-AVX2 code is thrown out, then that R2R code is useless, right? Ideally, the following is true:
As we all know, we are not doing this today. For this conversation, it seems like the first topic is the concern. |
From my understanding the main problem that // This method is Jitted (uses AVX2)
void Method1()
{
Console.WriteLine(Method2(new Vector<long>(1)));
// Expected: {2,2,2,2}
// Actual: {2,2,1,1} or {2,2,0,0} or {2,2,garbage values}
}
// This method is R2R'd with SSE2
Vector<long> Method2(Vector<long> v)
{
return v * 2; // {v.long0 * 2, v.long1 * 2 }
} So we don't allow this to happen and when we see Vector on a non-trimmed path and don't have AVX2 we fallback to JIT. if (Sse2.IsSupported)
{
}
else if (Vector.IsHardwareAccelerated)
{
Vector<T>
} |
Right. That's my understanding, too. To my mind, our approach doesn't make sense. Why bother R2R-compiling methods that won't be used on most hardware? We should either compile those methods for the most common hardware (not SSE2) or not compile them at all.
trimmed? Do you mean tiered? |
Trimmed, but it's rather a wrong word here, consider the following example: public static void Test1()
{
if (Sse2.IsSupported)
{
Console.WriteLine(Vector128.Create(2) * 2);
}
else
{
Console.WriteLine(new Vector<int>(2) * 2);
}
}
public static void Test2(bool cond)
{
if (cond && Sse2.IsSupported) // the only difference is `cond &&`
{
Console.WriteLine(Vector128.Create(2) * 2);
}
else
{
Console.WriteLine(new Vector<int>(2) * 2);
}
} When we run crossgen for this snippet only
|
There are few potentional solutions for the
|
List of methods in
|
Option 4 seems the most attractive. It's simple. I think we should validate that it is unacceptable before looking at more exotic options. @ivdiazsa has been doing startup tests with R2R+AVX2 vs retail (what we ship today). He's been able to demonstrate a win with R2R composite but not with our regular crossgen/R2R configuration. That seems odd to me. It's hard for me to imagine that JITing would be faster than no JITing. I'm guessing that there is something else at play. It wouldn't surprise me if there is something else in our system not expecting AVX2 code in our R2R images. Also, the test that @ivdiazsa was using might not be the best one (TE Plaintext). He's going to switch to a different ASP.NET test that is known to rely more on SIMD. I'm not sure if he was collecting the diff in # of JITed methods with those configuration. That would also be interesting to know. Perhaps @EgorBo could pair with @ivdiazsa to double the brain power on this problem. You know, we could crunch more data with the same clock cycles. |
Wouldn't option 4 mean blocking creating non AVX2 R2R images? |
I assume it was Platform-Plaintext? I've just tried it on our perflab (linux-citrine 28cores machine) and here is the results:
|
No. What led you to that conclusion? There are three decision points:
We're exclusively talking about the last topic. The middle topic is also interesting. There is no discussion about the first topic.
That configuration is not what I intended nor what @ivdiazsa is testing. There are two things to measure:
Ideally, the app is NOT R2R compiled but JITs. That's what most customer apps do. Measuring all R2R is also interesting but a SECONDARY scenario since most customers don't do that.
|
@richlander sorry, but I am still not sure I understand what is being tested. I fully realize that the most common case is when only BCL libs are prejitted (well, maybe a few 3rd party libs) and the rest is jitted - against what we test this mode? Against noR2R even for BCL?
What does this mean exactly? VM/JIT compiled with |
I'm only talking about R2R not how we configure the native toolset. AFAIK, the ability to specify AVX2 as the SIMD instruction set is not specific to composite. It was at one point, but I believe that's been resolved. That's certainly something to validate! I'd expect that all the managed assemblies we have today are R2R compiled the same way as today (not composite) with the addition of a flag to affect SIMD instructions. This necessarily include NetCoreApp and ASP.NET Core frameworks. We're very close to having container images (hopefully this coming week; it is Sunday here) that have this configuration. That will enable us to do A vs B perf testing much easier. It will also enable smart people inspect the configuration to validate that it is correct. It is VERY EASY to do this wrong. We've been working on this problem for months and have had a lot of problems. |
@EgorBo @richlander is this still pending on more perf validation of non-composite AVX2 config? |
I'm not sure. Do we have any data that we can share? |
Neither am I, the root issue here can be fixed if we bump minimal baseline for R2R (dotnet/designs#173) do we plan to do it in 7.0? |
We're not making any more significant changes in 7.0, as you likely know. We had a fair number of conversations about this topic over the last few months. Here's the basic context/problem:
Did I get that right? @ivdiazsa @mangod9 @davidwrighton @tannergooding |
Btw, |
IIRC, corelib is special and ignores and works under the presumption that we "do the right thing" and ensure all paths behave identically. While user code has the presumption that the two paths could differ in behavior and so will always throw out the R2R implementation if any path attempted to use an incompatible higher ISA. -- That is, SSE3, SSSE3, SSE4.1, and SSE4.2 are "encoding compatible with" SSE2 and so the There are a few things to then keep in mind for corelib code:
The discussion around targeting AVX/AVX2 is really one around targeting There is also a consideration that this impacts managed code but not native. Continuing to have the JIT, GC, and other native bits target a 20 year old baseline computer has some negative downsides with the main upside being that devs can "xcopy" their bits onto a flash drive and then run it on any computer. Even Arm64 has the consideration that I expect we could see some reasonable perf gains if we experimented with allowing a "less-portable" version of the JIT/GC that targeted a higher baseline (for both x64 and Arm64). I expect that this would be a net benefit for a majority of cases with it being "worse" only for a small subset of users with hardware that more than 15 years old (at which point, they likely aren't running a modern OS and likely have many other considerations around general app usage). |
is this something which can be done in 7? Otherwise we could move this broader decision around avx2 to 8. |
We'll take another run in .NET 8. |
we have enabled avx2 for the new "composite" container images in 8. Is there any other specific work required here? |
Just tried
I don't see |
#85791 is marked as Future. Surprised that |
public static void ThrowIfNegativeOrZero<T>(T value, [CallerArgumentExpression(nameof(value))] string? paramName = null)
where T : INumberBase<T>
{
if (T.IsNegative(value) || T.IsZero(value))
ThrowNegativeOrZero(paramName, value);
} I guess it's the same SVM issue so presumably it will be fixed with #87438 |
hmm, makes sense |
Repro:
Console.WriteLine("Hello world".Replace('e', 'a'));
Actual result:
String.Replace
is JITed in the foregroundExpected result: R2R version of
String.Replace
is, no JITing ofString.Replace
observed.The text was updated successfully, but these errors were encountered: