Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NETSDKE2E][ARM64][intermittent]Get "Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt." When running the command quickly. #41588

Closed
v-ainigao opened this issue Jun 14, 2024 · 34 comments · Fixed by #44930
Assignees
Milestone

Comments

@v-ainigao
Copy link

v-ainigao commented Jun 14, 2024

Build info
.net8.0.400-preview.0.24312.1(runtime-8.0.5) sdk

Steps:
1.install net8 preview 4 sdk
2.Quickly run the following command. (Direct Paste)

dotnet new sln -o s1
cd s1
dotnet new console -o a 
dotnet new classlib -o b -f netstandard2.1
dotnet new classlib -o c 
dotnet sln add a\a.csproj
dotnet sln add b\b.csproj
dotnet sln add c\c.csproj
cd a
dotnet add reference ..\b\b.csproj
dotnet add reference ..\c\c.csproj

Expected Result:
The command runs successfully without returning any errors.

Actual Result:
Get "Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt." When running the command quickly.
image

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Repeat 2 times:
--------------------------------
   at Microsoft.VisualStudio.Setup.Configuration.IEnumSetupInstances.Next(Int32, Microsoft.VisualStudio.Setup.Configuration.ISetupInstance[], Int32 ByRef)
--------------------------------
   at Microsoft.DotNet.Workloads.Workload.VisualStudioWorkloads.GetVisualStudioInstances()
   at Microsoft.DotNet.Workloads.Workload.VisualStudioWorkloads.GetInstalledWorkloads(Microsoft.NET.Sdk.WorkloadManifestReader.IWorkloadResolver, Microsoft.DotNet.Workloads.Workload.List.InstalledWorkloadsCollection, System.Nullable`1<Microsoft.NET.Sdk.WorkloadManifestReader.SdkFeatureBand>)
   at Microsoft.DotNet.Workloads.Workload.List.WorkloadInfoHelper.AddInstalledVsWorkloads(System.Collections.Generic.IEnumerable`1<Microsoft.NET.Sdk.WorkloadManifestReader.WorkloadId>)
   at Microsoft.DotNet.Workloads.Workload.List.WorkloadInfoHelper.get_InstalledAndExtendedWorkloads()
   at Microsoft.DotNet.Tools.New.WorkloadsInfoProvider.GetInstalledWorkloadsAsync(System.Threading.CancellationToken)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<ExtractWorkloadInfoAsync>d__9.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<ExtractWorkloadInfoAsync>d__9, Microsoft.TemplateEngine.Edge, Version=8.0.400.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<ExtractWorkloadInfoAsync>d__9 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.ValueTuple`2[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<ExtractWorkloadInfoAsync>d__9, Microsoft.TemplateEngine.Edge, Version=8.0.400.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<ExtractWorkloadInfoAsync>d__9 ByRef)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint.ExtractWorkloadInfoAsync(System.Collections.Generic.IEnumerable`1<Microsoft.TemplateEngine.Abstractions.Components.IWorkloadsInfoProvider>, Microsoft.Extensions.Logging.ILogger, System.Threading.CancellationToken)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<CreateAsync>d__6.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<CreateAsync>d__6, Microsoft.TemplateEngine.Edge, Version=8.0.400.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<CreateAsync>d__6 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<CreateAsync>d__6, Microsoft.TemplateEngine.Edge, Version=8.0.400.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<CreateAsync>d__6 ByRef)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint.CreateAsync(Microsoft.TemplateEngine.Abstractions.IEngineEnvironmentSettings, Microsoft.TemplateEngine.Abstractions.Constraints.ITemplateConstraintFactory, System.Threading.CancellationToken)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+<Microsoft-TemplateEngine-Abstractions-Constraints-ITemplateConstraintFactory-CreateTemplateConstraintAsync>d__5.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+<Microsoft-TemplateEngine-Abstractions-Constraints-ITemplateConstraintFactory-CreateTemplateConstraintAsync>d__5, Microsoft.TemplateEngine.Edge, Version=8.0.400.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<Microsoft-TemplateEngine-Abstractions-Constraints-ITemplateConstraintFactory-CreateTemplateConstraintAsync>d__5 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+<Microsoft-TemplateEngine-Abstractions-Constraints-ITemplateConstraintFactory-CreateTemplateConstraintAsync>d__5, Microsoft.TemplateEngine.Edge, Version=8.0.400.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<Microsoft-TemplateEngine-Abstractions-Constraints-ITemplateConstraintFactory-CreateTemplateConstraintAsync>d__5 ByRef)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory.Microsoft.TemplateEngine.Abstractions.Constraints.ITemplateConstraintFactory.CreateTemplateConstraintAsync(Microsoft.TemplateEngine.Abstractions.IEngineEnvironmentSettings, System.Threading.CancellationToken)
   at System.Threading.Tasks.Task`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].InnerInvoke()
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(System.Threading.Thread, System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart()
dotnet --info ``` dotnet --info C:\Users\v-ainigao>dotnet --info .NET SDK: Version: 8.0.400-preview.0.24312.1 Commit: f13e0daa5c Workload version: 8.0.400-manifests.c1c70047 MSBuild version: 17.11.0-preview-24310-01+74e23a98d

Runtime Environment:
OS Name: Windows
OS Version: 10.0.22631
OS Platform: Windows
RID: win-arm64
Base Path: C:\Program Files\dotnet\sdk\8.0.400-preview.0.24312.1\

.NET workloads installed:
Configured to use loose manifests when installing new manifests.
[wasm-tools]
Installation Source: SDK 8.0.400-preview.0
Manifest Version: 8.0.5/8.0.100
Manifest Path: C:\Program Files\dotnet\sdk-manifests\8.0.100\microsoft.net.workload.mono.toolchain.current\8.0.5\WorkloadManifest.json
Install Type: FileBased

Host:
Version: 8.0.5
Architecture: arm64
Commit: 087e15321b

.NET SDKs installed:
8.0.400-preview.0.24312.1 [C:\Program Files\dotnet\sdk]

.NET runtimes installed:
Microsoft.AspNetCore.App 8.0.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 8.0.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.WindowsDesktop.App 8.0.5 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

Other architectures found:
None

Environment variables:
Not set

global.json file:
Not found

Learn more:
https://aka.ms/dotnet/info

Download .NET:
https://aka.ms/dotnet/download

</details>
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Workloads untriaged Request triage from a team member labels Jun 14, 2024
@v-ainigao
Copy link
Author

v-ainigao commented Jun 14, 2024

After joining other people's accounts, try 6 times and can reproduce.
This issue also repro's on .net9preview6 daily build.

@v-ainigao
Copy link
Author

This issue also repro's on .net9.0.100-preview.6.24314.10SDK
image

@v-ainigao
Copy link
Author

v-ainigao commented Jul 2, 2024

This issue also repro's on .net9.0.100-preview.6.24328.19SDK and I found it seems to be easier to reproduce in an environment with VS+SDK.
image
dotnet --info
image

@Forgind
Copy link
Member

Forgind commented Jul 16, 2024

I tried this with today's 9.0 preview on an x64 machine and couldn't reproduce this. I suspect it's ARM64-specific.

@Forgind Forgind added the needs team triage Requires a full team discussion label Jul 16, 2024
@marcpopMSFT
Copy link
Member

This might be related to the same parallelism issue we found in the template engine with the Aspire template initially. They were able to workaround it but we never found the underlying issue.

@v-ainigao does this only repro on arm64? What version of VS are you using?

@v-ainigao
Copy link
Author

This might be related to the same parallelism issue we found in the template engine with the Aspire template initially. They were able to workaround it but we never found the underlying issue.

@v-ainigao does this only repro on arm64? What version of VS are you using?

Yes, only repro on ARM64 and the vs I use is for daily testing main branch.

@marcpopMSFT marcpopMSFT removed untriaged Request triage from a team member needs team triage Requires a full team discussion labels Jul 23, 2024
@marcpopMSFT marcpopMSFT added this to the 10.0.1xx milestone Jul 23, 2024
@v-ainigao
Copy link
Author

This issue also repro's on latest .net9preview7SDK
build info: vs17.12.0 preview 1.0 [35129.218.mian]+.net9.0.100-preview.7.24379.15(runtime-9.0.0-preview.7.24376.15)
image

@v-ainigao
Copy link
Author

This issue also repro's on net9.0.100-rc.1.24406.16SDK.
Environmental Information: Version 17.12.0 preview 2.0 [35206.223.main] + .net9.0.100-rc.1.24406.16SDK.
image

@gasparnagy
Copy link

From 2nd September, we experience the same issue on the GitHub action build of the Reqnroll project. A few of our tests perform a dotnet new command to generate a class library project and we get the same error. My feeling is that it comes in 50% of time. See for example: https://github.com/reqnroll/Reqnroll/actions/runs/10556606022/job/29653266376

It only comes on our Windows job, not on the Linux. For the Windows agents we use windows-latest.

It might be related to the .NET SDK installed on the agent: For the failing builds the agent contains .NET SDK 8.0.400, .NET Runtime 8.0.8, but I have checked earlier builds and it was .NET SDK 8.0.303, .NET Runtime 8.0.7. It seems that .NET SDK 8.0.400 is used by the agents since 27th August.

Reqnroll.TestProjectGenerator.ProjectCreationNotPossibleException: Execution of dotnet new failed. ---> System.Exception: The template "Class Library" was created successfully.

Processing post-creation actions...
Restoring C:\Users\runneradmin\AppData\Local\Temp\RR\Re38129a7\S1bd692a1\DefaultTestProject\DefaultTestProject.csproj:
Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Repeat 2 times:
--------------------------------
   at Microsoft.VisualStudio.Setup.Configuration.IEnumSetupInstances.Next(Int32, Microsoft.VisualStudio.Setup.Configuration.ISetupInstance[], Int32 ByRef)
--------------------------------
   at Microsoft.DotNet.Workloads.Workload.List.VisualStudioWorkloads.GetVisualStudioInstances()
   at Microsoft.DotNet.Workloads.Workload.List.VisualStudioWorkloads.GetInstalledWorkloads(Microsoft.NET.Sdk.WorkloadManifestReader.IWorkloadResolver, Microsoft.NET.Sdk.WorkloadManifestReader.SdkFeatureBand, Microsoft.DotNet.Workloads.Workload.List.InstalledWorkloadsCollection)
   at Microsoft.DotNet.Workloads.Workload.List.WorkloadInfoHelper.AddInstalledVsWorkloads(System.Collections.Generic.IEnumerable`1<Microsoft.NET.Sdk.WorkloadManifestReader.WorkloadId>)
   at Microsoft.DotNet.Workloads.Workload.List.WorkloadInfoHelper.get_InstalledAndExtendedWorkloads()
   at Microsoft.DotNet.Tools.New.WorkloadsInfoProvider.GetInstalledWorkloadsAsync(System.Threading.CancellationToken)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<ExtractWorkloadInfoAsync>d__9.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<ExtractWorkloadInfoAsync>d__9, Microsoft.TemplateEngine.Edge, Version=7.0.410.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<ExtractWorkloadInfoAsync>d__9 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.ValueTuple`2[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<ExtractWorkloadInfoAsync>d__9, Microsoft.TemplateEngine.Edge, Version=7.0.410.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<ExtractWorkloadInfoAsync>d__9 ByRef)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint.ExtractWorkloadInfoAsync(System.Collections.Generic.IEnumerable`1<Microsoft.TemplateEngine.Abstractions.Components.IWorkloadsInfoProvider>, Microsoft.Extensions.Logging.ILogger, System.Threading.CancellationToken)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<CreateAsync>d__6.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<CreateAsync>d__6, Microsoft.TemplateEngine.Edge, Version=7.0.410.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<CreateAsync>d__6 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint+<CreateAsync>d__6, Microsoft.TemplateEngine.Edge, Version=7.0.410.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<CreateAsync>d__6 ByRef)
   at Microsoft.TemplateEngine.Edge.Constraints.WorkloadConstraintFactory+WorkloadConstraint.CreateAsync(Microsoft.TemplateEngine.Abstractions.IEngineEnvironmentSettings, Microsoft

@v-ainigao
Copy link
Author

This issue also repro's on net9.0.100-rc.1.24452.12(runtime-9.0.0-rc.1.24431.7)SDK.
Environmental Information: sign-off Version 17.12.0preview2.0[35305.153.d17.12]
image

@gasparnagy
Copy link

I tried this with today's 9.0 preview on an x64 machine and couldn't reproduce this. I suspect it's ARM64-specific.

@Forgind We can reproduce it using the standard windows-latest GitHub actions image that has

  • PROCESSOR_ARCHITECTURE AMD64
  • PROCESSOR_IDENTIFIER AMD64 Family 25 Model 1 Stepping 1, AuthenticAMD

So I don't think it would be ARM specific.

@gasparnagy
Copy link

Can it be anyhow related to the Visual Studio 17.11 version maybe? Because that was also recently updated (from 17.10) in the GitHub actions image.

@v-ainigao Which VS version do you have where you can reproduce the problem?

@v-ainigao
Copy link
Author

v-ainigao commented Sep 6, 2024

Can it be anyhow related to the Visual Studio 17.11 version maybe? Because that was also recently updated (from 17.10) in the GitHub actions image.

@v-ainigao Which VS version do you have where you can reproduce the problem?

It will repro in the cmd when running the code too fast. The vs I am using today is 17.12.0preview2.0[35305.153.d17.12] and the SDK it carries net9.0.100-rc.1.24452.12(runtime-9.0.0-rc.1.24431.7) SDK

@gasparnagy
Copy link

It will repro in the CLI when running the code too fast.

What do you mean by that exactly? (Because in our case we run it from a test and maybe that is also "too fast".)

@v-ainigao
Copy link
Author

v-ainigao commented Sep 6, 2024

It will repro in the CLI when running the code too fast.

What do you mean by that exactly? (Because in our case we run it from a test and maybe that is also "too fast".)

It can be understood that you enter the code too quickly or just copy a code and press Enter in the cmd. It is easier to repro.
For example, if I paste this code directly into cmd, It will most likely repro.

mkdir 9
cd 9                         
dotnet new sln -o s1
cd s1
dotnet new console -o a 
dotnet new classlib -o b -f netstandard2.1
dotnet new classlib -o c 
dotnet sln add a\a.csproj
dotnet sln add b\b.csproj
dotnet sln add c\c.csproj
cd a
dotnet add reference ..\b\b.csproj
dotnet add reference ..\c\c.csproj
notepad Program.cs

image

@gasparnagy
Copy link

I see. I tried to put it even into a PowerShell loop, but locally I cannot reproduce, just on GitHub Actions... 😞

@v-ainigao
Copy link
Author

v-ainigao commented Sep 6, 2024

I see. I tried to put it even into a PowerShell loop, but locally I cannot reproduce, just on GitHub Actions... 😞

I just tried PowerShell and can repro it too in an ARM64OS created using Azure
image

@v-hozha1
Copy link

I also encounter this issue on the main build -Version 17.12.0 Preview 3.0 [35325.140.main] which integrated with 9.0.100-rc.2.24474.11 SDK.

image

@v-ainigao
Copy link
Author

This issue does not repro on .net9RTM sdk.
build info:9.0.100-rtm.24508.43(runtime- 9.0.0-rtm.24503.8)
I will keep tracking on next SDK version.

@gasparnagy
Copy link

@marcpopMSFT Could you please confirm that you have made some changes that could have fixed this as @v-ainigao reported?

Because I have just reproduced it locally on my x64 Intel processor with .NET 8.0 just a few days ago. (I don't even have .NET 9 runtime or SDK installed on this machine.) Is there a chance that the same fix can be also applied to .NET 8 SDK?

@v-ainigao
Copy link
Author

@marcpopMSFT Could you please confirm that you have made some changes that could have fixed this as @v-ainigao reported?

Because I have just reproduced it locally on my x64 Intel processor with .NET 8.0 just a few days ago. (I don't even have .NET 9 runtime or SDK installed on this machine.) Is there a chance that the same fix can be also applied to .NET 8 SDK?

I didn't make any changes, and it doesn't repro as usual.

@marcpopMSFT
Copy link
Member

We have not made any changes to resolve this. Our best guess earlier is that it was a race condition but we had trouble reproducing it so weren't able to rack down the issue. Given it's an unreliable repro, that explains why it might have gone away in .net9 in some configurations but still be showing up in other existing net8 configurations. I don't think we know how to investigate with next steps as I'm not sure what information to have you collect. Without a repro we can consistently reproduce, we can't get the debug information to determine what is causing the access violation.

@gasparnagy
Copy link

Thank you @marcpopMSFT for the response and it is quite understandable. I think normally this does not cause a big problem as users can retry or use Visual Studio to create a new project when it does not work.

Maybe our open-source project (Reqnroll) is the only one that based their testing on generating test projects with dotnet new. In our CI build, we run dotnet new approximately 500 times and the tests run parallel, so we do kind of a stress test unintentionally, but even with this setup it is not reliably reproducible, but still regularly causes build failures.

We finally decided to avoid using dotnet new so extensively, and we only use it once and every other time we just copy the generated project folder as files. With this we can reduce the dotnet new invocations to 10, so hopefully we will not be the victims of this issue.

It is also to be noticed that dotnet new is pretty slow. In an average test case (that creates a project, builds it and invokes dotnet test), more than 25% was taken by the two dotnet new invocations locally where all packages were already cached. So by replacing them with a "manual" solution, we can also gain test performance.

@marcpopMSFT
Copy link
Member

If you're running new in automation, that might explain why you are more likely to hit this issue. CC @dsplaisted @nagilson @baronfel as we're going to try to make a change to 9.0.2xx that should improve one specific known case of this but I don't think that would help in your case as the fix is narrowly focused on aspire templates.

Did reducing your usage of new help with the reliability here? @baronfel any thoughts on why new might be slow? Is it the restore step needing to hit the network a second time or potentially something else?

@baronfel
Copy link
Member

I don't know much about the perf characteristics of dotnet new - we've mostly been focused on functionality in the past. We probably owe a perf trace to find hot spots. It does do an automatic restore as a post action (which should be able to be skipped IIRC).

@gasparnagy
Copy link

Did reducing your usage of new help with the reliability here?

@marcpopMSFT For the first look it did. But I could not fully finish that because of some side-effects my workaround caused and I haven't yet got time to investigate those. But I will send an update when things are more visible.

@gasparnagy
Copy link

gasparnagy commented Oct 23, 2024

I don't know much about the perf characteristics of dotnet new - we've mostly been focused on functionality in the past. We probably owe a perf trace to find hot spots. It does do an automatic restore as a post action (which should be able to be skipped IIRC).

@baronfel My suspicion is that most of the performance problems rooting from the restore post-action, but I don't have evidence for that. Nevertheless the option of skipping this restore would definitely be helpful, because we anyway add new packages to the created project and do a build afterwards, so the restore has to be run anyway so I don't think we get too much benefit of the restore during new.

If you are interested in performance results, you can download the test result TRX file of our test execution, where you will see the different commands with a time stamp in the test std out. For example in our latest build they are here: https://github.com/reqnroll/Reqnroll/actions/runs/11369395224/artifacts/2064457469

@dsplaisted
Copy link
Member

I believe most or all of the templates have a --no-restore option which will skip the restore operation when the template is created.

@gasparnagy
Copy link

@dsplaisted Wow! Thank you! This is cool and I did not know about that. Indeed the classlib template we use has this, I have tested. I was checking the dotnet new documentation here, but that did not contain this and I did not think that the individual templates might have options that change the behavior of the dotnet new process. But now I see that it is actually documented at the templates, like for classlib here.

Thanks for the hint. Really useful. I will give it a quick try.

@dsplaisted
Copy link
Member

@dsplaisted Wow! Thank you! This is cool and I did not know about that. Indeed the classlib template we use has this, I have tested. I was checking the dotnet new documentation here, but that did not contain this and I did not think that the individual templates might have options that change the behavior of the dotnet new process. But now I see that it is actually documented at the templates, like for classlib here.

Thanks for the hint. Really useful. I will give it a quick try.

FYI @baronfel, looks like the --no-restore option isn't very discoverable because it is in the options for each template, not in the overall dotnet new help.

@baronfel
Copy link
Member

We have very explicit control over the help output for top-level dotnet new - it would be pretty straightforward to make a help section for "common template options".

@gasparnagy
Copy link

Feedback on performance: I did some basic measurements, and it seems that using --no-restore for new classlib improves it by approx. 60%, which is great.

@jaredpar
Copy link
Member

jaredpar commented Nov 7, 2024

Reactivating as I was able to reproduce this issue and get a crash dump of dotnet new.

https://github.com/jaredpar/complog/actions/runs/11713633968
Crash Dump Artifact

The core bug seems to be that Microsoft.VisualStudio.Setup.Configuration.Native is not thread safe but the dotnet new command is calling into it from multiple threads. The AV is happening in a native method that is doing a lazy init in roughly the following fashion:

wstring s_str;

const wstring& GetStr() {
  if (s_str.empty()) {
    ...
    s_str = ...;
  }
  return s_str;
}

This is not safe to call from multiple threads as the assignment can race with .empty and lead to an AV. In the crash dump you can see that the AV is happening in the assignment and there is another thread that is calling through GetVisualStudioInstances which also calls into this code

Crashing frame:

 	Microsoft.VisualStudio.Setup.Configuration.Native.dll!memcpy() Line 470	Unknown
 	[Inline Frame] Microsoft.VisualStudio.Setup.Configuration.Native.dll!wmemmove(wchar_t *) Line 247	C++
 	[Inline Frame] Microsoft.VisualStudio.Setup.Configuration.Native.dll!std::_WChar_traits<wchar_t>::move(wchar_t * const) Line 347	C++
 	[Inline Frame] Microsoft.VisualStudio.Setup.Configuration.Native.dll!std::wstring::assign(const wchar_t * const) Line 2663	C++
 	Microsoft.VisualStudio.Setup.Configuration.Native.dll!std::wstring::assign(const wchar_t * const _Ptr) Line 2676	C++
 	[Inline Frame] Microsoft.VisualStudio.Setup.Configuration.Native.dll!std::wstring::operator=(const wchar_t * const) Line 2498	C++
>	[Inline Frame] Microsoft.VisualStudio.Setup.Configuration.Native.dll!Policy::get_DefaultCachePath() Line 174	C++
 	Microsoft.VisualStudio.Setup.Configuration.Native.dll!Policy::get_CachePath() Line 191	C++
 	Microsoft.VisualStudio.Setup.Configuration.Native.dll!StateManager<FileStateManager>::GetRootDirectory() Line 68	C++
 	Microsoft.VisualStudio.Setup.Configuration.Native.dll!FileStateManager::GetInstanceIds() Line 18	C++
 	Microsoft.VisualStudio.Setup.Configuration.Native.dll!StateManager<FileStateManager>::GetInstanceIds() Line 49	C++
 	Microsoft.VisualStudio.Setup.Configuration.Native.dll!EnumSetupInstances::Initialize() Line 47	C++
 	Microsoft.VisualStudio.Setup.Configuration.Native.dll!EnumSetupInstances::Next(unsigned long celt, ISetupInstance * * rgelt, unsigned long * pceltFetched) Line 69	C++

Other thread accessing this same data:

 	Microsoft.DotNet.TemplateLocator.dll!Microsoft.NET.Sdk.WorkloadManifestReader.WorkloadResolver.ComposeWorkloadManifests() Line 246	C#
 	Microsoft.DotNet.TemplateLocator.dll!Microsoft.NET.Sdk.WorkloadManifestReader.WorkloadResolver.InitializeManifests() Line 165	C#
 	Microsoft.DotNet.TemplateLocator.dll!Microsoft.NET.Sdk.WorkloadManifestReader.WorkloadResolver.GetAvailableWorkloadDefinitions() Line 568	C#
 	System.Linq.dll!System.Linq.Enumerable.SelectEnumerableIterator<(Microsoft.NET.Sdk.WorkloadManifestReader.WorkloadDefinition, Microsoft.NET.Sdk.WorkloadManifestReader.WorkloadManifest), string>.MoveNext()	Unknown
 	System.Linq.dll!System.Linq.Enumerable.Contains<string>(System.Collections.Generic.IEnumerable<string> source, string value, System.Collections.Generic.IEqualityComparer<string> comparer)	Unknown
>	dotnet.dll!Microsoft.DotNet.Workloads.Workload.VisualStudioWorkloads.GetInstalledWorkloads(Microsoft.NET.Sdk.WorkloadManifestReader.IWorkloadResolver workloadResolver, Microsoft.DotNet.Workloads.Workload.List.InstalledWorkloadsCollection installedWorkloads, Microsoft.NET.Sdk.WorkloadManifestReader.SdkFeatureBand? sdkFeatureBand) Line 59	C#
 	dotnet.dll!Microsoft.DotNet.Workloads.Workload.List.WorkloadInfoHelper.AddInstalledVsWorkloads(System.Collections.Generic.IEnumerable<Microsoft.NET.Sdk.WorkloadManifestReader.WorkloadId> sdkWorkloadIds) Line 66	C#
 	dotnet.dll!Microsoft.DotNet.Workloads.Workload.List.WorkloadInfoHelper.InstalledAndExtendedWorkloads.get() Line 38	C#
 	dotnet.dll!Microsoft.DotNet.Tools.New.WorkloadsInfoProvider.GetInstalledWorkloadsAsync(System.Threading.CancellationToken cancellationToken) Line 27	C#

@dsplaisted
Copy link
Member

Based on the stack trace, it looks like this is caused by the same parallelism issue as dotnet/templating#7946

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants