Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dotnet build leaves running dotnet.exe processes after it finishes (.csproj specific) #9487

Closed
jhudsoncedaron opened this issue Jun 11, 2018 · 27 comments
Milestone

Comments

@jhudsoncedaron
Copy link

Steps to reproduce

.csproj files attached
buildleak.zip
Enter c directory
Open task manager/details view
observe no dotnet.exe processes are running (clean test environment)
run dotent build

Expected behavior

No dotnet.exe processes left

Actual behavior

dotnet.exe processes left

Environment data

dotnet --info output:

.NET Core SDK (reflecting any global.json):
Version: 2.1.300
Commit: adab45b

Runtime Environment:
OS Name: Windows
OS Version: 10.0.17134
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\2.1.300\

Host (useful for support):
Version: 2.1.0
Commit: caa7b7e2ba

.NET Core SDKs installed:
1.0.0 [C:\Program Files\dotnet\sdk]
1.0.1 [C:\Program Files\dotnet\sdk]
1.0.4 [C:\Program Files\dotnet\sdk]
1.1.0 [C:\Program Files\dotnet\sdk]
2.0.0 [C:\Program Files\dotnet\sdk]
2.1.200 [C:\Program Files\dotnet\sdk]
2.1.300 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 1.0.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 1.0.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 1.1.1 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 1.1.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.0.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.0.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.1.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
https://aka.ms/dotnet-download

@livarcocc
Copy link
Contributor

This is by design in 2.1.300.

As part of our perf improvements, we introduced three persistent servers with the goal of reduce our JIT time. The servers are the razor compilation server, the vbcscompiler server and the msbuild node re-use server.

If you don't want them staying around when you are done building, you can invoke dotnet build-server shutdown to turn them off.

If you don't want them to start to begin with, you can set different properties/environment variables to prevent them from starting. From @peterhuene:

Use -p:UseRazorBuildServer=false to disable the Razor (rzc) server.

Use -p:UseSharedCompilation=false to disable the Roslyn (vbcscompiler) server.

For MSBuild, pass /nodeReuse:false on the command line to disable node re-use.

@jhudsoncedaron
Copy link
Author

They're causing spurious build failures due to holding locks.

@eikster-dk
Copy link

eikster-dk commented Aug 6, 2018

@livarcocc I have a whole bunch of teamcity agents that are having the same issue as @jhudsoncedaron. The processes somehow doesn't release the files under a build of a project. random json files/dll's are not released and the build fails..

If i try to rerun the same build, it fails again. If i kill the 3 .NET core host processes, i can successfully build again, once or twice then the error occurs again.

Build steps are like:

  1. dotnet restore
  2. dotnet build
  3. dotnet test
  4. dotnet publish

It can fail randomly under each step.

This happened after we upgraded our TC agents with the latest version of dotnet core:

.NET Core SDK (reflecting any global.json):
Version: 2.1.302
Commit: 9048955

Runtime Environment:
OS Name: Windows
OS Version: 10.0.14393
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\2.1.302\

Host (useful for support):
Version: 2.1.2
Commit: 811c3ce6c0

.NET Core SDKs installed:
2.1.4 [C:\Program Files\dotnet\sdk]
2.1.105 [C:\Program Files\dotnet\sdk]
2.1.302 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 2.0.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.0.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
https://aka.ms/dotnet-download

@livarcocc
Copy link
Contributor

Have you tried the steps that I suggested above?

@eikster-dk
Copy link

Yes, i have tried all of the above and the builds continued to fail

I ended up adding /m:1 as i read on stackoverflow to disable the multi-process build and now every build is green.

I'm not sure if my issue is related to dotnet/cli#9648

@jhudsoncedaron
Copy link
Author

jhudsoncedaron commented Aug 7, 2018

But that's (#9648) my issue. If it weren't for the fact that this is the older number I'd say it was a duplicate.

@jasondaicoder
Copy link

dotnet build-server shutdown doesn't work on mac, I can still see the dotnet processes are live and consuming lots of CPU.

@peterhuene
Copy link
Contributor

@jasondaicoder what DLLs are those dotnet processes executing? That is to say, if you view the command lines for those processes, what was the argument to dotnet or dotnet exec?

pbsamsung referenced this issue in SamsungDForum/JuvoPlayer Oct 22, 2019
Relates to https://github.com/dotnet/cli/issues/9458

The processes somehow doesn't release the files under a build of a project.
Random json files/dll's are not released and the build fails.

Signed-off-by: Pawel Panek <p.panek@samsung.com>
@TheXenocide
Copy link

TheXenocide commented Dec 19, 2019

We are also experiencing locked files in build environments being held by the dotnet process. If the work it's performing is done, shouldn't it be releasing the files it was using? Even in the case of Razor, different builds could use different versions of the package so shouldn't there be some sort of AssemblyLoadContext to allow GC or a non-locking load behavior (load by byte-array, copy before load, etc.)? This also prevents clearing the NuGet package cache, even in idle environments.

EDIT: here's an issue from TeamCity restoring packages

[18:46:52]
[restore] Access to the path 'Microsoft.AspNetCore.Razor.Language.dll' is denied.
[18:46:52]
[restore] Process exited with code 1
[18:46:52]
[restore] Process exited with code 1 (Step: Restore NuGet Packages (NuGet Installer))

@abeham
Copy link

abeham commented Mar 5, 2020

Same here, linux gitlab build server had lots of dotnet processes running /usr/share/dotnet/dotnet /usr/share/dotnet/sdk/3.1.101/MSBuild.dll /usr/share/dotnet/sdk/3.1.101/MSBuild.dll /nologo /nodemode:1 /nodeReuse:true from different dates. I counted at least 23 instances. Build hang in restore. I killed all the processes now it works, but for how long? Should there really be 23 instances all running under the same user?

@jhudsoncedaron
Copy link
Author

So in the years, most of the "locks" problems have been fixed, but there are two big problems left.

  1. There's one process or set of processes started per directory you build from.
  2. Under Windows, a process holds a lock on the directory it was started from, even if its current directory is changed later. In order to get rid of this lock pileup, the spawning process needs to carefully change its current directory to C:\Windows or something like that before starting the persistent child.

@jjxtra
Copy link

jjxtra commented Sep 26, 2020

I have dozens of these lingering processes, please revert the default behavior to not leave them hanging around

@tpak
Copy link

tpak commented Jul 1, 2021

FYI for the next person - we're on Ubuntu 18/20.04 and we're currently stuck with an ancient version of .net (trying to get rid of it and too embarrassed to quote the version here in public) and adding the flags to our restore, build, and publish steps helped and then at the end we added shutdown mentioned as well for good measure - now our build scripts are at least being polite to each other and we can focus on fixing the issues and getting onto a new version of .net

As per @livarcocc's recommendation above, we added the following two flags to each step restore, build, publish:
dotnet restore -p:UseRazorBuildServer=false -p:UseSharedCompilation=false

And as per @jasondaicoder's recommendation above we added this to the end of our build:
dotnet build-server shutdown

I hope this helps someone else stuck with a ratty old version of .net that is causing a headache.

@NinjaCross
Copy link

NinjaCross commented Nov 11, 2021

The problem is still present in NET 6.0.100.
It's also impossibile to shutdown the processes using the command dotnet build-server shutdown

As you can see here, the command starts, but never complete.
It hangs too just when it tries to shutdown the hanging process.

image

I feel this issue should be reopened ASAP, since it's blocking all our CI/CD and developers local machines too.

@schotime
Copy link

schotime commented May 3, 2022

We are getting this too after upgrade to .net6.

image

@tpak
Copy link

tpak commented May 3, 2022

We managed to get to .net5 in January and "dotnet build-server shutdown" is working for us. We're about to move to .net6 and we'll test and report back what we find, but we're on Ubuntu 20.04. We'll also likely move from our current Jenkins build to GitHub actions for our pipeline and that will relieve me of worry since the container is ephemeral. You might try containerized runners for your CI/CD of choice.

@emmenlau
Copy link

emmenlau commented Apr 6, 2023

The solutions given in this issue do not fully resolve it. Here is what I observe:

On MacOS we use dotnet for the build. As suggested above, we use -p:UseRazorBuildServer=false -p:UseSharedCompilation=false on all calls to dotnet to ensure it will not use any build servers. However, dotnet starts MSBuild.dll with options /nodemode:1 /nodeReuse:true. After that, a number of dotnet processes remain at the end of the build.

Using dotnet build-server shutdown is not a solution because it will only execute on build success, not on error. On build error, the CI aborts, and does not execute any further calls to dotnet build-server shutdown. This can not easily be worked around in the CI. However, subsequent (independent) builds of other projects will fail, because the rogue dotnet processes keep hogging the build lock.

The exact same build scripts with -p:UseRazorBuildServer=false -p:UseSharedCompilation=false work successfully on Linux! So generally this can work. It just does not seem to work on MacOS.

@KalleOlaviNiemitalo
Copy link
Contributor

@emmenlau, can you make the CI run dotnet build-server shutdown at the start of each build, rather than at the end? That way, the CI wouldn't have yet detected a build error and aborted.

What CI are you using anyway?

@emmenlau
Copy link

emmenlau commented Apr 6, 2023

@KalleOlaviNiemitalo we have about 200 C++ projects and 2 dotnet projects. I would need to add dotnet build-server shutdown in the build scripts of 200 projects that do not even depend on dotnet (and may sometimes run on build machines that do not have dotnet installed) just to fix 2 projects that leave spurious processes behind.

We're using a self-hosted instance of Gitlab CI, but not with default scripting (which uses platform-dependent shells) but instead with bash-shell on all platforms.

@jhudsoncedaron
Copy link
Author

For anyone who's still fighting with this, check if dotnet build --disable-build-servers solves your problem.

@KalleOlaviNiemitalo
Copy link
Contributor

@emmenlau, does the GitLab pipeline system support .NET SDK specifically, or do you just configure it to run dotnet build commands via Bash? In the latter case, I imagine you could make the pipeline instead run a Bash script that runs both dotnet build and dotnet build-server shutdown, setting the exit code appropriately. That way, GitLab would see the failure exit code and abort the CI run only after dotnet build-server shutdown has already finished.

@baronfel
Copy link
Member

baronfel commented Apr 6, 2023

@emmenlau what version of the .NET SDK are you seeing this behavior on? If you want to disable MSBuild node sharing (which is the default behavior) then you definitely want to use the --disable-build-servers flag as @jhudsoncedaron just mentioned. That will disable the use of

  • razor build server
  • shared compilation
  • MSBuild Node Reuse

for you - and as we add new kinds of build servers we also hook them into this flag.

@emmenlau
Copy link

emmenlau commented Apr 6, 2023

@baronfel I'll try that one! So do I understand correctly that --disable-build-servers is the new superset of -p:UseRazorBuildServer=false -p:UseSharedCompilation=false, and so these ones are not needed anymore? Great!

@jhudsoncedaron
Copy link
Author

Too bad there's not a command to start the build server's explicitly. I'm thinking of building a dotnet-wrap tool that does the logically equivalent of (severe language mixing)

    dotent build-server start
    try {
         return dotnet "$@"
    } finally {
        dotnet build-server shutdown
    }

@baronfel
Copy link
Member

baronfel commented Apr 6, 2023

@emmenlau that's correct - though we don't have it documented. I raised dotnet/docs#34903 to cover that. Separately we may need to plumb this through a few other commands that can trigger builds to get full coverage - I'll chat with the team about that:

  • dotnet test
  • dotnet clean

(Logged #31651 to track this work)

ALSO - I believe this option is only available in .NET 7 SDKs onwards, so you'll need that SDK version to take advantage of it. If you're building .NET 6 apps you can still do that from a 7 SDK, so there should be no technical blockers from doing that.

@emmenlau
Copy link

emmenlau commented Apr 6, 2023

That is great news, thanks! I'll switch to that flag in the future!

For now, I have found that MSBUILDDISABLENODEREUSE=1 will further help that no dotnet processes remain on MacOSX. So people with .NET 6 may use a combination of:

set MSBUILDDISABLENODEREUSE=1

dotnet build -p:UseRazorBuildServer=false -p:UseSharedCompilation=false ...

@tacitomv
Copy link

Here's another way: I'm currently using netcore6 on linux (ubuntu/mint flavor) and sometimes the process gets stuck (usually I'm doing some aspnetcore server so the evidence is 'Cannot bind to port x - the address is already in use').

So first I use sudo netstat -ltnp | grep -w 'dotnet' for showing what is still binded to those ports and then I drop a kill -9 <process_id>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests