Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Most of libs partition is not incrementally buildable #49800

Closed
agocke opened this issue Mar 18, 2021 · 20 comments · Fixed by #64000
Closed

Most of libs partition is not incrementally buildable #49800

agocke opened this issue Mar 18, 2021 · 20 comments · Fixed by #64000

Comments

@agocke
Copy link
Member

agocke commented Mar 18, 2021

Running build clr+libs -ninja and then build libs -ninja results in 217 csc invocations when there should be 0.

It looks like system which creates *.Forwards.cs files may not be incremental.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 18, 2021
@ghost
Copy link

ghost commented Mar 18, 2021

Tagging subscribers to this area: @Anipik, @safern, @ViktorHofer
See info in area-owners.md if you want to be subscribed.

Issue Details

Running build clr+libs -ninja and then build libs -ninja results in 217 csc invocations when there should be 0.

It looks like system which creates *.Forwards.cs files may not be incremental.

Author: agocke
Assignees: -
Labels:

area-Infrastructure-libraries

Milestone: -

@ViktorHofer
Copy link
Member

From what I can see, the genpartialfacades target has inputs and ouputs but those me be invalidated. @agocke can you please share a binlog?

https://github.com/dotnet/arcade/blob/6cc4c1e9e23d5e65e88a8a57216b3d91e9b3d8db/src/Microsoft.DotNet.GenFacades/build/Microsoft.DotNet.GenPartialFacadeSource.targets#L20-L49

cc @Anipik @ericstj

@ViktorHofer ViktorHofer added this to the 6.0.0 milestone Mar 24, 2021
@ViktorHofer ViktorHofer removed the untriaged New issue has not been triaged by the area owner label Mar 24, 2021
@agocke
Copy link
Member Author

agocke commented May 14, 2021

Happy to share whatever info is necessary privately. If someone wants to look into this, feel free to reach out.

@ericstj
Copy link
Member

ericstj commented May 14, 2021

We obviously want this to work and have spent time to make sure it works. Given how much we rely on ProjectReferences now, it makes incremental build especially important to inner-loop. @ViktorHofer can you have a look at this when you have cycles? nevermind I'll take it.

It looks like system which creates *.Forwards.cs files may not be incremental.

This is not the case. I did a small check on incremental build of System.Runtime.csproj and it does not invoke CSC at all, so something more subtle is happening. Perhaps the culprit is earlier in the build sequence (or even in scripts).

@ericstj
Copy link
Member

ericstj commented May 14, 2021

So I tried this in the background today and found at least one culprit:

<!-- reference everything but self -->
<!-- Type duplicated in Microsoft.Extensions.DependencyModel. Exclude it for now: https://github.com/dotnet/runtime/issues/34420 -->
<ReferencePath
Include="$(NetCoreAppCurrentRefPath)*.dll"
Exclude="$(NetCoreAppCurrentRefPath)$(MSBuildProjectName).dll;
$(NetCoreAppCurrentRefPath)netstandard.dll;
$(NetCoreAppCurrentRefPath)Microsoft.Extensions.DependencyModel.dll" />

This will cause the shims to rebuild every time, then trigger lots of downstream builds since many things get the shims from the ref-pack. I'll see if I can fix it.

@ericstj ericstj assigned ericstj and unassigned ViktorHofer May 14, 2021
@ericstj
Copy link
Member

ericstj commented May 14, 2021

Looks like there's another problem with the shims:

<!-- Build generated shims with the full reference closure, including OOB assemblies. -->
<Target Name="RebuildGeneratedShims"
AfterTargets="BuildNonNetCoreAppProjects"
Condition="'@(GeneratedShimProject)' != ''">
<MSBuild Targets="Build"
Projects="@(GeneratedShimProject)"
Properties="$(TraversalGlobalProperties);_Incremental=true" />
</Target>

Even if we fix the referencing between the shims we're force rebuilding them after all the other references are built. This is effectively a cycle since we put those in the reference pack.

This rebuild is due to the netfx compat shims needing to have references to things outside of netcoreapp. Things outside netcoreapp now use the netcoreapp ref-pack to build. None of the projects that build in dotnet/runtime should require netfx compat shims, so we could exclude those from references to avoid retriggering builds. We can get netstandard out of that set since it doesn't need to reference anything outside netcoreapp.

@ViktorHofer
Copy link
Member

None of the projects that build in dotnet/runtime should require netfx compat shims, so we could exclude those from references to avoid retriggering builds.

I agree, those should not be referenced and I was under the impression that those aren't by default as we define references granularly for .NETCoreApp configurations.

This is effectively a cycle since we put those in the reference pack.

I agree and it would prefer to break the cycle either just locally or entirely if possible. Preferably we would not build these shims locally at all (and have a mode in the sfx projects to not error out if they are missing). I don't think we have any protection for these shims in place today, so I assume not building them locally would not add risk to the build?

Alternatively if we believe the type forwards which are generated as part of a shims build don't change, should we just check them in, similar to #52793? AFAIK the compat shims only come into play when referencing a .NET Framework assembly (from .NET Standard or .NETCoreApp). Are we ok with freezing the current set of type forwards or do we want to continue live generating them based on available types on .NETCoreApp?

@ericstj
Copy link
Member

ericstj commented May 17, 2021

I was under the impression that those aren't by default as we define references granularly for .NETCoreApp configurations.

netstandard is being built in the same way and ends up getting referenced by most src projects.

Checking in the type-forwards doesn't solve anything here, unfortunately, as the process of generating them isn't causing any trouble here, it's the references that are changing. The type-forward generation task was just the first thing to notice that. If it didn't then the compile phase later on would notice it. The only way to remove that would be to compile from IL so that we wouldn't need references at all.

I believe the way forward here is to:

  1. Remove the broken wildcard references when building individual shims.
  2. Make netstandard shim different than the rest, and build this after building netcoreapp references, only once.
  3. Exclude the .NETFramework shims from any project's references: no wildcards, and remove them if the SDK happened to pull them from our live built ref-pack. We shouldn't need these to build any of the libraries in dotnet/runtime. (this may be a non-issue)
  4. Build the .NETFramework shims once after all other libs have been built, remove the rebuild hack that we do now. (this may also be a non-issue if no-one is referencing them)

@safern
Copy link
Member

safern commented Dec 6, 2021

From: @jkoritzinsky on #62444

It would be interesting to investigate using MSBuild static graph with the libraries build to see what improvements that provides if any. I think Viktor did some investigation a while back but I never saw the numbers.

@safern
Copy link
Member

safern commented Dec 6, 2021

I just did a quick smoke run on my computer and incremental build takes 2mins, which for doing "nothing" is quite some time.

@carlossanlop

This comment has been minimized.

@ericstj
Copy link
Member

ericstj commented Dec 7, 2021

@carlossanlop -- that's very different from incremental build.

When we say "incremental build" we mean only build the things that changed and the components downstream of those. In particular if nothing changed, then nothing should rebuild (however the build needs to do some, ideally minimal, work to realize that). For argument validation that's probably a separate issue. To fix you can look at the eng\build.* scripts which are the main entry-point.

@carlossanlop
Copy link
Member

Thanks for the clarification, @ericstj . I hid my comment as off-topic and I opened #62535 to track that request separately.

@ViktorHofer
Copy link
Member

ViktorHofer commented Jan 20, 2022

I had some time during my vacation and stumbled upon this problem and submitted the following PRs to fix that:

I also filed dotnet/sdk#23517 which causes 111 seconds being spent on re-validating something that shouldn't run in an incremental no-op packaging operation.

EDIT:
With all these changes combined (including dotnet/arcade@8c1b3eb and dotnet/arcade@1c78e43 which make the build even faster), a libraries subset no-op incremental build (without tests) only takes 70 seconds on my Surface Book 2. On a dev PC tower probably less than a minute.

Here are the remaining Top 10 running tasks:

image

Summary:

  • Restore could be bit faster in no-op scenarios (cc @jeffkl)
  • RAR takes more than 5 seconds but considering that we are building 300+ projects incrementally it doesn't seem that much. This data might still be interesting to folks like @rainersigwald and @dsplaisted.
  • The no-op invocation of Ninja which detects no changes takes quite long. AFAIK shelling out into the Dev Command Prompt strongly contributes to this number. cc @jkoritzinsky for ideas how to improve this.
  • The custom libraries binplacing and TargetFramework.Sdk logic adds a bit to the no-op build. I'm wondering why this is the case when nothing needs to be copied. The TargetFramework.Sdk overhead is likely due to this MSBuild target. cc @ericstj

@jeffkl
Copy link

jeffkl commented Jan 20, 2022

@ViktorHofer can you get me a binlog from restore?

SET RESTORE_TASK_BINLOG_PARAMETERS=LogFile=nuget.binlog

Then run no-op restore and send me nuget.binlog

@ViktorHofer
Copy link
Member

ViktorHofer commented Jan 20, 2022

@jeffkl you got mail 👍

What I forgot to mention in the mail is that you can clearly see in the "outer" binlog that the Restore target runs for ~16s where-as the "inner" static graph restore generated binlog shows a duration of only 12.5s which means that additional 3s are spent somewhere in between. IIRC the static graph restore happens out-of-process in a console app, so shelving out might contribute to the number.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 28, 2022
@jeffkl
Copy link

jeffkl commented Jan 31, 2022

Well I think the long pole is how long its taking to evaluate the projects:

Evaluated 1686 project(s) in 13603ms (1686 builds, 0 failures).

When I developed static graph-based restore, I was using a project tree with 750 projects which took 4 seconds to do a no-op restore. This repo is 1,686 projects and is taking 13.6 seconds (8ms per project on average). That is pretty darn fast in my opinion :) The NuGet side of things is only taking 3 seconds to determine that everything is up-to-date.

MSBuild does not cache evaluations from build to build so I'm not sure how to make this faster. We'd have to look at evaluations to see if we could speed it up. Static graph-based restore disables globs (stuff like *.cs) so that's unlikely causing slowness. The evaluations and builds also very parallel but if the project tree's dependencies are in declared in such a way, we could see less parallelism. @rainersigwald what's your opinion, is there anything we can do to make evaluating 1,686 projects faster than 13 seconds?

@ViktorHofer
Copy link
Member

ViktorHofer commented Jan 31, 2022

I don't think we actually have 1686 projects but 1686 evaluations (from both the inner and the outer dimensions).

@ViktorHofer
Copy link
Member

Thanks for double checking that restore itself isn't the culprit for slowness. I have some PRs in the pipeline that will reduce about 200 unnecessary outer builds by switching from TargetFrameworks to TargetFramework.

@jeffkl
Copy link

jeffkl commented Jan 31, 2022

I don't think we actually have 1686 projects but 1686 evaluations (from both the inner and the outer dimensions).

Yes good point, 1,686 evaluations is what I meant

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 24, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants