Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyzer adversely affecting build performance #18618

Closed
roji opened this issue Oct 28, 2019 · 43 comments
Closed

Analyzer adversely affecting build performance #18618

roji opened this issue Oct 28, 2019 · 43 comments
Assignees
Labels
area-perf closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Milestone

Comments

@roji
Copy link
Member

roji commented Oct 28, 2019

See dotnet/core#3663.

@Pilchie suggested:

I suspect that the advice will be to switch to using Operation analyzers (at least for the memberaccess and objectcreationexpression).
That should let you re-use some already bound info instead of your GetSymbolInfo calls causing the compiler to do a bunch of extra work to re-bind them.

Also look into whether we can/should exclude migration files.

/cc @bricelam

@Pilchie
Copy link
Member

Pilchie commented Oct 28, 2019

@agocke suggested that the overloads taking lambdas are causing a lot of the time. Not sure how much moving to IOperation would help if the underlying binding is already taking a long time. @mavasani - any ideas?

@mavasani
Copy link

We have had some internal teams recently move from SyntaxNode/GetSymbolInfo based analyzer implementation to IOperation based implementation and they saw 4x-5x improvement in analyzer performance. I think it would be a good change to make regardless of whether or not it completely addresses the underlying performance concern of the analysis scenario.

@ajcvickers
Copy link
Contributor

Notes from triage:

  • Putting this in 3.1 to:
    • Ensure that generated code is being skipped
    • If the perf is still poor after skipping generated code, then consider also skipping migrations themselves. (It is possible these will be edited to use internal code, but that's not going to be common.)
  • Issue Revisit model-snapshot for perf #18620 covers the more general perf issue for big models and/or many migrations

@bricelam
Copy link
Contributor

bricelam commented Oct 28, 2019

Confirmed that we generate already generate this in the snapshot and code behind:

// <auto-generated />

Maybe Roslyn doesn't handle self-closing tags? (or attributes for that matter)

@roji
Copy link
Member Author

roji commented Oct 28, 2019

Will investigate what's going on tomorrow.

@Pilchie
Copy link
Member

Pilchie commented Oct 28, 2019

The roslyn code just looks at the start of the string, but the comment has to be the first thing in the file.

@Pilchie
Copy link
Member

Pilchie commented Oct 28, 2019

It also excludes some files by file name patterns, so it might be working already

@agocke
Copy link
Member

agocke commented Oct 28, 2019

So, modifying the analyzer could be useful, but the performance is still not great, even just building the code, because of the pattern of generated code.

The key thing on the analyzer side is that we're somehow avoiding re-using symbols from binding, effectively causing us to bind twice. Obviously this slows things down even more, but the underlying binding is still unfortunate. Overall on the analyzer side I'd say:

  1. I agree they probably shouldn't be running on generated code. Something I can debug.
  2. An analyzer model that re-uses results from binding would be helpful. I don't know offhand what the best architecture to get this result is

@mavasani
Copy link

RE: Generated code analysis

Please ensure that analyzer explicitly configures generated code analysis mode by invoking AnalysisContext.ConfigureGeneratedCodeAnalysis in the Initialize callback.

@bricelam
Copy link
Contributor

the performance is still not great, even just building the code, because of the pattern of generated code

@agocke We've filed #18620 to address this

@mavasani
Copy link

@bricelam Thanks.

I notice that you invoke GetSymbolInfo on every single member access and object creation in the compilation. That is going to be extremely expensive, and certainly not recommended for semantic analysis. @sharwell has done bunch of performance measurements in this space, and can confirm the same. You should certainly move to RegisterOperationAction for these callbacks. You can register for following OperationKinds:

  1. OperationKind.ObjectCreation - Gives back an IObjectCreationOperation
  2. OperationKind.Invocation - Gives back an IInvocationOperation
  3. OperationKind.FieldReference - Gives back an IFieldReferenceOperation
  4. OperationKind.MethodReference - Gives back an IMethodReferenceOperation
  5. OperationKind.PropertyReference - Gives back an IPropertyReferenceOperation
  6. OperationKind.EventReference - Gives back an IEventReferenceOperation

Each of the above operations directly expose the underlying target symbol and would avoid invoking GetSymbolInfo.

I would also avoid registering a syntax node action for ClassDeclarationSyntax and invoking GetDeclaredSymbol. Instead you can invoke RegisterSymbolAction with SymbolKind.NamedType and explicitly check the type kind on the call back to be TypeKind.Class.

@roji
Copy link
Member Author

roji commented Oct 29, 2019

tl;dr this is fixed in 3.1.

@bricelam added GeneratedCodeAnalysisFlags.None to the analyzer in 68b864c, which is in 3.1 but not in 3.0 (of course I had totally forgotten). So the 3.0 version of the analyzer indeed processes all code-generated, but 3.1 does not. As @Pilchie wrote above, Roslyn treats all *.Designer.cs as auto-generated (and some other name patterns), as well as any file starting with a comment that contains "<autogenerated" or "<auto-generated" (see full Roslyn logic here). So we're fine.

Did some tests to confirm that all is good. With latest release/3.1, compiling the provided migration code, we get the following times (in seconds):

Scenario Time (seconds)
Current 3.1 11.481
With GeneratedCodeAnalysisFlags.Analyze (as 3.0) 19.449
With null analyzer (no RegisterSyntaxNodeAction) 10.971
With no analyzer (no PackageReference) 7.603

We can definitely still optimize the analyzer as per @mavasani's suggestion, opened #18637 to track.

Unless someone objects, will leave this issue as closed-fix in 3.1 for future reference, as we don't have any other visible issue discussing it.

@roji
Copy link
Member Author

roji commented Oct 29, 2019

@Sally-Xu it can you please try the EF Core 3.1.0 daily builds and confirm the results?

@roji roji added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Oct 29, 2019
@bricelam bricelam closed this as completed Nov 2, 2019
@Sally-Xu
Copy link

Sally-Xu commented Nov 3, 2019

@roji,

I just downloaded EF Core 3.1.0-preview2.19525.5 and the problem is still there. Does the EF Core 3.1.0-preview2.19525.5 include this fix?

I haven't got time to setup my machine to try the daily build. Would rather wait for NuGet package to try it.

@roji
Copy link
Member Author

roji commented Nov 4, 2019

@Sally-Xu 3.1.0-preview2.19525.5 indeed contains the fix. I retested and as far as I can see the issue is gone, can you please double-check? If you're still seeing a slow build time, could you please provide detailed instructions (similar to the below)?

To test, I created a minimal project with your migrations (see https://github.com/roji/AnalyzerBuildPerf). I've done three tests, executing the following commandline:

dotnet clean; dotnet build; touch Program.cs; time dotnet build

Results:

Scenario Build Time (seconds)
Depending on EF Core 3.0 19.868
Depending on EF Core 3.1.0-preview2.19525.5 8.786
Analyzer disabled 6.022

The analyzer was disabled via a special fragment in the csproj, which allows passing /p:UseRoslynAnalyzers=false on the commandline.

@Sally-Xu
Copy link

Sally-Xu commented Nov 7, 2019

@roji, I just tested it again. Yes, the build time with 3.1.0-preview2.19525.5 is better than v3.0.0, but it is still not ideal.

For our project the build time is 53s (EF3.0) vs 32s (EF 3.1.0-preview2.19525.5 ) vs. 2.7s (without Migrations folder)

For your EFGames testing project, the build time on my machine is 34s( EF3.0) vs 9s (EF 3.1.0-preview2.19525.5) vs. 1s (Without Migrations folder)

The migrations code compile still takes the biggest chunk of build time.

Here are the binlog files: https://1drv.ms/u/s!AqUgraw9MWG6gaIdG0dcrq2wa0eAzA?e=L3qmDN

@roji
Copy link
Member Author

roji commented Nov 7, 2019

It's weird though... When @Sally-Xu deletes the migration directory, the build is super fast (~3s), implying that the non-migration code isn't problematic. At the same time, when I run a project containing only the migrations, I get an ~8s build (~6s with the analyzer totally disabled).

@Pilchie
Copy link
Member

Pilchie commented Nov 7, 2019

My point is that if the compiler has to consider overloads declared outside the migrations folder to resolve the calls inside the migrations folder, it could lead to this sort of complexity where both pieces are required to see the full effect of the perf problem

@roji
Copy link
Member Author

roji commented Nov 7, 2019

Right, understood - although migrations typically aren't supposed to interact in complex with the rest of the project code.

Anyway, it would be good to be able to look at the actual project producing this. Regardless I'll definitely also give #18637 a try as previously suggested.

@Sally-Xu
Copy link

Sally-Xu commented Nov 7, 2019

@roji ,

I created a new mini Web API project (https://github.com/Sally-Xu/net3buildissue) only contains the Migrations Folder and DBContext.

The build time on my machine is ~40 s (with Migrations) vs. 2 s (without Migrations).

You can un-comment the following setting in the .csproj file to test both situations:

<ItemGroup> <!--<Compile Remove="Migrations\**" /> --> </ItemGroup>

@roji
Copy link
Member Author

roji commented Nov 7, 2019

Thanks @Sally-Xu, I'll investigate this.

@roji
Copy link
Member Author

roji commented Nov 7, 2019

I've dug some more and indeed saw a big build perf difference between the repo @Sally-Xu provided and my own. I managed to narrow it down to a quite surprising factor: specifying <Project Sdk="Microsoft.NET.Sdk.Web"> in the csproj makes the build time shoot up from around 6s to around 16s. Here are some measurements:

Scenario Build time
Microsoft.NET.Sdk.Web 16s
Microsoft.NET.Sdk 6s
Microsoft.NET.Sdk.Web, without auto-generated migration files 4.5s
Microsoft.NET.Sdk, without auto-generated migration files 2s
Microsoft.NET.Sdk.Web with migration files but analyzer disabled 6.3s
Microsoft.NET.Sdk with migration files but analyzer disabled 5s

This was with EF Core 3.1.0-preview3.19554.8, on SDK 3.1.100-preview2-014569. I've updated the repo used (https://github.com/roji/AnalyzerBuildPerf) to be very similar to what @Sally-Xu provided (but changed from Npgsql to SQL Server to exclude any unknown issues, and some other minor changes).

Note that running with EF Core 3.0.0, where the analyzer didn't yet specify GeneratedCodeAnalysisFlags.None, makes the build time go up to 30s. Also, while with 3.0.0 I can see a warning by placing an offending internal snippet in auto-generated migration files, in 3.1.0 I can't. In other words, GeneratedCodeAnalysisFlags.None does seem to be doing its job - diagnostics aren't generated and build time is cut from 30s to 15s, suggesting the auto-generated files aren't processed.

Does anyone have any idea where to go from here? @Pilchie @mavasani @agocke @jaredpar @rainersigwald

@Pilchie
Copy link
Member

Pilchie commented Nov 7, 2019

The Web Sdk pulls in a bunch of other analyzers for MVC and stuff, so it's possible that they have similar problems. @mkArtakMSFT ?

@rainersigwald
Copy link
Member

@roji Is that all end-to-end build time? Where is the time going? The task time summary in MSBuild is probably a nice place to start.

@roji
Copy link
Member Author

roji commented Nov 7, 2019

@Pilchie that sounds like a promising direction.

@rainersigwald yeah, sorry, this is end-to-end time (am not a big build perf expert). Here's the msbuild detailed summary from an even slower machine, although it doesn't seem very helpful:

  Detailed Build Summary
  ======================


  ============================== Build Hierarchy (IDs represent configurations) =======================================
  ==============
  Id                  : Exclusive Time   Total Time   Path (Targets)
  ---------------------------------------------------------------------------------------------------------------------
  --------------
  0                   : 25.129s           25.129s       C:\projects\AnalyzerBuildPerf\web-api.csproj ()

  ============================== Node Utilization (IDs represent configurations) ======================================
  ==============
  Timestamp:            1        Duration   Cumulative
  ---------------------------------------------------------------------------------------------------------------------
  --------------
  637087586518145680:   0        25.129s     25.129s +++++ (scale 1:100)
  ---------------------------------------------------------------------------------------------------------------------
  --------------
  Utilization:          100.0    Average Utilization: 100.0

@rainersigwald
Copy link
Member

@roji I think the more interesting sections will be Project Evaluation Performance Summary, Project Performance Summary, Target Performance Summary, and Task Performance Summary.

Specifically:

  • If the Csc task is taking the bulk of the time, it's likely code complexity or analzyers.
  • If evaluation is slow we have more data we can get.

I suspect the former but it's cheap to confirm.

@roji
Copy link
Member Author

roji commented Nov 7, 2019

@rainersigwald I can dig into this more tomorrow if nobody else does before that (am in Berlin)

@roji
Copy link
Member Author

roji commented Nov 8, 2019

I investigated this further and all seems clear. Using Microsoft.NET.Sdk.Web brings in three additional analyzer packages: Microsoft.AspNetCore.Analyzers, Microsoft.AspNetCore.Mvc.Analyzers AND Microsoft.AspNetCore.Components.Analyzers. Lots of the contained analyzers do perform generated code analysis, explaining the situation. Did an analyzer-by-analyzer analysis (ha), and opened dotnet/aspnetcore#16922 to discuss possibly opting out for some/all of the analyzers.

@Sally-Xu thanks for your patience and for providing the repro project!

@larrytamnjong
Copy link

Looking at this comment thread 5 years later and we're facing the same problem built time on our Asp.net core 8 wep API app takes almost 2 hours. We have approximately 80 migrations.

@roji
Copy link
Member Author

roji commented Oct 25, 2024

@larrytamnjong we are not aware of any specific pending performance issues. How big are your migrations? Does everything run fast when you disable analyzers (see above)? If not, can you see which analyzers are taking how much time?

@larrytamnjong
Copy link

larrytamnjong commented Oct 25, 2024

Thanks, @roji . We have 88 migrations in total. When I exclude them using the following configuration:

<PropertyGroup> 
   <TargetFramework>net8.0</TargetFramework> 
   <DefaultItemExcludes Condition="'$(Configuration)' == 'Debug' ">$(DefaultItemExcludes);Migrations\\*.Designer.cs</DefaultItemExcludes> 
</PropertyGroup>

the project builds in just a few seconds. However, with migrations included, the build process becomes extremely slow—it can sometimes take over an hour and even freezes the computer. Disabling the analyzers hasn’t helped either.

We have migration files that are over 4 Mb I don't know if that's normal.

@larrytamnjong
Copy link

@roji is there anything else you need me to provide to be able to remedy the situation or to improve the build time of out solution with the migrations.

@roji
Copy link
Member Author

roji commented Oct 26, 2024

@larrytamnjong migration files over 4MB are generally not a good thing, and are very likely to cause pretty bad performance problems even regardless of any specific analyzer. What's the cumulative size of all of your migrations? What's the reason they're so big?

@larrytamnjong
Copy link

larrytamnjong commented Oct 26, 2024

@roji I have 88 migrations, which are exactly 498 Mb in size. I don't know why they're so big. Each designer file contains migrations from all the previous migrations I don't know if that's what's causing the huge size. The migration files themselves are small a few kb but the designer files are very heavy some are reaching 6Mb

@roji
Copy link
Member Author

roji commented Oct 27, 2024

@larrytamnjong then I suggest you take a look inside and understand what it is that's causing the size; for example, you may be using HasData() to seed huge data, which is very much discouraged. Having such huge migration source files is definitely not normal, and will likely affect your project in negative ways regardless of analyzers.

@larrytamnjong
Copy link

Thanks @roji we are actually using HasData to seed huge data I didn't figure this could be causing the issue. I'm confused about how to fix this considering the fact we already have migrations in production. Do you have any recommendations?

@roji
Copy link
Member Author

roji commented Oct 27, 2024

FWIW the problems around using HasData with huge data is called out in the docs. Going forward, the simplest would probably be to reset your migrations, starting again from scratch without HasData. Otherwise, you can manually modify the migration files to remove all HasData.

@larrytamnjong
Copy link

larrytamnjong commented Oct 27, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-perf closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Projects
None yet
Development

No branches or pull requests

9 participants