Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trimming Tools in .NET 8 #78671

Closed
Tracked by #69739
agocke opened this issue Nov 22, 2022 · 20 comments
Closed
Tracked by #69739

Trimming Tools in .NET 8 #78671

agocke opened this issue Nov 22, 2022 · 20 comments
Assignees
Labels
area-Tools-ILLink .NET linker development as well as trimming analyzers User Story A single user-facing feature. Can be grouped under an epic.
Milestone

Comments

@agocke
Copy link
Member

agocke commented Nov 22, 2022

There are two categories of tools that would help users with trimming

  1. App size
  2. Trimming roots

For (1), users could use the tool to examine the size of items after trimming and see what's taking up the most space.

For (2), users could find out what's keeping a particular type or member around, and then remove the roots.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Nov 22, 2022
@agocke agocke added this to the 8.0.0 milestone Nov 22, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Nov 22, 2022
@agocke agocke added untriaged New issue has not been triaged by the area owner area-Tools-ILLink .NET linker development as well as trimming analyzers labels Nov 22, 2022
@ghost
Copy link

ghost commented Nov 22, 2022

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas
See info in area-owners.md if you want to be subscribed.

Issue Details

There are two categories of tools that would help users with trimming

  1. App size
  2. Trimming roots

For (1), users could use the tool to examine the size of items after trimming and see what's taking up the most space.

For (2), users could find out what's keeping a particular type or member around, and then remove the roots.

Author: agocke
Assignees: -
Labels:

untriaged, area-Tools-ILLink

Milestone: 8.0.0

@agocke agocke removed the untriaged New issue has not been triaged by the area owner label Nov 22, 2022
@agocke agocke added the User Story A single user-facing feature. Can be grouped under an epic. label Dec 5, 2022
@agocke agocke changed the title Add trimming analysis tools Trimming Tools in .NET 8 Dec 12, 2022
@eerhardt
Copy link
Member

eerhardt commented Jan 3, 2023

In case this helps drive features/solutions here, the most helpful tools I've found are:

App Size

  1. Set <PublishAot>true</PublishAot> in the .csproj
  2. Set <IlcGenerateMstatFile>true</IlcGenerateMstatFile> in the .csproj
  3. Publish your app
  4. In obj\<Configuration>\<Target_Framework>\<RID>\native\ there will be an <AppName>.mstat file. This file contains information about what methods, types, etc. were left in the app. And gives a rough size of the methods in the app.
  5. The mstat file is a binary file and has an interesting format. It can be read by a tool like: https://gist.github.com/MichalStrehovsky/2c7cb3d623c7f8901541914dab04238d. Just change the file path on line 12 to your app's.
    • The mstat tool needs <PackageReference Include="Mono.Cecil" Version="0.11.4" /> in order to build/run.
    • I have also modified this code to dump all the methods and their sizes in a specific namespace. This allows me to see "what's so big in namespace 'X'?". It also allows you to visualize things like generic expansions, as you will see the generic methods filled with each ValueType. Code for this is here, with the diff being the code in #if TRUE. Change the namespaceName as needed.

Trimming roots

For seeing "why" a chunk of code is included in the app, the easiest approach I found is:

  1. Set the following properties in the .csproj
    <PublishAot>false</PublishAot>
    <PublishTrimmed>true</PublishTrimmed>
    <SelfContained>true</SelfContained>
    <TrimMode>full</TrimMode>
    <DynamicCodeSupport>false</DynamicCodeSupport>
    <EventSourceSupport>false</EventSourceSupport>
  1. Publish your app
  2. Open all the published bin\<Configuration>\<Target_Framework>\<RID>\publish\*.dll files in ILSpy (or similar)
  3. Find the code you are interested in analyzing and use the (right-click) Analyze feature to see what is calling it.

This usually finds 80-90% of the reasons why a certain method/class is required by the app. When this doesn't tell me why, then I end up using https://github.com/dotnet/linker/tree/main/src/analyzer as my next option.

@smhmhmd
Copy link
Contributor

smhmhmd commented Mar 2, 2023

For (1), users could use the tool to examine the size of items after trimming and see what's taking up the most space.

A 'readelf'-like utility that prints information about assemblies (#73913 (comment)) could be useful for App size. If it does not already exist, can this be added as a task ?

sizebench seems to be a windows-only utility.

@agocke
Copy link
Member Author

agocke commented Mar 2, 2023

@eerhardt @amcasey do you think the above would be helpful for you? What format would you prefer to see?

@amcasey
Copy link
Member

amcasey commented Mar 2, 2023

@agocke I'm not specifically familiar with either of those tools but my initial impression is that they both present a more or less flat list (albeit with granular categories). I guess the two main things I'd want from a solution are:

  1. Aggregation: member < type < namespace < assembly. A UI would be nice for this, but there are plenty of ways to see aggregating views of tabular data, so it doesn't seem essential.
  2. Some indication of causality: using aggregation, I've identified that X seems worth improving - which other things refer to / cause X. Because the refers-to graph is so large, in my experience, you either need a (relatively complicated) UI for manually navigating it (showing and hiding portions on-demand) or you need some high-confidence heuristic for laying out a key subgraph non-interactively (e.g. here are the top three space users and three levels of referrers for each).

@eerhardt
Copy link
Member

eerhardt commented Mar 2, 2023

I definitely don't think a UI is worth it at this point. Dumping the information to a text file or to stdout is good enough to start.

stdout/text file also allows for easy diffing between 2 apps/versions. I've found myself looking at diffs a lot lately. For example, if I add component X, I see my app size jump xxMB. What were all the changes between the two versions?

@smhmhmd
Copy link
Contributor

smhmhmd commented Mar 2, 2023

Dumping the information to a text file or to stdout is good enough to start.

My vote is also for text output

@amcasey
Copy link
Member

amcasey commented Mar 2, 2023

Dumping the information to a text file or to stdout is good enough to start.

My vote is also for text output

Preferably, structured text output (i.e. json or csv).

@eerhardt
Copy link
Member

eerhardt commented Mar 2, 2023

I would imagine the output format to be configurable. For example bloaty has this:

$ bloaty --help
Bloaty McBloatface: a size profiler for binaries.

USAGE: bloaty [OPTION]... FILE... [-- BASE_FILE...]

Options:

  --csv              Output in CSV format instead of human-readable.
  --tsv              Output in TSV format instead of human-readable.

@birojnayak
Copy link
Contributor

text output should be fine..

@smhmhmd
Copy link
Contributor

smhmhmd commented Mar 8, 2023

For (2), users could find out what's keeping a particular type or member around, and then remove the roots.

@agocke
Could something like nm be helpful in tracking the roots
nm has output like this

0000000000000000 a metadata_test.cpp.
U mktime@GLIBC_2.2.5
U nanosleep@GLIBC_2.2.5
U opendir@GLIBC_2.2.5
U OPENSSL_cleanse@OPENSSL_3.0.0
U pclose@GLIBC_2.2.5
U perror@GLIBC_2.2.5
U popen@GLIBC_2.2.5

@sbomer
Copy link
Member

sbomer commented Mar 8, 2023

For anyone interested on working on this, I have an incomplete write-up of a potential design for this kind of tool. It might be useful as a starting point: sbomer/linker@961474c?short_path=bcecbad#diff-bcecbad3970ec34a98732056ca2a36fdd2c7cfb1f7c285be0329d5116f6cf56c

@smhmhmd
Copy link
Contributor

smhmhmd commented Mar 9, 2023

@sbomer, Thanks for starting the writeup, added a comment in writeup

@MichalStrehovsky
Copy link
Member

We also have a lossy text format that has this info. <IlcGenerateMapFile>true</IlcGenerateMapFile> will generate it in the obj directory. It looks like this:

<?xml version="1.0" encoding="utf-8"?>
<ObjectNodes>
  <ConstructedEEType Name="??_7NAotHello_Program@@6B@" Length="56" Hash="d4817aa5497628e7c77e6b606107042bbba3130888c5f47a375e6179be789fbb" />
  <ConstructedEEType Name="??_7System_Console_Microsoft_CodeAnalysis_EmbeddedAttribute@@6B@" Length="56" Hash="d4817aa5497628e7c77e6b606107042bbba3130888c5f47a375e6179be789fbb" />
  ...
</ObjectNodes>

It loses a bunch of information about assemblies or namespaces (well, it's still there, but behind mangling).

As the output of the compiler, we need something that doesn't lose any information. Either a text format that will be very complex, or binary format. We already have the binary format. I'm open to having a text format, but it will not be easy to parse.

E.g. for a method, we need to capture:

Owning type name, namespace, assembly, including any generic instantiation parameters if the type is generic. If the generic parameters are specified, we need to be able to also map them to things like name, namespace, assembly, etc. Name of the method. If the method is generic, also generic arguments to the method in a form where we can easily identify what type that is (assembly name, namespace, if it's constructed like an array, the deconstruction of it etc.). Methods can have overloads so we also need the signature, encoding the types of all parameters.

I'm imaging future tools that might be able to take you to the source code of the problematic thing. So we need the full fidelity. I see an IL based binary format as best equipped for this. Tools can make various textual "views" of this information.

@agocke
Copy link
Member Author

agocke commented Mar 20, 2023

So that we have a place to work and contribute code, I've created a new branch feature/aot-tools in dotnet/runtimelab: https://github.com/dotnet/runtimelab/tree/feature/aot-tools. I think we can start moving some of the existing tools that people have built into that repository.

@agocke
Copy link
Member Author

agocke commented Mar 24, 2023

I've added the existing MstatDump in dotnet/runtimelab#2238

@MichalStrehovsky
Copy link
Member

I've built a GUI tool to inspect/diff/rootcause Native AOT size. Repo here: https://github.com/MichalStrehovsky/sizoscope. Personal repo since it's my vacation project and "code reviews" and "driving consensus" is the opposite of a vacation.

@NCLnclNCL
Copy link

Tôi đã xây dựng một công cụ GUI để kiểm tra/phân biệt/gốc kích thước AOT gốc. Repo tại đây: https://github.com/MichalStrehovsky/sizoscope . Repo cá nhân vì đó là dự án kỳ nghỉ của tôi và "đánh giá mã" và "thúc đẩy sự đồng thuận" trái ngược với kỳ nghỉ.

thank

@agocke
Copy link
Member Author

agocke commented Aug 10, 2023

Closing out as we don't plan any further work in .NET 8.

@agocke agocke closed this as completed Aug 10, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Sep 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Tools-ILLink .NET linker development as well as trimming analyzers User Story A single user-facing feature. Can be grouped under an epic.
Projects
Archived in project
Development

No branches or pull requests

8 participants