Skip to content
This repository has been archived by the owner on Nov 1, 2020. It is now read-only.

Base class library size on disk footprint workitems #5013

Open
2 of 6 tasks
MichalStrehovsky opened this issue Nov 23, 2017 · 2 comments
Open
2 of 6 tasks

Base class library size on disk footprint workitems #5013

MichalStrehovsky opened this issue Nov 23, 2017 · 2 comments

Comments

@MichalStrehovsky
Copy link
Member

MichalStrehovsky commented Nov 23, 2017

I spent some time looking into why an optimized Hello World is 3.8 MB big on Windows. There are two ways to look at this problem: what we do in the compiler to make it this big, and what we do in the class libraries that forces it to be so big. This issue tracks the latter.

With a set of hacks here Hello World can go down to about 2.4 MB. I was focusing mostly on getting the reflection stack out of the binary.

The problems:

  • Some pieces of the class library have ETW tracing enabled. This heavily relies on reflection to work and brings pretty much everything into the binary.
  • ResourceManager. This is mostly used to bring localized exception messages. Too bad we don't have any in the framework. ResourceManager parses custom attributes, so custom attribute support, reflection method invocation, reflection field access etc. is all brought in.
  • Unhandled exception stack trace experience eagerly populates MethodBase when initializing StackFrame, bringing a lot of MethodInfo support in. This is totally my fault. I asked for this to be implemented similar to CoreCLR, but we have good reasons to make StackFrame a bit different.
  • Enum.ToString. We'll probably want a more low level version of this. There's no way around not generating this. Someone will box an enum and someone will call a ToString. We'll need to generate this method. It better be lightweight.
  • ValueType.Equals/GetHashCode. This reflects on fields. We've heard complaints that when we fall back to this implementation, it's super slow. Maybe we want something more low level here (e.g. small NativeFormat data structures describing field offsets and their types, if valuetype). If we build new data structures, we'll also need to teach the type loader to build them. Fixed in Remove reflection in ValueType.Equals/GetHashCode #5436.
  • IsByRefLike. This is implemented as a custom attribute search. Maybe we shouldn't be calling it from the codepath where it's being called right now. Fixed in Move check for IsByRefLike to MakeArrayType #5439.

There are also some other opportunities I didn't measure yet:

  • There's still a bunch of type loader left in the image. We're not loading new types. It shouldn't be needed.
  • Globalization support. We should look into exposing dummy globalization as a build option.
  • Manifest resources. We're still emitting 160 kB of resource strings even though the resource manager is gone.
  • There's a bunch more reflection stack in there, mostly to support Object.ToString().
@jkotas
Copy link
Member

jkotas commented Nov 24, 2017

Someone will box and enum and someone will call a ToString. We'll need to generate this method. It better be lightweight.

Are you trying to get rid of dependency on native metadata completely? I would think that saving the backing data for enums in the native metadata should be fine. Maybe the real problem here is that there is too much code required to fetch the enum values and names from native metadata.

ETW tracing enabled
Globalization support

These maybe candidates for readonly-value overrides implemented by Mono AOT: https://github.com/mono/mono/blob/50fa04c1365f68f309c6d0613c96672deb0d07fc/man/mono.1#L274 . The basic idea is that you place the call to the code you want to exclude behind a property and you tell the compiler what this property is going to return. It would be nice to do this optimization in IL Linker as an optional pass since it is generally applicable. Though this one is low-level and policy-free enough that we can live with it in the ILCompiler too.

For globalization, we should base it on https://github.com/dotnet/corefx/blob/master/Documentation/architecture/globalization-invariant-mode.md. (It did not existed when the Dummy globalization was added.)

ResourceManager

I would be nice to have an option to replace the localizable error strings with hardcoded single language messages; or no errors strings at all. Again, it would be nice to do this as optional pass in ILLinker.

IsByRefLike. This is implemented as a custom attribute search

It should be fetched from EEType when we have one.

@MichalStrehovsky
Copy link
Member Author

MichalStrehovsky commented Nov 24, 2017

Maybe the real problem here is that there is too much code required to fetch the enum values and names from native metadata

This. To do a ToString, the higher level reflection stack will do a custom attribute walk to check for the Flags attribute. Checking for attributes is super expensive because it involves resolving ConstructorInfos and that's a lot of code. I would want us to have a low level fast path that uses the MetadataReader directly.

It should be fetched from EEType when we have one.

This is in a code path where all we have is a System.Type that might not have an EEType. So we need to compile the code that can deal with that. Which involves custom attribute search and those are expensive.

MichalStrehovsky added a commit to MichalStrehovsky/corert that referenced this issue Jan 8, 2018
Contributes to dotnet#5013.

This adds support for emitting new data structures that describe instance field layout of valuetypes that can't be memcompared. The data structure enables us to iterate over fields at runtime and compare them individually.
MichalStrehovsky added a commit to MichalStrehovsky/corert that referenced this issue Feb 27, 2018
Contributes to dotnet#5013.

Having reflection field access and custom attribute parsing support in a code path reachable from `Enum.ToString` means that any "hello world"-style app needs to have pretty much the full reflection stack embedded in it. The reflection stack is huge.

I'm creating a shortcut EnumInfo that takes advantage of the fact that the TypeInfo is native metadata based and has a full type handle. It uses metadata APIs directly to read the metadata, bypassing the reflection stack.
MichalStrehovsky added a commit that referenced this issue Apr 8, 2019
Contributes to #5013.

Having reflection field access and custom attribute parsing support in a code path reachable from `Enum.ToString` means that any "hello world"-style app needs to have pretty much the full reflection stack embedded in it. The reflection stack is huge. This also makes access to uncached `EnumInfo` marginally faster.

This pretty much restores #3801, where we replaced the specialized code paths with the common reflection path to fix a bug around blocked types. I fix that bug by simply returning an empty `EnumInfo`.

I had to make generic type definition EETypes carry their CorElementType to make this work property on generic type definitions of enums (for the corner case of enum type nested under a generic type). I'll see how difficult is it to add this to the binder on the Project N side when this ports over. If it's too complex, I'll simply restore the logic that accesses the first instance field type on generic definitions using reflection (under `#if PROJECTN`).
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants