Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline resource strings in the compiler #80896

Merged
merged 6 commits into from
Jan 30, 2023

Conversation

MichalStrehovsky
Copy link
Member

Contributes to #80165.

Allows getting rid of resource manager.

Cc @dotnet/ilc-contrib

@ghost
Copy link

ghost commented Jan 20, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Contributes to #80165.

Allows getting rid of resource manager.

Cc @dotnet/ilc-contrib

Author: MichalStrehovsky
Assignees: MichalStrehovsky
Labels:

area-NativeAOT-coreclr

Milestone: -

Copy link
Member Author

@MichalStrehovsky MichalStrehovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointing out the hacks, but they're quite obvious.

Comment on lines 698 to 699
string resourceName1 = $"{module.Assembly.GetName().Name}.Strings.resources";
string resourceName2 = $"FxResources.{module.Assembly.GetName().Name}.SR.resources";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now need to inline knowledge about these names. Don't like this one either. Don't see way out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does resource manager find it? In theory this has to be written somewhere in the compiled code, so the compiler could get it from there. But it's probably VERY complicated to do so.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource manager grabs it from a piece of (generated) code. It looks like this:

namespace FxResources.System.Private.DisabledReflection
{
    internal static class SR { }
}
namespace System
{
    internal static partial class SR
    {
        private static global::System.Resources.ResourceManager s_resourceManager;
        internal static global::System.Resources.ResourceManager ResourceManager => s_resourceManager ?? (s_resourceManager = new global::System.Resources.ResourceManager(typeof(FxResources.System.Private.DisabledReflection.SR)));
    }
}

So we'd need to crack open the cctor on the System.SR class and find the type that is passed to resource manager's constructor. That's the name of the resource.

It's not impossible. But also not completely straightforward.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, but this is all generated by Arcade logic, so we can just change it: https://github.com/dotnet/arcade/blob/083fc0406279952794c853630a08a141eef7244f/src/Microsoft.DotNet.Arcade.Sdk/src/GenerateResxSource.cs

Looks like the extra type is only done to comfort .NET Native (likely also some old version of .NET Native that still did ILMerging base on the comment).

So we can remove the useless type and instead call into the ResourceManager ctor that takes the resource name.

That will be a net improvement everywhere. I like those kinds of changes.

namespace System
{
    internal static partial class SR
    {
        private const string ResourceName = "FxResources.System.Private.DisabledReflection.SR";
        private static global::System.Resources.ResourceManager s_resourceManager;
        internal static global::System.Resources.ResourceManager ResourceManager => s_resourceManager ?? (s_resourceManager = new global::System.Resources.ResourceManager(typeof(SR).Assembly), ResourceName);
    }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - although in probably won't help that much to detect it from the compiler. We would still have to understand the .cctor and alike. I'd say keep the current "hardcoded" version and file a bug that this could be improved by analyzing the static .cctor. Even that would still rely on just specific patterns, but it would more robust.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd be looking for the const string - that one is easy to find and can be just part of the protocol.

case ParseFailureKind.FormatWithOriginalDateTimeAndParameter:
return new FormatException(SR.Format(SR.GetResourceString(result.failureMessageID)!, new string(result.originalDateTimeString), result.failureMessageFormatArgument));
//case ParseFailureKind.ArgumentNull:
// return new ArgumentNullException(result.failureArgumentName, SR.GetResourceString(result.failureMessageID));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would need to stop delaying grabbing the resource string. I'm not sure this one is on the hello world path, but we'd need to audit everything.

@marek-safar
Copy link
Contributor

Does it make sense to optimize via "hacks" for hello world?

@MichalStrehovsky
Copy link
Member Author

Does it make sense to optimize via "hacks" for hello world?

Looking for ways to make it non-hacky! That's why this is a draft.

In .NET 7 we shipped NativeAOT for command line apps scenarios - lots of command line apps can go by without ever touching resource manager. For a small command line utility, this can be a 40% saving (this is bringing hello world to 1.6 MB by default without any other feature switches or optimization, just dotnet publish a release build).

@am11
Copy link
Member

am11 commented Jan 20, 2023

Can linker help with inlining of string resources more broadly? Perhaps with some existing criteria, e.g. <SatelliteResourceLanguages>en-US</SatelliteResourceLanguages> which SDK recognizes.

@vitek-karas
Copy link
Member

Linker could in theory do this as well - as a middle ground between full resources and resource keys only. The priority in linker is lower though because the size saving is relatively smaller (the constant 10MB of the native runtime outweighs any managed code in the app for small apps).

That said we should design this such that the same technique could be used in the linker as well.

@MichalStrehovsky MichalStrehovsky marked this pull request as ready for review January 25, 2023 09:33
@MichalStrehovsky
Copy link
Member Author

Pushed out a new iteration. This addresses the caching issue. The optimization now also gracefully falls back if someone used SR.GetString instead of the accessors. I don't think we would be able to police this in the BCL and sometimes the use is legitimate. If we detect that, the resource blob is not optimized away.

{
string? displayName = SR.GetResourceString("Globalization_cp_" + codePage.ToString());
if (string.IsNullOrEmpty(displayName))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was dead code. The only situation when GetResourceString could return a null is if we failed to find the manifest resource blob.

I traced this back to dotnet/corert@db4c4e1 "this implementation does not try to find resource strings for EncodingName, because ProjectN does not support this yet". 6 years passed. We added support at some point.

@MichalStrehovsky
Copy link
Member Author

Goldilocks before 16,486,400 bytes. Goldilocks after 16,224,768 bytes (1.6% savings).

Hello world before ~2.5 MB. Hello world after 1,857,536 bytes (massive% savings).

@VSadov
Copy link
Member

VSadov commented Jan 26, 2023

The change in HelloWorld size is quite impressive

We can now optimize away the assembly even when resource strings are in use.
@MichalStrehovsky
Copy link
Member Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

@dotnet/ilc-contrib could someone have a look?

On a high level:

  • When we're looking at IL to do substitutions, we additionally look for calls to SR.SomeResourceName. These are generated properties (generated by a piece of code in Arcade) that basically just do GetResourceString(nameof(SomeResourceName)). We look up what the resource string is (in the manifest resource) and replace the call with the looked up string literal.
  • We also keep track of calls to SR.GetResourceString. Seeing this in the graph means that the optimization was defeated - someone bypassed the generated accessors. If we see one, we add dependency graph node to the graph that represent the manifest resource that has the string.
  • When generating managed resources we skip over the one that has the strings unless the above dependency node is in the graph. This allows optimizing away the resource blobs if all accesses were inlined.

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
// Do not attempt to inline resource strings if we only want to use resource keys.
// The optimizations are not compatible.
bool shouldInlineResourceStrings =
!_hashtable._switchValues.TryGetValue("System.Resources.UseSystemResourceKeys", out bool useResourceKeys) || !useResourceKeys;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any estimate on additional saving we would get if this feature also worked with UseSystemResourceKeys feature switch (some of the message are very long)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any estimate on additional saving we would get if this feature also worked with UseSystemResourceKeys feature switch (some of the message are very long)?

This will always be inlined - for the below app:

Console.WriteLine(new NullReferenceException().Message);
throw null;

If I compile this with .NET 7 as dotnet publish -c Release /p:PublishAot=true /p:UseSystemResourceKeys=true and step into the ctor, I see this (the first lea is loading the string literal of the resource key - inlined from the accessor):

00007FF787207950 56                   push        rsi  
00007FF787207951 48 83 EC 20          sub         rsp,20h  
00007FF787207955 48 8B F1             mov         rsi,rcx  
00007FF787207958 48 8D 0D 11 7B 17 00 lea         rcx,[__Str_Arg_NullReferenceException (07FF78737F470h)]  
00007FF78720795F E8 0C C2 00 00       call        S_P_CoreLib_System_SR__GetResourceString (07FF787213B70h)  
00007FF787207964 C7 46 48 00 15 13 80 mov         dword ptr [rsi+48h],80131500h  
00007FF78720796B 48 8D 4E 08          lea         rcx,[rsi+8]  
00007FF78720796F 48 8B D0             mov         rdx,rax  
00007FF787207972 E8 89 BE F4 FF       call        RhpAssignRefAVLocation (07FF787153800h)  
00007FF787207977 C7 46 48 01 15 13 80 mov         dword ptr [rsi+48h],80131501h  
00007FF78720797E C7 46 48 03 40 00 80 mov         dword ptr [rsi+48h],80004003h  
00007FF787207985 48 83 C4 20          add         rsp,20h  
00007FF787207989 5E                   pop         rsi  
00007FF78720798A C3                   ret  

SR__GetResourceString didn't get inlined probably because the IL is messed up with too many nops and looks large even thought it isn't. It looks like this and there's only one of it, so not a huge deal size-wise, but we could leave fewer nops when substitutions happen to not to throw off inlining heuristics:

00007FF787213B70 48 8B C1             mov         rax,rcx  
00007FF787213B73 C3                   ret  

@marek-safar
Copy link
Contributor

We also keep track of calls to SR.GetResourceString. Seeing this in the graph means that the optimization was defeated

It might be worth enabling BannedAPIAnalyzet for this.

@MichalStrehovsky
Copy link
Member Author

We also keep track of calls to SR.GetResourceString. Seeing this in the graph means that the optimization was defeated

It might be worth enabling BannedAPIAnalyzet for this.

Yes, we definitely have a couple calls to these for no good reason!

@MichalStrehovsky
Copy link
Member Author

I've filed #81338 for the BannedApis and #81339 for SR.GetResourceString not being inlined.

@dotnet/ilc-contrib is the any other feedback? I would like this to make into Preview 1.

Copy link
Member

@vitek-karas vitek-karas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@MichalStrehovsky MichalStrehovsky merged commit 759fecb into dotnet:main Jan 30, 2023
@MichalStrehovsky MichalStrehovsky deleted the resstring branch January 30, 2023 23:53
MichalStrehovsky added a commit to MichalStrehovsky/runtime that referenced this pull request Feb 1, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Mar 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants