Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeAOT status for Android #106748

Open
vyacheslav-volkov opened this issue Aug 21, 2024 · 55 comments
Open

NativeAOT status for Android #106748

vyacheslav-volkov opened this issue Aug 21, 2024 · 55 comments

Comments

@vyacheslav-volkov
Copy link

I had previously raised this topic in another issue #101135, but I want to create a separate discussion as I couldn't find a place to track the progress on this matter.

The most serious and long-standing issues with Xamarin.Android is the slow startup time for applications. If you search the internet for "Xamarin.Android slow startup," you'll find hundreds of discussions on this topic. Even with all possible optimizations, including MonoAOT compilation, the startup time remains slow, and even MonoAOT works incorrectly on Android #101135. This problem is particularly noticeable with UI frameworks such as Avalonia, UNO, and MAUI. Developers simply don't have the ability to solve this problem on their own, as it is rooted in the fundamental aspects of the platform's operation, and a significant amount of time is spent on JIT compilation. In the end, to write a "fast application" for Android that still lags behind native applications in terms of startup speed, you need to perform a whole range of additional operations, which not every developer can manage, just to make their application work somewhat faster. I believe that this expectation is where the main problem lies. A developer expects that the release build will immediately work as it should, but instead, they encounter performance issues where they don't expect them.

When .NET Native was introduced, I thought it would be the solution to the slow startup problem for Android. Starting with version .NET 8.0, it became stable for iOS, and I began actively using it. The results are impressive: a fairly large application on an iPhone X launches as quickly as any native application and even faster than a similar application on a Samsung Galaxy S22 Ultra, despite all possible optimizations for Android. The gap between the release of these devices is five years, and I dread to imagine the startup time on a five-year-old Android device. Yes, there are still limitations on using dynamic code, but they are not that difficult to overcome, resulting in an application that performs as fast as a native one. Isn't that what we want for a cross-platform application? Moreover, I’m almost 100% sure that no one uses Android applications without ProfiledAOT or FullAOT because, in that case, you can forget about startup performance. This also means they are already using trimming, so transitioning to NativeAOT wouldn't require much additional effort. Over time, more libraries and frameworks will become fully compatible with NativeAOT, making integration seamless for developers without any issues.

However, observing the discussions about .NET Native and the activity around this topic, I get the impression that the team does not give this problem enough priority, and no specific timelines have been set for its resolution. For example, in one of the discussions on GitHub, the following is mentioned:

These will likely work under Mono, but will need to be fixed one day in .NET 10 or some future release that supports NativeAOT.
dotnet/android#8724

This gives the impression that allocating resources for NativeAOT on Android is not a priority, and instead, new releases include optimizations that only provide marginal improvements (e.g., -10% startup time for test cases). However, in real-world conditions, such improvements do not solve the problem. If an application takes 2000ms to start, even reducing it to 1800ms makes little difference, and at best, such optimizations are noticeable only under ideal conditions.

It seems to me that the team does not fully grasp the depth of this issue. Many of my colleagues have already switched to Flutter specifically because of the slow startup times on Android. When their clients or customers ask why the Android application launches so slowly, developers are forced to reply that it is a limitation of the technology they are using, they may also suggest switching to iOS, where there are no such problems, but this is not an option.

In my opinion, the implementation of NativeAOT support for Android, should be considered critically important. I would like to hear the team's thoughts on this matter: what should we expect? Will NativeAOT support for Android be added in the near future, or should we only hope for small, incremental performance improvements that don't solve anything and are waiting for everyone to switch to Flutter?

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 21, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 21, 2024
@EgorBo EgorBo added os-android area-NativeAOT-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Aug 21, 2024
@jkotas
Copy link
Member

jkotas commented Aug 21, 2024

cc @jonathanpeppers @jonpryor

@jonathanpeppers
Copy link
Member

Maybe just to breakdown the work involved slightly:

  • Runtime team:
    • GC bridge (of some form) to support Java interop
    • NativeAOT runtime packs for Android: this is somewhat working with linux-bionic-arm64 packages, but we probably want actual android-arm64, etc. packages.
  • Android team:
    • Android workload (at build time), things like setting up the build for ILC, etc. We need to support multiple RIDs and join them into a single app during a build.
    • Android workload (at runtime) uses Mono embedding APIs for startup, loading assemblies, etc. We'd probably throw out this code (at least partially) and rewrite for NativeAOT.
  • MAUI team:
    • We'd probably want to run all their test suites on NativeAOT as well as Mono. They have some of this for iOS already.

We did some of the basic groundwork in .NET 9, such as:

  • Basic experiments with NativeAOT, testing Java interop can work.
  • Java.Interop and the Android .NET assemblies now have 0 trimmer warnings.

This seems like a multi-month effort involving multiple teams. I don't actually know when we'd start on this; as it's quite above my paygrade.

@agocke
Copy link
Member

agocke commented Aug 21, 2024

NativeAOT runtime packs for Android: this is somewhat working with linux-bionic-arm64 packages, but we probably want actual android-arm64, etc. packages.

I'm somewhat skeptical of this. We've increasingly stopped doing things more specific than kernel-libc-arch in the runtime. It seems unlikely that Android needs more than what's already in our bionic packages.

GC bridge (of some form) to support Java interop

Agreed that this is necessary, but somewhat ill-defined, I think. It's not clear what functionality is available in Mono that isn't available in Core CLR.

@jkotas
Copy link
Member

jkotas commented Aug 21, 2024

It seems unlikely that Android needs more than what's already in our bionic packages.

There are number of special cases for Android in the higher-level runtime libraries. For example:

#if TARGET_ANDROID
private const string NativeHandlerType = "Xamarin.Android.Net.AndroidMessageHandler, Mono.Android";
private const string GetHttpMessageHandlerType = "Android.Runtime.AndroidEnvironment, Mono.Android";
.

These special-cases are unnecessary to get ordinary Linux-targeting code running on Android, but they are necessary for compatibility with Xamarin Android behaviors that exist today.

Agreed that this is necessary, but somewhat ill-defined, I think. It's not clear what functionality is available in Mono that isn't available in Core CLR.

Yes, the first step would be to extract the required functionality into an API proposal. The APIs that we have introduced for GC integration with ObjectiveC show the general shape to follow.

@alexyakunin
Copy link

alexyakunin commented Aug 21, 2024

So... The ETA is prob not .NET 10, right?

@filipnavara
Copy link
Member

filipnavara commented Aug 21, 2024

It seems unlikely that Android needs more than what's already in our bionic packages.

Aside from the things mentioned earlier, the whole Android crypto interop is currently not part of the linux-bionic packages.

Agreed that this is necessary, but somewhat ill-defined, I think. It's not clear what functionality is available in Mono that isn't available in Core CLR.

I had an idea to implement it in a way similar to Objective-C interop that I discussed informally with some of the stakeholders.

Here's the rough version copied from communication logs:

Assuming you are familiar with the MonoVM bridge, skip this part:

The Java bridged objects have a marker. At the end of GC you find all the marked objects that were collected and reconstruct an object graph of them. Then you switch the Java strong GC refs to weak GC refs, reconstruct the edges from the GC graph on the Java side (when possible, so only for certain bridged objects that have a List<object> on the Java peer side), and run Java GC. Once both the .NET and Java GC are finished, you switch the Java GC handles back to strong ones, and collect everything that didn't survive either GC.

The idea is to decompose the process into two phases and reuse the same logic that ObjC GC interop (reference counting) and COM interop does.

  • When a marked Java peer object is found unused by GC:
    • If you have strong Java GC handle, convert it to weak GC handle. Return "ref count" == 1.
    • If you have weak Java GC handle, return "ref count" == JavaGCHandle.IsAlive
  • If you have a WeakReference pointing to Java peer object:
    • If you access Target, convert the Java GC handle to strong one (if Target != null)
  • Interop that converts Java object to it's .NET Java peer object looks up the internal dictionary. If found and it has weak handle, convert to strong handle first

Down-side: You need to do .NET GC, Java GC, .NET GC to completely clean up peer objects, ie. one more GC on .NET side than MonoVM does... but MonoVM actually does part of it too, just in the hidden steps; Up-side: You don't block .NET GC on Java GC, the number of long-term surviving peer objects affects the GC much less

Notably, I had some feedback on it and there may be additional problems with the approach that I didn't originally foresee (#104272 (comment)). We also didn't get anywhere near to implementing it, even as a rough prototype.

@agocke
Copy link
Member

agocke commented Aug 21, 2024

Aside from the things mentioned earlier, the whole Android crypto interop is currently not part of the linux-bionic packages.

I stand corrected. I find this factoring pretty unfortunate, though.

@srxqds
Copy link
Contributor

srxqds commented Aug 22, 2024

Why doesn't Microsoft continue to invest more manpower in optimization on monovm?

@huoyaoyuan
Copy link
Member

Why doesn't Microsoft continue to invest more manpower in optimization on monovm?

There can be non-trivially duplicated task for optimization - sometimes even totally rework from scratch to make sure the architecture is optimal. NativeAOT was built from scratch to make everything AOT friendly. RyuJIT was built from scratch to replace the old JIT which originates from MSVC.
Being small doesn't mean friendly to optimization, and it's often the opposite due to lack of layering.

@srxqds
Copy link
Contributor

srxqds commented Aug 22, 2024

Why doesn't Microsoft continue to invest more manpower in optimization on monovm?

There can be non-trivially duplicated task for optimization - sometimes even totally rework from scratch to make sure the architecture is optimal. NativeAOT was built from scratch to make everything AOT friendly. RyuJIT was built from scratch to replace the old JIT which originates from MSVC. Being small doesn't mean friendly to optimization, and it's often the opposite due to lack of layering.

Yes, you are right, but I hope development team can pay more attention to it, monovm feature and optimization always delay, even ignore, they always said it's not important. hope it can attach importance to align coreclr.

@srxqds
Copy link
Contributor

srxqds commented Aug 22, 2024

I have opend so many issse https://github.com/dotnet/runtime/issues/created_by/srxqds ,most of them are igored.

@GerardSmit
Copy link

GerardSmit commented Aug 22, 2024

Additional information:

NativeAOT for Android was experimented here: https://github.com/dotnet/runtimelab/tree/feature/nativeaot-android
And the write-up can be found here: https://github.com/dotnet/runtimelab/blob/feature/nativeaot-android/src/mono/sample/Android-NativeAOT/README.md

When you look at the section "Performance measurements", take it with a grain of salt. In Discord the following was mentioned when this document was released:

they measured the Debug version of NativeAOT
..
They also didn't strip debugging symbols, so that doubly explains the size.

The size and performance of devices "Pixel 7a" and "Emulator" got updated after this message but I'm not sure about "Samsung Galaxy S10 Lite", Samsung "Galaxy S23" and "Pixel 5". These numbers didn't change after the initial commit (see Git Blame).


Why doesn't Microsoft continue to invest more manpower in optimization on monovm?

CoreCLR was made from the ground. Mono has to adapt to CoreCLR which can make it harder. For example, generics are currently a problem in Mono while more generics are being used in libraries:

Those methods are all generic methods which are not AOT'ed. Mono doesn't have tiered JIT.
#104076 (comment) in MonoAOT Perf_Single and Perf_Double Regressions on 6/3/2024 6:35:27 PM

Mono has multiple backends; MonoJIT, MonoInterpreter, MonoLLVM (maybe I'm missing more) so implementing performance improvements is quite the task.

@vyacheslav-volkov
Copy link
Author

@GerardSmit do you know what the Avalonia team used for this video https://www.reddit.com/r/dotnet/comments/13lvih2/nativeaot_ndk_vs_xamarinandroid_performance/? In their video the performance is as fast as what I get on iOS with NativeAOT.

@jonathanpeppers
Copy link
Member

I also have a sample here, that should have been testing Release mode:

@GerardSmit
Copy link

@vyacheslav-volkov I'm not sure what they used. I'm also not sure if they released any tools or source.
In the Reddit comments they commented the following:

We may commercialise it, as a way to generate revenue to support our continued OSS work.

Which may be the reason they never open-sourced/released this experiment.

@jonpryor
Copy link
Member

At the risk of completely sidetracking this discussion, @vyacheslav-volkov wrote:

The most serious and long-standing issues with Xamarin.Android is the slow startup time for applications.

There are many parts of the stack, and the .NET for Android part of the stack is not that slow. On a Pixel 6 Pro:

  • Java app (Android Studio > New Project > Empty Views Activity), built via gradlew assembleRelease:

    I ActivityTaskManager: Displayed com.example.helloworld/.MainActivity for user 0: +141ms
    
  • dotnet new android for .NET 8, built via dotnet build -c Release:

    I ActivityTaskManager: Displayed com.companyname.android_net8_hw/crc64b62cfbcfada02d88.MainActivity for user 0: +234ms
    

This is the first time I launched these apps (no averaging or anything), and .NET for Android is 93ms slower.

I don't consider that to be a lot of overhead.

The problems you're observing are not solely in MonoVM or JIT or runtime or .NET for Android (or everything built atop of them). I would not expect NativeAOT to be a "silver bullet" either.

@steveisok
Copy link
Member

Aside from the things mentioned earlier, the whole Android crypto interop is currently not part of the linux-bionic packages.

I stand corrected. I find this factoring pretty unfortunate, though.

The reason for this is pretty clear cut. Most / all of the crypto API's are Java API's and since linux-bionic is not including any of that (analogous to targeting the NDK), there's not much we can do outside of re-opening the discussion of shipping openssl as part of the runtime.

@vyacheslav-volkov
Copy link
Author

@jonpryor I agree that an empty application starts up fairly quickly. However, once you start adding code, the startup time begins to drop dramatically. At this point, the startup time doesn't depend on actual code optimizations anymore; it all comes down to the JIT compilation speed and efforts to reduce it.

For example, some advice suggests using only classes because it supposedly reduces compilation time #101135 (comment). But I can't say this is great advice, considering that everything in .NET is moving towards reducing allocations in the heap, and the framework itself is actively shifting towards using struct everywhere. Prohibiting the use of struct just for the sake of a faster startup sounds unreasonable.

Or take a simple MAUI application: its startup time will be around 600 ms. This means that an empty application already starts 2-3 times slower than a native application. As the developer adds their code, the startup time ranges from 1500 to 5000+ ms. In this case, traditional code optimization doesn't work — the developer must understand that optimization here is about easing the JIT compilation process rather than improving the code itself.

Here's a real example: my framework doesn't use any complex features, but it has a lot of struct and generic. The actual startup time with FullAOT on Android is about 1000 ms on a Galaxy Note 10. The same code on Xamarin.iOS with NativeAOT starts instantly on an iPhone X.

Here's the link to the repository https://github.com/vyacheslav-volkov/PerfAndroidTest/tree/main, where you can find two projects — Android (FullAOT) and iOS (NativeAOT). I added .speedscope.json files for the Android project to the trace folder. If you have time, please take a look and give me some advice on how I can improve the startup time without changing the runtime or using NativeAOT for Android. Also, if you check this issue #101135 you will find that in the current state of Xamarin.Android, even when using FullAOT, it is not possible to AOT generics and structs.

@jonpryor
Copy link
Member

Here is another sample which uses NativeAOT on Android, and unlike @jonathanpeppers sample has the benefit of looking like .NET for Android, with a C# subclass of Android.App.Activity: https://github.com/dotnet/java-interop/tree/main/samples/Hello-NativeAOTFromAndroid

For comparison to the previous Android times:

I ActivityTaskManager: Displayed net.dot.jni.helloandroid/my.MainActivity for user 0: +300ms

Compare 300ms to Java (+141ms) and .NET for Android (+234ms). This also contains additional debug prints, so isn't directly comparable, but should further emphasize that NativeAOT in and of itself will not be a silver bullet to all of your startup woes. A lot depends upon code higher up the stack.

At present, one of the primary blockers keeping us from dedicating more effort to NativeAOT support within .NET for Android is the lack of a decent GC story. (The current story is "everything leaks, lol".)

I do not foresee dedicating significant effort to support NativeAOT on the .NET for Android side until after the GC story is complete, and I'd further guesstimate that we'd want at least one .NET release after the GC exists before we'd support it.

@alexyakunin wrote:

So... The ETA is prob not .NET 10, right?

If NativeAOT has a GC story for .NET 10, I'd tentatively hope for preview support in .NET for Android by .NET 11. Maybe. (There are a number of unknown unknowns, and would not want to get anyone's hopes up.) Increase numbers as appropriate.

@agocke
Copy link
Member

agocke commented Aug 22, 2024

A lot depends upon code higher up the stack.

This sounds right to me. For reference, the console app startup for native aot on a Linux desktop machine is measured in microseconds so the runtime overhead in native aot is ~0. All of the startup impact is the cost of the code running in the startup path.

@vyacheslav-volkov
Copy link
Author

vyacheslav-volkov commented Aug 22, 2024

@jonpryor I just conducted a quick and rough test, but I think the point will be clear. I measured the initialization time of services in my test application twice, meaning the same code was executed twice. The first time it needed time for JIT compilation, and the second time it ran without JIT compilation. The second run was 27 times faster. As a developer, there’s nothing I can do to affect JIT compilation, and traditional code optimizations won’t work. I would need to rewrite the entire codebase just to make it easier for Android’s JIT compiler to handle it. And this is just a small portion of the code needed to launch the application. In this code, nothing is being called other than the creation and registration of services.

        var stopwatch = Stopwatch.StartNew();
        MugenApplicationConfiguration.Configure()
                                     .AndroidConfigurationGeneratedBindings<MainViewModel, MainActivity>(true, null, this)
                                     .PerfAndroidGeneratedBindingConfiguration()
                                     .CompositeUIConfiguration(new ShellHandlerProvider())
                                     .WithComponent(new MainSectionManager());
        stopwatch.Stop();
        Log.Wtf("STARTUP1", stopwatch.Elapsed.ToString());


        stopwatch.Restart();
        MugenApplicationConfiguration.Configure()
                                     .AndroidConfigurationGeneratedBindings<MainViewModel, MainActivity>(true, null, this)
                                     .PerfAndroidGeneratedBindingConfiguration()
                                     .CompositeUIConfiguration(new ShellHandlerProvider())
                                     .WithComponent(new MainSectionManager());
        stopwatch.Stop();
        Log.Wtf("STARTUP2", stopwatch.Elapsed.ToString());
Phone: Galaxy Note 10, Release build + FullAOT
STARTUP1    00:00:00.1165980
STARTUP2    00:00:00.0043421

If NativeAOT can make any user code executes in 50-100 ms (based on this example, this is more than enough if we don't need JIT compilation), plus an additional runtime execution time of 250-300 ms, we would achieve a total startup time of 350-400 ms for any application. This is comparable to the startup time of native applications.

@jonathanpeppers
Copy link
Member

jonathanpeppers commented Aug 22, 2024

As a developer, there’s nothing I can do to affect JIT compilation

@vyacheslav-volkov for your example above, have you tried either to "AOT Everything" with -p:AndroidEnableProfiledAot=false, or recorded a custom AOT profile?

By default, we use a built-in AOT profile that won't include most of your code. It is a reasonable tradeoff for app size vs startup time.

If you can use AOT for the code above, STARTUP1 should be much quicker.

(note that this is using Mono's AOT in the current product, and completely unrelated from NativeAOT).

@vyacheslav-volkov
Copy link
Author

vyacheslav-volkov commented Aug 22, 2024

@jonathanpeppers I've used this test project with FullAOT (Mono's AOT), you can check the config, it should be good:
https://github.com/vyacheslav-volkov/PerfAndroidTest/blob/main/PerfAndroid/PerfAndroid.csproj#L12-L14

Out of curiosity, I ran the same code without AOT, and here’s the result:

STARTUP1  00:00:00.2220397
STARTUP2  00:00:00.0038740

@jonathanpeppers
Copy link
Member

@vyacheslav-volkov can you check it's actually using AOT? It seems odd AOT would make the first run worse than JIT.

adb shell setprop debug.mono.log default,mono_log_level=debug,mono_log_mask=aot

This should make Mono print out a log message for each method like:

10401 10401 D Mono    : AOT: FOUND method Microsoft.AspNetCore.Components.WebView.Maui.BlazorWebView:.ctor () [0x6f9efd0150 - 0x6f9efd0340 0x6f9efd260c]

Note it's expected some methods will say:

10401 10401 D Mono    : AOT NOT FOUND: (wrapper runtime-invoke) object:runtime_invoke_void (object,intptr,intptr,intptr).
10401 10401 D Mono    : AOT NOT FOUND: (wrapper managed-to-native) System.Diagnostics.Debugger:IsAttached_internal ().
10401 10401 D Mono    : AOT NOT FOUND: (wrapper native-to-managed) Android.Runtime.JNINativeWrapper:Wrap_JniMarshal_PPL_V (intptr,intptr,intptr).

Clear debug.mono.log later after testing (as it will slowdown apps). You can reboot the device or use adb shell setprop debug.mono.log "''"

@charlesroddie
Copy link

My observations and suggestion as an end user:

Android is 2+ years behind other targets for AOT compilation in dotnet.

This stems from using the Android SDK which relies on Java interop, which makes things much more complicated than any other platform. The possible plans described above (#106748 (comment), #106748 (comment)) suggest tackling these issues which may take a long time.

Surely the NDK is a better target

Flutter uses the NDK and fully AOT-compiles everything and you can access relevant android-specific stuff from dart. This is where dotnet should be:

https://docs.flutter.dev/resources/faq#run-android The engine's C and C++ code are compiled with Android's NDK. The Dart code (both the SDK's and yours) are ahead-of-time (AOT) compiled into native, ARM, and x86-64 libraries. Those libraries are included in a "runner" Android project, and the whole thing is built into an .apk. When launched, the app loads the Flutter library. Any rendering, input, or event handling, and so on, is delegated to the compiled Flutter and app code.

In dotnet, there are some POCs as mentioned above (https://www.reddit.com/r/dotnet/comments/13lvih2/nativeaot_ndk_vs_xamarinandroid_performance/ and more recently https://github.com/jonathanpeppers/Android-NativeAOT ), both using SkiaSharp. But we would need the NDK callable from dotnet, similar to calling native code from dotnet in SkiaSharp, dotnet-ios, WinUI, etc..

@alexyakunin
Copy link

alexyakunin commented Aug 25, 2024

Just read: https://docs.avaloniaui.net/docs/basics/user-interface/controls/creating-controls/defining-properties

AvaloniaProperty.Register<MyCustomButton, int>(...)

Almost crying... As far as I can reason, that's nearly the worst case scenario, assuming it's not int, but any type declared outside of mscorlib. It's going to skip AOT for all <AnyClass, struct> params just because there is a struct.

There must be > 1K properties in such a demo. Assuming Register is not the only generic methods called at least once per property registration + this happens for every property on every component type, it might be easily a few thousands of generic method instances with missing AOT.

In our case we see ~4K of AOT_NOT_FOUND methods in Mono debug log, and JIT alone eats up ~ 1.5s of time on Galaxy S23 Ultra.

One other notable case is caching - if you use ConcurrentDictionary.GetOrAdd<TKey, TValue, TState>() overload with one of args being value type, all of such calls will also require JIT.

NativeAOT handles all these cases, and this might at least partially explain such a dramatic difference.

@alexyakunin
Copy link

alexyakunin commented Aug 25, 2024

@emmauss If you guys can try running the demo with Mono debug log enabled & share the number of AOT_NOT_FOUND methods @ startup, it would be great... Maybe it will help MS folks to seriously think about the priority of this issue. Especially in the light of "you'll be lucky to have .NET Native for Android in .NET 11".

adb shell setprop debug.mono.log default,assembly,mono_log_level=debug,mono_log_mask=all

@agocke agocke added this to the Future milestone Aug 29, 2024
@agocke agocke removed the untriaged New issue has not been triaged by the area owner label Aug 29, 2024
@agocke
Copy link
Member

agocke commented Aug 29, 2024

One note: a lot of the above costing implicitly assumes MAUI is on top, meaning that the system needs tight JVM integration. I don't know if platforms like Avalonia actually require that. If not, and they can compile against the Android NDK, the cost and schedules may change. I'll let someone from Avalonia speak on how they used Native AOT in the past.

@emmauss
Copy link

emmauss commented Aug 29, 2024

One note: a lot of the above costing implicitly assumes MAUI is on top, meaning that the system needs tight JVM integration. I don't know if platforms like Avalonia actually require that. If not, and they can compile against the Android NDK, the cost and schedules may change. I'll let someone from Avalonia speak on how they used Native AOT in the past.

We tested Native AOT with the Android NDK, using NativeActivity for our activity. This cut support for SDK apis that modern android apps use, like the storage access framework, window insets, text and input composition and embedding native android views in-app. Apis we need for storage, window customization and text prediction support. Also, we couldn't use any dotnet android libraries. These do not make it appealing to end users as they will be cut off from the rich dotnet android library ecosystem, and also need to set up a lot of build scripts just to build and sign their app.

@jonathanpeppers
Copy link
Member

jonathanpeppers commented Aug 29, 2024

One note: a lot of the above costing implicitly assumes MAUI is on top, meaning that the system needs tight JVM integration. I don't know if platforms like Avalonia actually require that. If not, and they can compile against the Android NDK, the cost and schedules may change. I'll let someone from Avalonia speak on how they used Native AOT in the past.

Generally, I don't know how you would make a "real" Android application that without calling Java APIs. Even Unity3d games would use their Java interop support for things like in-app purchases, push notifications, etc. There are a lot of random OS features you have to access from Java, so I would think most Avalonia apps would also need to use these.

@agocke
Copy link
Member

agocke commented Aug 29, 2024

Generally, I don't know how you would make a "real" Android application that without calling Java APIs.

I believe you could still call Java APIs through JNI, it would just be significantly more effort than the current implementations.

do not make it appealing to end users as they will be cut off from the rich dotnet android library ecosystem, and also need to set up a lot of build scripts just to build and sign their app

Agreed, the downside of this approach would be none of the existing Android/Java interop would work.

@vyacheslav-volkov
Copy link
Author

@agocke If someone could fix this issue #101135, we could use it as a workaround until full support for NativeAOT is available. Could someone from the team assess how difficult this task might be and estimate how long it might take to fix? Currently, we do not have a truly working solution to the slow startup problem.

@agocke
Copy link
Member

agocke commented Aug 30, 2024

That issue has 86 comments, so let me see if I can summarize. That's not one issue but really a blanket issue for: we've seen a variety of methods that must be JITed in our sample apps, which causes slow startup. Is that right? If so, I would expect that issue to verge on impossible to fix. Rearchitecting Mono to AOT everything is more expensive than just using Native AOT.

@alexyakunin
Copy link

alexyakunin commented Aug 31, 2024

That issue has 86 comments, so let me see if I can summarize. That's not one issue but really a blanket issue for: we've seen a variety of methods that must be JITed in our sample apps, which causes slow startup. Is that right? If so, I would expect that issue to verge on impossible to fix. Rearchitecting Mono to AOT everything is more expensive than just using Native AOT.

No, it's not quite right: there is a very specific scenario where AOT code isn't generated for a generic method instance:

  • It's an instance with a ValueType parameter
  • Which isn't a primitive type - this was relaxed to a type from mscorlib in a partial fix
  • And if I am not mistaken, the method itself has to be declared in the same assembly as its parameter.

I'll find the link to the specific piece of code making all these checks a bit later (already shared it here).

Overall, my impression is: yes, probably some extra is necessary to eliminate some of these constraints (e.g. modifying AOT code lookup logic, etc.), but this isn't as complicated as a full overhaul of Mono AOT.

Moreover, I suspect some of these constraints were originally added to decrease the number of generic implementations AOT generates in Full mode, and it happened at the time when generics weren't so widespread + there was no profile-based AOT mode.

@alexyakunin
Copy link

alexyakunin commented Aug 31, 2024

That issue has 86 comments

I also wish people responding to it take it seriously right from the beginning instead of saying ~ "well, you guys are fine - the app is at least starting, right? as for the startup time, it's sad, but please wait for a few more years."

I didn't write about this bizarre issue with AOT because I genuinely love .NET and believe you guys are doing a great job making it better. So even though this discovery means Mono AOT is 90% fake, and profile-guided AOT deserves this name only formally, I'd rather wait for MS to address it.

And somehow... Somehow I discover that "it's fine" feels like an acceptable answer for MS here. But seriously, how it can be fine, if a single post about this would decrease a chance of MAUI being used by a given company by maybe 50%? Isn't an elevated ANR & Play Store penalization one of the worst things you can face, assuming you can't fix this?

@alexyakunin
Copy link

alexyakunin commented Aug 31, 2024

My point is: if you guys would run MAUI as a startup, this issue would be instantly classified as "existential":

  • If people start mentioning it as number 1 concern, all of our other efforts to promote MAUI are doomed.
  • Thus no matter how much we invest into other features of MAUI or Blazor Hybrid, this single thing deserves more attention.

@agocke
Copy link
Member

agocke commented Sep 3, 2024

No, it's not quite right: there is a very specific scenario where AOT code isn't generated for a generic method instance:

It's an instance with a ValueType parameter
Which isn't a primitive type - this was relaxed to a type from mscorlib in a partial fix
And if I am not mistaken, the method itself has to be declared in the same assembly as its parameter.

Assuming the above are true, my understanding is that this is not quite as expensive as guaranteeing Mono can AOT any code, but it's close. My understanding is that specialization of value types is one of the main limitations in the current architecture and implementing it would be a very large work item.

It's certainly possible that there is a simpler implementation I'm not aware of, but I would start the cost as very expensive.

@charlesroddie
Copy link

If a mono architecture limitation prevents full AOT on Android, why does full mono AOT work on IOS?

@vyacheslav-volkov
Copy link
Author

The problem is that there is a huge gap emerging between iOS and Android in terms of performance for .NET applications. With each new release, the .NET team adds more and more value types (ValueTuple, ValueTask, Span, Memory, etc.), aiming to reduce heap allocations by using more value types. Developers naturally follow this trend, and various libraries increasingly use value types. However, when trying to create an application on Android, they encounter very slow startup times, whereas other platforms do not face this issue.

When developers ask why it is so slow on Android, they are told that using value types is detrimental because it makes JIT compilation harder, and they should avoid using them. But developers who have already spent a lot of time writing their code with value types are unlikely to create a special version just for Android that avoids them. This situation reveals a contradiction: the entire .NET ecosystem is moving towards optimizations by reducing allocations, but if you want to write for Android, you are advised to forget about value types and generics and use only classes.

I am currently working on a large project, and for iOS, I am using .NET Native. My library heavily utilizes generics and value types, and I see no issues with this. The installation size from the App Store is 72MB, which is smaller than many similar native applications written in Objective-C/Swift (~100MB), and the performance is comparable to these native apps. The situation is entirely different with Android. As soon as I add simple initialization that does nothing but call managed code, I experience a significant performance hit. For instance, JIT compilation slows down initialization by a factor of 27 check my comment #106748 (comment). I've also shared a repository where you can check it out https://github.com/vyacheslav-volkov/PerfAndroidTest. I have used all possible optimization methods, including FullAOT for Android, but this only slightly improves the result. The only option I see is to rewrite all the code specifically for Android, but even that may not help, as .NET itself uses many generics.

I don’t understand the .NET team's stance, which denies this issue by citing empty applications where everything works fine. I have provided my examples and asked for optimization assistance (@jonathanpeppers @jonpryor) but have not received any real advice that could help address the situation. The most frustrating part is that there seems to be no hope for this issue to be resolved in the near future. Now, think about it: if someone is starting a project for mobile platforms today and is choosing a framework, and they find this issue, will they choose .NET if even for such a basic thing as startup time there is no solution?

I am confident that if resources were allocated to address this issue, it wouldn't necessarily require immediate implementation of .NET Native AOT. It would be enough for someone with knowledge of Mono to try to solve this problem #101135 and provide an answer: why it really cannot be done or how it can be done and within what timeframe, as far as I understand MonoAOT works for some internal generics but doesn't work for custom generics, maybe fixing it isn't that hard. Currently, all discussions point out that Native AOT for Android is difficult, fixing MonoAOT for Android is difficult, and that the problem lies with developers because everything works fine in tests with an empty application.

@jkotas
Copy link
Member

jkotas commented Sep 4, 2024

I am currently working on a large project, and for iOS, I am using .NET Native. My library heavily utilizes generics and value types, and I see no issues with this. The installation size from the App Store is 72MB,

Do you happen to have size and startup performance numbers for Mono AOT on iOS for this project? Would the Mono AOT size and startup performance be acceptable on iOS if native AOT did not exist?

@alexyakunin
Copy link

alexyakunin commented Sep 4, 2024

I am currently working on a large project, and for iOS, I am using .NET Native. My library heavily utilizes generics and value types, and I see no issues with this.

If a mono architecture limitation prevents full AOT on Android, why does full mono AOT work on IOS?

I'm also curious how it's even possible to run Mono AOT apps on iOS w/o interpreter enabled, assuming it works the same way for both iOS and Android (and our findings for Android are correct).

For the note, we use interpreter-only builds for iOS - that's because our initial attempts w/ full AOT failed there, but that was ~ 1+ year ago, when we knew way less details of how it works + didn't do anything to address the explosion of generic instances in ArgumentList<T0, ... T8> type, etc., and now it's addressed. So we'll definitely retry with AOT builds for iOS quite soon.

@alexyakunin
Copy link

@jkotas

Would the Mono AOT size and startup performance be acceptable on iOS if native AOT did not exist?

I know I'm not the one you ask here, but in our case even interpreter-only mode startup performance is acceptable on iOS - and that's exactly what we use now. I shared the numbers, it's about 1.1s on iPhone 13 (interpreter-only) vs 1.8s on Galaxy S23 Ultra (both profiled and full AOT).

@vyacheslav-volkov
Copy link
Author

@jkotas MonoAOT performs almost as fast as NativeAOT, but the application size becomes larger. I just checked, and for this project, it's around 108 MB for MonoAOT. This is a reasonable size, as I mentioned earlier; similar applications written in Objective-C/Swift occupy about the same amount of space.

@josephmoresena
Copy link
Contributor

  • GC bridge (of some form) to support Java interop
  • NativeAOT runtime packs for Android: this is somewhat working with linux-bionic-arm64 packages, but we probably want actual android-arm64, etc. packages.

In fact a GC bridge for Java intero, .NET (runtime) interop and .NET (NativeAOT) interop would be great.

For some time now, I have been working on a JNI framework for .NET that is fully compatible with NativeAOT, and there is even an example of its use on Android.

However, from what I have observed, native Android applications go beyond JNI; it even seems that all of Java's functionality is encapsulated in Android's own native libraries.
For NativeAOT, what might be feasible is to build a framework on top of these native libraries using DirectPInvoke and NativeLibrary.

@juepiezhongren
Copy link

it is better to deprecate monoVm in the future, while monoInterpreter to be a preserved component for clr would be a descent result.

@jonpryor
Copy link
Member

jonpryor commented Oct 9, 2024

@josephmoresena wrote:

I have been working on a JNI framework for .NET that is fully compatible with NativeAOT

Out of curiosity, why? dotnet/java-interop has a couple of samples running on NativeAOT; the current JNI underpinnings of .NET for Android can work with NativeAOT. The major problem is the GC, as mentioned elsewhere on this issue.

@josephmoresena
Copy link
Contributor

@josephmoresena wrote:

I have been working on a JNI framework for .NET that is fully compatible with NativeAOT

Out of curiosity, why? dotnet/java-interop has a couple of samples running on NativeAOT; the current JNI underpinnings of .NET for Android can work with NativeAOT. The major problem is the GC, as mentioned elsewhere on this issue.

When I thought about this, it was December 2021. I also made several static approaches to handling the API, but due to the .NET version at the time, I didn’t like the final result because it depended 100% on instances rather than types.

It wasn’t until the advent of generic math that I was able to make sense of what I wanted. And that’s all; I believe no one would actually use it because, even though it’s friendly to use, its overloaded—precisely because I tried to make it friendly to use.

I think the most notable aspect of that approach is how I avoided the use of reflection (or even how I used it) through generic math to bring in the definitions (methods, fields, constructors, and functions) and data types (primitives, arrays, classes, and interfaces), always focused on a reflection-free scenario.

It doesn't use source generators, and everything is compiled statically. It has switches to trim down scenarios, and in general, it is usable (perhaps with many errors because only I maintain it) in any scenario that uses JNI or the invocation interface.

@winkmichael
Copy link

Any update on this? Without AOT performance is pretty horrible starting apps.

@charlesroddie
Copy link

@jonpryor The major problem is the GC, as mentioned elsewhere on this issue.

@jkotas Yes, the first step would be to extract the required functionality into an API proposal.

How far away is the dotnet Android team from being able to create a proposal, as IOS did in [NativeAOT] Low level API support for Objective-C scenarios - with GC API described in #44659) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests