Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container image built using source-built SDK may fail to run #34860

Closed
tmds opened this issue Aug 23, 2023 · 18 comments · Fixed by #34863
Closed

Container image built using source-built SDK may fail to run #34860

tmds opened this issue Aug 23, 2023 · 18 comments · Fixed by #34863
Labels
Area-Containers Related to dotnet SDK containers functionality untriaged Request triage from a team member

Comments

@tmds
Copy link
Member

tmds commented Aug 23, 2023

By default applications are published using an apphost, the container tooling targets Microsoft base images, and the tooling uses the apphost to start the app.

However, the source-built apphost may be incompatible with the Microsoft base images.

When I run this using the Fedora .NET 8 preview 7 SDK:

dotnet new web -o web
cd web
dotnet publish /p:PublishProfile=DefaultContainer

The resulting image does not work:

$ podman run web
You must install .NET to run this application.

App: /app/web
Architecture: x64
App host version: 8.0.0-preview.7.23375.6
.NET location: Not found

Learn more:
https://aka.ms/dotnet/app-launch-failed

Download the .NET runtime:
https://aka.ms/dotnet-core-applaunch?missing_runtime=true&arch=x64&rid=fedora.37-x64&os=debian.12&apphost_version=8.0.0-preview.7.23375.6

The source-built apphost is unable to find the .NET root because in our builds we've changed the sources to where the .NET root to match where we have it on Fedora. I don't think we actually still need to patch this location... but the issue remains:

The source-built apphost can be binary incompatible with the Microsoft base images: it may have a glibc version issue, and on a musl-based distro (like Alpine) the apphost isn't going to run on the glibc Microsoft base images either.

Can we update the container logic for .NET 8 so that it uses the apphost command only for SelfContained?

cc @baronfel @richlander @MichaelSimons @omajid

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged Request triage from a team member label Aug 23, 2023
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@tmds
Copy link
Member Author

tmds commented Aug 23, 2023

on a musl-based distro (like Alpine) the apphost isn't going to run on the glibc Microsoft base images either

This problem also occurs when using a non source-built SDK on Alpine.

Can we update the container logic for .NET 8 so that it uses the apphost command only for SelfContained?

I've implemented this in #34863 as a proposed fix.

@baronfel baronfel added the Area-Containers Related to dotnet SDK containers functionality label Aug 23, 2023
@baronfel
Copy link
Member

I'm a bit confused about what's shown here - I'd expect if the apphost needed to be swapped out that AppHost package selection in the build targets would handle that?

@tmds
Copy link
Member Author

tmds commented Aug 23, 2023

I'd expect if the apphost needed to be swapped out that AppHost package selection in the build targets would handle that

I'm not sure what you mean by swapped out.

By default the SDK includes an apphost in the published output. This apphost comes as part of the SDK.

The SDK apphost may not be compatible with the base image.

Consider the default Microsoft base images. These use Debian (glibc-based).

Depending on the SDK you use to containerize the app, the resulting image may not work.

  • The portable apphost that comes as part of the Microsoft linux-x64 SDK will work.
  • A source-build apphost that was built on a glibc based distro (like Fedora) may or may not work, due to the way glibc versions symbols.
  • And an apphost from a musl-based SDK (Microsoft built, or source-built) won't work.

To put it differently: the apphost is a rid-specific asset, and its rid may be incompatible with the base image.

@richlander
Copy link
Member

I understand. The basic point is that the dotnet launcher works in more cases, because it has to, and is easy to locate since it is in PATH.

The apphost is the tip of the iceberg on RID problems. Publishing to a musl container w/o specializing the build for musl is a bad pattern to encourage.

The pathing issue is different, but I'd ask why one of the standard locator mechanisms isn't being used (/etc/dotnet/install_location or DOTNET_ROOT).

That aside, there are a few options here:

  • Honor the UseAppHost property, which would result in an apphost being used by default. Similar to status quo.
  • Same, but set UseAppHost to false if not set.
  • Invent a new ContainerUseAppHost setting with the default being false

I propose that the options are listed in order of desirability. We can start with one and move on to the next if it isn't sufficient.

@tmds
Copy link
Member Author

tmds commented Aug 23, 2023

The proposed fix implemented in #34863 is the option:

Ignore the apphost and start the app in the most portable way.

@tmds
Copy link
Member Author

tmds commented Aug 23, 2023

I don't see much of an advantage of starting the app through the apphost in the container image. It's more like an opportunity for failure.

@richlander
Copy link
Member

That's true. However, I have seen SDK container builds as building on publish with all that means and using the apphost doesn't resolve all the problems.

I see this approach as a nice (as opposed to ugly) hack. It is targeted and not complete. I don't think it does harm, however. If that's what you folks want to do, go ahead.

@tmds
Copy link
Member Author

tmds commented Aug 23, 2023

Same, but set UseAppHost to false if not set.

I think it would be nice if the SDK would do this for all commands, that is: if there is nothing rid specific about the build (rid specific includes: UseAppHost set to true) then don't include an apphost.

That would be a more invasive change, for which we're late in the release cycle.

@baronfel
Copy link
Member

baronfel commented Aug 23, 2023

Just a quick note that Option 1 (Honor the UseAppHost property, which would result in an apphost being used by default. Similar to status quo.) is what we were doing before that led to this issue, so I don't think it's on the table. And we don't really control UseAppHost (actually we don't know if a container publish is going to occur as part of a publish operation) and so we can't reliably control the value of UseAppHost, so I think option 2 isn't workable either.

I'm ok starting the app via dotnet currently in all cases except when the runtime-deps images are chosen (which is #34863), just wanted to put that bit of info out there^

@richlander
Copy link
Member

if there is nothing rid specific about the build (rid specific includes: UseAppHost set to true) then don't include an apphost.

Agree on that. However, I'm still wanting to change the overall default to RID-specific, as which point we'd be back to square one.

@tmds
Copy link
Member Author

tmds commented Aug 23, 2023

I'm still wanting to change the overall default to RID-specific

I wouldn't like this as for most rid-specific features a source-built SDK falls back to prebuilt Microsoft binaries (dotnet/source-build#1215).

Also from a UX point of view, I think it's good the default output is portable, and the user expresses through some intent (like specifying a rid, or enabling self-contained) that he'd like something more specific.

@baronfel this makes me wonder about something. Afaik, when the SDK builds a container image on Windows and no rid is set, it still builds a Linux image. How does the SDK know it should publish the app with a Linux apphost?

@richlander
Copy link
Member

I think you are conflating RID-specific and self-contained. Framework-dependent apps can be RID-specific. Which MS built binaries are you concerned about in that scenario?

Also, we should work on a better prebuilt scenario. I agree that it is lackluster.

@baronfel
Copy link
Member

@baronfel this makes me wonder about something. Afaik, when the SDK builds a container image on Windows and no rid is set, it still builds a Linux image. How does the SDK know it should publish the app with a Linux apphost?

What I expect to happen is that since no RID is set, we have a platform-independent build and so the apphost isn't used in the App command. Instead we should launch the app via the 'dotnet' method.

Having said that, I'll try to verify that behavior soon.

@baronfel
Copy link
Member

Unfortunately that is not the behavior. UseAppHost defaults to true these days so we are currently broken here in the 'default' build scenario. If a Windows (or macOS!) user does specify a RID then they're all good, but the unspecified case falls into this pit. Your change would help this I believe, but not mitigate it entirely.

@tmds
Copy link
Member Author

tmds commented Aug 24, 2023

I think you are conflating RID-specific and self-contained. Framework-dependent apps can be RID-specific. Which MS built binaries are you concerned about in that scenario?

Mostly the linux-x64 apphost.

A default FDD publish shouldn't be limited by the SDK to only work on a subset of rids.

This applies to the source-built SDK since that apphost isn't portable like the ones built by Microsoft.

It also applies to Microsoft SDKs: a FDD publish on Alpine has an apphost that doesn't work on Ubuntu.

Your change would help this I believe, but not mitigate it entirely.

Yes, it will work for the non-self contained case now.

The same issue exists on Linux for self-contained. A self-contained app published on Alpine won't run in the glibc based base image, and vice versa. To improve that, the default base image logic could take into account the RuntimeIdentifier.

On Windows, it's still an issue that this RuntimeIdentifier would be 'Windows' while the desired default behavior would be to build a 'Linux' container.

@baronfel
Copy link
Member

The same issue exists on Linux for self-contained. A self-contained app published on Alpine won't run in the glibc based base image, and vice versa. To improve that, the default base image logic could take into account the RuntimeIdentifier.

Totally agree here - we've got an issue tracking this that proposes this enhancement. We'd feed the RID (either explicitly specified or inferred from the current SDK's default RID) into tag selection and set fields like ContainerFamily to specific values like alpine if necessary to support the RID chosen.

@tmds
Copy link
Member Author

tmds commented Sep 3, 2023

Fixed by #34863

@tmds tmds closed this as completed Sep 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Containers Related to dotnet SDK containers functionality untriaged Request triage from a team member
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants