Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated RID Plan #83246

Closed
richlander opened this issue Mar 10, 2023 · 79 comments
Closed

Updated RID Plan #83246

richlander opened this issue Mar 10, 2023 · 79 comments
Assignees
Milestone

Comments

@richlander
Copy link
Member

richlander commented Mar 10, 2023

RID Plan

We worked on a plan to tame the RID graph last year. It got close to resolution, but not quite far enough. We also didn't have time to fund it as part of .NET 7. We started talking about it again earlier this year. We have defined targeted updates to the earlier plan that we think will get across the finish line.

The goal remains the same, which is to stop updating the RID graph and to stop reading it in all scenarios, by default. Eventually, we want to delete it. The RID graph represents a poor design point of the product, particularly in terms of how it supports Linux, aiming to act as a persisted database of historical and current operating system releases. We're also not doing a great job of keeping it updated to satisfy its intended function.

Quick context

The RID graph is used to two primary spots:

  • By dotnet build or dotnet publish to select the best-match RID-specific assets (primarily from NuGet packages) to generate a RID-specific app.
  • By the host to select RID-specific assets from a portable app layout.

Put another way, given RID-specific assets in an app NuGet graph, those assets are either selected during build or deferred to runtime.

In both cases, the target RID of the app or the environment often doesn't match the RID of the assets. That's intentional and fine, however resolving that difference requires a best-match algorithm. Today, the RID graph serves as an authoritative database for that algorithm to use.

For example, a portable app might be running on a Linux Mint 19 machine, yet carry assets (for a given dependency) that separately target Ubuntu 18.04 and portable Linux. The host knows (via the RID graph) that the Ubuntu 18.04 asset is compatible with Linux Mint 19 and a better match than the portable Linux asset. It chooses the Ubuntu 18.04 asset.

Open issues

The following issues are the primary ones blocking progress:

  • We planned to have an algorithmic approach. Should it include concrete RIDs, like ubuntu.22.04?
  • Source build users tend to include concrete RIDs, like rhel.8. Should we ensure that distro-specifc source-build builds and the portable Linux build have parity in that respect?
  • How do we handle compatible distros, like Oracle Linux and RHEL or Linux Mint and Ubuntu?
  • Can we use the same solution for the apphost and the SDK?

Probing Algorithm

The original plan proposed the following probing model for the portable build:

  • linux-x64
  • unix

Note: Architecture-specific RIDs are intended for native and ready-to-run code. Architecture-agnostic RIDs are intended for IL code (often related to P/Invokes).

That's probably artificially limited. Source-build users offer a more expansive set of RIDs to probe, which might bias to too much.

We concluded that the following model should satisfy the vast majority of needs.

  • ubuntu.22.04-x64
  • ubuntu.22.04
  • linux-x64
  • unix

This model would offer symmetry (or symmetry enough) between the portable build and the typical distro-specific source-build build.

Distro compatibility

Expressing distro-compatibility adds a lot of nodes to the centrally-managed RID graph. We also expect that they don't benefit a lot of users. Instead, we propose to move to a model where compatibility relationships are described in app project files for the relatively few apps that need them. We propose roughly similar support as what source-build users have for adding RID information.

We don't want to invent a new format. We have three formats already:

We propose to use the runtime.compatibility.json format, like the following real example:

  "linuxmint.19": [
    "linuxmint.19",
    "ubuntu.18.04",
    "ubuntu",
    "debian",
    "linux",
    "unix",
    "any",
    "base"
  ]

This compatibility description is notable since it doesn't include architecture information. Given a desire for apps to run on multiple architectures, we can algorithmically add architecture when apps (or tools) are running (in a given environment).

We'd recommend a more terse description (to describe just the compatability relationship):

  "linuxmint.19": [
    "linuxmint.19",
    "ubuntu.18.04",
  ]

Multiple compatibility relationships can be described.

{
    "linuxmint.19": [
        "linuxmint.19",
        "ubuntu.18.04",
    ],
    "ol.8": [
      "ol.8",
      "rhel.8",
    ],    
}

We propose that this compatibility information be included in app project files in some new property, like the following:

<PropertyGroup>
    <RuntimeCompatibility>
      {
          "linuxmint.19": [
              "linuxmint.19",
              "ubuntu.18.04",
          ],
          "ol.8": [
            "ol.8",
            "rhel.8",
          ],    
      }
    </RuntimeCompatibility>
</PropertyGroup>

The following is an alternative format we will consider:

<ItemGroup>
  <RuntimeCompatibility Include="linuxmint.19.2" Fallbacks="linuxmint.19.1;linuxmint.19;ubuntu.18.04;ubuntu;debian" />
  <RuntimeCompatibility Include="ol.8" Fallbacks="rhel.8" />
</ItemGroup>

This information would be copied into app.deps.json in a new section.

Embedded JSON is a bit ugly. We could use XML, in the runtimeGroups.props format. That would be workable. We'd need to productize the task that converts that XML to JSON. Given how targeted we think the scenario is, we are proposing the simpler JSON-based approach.

The RID graph includes multiple variations for each distro, like ol.8 and ol.8.0. For Linux Mint, there is linuxmint.19.0, linuxmint.19.1, and linuxmint.19.2. For Ubuntu, there is ubuntu.18.04 and ubuntu.18.10. Note the relationships between Linux Mint and Ubuntu releases.

We should reason about distro versions in terms of VERSION_ID in /etc/os-release and then rely on stated compatibility relationships for any differences, including between Linux Mint dot versions (for example).

$ docker run --rm linuxmintd/mint21-amd64 cat /etc/os-release | grep VERSION
VERSION="21 (Vanessa)"
VERSION_ID="21"
VERSION_CODENAME=vanessa
$ docker run --rm linuxmintd/mint21.1-amd64 cat /etc/os-release | grep VERSION
VERSION="21.1 (Vera)"
VERSION_ID="21.1"
VERSION_CODENAME=vera

This means that linuxmint.21 and linuxmint.21.1 would be valid RIDs, but linuxmint.21.0 would not. A compatibility relationship would need to be specified (as described above) to enable an app running on Linux Mint 21.1 to use a linuxmint.21 asset.

The following examples demonstrate the differing compatibility relationships offered by Linux Mint and Ubuntu.

Linux Mint 19.2:

  "linuxmint.19.2": [
    "linuxmint.19.2",
    "linuxmint.19.1",
    "linuxmint.19",
    "ubuntu.18.04",
    "ubuntu",
    "debian",
    "linux",
    "unix",
    "any",
    "base"
  ],

Ubuntu 18.10:

  "ubuntu.18.10": [
    "ubuntu.18.10",
    "ubuntu",
    "debian",
    "linux",
    "unix",
    "any",
    "base"
  ],

They key point being made is that there is no compatibility relationship offered between Ubuntu releases. Ubuntu 18.04 assets are not using when running on Ubuntu 18.10. However, both Linux Mint and Ubuntu RIDs offer compatibility to Debian. We will not offer compatibility for any unversioned distro RIDs, like ubuntu and debian.

Note: This doc uses a mix of Linux Mint 19 and 21 since v19.x is the most recent version recorded in the RID graph, while v21.x is the latest stable version. That alone provides evidence of the operational drawbacks of the current model and a benefit that the proposed system would deliver.

Backwards compatibility

The transition to an algorithmic approach to RIDs is a significant change. We will need a backwards compatibility mode.

We propose:

  1. The apphost uses the algorthmic mode by default.
  2. There is a property that enables using the existing RID graph approach instead.
  3. App developers can specify distro compatibility relationships that they want to use.

The expectation is that this combination of modes will enable us to freeze or significantly freeze the RID graph. Assuming we freeze the RID graph, there may be scenarios where users might need to use option #3 in combination with either option 1 or 2.

SDK experience

We need to work through the SDK/tools experience a bit more.

We need an experience that notifies developers that they will not be well-served by the pure algorthmic mode. This will be the case when dotnet restore detects a non-portable RID-specific asset.

Portable RIDs:

  • linux
  • unix
  • win
  • osx
  • wasm

Non-portable RID examples:

  • linuxmint.19.2
  • ubuntu.18.04
  • rhel.8

If a non-portable RID is discovered in the package graph, the SDK should produce a warning, similar to the following:

"Distribution-specific assets were identified in one or more package references. You may want to add distribution-compatibility relationships to your project file. See http://aka.ms/dotnet/rid/compatibility to learn more."

We would offer some pragma disable option to quiet the warning.

For portable apps, this warning would give developers an opportunity to define compatibility relationships so that the correct assets get used on Linux Mint and Ubuntu, for example.

RID-specific apps are in some ways the easier case. The developer will specify a specific RID and there is an opportunity for the developer to inspect the results and for the SDK to present an error (if there was no best-match asset).

We have a couple options on identifiying non-portable RIDs. We can the discrete list above, or we can rely on the existing RID graph to determine what the non-portable RIDs are. Both are non-perfect.

Regression or improvement?

This proposal is simultaneously a regression and an improvement.

How it is a regression:

  • Distro compatibility relationships become an app dev concern.
  • .NET doesn't come with distro relationship information, and there is no central place for recording that.

How it is an improvement:

  • Distro compatibility relationships can be specified at the point of need.
  • RID graph updates are not necessary and no longer need to rely on the expensive and lengthy .NET servicing process.
  • The RID graph is inconsistent in its updates. It is now as up to date as needed by the app developer.

A related problem is that we've never provided a way to correctly build portable linux assets. From that point-of-view, it is safer to build distro-specific assets due to that, at least currently. We are considering offering guidance on how to build linux assets. We're hoping that encourages producing more portable assets where there might be distro-specific ones today.

Impact on Source-Build

Today, source-build users rely on augmenting the RID graph via MSBuild. Assuming we can lock the RID graph, that will no longer make much sense. Instead, we'd need to move source-build users to this plan. That will likely mean making changes in host code instead of to the RID graph.

Deletion of RID graph

We should delete the RID graph with the .NET 9 or 10 releases, ideally the earlier of the two. The .NET 8 release should give us insight into which scenarios have a strong dependency on this data.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 10, 2023
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented Mar 10, 2023

Tagging subscribers to this area: @dotnet/area-meta
See info in area-owners.md if you want to be subscribed.

Issue Details

Updated RID Plan

We worked on a plan to tame the RID graph last year. It got close to resolution, but not quite far enough. We also didn't have time to fund it as part of .NET 7. We started talking about it again earlier this year. We have defined targeted updates to the earlier plan that we think will get across the finish line.

The goal remains the same, which is to stop updating the RID graph and to stop reading it in all scenarios, by default. Eventually, we want to delete it. The RID graph represents a poor design point of the product, particularly in terms of how it supports Linux, aiming to act as a persisted database of historical and current operating system releases. We're also not doing a great job of keeping it updated to satisfy its intended function.

Quick context

The RID graph is used to two primary spots:

  • By dotnet build or dotnet publish to select the best-match RID-specific assets (primarily from NuGet packages) to generate a RID-specific app.
  • By the host to select RID-specific assets from a portable app layout.

Put another way, given RID-specific assets in an app NuGet graph, those assets are either selected during build or deferred to runtime.

In both cases, the target RID of the app or the environment often doesn't match the RID of the assets. That's intentional and fine, however resolving that difference requires a best-match algorithm. Today, the RID graph serves as an authoritative database for that algorithm to use.

For example, a portable app might be running on a Linux Mint 19 machine, yet carry assets (for a given dependency) that separately target Ubuntu 18.04 and portable Linux. The host knows (via the RID graph) that the Ubuntu 18.04 asset is compatible with Linux Mint 19 and a better match than the portable Linux asset. It chooses the Ubuntu 18.04 asset.

Open issues

The following issues are the primary ones blocking progress:

  • We planned to have an algorithmic approach. Should it include concrete RIDs, like ubuntu.22.04?
  • Source build users tend to include concrete RIDs, like rhel.8. Should we ensure that distro-specifc source-build builds and the portable Linux build have parity in that respect?
  • How do we handle compatible distros, like Oracle Linux and RHEL or Linux Mint and Ubuntu?
  • Can we use the same solution for the apphost and the SDK?

Probing Algorithm

The original plan proposed the following probing model for the portable build:

  • linux-x64
  • unix

Note: Architecture-specific RIDs are intended for native and ready-to-run code. Architecture-agnostic RIDs are intended for IL code (often related to P/Invokes).

That's probably artificially limited. Source-build users offer a more expansive set of RIDs to probe, which might bias to too much.

We concluded that the following model should satisfy the vast majority of needs.

  • ubuntu.22.04-x64
  • ubuntu.22.04
  • linux-x64
  • unix

This model would offer symmetry (or symmetry enough) between the portable build and the typical distro-specific source-build build.

Distro compatibility

Expressing distro-compatibility adds a lot of nodes to the centrally-managed RID graph. We also expect that they don't benefit a lot of users. Instead, we propose to move to a model where compatibility relationships are described in app project files for the relatively few apps that need them. We propose roughly similar support as what source-build users have for adding RID information.

We don't want to invent a new format. We have three formats already:

We propose to use the runtime.compatibility.json format, like the following real example:

  "linuxmint.19": [
    "linuxmint.19",
    "ubuntu.18.04",
    "ubuntu",
    "debian",
    "linux",
    "unix",
    "any",
    "base"
  ]

This compatibility description is notable since it doesn't include architecture information. Given a desire for apps to run on multiple architectures, we can algorithmically add architecture when apps (or tools) are running (in a given environment).

We'd recommend a more terse description (to describe just the compatability relationship):

  "linuxmint.19": [
    "linuxmint.19",
    "ubuntu.18.04",
  ]

Multiple compatibility relationships can be described.

{
    "linuxmint.19": [
        "linuxmint.19",
        "ubuntu.18.04",
    ],
    "ol.8": [
      "ol.8",
      "rhel.8",
    ],    
}

We propose that this compatibility information be included in app project files in some new property, like the following:

<PropertyGroup>
    <RuntimeCompatibility>
      {
          "linuxmint.19": [
              "linuxmint.19",
              "ubuntu.18.04",
          ],
          "ol.8": [
            "ol.8",
            "rhel.8",
          ],    
      }
    </RuntimeCompatibility>
</PropertyGroup>

This information would be copied into app.deps.json in a new section.

Embedded JSON is a bit ugly. We could use XML, in the runtimeGroups.props format. That would be workable. We'd need to productize the task that converts that XML to JSON. Given how targeted we think the scenario is, we are proposing the simpler JSON-based approach.

The RID graph includes multiple variations for each distro, like ol.8 and ol.8.0. For Linux Mint, there is linuxmint.19.0, linuxmint.19.1, and linuxmint.19.2. For Ubuntu, there is ubuntu.18.04 and ubuntu.18.10. Note the relationships between Linux Mint and Ubuntu releases.

We should reason about distro versions in terms of VERSION_ID in /etc/os-release and then rely on stated compatibility relationships for any differences, including between Linux Mint dot versions (for example).

$ docker run --rm linuxmintd/mint21-amd64 cat /etc/os-release | grep VERSION
VERSION="21 (Vanessa)"
VERSION_ID="21"
VERSION_CODENAME=vanessa
$ docker run --rm linuxmintd/mint21.1-amd64 cat /etc/os-release | grep VERSION
VERSION="21.1 (Vera)"
VERSION_ID="21.1"
VERSION_CODENAME=vera

This means that linuxmint.21 and linuxmint.21.1 would be valid RIDs, but linuxmint.21.0 would not. A compatibility relationship would need to be specified (as described above) to enable an app running on Linux Mint 21.1 to use a linuxmint.21 asset.

The following examples demostrate the differring compatibility relationships offered by Linux Mint and Ubuntu.

Linux Mint 19.2:

  "linuxmint.19.2": [
    "linuxmint.19.2",
    "linuxmint.19.1",
    "linuxmint.19",
    "ubuntu.18.04",
    "ubuntu",
    "debian",
    "linux",
    "unix",
    "any",
    "base"
  ],

Ubuntu 18.10:

  "ubuntu.18.10": [
    "ubuntu.18.10",
    "ubuntu",
    "debian",
    "linux",
    "unix",
    "any",
    "base"
  ],

They key point being made is that there is no compatibility relationship offered between Ubuntu releases. Ubuntu 18.04 assets are not using when running on Ubuntu 18.10. However, both Linux Mint and Ubuntu RIDs offer compatibility to Debian. We will not offer compatibility for any unversioned distro RIDs, like ubuntu and debian.

Note: This doc uses a mix of Linux Mint 19 and 21 since v19.x is the most recent version recorded in the RID graph, while v21.x is the latest stable version. That alone provides evidence of the operational drawbacks of the current model and a benefit that the proposed system would deliver.

Backwards compatibility

The transition to an algorithmic approach to RIDs is a significant change. We will need a backwards compatibility mode.

We propose:

  1. The apphost uses the algorthmic mode by default.
  2. There is a property that enables using the existing RID graph approach instead.
  3. App developers can specify distro compatibility relationships that they want to use.

The expectation is that this combination of modes will enable us to freeze or significantly freeze the RID graph. Assuming we freeze the RID graph, there may be scenarios where users might need to use option #3 in combination with either option 1 or 2.

SDK experience

We need to work through the SDK/tools experience a bit more.

We need an experience that notifies developers that they will not be well-served by the pure algorthmic mode. This will be the case when dotnet restore detects a non-portable RID-specific asset.

Portable RIDs:

  • linux
  • unix
  • win
  • osx
  • wasm

Non-portable RID examples:

  • linuxmint.19.2
  • ubuntu.18.04
  • rhel.8

If a non-portable RID is discovered in the package graph, the SDK should produce a warning, similar to the following:

"Distribution-specific assets were identified in one or more package references. You may want to add distribution-compatibility relationships to your project file. See http://aka.ms/dotnet/rid/compatibility to learn more."

We would offer some pragma disable option to quiet the warning.

For portable apps, this warning would give developers an opportunity to define compatibility relationships so that the correct assets get used on Linux Mint and Ubuntu, for example.

RID-specific apps are in some ways the easier case. The developer will specify a specific RID and there is an opportunity for the developer to inspect the results and for the SDK to present an error (if there was no best-match asset).

We have a couple options on identifiying non-portable RIDs. We can the discrete list above, or we can rely on the existing RID graph to determine what the non-portable RIDs are. Both are non-perfect.

Regression or improvement?

This proposal is simultaneously a regression and an improvement.

How it is a regression:

  • Distro compatibility relationships become an app dev concern.
  • .NET doesn't come with distro relationship information, and there is no central place for recording that.

How it is an improvement:

  • Distro compatibility relationships can be specified at the point of need.
  • RID graph updates are not necessary and no longer need to rely on the expensive and lengthy .NET servicing process.
  • The RID graph is inconsistent in its updates. It is now as up to date as needed by the app developer.

A related problem is that we've never provided a way to correctly build portable linux assets. From that point-of-view, it is safer to build distro-specific assets due to that, at least currently. We are considering offering guidance on how to build linux assets. We're hoping that encourages producing more portable assets where there might be distro-specific ones today.

Impact on Source-Build

Today, source-build users rely on augmenting the RID graph via MSBuild. Assuming we can lock the RID graph, that will no longer make much sense. Instead, we'd need to move source-build users to this plan. That will likely mean making changes in host code instead of to the RID graph.

Deletion of RID graph

We should delete the RID graph with the .NET 9 or 10 releases, ideally the earlier of the two. The .NET 8 release should give us insight into which scenarios have a strong dependency on this data.

Next Steps

Once we (hopefully) get to agreement, I'll update the original RID issue on the updated plan.

It's possible we can still get this plan into .NET 8.

Thanks for all the feedback on the designs PR and TIA for feedback on this one.

@elinor-fung @tmds @baronfel @jkotas @omajid @agocke

Author: richlander
Assignees: -
Labels:

area-Meta, untriaged

Milestone: -

@agocke
Copy link
Member

agocke commented Mar 10, 2023

Unfortunately, I think this is still too complicated and provides support for scenarios that we should drop entirely.

As mentioned in above, I believe that RIDs serve two purposes:

  • To select platform-specific assets from NuGet
  • To select platform-specific assets from the runtime itself

By encoding a very complex description of "platform" in RIDs, it's allowed NuGet packages and the runtime a very large amount of flexibility in micro-targeting exactly which assets to provide different platforms.

I think this flexibility was a mistake. The vast majority of NuGet packages do not make use of this flexibility, and in fact, for many of them, attempting to target many OS variants has proven to be a headache rather than a boon.

Instead, I think RIDs should express a basic, widely used definition of "binary compatibility." Most commonly this refers to "ABI", or C-language compatibility. By using this definition we would not make it possible to deliver arbitrary assets for arbitrary operating systems, but we would allow apps and libraries to provide assets which can communicate over the generally understood C ABI for that system.

Practically, this would shrink the RID system to exactly these components:

  • Operating system family (with optional version)
  • Libc (if multiple are available for the platform)
  • Processor architecture

This system would imply that you would not be able to deliver app dependencies that have a stronger-than-C-ABI requirement using the .NET tooling. I believe this is a viable and preferrable system. For operating systems like Windows and Mac, this is rarely a problem as there are few variations and the C-ABI is the dominant interop system.

For Linux distributions, many libraries and apps would not find this particularly constraining, as many of them want to provide maximal portability regardless. For others, which would like to do more specific distro targeting, historically Linux distributions have provided system package managers for exactly this reason.

Our new recommendations would be:

  • For library builders, either provide ABI-compatible assets, or arrange for dependencies to be delivered out of band.
  • For app builders, possibly bundle extra dependencies if your app is designed to be distro-specific
  • For distros, you must provide dotnet dependencies at least as high as the ABI requirements of the .NET release you are building. You will not be able to provide a "custom" RID for specialty packages. If you would like to build and provide the redistributable assets (e.g., the apphost), you will need to build and provide an ABI-compatible asset for that .NET release.

@agocke
Copy link
Member

agocke commented Mar 10, 2023

  • @dotnet/distro-maintainers for feedback

@richlander
Copy link
Member Author

I hear you saying that you liked the original proposal better, which focused on the Linux portable RID and not specific distros like Ubuntu 22.04 let alone its relationship with Linux Mint.

We started this expanded proposal since developers on our team publish distro-specific RID-split packages.

I feel like you are missing the key point that this proposal agrees with your feedback. It is intended to significantly curtail the descriptive capability that we provide by default. It also enables a compatible bridge to a simpler system.

I see you saying that you are willing to accept the break that your proposal entails. I agree that distro specific packages are not common however I don't see tour proposed break as necessary to get us on a better path.

@tannergooding

@agocke
Copy link
Member

agocke commented Mar 10, 2023

I feel like you are missing the key point that this proposal agrees with your feedback. It is intended to significantly curtail the descriptive capability that we provide by default. It also enables a compatible bridge to a simpler system.

That's a fair critique. My understanding is that this system still defines a public API which has to be understood by NuGet authors and distro maintainers. I am worried that even having these capabilities available substantially complexifies the ecosystem. Is there something I'm overlooking?

@richlander
Copy link
Member Author

Can you define the public API you see?

@agocke
Copy link
Member

agocke commented Mar 10, 2023

I see this as moving from an explicit contact to an implicit contract. In the package side, it would be the responsibility of package authors to survey the space of distros and provide specific assets. On the distro side, they would be pressured to provide unique identifiable info in /etc/os-release.

In the best case, it seems like there would be significant package churn as distros lobby various packages to add them to the list. Or, distros won't lobby at all and users of those distros will be stuck without binaries.

In the worst case, it seems like this could end up like user-agent, with everyone lying about their name and the packages looking for the right lies.

On the other hand, if we don't have a system like this at all, packages will be pressured to either find some way to provide true portable binaries for maximum compat with no configuration, or level the distro playing field by moving the whole system to external package managers.

@vitek-karas
Copy link
Member

In the worst case, it seems like this could end up like user-agent, with everyone lying about their name and the packages looking for the right lies.

This assumes that .NET will be a driving force to change what distros return in their /etc/os-release. I'd love to think .NET will be that popular, but I highly doubt it. I think it's safer to assume that basically nobody in the .NET ecosystem will have any say to what is returned there -> expect the unexpected/chaos.

Ability to specify custom RID fallbacks in a project file

I like this idea. Personally I would go with the MSBuild native syntax - embedding JSON into msbuild looks really weird. And embedding XML will make things probably even worse. (Note: The current json format is mostly invisible to developers, with the exception of distro authors, so I think there's little value in trying to reuse it)

One additional consideration - if I'm shipping a NuGet which has distro specific assets I might want to augment the RID graph in every app the NuGet is referenced. This is obviously doable as NuGets can inject targets into the msbuild, but I think we should consider this as another aspect of the solution. (And yes, this comes with the danger that someone will publish a NuGet which adds back the entire RID graph as we know it today... I don't know how I feel about it... but we won't do that ourselves).

Desire to force more code to be distro agnostic

I think that is sort of orthogonal - the above proposal doesn't make it any easier to maintain distro specific packages, it probably makes it actually harder. So the incentive is there already. It's just not a "you must migrate" statement. I also think that we can't try to force packages to distro agnostic unless we have a good solution how to produce such packages (like the manylinux target and docker images to use to build assets on). On that topic - are we even the right group to try to do that? Why not piggy back on some other platform which already solves this to some extent (like Python). We're bound to create something sufficiently different that it won't make this easier for developers necessarily.

@tmds
Copy link
Member

tmds commented Mar 10, 2023

We concluded that the following model should satisfy the vast majority of needs.
ubuntu.22.04-x64
ubuntu.22.04
linux-x64
unix

I would add ubuntu.22-x64, ubuntu.22 to this. It doesn't mean anything for Ubuntu, but for distros that have a <major>.<minor> scheme where the major indicates binary compatibility (like RHEL) it makes things work.

We'd recommend a more terse description (to describe just the compatability relationship):

+1 for going from rid graph model to rid compatibility model.

That is a lot less to express.

Distro compatibility relationships become an app dev concern.

It would be nice if this compatibility could be expressed in a NuGet package, and Microsoft would still maintain such a package.

Dealing with the compatibility relationships should be a matter of referencing this package, unless there are rids involved not yet known by the package.

"Distribution-specific assets were identified in one or more package references. You may want to add distribution-compatibility relationships to your project file. See http://aka.ms/dotnet/rid/compatibility to learn more."

As @vitek-karas suggested, the package maintainer can include this information for the non-portable rids he includes.
He could do so by referencing the compatibility package, if that includes these non-portable rids already.

For the published app, the compatibility information can be trimmed down based on the non-portable rids for which there are actual artifacts (include those, and their derivative distros).

@janvorli
Copy link
Member

In the portable list above, I don't see linux-musl / linux (with glibc) distinction. Is that intentional? Executable assets like shared libraries are not compatible between those two, so it seems we will need both.

@agocke
Copy link
Member

agocke commented Mar 10, 2023

@janvorli I believe in this proposal all the "portable" rids would stick around, so linux-musl-x64 would still be there.

@elinor-fung
Copy link
Member

Yeah, portable would still have linux-musl vs. linux.

if I'm shipping a NuGet which has distro specific assets I might want to augment the RID graph in every app the NuGet is referenced

Definitely a good consideration. This would lean towards the MSBuild-y syntax too.

I would add ubuntu.22-x64, ubuntu.22 to this. It doesn't mean anything for Ubuntu, but for distros that have a . scheme where the major indicates binary compatibility (like RHEL) it makes things work.

As in you think that should be part of the algorithmic model? Since we're trying to get out of .NET knowing/maintaining compatibility relationships, I'd see that as something that can be specified by the app via the proposed distro compatibility mechanism.

@agocke
Copy link
Member

agocke commented Mar 10, 2023

Yeah, I'm curious why we need any ubuntu- RIDs. I think one of the main benefits of this model would be that distro maintainers, including ubuntu, would stop publishing with a custom RID.

@tmds
Copy link
Member

tmds commented Mar 10, 2023

As in you think that should be part of the algorithmic model?

Yes. On RHEL, /etc/os-release contains a VERSION_ID of for example 8.7.
We haven't added these minors to the rid graph, but instead the host has some code specifically for RHEL that strips the minor in the normalize_linux_rid function.

There is similar code for Rocky Linux, which is a RHEL derivate.

The algoritmic model can derive the minor to major relationships.

The compatibility between Rocky an RHEL can be expressed at the major version level: rhel.8 <- rocky.8.

Yeah, I'm curious why we need any ubuntu- RIDs. I think one of the main benefits of this model would be that distro maintainers, including ubuntu, would stop publishing with a custom RID.

For source-build we need non-portable rids. These rid act as a 'namespace' for identifying source build artifacts on the distro.

They allow to make the distinction with the portable assets, which Microsoft builds, and work across a range of distros.

See dotnet/source-build#2932 (comment).

@agocke
Copy link
Member

agocke commented Mar 10, 2023

For source-build we need non-portable rids. These rid act as a 'namespace' for identifying source build artifacts on the distro.

How does this work for Java? Is there a special signifier in Maven for OpenJDK assets vs. Oracle assets?

@tmds
Copy link
Member

tmds commented Mar 10, 2023

How does this work for Java? Is there a special signifier in Maven for OpenJDK assets vs. Oracle assets?

It is not about the vendor.

Portable, like linux-{arch}, means the native binaries run across a range of glibc based distros.

Is that a thing with Java?

Because portable means something for .NET, there is the need to distinguish with what is not portable.

Go avoids the problem by not taking a dependency on libc. It directly makes Linux syscalls.

@agocke
Copy link
Member

agocke commented Mar 10, 2023

Is that a thing with Java?

Looks like Maven has no built-in support and people generally use an external "Native Maven Plugin" library, https://www.mojohaus.org/maven-native/native-maven-plugin/. It looks like that library intends for you to distribute C++ source and compile it locally.

Go avoids the problem by not taking a dependency on libc. It directly makes Linux syscalls.

No, cgo does have a libc dependency. They avoid the problem by directing people to bundle source code and build on the local machine.

Looking at a number of languages, it looks like distributing source code is the most common solution. Python stands out by distributing binaries.

@tmds
Copy link
Member

tmds commented Mar 10, 2023

Looks like Maven has no built-in support and people generally use an external "Native Maven Plugin" library, https://www.mojohaus.org/maven-native/native-maven-plugin/. It looks like that library intends for you to distribute C++ source and compile it locally.

So, portable doesn't exist here.

No, cgo does have a libc dependency. They avoid the problem by directing people to bundle source code and build on the local machine.

Go itself doesn't take a libc dependency like .NET, it makes direct syscalls.
This allows to create portable binaries.

cgo is specific for Go interacting with C.
If that requires building on the local machine, then when using cgo, the binaries are not portable.

@tannergooding
Copy link
Member

I'd really not like to have the primary solution be "have people build from source". As an escape hatch, it seems "ok", but in general some native dependencies can take hours to build or fail to build entirely on lower end machines. Requiring complete and correct toolchains to exist on the machine is also itself problematic.

@agocke
Copy link
Member

agocke commented Mar 10, 2023

I'm still not understanding the source build problem.

I would prefer that we simply not provide support for distributing non-portable assets through NuGet.

But this proposal does allow that. But this would be almost entirely a package author decision to tag with the value from /etc/os-release.

So I don't see what we would use source-build RIDs for. It seems extraneous.

@omajid
Copy link
Member

omajid commented Mar 10, 2023

How does this work for Java? Is there a special signifier in Maven for OpenJDK assets vs. Oracle assets?

Maven doesn't have a signifier for it, but the general problem also affects Java.

If you build a JDK (analogue of the .NET SDK, so it does have the equivalent of a host) it is also portable or non-portable, dependening on the build configuration that was used to build the JDK. The RID-equivalent is not encoded in the JDK.

Maven artifacts distributed on Maven Central (or third party maven repositoriess) are independent from the JDK. They can contain native libraries (.so or .dll). The native libraries can be using native interop with Java (JNI libraries) or just plain C-ABI libraries. There's no logic in the JDK on how to do any resolution.

The burden is on application/library developers to package this and do the resolution at their library/application's runtime.

Here's an example of a native-library using package (tensorflow):

https://central.sonatype.com/artifact/org.tensorflow/libtensorflow_jni/1.15.0

They have to resolve the OS, architecture at runtime: https://github.com/tensorflow/tensorflow/blob/84eaae4f4833eb0dfdc3b0a20bb00c1de3ad0f7c/tensorflow/java/src/main/java/org/tensorflow/NativeLibrary.java#L212-L228

Then use that to load the appropraite native library from their package:

$ unzip -l libtensorflow_jni-1.15.0.jar 
...
        0  10-22-2019 17:09   org/tensorflow/
        0  10-22-2019 17:09   org/tensorflow/native/
        0  10-22-2019 17:09   org/tensorflow/native/linux-x86_64/
        0  10-22-2019 17:09   org/tensorflow/native/windows-x86_64/
        0  10-22-2019 17:09   org/tensorflow/native/darwin-x86_64/
   422253  10-22-2019 17:09   org/tensorflow/native/linux-x86_64/THIRD_PARTY_TF_JNI_LICENSES
154073736  10-22-2019 17:09   org/tensorflow/native/linux-x86_64/libtensorflow_jni.so
 35226832  10-22-2019 17:09   org/tensorflow/native/linux-x86_64/libtensorflow_framework.so.1
    11419  10-22-2019 17:09   org/tensorflow/native/linux-x86_64/LICENSE
 78973440  10-22-2019 17:09   org/tensorflow/native/windows-x86_64/tensorflow_jni.dll
    11419  10-22-2019 17:09   org/tensorflow/native/windows-x86_64/LICENSE
   422253  10-22-2019 17:09   org/tensorflow/native/darwin-x86_64/THIRD_PARTY_TF_JNI_LICENSES
 28642820  10-22-2019 17:09   org/tensorflow/native/darwin-x86_64/libtensorflow_framework.1.dylib
    11419  10-22-2019 17:09   org/tensorflow/native/darwin-x86_64/LICENSE
284042344  10-22-2019 17:09   org/tensorflow/native/darwin-x86_64/libtensorflow_jni.dylib
...

In practice, this isn't so different from what .NET packages end up doing. .NET packages claim to support linux-x64, but that can mean anything from "We support the same breadth of platforms that .NET 7 means when it says linux-x64" to "We support whatever GitHub Actions' ubuntu-latest maps to"

@omajid
Copy link
Member

omajid commented Mar 10, 2023

@tmds, you do have a link to the previous issue where where we were discussing getting rid of portable vs non-portable mode completely? (IIRC @am11 was there too) That was a long thread and had some great context around portable-ness and its consequences, but I can't find the issue anymore.

@ayakael
Copy link
Contributor

ayakael commented Mar 10, 2023

@omajid I think that you are referring to this one: #62942

@tmds
Copy link
Member

tmds commented Mar 11, 2023

I'm still not understanding the source build problem.

There is no source build problem

Source-build needs non-portable rids because portable rids exist.
That is a solved problem.

For example, with our source-build .NET, we build a Microsoft.NETCore.App.Host.fedora.37-x64. This is an apphost that is not portable.
If the user needs something that runs across a wide range of distros, he needs the portable one: Microsoft.NETCore.App.Host.linux-x64.
(The fedora.37-x64 vs linux-x64 semantics are the same for the other packages.)

The non-portable rids provide a namespace for the distro.
They allow the to pick the portable version vs source-build version when needed.

That is the primary use-case for non-portable rids.
The non-portable rid doesn't need to be understood elsewhere for this.

It seems extraneous.

All this complexity is for making non-portable assets consumable by portable .NET applications.

According to @richlander this is a use-case that needs to be supported:

We started this expanded proposal since developers on our team publish distro-specific RID-split packages.

@agocke
Copy link
Member

agocke commented Mar 11, 2023

Alright, so specifically, non-portable RIDs are needed for redistributable components that do not meet the .NET portability requirement.

According to @richlander this is a use-case that needs to be supported:

I'm not starting from the premise that anything needs to be supported. I want to know exactly which scenarios are enabled by which features.

@agocke
Copy link
Member

agocke commented Mar 11, 2023

Alright, so specifically, non-portable RIDs are needed for redistributable components that do not meet the .NET portability requirement.

Actually, let me be even more specific. Non-portable RIDs are used by asset selection to give redistributable componenets identifiers that differ from the Microsoft-published components, which happen to be portable.

Notably, I don't think anyone is using non-native non-portable RIDs, i.e. if you're on ubuntu and you want to publish for a non-portable build you would always say -r ubuntu-x64 not -r centos-x64. If you did say -r centos-x64 I would expect that to fail as there are no centos-x64 redistributable packages on NuGet and you almost certainly don't have them locally. Yes, in theory someone could host a NuGet package server that contains redistributable components for each of the linux distributions, but I don't think that exists.

So in practice we're actually talking about 1 bit of information -- portable vs. non-portable. The RID system is massively overcomplicated for that data.

@am11
Copy link
Member

am11 commented Mar 11, 2023

Some real data-points would be interesting for this discussion. e.g. if < 0.001% packages targeting .NET 6+ in nuget.org are using non-portable RIDs, then deprecation should be in cards. For the remaining few use-cases, user is free to devise custom mechanism without making the entire ecosystem distro-aware and distro-specific.

With modern APIs like NativeLibrary.SetDllImportResolver(), user is free to preform system introspection, compile-from-source on first run or "detect platform" for target library selection before loading it into the process.

@richlander
Copy link
Member Author

I think the best path forward is to get the community to maintain these compat lists. They should be easy to merge (if necessary) and to use in either project files or Directory.Build.props. It's not perfect, but I think we've identified that there is no perfect answer. I propose we go with this plan and see what the user feedback looks like.

@agocke
Copy link
Member

agocke commented Apr 6, 2023

That works for me. As long as we don't put the MajorVersionCompat property definition into the SDK itself, I see no problems.

@tmds
Copy link
Member

tmds commented Apr 6, 2023

The goal of the rid is to express binary compatibility.

For RHEL, the compatibility is across the major version, like RHEL 8.

In /etc/os-release, VERSION_ID includes the minor, like 8.7.

For me the goal is, to make this just work.

There should be no hoops the users must go through.

@agocke
Copy link
Member

agocke commented Apr 6, 2023

Bundling native components as linux-x64 with a minimal reference set will have everything just work for the user, and it will work not just for "major" distros, but many "minor" ones as well.

@richlander
Copy link
Member Author

As long as we don't put the MajorVersionCompat property definition into the SDK itself

Can you elaborate on what you mean by this?

It could be useful if a source-build distro could add its own definitions so that users didn't need to.

@agocke
Copy link
Member

agocke commented Apr 6, 2023

I don't consider custom source-builds to be part of the SDK, but instead patches that are applied by the author when building.

@richlander
Copy link
Member Author

I don't consider custom source-builds to be part of the SDK

I don't think that's a helpful framing. We work with folks like Red Hat in partnership. For example, we made it possible to inject RIDs into the RID graph as part of source-build. First-class source-build scenarios are SDK scenarios.

In any case, if we're going to offer MajorVersionCompat as a feature, then the SDK would need to know about it.

@tmds
Copy link
Member

tmds commented Apr 6, 2023

source-build

Note that we're not trying to solve an issue with the source-built runtimes.
When we source-build .NET we are in control of the rid that is associated with that build.
For example, we can name it: rhel.8-x64.

The issue is that when Microsoft portable binaries run on RHEL8, they don't know they should use rhel.8-x64 assets,
because the algoritmic model doesn't suggest that rid based on /etc/os-release.

@agocke
Copy link
Member

agocke commented Apr 6, 2023

In any case, if we're going to offer MajorVersionCompat as a feature, then the SDK would need to know about it.

Yeah, I think my statement was unclear: the SDK would use and respect whatever comes through, the list would just be empty by-default. Distributions would be responsible for encoding their preferred values during build. This could be like the RID system, where the configuration is passed on the command line, or it could be something like adding or modifying a well-known props file to a particular place in the SDK.

The important bit is just that the list of distros doesn't appear in the upstream build.

The issue is that when Microsoft portable binaries run on RHEL8, they don't know they should use rhel.8-x64 assets,
because the algoritmic model doesn't suggest that rid based on /etc/os-release.

I think this is intentional. Microsoft portable distributions should be portable by-definition. If distros want users to be able to take advantage of distro-specific assets, users have to use the copy in the package manager.

This is analogous to python's wheel system, where distributions are responsible for encoding the wheels they support in the python build.

@tmds
Copy link
Member

tmds commented Apr 6, 2023

The rids are meant for expressing binary compatibility.

The host derives the rid from /etc/os-release.
For some distros, what is in /etc/os-release is not expressing binary compatibility.
That is: the implementation that is used to derive the rid is not appropriate for the rid semantics.

the list would just be empty by-default.

The user shouldn't have to compensate for distros where the implementation is known to fail.

I think this is intentional. Microsoft portable distributions should be portable by-definition. If distros want users to be able to take advantage of distro-specific assets, users have to use the copy in the package manager.

My understanding is that the complexity comes from a requirement to make Microsoft portable builds capable of picking up non-portable assets.

When we source-build .NET on a specific distro, the rids to probe for are known at build time.

When a Microsoft portable binary runs on a distro, it need to figure out what to probe. This is the complex part.

@jkotas
Copy link
Member

jkotas commented Apr 7, 2023

This discussion suggests that a declarative systems seem to be too inflexible to satisfy RID compatibility relationships. Would it make sense to allow describing the RID compatibility relationships as code instead? Something like:

  • The default host would probe for linux-x64 and unix assets only, like in the original plan.

  • Applications or libraries that wish to include distro-specific assets will be required to include code to resolve the distro-specific assets, e.g. by subscribing to existing AssemblyLoadContext.Default.ResolvingUnmanagedDll event. This code can be customized by application or library author to implement arbitrary RID compatibility relationship.

@tmds
Copy link
Member

tmds commented Apr 7, 2023

Yes. We've challenged a couple of times the need to automatically pick up non-portable assets at runtime.

The only thing said about that (in #83246 (comment)) is:

We started this expanded proposal since developers on our team publish distro-specific RID-split packages.

@agocke
Copy link
Member

agocke commented Apr 10, 2023

Applications or libraries that wish to include distro-specific assets will be required to include code to resolve the distro-specific assets, e.g. by subscribing to existing AssemblyLoadContext.Default.ResolvingUnmanagedDll event. This code can be customized by application or library author to implement arbitrary RID compatibility relationship.

Seems like it would be pretty powerful and useful. Two questions:

  1. Is there any scenario where we'll need to remap before managed code loading?
  2. How would this integrate with source build? Would platforms build an additional plugin and drop it in the SDK somewhere?

@elinor-fung
Copy link
Member

The default host would probe for linux-x64 and unix assets only, like in the original plan.

I would be concerned that not probing the current RID would turn this into significantly more of a breaking change for non-linux, particularly Windows. I know there are a lot of packages that have win10 assets (maybe some of them aren't actually win10 specific, but I did see some high downloads that were intended to only support win10).

As various folks have mentioned before, some packages do already use NativeLibrary, DllImportResolver, ResolvingUnmanagedDll, and friends to do their own resolution. I know there was pushback from package owners around the complexity, hence the proposal for a declarative system. If the declarative system still seems to be a partial solution, maybe we just commit to the stronger break, create guidance/examples, and push of the complexity to code in packages that would be affected. From what data I was able to query on public nuget, I found ~700 packages that assets under runtimes/.

How would this integrate with source build? Would platforms build an additional plugin and drop it in the SDK somewhere?

I think I'm missing the scenario here. Is this for using a source-built SDK to build/publish a RID-specific application that uses distro-specific assets that aren't an exact match to the RID specified on build/publish?

@agocke
Copy link
Member

agocke commented Apr 11, 2023

I think I'm missing the scenario here. Is this for using a source-built SDK to build/publish a RID-specific application that uses distro-specific assets that aren't an exact match to the RID specified on build/publish?

That was my thought, yeah. I'm the maintainer for BananaLinux and I make source-build copy of .NET. I want to express compatibility for all Ubuntu packages >= 16.04 and <= 22.04, and all minor versions of CarrotLinux 4.x.

@tmds
Copy link
Member

tmds commented Apr 13, 2023

I'm the maintainer for BananaLinux and I make source-build copy of .NET. I want to express compatibility for all Ubuntu packages >= 16.04 and <= 22.04, and all minor versions of CarrotLinux 4.x.

My understanding is that, under this proposal, the compatibility information is no longer the responsibility of the .NET installation.

The app developer is responsible for including it (when their app comes with non-portable assets).

This works the same for Microsoft built .NET and source-built .NET.

@agocke
Copy link
Member

agocke commented Apr 15, 2023

That makes more sense -- so this would go in the user's app and they would use this resolution policy there.

@elinor-fung
Copy link
Member

Based on the suggestion (re)raised in #83246 (comment) around not trying to add a declarative system:

This is basically the last plan, without the declarative system (so 'opt-in: distro compatibility' and ‘major <- major.minor’ removed).

We looked at the data available to us for packages listed on NuGet. Based on paths of items in the package, we identified ~700 distinct packages out of 490k packages that had non-portable RID assets in any of their versions. Only 250 of those used linux RIDs (most were windows). A number of the commonly downloaded ones (for example, Libuv, SkiaSharp and LibGit2Sharp) no longer had non-portable RID assets in their latest versions.

This seems to support the position of avoiding the complexity of adding a declarative system and pushing it to applications/libraries to add a custom mechanism using existing APIs.

@tmds
Copy link
Member

tmds commented May 8, 2023

The host will use the algorithmic probing by default.

Is this right: the algoritmic probing doesn't probe for any non-portable rids (for portable builds)?

I think the algorithimc probing is good if it:

  • either doesn't probe for any non-portable rids,
  • or it (also) probes for the proper non-portable rids on RHEL, Rocky, Oracle Linux, etc (that is: without the minor).

@elinor-fung
Copy link
Member

I think it should still probe for the RID of the host itself - so the fallback RID passed at build-time (with the same version stripping or not as in the build scripts), which I believe would include the non-portable RID for non-portable builds of the host.

@tmds
Copy link
Member

tmds commented May 8, 2023

My question is about the algorithmic model at runtime: whether it adds a non-portable rid based on the /etc/os-release from the host it is running on, or whether the list is fixed as build time?

From your reply, I assume it's the latter? And this fixed list could include a non-portable rid.

@elinor-fung
Copy link
Member

From your reply, I assume it's the latter? And this fixed list could include a non-portable rid.

Yes, I was thinking it would be the fixed list from build time. I'm hoping that would help move us away from the (often confusing) determination at run time. Happy to hear any other opinions though.

@ericstj
Copy link
Member

ericstj commented Oct 24, 2023

@richlander - are you using this issue to track any more remaining work in 8.0 or can it be closed/milestone updated?

@richlander
Copy link
Member Author

This was implemented in .NET 8, so closing.

@ghost ghost locked as resolved and limited conversation to collaborators Nov 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests