Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package validation #196

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ Use update-index to regenerate it:
| 2021 | [Requirements](accepted/2021/statics-in-interfaces/requirements/README.md) | [Immo Landwerth](https://github.com/terrajobst) |
| 2021 | [Scenarios](accepted/2021/statics-in-interfaces/scenarios/README.md) | [Immo Landwerth](https://github.com/terrajobst) |
| 2021 | [Statics in Interfaces](accepted/2021/statics-in-interfaces/README.md) | [Immo Landwerth](https://github.com/terrajobst) |
| 2021 | [Package Validation](accepted/2021/package-validation/package-validation.md) | [Immo Landwerth](https://github.com/terrajobst) |

# Proposed

Expand Down
190 changes: 190 additions & 0 deletions accepted/2021/package-validation/package-validation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
# Package Validation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my first read, I confused "Package Validation" with nuget package verification.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps another name makes sense here since many users confuse the verification feature with validation already.


**Owner** [Immo Landwerth](https://github.com/terrajobst)

With .NET Core & Xamarin we have made cross-platform a mainstream requirement
for library authors. However, we lack validation to ensure that library
developers, which can result in packages that don't work well which in turn
terrajobst marked this conversation as resolved.
Show resolved Hide resolved
hurts our ecosystem. This is especially problematic for emerging platforms where
adoption isn't high enough to warrant special attention by library authors.

The tooling we provide as part of the SDK has close to zero validation that
multi-targeted packages are well-formed. For example, a package that
multi-targets for .NET Core 3.1 and .NET Standard 2.0 needs to ensure that code
compiled against the .NET Standard 2.0 binary can run against the binary that is
produced for .NET Core 3.1. In the dotnet/runtime repo we have tooling that
ensures that this is the case, but this can't easily be used by our customers.
We have seen this issue in the wild, even with 1st parties, for example, the
Azure AD libraries.

Another common issue in the .NET ecosystem is that binary breaking changes
aren't well understood. Developers tend to think of "does the source still
compile" as the bar to determine whether an API is compatible. However, at the
ABI level certain changes that are fine in C# aren't compatible at the level of
IL, for example, adding a defaulted parameter or changing the value of a
constant. In the dotnet/runtime repo we have tooling at allows to validate that
our current API shape is backwards compatible with the API we shipped before.

This document outlines a set of features that will allow library developers to
validate that their packages are well-formed and they they didn't make
unintentional breaking changes. This is achieved by productizing our internal
tooling that we're planning to use ourselves as well.

## Scenarios and User Experience
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, am I understanding correctly that this tool runs on exactly one single package at a time?
Eg., it doesn't need to check referenced packages; nor does it attempt to compare two versions of the same package?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh never mind I see the package-to-package scenario. Still it would be good to note whether it ever looks at any other packages other than the 1 or 2 supplied

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it attempt to compare two versions of the same package

it will have the functionality to compare with a previous version of the package.

it doesn't need to check referenced packages;

we will run few checks on the package dependencies.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shoud we add a user case for validation across different rids ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to, but could you provide a rough summary of what we'd flag?


## Validation between .NET Standard and .NET Core

Finley is working on a client library for a cloud services. They need to support
both .NET Framework and .NET Core customers. They started with just targeting
.NET Standard 2.0 but now they realize they want to take advantage of the new
UTF8 string feature in .NET 7. In order to do that, they are multi-targeting for
`netstandard2.0` and `net7.0`.

Finley has written the following code:

```C#
#if NET7_0_OR_GREATER
public void DownloadLogs(Utf8String area)
{
DownloadLogs(area.AsSpan());
}
#else
public void DownloadLogs(string area)
{
var utf8Bytes = Encoding.UTF8.GetBytes(area);
DownloadLogs(utf8Bytes);
}
#endif

private void DownloadLogs(ReadOnlySpan<byte> areaUtf8)
{
// Do network call
}
```

When Finley builds the project it fails with the following error:

> error SYSLIB1234: The method 'DownloadLogs(string)' exists for .NET Standard
> 2.0 but does not exist for .NET 7.0. This prevents consumers who compiled
> against .NET Standard 2.0 to run on .NET 7.0.

Finley understands that they shouldn't exclude `DownloadArea(string)` but
terrajobst marked this conversation as resolved.
Show resolved Hide resolved
instead just provide an additional `DownloadArea(Utf8String)` method for .NET
terrajobst marked this conversation as resolved.
Show resolved Hide resolved
7.0 and changes the code accordingly:

```C#
#if NET7_0_OR_GREATER
public void DownloadLogs(Utf8String area)
{
// ...
}
#endif

public void DownloadLogs(string area)
{
// ...
}
```

### Validation against previous version

Skylar is tasked with adding support for a connection time out to their library.
The `Connect` method looks like this right now:

```C#
public static Client Connect(string url)
{
// ...
}
```

Since a connection timeout is an advanced configuration setting they reckon they
can just add an optional parameter:

```C#
public static Client Connect(string url, TimeSpan? timeout = default)
{
// ...
}
```

However, when they rebuild the package they are getting an error:

> error SYSLIB1234: The method 'Connect(string)' exists in the previous version
> of the NuGet package (AdventureWorks.Client 1.2.3) but no longer exists in the
> current version being built (1.2.4). This is a breaking change.

Skylar realizes that while this is not a source breaking change, it's a binary
breaking change. They solve this problem by adding an overload instead:

```C#
public static Client Connect(string url)
{
return Client(url, Timeout.InfiniteTimeSpan);
}

public static Client Connect(string url, TimeSpan timeout)
{
// ...
}
```

### Making an intentional breaking change

Armani is working on version 0.7 of their library. While they don't want to make
accidental breaking changes, they do want to make changes based on user feedback.

In version 0.6 they had a `Connect()` method that accepted a timeout as an
`int`. Users complaint that timeouts in .NET are nowadays generally modelled
using `TimeSpan` instead. After making the necessary code change, rebuilding the
library results in an error:

> error SYSLIB1234: The method 'Connect(string, int)' exists in the previous
terrajobst marked this conversation as resolved.
Show resolved Hide resolved
> version of the NuGet package (Fabrikam.Client 0.6.3) but no longer exists in
> the current version being built (0.7.0). This is a breaking change.

Since Armani's library is still an 0.x, which, according to SemVer, doesn't
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the analyzer look at the version number? I assume not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this would be a configuration that a user would set in their project as this feature would be opt-in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, perhaps it will not report breaking changes if they are between two packages with different major versions? If so, is there a way to force it to do so anyway (as we would for our own packages)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would always report errors and then users can suppress messages for intentional breaks. I would be scared of having a mode that runs by default and swallows errors that might help the user avoid a breaking change and rather have them being intentional about suppressing that warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I know that for Microsoft, we may increment the major version for a variety of reasons, such as marketing, not because it's time to break stuff.

Copy link
Member Author

@terrajobst terrajobst Apr 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that. Unless I'm missing something, I don't believe we need configuration regarding the version number, but we will need an opt-in for previous version validation and we need a way to configure feed(s).

  • The tool knows which version is being built.
  • For each feed, the tool can request the existing versions. The previous version is the one that immediately precedes the current version in a combined sorted list.

Originally I thought about a setting for "ignore breaking changes across major version boundaries" but it seems even if you do cross a SemVer boundary you still want to make intentional breaking changes only. The tool only knows which ones are intentional if you the developer suppressed them, so I don't believe we need a SemVer setting at all.

I've added a section to the doc to cover this.

promise a stable API, they briefly consider turning off the breaking change
validation entirely. However, Armani feels that even though it's 0.x they still
don't want to make accidental breaking change so they decide to just suppress
this specific instance of the breaking change:

```C#
[SuppressMessage("Interoperability", "SYSLIB1234")]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about the size footprint that this may add to costumer binaries in metadata? I agree we should support disabling code this way, however should we provide a way to disable errors via an input text file or an MSBuild itemgroup? This could be modeled as tuples:

SYSLIB1234, Connect(System.String, System.TimeSpan)

or

<PackageValidation Include="SYSLIB1234" Member="Connect(System.String, System.TimeSpan)" />

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have precedent for designating types/members in MSBuild like this? I wonder whether it would be fragile ..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure; but I would imaging adding these attributes for all the APICompat errors we baseline in dotnet/runtime for example which is a pretty large codebase would not be a good option because of the metadata bloat those would cause.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @ericstj for opinions/ideas as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be #pragma's that could be read by the tool, either through symbol->source mapping or read by an analyzer and shuffled over to the API compat tool through a file on the side, perhaps even the PDB (same could be done through Conditional attributes). Those could mitigate size impact on assembly. I think authoring in source is significantly better than msbuild project file, which is better than random xml/text file.

In this particular case I see the attribute is applied to the new method. That could be problematic. When analyzing metadata we wouldn't necessarily see the removal of one overload and the addition of a new one as the same thing. In the case that many overloads were added, which one should get the suppression for the removed overload? When members are completely removed, where do you put the suppression? API Compat solved this with a crude method: suppress by the full log string.

Another concern: we'd want to make sure folks aren't unnecessarily suppressing warnings. Folks might also want to have many runs of the tool (eg: once against previous major version, once against previous minor). Where would they put the different suppressions? Also, if they were only running against previous version, would we ask them to remove all the suppressions once they passed the single version that introduced the breaking change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be #pragma's that could be read by the tool, either through symbol->source mapping or read by an analyzer and shuffled over to the API compat tool through a file on the side, perhaps even the PDB (same could be done through Conditional attributes). Those could mitigate size impact on assembly. I think authoring in source is significantly better than msbuild project file, which is better than random xml/text file.

Wouldn't that mean that if we wanted to run against the previos version of the package which had it's own APICompat suppressions we would need to run against the symbols package if the PDBs are not part of the package that we downloaded?

Also we would need to come up with our own #pragma format that included RID/TFM/SemVer if you wanted to suppress for a specifc major version but not for others? Or if you wanted to have a suppression for just one RID but not for the others, etc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that mean that if we wanted to run against the previos version of the package which had it's own APICompat suppressions we would need to run against the symbols package if the PDBs are not part of the package that we downloaded?

Understanding if these suppressions need to travel with the binary itself is an important distinction. That's not clear to me today: in all scenarios I've seen, right-hand-side is being built. If we want to support otherwise then I agree we should have a reliable persistent storage location.

Also we would need to come up with our own #pragma format that included RID/TFM/SemVer if you wanted to suppress for a specifc major version but not for others? Or if you wanted to have a suppression for just one RID but not for the others, etc?

Source already has conditional syntax (conditions in project file, #ifdef in source). For applying suppressions to right-hand-side I think we should use that. Constraining the suppression based on what's on the left-hand-side could be interesting. I think we've mentioned a few different interesting variables: version, TFM, RID. What do we think these constraints might help us with?

For version my thought was that it would allow us to "permit" old suppressions to remain if the version on the left-hand-side was newer than that being analyzed. This way we wouldn't force people to delete suppressions the release after they made the breaking change (they still could if they wanted).

I'm not sure if it's so interesting for TFM/RID: if you are going to permit a break you typically are OK with that break from any source on the left-hand-side. To handle the "unused" case here, we could look at all compat runs in a batch and only raise an unused warning if none of them required the suppression (and the version rule didn't apply).

Copy link
Member

@safern safern Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can't think of a scenario where left would need to carry suppressions over, but maybe there is one.

Also, something that might be interesting would be that we are running APICompat after pack (I don't know if the opt-in against a previous version would be after build or after pack, I would guess after pack so that we can actually do package validation againt it as well not just API Compat?). Any way when running after pack it would be trickier to get the PDBs as that happens in the outer build where there is no RID/TFM context, I know we could figure out the PDBs paths with some targets and stuff, but how ideal would that be?

What I'm trying to say is how ideal is depending on PDBs as well?

Also something to consider is how are we going to baseline errors for Package Validation itself, like package dependencies errors and those kind of errors? Or I guess those shouldn't be allowed to be disabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just brainstorming existing files that might already flow through the process where we could stash metadata. PDBs definitely have some downsides. Assembly metadata is, of course, easier. Was just reading our docs on suppress message and found this that addresses metadata-bloat:

The SuppressMessageAttribute attribute is a conditional attribute, which is included in the IL metadata of your managed code assembly, only if the CODE_ANALYSIS compilation symbol is defined at compile time.

You should not use in-source suppressions on release builds, to prevent shipping the in-source suppression metadata accidentally. Additionally, because of the processing cost of in-source suppression, the performance of your application can be degraded.

Maybe if we use it that way it makes it OK? Folks would need to pack debug builds to have it work.

Also something to consider is how are we going to baseline errors for Package Validation itself, like package dependencies errors and those kind of errors? Or I guess those shouldn't be allowed to be disabled.

We should also have a path for suppressions that isn't exclusively assembly metadata. NoWarn, but also more granular suppressions. We'd need to think about all the encoding we'd want to do on those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe if we use it that way it makes it OK? Folks would need to pack debug builds to have it work.

That would make me a little nervous though, as not a lot of PR/CI validation builds run dotnet pack on Debug builds, I don't know how common that practice is? And I would guess release pipelines always pack on Release configuration, so is a behavior where people can only suppress messages on one configuration but not in the other? That could loose some fidelity as our tooling would only be running on Debug builds that run pack.

Also, if we are going to do suppressions on source as our first class mode to suppress errors I think we should provide a fixer in the near future...

public static Client Connect(string url, TimeSpan timeout)
{
// ...
}
```

## Requirements

### Goals

* Performance should be good so that basic package validation can be on by
default (when the project generates a package), but it must be possible to
turn it off as well
* Comparing to previous version should be opt-in because it requires downloading
terrajobst marked this conversation as resolved.
Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the tool work without comparing to a previous version?

Copy link
Member

@safern safern Apr 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be running as part of the dotnet pack step and would make sure that the produced package is API Compatible (in between compile and runtime assets), well formed, closure complete, etc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@safern Will there be any flag or metadata that one can assess the <ValidatePackageAgainstPreviousVersion> property was used at pack time in the resulting package?

additional files
* Developers need to be able to make intentional breaking changes while still
Copy link

@JonDouglas JonDouglas Apr 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe if semver is not detected that intentional breaking changes can be made? i.e. 0.Y.Z

Otherwise if semver detected, they can choose to ignore / suppress the warnings/errors? i.e. X.Y.Z where >= 1.0.0

validating that they didn't make other unintentional breaking changes.

### Non-Goals

* Having zero impact on performance (as that's not feasible)

## Stakeholders and Reviewers

* Libraries team
* NuGet team
* SDK team

## Design

### API Compat Tooling

The API compat tooling is described in a separate [spec][api-compat-spec].

## Q & A

[api-compat-spec]: https://github.com/dotnet/designs/pull/177