Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dates when a package became deprecated #9638

Open
noqcks opened this issue Aug 25, 2023 · 3 comments
Open

Add dates when a package became deprecated #9638

noqcks opened this issue Aug 25, 2023 · 3 comments
Labels
Area: V3 Feed feature-request Customer feature request

Comments

@noqcks
Copy link

noqcks commented Aug 25, 2023

NuGet Product(s) Involved

Other/NA

The Elevator Pitch

For the NuGet API, it would be wonderful to have a way to see when packages were marked as deprecated. We're building an open-source scanner for EOL packges/software in container images at https://github.com/xeol-io/xeol and we'd like to be able to know when packages have been marked as EOL so that users seeing results can see how long a package has been deprecated for, and make an intelligent decision about what to do in response.

Additional Context and Details

Related Issue: dotnet/core#7420

@zivkan
Copy link
Member

zivkan commented Aug 28, 2023

I really like the idea! I started an internal conversation to see this proposal needs to go through the full feature process, specifically, if it needs a design spec since decisions need to be made about whether or not to back-fill all existing deprecations on nuget.org, or only collect dates going forward, and how the date will be represented in dotnet list package --show-deprecated.

I'm hoping that just adding a field is simple enough that we can have a comment in this issue and avoid the more formal design spec process, but I'll write back when we get agreement internally.

@ghost ghost added the WaitingForCustomer label Aug 28, 2023
@joelverhagen joelverhagen transferred this issue from NuGet/Home Aug 30, 2023
@joelverhagen
Copy link
Member

Hey @noqcks, I've transferred this issue to NuGet/NuGetGallery (the issue tracker for NuGet.org) because at the heart of it is an update to the data model made available on the V3 protocol -- of which NuGet.org is a leading implementation.

Currently, some deprecated information + vulnerability information (which we can consider similarly in many contexts) is available on the Package Metadata resource (a.k.a. the registration base URL) and the Search resource. In neither case, as you state, do we have the date that the package transitioned from non-deprecated/non-vulnerable to deprecated/vulnerable. We do store the timestamp internally for deprecation (DeprecatedOn -- a miracle really) but we don't seem to use it anywhere except for our security auditing. This could theoretically be surfaced in our API responses and then shown in Visual Studio UI experiences or CLI experiences. So, there would be at least three steps in this work: decide on whether "vulnerable on" timestamp is in scope, populate the dates into API responses (a protocol change), and then enhance the client experiences to use this new value when available.

This would be a non-trivial amount of work because there may be a data backfill for vulnerable on, changes to NuGet.org V3 protocol, which is costly due to our infrastructure based on blob storage. We would rewrite thousands or millions of blobs in Azure Blob Storage to set the property everywhere. And then, finally, we would need to enhance the client to be careful to not need the value but use it if it is there. Oh, and we would need protocol doc updates. So, it's a righteous effort and feasible, just not a small thing. If you only need deprecated date and only in the API not in the client experiences, then perhaps as @zivkan says we can go with a lightweight design process (full details in this issue perhaps).

Whether you need the lightweight implementation or the full implementation with client experience changes, I want to be frank about when (or even if) our team could implement this. We have a deep backlog of long-awaited customer and partner asks which means it's hard for us to pick up new changes like this without a clear indication from the community that this is very important and needed for a lot of people. Unfortunately, there are a lot of enhancements in this "good idea, but we can't prioritize it" bucket. In short, we'd need to prioritize this work along with the rest of our asks. If you want to help with the proposal and the implementation process, we can explore that together here or on the proposal PR. That could help but is also not guaranteed since in the end our team would need to perform the deployment and data update operations to make it live.

Anyways, back to the issue. There is a workaround available to you right now.

It's not clear to me how https://github.com/xeol-io/xeol operates, but if it just uses the NuGet's V3 HTTP API directly, then there is a way right now to determine the approximate "deprecated on" date (+/- 5 minutes) by reading the NuGet.org catalog API.

The catalog is an append only log of most package metadata changes on NuGet.org. When a package is marked as vulnerable, an item per version is appended to the catalog with a "deprecation" property set. If you read the catalog forward in time, you can build an index of what packages were deprecated at any point in time because each catalog item has its own timestamp. The catalog is sorted in chronological order after all.

Consider an example. Microsoft.AspNetCore.Cors 2.2.0 is marked as deprecated right now.

If we scan the catalog, we find two leaves for this package version:

CommitTimestamp HasDeprecation Url PageUrl
2018-12-03 23:15:21.0089785 False https://api.nuget.org/v3/catalog0/data/2018.12.03.23.15.21/microsoft.aspnetcore.cors.2.2.0.json https://api.nuget.org/v3/catalog0/page6121.json
2023-05-04 02:43:18.1316853 True https://api.nuget.org/v3/catalog0/data/2023.05.04.02.43.18/microsoft.aspnetcore.cors.2.2.0.json https://api.nuget.org/v3/catalog0/page19114.json

From this we know that the package was deprecated on 2023-05-04 at 2 AM UTC because the catalog leaf at that time was when the package first transitioned to deprecated. It's possible there will be more leaves in the future if the deprecation details change slightly or if the package is unlisted, for example. Or the package can become undeprecated later!

But hopefully this demonstrates the feasibility.

For your scanner, you could build a local index of each package that is deprecated by regularly polling the catalog with a persistent cursor. After you initially scan the full catalog, it would be cheap to keep your index up to date. See my guide Query for all packages published to nuget.org for more details.

@joelverhagen joelverhagen self-assigned this Aug 30, 2023
@joelverhagen joelverhagen added Area: V3 Feed feature-request Customer feature request and removed Type:Feature labels Aug 30, 2023
@noqcks
Copy link
Author

noqcks commented Aug 30, 2023

Hi @joelverhagen thank you for the detailed response, I really appreciate it.

The "vulnerable on" timestamp it out of scope for what we want to do. Im not familiar with how NuGet tracks vulnerabilities, but from a cursory glance, it seems like they're published to an external vuln database in all instances, in which case a user can fetch information from the vuln database if they need more information around publish dates.

For our use case, the only change we'd need is a date associated with a deprecation, returned by the API, so the "lightweight design" mentioned. I will look into your catalog workaround in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: V3 Feed feature-request Customer feature request
Projects
None yet
Development

No branches or pull requests

4 participants