-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use package id from cyclonedx when provided #1872
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: James Neate <jamesmneate@gmail.com>
The alternate is that the entire BOM-Ref field becomes the ID instead of the package-id suffix? Happy to make said change if needed. |
syftID := extractSyftPacakgeID(component.BOMRef) | ||
if syftID != "" { | ||
idMap[syftID] = p | ||
p.OverrideID(artifact.ID(syftID)) | ||
} else { | ||
// TODO there must be a better way than needing to call this manually: | ||
p.SetID() | ||
} | ||
// TODO there must be a better way than needing to call this manually: | ||
p.SetID() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only way this p.OverrideID
gets called is if the component.BOMRef
is a PURL that contains the package-id
qualifier (which is probably only set when Syft generated the SBOM in the first place, using a PURL as the BOMRef). Additionally, this syftID
might not even match the ID generated by the current version of Syft, so the way it's being used is probably wrong.
It was originally a conscious decision to re-generate the IDs for components (by calling the p.SetID()
instead of reusing whatever ID is provided by the format in order to get reasonably consistent output. For example: if we used the BOMRef as the Syft package ID, then re-encoded CycloneDX, we'd have something like: pkg:thing/name@version?package-id=pkg:thing/name@version?package-id=2a7443b435345b2
(ignore the lack of URL encoding). Obviously we can handle this during encoding, for example, by comparing the ID to what SetID
would set.
A few things here:
- we should probably get rid of
extractSyftPacakgeID
. I don't think this is providing any value -- at best it's extracting the same ID thatSetID()
would generate, but more likely it is setting some different ID that is not useful in any way - we should adjust output formats to be aware when an ID is explicitly set vs generated based on the data (this would probably look like 1 or 2 utility methods on the package -- maybe a
GenerateID
that could be used for comparison and/or anIsIDGenerated()
)- We would also need to update the CycloneDX encoder and possibly other encoders to handle cases with explicitly set IDs
If we want to support reading IDs from formats instead of always generating them based on the data (I'm not sure we always do -- this seems like a configurable option), I think your observation that we should be using the BOMRef
is accurate, maybe something like this:
if component.BOMRef != "" {
p.OverrideID(artifact.ID(component.BOMRef))
} else {
p.SetID()
}
idMap[string(p.ID())] = p
PackageURL: "pkg:maven/org.springframework.boot/spring-boot-starter-test", | ||
} | ||
|
||
packageID := extractSyftPacakgeID(packageWithId.BOMRef) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo here extractSyftPacakgeID
packageID := extractSyftPacakgeID(packageWithId.BOMRef) | ||
|
||
packageWithoutId := cyclonedx.Component{ | ||
BOMRef: "pkg:maven/org.springframework.boot/spring-boot-starter-webflux", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Why don't we reuse the BomRef value ? I assume sboms use the BomRef as an unique reference, therefore I am wondering why this initial unique ID for this SBOM needs to be adapted as done on sbom.Artifacts.Packages.PackagesByName(packageWithoutId.Name)
or sbom.Artifacts.Packages.Package(artifact.ID(packageID)))
.
Hi @jneate -- I had a discussion with the team about this and there are some definite issues with directly using IDs from SBOMs as Syft internal object IDs. Although this PR is specific to CycloneDX, if we were to do this with SPDX, without more intelligent behavior, we would output IDs that increasingly grow in size. A directly related thing we are interested in is maintaining additional information from SBOM formats that Syft doesn't support -- for example VEX information in CycloneDX currently will just be dropped if we simply converted CycloneDX -> CycloneDX. I think we'd like to try to do a couple things on this front: Unfortunately, for this PR, we probably don't want to accept as-is where we're just setting the Syft ID directly, but instead with some sort of additional field like |
This is further progress on anchore/grype#1265
Currently when cyclone-dx has a package-id specified it's ignored and syft generates a new ID, this change ensures the cyclone-dx ID is re-used.
The fix was mainly for grype but it also fixes
syft convert
when switching between cyclonedx -> syft-json, the package-id is re-used in the new output.Left the TODO in place as it was there when I started.