Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use package id from cyclonedx when provided #1872

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions syft/formats/common/cyclonedxhelpers/decoder.go
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,11 @@ func collectPackages(component *cyclonedx.Component, s *sbom.SBOM, idMap map[str
syftID := extractSyftPacakgeID(component.BOMRef)
if syftID != "" {
idMap[syftID] = p
p.OverrideID(artifact.ID(syftID))
} else {
// TODO there must be a better way than needing to call this manually:
p.SetID()
}
// TODO there must be a better way than needing to call this manually:
p.SetID()
Comment on lines 92 to -97
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way this p.OverrideID gets called is if the component.BOMRef is a PURL that contains the package-id qualifier (which is probably only set when Syft generated the SBOM in the first place, using a PURL as the BOMRef). Additionally, this syftID might not even match the ID generated by the current version of Syft, so the way it's being used is probably wrong.

It was originally a conscious decision to re-generate the IDs for components (by calling the p.SetID() instead of reusing whatever ID is provided by the format in order to get reasonably consistent output. For example: if we used the BOMRef as the Syft package ID, then re-encoded CycloneDX, we'd have something like: pkg:thing/name@version?package-id=pkg:thing/name@version?package-id=2a7443b435345b2 (ignore the lack of URL encoding). Obviously we can handle this during encoding, for example, by comparing the ID to what SetID would set.

A few things here:

  • we should probably get rid of extractSyftPacakgeID. I don't think this is providing any value -- at best it's extracting the same ID that SetID() would generate, but more likely it is setting some different ID that is not useful in any way
  • we should adjust output formats to be aware when an ID is explicitly set vs generated based on the data (this would probably look like 1 or 2 utility methods on the package -- maybe a GenerateID that could be used for comparison and/or an IsIDGenerated())
    • We would also need to update the CycloneDX encoder and possibly other encoders to handle cases with explicitly set IDs

If we want to support reading IDs from formats instead of always generating them based on the data (I'm not sure we always do -- this seems like a configurable option), I think your observation that we should be using the BOMRef is accurate, maybe something like this:

		if component.BOMRef != "" {
			p.OverrideID(artifact.ID(component.BOMRef))
		} else {
			p.SetID()
		}
		idMap[string(p.ID())] = p

s.Artifacts.Packages.Add(*p)
}

Expand Down
38 changes: 38 additions & 0 deletions syft/formats/common/cyclonedxhelpers/decoder_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -336,3 +336,41 @@ func Test_missingComponentsDecode(t *testing.T) {

assert.NoError(t, err)
}

func Test_useSyftIDWhenProvided(t *testing.T) {

packageWithId := cyclonedx.Component{
BOMRef: "pkg:maven/org.springframework.boot/spring-boot-starter-test?package-id=646a5a71a4abeee0",
Type: cyclonedx.ComponentTypeLibrary,
Name: "spring-boot-starter-test",
Version: "",
PackageURL: "pkg:maven/org.springframework.boot/spring-boot-starter-test",
}

packageID := extractSyftPacakgeID(packageWithId.BOMRef)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo here extractSyftPacakgeID


packageWithoutId := cyclonedx.Component{
BOMRef: "pkg:maven/org.springframework.boot/spring-boot-starter-webflux",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why don't we reuse the BomRef value ? I assume sboms use the BomRef as an unique reference, therefore I am wondering why this initial unique ID for this SBOM needs to be adapted as done on sbom.Artifacts.Packages.PackagesByName(packageWithoutId.Name) or sbom.Artifacts.Packages.Package(artifact.ID(packageID))).

Type: cyclonedx.ComponentTypeLibrary,
Name: "spring-boot-starter-webflux",
Version: "",
PackageURL: "pkg:maven/org.springframework.boot/spring-boot-starter-webflux",
}

bom := cyclonedx.BOM{Metadata: nil,
Components: &[]cyclonedx.Component{
packageWithId,
packageWithoutId,
}}

sbom, err := ToSyftModel(&bom)

assert.Nil(t, err)
assert.NotNil(t, sbom.Artifacts.Packages.Package(artifact.ID(packageID)))

pkgsWithoutID := sbom.Artifacts.Packages.PackagesByName(packageWithoutId.Name)

assert.Len(t, pkgsWithoutID, 1)
assert.NotEqual(t, "", pkgsWithoutID[0].ID())

}