Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPDX element/relationships should be defined for the container image being described #1241

Closed
lumjjb opened this issue Oct 4, 2022 · 6 comments · Fixed by #1934
Closed
Assignees
Labels
enhancement New feature or request

Comments

@lumjjb
Copy link

lumjjb commented Oct 4, 2022

What would you like to be added:

At the moment, when scanning container image, SPDX documents contain the list of packages that the target contains.. However, there are no elements/relationships that link back to the container that is being described..

Why is this needed:

Without specifying the container image element/relationships to the packages, the ingestion of SPDX documents resulting in detached packages without being able to query for the containers that use the packages. It will be useful to be able to link them up so that queries can be made for finding entities that use certain packages.

Additional context:

Example graph without links to entity:

image

The SPDX element to be defined can reference the OCI PURL (https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#oci), and should include sha256 checksum of the container image when possible.

@lumjjb lumjjb added the enhancement New feature or request label Oct 4, 2022
@spiffcs spiffcs self-assigned this Oct 4, 2022
@kzantow
Copy link
Contributor

kzantow commented Oct 4, 2022

I think we need to get SPDX 2.3 support for this first, no? spdx/tools-golang#156

Reason being, 2.3 added the "primary package purpose" which allows us to add a container as a package in the graph

@spiffcs
Copy link
Contributor

spiffcs commented Oct 4, 2022

👋 Nice find @lumjjb!

As far as the core syft data model do you think this would be a new field for the packages?

"Package": {
"required": [
"id",
"name",
"version",
"type",
"foundBy",
"locations",
"licenses",
"language",
"cpes",
"purl"
],
"properties": {
"id": {
"type": "string"
},
"name": {
"type": "string"
},
"version": {
"type": "string"
},
"type": {
"type": "string"
},
"foundBy": {
"type": "string"
},
"locations": {
"items": {
"$schema": "http://json-schema.org/draft-04/schema#",
"$ref": "#/definitions/Coordinates"
},
"type": "array"
},
"licenses": {
"items": {
"type": "string"
},
"type": "array"
},
"language": {
"type": "string"
},
"cpes": {
"items": {
"type": "string"
},
"type": "array"
},
"purl": {
"type": "string"
},
"metadataType": {
"type": "string"
},
"metadata": {
"anyOf": [
{
"type": "null"
},
{
"$ref": "#/definitions/AlpmMetadata"
},
{
"$ref": "#/definitions/ApkMetadata"
},
{
"$ref": "#/definitions/CargoPackageMetadata"
},
{
"$ref": "#/definitions/ConanLockMetadata"
},
{
"$ref": "#/definitions/ConanMetadata"
},
{
"$ref": "#/definitions/DartPubMetadata"
},
{
"$ref": "#/definitions/DotnetDepsMetadata"
},
{
"$ref": "#/definitions/DpkgMetadata"
},
{
"$ref": "#/definitions/GemMetadata"
},
{
"$ref": "#/definitions/GolangBinMetadata"
},
{
"$ref": "#/definitions/HackageMetadata"
},
{
"$ref": "#/definitions/JavaMetadata"
},
{
"$ref": "#/definitions/KbPackageMetadata"
},
{
"$ref": "#/definitions/NpmPackageJSONMetadata"
},
{
"$ref": "#/definitions/PhpComposerJSONMetadata"
},
{
"$ref": "#/definitions/PortageMetadata"
},
{
"$ref": "#/definitions/PythonPackageMetadata"
},
{
"$ref": "#/definitions/RpmMetadata"
}
]
}
},
"additionalProperties": true,
"type": "object"
},

Right now we have a purl field that describes the package, but we could add a field that also describes the distro it's related to and then transpose that into the spdx relationship1.

We could also codify this as a relationship (syft-json relationship), but I'm not sure what the parent would be in this case that all packages could be linked back to since we identify the distro as it's own separate field outside of the package list 2.

Maybe we need to add the sha256 checksum of the container image as a part of another "new" element that goes into the core data model

Footnotes

  1. https://github.com/anchore/syft/blob/91eece47ffc5de009dccb916458b9ba734864319/schema/json/schema-4.0.0.json#L1368-L1390

  2. https://github.com/anchore/syft/blob/91eece47ffc5de009dccb916458b9ba734864319/schema/json/schema-4.0.0.json#L755-L814

@spiffcs
Copy link
Contributor

spiffcs commented Oct 4, 2022

Note: I think adding an ID field to the source object might be the best bet here.

Candidate for the ID field could be:
https://github.com/opencontainers/image-spec/blob/main/config.md#layer-chainid

We could also potentially update relationships to be id:container:["id", "id", ...] so we don't have to generate every single new relationship.

This would give us a good basis for then incorporating 2.3 spdx when it's ready

@spiffcs
Copy link
Contributor

spiffcs commented Oct 4, 2022

@kzantow I think we can also use this issue to discuss changes to the core model that need to be made before we're ready to adopt this for spdx 2.3

@eliaslevy
Copy link

This issue is a special case of #1661. In general, Syft should create a SPDX Package as the root of the SPDX Relationships tree for the artifact the SBOM describes. Otherwise there is no location to describe artifact metadata, such as author, version, checksums, download location, etc.

@lumjjb
Copy link
Author

lumjjb commented May 18, 2023

Hi - pinging back on this issue! Wondering if this is something that is being worked on.. We are still special casing this in GUAC and likely will be doing a filter for SBOMs that don't have this soon going forward - as the heuristics are unreliable (sometimes files become containers and vice versa).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants