Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add p2-aware model converter for CycloneDX SBOM generation #3258

Merged
merged 1 commit into from
Feb 15, 2024

Conversation

ptziegler
Copy link
Contributor

@ptziegler ptziegler commented Dec 14, 2023

Initial draft based on the discussions here and here.

The big obstacles right now are:

  • Convert the Maven artifact back into a p2 artifact.
  • Find the download URL for a given p2 artifact.
  • Handle artifacts from the current reactor build.
  • Handle group name of the SBOM component.

The reactor artifacts are especially tricky, because none of the artifacts have been built when the SBOM is generated.

For illustration, this is the SBOM for org.eclipse.gef-feature:
bom.zip

Copy link

github-actions bot commented Dec 14, 2023

Test Results

  588 files  + 3    588 suites  +3   3h 25m 16s ⏱️ - 26m 28s
  398 tests + 4    391 ✅ + 4   7 💤 ±0  0 ❌ ±0 
1 194 runs  +12  1 172 ✅ +12  22 💤 ±0  0 ❌ ±0 

Results for commit 46c8028. ± Comparison against base commit 96ae5d9.

♻️ This comment has been updated with latest results.

@laeubi
Copy link
Member

laeubi commented Dec 15, 2023

Convert the Maven artifact back into a p2 artifact.

You can take a look at TychoWorkspaceReader that has a method File findP2Artifact(final Artifact artifact) (feel free to make it public or extract into an own P2Mapper component)

Find the download URL for a given p2 artifact.

As described in the discussions around this I think this is somehow impossible to solve without configuration in all cases, but you can do the following, given you have already found the P2 coordinates:

  1. Collect all remote repositories from the maven project and look for layout p2 then use P2RepositoryManager and try to find your artifact there, if it is found, it can at least be downloaded from there.
  2. Get the TargetPlatformConfiguration for the project and then use getTargets() to get the list of used target files for the project, afterwards you should look for org.eclipse.tycho.targetplatform.TargetDefinition.RepositoryLocations and do the same here and even want to consider org.eclipse.tycho.targetplatform.TargetDefinition.TargetReferenceLocations

Handle artifacts from the current reactor build.

I think this would require a configuration somewhere to what URL the reactor should be mapped, maybe just as simple as start with a user-property like -Dp2.sbom.reactor=<desired url>

Handle group name of the SBOM component.

What do you mean by this?

@laeubi
Copy link
Member

laeubi commented Dec 15, 2023

By the way please don't use tycho-extras, this is an old historic construct and should be avoided for new designs, simply put it in the root of the project.

@ptziegler
Copy link
Contributor Author

You can take a look at TychoWorkspaceReader that has a method File findP2Artifact(final Artifact artifact) (feel free to make it public or extract into an own P2Mapper component)

Thanks, I'll have a look.

As described in the discussions around this I think this is somehow impossible to solve without configuration in all cases,

And for the most part, I agree. But I also believe we should at least attempt to find the location of an artifact. Otherwise the package URL is not much of a locator...

I also don't believe it has to be a link to the jar file. I would also consider the base URL of the update site, a link to the Git repository or anything similar to be valid locators.

There are of course difficult/impossible edge-cases, but I'd rather have something that works for 95% of all projects and go from there.

Problems that come to mind are e.g. with file-based repositories or builds that directly assemble a product and therefore don't upload their artifacts. But as said, I first want to get this working for simple projects.

maybe just as simple as start with a user-property like -Dp2.sbom.reactor=<desired url>

That's a good point. I assume most projects already have a property in their pom, describing the repository the build artifacts are uploaded to. One might as well simply use the same property for both.

Handle group name of the SBOM component.

What do you mean by this?

The SBOM contains components of the form:

<component type="library" bom-ref="...">
	<group>p2.eclipse.plugin</group>
	<name>org.eclipse.orbit.xml-apis-ext</name>
	<version>1.0.0.v20230923-0644</version>
	<scope>required</scope>
	...
</component>

Where the group is simply the artificial group id created by Tycho. Given that there is no real counterpart in p2, I'm not exactly sure what to do with it or if it even matters. But that's simply something I have to ask the CycloneDX developers about.

By the way please don't use tycho-extras, this is an old historic construct and should be avoided for new designs, simply put it in the root of the project.

Noted.

@laeubi
Copy link
Member

laeubi commented Dec 15, 2023

also don't believe it has to be a link to the jar file. I would also consider the base URL of the update site

As described above you can try to guess at least what URLs can supply the artifact with the described methods that works in a good 80% of cases.

@ptziegler
Copy link
Contributor Author

I made some good progress and a good chunk seems to be working already.

A general problems seems to be that the SBOM is generate really early, at a point where Tycho is not fully initialized yet.

You can take a look at TychoWorkspaceReader that has a method File findP2Artifact(final Artifact artifact) (feel free to make it public or extract into an own P2Mapper component)

I didn't get that to work yet, which is why I'm still constructing the ArtifactKey "by hand".

I think this would require a configuration somewhere to what URL the reactor should be mapped, maybe just as simple as start with a user-property like -Dp2.sbom.reactor=

That is currently done via a Maven property but in the end should function exactly the same way.

I also believe I need to also create a CycloneDX dependency model for p2 artifacts. Because dependencies are usually flat.

Example:
Features not only depend on the plugins they contain, but also on the plugins of the plugins.

@laeubi
Copy link
Member

laeubi commented Jan 3, 2024

A general problems seems to be that the SBOM is generate really early, at a point where Tycho is not fully initialized yet.

This shouldn't be an issue, Tycho itself is initialized far earlier than most (if not all) plugins in a regular maven build, so maybe you can give an example where it causes issues?

I didn't get that to work yet, which is why I'm still constructing the ArtifactKey "by hand".

Can you explain what was the problem?

I also believe I need to also create a CycloneDX dependency model for p2 artifacts. Because dependencies are usually flat.

Do you mean to construct a tree of items? There is already a dependency-tree mojo for P2 model

Features not only depend on the plugins they contain, but also on the plugins of the plugins.

DependencyArtifacts should contain the full dependency model.

@ptziegler
Copy link
Contributor Author

This shouldn't be an issue, Tycho itself is initialized far earlier than most (if not all) plugins in a regular maven build, so maybe you can give an example where it causes issues?

The makeAggregatePom goal currently crashes with the following error:

Execution default of goal org.cyclonedx:cyclonedx-maven-plugin:2.7.9:makeAggregateBom failed: Project tycho-demo:example.plugin:eclipse-plugin:1.0.0.20240107 does not have an expanded version

The expanded version is set by the BuildQualifierMojo during the validate phase.

But this goal is executed before it has a chance to run and so far, I haven't managed to bind it to a later phase.

I didn't get that to work yet, which is why I'm still constructing the ArtifactKey "by hand".

Can you explain what was the problem?

Initially, I was calling TychoProjectManager.getArtifactKey() to get the ArtifactKey of a given Artifact.

This initially wouldn't work, because Tycho didn't recognize them as part of the reactor build, so I had to calculate the corresponding project first.

The next problem is that the version of the returned ArtifactKey isn't expanded, so inside the SBOM, it would still be .qualifier instead of the timestamp.

Which is why I do all of this by hand.

I also believe I need to also create a CycloneDX dependency model for p2 artifacts. Because dependencies are usually flat.

Do you mean to construct a tree of items? There is already a dependency-tree mojo for P2 model

Features not only depend on the plugins they contain, but also on the plugins of the plugins.

DependencyArtifacts should contain the full dependency model.

I'm pretty sure CycloneDX doesn't use the dependency-tree mojo but rather works on Artifact.getDependencies() directly.

I could fix this by also writing a custom ProjectDependenciesConverter which uses DependencyArtifacts instead, but that all depends on how much effort it is...

@laeubi
Copy link
Member

laeubi commented Jan 8, 2024

The makeAggregatePom goal currently crashes with the following error

I think you probably want to use makeBom goal instead, aggregate goals unusually don't work well with Tycho as there is literally no documentation/specification how aggregate mojo are supposed to work so each implementation is doing different things some run before each phase some run at the end and so on, it seems even not that easy in a regular maven build:

What I can think of is to make Tycho compute the qualifier on demand at first call as we now do for almost everything, but as you said its done by the mojo actually and it is configurable so could be different from the default, especially if you choose different timestamp providers so one defiantly has to take the project config into account.

This initially wouldn't work, because Tycho didn't recognize them as part of the reactor build, so I had to calculate the corresponding project first.

What do you mean by "not part of reactor build" are you calling the mojo directly on the commandline?

I'm pretty sure CycloneDX doesn't use the dependency-tree mojo but rather works on Artifact.getDependencies() directly.

That's for sure, I just wanted to give a hint how you can reconstruct a tree from a flat list, but I'm not sure if you like the other way round?

@ptziegler
Copy link
Contributor Author

I think you probably want to use makeBom goal instead

That's what I do ;D

What do you mean by "not part of reactor build" are you calling the mojo directly on the commandline?

The artifacts Tycho gets from CycloneDX are not of type ProjectArtifact even if they match a reactor project. Even the fallback mechanism via the bundle reader doesn't work here, because at this point, the artifact files are still set to the temporary file not yet available file created by the MavenDependencyInjector.

public ArtifactKey getArtifactKey(Artifact artifact) {
if (artifact instanceof ProjectArtifact projectArtifact) {
Optional<ArtifactKey> key = getArtifactKey(projectArtifact.getProject());
if (key.isPresent()) {
return key.get();
}
}
try {
OsgiManifest loadManifest = bundleReader.loadManifest(artifact.getFile());
return loadManifest.toArtifactKey();
} catch (OsgiManifestParserException e) {
// not an bundle then...
}
return new DefaultArtifactKey("maven", artifact.getGroupId() + ":" + artifact.getArtifactId(),
artifact.getVersion());
}

That's for sure, I just wanted to give a hint how you can reconstruct a tree from a flat list, but I'm not sure if you like the other way round?

Hm... That looks really usefull, actually. If I can reuse this dependency tree, then it should hopefully be pretty straightforward to synchronize this with the SBOM tree. Thanks!

@laeubi
Copy link
Member

laeubi commented Jan 8, 2024

The artifacts Tycho gets from CycloneDX are not of type ProjectArtifact even if they match a reactor project.

Yes usually you need to first map an artifact to a reactor project by building a map from the session, I think mostly because CycloneDX will likely use dependencies instead of real project artifacts, you can for example use TychoReactorReader and make getTychoReactorProject(...) public.

@ptziegler
Copy link
Contributor Author

ptziegler commented Jan 24, 2024

I finally got some time to work on this stuff again... I ended up reusing the algorithm that is used by the dependencies-tree mojo and moved parts of it to tycho-core.

While it's not finished yet, it's (for the most part) functional. I still need to extend the documentation and do some cleanup, though.
The only open issue I don't know how to resolve is the fact that IInstallableUnit.getArtifacts() is always empty when assembling a product, for whatever reason... Because of this, the filtering doesn't work here.

@laeubi
Copy link
Member

laeubi commented Jan 25, 2024

I ended up reusing the algorithm that is used by the dependencies-tree mojo and moved parts of it to tycho-core.

Great, if you like you can split this into a separate PR then we can merge the refactoring already.

The only open issue I don't know how to resolve is the fact that IInstallableUnit.getArtifacts() is always empty when assembling a product, for whatever reason...

Products don't have artifacts (in P2) as they are metadata only. You can take the metadata IU and perform a Planner operation on it or use the dependency artifacts itself.

@ptziegler
Copy link
Contributor Author

Got a little stuck with creating the CycloneDX dependency tree, but the result should now match the output of the dependency-tree mojo. I think this is now a usable proof-of-concept.

@ptziegler ptziegler marked this pull request as ready for review February 9, 2024 05:03
This change consists of two parts. The first part consists of generating
a valid package URL for a given p2 artifact. The PURL contains the
symbolic name, version and classifier of the artifact key, as well as
the repository it is located in.
The second part handles the dependencies between two PURLs. For the sake
of consistency, this dependency tree matches the tree calculated by the
dependency-tree mojo.
@laeubi
Copy link
Member

laeubi commented Feb 15, 2024

@ptziegler do you want me to merge (and backport?) your change.

@laeubi
Copy link
Member

laeubi commented Feb 15, 2024

/request-license-review

Copy link

/request-license-review

License review requests:

After all reviews have concluded, re-run the license-vetting check from the Github Actions web-interface to update its status.

Workflow run (with attached summary files):
https://github.com/eclipse-tycho/tycho/actions/runs/7914057047

@ptziegler
Copy link
Contributor Author

Sorry, the past few days have been a little busy on my side...

I'd like to first update the specification in package-url/purl-spec#272 because I now use the base url of the repository as location, instead of the full path to the file.

There are also a few instances where the SBOM is using the group id. I'd also like to ask the CycloneDX developers, whether it's fine to keep using the artifical p2.* entry or whether there's a better alternative.

I'll try to work on this evening, but I don't believe they require any significant changes to the current implementation. So I think it would be good if the current state is merged, so that other people can try it on some larger projects.

@laeubi laeubi merged commit c06787c into eclipse-tycho:main Feb 15, 2024
11 checks passed
@laeubi
Copy link
Member

laeubi commented Feb 15, 2024

@ptziegler If you like please cherry-pick / backport this to Tycho 4 then it could be included in the upcomming bugfix release!

@laeubi laeubi added the backport-to-tycho-4.0.x Can be added to a PR to trigger an automatic backport of the change label Feb 15, 2024
Copy link

💔 All backports failed

Status Branch Result
tycho-4.0.x Backport failed because of merge conflicts

You might need to backport the following PRs to tycho-4.0.x:
- Skip CHECK_TRUST phase and allow parallel execution of product assembly
- Add support to specify target locations in the pom configuration
- Support javac as a compiler for Tycho
- Add support for API Tools Annotations to Tycho
- Support repository references

Manual backport

To create the backport manually run:

backport --pr 3258

Questions ?

Please refer to the Backport tool documentation and see the Github Action logs for details

@ptziegler
Copy link
Contributor Author

Ah, drat. I just tried to generate the SBOM for GEF-classic and an error is thrown because the expanded version isn't set for artifacts of type eclipse-test-bundle.

I don't really see the use-case for having the bill-of-materials for test fragments, so it probably makes more sense to restrict this converter to bundles and features, rather than to add the BuildQualifierMojo to the default-lifecycle-mapping for this artifact type.

Until then, the cyclonedx.skip property has to be explicitly set (or alternatively, the mojo has to be explicitly invoked). :/

private ArtifactKey getQualifiedArtifactKey(Artifact artifact) {
String expandedVersion = artifact.getVersion();
MavenProject mavenProject = reactorReader.getTychoReactorProject(artifact).orElse(null);
if (mavenProject != null) {
ReactorProject reactorProject = DefaultReactorProject.adapt(mavenProject);
if (reactorProject != null) {
expandedVersion = reactorProject.getExpandedVersion();
} else {
LOG.error(mavenProject + " is not a Tycho project.");
}
}
String type = reactorReader.getPackagingType(artifact);
return new DefaultArtifactKey(type, artifact.getArtifactId(), expandedVersion);

@laeubi
Copy link
Member

laeubi commented Feb 19, 2024

@ptziegler actually the build-qualifier is part of the default-lifecycle-mapping already for the eclipse-test-bundle on the validate phase already, maybe GEF is configured to skip this (for unknown reasons).

Beside that it is common for mojos to have a configuration parameter that controls the packaging types a mojo can run, but as far as I understand you don't have a mojo here but just extending the cyclone-dx one?

In any case there is likely a next bugfix release after 15.03.2024 when Eclipse become GA, so any improvements can go in there and as you said its good to test it out in the wild what will likely uncover some improvements we can make here.

@ptziegler
Copy link
Contributor Author

ptziegler commented Feb 20, 2024

actually the build-qualifier is part of the default-lifecycle-mapping already for the eclipse-test-bundle on the validate phase already, maybe GEF is configured to skip this

After debugging, the build-qualifier is indeed calculated, also for the test bundles. To be precise, the problem shows up, the moment you have more than one test fragment.
In general, there seems to be a very weird chain of dependencies that's causing problem.

Example:
In GEF-classic, we have the following test bundles (which are also built in this order):

  • org.eclipse.draw2d.tests <-- Error
  • org.eclipse.zest.tests
  • org.eclipse.gef.tests

Looking at the dependency tree, the Draw2D tests have a dependency on an artifact with IU eclipse-test-plugin-packaging, which seems a little weird but should be fine, in principle.

Now in order to calculate the version string in the PURL, I need to resolve the build qualifiers for all reactor projects, which is done via the ProjectDependencyClosure:

public Optional<MavenProject> getProject(IInstallableUnit installableUnit) {
return Optional.ofNullable(iuProjectMap.get(installableUnit));
}

I would assume that this method returns Optional.empty() for the eclipse-test-plugin-packaging IU mentioned above but instead, it returns the org.eclipse.zest.tests project. This project hasn't been built yet and therefore it doesn't have a build qualifier, leading to this error.

@ptziegler
Copy link
Contributor Author

@return the project that provides the given {@link IInstallableUnit} or an empty Optional if no project in this closure provides this unit.

In hindsight, I'm probably simply using the wrong method...

@laeubi
Copy link
Member

laeubi commented Feb 20, 2024

You can try to use -Dtycho.target.eager=true in that case projects are eagerly resolved sadly test project sometimes use dependencies indirectly they never declare (there should be a warning in the log).

Also I always enable -Dtycho.localArtifacts=ignore in .mvn/maven.config in all my projects because that option causes too much trouble.

@laeubi
Copy link
Member

laeubi commented Feb 20, 2024

In hindsight, I'm probably simply using the wrong method...

It depends, as obviously it gives you the MavenProject but this does not mean it is build, I have already refactored Tycho in almost all places to behave well defined even if project is not yet build but build qualifier is the only place where this currently not happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-to-tycho-4.0.x Can be added to a PR to trigger an automatic backport of the change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants