Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress addition of unnecessary repo-refereces when assembling p2-repos #2744

Merged
merged 2 commits into from
Sep 29, 2023

Conversation

HannesWell
Copy link
Member

The main goal of this PR is to prevent the addition of repository references that don't provide any relevant artifact or are full duplicates (relating the relevant artifacts) to other repositories and thus reducing the number of references which speeds up querying the repo since less other repositories have to be queried.

I think this becomes more relevant since one can now add repository references automatically and for example not all repos referenced in the TP are actually relevant for a p2-repository build for a project.
While with one can explicitly filter references with #2742, the user should not need to carry the burden to check manually which repos are relevant for the assembled repository. Instead Tycho should be smart enough to check that (and with this PR it can).

In detail his PR consists of three commits

  1. Minor code enhancements in p2-repository assembling code (will move that to a separate PR once settled)
  2. Suppress addition of unnecessary repo-refereces when assembling p2-repos
  3. Print added repository-references as info, when assembly a p2-repository

Adding some information about added repo-references in the log seems to be reasonable since it is now not obvious anymore with auto-addition, explicit and implicit filtering which references actually end up in the repo.

Similar to #2742 (comment) we could consider to activate this auto-filtering only for those references that are added automatically, because one might have added an effectively empty repo on purpose.
The logging could also only activated if automatic addition is enabled.

Btw. for M2E this reduces the list of added references from 9 to 6:

https://download.eclipse.org/eclipse/updates/4.28/
https://download.eclipse.org/egit/updates-6.6/
https://download.eclipse.org/modeling/emf/emf/builds/release/2.34.0
https://download.eclipse.org/webtools/downloads/drops/R3.28.0/R-3.28.0-20221120050827/repository/
https://download.eclipse.org/wildwebdeveloper/releases/1.3.0/
https://download.eclipse.org/tm4e/releases/0.8.1/
https://download.eclipse.org/lsp4e/releases/0.23.0/
https://download.eclipse.org/lsp4j/updates/releases/0.21.0/
https://download.eclipse.org/cbi/updates/license/

to

https://download.eclipse.org/lsp4j/updates/releases/0.21.0/
https://download.eclipse.org/eclipse/updates/4.28/
https://download.eclipse.org/webtools/downloads/drops/R3.28.0/R-3.28.0-20221120050827/repository/
https://download.eclipse.org/wildwebdeveloper/releases/1.3.0/
https://download.eclipse.org/modeling/emf/emf/builds/release/2.34.0
https://download.eclipse.org/tm4e/releases/0.8.1/

Currently this includes work-arounds for the following pending P2 PRs that are not available before December.

@HannesWell HannesWell added the backport-to-tycho-4.0.x Can be added to a PR to trigger an automatic backport of the change label Sep 3, 2023
@github-actions
Copy link

github-actions bot commented Sep 4, 2023

Test Results

   561 files  ±0     561 suites  ±0   4h 13m 17s ⏱️ - 10m 10s
   364 tests +1     358 ✔️ +1    6 💤 ±0  0 ±0 
1 092 runs  +3  1 073 ✔️ +3  19 💤 ±0  0 ±0 

Results for commit a58e8b7. ± Comparison against base commit 05367ee.

♻️ This comment has been updated with latest results.

Copy link
Member

@laeubi laeubi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this must then be configurable (at best with the ongoing <filter> configuration) because we can't decide on the rationale if/how a user wants to add additional repositories, even though PDE currently not support this fully (see eclipse-pde/eclipse.pde#720 ) Tycho does so "unnecessary" depends on the usecase, and if the updatesites are really not required one probably should never add them to the target platform anyways ...

@laeubi
Copy link
Member

laeubi commented Sep 4, 2023

What do you think about having

<repositoryReferenceFilters>
    ...
   <strictlyRequired>true/false (default)</strictlyRequired>
</repositoryReferenceFilters>

that do this filtering with strictlyRequired=true?

@HannesWell
Copy link
Member Author

HannesWell commented Sep 4, 2023

What do you think about having

<repositoryReferenceFilters>
    ...
   <strictlyRequired>true/false (default)</strictlyRequired>
</repositoryReferenceFilters>

that do this filtering with strictlyRequired=true?

The schema sounds good and I think this fits well to #2742.
But I think the name should be more expressive, e.g. excludeNotStrictlyRequired?
I assume that in most cases excludeNotStrictlyRequired=true is what users want and therefore would suggest to make that the default.

and if the updatesites are really not required one probably should never add them to the target platform anyways ...

Sometimes yes, but I think there are scenarios where it is not so obvious. For example in the case of m2e (and probably many other eclipse projects) the license repo is not required since the license is inlined during the build. Or one could have a p2-repo from which test dependencies are fetched. That should probably then also not be added to a 'product' code repo.

I'll update this PR tomorrow when #2742 is merged.

@HannesWell
Copy link
Member Author

Furthermore I think that instead of pruning the references based on the content of the artifact repository, it actually needs to be done based on the metadata repository. Otherwise the content provided by transitive references is not considered.

Which in general brings me to another idea I had:
Wouldn't it also be reasonable variant to keep the metadata (potentially with the full closure depending on the plugin-config) as it is and only remove the artifacts and reference the artifact repositories (and not the corresponding metadata repo).
I would assume that this would speed up the resolution process since then probably less IUs are available,
the closure of the assembled repo is a sub-set of the closure of all references and probably in most cases even a real sub-set, and less metadata repos would have to be queried.

@laeubi
Copy link
Member

laeubi commented Sep 5, 2023

But I think the name should be more expressive, e.g. excludeNotStrictlyRequired?

better add an exhaustive documentation to the parameter that trying to express to much with the name

I assume that in most cases excludeNotStrictlyRequired=true is what users want and therefore would suggest to make that the default.

Its hard to guess what "most users" want, but this can't be the default for Tycho 4 at least and for 5 then needs a documentation in the migration guide.

For example in the case of m2e (and probably many other eclipse projects) the license repo is not required since the license is inlined during the build.

This is a very special case and one can either filter that, or simply put the license repo in the pom and exclude pom repositories in the first place.

Furthermore I think that instead of pruning the references based on the content of the artifact repository, it actually needs to be done based on the metadata repository.

Yes the metadata is the "key" in fact not every requirement results in an artifact.

Wouldn't it also be reasonable variant to keep the metadata (potentially with the full closure depending on the plugin-config) as it is and only remove the artifacts and reference the artifact repositories (and not the corresponding metadata repo).

You can add a new option <includeMetadata> but be aware that if one references a snapshot/latest repository the metadata might change or is probably not valid anymore then.

@HannesWell HannesWell force-pushed the filterUnusedReferences branch 2 times, most recently from bd7d483 to 2bcceac Compare September 5, 2023 22:29
@HannesWell
Copy link
Member Author

But I think the name should be more expressive, e.g. excludeNotStrictlyRequired?

better add an exhaustive documentation to the parameter that trying to express to much with the name

I think it is best to have both. An expressive name and a good description.

And since we have the exclude filter for locations excludeNotStrictlyRequired fits IMHO quite well.

I assume that in most cases excludeNotStrictlyRequired=true is what users want and therefore would suggest to make that the default.

Its hard to guess what "most users" want, but this can't be the default for Tycho 4 at least and for 5 then needs a documentation in the migration guide.

Yes that's right. Sounds fair. Lets do it that way.

Wouldn't it also be reasonable variant to keep the metadata (potentially with the full closure depending on the plugin-config) as it is and only remove the artifacts and reference the artifact repositories (and not the corresponding metadata repo).

You can add a new option <includeMetadata> but be aware that if one references a snapshot/latest repository the metadata might change or is probably not valid anymore then.

Yes that's right. But I think this is a general 'issue' of filterProvided respectively relying on references. You should only do that with stable repositories.

@laeubi
Copy link
Member

laeubi commented Sep 6, 2023

An expressive name and a good description.

Longer names are not necessarily more expressive (but harder to remember / type), e.g. many mojos have a skip flag even though in many cases it could have been skipTest, skipCompile, skipExecutionOfMyMojo, ...

Especially for boolean "flags" using the negation is often confusing, e.g. excludeNotStrictlyRequired is unclear to me what it means (without reading the docs) if I set it to true because it contains two "negations", exclude and not so does this means a double negation so I better should choose false here?!? Also for me filtering always implies exclusion already (we are inside a <repositoryReferenceFilters> tag) , so I would assume an exclusion for any option here by default (the opposite of filtering would be selection as in SQL where you first SELECT some set and later on filter it with WHERE)

@HannesWell
Copy link
Member Author

An expressive name and a good description.

Longer names are not necessarily more expressive (but harder to remember / type), e.g. many mojos have a skip flag even though in many cases it could have been skipTest, skipCompile, skipExecutionOfMyMojo, ...

It is of course reasonable to strip of the context, but I didn't suggest excludeNotStrictlyRequiredRepositoryReferencesWhenAssemblingARepository :P

Especially for boolean "flags" using the negation is often confusing, e.g. excludeNotStrictlyRequired is unclear to me what it means (without reading the docs) if I set it to true because it contains two "negations", exclude and not so does this means a double negation so I better should choose false here?!? Also for me filtering always implies exclusion already (we are inside a <repositoryReferenceFilters> tag) , so I would assume an exclusion for any option here by default (the opposite of filtering would be selection as in SQL where you first SELECT some set and later on filter it with WHERE)

That's a points. I choose exclude as it fits the underlying process since the references are first added and then removed. But of course this does not have to be reflected here.
So maybe the opposite includeOnlyStrictlyRequired?

@laeubi
Copy link
Member

laeubi commented Sep 18, 2023

@HannesWell would be good if we can finish this I want to perform a new release end of month so this is included then.

@HannesWell
Copy link
Member Author

HannesWell commented Sep 28, 2023

This should now be almost ready. The only missing thing are to mention the new flag in the doc of repositoryReferenceFilters and the release notes.

I now changed the name of the property to addOnlyProviding, a full configuration then looks like:

<plugin>
	<groupId>org.eclipse.tycho</groupId>
	<artifactId>tycho-p2-repository-plugin</artifactId>
	<configuration>
		<repositoryReferenceFilters>
			<addOnlyProviding>true</addOnlyProviding>
		</repositoryReferenceFilters>
	</configuration>
</plugin>

tycho-core/pom.xml Outdated Show resolved Hide resolved
@HannesWell
Copy link
Member Author

This is now actually complete.
But at my final local testing with eclipse-m2e/m2e-core#1222, I noticed that the two fields of RepositoryReferenceFilter where not properly populated for the mojo if both are used together.
So although I have used the following configuration the exclude list was empty when the mojo was executed.

<plugin>
	<groupId>org.eclipse.tycho</groupId>
	<artifactId>tycho-p2-repository-plugin</artifactId>
	<configuration>
		<includeAllDependencies>true</includeAllDependencies>
		<includeAllSources>true</includeAllSources>
		<filterProvided>true</filterProvided>
		<addIUTargetRepositoryReferences>true</addIUTargetRepositoryReferences>
		<repositoryReferenceFilters>
			<addOnlyProviding>true</addOnlyProviding>
			<exclude>
				<location>https://download.eclipse.org/tools/orbit/**</location>
				<location>![%regex[http(s)?:\/\/download\.eclipse\.org\/.*]]</location>
			</exclude>
		</repositoryReferenceFilters>
	</configuration>

So either I'm doing something wrong or addOnlyProviding needs to be a separate attribute. Then the list of exclusions could also be a simple List.

@HannesWell
Copy link
Member Author

So either I'm doing something wrong or addOnlyProviding needs to be a separate attribute. Then the list of exclusions could also be a simple List.

I was doing something wrong. I used the old configuration where filters was used as plural.
The parameter is named repositoryReferenceFilter now without trailing s.

@laeubi
Copy link
Member

laeubi commented Sep 29, 2023

Good you find that out, so is this ready to be merged?

@HannesWell HannesWell merged commit 2a90039 into eclipse-tycho:master Sep 29, 2023
10 checks passed
@HannesWell
Copy link
Member Author

Good you find that out, so is this ready to be merged?

Yes it finally is.

@HannesWell HannesWell deleted the filterUnusedReferences branch September 29, 2023 06:07
@HannesWell HannesWell restored the filterUnusedReferences branch September 29, 2023 06:11
@HannesWell HannesWell deleted the filterUnusedReferences branch September 29, 2023 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-to-tycho-4.0.x Can be added to a PR to trigger an automatic backport of the change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants