Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the effect of different base package configuration on component scanning performance and make the effect more intuitive [SPR-16649] #21190

Closed
spring-projects-issues opened this issue Mar 27, 2018 · 1 comment
Assignees
Labels
in: core Issues in core modules (aop, beans, core, context, expression) type: enhancement A general enhancement
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

Andy Wilkinson opened SPR-16649 and commented

The base package configuration that's used for component scanning can have a significant impact on the time that the scan can take. In some cases the impact of narrowing the scan can be counter-intuitive as it actually takes longer to scan fewer packages. I've attached a small sample that reproduces the behaviour that I'll describe below.

In a large application (200 packages, each with 50 classes) scanning all 200 packages takes 601ms when those packages are available directly on the filesystem:

$ ./gradlew run -Ppackages=single

> Task :run
Scanning single took 601ms


BUILD SUCCESSFUL in 10s
2 actionable tasks: 2 executed

If the scanning is narrowed to the 100 packages that are of interest, the time taken decreases to 403ms:

./gradlew run -Ppackages=multi

> Task :run
Scanning multi took 403ms


BUILD SUCCESSFUL in 1s
2 actionable tasks: 1 executed, 1 up-to-date

Halving the number of packages that are scanned has reduced the time taken by roughly a third.

If the application is packaged as a jar file, the time taken to scan all 200 packages increases slightly to 657ms:

$ ./gradlew distZip && unzip build/distributions/component-scanning-performance.zip -d build/distributions && build/distributions/component-scanning-performance/bin/component-scanning-performance single

BUILD SUCCESSFUL in 2s
4 actionable tasks: 3 executed, 1 up-to-date
Archive:  build/distributions/component-scanning-performance.zip
   creating: build/distributions/component-scanning-performance/
   creating: build/distributions/component-scanning-performance/lib/
  inflating: build/distributions/component-scanning-performance/lib/component-scanning-performance.jar
  inflating: build/distributions/component-scanning-performance/lib/spring-context-5.0.4.RELEASE.jar
  inflating: build/distributions/component-scanning-performance/lib/spring-aop-5.0.4.RELEASE.jar
  inflating: build/distributions/component-scanning-performance/lib/spring-beans-5.0.4.RELEASE.jar
  inflating: build/distributions/component-scanning-performance/lib/spring-expression-5.0.4.RELEASE.jar
  inflating: build/distributions/component-scanning-performance/lib/spring-core-5.0.4.RELEASE.jar
  inflating: build/distributions/component-scanning-performance/lib/spring-jcl-5.0.4.RELEASE.jar
   creating: build/distributions/component-scanning-performance/bin/
  inflating: build/distributions/component-scanning-performance/bin/component-scanning-performance
  inflating: build/distributions/component-scanning-performance/bin/component-scanning-performance.bat
Scanning single took 657ms

If we then narrow the scan to focus on the 100 packages of interest, the time taken for the scan increases significantly to 1084ms:

$ build/distributions/component-scanning-performance/bin/component-scanning-performance multi
Scanning multi took 1084ms

On the surface, I find it unintuitive that narrowing the packages that need to be scanned takes longer when the packages are in a jar file. This problem is exacerbated by the fact that the scan is faster when the scan is narrowed when the packages are on the file system. This means that, for optimal scanning performance, you may need one configuration during development and test and another in production.

The scan's slower in the jar file case as, when a package exists in a jar file, the whole jar is scanned. This means that when the scan is narrowed by providing 100 sub-packages rather than a single parent package, the whole jar is scanned 100 times rather than once. Would it be possible to provide an entry point to scanning that takes multiple base packages? Then, if multiple base packages resolve to the same jar, the jar could be scanned once to find matches across all the base packages.


Affects: 4.3.14, 5.0.4

Attachments:

0 votes, 5 watchers

@spring-projects-issues spring-projects-issues added the type: enhancement A general enhancement label Jan 11, 2019
@spring-projects-issues spring-projects-issues added this to the 5.x Backlog milestone Jan 11, 2019
@rstoyanchev rstoyanchev added the in: core Issues in core modules (aop, beans, core, context, expression) label Jul 26, 2021
@jhoeller jhoeller self-assigned this Mar 7, 2024
@jhoeller jhoeller modified the milestones: 6.x Backlog, 6.2.x, 6.2.0-M1 Mar 7, 2024
@jhoeller
Copy link
Contributor

jhoeller commented Mar 8, 2024

I've introduced custom root directory and jar caching in PathMatchingResourcePatternResolver now, bringing scanning performance for all individual subpackages to the same level as a single scan for the root package. No new API necessary (aside from a clearCache() method for the application context to call on refresh completion), and the caching applies to any individual scanning attempts against the same PathMatchingResourcePatternResolver instance (usually the shared one in the context).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core Issues in core modules (aop, beans, core, context, expression) type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

3 participants