Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import tree #2751

Merged
merged 73 commits into from
May 14, 2024
Merged

Import tree #2751

merged 73 commits into from
May 14, 2024

Conversation

janmasrovira
Copy link
Collaborator

@janmasrovira janmasrovira commented Apr 22, 2024

New commands:

  1. dev import-tree scan FILE. Scans a single file and lists all the imports in it.
  2. dev import-tree print. Scans all files in the package and its dependencies. Builds an import dependency tree and prints it to stdin. If the --stats flag is given, it reports the number of scanned modules, the number of unique imports, and the length of the longest import chain.

Example: this is the truncated output of juvix dev import-tree print --stats in the juvix-stdlib directory.

[...]
Stdlib/Trait/Partial.juvix imports Stdlib/Data/String/Base.juvix
Stdlib/Trait/Partial.juvix imports Stdlib/Debug/Fail.juvix
Stdlib/Trait/Show.juvix imports Stdlib/Data/String/Base.juvix
index.juvix imports Stdlib/Cairo/Poseidon.juvix
index.juvix imports Stdlib/Data/Int/Ord.juvix
index.juvix imports Stdlib/Data/Nat/Ord.juvix
index.juvix imports Stdlib/Data/String/Ord.juvix
index.juvix imports Stdlib/Prelude.juvix

Import Tree Statistics:
=======================
• Total number of modules: 56
• Total number of edges: 193
• Height (longest chain of imports): 15

Bot commands support the --scan-strategy flag, which determines which parser we use to scan the imports. The possible values are:

  1. flatparse. It uses the low-level FlatParse parsing library. This parser is made specifically to only parse imports and ignores the rest. So we expect this to have a much better performance. It does not have error messages.
  2. megaparsec. It uses the normal juvix parser and we simply collect the imports from it.
  3. flatparse-megaparsec (default). It uses the flatparse backend and fallbacks to megaparsec if it fails.

Internal changes

Megaparsec Parser (Concrete.FromSource)

In order to be able to run the parser during the scanning phase, I've adjusted some of the effects used in the parser:

  1. I've removed the NameIdGen and Files constraints, which were unused.
  2. I've removed Reader EntryPoint. It was used to get the ModuleId. Now the ModuleId is generated during scoping.
  3. I've replaced PathResolver by the TopModuleNameChecker effect. This new effect, as the name suggests, only checks the name of the module (same rules as we had in the PathResolver before). It is also possible to ignore the effect, which is needed if we want to use this parser without an entrypoint.

PathResolver effet refactor

  1. The WithPath command has been removed.
  2. New command ResolvePath :: ImportScan -> PathResolver m (PackageInfo, FileExt). Useful for resolving imports during scanning phase.
  3. New command WithResolverRoot :: Path Abs Dir -> m a -> PathResolver m a. Useful for switching package context.
  4. New command GetPackageInfos :: PathResolver m (HashMap (Path Abs Dir) PackageInfo) , which returns a table with all packages. Useful to scan all dependencies.

The Package.PathResolver has been refactored to be more like to normal PathResolver. We've discussed with @paulcadman the possibility to try to unify both implementations in the near future.

Misc

  1. Package.juvix no longer ends up in PackageInfo.packageRelativeFiles.
  2. I've introduced string definitions for --, {- and -}.
  3. I've fixed a bug were .juvix.md was detected as an invalid extension.
  4. I've added LazyHashMap to the prelude. I've also added ordSet to create ordered Sets, ordMap for ordered maps, etc.

Benchmarks

I've profiled juvix dev import-tree --scan-strategy [megaparsec | flatparse] --stats with optimization enabled.
In the images below we see that in the megaparsec case, the scanning takes 54.8% of the total time, whereas in the flatparse case it only takes 9.6% of the total time.

  • Megaparsec
    image

  • Flatparse
    image

Hyperfine

hyperfine --warmup 1 'juvix dev import-tree print --scan-strategy flatparse --stats' 'juvix dev import-tree print --scan-strategy megaparsec --stats' --min-runs 20
Benchmark 1: juvix dev import-tree print --scan-strategy flatparse --stats
  Time (mean ± σ):      82.0 ms ±   4.5 ms    [User: 64.8 ms, System: 17.3 ms]
  Range (min … max):    77.0 ms … 102.4 ms    37 runs

Benchmark 2: juvix dev import-tree print --scan-strategy megaparsec --stats
  Time (mean ± σ):     174.1 ms ±   2.7 ms    [User: 157.5 ms, System: 16.8 ms]
  Range (min … max):   169.7 ms … 181.5 ms    20 runs

Summary
  juvix dev import-tree print --scan-strategy flatparse --stats ran
    2.12 ± 0.12 times faster than juvix dev import-tree print --scan-strategy megaparsec --stats

In order to compare (almost) only the parsing, I've forced the scanning of each file to be performed 50 times (so that the cost of other parts get swallowed). Here are the results:

hyperfine --warmup 1 'juvix dev import-tree print --scan-strategy flatparse --stats' 'juvix dev import-tree print --scan-strategy megaparsec --stats' --min-runs 10
Benchmark 1: juvix dev import-tree print --scan-strategy flatparse --stats
  Time (mean ± σ):     189.5 ms ±   3.6 ms    [User: 161.7 ms, System: 27.6 ms]
  Range (min … max):   185.1 ms … 197.1 ms    15 runs

Benchmark 2: juvix dev import-tree print --scan-strategy megaparsec --stats
  Time (mean ± σ):      5.113 s ±  0.023 s    [User: 5.084 s, System: 0.035 s]
  Range (min … max):    5.085 s …  5.148 s    10 runs

Summary
  juvix dev import-tree print --scan-strategy flatparse --stats ran
   26.99 ± 0.52 times faster than juvix dev import-tree print --scan-strategy megaparsec --stats

@janmasrovira janmasrovira self-assigned this Apr 22, 2024
@janmasrovira janmasrovira force-pushed the module-dependencies branch 10 times, most recently from e9fdf36 to e4edd58 Compare April 29, 2024 15:18
@janmasrovira janmasrovira force-pushed the module-dependencies branch 2 times, most recently from 5f578fb to 6bf20bf Compare April 30, 2024 14:22
@janmasrovira janmasrovira marked this pull request as ready for review May 2, 2024 21:23
@paulcadman paulcadman force-pushed the module-dependencies branch from bbf0a29 to ab431ba Compare May 3, 2024 09:05
@paulcadman paulcadman self-requested a review May 3, 2024 09:05
@janmasrovira janmasrovira force-pushed the module-dependencies branch from 04761f7 to 066a83c Compare May 3, 2024 11:44
@janmasrovira janmasrovira force-pushed the module-dependencies branch from 4642763 to 512f972 Compare May 3, 2024 16:42
@janmasrovira janmasrovira force-pushed the module-dependencies branch from 512f972 to e88d3ab Compare May 3, 2024 17:00
paulcadman
paulcadman previously approved these changes May 13, 2024
@janmasrovira janmasrovira merged commit 7c59e2a into main May 14, 2024
4 checks passed
@janmasrovira janmasrovira deleted the module-dependencies branch May 14, 2024 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants