-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prune unused dependencies from manifests generated from spago.dhall files #667
Conversation
@@ -16,6 +16,7 @@ | |||
"package.json" | |||
], | |||
"dependencies": { | |||
"purescript-prelude": "^6.0.0" | |||
"purescript-prelude": "^6.0.0", | |||
"purescript-type-equality": "^4.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unused dependency; since we solve for it, though, we need to add it to the local registry, registry-index, and storage fixtures (hence the other added files).
Right string -> Env.askResourceEnv >>= \{ dhallTypes } -> Run.liftAff (jsonToDhallManifest dhallTypes string) >>= case _ of | ||
Left error -> do | ||
Log.error $ "Manifest does not typecheck: " <> error | ||
Except.throw $ "Found a valid purs.json file in the package source, but it does not typecheck." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to always write a purs.json file, and then read it back and verify the dhall types. Now, we only verify the dhall types when a purs.json file already exists. That's because we already have tests in place to ensure that our manifest format complies with the dhall types, and in all cases except this one we're using our manifest codec to write the file.
Otherwise, this is the same as how we've always read the purs.json file out of the package source code.
Log.debug $ "Read a valid purs.json manifest from the package source:\n" <> stringifyJson Manifest.codec manifest | ||
pure manifest | ||
|
||
else if hasSpagoYaml then do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is unchanged other than returning rather than writing the manifest.
Log.debug "Successfully converted a spago.yaml into a purs.json manifest" | ||
pure manifest | ||
else do | ||
Comment.comment $ "Package source does not have a purs.json file. Creating one from your bower.json and/or spago.dhall files..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is unchanged other than returning rather than writing the manifest. That allows us to verify and modify it before writing it later on.
-- Now that we've verified the package we can write the manifest to the source | ||
-- directory and then publish it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the divergence between the old and new implementations begins.
let depGlobs = Path.concat [ tmpDepsDir, "*", "src", "**", "*.purs" ] | ||
let command = Purs.Graph { globs: [ srcGlobs, depGlobs ] } | ||
-- We need to use the minimum compiler version that supports 'purs graph' | ||
let minGraphCompiler = unsafeFromRight (Version.parse "0.13.8") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
purs graph
was introduced in 0.13.8, though that implementation had a bug in which it listed all transitive dependencies instead of direct dependencies only. In our case that doesn't matter: we want the transitive dependencies anyway.
We need to do our best to associate the call to 'purs graph' with a compiler version that can parse its source, so a package on 0.14.1 should use a purs graph call from the 0.14.0 series.
At this moment I'm allowing failed purs graph calls to just let the package through anyway if it's via the legacy importer since we can't accurately determine what compilers the source code might be valid with. However, when we implement #255 then we will be able to choose a correct compiler and can remove that escape hatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A small code-location note, but it's otherwise good to go
let | ||
-- We need access to a graph that _doesn't_ include the package | ||
-- source, because we only care about dependencies of the package. | ||
noSrcGraph = Map.filter (isNothing <<< String.stripPrefix (String.Pattern packageDirectory) <<< _.path) graph | ||
|
||
pathParser = map _.name <<< parseInstalledModulePath <<< { prefix: tmpDepsDir, path: _ } | ||
|
||
case PursGraph.associateModules pathParser noSrcGraph of | ||
Left errs -> | ||
Except.throw $ String.joinWith "\n" | ||
[ "Failed to associate modules with packages while finding unused dependencies:" | ||
, flip NonEmptyArray.foldMap1 errs \{ error, module: ModuleName moduleName, path } -> | ||
" - " <> moduleName <> " (" <> path <> "): " <> error <> "\n" | ||
] | ||
Right modulePackageMap -> do | ||
Log.debug "Associated modules with their package names. Finding all modules used in package source..." | ||
-- The modules used in the package source code are any that have | ||
-- a path beginning with the package source directory. We only | ||
-- care about dependents of these modules. | ||
let sourceModules = Map.keys $ Map.filter (isJust <<< String.stripPrefix (String.Pattern packageDirectory) <<< _.path) graph | ||
|
||
Log.debug "Found all modules used in package source. Finding all modules used by those modules..." | ||
-- We find all dependencies of each module used in the source, | ||
-- which results in a set containing every module name reachable | ||
-- by the source code. | ||
let allReachableModules = Set.fromFoldable $ Array.fold $ Array.mapMaybe (flip PursGraph.allDependencies graph) $ Set.toUnfoldable sourceModules | ||
|
||
-- Then we can associate each reachable module with its package | ||
-- name to get the full set of used package names. | ||
let allUsedPackages = Set.mapMaybe (flip Map.lookup modulePackageMap) allReachableModules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This chunk could probably sit in the Graph module - it's pretty close to what Spago does for unused packages, and could probably be reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't add it there because we're relying on specific directory names and path formats to pull out the source vs. non-source code, so the only part that feels "shareable" is determining the reachable modules and used packages (lines 631-635); if you think that's usable by Spago I could move it to the Graph module but I'll fully defer to you on that. If there's no active plan to reuse it then I'll keep it where it is; it's easy to copy the implementation.
Fixes #662. The registry API will now detect unused dependencies in manifests generated from spago.dhall files. Those dependencies will be removed from the manifest before it is written to disk and we kick off the publishing process.
The general problem is that spago.dhall files frequently contain test dependencies and we have no way to distinguish a test from a non-test dependency from the config file alone. The main reason why these dependencies are mixed together is that spago.dhall files have historically been internal-only: the package set repository contained the "correct" list of source-only dependencies (though this is frequently incorrect, too). When we import a legacy package and it contains a spago.dhall file then we just take its dependencies at face value, but the result is over-constrained package dependencies as seen in #662.
This PR prunes unused dependencies from the manifests generated from legacy packages. Specifically, we prune unused dependencies any time we use the
fetchLegacyManifest
function, because that creates a manifest from the aggregate of bower.json, spago.dhall, and package-sets files.The pruning process is as follows. As soon as we have the project source code and its generated manifest, we:
purs graph
(only usable from 0.13.8 onward, FYI) to determine, for any given module name, its path and the modules it directly depends on.path
segment — we can do this because packages are installed in a standard form, ie.<tmp>/<package-name>-<version>/...
The result is a
Set PackageName
representing packages that are actually in use. We can then walk through the generated manifest checking each dependency to see if it is in the "used packages" set; if not, then it is removed from the manifest.Finally, we write that updated manifest to disk and proceed with publishing the package. Publishing the package involves re-solving the manifest, downloading dependencies, and compiling, so we can be sure that the adjusted package indeed works. (This doesn't affect packages with a purs.json or spago.yaml file — they are solved only once, since we don't have to prune unused dependencies from them).
Internal Notes
PursGraph
module that Spago can reuse if it wants, as well as acheckUnusedDependencies
validation function, but I don't see any need to couple the code bases together over the use ofpurs graph
as a utility. Spago has a more varied set of needs than the registry does. I would prefer not to try and share something likecheckImports
between the code bases.type-equality
in theeffect@4.0.0
fixture and added a step to verify that the packaged tarball contains the correct, adjusted manifest.