Skip to content
This repository has been archived by the owner on Apr 1, 2022. It is now read-only.

Add support for Yarn v2 projects #244

Merged
merged 32 commits into from
Jun 4, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
1cdef17
WIP - add parsers for locator and descriptor
cnr May 23, 2021
2c8311f
WIP - define json/yaml parsers
cnr May 25, 2021
5abb9cd
Use the best parser
cnr May 25, 2021
12a3b26
Add LockfileV2 and Resolvers modules
cnr May 25, 2021
342b503
Add resolvers and partial analysis logic
cnr May 27, 2021
2bc3d06
Use real diagnostics
cnr May 27, 2021
41ac122
Implement graph-building logic
cnr May 27, 2021
653e59e
Add tar resolver
cnr May 27, 2021
3bcac7b
use better message for unsupported locator
cnr May 27, 2021
3d75f50
Add remaining unhandled resolvers
cnr May 28, 2021
96f0118
format cabal file
cnr May 28, 2021
23031f7
Add devdocs for yarn v2
cnr Jun 1, 2021
2ba18c6
Add a couple of links to yarnv2 devdocs
cnr Jun 1, 2021
aa4070d
Add docs to LockfileV2
cnr Jun 1, 2021
eac91a2
Add docs to Resolvers
cnr Jun 1, 2021
4c928cf
Mention hack for npm: protocol lookup
cnr Jun 1, 2021
72dc874
Reorganize yarn strategy module layout
cnr Jun 1, 2021
b329091
Strategy.Npm: ignore directories and subdirectories when yarn.lock is…
cnr Jun 1, 2021
bff5f1b
Yarn: try V1 lockfile then V2 lockfile
cnr Jun 1, 2021
915dcf3
Factor out duplication in unsupported resolvers
cnr Jun 1, 2021
41c61f3
Copy editing
cnr Jun 1, 2021
4e17b1a
Add tests for resolvers
cnr Jun 1, 2021
316f5ee
Kill redundant resolver names
cnr Jun 1, 2021
ef4c432
Add end-to-end test for the example lockfile
cnr Jun 1, 2021
8a78159
Update changelog
cnr Jun 1, 2021
9195ffc
Add more links to yarnv2 devdocs
cnr Jun 4, 2021
0b8c7fd
Kill scary comment at the top of test yarn.lock
cnr Jun 4, 2021
e3e7303
Add tests for Data.Text.Extra.dropPrefix
cnr Jun 4, 2021
f858a6b
Allow dependency versions as numbers
cnr Jun 4, 2021
052d18d
Strip selectors from npm locators
cnr Jun 4, 2021
878fc2e
Add workaround for semver range coalescing
cnr Jun 4, 2021
5a0340e
Merge remote-tracking branch 'origin/master' into yarn-v2
cnr Jun 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions Changelog.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,20 @@
# Unreleased
# 2.7.1

- Adds support for Yarn v2 lockfiles ([#244](https://github.com/fossas/spectrometer/pull/244))
- Fixes the dependency version parser for `.csproj`, `.vbproj`, and similar .NET files ([#247](https://github.com/fossas/spectrometer/pull/247))
- Re-enables status messages for commands like `fossa test` in CI environments ([#248](https://github.com/fossas/spectrometer/pull/248))

# v2.7.0

- Adds support for the Conda package manager ([#226](https://github.com/fossas/spectrometer/pull/226))

# v2.6.1

- Adds --follow to the vps analyze subcommand, which allows for following symbolic links during VPS scans. ([#243](https://github.com/fossas/spectrometer/pull/243))

# v2.6.0

- Improves the output of `fossa analyze` by displaying the status of ongoing Project Discovery and Project Analysis tasks ([#241](https://github.com/fossas/spectrometer/pull/241))
- Improves the output of `fossa analyze` by displaying the status of ongoing Project Discovery and Project Analysis tasks ([#239](https://github.com/fossas/spectrometer/pull/239))

# v2.5.18

Expand Down
189 changes: 189 additions & 0 deletions devdocs/buildtools/yarnv2.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is currently formatted as an emacs org-mode document, because it's substantially easier for me to edit that way. I'll make sure to convert to markdown before merging

* Overview
Yarn is a buildtool primarily used for building and managing javascript projects. It's functionally a superset of the =npm= cli.

Yarn uses the same package manifest file as =npm= -- =package.json= -- but uses a novel lockfile format to pin dependencies, saved as =yarn.lock=.

For dependency analysis, we focus exclusively on the lockfile.

* Lockfile (=yarn.lock=)
As part of the update from Yarn v1 to Yarn v2, some major changes were made to the lockfile. Most notably:

+ The lockfile [[https://dev.to/arcanis/introducing-yarn-2-4eh1#new-lockfile-format][is now real yaml]]. Yarn v1 used an almost-but-not-quite pseudo-yaml format.
+ While Yarn v1's lockfile contained information only about dependencies of user projects, Yarn v2's lockfile is much more information-rich. It contains information about first-party user projects ("workspaces"), the version ranges specified in =package.json= for dependencies ("descriptors"), and the resolved version for each dependency ("locators")

** Concepts
*** Workspaces
Workspaces are first-party package directories (directories that contain =package.json=). Workspaces are always available locally on disk, and are specified by a relative reference to a directory (e.g., =.= or =./foo/bar= or =../baz=)

A yarn project can have several workspaces, where workspaces may (but *are not required to*) depend on each other in a DAG. This is similar to "multi-module projects" in other buildtools like maven or gomodules.

Every yarn project will contain at least one workspace.

See: [[https://yarnpkg.com/features/workspaces]]

*** Locators
/Somewhat/ similar to fossa locators, a yarn locator is an unambiguous reference to a specific version of a package and where to find it.

Locators have three components:
+ Package scope (optional) -- like =@babel= -- a scope on npm.
+ Package name -- like =underscore=.
+ Package reference -- which can vary in shape depending on where the package is coming from. For example, this could be a pointer to specific package version on the npm registry, a pointer to a git repo at a specific commit, or a link to a tarball.

Package scope and name in a locator, for the purposes of dependency resolution, *are unused*. Only the package reference matters.

Yarn supports a handful of reference types by default, and plugins can be added to support new reference types. See the =Resolvers= section below.

See: https://yarnpkg.com/advanced/lexicon#locator

*** Descriptors
Descriptors are similar to locators, but may point to a /range/ of package versions. For the purposes of dependency analysis, we don't care much about the shape and content of descriptors.

Descriptors have three components:
+ Package scope (optional) -- like =@babel= -- a scope on npm
+ Package name -- like =underscore=
+ Package range -- which can vary in shape depending on where the package is coming from. For example, this could be a semver for a package on the npm registry, a pointer to a git repo on a branch, or a link to a tarball

All locators are valid descriptors; not all descriptors are valid locators.

See: https://yarnpkg.com/advanced/lexicon#descriptor

*** Resolvers
Plugins are the yarn v2 mechanism used to add support for, among other things, new types of locators.

A plugin can export zero or more "Resolvers", each of which can add support for new types of locator. Yarn itself implements support for "built-in" locator types (npm dependencies, git dependencies, etc) as resolvers in bundled plugins.

For dependency analysis, we support locators produced by [[https://github.com/yarnpkg/berry/blob/8afcaa2a954e196d6cd997f8ba506f776df83b1f/packages/yarnpkg-cli/package.json#L68-L82][all of the built-in plugins]].

See: https://yarnpkg.com/advanced/lexicon#resolver
** Format
#+BEGIN_SRC yaml
# This file is generated by running "yarn install" inside your project.
# Manual changes might be lost - proceed with caution!

__metadata:
version: 4
cacheKey: 7

"bar@workspace:bar":
version: 0.0.0-use.local
resolution: "bar@workspace:bar"
dependencies:
underscore: 1.13.1
languageName: unknown
linkType: soft

"foo@workspace:foo":
version: 0.0.0-use.local
resolution: "foo@workspace:foo"
dependencies:
underscore: ^1.13.0
languageName: unknown
linkType: soft

"quux@workspace:quux":
version: 0.0.0-use.local
resolution: "quux@workspace:quux"
dependencies:
underscore: "jashkenas/underscore#tag=1.13.1"
languageName: unknown
linkType: soft

"toplevel@workspace:.":
version: 0.0.0-use.local
resolution: "toplevel@workspace:."
languageName: unknown
linkType: soft

"underscore@jashkenas/underscore#tag=1.13.1":
version: 1.13.1
resolution: "underscore@https://github.com/jashkenas/underscore.git#commit=cbb48b79fc1205aa04feb03dbc055cdd28a12652"
checksum: 560609fdb4ba2c30e79db95ea37269982d1a2788d49b78f0de4f391da711bc2495d5fbddd6d24e7716fccf69959e445916af83eb5de1ad137b215777e2d32e4d
languageName: node
linkType: hard

"underscore@npm:1.13.1, underscore@npm:^1.13.0":
version: 1.13.1
resolution: "underscore@npm:1.13.1"
checksum: 19527b2db3d34f783c3f2db9716a2c1221fef2958866925545697c46f430f59d1b384b8105cc7e7c809bdf0dc9075f2bfff90b8fb270b9d3a6c58347de2dd79d
languageName: node
linkType: hard

#+END_SRC

Ignoring the =__metadata= field, the yarn lockfile is a mapping from =a comma-separated list of descriptors= to a =package description=.

*** Package description fields

Of a package's fields, we only care about =resolution= and =dependencies=

**** =resolution=
The locator used for this package

**** =dependencies=
An optional field containing =package: descriptor-range= mappings for each dependency of the package. *This includes dev dependencies* if they were included when running =yarn install=.

This field is copied functionally identically from a package's =dependencies= and =devDependencies= fields in =package.json=. The code that parses a =Package description= [[https://github.com/yarnpkg/berry/blob/0d9834036d6a3747d6c0dbb5c11e27568f7194dc/packages/yarnpkg-core/sources/Project.ts#L284][is the same code]] that parses dependencies in a =package.json= file

Full dependency descriptors [[https://github.com/yarnpkg/berry/blob/0d9834036d6a3747d6c0dbb5c11e27568f7194dc/packages/yarnpkg-core/sources/Manifest.ts#L307-L326][can be reconstructed]] by joining key-value pairs on =@=: =underscore: ^1.13.0= is =underscore@^1.13.0=. Each dependency's descriptor is a key for a package at the top level of the yarn lockfile

#+BEGIN_QUOTE
*NOTE*: a fun note about dependency descriptors

A keen eye may notice that in the lockfile above, some descriptor keys contain =npm:= at the top-level. For example, there's =underscore@npm:1.13.1= -- but that descriptor isn't used anywhere as a dependency. The closest is =underscore@1.13.1=, a dependency of the =bar= workspace.

In an interesting design decision, yarn makes the default resolver for packages configurable. When a user provides a raw version (e.g., =1.13.1=) or semver (=^1.13.1=) for a dependency in =package.json=, a "default protocol" string is prepended to the descriptor range. This option [[https://next.yarnpkg.com/configuration/yarnrc#defaultProtocol][is configured]] as =defaultProtocol=, which defaults to =npm:=.

As a workaround, when using a descriptor =name@range= to look up a package in the lockfile, we must also try =name@npm:range=
#+END_QUOTE

*** Lockfile sources
The above lockfile was generated from the following files

=package.json=
#+BEGIN_SRC json
{
"name": "toplevel",
"private": true,
"workspaces": [
"foo",
"bar",
"quux"
]
}
#+END_SRC

=foo/package.json=
#+BEGIN_SRC json
{
"name": "foo",
"version": "1.0.0",
"dependencies": {
"underscore": "^1.13.0"
}
}
#+END_SRC

=bar/package.json=
#+BEGIN_SRC json
{
"name": "bar",
"version": "1.0.0",
"dependencies": {
"underscore": "1.13.1"
}
}
#+END_SRC

=quux/package.json=

Note that =name/repo= is implicitly treated as a github repo reference
#+BEGIN_SRC json
{
"name": "quux",
"version": "1.0.0",
"dependencies": {
"underscore": "jashkenas/underscore#tag=1.13.1"
}
}
#+END_SRC
40 changes: 24 additions & 16 deletions spectrometer.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@ common lang
GADTSyntax
GeneralizedNewtypeDeriving
HexFloatLiterals
ImportQualifiedPost
InstanceSigs
KindSignatures
MultiParamTypeClasses
Expand All @@ -46,12 +45,13 @@ common lang
RankNTypes
ScopedTypeVariables
StandaloneDeriving
StandaloneKindSignatures
StrictData
TupleSections
TypeApplications
TypeOperators
TypeSynonymInstances
ImportQualifiedPost
StandaloneKindSignatures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: did cabal-fmt put these out-of-order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I'm not sure why it's doing that 😕


ghc-options:
-Wall -Wincomplete-uni-patterns -Wcompat
Expand Down Expand Up @@ -123,14 +123,15 @@ library

-- cabal-fmt: expand src
exposed-modules:
Algebra.Graph.AdjacencyMap.Extra
App.Fossa.API.BuildLink
App.Fossa.API.BuildWait
App.Fossa.Analyze
App.Fossa.Analyze.Graph
App.Fossa.Analyze.GraphBuilder
App.Fossa.Analyze.GraphMangler
App.Fossa.Analyze.Project
App.Fossa.Analyze.Record
App.Fossa.API.BuildLink
App.Fossa.API.BuildWait
App.Fossa.Compatibility
App.Fossa.Configuration
App.Fossa.Container
Expand Down Expand Up @@ -161,20 +162,20 @@ library
App.Version
App.Version.TH
Console.Sticky
Control.Carrier.AtomicCounter
Control.Carrier.AtomicState
Control.Carrier.Diagnostics
Control.Carrier.Diagnostics.StickyContext
Control.Carrier.Finally
Control.Carrier.AtomicCounter
Control.Carrier.Output.IO
Control.Carrier.StickyLogger
Control.Carrier.TaskPool
Control.Carrier.Threaded
Control.Effect.AtomicCounter
Control.Effect.AtomicState
Control.Effect.ConsoleRegion
Control.Effect.Diagnostics
Control.Effect.Finally
Control.Effect.AtomicCounter
Control.Effect.Output
Control.Effect.Path
Control.Effect.Record
Expand All @@ -184,6 +185,7 @@ library
Control.Effect.StickyLogger
Control.Effect.TaskPool
Control.Exception.Extra
Data.Aeson.Extra
Data.FileEmbed.Extra
Data.Flag
Data.Functor.Extra
Expand Down Expand Up @@ -242,7 +244,6 @@ library
Strategy.Node.NpmList
Strategy.Node.NpmLock
Strategy.Node.PackageJson
Strategy.Node.YarnLock
Strategy.Npm
Strategy.NuGet.Nuspec
Strategy.NuGet.PackageReference
Expand All @@ -255,13 +256,17 @@ library
Strategy.Python.SetupPy
Strategy.Python.Setuptools
Strategy.Python.Util
Strategy.Rebar3
Strategy.RPM
Strategy.Rebar3
Strategy.Ruby.BundleShow
Strategy.Ruby.GemfileLock
Strategy.Scala
Strategy.Yarn
Strategy.UserSpecified.YamlDependencies
Strategy.Yarn
Strategy.Yarn.V1.YarnLock
Strategy.Yarn.V2.Lockfile
Strategy.Yarn.V2.Resolvers
Strategy.Yarn.V2.YarnLock
Text.URI.Builder
Types
VCS.Git
Expand Down Expand Up @@ -292,9 +297,9 @@ test-suite unit-tests
-- cabal-fmt: expand test
other-modules:
App.Fossa.API.BuildLinkSpec
App.Fossa.Configuration.ConfigurationSpec
App.Fossa.Report.AttributionSpec
App.Fossa.VPS.NinjaGraphSpec
App.Fossa.Configuration.ConfigurationSpec
Cargo.MetadataSpec
Carthage.CarthageSpec
Clojure.ClojureSpec
Expand All @@ -317,15 +322,14 @@ test-suite unit-tests
Go.TransitiveSpec
Googlesource.RepoManifestSpec
Gradle.GradleSpec
GraphingSpec
GraphUtil
GraphingSpec
Haskell.CabalSpec
Haskell.StackSpec
Maven.PluginStrategySpec
Maven.PomStrategySpec
Node.NpmLockSpec
Node.PackageJsonSpec
Node.YarnLockSpec
NuGet.NuspecSpec
NuGet.PackageReferenceSpec
NuGet.PackagesConfigSpec
Expand All @@ -340,11 +344,15 @@ test-suite unit-tests
Ruby.BundleShowSpec
Ruby.GemfileLockSpec
UserSpecified.YamlDependenciesSpec
Yarn.V2.LockfileSpec
Yarn.V2.ResolversSpec
Yarn.YarnLockV1Spec

build-tool-depends: hspec-discover:hspec-discover ^>=2.7.1
build-depends:
, hedgehog ^>=1.0.2
, hspec ^>=2.7.1
, hspec-hedgehog ^>=0.0.1.2
, hspec-megaparsec ^>=2.1
, hedgehog ^>=1.0.2
, hspec ^>=2.7.1
, hspec-expectations-pretty-diff ^>=0.7.2.5
, hspec-hedgehog ^>=0.0.1.2
, hspec-megaparsec ^>=2.1
, spectrometer
19 changes: 19 additions & 0 deletions src/Algebra/Graph/AdjacencyMap/Extra.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
module Algebra.Graph.AdjacencyMap.Extra (
gtraverse,
) where

import Algebra.Graph.AdjacencyMap qualified as AM
import Data.Set qualified as S

-- | It's 'traverse', but for graphs
--
-- It's also unlawful. 'f' might be called several times for each node in the graph
gtraverse ::
(Applicative f, Ord b) =>
(a -> f b) ->
AM.AdjacencyMap a ->
f (AM.AdjacencyMap b)
gtraverse f = fmap mkAdjacencyMap . traverse (\(a, xs) -> (,) <$> f a <*> traverse f xs) . AM.adjacencyList
where
mkAdjacencyMap :: Ord c => [(c, [c])] -> AM.AdjacencyMap c
mkAdjacencyMap = AM.fromAdjacencySets . fmap (fmap S.fromList)
Comment on lines +16 to +19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional, readability]: I hate the (,) constructor so much, it's really annoying to read. I usually use tuplify, since names are easier to read in prefix position than operators-in-parenthesis.

Suggested change
gtraverse f = fmap mkAdjacencyMap . traverse (\(a, xs) -> (,) <$> f a <*> traverse f xs) . AM.adjacencyList
where
mkAdjacencyMap :: Ord c => [(c, [c])] -> AM.AdjacencyMap c
mkAdjacencyMap = AM.fromAdjacencySets . fmap (fmap S.fromList)
gtraverse f = fmap mkAdjacencyMap . traverse (\(a, xs) -> tuplify <$> f a <*> traverse f xs) . AM.adjacencyList
where
tuplify a b = (a, b)
mkAdjacencyMap :: Ord c => [(c, [c])] -> AM.AdjacencyMap c
mkAdjacencyMap = AM.fromAdjacencySets . fmap (fmap S.fromList)

Loading