Skip to content

Latest commit

 

History

History
366 lines (304 loc) · 16.6 KB

getting-started.md

File metadata and controls

366 lines (304 loc) · 16.6 KB

Getting Started

This tutorial gives a brief introduction to how the tools work together at the example of the mime-types NPM package. It will guide through the main steps for running ORT:

  • Install ORT.
  • Analyze the dependencies of mime-types using the analyzer.
  • Scan the source code of mime-types and its dependencies using the scanner.

1. Prerequisites

ORT is tested to run on Linux, macOS, and Windows. This tutorial assumes that you are running on Linux, but it should be easy to adapt the commands to macOS or Windows.

In addition to Java (version >= 11), for some of the supported package managers and Version Control Systems additional tools need to be installed. In the context of this tutorial the following tools are required:

  • Git (any recent version will do)
  • Node.js 10.* or higher.
  • NPM 6.4.* or higher.
  • Yarn 1.9.* or higher.

For the full list of supported package managers and Version Control Systems see the README.

2. Download & Install ORT

In future we will provide binaries of the ORT tools, but currently you have to build the tools on your own. First download the source code (including Git submodules) from GitHub:

git clone --recurse-submodules https://github.com/oss-review-toolkit/ort.git

To build the command line interface run:

cd ort
./gradlew installDist

This will create the script to run ORT at cli/build/install/ort/bin/ort. To get the general command line help run it with the --help option:

cli/build/install/ort/bin/ort --help

3. Download the mime-types source code

Before scanning mime-types its source code has to be downloaded. For reliable results we use version 2.1.18 (replace [mime-types-dir] with the directory you want to clone mime-types to):

git clone https://github.com/jshttp/mime-types.git [mime-types-dir]
cd [mime-types-dir]
git checkout 2.1.18

4. Run the analyzer on mime-types

The next step is to run the analyzer. It will create a JSON or YAML output file containing the full dependency tree of mime-types including the metadata of mime-types and its dependencies.

# Command line help specific to the analyzer.
cli/build/install/ort/bin/ort analyze --help

# The easiest way to run the analyzer. Be aware that the [analyzer-output-dir] directory must not exist.
cli/build/install/ort/bin/ort analyze -i [mime-types-dir] -o [analyzer-output-dir]

# The command above will create the default YAML output. If you prefer JSON run:
cli/build/install/ort/bin/ort analyze -i [mime-types-dir] -o [analyzer-output-dir] -f JSON

# To get the maximum log output run:
cli/build/install/ort/bin/ort --debug --stacktrace analyze -i [mime-types-dir] -o [analyzer-output-dir]

The analyzer will search for build files of all supported package managers. In case of mime-types it will find the package.json file and write the results of the dependency analysis to the output file analyzer-result.yml. On the first attempt of running the analyzer on the mime-types package it will fail with an error message:

The following package managers are activated:
        Bower, Bundler, Cargo, Composer, DotNet, GoDep, Gradle, Maven, NPM, NuGet, PIP, SBT, Stack, Yarn
Analyzing project path:
        [mime-types-dir]
ERROR - Resolving dependencies for 'package.json' failed with: No lockfile found in '[mime-types-dir]'. This potentially results in unstable versions of dependencies. To allow this, enable support for dynamic versions.
Writing analyzer result to '[analyzer-output-dir]/analyzer-result.yml'.

This happens because mime-types does not have package-lock.json file. Without this file the versions of (transitive) dependencies that are defined with version ranges could change at any time, leading to different results of the analyzer. To override this check, use the global -P ort.analyzer.allowDynamicVersions=true option:

$ cli/build/install/ort/bin/ort -P ort.analyzer.allowDynamicVersions=true analyze -i [mime-types-dir] -o [analyzer-output-dir]
The following package managers are activated:
        Bundler, Composer, GoDep, Gradle, Maven, NPM, PIP, SBT, Stack, Yarn
Analyzing project path:
        [mime-types-dir]
Writing analyzer result to '[analyzer-output-dir]/analyzer-result.yml'.

The result file will contain information about the mime-types package itself, the dependency tree for each scope, and information about each dependency. The scope names come from the package managers, for NPM packages these are usually dependencies and devDependencies, for Maven package it would be compile, runtime, test, and so on.

Note that the analyzer-result.yml is supposed to capture all known information about a project, which can then be "filtered" in later steps. For example, scopes which are not relevant for the distribution will still be listed, but can be configured to get excluded so that they e.g. do not get downloaded and scanned by the scanner step. To specify which scopes should be excluded, add an .ort.yml configuration file to the input directory of the analyzer. For more details see Configuration File.

For this guide, [mime-types-dir]/.ort.yml can be created with following content:

excludes:
  scopes:
  - pattern: "devDependencies"
    reason: "DEV_DEPENDENCY_OF"
    comment: "Packages for development only."

Following is an overview of the structure of the analyzer-result.yml file (comments were added for clarity and are not part of a real result file):

# VCS information about the input directory.
repository:
  vcs:
    type: "Git"
    url: "https://github.com/jshttp/mime-types.git"
    revision: "7c4ce23d7354fbf64c69d7b7be8413c4ba2add78"
    path: ""
  vcs_processed:
    type: "Git"
    url: "https://github.com/jshttp/mime-types.git"
    revision: "7c4ce23d7354fbf64c69d7b7be8413c4ba2add78"
    path: ""
  # Will only be present if an '.ort.yml' configuration file with scope excludes was provided. Otherwise this is an empty object.
  config:
    excludes:
      scopes:
      - pattern: "devDependencies"
        reason: "DEV_DEPENDENCY_OF"
        comment: "Packages for development only."
# The analyzer result.
analyzer:
  # The time when the analyzer was executed.
  start_time: "2019-02-19T10:03:07.269Z"
  end_time: "2019-02-19T10:03:19.932Z"
  # Information about the environment the analyzer was run in.
  environment:
    ort_version: "331c32d"
    os: "Linux"
    variables:
      SHELL: "/bin/bash"
      TERM: "xterm-256color"
      JAVA_HOME: "/usr/lib/jvm/java-8-oracle"
    tool_versions: {}
  # Configuration options of the analyzer.
  config:
    ignore_tool_versions: false
    allow_dynamic_versions: true
  # The result of the dependency analysis.
  result:
    # Metadata about all found projects, in this case only the mime-types package defined by the package.json file.
    projects:
    - id: "NPM::mime-types:2.1.18"
      purl: "pkg://NPM//mime-types@2.1.18"
      definition_file_path: "package.json"
      declared_licenses:
      - "MIT"
      declared_licenses_processed:
        spdx_expression: "MIT"
      vcs:
        type: ""
        url: "https://github.com/jshttp/mime-types.git"
        revision: ""
        path: ""
      vcs_processed:
        type: "Git"
        url: "https://github.com/jshttp/mime-types.git"
        revision: "076f7902e3a730970ea96cd0b9c09bb6110f1127"
        path: ""
      homepage_url: ""
      # The dependency trees by scope.
      scopes:
      - name: "dependencies"
        dependencies:
        - id: "NPM::mime-db:1.33.0"
      - name: "devDependencies"
        dependencies:
        - id: "NPM::eslint-config-standard:10.2.1"
        - id: "NPM::eslint-plugin-import:2.8.0"
          dependencies:
          - id: "NPM::builtin-modules:1.1.1"
          - id: "NPM::contains-path:0.1.0"
            # If an issue occurred during the dependency analysis of this package there would be an additional "issues"
            # array.
# ...
# Detailed metadata about each package from the dependency trees.
    packages:
    - package:
        id: "NPM::abbrev:1.0.9"
        purl: "pkg://NPM//abbrev@1.0.9"
        declared_licenses:
        - "ISC"
        declared_licenses_processed:
          spdx_expression: "ISC"
        description: "Like ruby's abbrev module, but in js"
        homepage_url: "https://github.com/isaacs/abbrev-js#readme"
        binary_artifact:
          url: ""
          hash: ""
          hash_algorithm: ""
        source_artifact:
          url: "https://registry.npmjs.org/abbrev/-/abbrev-1.0.9.tgz"
          hash: "91b4792588a7738c25f35dd6f63752a2f8776135"
          hash_algorithm: "SHA-1"
        vcs:
          type: "Git"
          url: "git+ssh://git@github.com/isaacs/abbrev-js.git"
          revision: "c386cd9dbb1d8d7581718c54d4ba944cc9298d6f"
          path: ""
        vcs_processed:
          type: "Git"
          url: "ssh://git@github.com/isaacs/abbrev-js.git"
          revision: "c386cd9dbb1d8d7581718c54d4ba944cc9298d6f"
          path: ""
      curations: []
# ...
# Finally a list of project related issues that happened during dependency analysis. Fortunately empty in this case.
    issues: {}
# A field to quickly check if the analyzer result contains any issues.
    has_issues: false

5. Run the scanner

To scan the source code of mime-types and its dependencies the source code of mime-types and all its dependencies needs to be downloaded. The downloader tool could be used for this, but it is also integrated in the scanner tool, so the scanner will automatically download the source code if the required VCS metadata could be obtained.

Note that if downloader is unable to download the source code due to say a missing source code location in the package metadata then you can use curations as a workaround.

To use curations, create a curations.yml and pass it to the --package-curations-file option of the analyzer:

cli/build/install/ort/bin/ort analyze
  -i [mime-types-dir]
  -o [analyzer-output-dir]
  --package-curations-file $ORT_CONFIG_DIR/curations.yml

ORT is designed to integrate lots of different scanners and is not limited to license scanners, technically any tool that explores the source code of a software package could be integrated. The actual scanner does not have to run on the same machine, for example we will soon integrate the ClearlyDefined scanner backend which will perform the actual scanning remotely.

For this tutorial we will use ScanCode. You do not have to install the tool manually, it will automatically be bootstrapped by the scanner.

As for the analyzer you can get the command line options for the scanner using the --help option:

cli/build/install/ort/bin/ort scan --help

The mime-types package has only one dependency in the depenencies scope, but a lot of dependencies in the devDependencies scope. Scanning all of the devDependencies would take a lot of time, so we will only run the scanner on the dependencies scope in this tutorial. If you also want to scan the devDependencies it is strongly advised to configure a scan storage for the scan results to speed up repeated scans.

As during the analyzer step an .ort.yml configuration file was provided to exclude devDependencies, the --skip-excluded option can be used to avoid the download and scanning of that scope.

$ cli/build/install/ort/bin/ort scan -i [analyzer-output-dir]/analyzer-result.yml -o [scanner-output-dir] --skip-excluded
Using scanner 'ScanCode'.
Limiting scan to scopes: [dependencies]
Bootstrapping scanner 'ScanCode' as required version 2.9.2 was not found in PATH.
Using processed VcsInfo(type=git, url=https://github.com/jshttp/mime-db.git, revision=482cd6a25bbd6177de04a686d0e2a0c2465bf445, resolvedRevision=null, path=).
Original was VcsInfo(type=git, url=git+https://github.com/jshttp/mime-db.git, revision=482cd6a25bbd6177de04a686d0e2a0c2465bf445, resolvedRevision=null, path=).
Running ScanCode version 2.9.2 on directory '[scanner-output-dir]/downloads/NPM/unknown/mime-db/1.35.0'.
Using processed VcsInfo(type=git, url=https://github.com/jshttp/mime-types.git, revision=7c4ce23d7354fbf64c69d7b7be8413c4ba2add78, resolvedRevision=null, path=).
Original was VcsInfo(type=, url=https://github.com/jshttp/mime-types.git, revision=, resolvedRevision=null, path=).
Running ScanCode version 2.9.2 on directory '[scanner-output-dir]/downloads/NPM/unknown/mime-types/2.1.18'.
Writing scan result to '[scanner-output-dir]/scan-result.yml'.

The scanner writes a new ORT result file to [scanner-output-dir]/scan-result.yml containing the scan results in addition to the analyzer result from the input. This way belonging results are stored in the same place for traceability. If the input file already contained scan results they are replaced by the new scan results in the output.

As you can see when checking the scan-result.yml file, the licenses detected by ScanCode match the licenses declared by the packages. This is because we scanned a small and well-maintained package in this example, but if you run the scan on a bigger project you will see that ScanCode often finds more licenses than are declared by the packages.

6. Running the evaluator

The evaluator can apply a set of rules against the scan result created above. ORT provides examples for the policy rules file (rules.kts), user-defined categorization of licenses (license-classifications.yml) and user-defined package curations (curations.yml) that can be used for testing the evaluator.

To run the example rules use:

cli/build/install/ort/bin/ort evaluate
  --package-curations-file curations.yml
  --rules-file rules.kts
  --license-classifications-file license-classifications.yml
  -i [scanner-output-dir]/scan-result.yml
  -o [evaluator-output-dir]/mime-types

See the curations.yml documentation to learn more about using curations to correct invalid or missing package metadata and the license-classifications.yml documentation on how you can classify licenses to simplify writing the policy rules.

It is possible to write your own evaluator rules as a Kotlin script and pass it to the evaluator using --rules-file. Note that detailed documentation for writing custom rules is not yet available.

7. Generate a report

The evaluation-result.yml file can now be used as input for the reporter to generate human-readable reports and open source notices.

For example, to generate a static HTML report, WebApp report and an open source notice by package, use:

cli/build/install/ort/bin/ort report
  -f NoticeTemplate,StaticHtml,WebApp
  -i [evaluator-output-dir]/evaluation-result.yml
  -o [reporter-output-dir]
Created 'StaticHtml' report: [reporter-output-dir]/scan-report.html
Created 'WebApp' report: [reporter-output-dir]/scan-report-web-app.html
Created 'NoticeTemplate' report: [reporter-output-dir]/NOTICE_default

If you do not want to run the evaluator you can pass the scanner result e.g. [scanner-output-dir]/scan-result.yml to the reporter instead. To learn how you can customize generated notices see notice-templates.md. To learn how to customize the how-to-fix texts for scanner and analyzer issues see how-to-fix-text-provider-kts.md.

8. Curating Package Metadata or License Findings

In the example above everything went well because the VCS information provided by the packages was correct, but this is not always the case. Often the metadata of packages has no VCS information, points to outdated repositories, or the repositories are not correctly tagged.

ORT provides a variety of mechanisms to fix a variety of issues, for details see:

  • The .ort.yml file - project-specific license finding curations, exclusions and resolutions to address issues found within a project's code repository.
  • The package configuration file - package (dependency) and provenance specific license finding curations and exclusions to address issues found within a scan result for a package.
  • The curations.yml file - curations correct invalid or missing package metadata and set the concluded license for packages.
  • The resolutions.yml file - resolutions allow resolving any issues or policy rule violations by providing a reason why they are acceptable and can be ignored.