Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python purl URLs seem incorrect, some examples don't work #627

Closed
unchris opened this issue Nov 30, 2023 · 7 comments
Closed

Python purl URLs seem incorrect, some examples don't work #627

unchris opened this issue Nov 30, 2023 · 7 comments
Assignees

Comments

@unchris
Copy link

unchris commented Nov 30, 2023

Across this repo, the examples for Python PURLs uses pkg:pip but the purl-spec says it should be pkg:pypi.

Here's a code search showing all the instances of this.

I've been trying to use this action in a new workflow to deny, for example, pycrypto, which is one of the examples in the action.yml among other places:

      - uses: actions/checkout@v4
      - name: "Dependency Review"
        uses: actions/dependency-review-action@v3
        with:
          fail-on-severity: high
          comment-summary-in-pr: "on-failure"
          deny-packages: "pkg:pip/pycrypto, pkg:pypi/pycrypto, pkg:PYPI/pycrypto, pkg:PIP/pycrypto, pycrypto"

I've tried a number of variants but none of them blocks pycrypto. After adding debug logging though, I found this is because the pycrypto purl is blank in the run:

{"change_type":"added",
"manifest":"pyproject.toml",
"ecosystem":"pip",
"name":"pycrypto",
"version":"",
"package_url":"",
"license":null,
"source_repository_url":"https://github.com/pycrypto/pycrypto",
"scope":"runtime",
"vulnerabilities":[]},

here's one it gets right:

{"change_type":"added",
"manifest":"pyproject.toml",
"ecosystem":"pip",
"name":"howso-engine",
"version":"10.0.0",
"package_url":"pkg:pypi/howso-engine@10.0.0",
"license":null,
"source_repository_url":"https://github.com/howsoai/howso-engine-py",
"scope":"runtime",
"vulnerabilities":[]},

even weirder is what it comes up with for python itself:

{"change_type":"added",
"manifest":"pyproject.toml",
"ecosystem":"pip",
"name":"python",
"version":">= 3.8,< 3.12",
"package_url":"",
"license":null,
"source_repository_url":"https://github.com/mathspp/building-a-python-compiler-and-interpreter",
"scope":"runtime",
"vulnerabilities":[]},

This is in a pyproject.toml-based project using poetry. Here's the dependency section for completeness:

[tool.poetry.dependencies]
python = ">=3.8,<3.12"
cowsay = "~=5.0"
requests = "*"
opentelemetry-instrumentation = "0.39b0"
howso-engine = "10.0.0"
pycrypto = "*"

(the weird set of dependencies was chosen specifically to trigger warnings, this project exists to test Github advanced security and related tooling like this dependency review action 😂 )

@febuiles
Copy link
Contributor

febuiles commented Dec 1, 2023

@unchris do you see pycrypto when you go to github.com///network/dependencies if you merge the PR? If so, can you share a screenshot of the row? Looking at pycrypto's project's setup.cfg the name is set dynamically, and I'm wondering Dependency Graph is not able to properly parse the package name.

@unchris
Copy link
Author

unchris commented Dec 1, 2023

I created and merged a smaller PR that only had the following dependencies:

[tool.poetry.dependencies]
python = ">=3.9,<3.12"
cowsay = "~=5.0"
pycrypto = "*"

This is what I see in the Dependency Graph for python deps:

Screenshot 2023-12-01 at 10 22 28 AM

Here's the output from the PR:

Run actions/dependency-review-action@v3
  with:
    fail-on-severity: high
    comment-summary-in-pr: on-failure
    deny-packages: pkg:pypi/pycrypto
    deny-licenses: AGPL-3.0
    repo-token: ***
    retry-on-snapshot-warnings: false
    retry-on-snapshot-warnings-timeout: 120
Dependency review did not detect any denied packages
Vulnerabilities
  Dependency review did not detect any vulnerable packages with severity level "high" or higher.
Licenses
  
  We could not detect a license for the following dependencies:
  pyproject.toml » pycrypto@
Denied
  Config: pkg:pypi/pycrypto
Dependency Changes
  File: pyproject.toml
  + pycrypto@

@febuiles
Copy link
Contributor

febuiles commented Dec 5, 2023

@unchris thanks for sharing more details on your side.

@jovel & co. Dependency Graph detects the package pycrypto as being defined in https://github.com/pycrypto/pycrypto/blob/master/setup.py (I'm not sure how since it's dynamically assigned). I think two questions need to be answered:

  1. Is the API returning the wrong PURL for Python packages? (pip when it should be pypi?)
  2. How should Dependency Graph treat archived repositories when it comes to providing packages. Is the expectation that archived repositories will still be queryable through the API for use-cases like this action?

(Unassigning myself, do not have cycles atm, might be able to pick up later this week)

@febuiles febuiles removed their assignment Dec 5, 2023
@febuiles
Copy link
Contributor

febuiles commented Dec 11, 2023

@unchris Upon further investigation I realized that the version must be specified if we want to get a PURL back from the API. I created a sample PR here: future-funk/bug-free-adventure#1

~/w/dr-action (main)
$ gh api repos/future-funk/bug-free-adventure/dependency-graph/compare/main...febuiles-patch-1
[
  {
    "change_type": "added",
    "manifest": "pyproject.toml",
    "ecosystem": "pip",
    "name": "pycrypto",
    "version": "2.6.1",
    "package_url": "pkg:pypi/pycrypto@2.6.1",
    "license": null,
    "source_repository_url": "https://github.com/pycrypto/pycrypto",
    "scope": "runtime",
    "vulnerabilities": [
      {
        "severity": "critical",
        "advisory_ghsa_id": "GHSA-cq27-v7xp-c356",
        "advisory_summary": "Buffer Overflow in pycrypto",
        "advisory_url": "https://github.com/advisories/GHSA-cq27-v7xp-c356"
      },
      {
        "severity": "high",
        "advisory_ghsa_id": "GHSA-6528-wvf6-f6qg",
        "advisory_summary": "Pycrypto generates weak key parameters",
        "advisory_url": "https://github.com/advisories/GHSA-6528-wvf6-f6qg"
      }
    ]
  },

As you can see, the returned PURL is properly scoped to pypi too, which means you can use pkg:pypi/pycrypto in your denylist to block all [explicitly defined] versions.

Without a version number it's impossible to know if a set of vulnerabilities applies. I would recommend including the poetry.lock file to the repository, this should yield a specific version number every time if you're worried asterisks are used in your pyproject.toml manifest file.

Please re-open if I missed anything!

@febuiles febuiles self-assigned this Dec 11, 2023
@unchris
Copy link
Author

unchris commented Dec 11, 2023

@febuiles Thanks for this, when I use a version number in the pyproject.toml file it works now.

The original PR that I used did also submit the poetry.lock file which had pycrypto pinned at 2.6.1. I'm not sure what part of the dependency graph comparison is missing the lockfile. It does appear to be supported as you're saying. 🤷🏻‍♂️

As for what's missed: basically just doc and test changes I'd guess? The code search in my initial comment shows a number of places where the PURL is showing pkg:pip when it should be pkg:pypi - the most important reference probably being in the action.yml here.

Thanks again for your help!

@febuiles
Copy link
Contributor

The code search in my initial comment shows a number of places where the PURL is showing pkg:pip when it should be pkg:pypi

Ooof, thanks for this callout, I misinterpreted the original comment. I've merged this PR to fix this issue.

The original PR that I used did also submit the poetry.lock file which had pycrypto pinned at 2.6.1

If you can reproduce this in a public repo I'd be happy to take a look into it.

@unchris
Copy link
Author

unchris commented Dec 11, 2023

If you can reproduce this in a public repo I'd be happy to take a look into it.

No need - it looks like among the many variations I tested, every one of them had a mistake:

  • first attempt used private registries unknown to the dependency graph
  • second attempt used pkg:pip when I was trying to change the PURL in case this project's documentation had it right
  • third and later attempts corrected the pip -> pypi mistake, but also included additional deps in case pycrypto was the issue
    • e.g., I used something like pkg:pypi/wheel pkg:pypi/pycrypto on those attempts instead of pkg:pypi/wheel,pkg:pypi/pycrypto

When I use pkg:pypi/pycrypto and the poetry.lock and pyproject.toml are committed with pycrypto = "*" it all works fine. Not sure how I never managed to test a "perfect" configuration, but such is life.

Thanks again for all your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants