You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An important consideration is that many of these packages are appearing on the top 4000 packages maintained by @hugovk. See hugovk/drop-python#29 . My guess is that these high statistics is rarely due to the popularity of these projects - instead I guess other dev tools(code editors, doc generators, etc), or more scary import hooks?, are automatically fetching stdlib names from PyPI because they cant differentiate between stdlib and non-stdlib when processing the imports in a Python source file. If that is the case, at best these are a nuisance to those tools and the users of those tools, but at worst these are a potential security problem if preventative action isnt taken. (I have checked and I am confident there are no current security problems)
pprint: "The funniest joke in the world"; tarball on PyPI is not a security problem, but it also isnt useful. https://github.com/hamadi15/test is a dummy repo.
"Lock" all stdlib names which are currently unallocated on PyPI (Create test suite stdlib-list#28 makes this easy), so they cant be claimed in the future.
Delete the junk-iest of the above names - pprint, dis, select, time and trace
Lock most of the above packages at their current state, and work with any maintainers to determine if they are happy to voluntarily have the PyPI package removed, esp renamed packages where the old name is unlikely to be used any more, such as AST, calendar, chunk.
Create a register of stdlib names which are currently populated with either
a backport/prior art/etc, or
are allocated to an unrelated package which overlaps the stdlib name if the maintainers have a good reason to keep publishing under the current name and have reasonable protections in place to prevent and detect abuse, e.g. DateTime, maybe Wave, etc.
a dummy package if this is the approach taken to "lock" packages
Create a process for naming of stdlib packages which requires the name on PyPI is either unused and can be locked, or is a backport/prior art/etc added to the register.
(maybe) Also include a process for unlocking stdlib package names, probably looking like the process to reallocate a PyPI name to a new maintainer in PEP 541
I am well aware that this only addresses one set of problems; packages can install code that replace stdlib names without needing to use the stdlib name in the package name.
Deeper analysis of PyPI artifacts is also needed to find those more nefarious cases, but the above simple and more legitimate appearing cases of package name overloads are the easy pickings that seems like a logic first step.
The text was updated successfully, but these errors were encountered:
Thanks for putting this together @jayvdb. I think we can transfer most of these to https://pypi.org/user/admin, and once #5838 lands, yank all the old releases.
Create a process for naming of stdlib packages which requires the name on PyPI is either unused and can be locked, or is a backport/prior art/etc added to the register.
I think this part doesn't really belong in this issue tracker (in the sense that we can't fix it with a PR to this repo), I'd encourage you to create an issue at https://github.com/pypa/packaging-problems/issues and maybe start a discussion at https://discuss.python.org/c/core-workflow/ so that the folks who are actually the ones making new stdlib modules have an opportunity to weigh in and understand the issue here.
I've done a bit of analysis of all stdlib module names registered on PyPI as part of https://github.com/jayvdb/pypidb and using master of pypi/stdlib-list#28. This issue is a bit like #1506 , focused on a narrow set.
An important consideration is that many of these packages are appearing on the top 4000 packages maintained by @hugovk. See hugovk/drop-python#29 . My guess is that these high statistics is rarely due to the popularity of these projects - instead I guess other dev tools(code editors, doc generators, etc), or more scary import hooks?, are automatically fetching stdlib names from PyPI because they cant differentiate between stdlib and non-stdlib when processing the imports in a Python source file. If that is the case, at best these are a nuisance to those tools and the users of those tools, but at worst these are a potential security problem if preventative action isnt taken. (I have checked and I am confident there are no current security problems)
Especially concerning are:
AST
- Created 2017; URL https://github.com/yijunyu/needle/ is 404. Almost certainly renamed toneedles
-> https://github.com/yijunyu/needlesdis
: code prints lol many times; 404 http://www.wisehandy.com/formatter
: 404 https://github.com/WoLpH/formattermailbox
: renamed toimbox
modulefinder
: 404 https://github.com/bioasp/modulefinderpprint
: "The funniest joke in the world"; tarball on PyPI is not a security problem, but it also isnt useful. https://github.com/hamadi15/test is a dummy repo.numbers
: no files; 404 https://github.com/jeanpimentel/numberssecrets
: 404 http://tuohela.net/packages/secretsselect
: no files; 404 https://github.com/Jaymon/select. I guess it was renamed to https://github.com/Jaymon/que https://pypi.org/project/que/#filessignal
: no files; 404 https://github.com/privatwolke/signaltime
: no urls or filestoken
: no urlsturtle
: http://adroll.com/labs is 404trace
: no files; http://billionuploads.com/ka79h2t4jpi1 which unavailable atmThose suggest there is not an active maintainer, and they could be abused or fall into the wrong hands without anyone noticing.
Less concerning except for the clash with stdlib are:
calendar
: republished to https://pypi.org/project/python-calendrical/ a long time agochunk
: forked/renamed tobunch
and thenmunch
; PyPI URL is direct to the latter https://github.com/Infinidat/munchDateTime
- 6396758 - not the same. Owned by Zope so in safe hands. Installs toDateTime
rather thandatetime
.functools
: http://www.trit.org/~dima/ - no problems, except different API to stdlibparser
: no files; https://github.com/tehmaze/parserresource
: https://github.com/RussellLuo/resourceshelve
: no files; https://github.com/ton/stashwave
: https://pythonhosted.org/Wave/Suggestions:
I am well aware that this only addresses one set of problems; packages can install code that replace stdlib names without needing to use the stdlib name in the package name.
Deeper analysis of PyPI artifacts is also needed to find those more nefarious cases, but the above simple and more legitimate appearing cases of package name overloads are the easy pickings that seems like a logic first step.
The text was updated successfully, but these errors were encountered: