-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle runs of delimiters in filenames #14172
Conversation
(This also removes one of our usages of |
I think this probably works, but I don't know precisely what rules packages generating wheels (principally setuptools) have been using for the filename, e.g. if they might have The name normalization spec has a one line function to fully normalize a name from any valid starting point. Is there any reason to do it differently for the filename prefix? def normalize(name):
return re.sub(r"[-_.]+", "-", name).lower() |
We don't actually want to normalize the filename prefix, as it should already be normalized. But to handle backwards compatibility with tools that might not be replacing filename_prefix = filename_prefix.lower().replace(".", "_").replace("-", "_") If we remove consecutive runs when normalizing the prefix, this would mean any filenames that have any amount of consecutive runs of delimiters in the filename would become valid, because the comparison between the prefix and the project name would not take repeating delimiters into account. Since these filenames are currently invalid, we haven't had an upload with a
So TL;DR: I'm not sure we want to become less restrictive by normalizing out character runs in the filename before comparison to the project name. |
Thanks, that makes sense. In any case, my main interest in the original
issue was for pre-normalised names, which should work now.
…On Fri, 21 Jul 2023, 18:05 Dustin Ingram, ***@***.***> wrote:
We don't actually *want* to normalize the filename prefix, as it should
already be normalized. But to handle backwards compatibility with tools
that might not be replacing - and . with _, or lowercasing the prefix, we
have to do a minimal normalization here, which we'll eventually want to
remove via #14156 <#14156>:
filename_prefix = filename_prefix.lower().replace(".", "_").replace("-", "_")
If we remove consecutive runs when normalizing the prefix, this would mean
any filenames that have any amount of consecutive runs of delimiters in the
filename would become valid, because the comparison between the prefix and
the project name would not take repeating delimiters into account. Since
these filenames are currently invalid, we haven't had an upload with a ._,
_., or .. in the filename for quite some time:
warehouse=> select upload_time, filename from release_files where filename ilike '%\.\_%.whl' order by upload_time desc limit 1;
upload_time | filename
----------------------------+-----------------------------------------------
2021-11-03 20:54:45.340789 | hydrotools._restclient-3.0.5-py3-none-any.whl
(1 row)
warehouse=> select upload_time, filename from release_files where filename ilike '%\_\.%.whl' order by upload_time desc limit 1;
upload_time | filename
----------------------------+---------------------------------------------------
2021-05-03 00:44:10.904378 | cmc_csci046_.yilinli_trees-1.0.1-py3-none-any.whl
(1 row)
warehouse=> select upload_time, filename from release_files where filename ilike '%\.\.%.whl' order by upload_time desc limit 1;
upload_time | filename
----------------------------+-------------------------------------------------------
2020-11-06 20:00:06.720522 | example_..._pkg_megankuooo1234-0.0.1-py3-none-any.whl
(1 row)
So TL;DR: I'm not sure we want to become less restrictive by normalizing
out character runs in the filename before comparison to the project name.
—
Reply to this email directly, view it on GitHub
<#14172 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACQB5J6VBATR4C7HCWOJVTXRKZHNANCNFSM6AAAAAA2R65RHE>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go. Accept the suggestion if you like, it's okay if you don't!
Co-authored-by: Mike Fiedler <miketheman@gmail.com>
Fixes the issue raised here: #14155 (comment)
Fixes #13991.