Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type annotations to twine.commands.check #608

Merged
merged 3 commits into from
Apr 27, 2020

Conversation

bhrutledge
Copy link
Contributor

@bhrutledge bhrutledge commented Apr 26, 2020

Towards #231

I live-coded and talked my way through the first commit at a Boston Python Lightning Talk last night: https://youtu.be/BGSMwSkWphE?t=4694 (warning: lots of "umms" and face-touching).

Other changes:

  • Pass the correct type to check in the tests
  • Use textwrap.indent instead of home-rolled solution
  • Simplify overly-specific typing for PackageFile.metadata_dictionary()

@@ -46,8 +44,6 @@
".zip": "sdist",
}

MetadataValue = Union[str, Sequence[str], Tuple[str, IO, str]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we getting less specific here?

Copy link
Contributor Author

@bhrutledge bhrutledge Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fa5ba3b, I used a cast on one of the dictionary values:

    metadata = package.metadata_dictionary()
    description = metadata["description"]
    description_content_type = metadata["description_content_type"]

    # ...

    content_type, params = cgi.parse_header(cast(str, description_content_type))

Because cgi.parse_header expects a str, not the Union[...]. This led me to investigate how the MetadataValue alias is used. One place is Repository._convert_data_to_list_of_tuples:

twine/twine/repository.py

Lines 105 to 108 in 7111240

@staticmethod
def _convert_data_to_list_of_tuples(
data: Dict[str, package_file.MetadataValue]
) -> List[Tuple[str, package_file.MetadataValue]]:

However, to my eye, nothing about that code requires the strictness of MetadataValue.

Also, I've seen the Dict[str, Any] idiom recommended by Guido for dictionaries with mixed types(e.g. JSON) and I've used it in at least one other place:

# Working around https://github.com/python/typing/issues/182
self._releases_json_data: Dict[str, Dict[str, Any]] = {}

Ultimately, this change feels more readable/comprehensible to me. I think the ideal solution might be TypedDict in Py 3.8 and/or type hints for pkginfo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That issue seems to be more tied to JSON than anything (by my skimming). I'd argue that Any is a barely valuable annotation and its main value is in "I haven't figured out what this should be right now". I don't understand how casting description_content_type to a string led us down this road, because that's exactly one of the values.

I would also think that pkginfo is popular enough that there should already be some stubs for it, if not, it still seems like a valuable thing to have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how casting description_content_type to a string led us down this road, because that's exactly one of the values.

I've been trying to avoid cast as a smell, though I have used it in a few places where mypy required it, e.g. when we know an Optional[str] is actually a str:

repository_url = cast(str, upload_settings.repository_config["repository"])

I'd argue that Any is a barely valuable annotation and its main value is in "I haven't figured out what this should be right now".

I think it provides value to the reader indicating that "this value could be anything, so be careful". I also think how widely it's used is a matter of style, and it's more common when applying annotations to existing codebases.

https://mypy.readthedocs.io/en/stable/kinds_of_types.html#the-any-type
https://mypy.readthedocs.io/en/stable/casts.html

Do you feel strongly about this? I don't see much value in being so specific here, because it doesn't tell the reader or mypy enough to know what the runtime value of these dictionary keys will actually be. To me, that makes it just another thing that the reader has to look up.

I would also think that pkginfo is popular enough that there should already be some stubs for it, if not, it still seems like a valuable thing to have

Looking into this (and other missing imports) is on the roadmap for subsequent PR's: #231 (comment)

Copy link
Contributor

@deveshks deveshks Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have used cast in a similar way in https://github.com/pypa/twine/blob/master/twine/commands/register.py#L25 while adding type annotations.

I was wondering if assigning a value to repository_url if it is None might be a good idea rather than casting it. (I have assigned a value 'any', but I think any other standard value might be applicable here.

 repository_url = register_settings.repository_config["repository"]
    if repository_url is None:
        repository_url = 'any'

I also saw a similar approach taking place in twine/wininst.py

twine/twine/wininst.py

Lines 19 to 25 in d0ccb0c

@property
def py_version(self) -> str:
m = wininst_file_re.match(self.filename)
if m is None:
return "any"
else:
return m.group("pyver")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly enough to block the PR, but I would like us to converge, eventually, on types that are of use to us and anyone else even if they seem to not convey they actual value of a dictionary. In general, I'd say something is better than nothing.

I'm also worried that we've released MetadataValue and people are using twine.package (which I think is a documented part of the API) and thus we're causing problems for others usin gmypy and our bare-bones annotations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(One of these days I'll anticipate the API argument)

Just pushed ebd2ada, which feels like a good compromise to me. I kept the relaxed signature of _convert_data_to_list_of_tuples:

def _convert_data_to_list_of_tuples(data: Dict[str, Any]) -> List[Tuple[str, Any]]:

That removed the need for the awkward Tuple[str, IO, str], which is never actually stored in metadata_dictionary, but rather appended to the list of tuples:

twine/twine/repository.py

Lines 152 to 159 in ebd2ada

data_to_send = self._convert_data_to_list_of_tuples(data)
print(f"Uploading {package.basefilename}")
with open(package.filename, "rb") as fp:
data_to_send.append(
("content", (package.basefilename, fp, "application/octet-stream"),)
)

Leaving us with:

MetadataValue = Union[str, Sequence[str]]

So, yeah, glad I took a closer look at this. Thanks. 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if assigning a value to repository_url if it is None might be a good idea rather than casting it. (I have assigned a value 'any', but I think any other standard value might be applicable here.

So we've had complaints from people that they've uploaded things they didn't intend to as well as to places they didn't intend to. Placing a standard value in repository_url to satisfy mypy doesn't seem safe to me. Sorry I didn't see your comment earlier, @deveshks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deveshks I think your use of cast is appropriate, and I've used it that way as well.

Basically, given my_dict: Dict[str, Optional[str]], with cast(str, my_dict["key"]), we're saying: "at this point in the code, I'm confident that a) "key" is in my_dict and b) it's value is a str and not None". Having that confidence requires us to reason about the logic up to that point, but in the cases I've seen so far, it's seemed safe (and sometimes obvious from the context).

@bhrutledge bhrutledge merged commit 3e2793f into pypa:master Apr 27, 2020
@bhrutledge bhrutledge deleted the 231-annotate-check branch April 27, 2020 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants