Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up Dims type annotation #8606

Merged
merged 6 commits into from
Jan 16, 2024
Merged

Clean up Dims type annotation #8606

merged 6 commits into from
Jan 16, 2024

Conversation

crusaderky
Copy link
Contributor

No description provided.

@crusaderky crusaderky self-assigned this Jan 12, 2024
@@ -182,8 +182,7 @@ def copy(
DsCompatible = Union["Dataset", "DaCompatible"]
GroupByCompatible = Union["Dataset", "DataArray"]

Dims = Union[str, Iterable[Hashable], "ellipsis", None]
OrderedDims = Union[str, Sequence[Union[Hashable, "ellipsis"]], "ellipsis", None]
Dims = Union[Hashable, Sequence[Hashable], None]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ellipsis is a subclass of Hashable

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am on phone but this has been the subject of much discussion, worth reviewing that...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#6142

So there are good reasons to have str | Iterable[Hashable] rather than Hashable | Iterable[Hashable]...

Is there a cost of leaving ellipsis in? It's nice to have it be explicit, even if it doesn't affect the type resolution?

(+1 to much of the PR though, updating definitions to use Dims...)

@crusaderky crusaderky marked this pull request as draft January 12, 2024 15:11
@crusaderky crusaderky changed the title [type annotations] Dims can be a single Hashable Clean up Dims type annotation Jan 15, 2024
@crusaderky crusaderky closed this Jan 15, 2024
@crusaderky crusaderky reopened this Jan 15, 2024
@crusaderky crusaderky marked this pull request as ready for review January 15, 2024 17:25
@crusaderky
Copy link
Contributor Author

Ready for review

@@ -297,7 +298,7 @@ def test_parse_dims_replace_none(dim: None | ellipsis) -> None:
pytest.param(["x", 2], id="list_missing_all"),
],
)
def test_parse_dims_raises(dim: str | Iterable[Hashable]) -> None:
def test_parse_dims_raises(dim):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove these annotations? IIUC these get mypy to use the tests to test our annotations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that they validate is that the signature on the test matches the signature declared for the function being tested - quite pointless IMHO. Notably, they don't validate the parameters being passed from the lines above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the signature on the test matches the signature declared for the function being tested - quite pointless IMHO

I'm not completely sure what you mean by this. But without check_untyped_defs, having the -> None is the only way we test whether our annotations are correct (or am I wrong there?)

I think you should revert removing the annotations, at least the -> None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was trying to say is that the signature in the unit test causes mypy to verify that the signature declared in the unit test, dim: str | Iterable[Hashable] is indeed compatible with the signature declared in the parse_dims function, dim: Dims. Which IMHO is pointless.

It would have been useful if it verified that the values with which dim is actually populated in the test are legal for the Dims type, but it does not do that.

For example, in the future someone may short-sightedly change the definition of Dims from str | Collection[Hashable] to str | Sequence[Hashable]. One of the parameters in the test is

        pytest.param({"a", 1}, tuple({"a", 1}), id="non_sequence_collection"),

Nothing will trip, with or without annotations in the test signature, unless someone changes the actual implementation of parse_dims to crash if you pass a set to it.

Annotating the test signature would become useful if we rewrote it without parametrization:

def test_parse_dims() -> None:
    all_dims = ("a", "b", 1, ("b", "c"))  # selection of different Hashables

    # non-sequence collection
    actual = utils.parse_dims({"a", 1}, all_dims, replace_none=False)
    assert actual == tuple({"a", 1})

    ... # repeat for all other use cases

which would be a valid choice, but it would fail on the first bad use case instead of moving on and it would be less immediately obvious what broke. You win some, you lose some.

Would you like me to open a new PR where I rewrite the unit test without parametrization and with annotations?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was trying to say is that the signature in the unit test causes mypy to verify that the signature declared in the unit test, dim: str | Iterable[Hashable] is indeed compatible with the signature declared in the parse_dims function, dim: Dims. Which IMHO is pointless.

Totally agree!

Annotating the test signature would become useful if we rewrote it without parametrization:

Yes. In this specific case it's still slightly useful to have annotations on this function.

  • The dim typing check isn't that useful, because we supply it atm
  • all_dims will be checked

Would you like me to open a new PR where I rewrite the unit test without parametrization and with annotations?

Sorry, no. (and to the extent you're interpreting my comment as arguably bad suggestions, I apologize, I wasn't suggesting doing this)

The thing that I do think we should do is get to a point where tests checking as many annotations as possible. There are two ways to do this:

  • check_untyped_defs=True
  • -> None on test functions

So even though in this case it's only slightly useful:

  • There's no downside
  • The principle of having -> None is helpful, and gets us closer to having a blanket check_typed_defs

...so I think we should restore -> None (and nothing else)


Just to put this in perspective, I'm not trying to be difficult / antagonistic. We previously had lots of incorrect annotations! And so I have done a decent amount of work moving xarray on this — you can see the progress we've made at converting the library to test annotations here, and notably this file is currently excluded. I started by adding -> None to lots of functions, so hence my protest that we're now undoing them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Collaborator

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall great!

@crusaderky crusaderky merged commit 1580c2c into pydata:main Jan 16, 2024
25 of 26 checks passed
@crusaderky crusaderky deleted the DIms_type branch January 16, 2024 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants