Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(typing): #983 return type of StringMethods.match #990

Merged
merged 4 commits into from
Sep 4, 2024

Conversation

cmp0xff
Copy link
Contributor

@cmp0xff cmp0xff commented Sep 1, 2024

@cmp0xff
Copy link
Contributor Author

cmp0xff commented Sep 1, 2024

Hi @Dr-Irv thank you for the comment. In #983 (comment) you mentioned:

The declaration of match() in core/strings.pyi is incorrect.

I partially fixed it.

How can I capture this expectation?

But to fix it, the StringMethods class will need an additional argument to pass in the expected result of match(), similar to what is done with str.split().

I do not get this suggestion. Could you elaborate?

PR with tests welcome

#990

@cmp0xff cmp0xff marked this pull request as ready for review September 1, 2024 21:13
@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Sep 3, 2024

How can I capture this expectation?

But to fix it, the StringMethods class will need an additional argument to pass in the expected result of match(), similar to what is done with str.split().

I do not get this suggestion. Could you elaborate?

In core/strings.pyi, we have this:

# The _TS type is what is used for the result of str.split with expand=True
_TS = TypeVar("_TS", DataFrame, MultiIndex)

class StringMethods(NoNewAttributesMixin, Generic[T, _TS]):

You need to create an additional TypeVar, let's call it _MR (for match result). Then the Generic part of StringMethods adds _MR to the parameters of Generic.

Then the match() method would return _MR.

Then you change the references to StringMethods in core/series.pyi and core/indexes/base.pyi to pass in the correct type of result you want match to return as the 3rd argument.

See the pattern used with split() - that's how we control the return type of that particular method.

I'll let you address that change before firing off the CI.

@cmp0xff
Copy link
Contributor Author

cmp0xff commented Sep 4, 2024

Hi @Dr-Irv , thanks for the explanations. I have made the changes.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the stubs, instead of using npt.NDArray[np.bool_], can you use np_ndarray_bool which you can import from pandas._typing .

@cmp0xff cmp0xff requested a review from Dr-Irv September 4, 2024 16:03
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cmp0xff

@Dr-Irv Dr-Irv merged commit 532fc66 into pandas-dev:main Sep 4, 2024
10 checks passed
@cmp0xff cmp0xff deleted the hotfix/cmp0xff/983-typing-str-match branch September 4, 2024 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

df.columns.str.match actually gives npt.NDArray[np.bool_], but mypy thinks it is pd.Index[str]
2 participants