-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Good first issue] TST: Disallow bare pytest.raises #30999
Comments
I'll take:
|
I will begin working on: pandas/tests/arithmetic/test_numeric.py |
@gdex1 I hope this will help you :) (The numbers represents line number)
|
@MomIsBestFriend I will help with : |
I can take care of these: pandas/tests/io/test_html.py |
@MomIsBestFriend there was quite some discussion in #23922 about to go about this. Because to repeat as I said there: I don't think we should "blindly" assert all error messages. Some things that were said in that thread: limit it to internal error messages, limit the match to a few key words of the message, avoid complicated patterns. Also, I think asserting error messages should go hand in hand with actually checking if it is a good, clear error message, and potentially improving this. It might be good to distill a list of attention points from the discussion in the other issue to put here. |
I completely agree, but the problem is that newcomers don't know what error messages to assert and what error messages not to assert, if we somehow define rules on what error messages to assert and what not to assert, and at the same time keeping this issue "beginner friendly", it will be great (IMO). Also, if we plan to enforce this in the CI we need to somehow mark what
I don't see why we wouldn't want to test internal error messages, can you please elaborate even more? I see the point that you pointed out in #23922 (comment), and I'm
Absolutely agree.
I have read the conversation at #23922, but I didn't saw anything that IMO is worth putting as a "note" in the issue's body, can you please point out things I missed? |
I don't see much else to add from that issue either.
These are reasons in part why picking and choosing which to test and which not to test is not the direction I would prefer. I would also add that we do sometimes check error message strings in Also, if these "internal" messages aren't that important, why do we have an error message in the first place? I would then just create a helper that then asserts the message is empty. |
So I said "limit to internal error messages", while "internal" can be a bit ambiguous... I meant: error messages that originate from pandas itself, and of course we want to test those. But so I meant that we (IMO) shouldn't test too much external error messages, meaning: messages coming from eg numpy or other libraries. Numpy can change those, and then our tests start failing due to a cosmetic change in numpy (and this is not hypothetical, it happened just last week I think). Now, I used "internal" in a different context in #30998 (comment). There, I meant as an internal, developer oriented error message that should never be raised to the user. IMO, those are not important to test exactly with the error message.
Let's put @simonjayhawkins's comment that you link to here:
That are all useful things, I fully agree. But that is not simple, and if we want to get those things out of this issue, then this issue is not for beginners. Of course, beginners don't need to do all of those things at once, but I still have the feeling that those PRs adding asserts are often rather close to "blindly adding the current error message to the pytest.raises call" without going any further (the above points). Also, if the above points is what makes this exercise useful, it are more concrete instructions about this that is useful to put at the top of this issue, I think. To be clear, I am all for better error messages and better tests asserting we have and keep those error messages good. But we also have limited time, and each PR requires time and effort to do and to review, and the question is where effort is best spent. |
It might make sense to create a larger issue to track these (other issues worth including in such an issue are #19159 and #21575). This part in itself is self-contained and is very approachable for beginners. |
@gfyoung how are those issues you link related to this discussion? |
They relate to the comment that you explicitly introduced from @simonjayhawkins |
#31072 contains the one missing match in |
As saying here #31091 (comment) I'm with @jorisvandenbossche's idea, that we won't test error messages from external packages, Any ideas on how to mark those? |
If we really don't want to test certain error messages (I could go either way on external ones to be fair), I think we should just create a helper function like this: def external_error_raised(expected_exception):
return pytest.raises(expected_exception, match=None) This will make it clear to our future selves that this is a non-pandas error, and the |
I really like that idea, can we make it a convention for our tests? That if a function is testing if a function/method is raising an error, and the error is an external error, we simply put
By that I mean putting a section on in the Contributing guide. |
I will take: pandas/tests/reshape/test_get_dummies.py |
I will take: |
I will take: pandas/tests/dtypes/test_inference.py |
Done: |
Done: |
I will take pandas/tests/indexing/multiindex/test_partial.py |
I will take |
I will make an attempt at: pandas/tests/series/apply/test_series_apply.py and see what people think |
Done: |
I will take |
I think I have handled all the easy-enough-to-handle instances of bare
I would say that this is no longer a "good first issue" because there are only complex instances left. |
@moink you could submit a draft PR and use |
Ok thanks @simonjayhawkins I will do that. I thought about abusing the CI for this - if it's ok to do that then I will. I also didn't know that CI ran on draft PRs. |
I opened #38876 with the |
@moink This should probably be handled on a case-by-case basis. I'd recommend searching for issues on these tests and creating them if none exist, then post references here. |
End users rely on error messages for their debugging purposes. Thus, it is important that we make sure that the correct error messages are surfaced depending on the error triggered.
The core idea is to convert this:
To this:
You can read more about
pytest.raises
here.Side note:
In case that the raised error message is an external error message (meaning that's not pandas specific), you should use the
external_error_raised
instead ofpytest.raises
.the usage of
external_error_raised
is exactly likepytest.raises
the only difference is that you don't pass in thematch
argument.For example:
Keynotes:
in your PR.
Please comment what you are planning to work on, so we won't do double the work (no need to mention me, you can just declare what you are planning to work on, just remember to check if something is already taken).
If a file/files that should be marked as "done" (as if there is no more work to do), isn't marked as "done", please comment letting me know about that (And mentioning me by putting
@MomIsBestFriend
at the comment's body, so I'll know where to look).To generate the full list yourself, you can add:
to
.pre-commit-config.yaml
in the- repo: local
section, and then runThe current list is:
NOTE:
The list may change as files are moved/renamed constantly.
Took pretty much everything from #23922, that was originally opened by @gfyoung.
The text was updated successfully, but these errors were encountered: