-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: insert 'match' to bare pytest raises in pandas/tests/internals/ #30998
TST: insert 'match' to bare pytest raises in pandas/tests/internals/ #30998
Conversation
Looks like this is breaking a test:
|
i think something like this would help identify what needs to be done to make the error messages consistent... diff --git a/pandas/tests/internals/test_internals.py b/pandas/tests/internals/test_internals.py
index 3f23486cd..9efc95fdb 100644
--- a/pandas/tests/internals/test_internals.py
+++ b/pandas/tests/internals/test_internals.py
@@ -818,14 +818,11 @@ class TestBlockManager:
invalid_values = [1, "True", [1, 2, 3], 5.0]
bm1 = create_mgr("a,b,c: i8-1; d,e,f: i8-2")
- msg = (
- r'For argument "inplace" expected type bool, received type int.|'
- r'For argument "inplace" expected type bool, received type str.|'
- r'For argument "inplace" expected type bool, received type list.|'
- r'For argument "inplace" expected type bool, received type float.'
- )
-
for value in invalid_values:
+ msg = (
+ 'For argument "inplace" expected type bool,'
+ f" received type {type(value).__name__}."
+ )
with pytest.raises(ValueError, match=msg):
bm1.replace_list([1], [2], inplace=value)
@@ -1238,16 +1235,22 @@ class TestCanHoldElement:
}
if (op, dtype) in invalid:
- msg = (
- r"cannot perform __pow\_\_ with this index type: DatetimeArray|"
- r"cannot perform __mod\_\_ with this index type: DatetimeArray|"
- r"cannot perform __truediv\_\_ with this index type: DatetimeArray|"
- r"cannot perform __mul\_\_ with this index type: DatetimeArray|"
- r"cannot perform __pow\_\_ with this index type: TimedeltaArray|"
- "ufunc 'multiply' cannot use operands with types dtype"
- r"\('<m8\[ns\]'\) and dtype\('<m8\[ns\]'\)|"
- "cannot add DatetimeArray and Timestamp"
- )
+ if op is operator.add:
+ msg = f"cannot add DatetimeArray and Timestamp"
+ elif op is operator.mul:
+ msg = (
+ re.escape(
+ "ufunc 'multiply' cannot use operands with types"
+ " dtype('<m8[ns]') and dtype('<m8[ns]')"
+ )
+ + "|cannot perform __mul__ with this index type: DatetimeArray"
+ )
+ else:
+ msg = (
+ f"cannot perform __{op.__name__}__ with this index type:"
+ r" (DatetimeArray|TimedeltaArray)"
+ )
+
with pytest.raises(TypeError, match=msg):
op(s, e.value)
else: |
@@ -350,7 +358,10 @@ def test_duplicate_ref_loc_failure(self): | |||
blocks[1].mgr_locs = np.array([0]) | |||
|
|||
# test trying to create block manager with overlapping ref locs | |||
with pytest.raises(AssertionError): | |||
|
|||
msg = "Gaps in blk ref_locs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should test this, this is an internal error message
(and it's also not a very good error message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and it's also not a very good error message)
Isn't that exactly the reason why we should test them in the first place? Puts more light on the quality of the error messages that we raise, internal or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that exactly the reason why we should test them in the first place
If we, in one go, improve the error message. Yes, that is fine, but otherwise I don't think it is worth the effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is fine, but otherwise I don't think it is worth the effort.
Debatable. If you're someone who has just started working on this codebase, I wouldn't want them to also modify the error message if they don't fully understand what is going. You can easily spend another PR figuring out what the proper error message is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @jorisvandenbossche here I don't think worth testing against error messages for AssertionError
, since that is purely internal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since that is purely internal
I still don't find that to be valid justification. Internal or not, we use error messages for a reason: because they provide some utility in development. If these error messages (not the Exception
class itself) had no use and won't see the light of day simply because they're internal, we should just get rid of them. To borrow from @jorisvandenbossche, they aren't worth the effort then.
Error messages can also serve to differentiate between which error is raised. The managers.py
file is littered with AssertionError
lines. It would be good to know that the correct one is being raised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am actually ok with testing these asserts; they are the current behavior. If things change then this test should fail. That said I wouldn; spend too much effort on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because they provide some utility in development.
Yes, they are useful for that. But also because they are only used in development, they can also easily change when refactoring some internals (without that there are user-facing consequences).
But every test (and certainly a test with a match) takes also effort to write, review and maintain (eg update if you update something internally). It's a trade-off. So I am not saying those internal error messages are not useful at all, but I still think they are not worth the effort to properly test with full coverage.
r'For argument "inplace" expected type bool, received type int.|' | ||
r'For argument "inplace" expected type bool, received type str.|' | ||
r'For argument "inplace" expected type bool, received type list.|' | ||
r'For argument "inplace" expected type bool, received type float.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe leave out the last word to have this less repetitive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a second look, I think I will just have msg
to be equal to:
msg = "For argument "inplace" expected type bool, received type (int|str|list|float)."
Will this be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that should work. Give it a shot.
Hello @MomIsBestFriend! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-01-17 16:00:13 UTC |
) | ||
+ "(DatetimeArray|TimedeltaArray)" | ||
) | ||
with pytest.raises(TypeError, match=msg): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in light of the discussion in #30999. Personally, I find that the above code change makes the test much more complicated and more difficult to read and interpret.
There is always a trade-off between testing every exact details vs practicality/readabiity/effort to do this, and personally for a case like the above, this is not worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks hard to read because it's so monolithic.
This test could benefit from a TON of refactoring, including pytest
parameterization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test could benefit from a TON of refactoring, including
pytest
parameterization.
@gfyoung I am looking into refactoring this test from the ground up, what would you put as the pytest parameterization(s) ?
I hope I can build the test from the ground up, depends on the given parameterization(s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am looking into refactoring this test from the ground up, what would you put as the pytest parameterization(s) ?
I might have jumped the gun there in saying we should do pytest
parameterization. If there is a way to parameterize so we can remove the logic for the skip
tests (and put it as part of the decorator), that would be good.
A similar point could be made about the combinations we expect to fail because I'm uncertain as to feasability. If we could somehow parameterize properly, we could easily break that part out as a separate test.
I have reverted the failing test, I will open a separate PR. |
@@ -297,7 +297,8 @@ def test_delete(self): | |||
assert (newb.values[1] == 1).all() | |||
|
|||
newb = self.fblock.copy() | |||
with pytest.raises(Exception): | |||
|
|||
with pytest.raises(IndexError, match=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the None
here for? Can this not be made more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be more specific as soon as #31130 gets merged.
@@ -350,7 +358,10 @@ def test_duplicate_ref_loc_failure(self): | |||
blocks[1].mgr_locs = np.array([0]) | |||
|
|||
# test trying to create block manager with overlapping ref locs | |||
with pytest.raises(AssertionError): | |||
|
|||
msg = "Gaps in blk ref_locs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @jorisvandenbossche here I don't think worth testing against error messages for AssertionError
, since that is purely internal
thanks @MomIsBestFriend I think we are agreed generally that testing for internal Asserts is not worth here. This is adding a single test for this, which is ok. But in general we don't want to add more (and hopefully refactor the internal error messages). |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff