TST: GH30999 Add match=msg to all pytest.raises in tests/indexes #38697

moink · 2020-12-25T14:51:32Z

This pull request partially addresses xref #30999 to remove bare pytest.raises by adding match=msg. It doesn't close that issue as I have only addressed modules in pandas/tests/indexes

I made a few of the raised errors more specific types. There are two other changes that are a bit more complex but I will put specific comments on the code lines to explain those.

…ctivate half a test in pandas/tests/indexes/interval/test_astype.py

moink · 2020-12-25T14:53:57Z

pandas/tests/indexes/multi/test_setops.py

    idx = pd.MultiIndex.from_product([[1, pd.Timestamp("2000"), 2], ["a", "b"]])
    other = pd.MultiIndex.from_product([[3, pd.Timestamp("2000"), 4], ["c", "d"]])

-    with pytest.raises(TypeError):
+    msg = "The 'sort' keyword only takes the values of None or False; True was passed."
+    with pytest.raises(ValueError, match=msg):


This test was marked as failing. But when I switched it to checking that a ValueError was raised, it passed. I think this works as intended now.

this is fine, a lot of things have been cleaned up recently I guess this was missed. cc @jbrockmendel

moink · 2020-12-25T14:57:48Z

pandas/tests/indexes/interval/test_astype.py

+            "Cannot convert interval[float64] to interval[uint64]; subtypes are "
+            "incompatible"
+        )
+        with pytest.raises(TypeError, match=msg):
            index.astype(dtype)


This is a test that was marked as failing, with the reason being closed issue #15832 . I put a bit of discussion about this over in the issue.

I removed the xfail and I also removed the last two pytest.raises. The result of the coercion is: IntervalIndex([(0, 0], (0, 0], (0, 0], (0, 1], (1, 1] ... (8, 9], (9, 9], (9, 9], (9, 9], (9, 10]], closed='right', dtype='interval[int64]') which doesn't look particularly useful but isn't exactly wrong either. If I understand the discussion at #15832 correctly this is intended behaviour for it not to raise the error.

Looks like the test for

index = interval_range(0.0, 10.0, freq=0.25) dtype = IntervalDtype("int64")

is lost here?

Yes, I removed it deliberately, as well as the equivalent one for uint64. I tried to explain that in my comment above but I guess I wasn't clear.

This test was marked as failing before. Now it is not. The test, as written before, was saying that

index = interval_range(0.0, 10.0, freq=0.25) dtype = IntervalDtype("int64")

should raise a ValueError. It does not raise any error. The question here is: Should casting an interval range with non-integer float boundaries raise an error? This test said it should, but it was marked as failing, because right now it does not raise an error. If I understood the discussion at #15832 correctly, and it is possible that I did not, than I think that it should not raise an error, so this test is not required.

The question is, as a user of pandas, if you had an interval range such as [(0.0, 0.25], (0.25, 0.5], (0.5, 0.75], (0.75, 1.0], (1.0, 1.25], (1.25, 1.5], (1.5, 1.75], (1.75, 2.0], (2.0, 2.25], (2.25, 2.5]], and you cast it to int, what would you want or expect? The old (failing) version of the test said you expect a ValueError. I am saying you expect [(0, 0], (0, 0], (0, 0], (0, 1], (1, 1], (1, 1], (1, 1], (1, 2], (2, 2], (2, 2]] which, as I said, looks not that useful as an IntervalIndex, but is not wrong either.

Would love to hear your thoughts.

I could also add another test for the expected behaviour of casting from float to int intervals if that's what we want.

Thanks for explaining, sorry should've read your comment on that linked issue. Your logic makes sense to me, but I'd defer to someone with more experience on how we handle IntervalIndex.

If being removed from this test, we definitely want to make sure this specific casting behavior is being tested somewhere. To keep this pr simple and easier to merge, I'd suggest reverting modification of this test, then making a followup pr to clean up this test and add tests for any untested casting behavior.

Ok, but I am not quite sure how to handle it. I want to leave the code in a good state even if I plan to do another PR soon, because it's possible that one will take a while or not be accepted ever or whatever. The way I see it, these are our options:

Revert it to exactly what it was before I modified it. That would make the file fail the new linting rule we want to add as part of [Good first issue] TST: Disallow bare pytest.raises #30999 , so don't add the linting rule and leave that issue open. I'm not particularly happy with this option because I have gotten the number of bare pytest.raises down from 250ish to 64 in just a few days and I was hoping to finish in a few more days. And the longer we leave the rule out of the pre-commit script the more likely people are to add new bare pytest.raises and the issue will stay open forever.

Revert it, and simultaneously add to the linting script a check that the bare pytest.raises is not in a test decorated with pytest xfail. Generally, addressing [Good first issue] TST: Disallow bare pytest.raises #30999 has been a pretty mechanical process, except when it comes to tests marked as failures. Then it takes a lot of digging through the code and historical issues to understand what would be a the correct failure message to add. It's the one thing that maybe makes [Good first issue] TST: Disallow bare pytest.raises #30999 not as good as a "good first issue" as it appears. I expect to run into more similar cases as I try to finish all the modules, because there are 26 files that only have 1 or 2 bare pytest raises and I suspect it's because they are the hard ones that people didn't know what message to put in. Adding the check to the script seems like kinda a cool project to figure out the ast module better than I currently do, but probably just that and not really worth our time. Unless we really think that we want the linter to do that.

Come up with, out of whole cloth, the error message that I think should be included in such a ValueError, if in the future pandas were to raise one. This is the strategy I took in previous pull requests. (see TST: GH30999 Add match=msg to all "with pytest.raises" in pandas/tests/io/pytables/test_store.py #38609 for example, where I added a new completely invented error message to test_multiple_open_close). This is also what I was trying to do that led me to read the code and the past issues and decide that I didn't think it deserved an error message at all and therefore that the test didn't make sense and I would remove it.

Put in an obviously completely bogus error message, so that future developers know that it's not a real one, and whoever in the future decides to address that test either by making it pass or by removing it can write a real message in the future.

Anyway, after thinking it through and writing it out, I think I will try for strategy 3, and you can decide if you like whatever message I come up with, but let me know if there's a different strategy that you prefer.

Apologies for the wall of text; I am new to contributing to pandas and obviously spent a lot of time thinking about this.

xref #38718

New PR #38719

If that's been done in past PR's, probably fine here too. Appreciate the thoughtfulness on your approach here!

yeah generally addressing on a new PR is the correct way, will look at that soon.

mzeitlin11 · 2020-12-25T19:48:27Z

pandas/tests/indexes/test_numpy_compat.py

+                r"loop of ufunc does not support argument 0 of type .* which "
+                f"has no callable {func.__name__} method"
+            )
+            with pytest.raises(TypeError, match=msg):


Changes in this file look to have caused some failures (https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=50879&view=logs&j=b516280a-8c75-541b-8ba0-ae13561718a7&t=88a240e0-0f0f-5fd5-6f10-d82e025ed5d5)

Looks like AttributeError possible too. Since these errors are raised externally from numpy, might be more robust to also just use external_error_raised and not match on message in case of error message changes.

Ok, it wasn't clear to me whether the error was raised in pandas or numpy. Is that also true of the other error in the same test? i.e. should I change both to tm.external_error_raised?

Yep exactly, that's also an external error. I think both should use tm.external_error_raised.

ok done. let's hope I fixed the tests.

…ternal_error_raised

…also be an AttributeError

moink · 2020-12-26T11:31:52Z

I am confused by the failing tests:
FAILED pandas/tests/arrays/categorical/test_warnings.py::TestCategoricalWarnings::test_tab_complete_warning
FAILED pandas/tests/frame/test_api.py::TestDataFrameMisc::test_tab_complete_warning[DataFrame]
FAILED pandas/tests/frame/test_api.py::TestDataFrameMisc::test_tab_complete_warning[Series]
FAILED pandas/tests/indexes/test_base.py::TestIndex::test_tab_complete_warning
FAILED pandas/tests/resample/test_resampler_grouper.py::test_tab_complete_ipython6_warning

None of which I touched? Any idea what is going on here?

moink · 2020-12-26T11:35:45Z

Ah, it's not just this PR. Several recent PRs have those same failures.

moink · 2020-12-26T14:09:32Z

Jedi update causing problems with the CI raised in #38703

mzeitlin11 · 2020-12-26T19:37:39Z

pandas/tests/indexes/test_numpy_compat.py

+        f"ufunc '{func.__name__}' not supported for the input types, and the "
+        "inputs could not be safely coerced to any supported types according "
+        "to the casting rule ''safe''"
+    )


This one's also an external error from numpy. Even though this works, probably better to be consistent and use tm.external_error_raised.

…to tm.external_error_raised in test_numpy_ufuncs_other

mzeitlin11

Looks good to me, thanks @moink!

jreback · 2020-12-27T22:14:25Z

pandas/tests/indexes/test_numpy_compat.py

@@ -96,13 +94,11 @@ def test_numpy_ufuncs_other(index, func, request):
            result = func(index)
            assert isinstance(result, np.ndarray)
        else:
-            # raise TypeError or ValueError (PeriodIndex)
-            with pytest.raises(Exception):
+            with tm.external_error_raised(TypeError):


guess these are older comments, cc @jbrockmendel

i think so yah

jreback · 2020-12-27T22:16:08Z

pandas/tests/indexes/interval/test_astype.py

+            "Cannot convert interval[float64] to interval[uint64]; subtypes are "
+            "incompatible"
+        )
+        with pytest.raises(TypeError, match=msg):
            index.astype(dtype)


yeah generally addressing on a new PR is the correct way, will look at that soon.

jreback · 2020-12-27T22:16:40Z

pandas/tests/indexes/multi/test_setops.py

    idx = pd.MultiIndex.from_product([[1, pd.Timestamp("2000"), 2], ["a", "b"]])
    other = pd.MultiIndex.from_product([[3, pd.Timestamp("2000"), 4], ["c", "d"]])

-    with pytest.raises(TypeError):
+    msg = "The 'sort' keyword only takes the values of None or False; True was passed."
+    with pytest.raises(ValueError, match=msg):


this is fine, a lot of things have been cleaned up recently I guess this was missed. cc @jbrockmendel

jreback · 2020-12-27T22:17:20Z

cc @jbrockmendel if any comments, go ahead and merge if good

MarcoGorelli · 2020-12-30T08:36:25Z

Hi @moink - can you fetch and merge upstream/master please and resolve the conflict?

# Conflicts: # pandas/tests/indexes/interval/test_astype.py

moink · 2020-12-30T10:45:48Z

Hi @moink - can you fetch and merge upstream/master please and resolve the conflict?

@MarcoGorelli ok done and green except travis

moink · 2020-12-31T10:47:13Z

I am not sure why I closed this before - must have been accidental.

jreback · 2020-12-31T16:25:27Z

thanks @moink

…das-dev#38697)

TST: GH30999 Add match=msg to all pytest.raises in tests/indexes. Rea…

4cc2668

…ctivate half a test in pandas/tests/indexes/interval/test_astype.py

moink commented Dec 25, 2020

View reviewed changes

mzeitlin11 requested changes Dec 25, 2020

View reviewed changes

moink added 3 commits December 25, 2020 20:49

TST: GH30999 Attempt to fix failing test test_numpy_ufuncs_basic

02cbc37

TST: GH30999 Change pytest.raises in test_numpy_ufuncs_basic to tm.ex…

d968db1

…ternal_error_raised

TST: GH30999 Allow first pytest.raises in test_numpy_ufuncs_basic to …

4fd5b91

…also be an AttributeError

mzeitlin11 reviewed Dec 26, 2020

View reviewed changes

TST: GH30999 Revert change to test_subtype_integer_errors and change …

5937097

…to tm.external_error_raised in test_numpy_ufuncs_other

MarcoGorelli self-requested a review December 27, 2020 12:55

This was referenced Dec 27, 2020

Change tests of casting an interval range index from float to int to reflect current behaviour #38718

Closed

TST: GH38718 Tests for casting an interval range from float to int #38719

Merged

mzeitlin11 approved these changes Dec 27, 2020

View reviewed changes

jreback approved these changes Dec 27, 2020

View reviewed changes

jreback added Error Reporting Incorrect or improved errors from pandas Testing pandas testing functions or related to the test suite labels Dec 27, 2020

jreback added this to the 1.3 milestone Dec 27, 2020

Merge branch 'master' into add_match_msg_to_pytest_raises_indexes

32eca9a

# Conflicts: # pandas/tests/indexes/interval/test_astype.py

moink closed this Dec 30, 2020

moink reopened this Dec 31, 2020

jreback merged commit fd5422f into pandas-dev:master Dec 31, 2020

luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021

TST: GH30999 Add match=msg to all pytest.raises in tests/indexes (pan…

356a0d8

…das-dev#38697)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: GH30999 Add match=msg to all pytest.raises in tests/indexes #38697

TST: GH30999 Add match=msg to all pytest.raises in tests/indexes #38697

moink commented Dec 25, 2020

moink Dec 25, 2020

jreback Dec 27, 2020

moink Dec 25, 2020 •

edited

Loading

mzeitlin11 Dec 26, 2020

moink Dec 26, 2020

mzeitlin11 Dec 26, 2020

moink Dec 27, 2020 •

edited

Loading

moink Dec 27, 2020

moink Dec 27, 2020

moink Dec 27, 2020

mzeitlin11 Dec 27, 2020

jreback Dec 27, 2020

mzeitlin11 Dec 25, 2020

moink Dec 25, 2020

mzeitlin11 Dec 26, 2020

moink Dec 26, 2020

moink commented Dec 26, 2020

moink commented Dec 26, 2020

moink commented Dec 26, 2020

mzeitlin11 Dec 26, 2020

moink Dec 27, 2020

mzeitlin11 left a comment

jreback Dec 27, 2020

jbrockmendel Dec 30, 2020

jreback Dec 27, 2020

jreback Dec 27, 2020

jreback commented Dec 27, 2020

MarcoGorelli commented Dec 30, 2020

moink commented Dec 30, 2020

moink commented Dec 31, 2020

jreback commented Dec 31, 2020

TST: GH30999 Add match=msg to all pytest.raises in tests/indexes #38697

TST: GH30999 Add match=msg to all pytest.raises in tests/indexes #38697

Conversation

moink commented Dec 25, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moink Dec 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moink Dec 27, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moink commented Dec 26, 2020

moink commented Dec 26, 2020

moink commented Dec 26, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzeitlin11 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 27, 2020

MarcoGorelli commented Dec 30, 2020

moink commented Dec 30, 2020

moink commented Dec 31, 2020

jreback commented Dec 31, 2020

moink Dec 25, 2020 •

edited

Loading

moink Dec 27, 2020 •

edited

Loading