-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: tests for maybe_promote (precursor to #23982) #25637
TST: tests for maybe_promote (precursor to #23982) #25637
Conversation
Hello @h-vetinari! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-06-21 06:15:45 UTC |
7801d0f
to
ca8f96b
Compare
Codecov Report
@@ Coverage Diff @@
## master #25637 +/- ##
==========================================
+ Coverage 91.26% 91.27% +<.01%
==========================================
Files 173 173
Lines 52968 52968
==========================================
+ Hits 48339 48344 +5
+ Misses 4629 4624 -5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #25637 +/- ##
==========================================
- Coverage 91.97% 91.97% -0.01%
==========================================
Files 180 180
Lines 50756 50756
==========================================
- Hits 46685 46683 -2
- Misses 4071 4073 +2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just started glancing at this. I find the box & box_dtype parametrization args very confusing. This is testing a scalar and array, but when / where can this actually be an object dtype of a scalar? I would like to keep the scope of this way way down. This means don't over parmaterize.
|
||
# try/except as numpy dtypes (i.e. if result_dtype is np.object_) do not | ||
# know some expected dtypes like DatetimeTZDtype, and hence raise TypeError | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this very error prone an would rather make this explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you mean here. PandasDtypes (which will not be known by a numpy dtype) can come in on either side of the comparison. If you have a way to write this better, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By more explicit, I think you need to check the type of result_dtype
and expected_dtype
. That way, you know which way to compare.
@pytest.mark.parametrize('boxed, box_dtype', [ | ||
(True, None), # fill_value wrapped in array with auto-dtype | ||
(True, object), # fill_value wrapped in array with object dtype | ||
(False, None) # fill_value directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does box_dtype of None even mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to explain just that with the comment next to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note here: None takes no extra meaning, but is passed directly to the constructor of np.array(some_value, dtype=None)
, which corresponds to the default behaviour of numpy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorporate this side-note as a comment. That will help to explain the parameterized values better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback: just started glancing at this. I find the box & box_dtype parametrization args very confusing. This is testing a scalar and array, but when / where can this actually be an object dtype of a scalar? I would like to keep the scope of this way way down. This means don't over parmaterize.
Thanks for starting to take a look at this. I intentionally kept these tests together, as the actual tests are the same in 95% of the cases, and it would lead to huge amount of duplication to separate testing the scalar case vs. the array case.
I've tried to annotate at every step what boxed
and box_dtype
mean, if we can find a better way to write this I'm all ears, but I don't think breaking up the tests is a good approach.
|
||
# try/except as numpy dtypes (i.e. if result_dtype is np.object_) do not | ||
# know some expected dtypes like DatetimeTZDtype, and hence raise TypeError | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you mean here. PandasDtypes (which will not be known by a numpy dtype) can come in on either side of the comparison. If you have a way to write this better, let me know.
@pytest.mark.parametrize('boxed, box_dtype', [ | ||
(True, None), # fill_value wrapped in array with auto-dtype | ||
(True, object), # fill_value wrapped in array with object dtype | ||
(False, None) # fill_value directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to explain just that with the comment next to it.
though a good idea, closing as stale |
I've worked an inordinate amount of time on #25425, and although I prepared two subsets of it (#23982 and this one), it never got any review. I understand the realities of how scarce reviewing resources are, but it's still disheartening after putting in so much work. Long story short, all those three PRs (#23982,#25425, #25637) should be reopened, please. I'm ready to keep working on them. |
@h-vetinari : I'll re-open this one first. I agree with @jreback that the scope of what you're trying to do is massive. Thus, it would be great if you focused on this PR alone. We can re-open other ones as the work progresses. |
In a separate commit, separate these out as requested. The best way to make the point here is to show the code side-by-side. Are you sure you can't abstract out setup methods for some of the functionality that would be de-duplicated by any chance? Also, if you could address the merge conflict. |
@gfyoung
I can do this, but essentially every test would be duplicated. I'm testing the same logic in both cases, and actually more than that: that the same promotion logic applies in both scalar and array case. If I start splitting this, it would make most sense to have one test for scalars and one for arrays, but from that point on, it would be very easy for those tests to diverge. Does this sound like a reasonable trade-off to you? While trivial, it will be a fair bit of work to actually duplicate these, so I'd rather be sure you understand the consequences before-hand.
That's exactly what I was aiming for with |
Why don't you try duplicating for one test only. Then we can evaluate before and after on that one test and see whether it makes sense for you to continue forward with the duplication.
Hmm...I see. It sounds though (from the comments IIUC) that that might have impacted readability because there were so many different cases to check. Is it possible to maybe have two or three different check promote functions that each handle a subset of cases? |
Finally got around to coming back to this:
I added a docstring and some comments to make it clearer. The function essentially only contains one branch depending on whether All of those tests have a very simple format: def test_maybe_promote_XXX_with_YYY(fixtures..., boxed, box_dtype):
dtype = some_dtype(XXX)
fill_dtype = some_dtype(YYY)
if some_condition:
pytest.xfail('reason')
# define fill_value (usually implicitly through fill_dtype)
fill_value = some_scalar(fill_dtype)
# define expected values
expected_dtype = some_dtype
exp_val_for_scalar = some_scalar_value
exp_val_for_array = some_missing_value_marker
_check_promote(dtype, fill_value, boxed, box_dtype, expected_dtype,
exp_val_for_scalar, exp_val_for_array) This about as simple as I can imagine it to be (while unifying the treatment of scalar and array case). Short of a new idea (and the docstring), I don't know how to make this clearer. (speaking of new ideas: I incorporated your typecheck instead of the try/except for |
pandas/conftest.py
Outdated
DATETIME_DTYPES = ['datetime64[ns]', 'M8[ns]'] | ||
TIMEDELTA_DTYPES = ['timedelta64[ns]', 'm8[ns]'] | ||
DATETIME64_DTYPES = ['datetime64[ns]', 'M8[ns]'] | ||
TIMEDELTA64_DTYPES = ['timedelta64[ns]', 'm8[ns]'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rationale for renames?
Is this necessary given how massive the diff is already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can split off the conftest-related things, of course.
Regarding your specific point: The rename is really important IMO, because the DATETIME_DTYPES
(as-is) have nothing to do with datetime.dateime
or other such things.
However, when reading a test that is being fed by a datetime_dtype
fixture, one could easily think that this tests datetime.datetime
.
The current state in conftest specifically contains only DATETIME64_DTYPES
, and I believe this distinction is used in many parts of the code base as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'm thinking that we should separate this out. The more we can condense this PR, the better.
But hold off on making that PR for the moment (just one more question re: some of other things you've changed for conftest.py
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change the DATETIME_DTYPES
-> DATETIME64_DTYPES
in a precursor of course, but the actual change that precipitates the necessity is the introduction of datetime64_dtype
, which wouldn't be used before this PR. Would you want me to introduce the fixtures with the conftest-precursor, or in this PR?
@jreback : Could you have another look at this |
at some point |
so why are you haveing every int type against a timedelta64 for example; this is just way overkill. These should all react the same, but having this expanse of tests is just too confusing. Please limit this. Obviously the same holds true for example datetime.
|
The simple answer is: fixtures - they do exactly what I wanted here: exhaustively (as possible) test every type against (inputs of) every other type. I get the argument that some tests can be overkill, but I can't see why it would be confusing...? Are you opposed specifically to the number of xfails, I'm guessing? Otherwise, once one starts introducing special cases in this module it becomes less understandable, more fragile, and more complex. |
@h-vetinari this add quite a bit of duplicative testing for many types. Please use more selective fixtures. If its obvious that something is doing the same exact cast vs int, int8, int16.....then a single int is enough for a fixture with a nice comment. |
@jreback |
ok @h-vetinari whats 3000+ new tests, lol. merge master once again and ping on green. |
@jreback, this is green. :) |
k thanks @h-vetinari |
@jreback @gfyoung
Can you please reopen #25425? I'd also suggest that we reopen #23982 anyway - I don't think it has a chance of getting merged (because it basically just adds a bunch of failing tests), but once rebased it would provide a good insight for the value of the refactor #25425 - as in: all the things that are currently broken and would be fixed by that PR. |
|
First step towards #23833, resp. precursor to #23982.
TL;DR:
maybe_promote
is quite broken. #23982 tries to come up with tests that it should pass, and #25425 tries to fix the implementation. However, #23982 is quite big, so @jreback asked for a smaller version that (mostly) just tests existing behaviour.