-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: Deprecate generic timestamp dtypes #15987
Conversation
910bcbc
to
3ec69e8
Compare
dtype = np.timedelta64 | ||
s = Series([], dtype=dtype) | ||
|
||
self.assertTrue(s.empty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use new pytest when you can, e.g.
assert s.empty
assert s.dtype == 'm8[ns]'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, not really a fan of having to rewrite every single test comparison just to be more "idiomatic" with the existing test standard.
Why not write our own in pandas.util.testing
that we can implement according to idiom and call from everywhere else?
e.g.
def assert_true(val):
assert val
def assert_equal(a, b):
assert a == b
The idiom is thus abstracted away from our tests, and we don't have to worry about it when writing test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, as I write this, I see that we do have one of these functions already:
assert_equal(a, b, msg="")
I think from now on, we should advocate that people use our static util.testing
functions (and avoid using self.assert...
or assert
keyword).
pandas/tests/series/test_dtypes.py
Outdated
def test_astype_empty_constructor_equality(self): | ||
# see gh-15524 | ||
|
||
for dtype in np.typecodes['All']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paramerize when you can on new tests (you may need to create a new class that inherits from object and/or write as bare functions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unittest
classes and pytest
decorators do not mesh well together based on experience. I can't find anything so far in documentation on the pytest
to do so. Unless you know of a straightforward way to do this, I'm leaving as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Barring some native pytest
way to implement, a test decorator is the only way to do this (similar to what was done with capturing stdout
and stderr
). This is probably preferable since it would be independent of testing platform.
However, I would prefer to leave that as a follow-up so I can experiment separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those docs don't answer my question. Does not explain how to parameterize a test method within tm.TestCase
. As I already mentioned, unittest
and pytest
don't mesh well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying that I don't disagree about parameterizing. However, I think it is preferable and easier to implement our own parameterizaton decorator (just as we did with capture), that is all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am trying to shrink the codebase
implementing parameteize is not trivial and would need a whole test suite - it diesnt add any value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but neither is incorporating pytest's builtin paramterize without adding new code (i.e. an entirely new class). The only way to avoid that is to write this is as a bare function (not a test method of tm.TestCase), but I prefer the test class organization if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just make another class inherit from object AS i show in the example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There actually is no example where you create a class that inherits from object
. However, that being said, having been able to sit down and re-examine the class
suggestion, I do see what you mean now.
Several things can come from this discussion:
-
Document how you can instantiate test classes that inherit from
object
(and nottm.TestCase
). I still prefer writing test classes instead of raw functions for organizational reasons, though it will still followpytest
idiom. -
Make a new PR that modifies the changes in TST: Add test decorators for redirecting stdout and stderr #15952 to use
pytest
fixtures for redirectingstdout
andstdin
(that will take substantial work though). -
Modify the existing test class in this file to inherit only from
object
(new commit).
pandas/core/internals.py
Outdated
# numpy arrays don't handle generic timestamp dtypes well. Since | ||
# we only use the nanosecond frequency, interpret generic timestamp | ||
# dtypes as nanosecond frequency. | ||
if dtype.name == "datetime64": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is way too hacky to put this here, this should be trapped in maybe_cast_to_datetime
, or if not, then let's create a generic way (via function there to do this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree, except that when we call .astype
, maybe_cast_to_datetime
does not get called. But certainly, there are other places that casting can go.
3ec69e8
to
d581e3e
Compare
Codecov Report
@@ Coverage Diff @@
## master #15987 +/- ##
==========================================
- Coverage 91.03% 91.01% -0.02%
==========================================
Files 145 145
Lines 49587 49585 -2
==========================================
- Hits 45141 45132 -9
- Misses 4446 4453 +7
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #15987 +/- ##
==========================================
+ Coverage 91% 91.01% +<.01%
==========================================
Files 145 145
Lines 49639 49649 +10
==========================================
+ Hits 45175 45188 +13
+ Misses 4464 4461 -3
Continue to review full report at Codecov.
|
172f94d
to
7cfa4d5
Compare
@jreback : All comments have been addressed, and everything is green. |
In the issue, the discussion was about construction of empty series objects, while here you also added constructing/astyping actual data with non-parameterized np.datetime64. As I said in the issue (#15524 (comment), so my comment was relevant after all :-)), I don't see a strong reason to allow this. |
@gfyoung I see now you replied in the issue:
There is maybe not a strong reason that they shouldn't work, but I also don't see a good reason that they should work. It just expands what users can do in an unnecessary way. Why is it needed that people can do |
@jorisvandenbossche : Well, it has to go one way or the other. Either you can I don't have any major issues going the other way and preventing anyone from casting to the generic |
I am ok with this (still have to review the change again) as we already support this.
this should be consistent with direct construction (via |
On second though, we could deprecate accepting direct So the principal of construction / conversion should work with the same type will be preserved. I think I do like being more explicit here (IOW doing the deprecation). @gfyoung can you see what this breaks? IIRC this was originally an issue to fix something in csv parsing. But this was an orthogonal change (I don't even really understand how this change was related, maybe you can show that?) |
And for
so it is only If possible, I would be in favor of not allowing bare |
@jorisvandenbossche yep its the |
@jreback , @jorisvandenbossche : That sounds like a good plan. Shouldn't be too bad I imagine. @jorisvandenbossche BTW, the reason for that weird error message that you saw in the issue here is because at some point, we call |
7cfa4d5
to
5a5c1e6
Compare
5a5c1e6
to
7453b63
Compare
Deprecation worked out reasonably well, but I can't really figure out why AppVeyor and Circle are failing my tests. The Travis logs are clean (and passing), so I don't think that there's any hidden warning being issued that's causing AppVeyor and Circle to miss them. Suggestions on how to fix this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't directly see what could be going on with circle ci and appveyor.
pandas/types/cast.py
Outdated
"deprecated and will raise in a future version. " | ||
"Please pass in '{dtype}[ns]' instead.") | ||
warnings.warn(msg.format(dtype=dtype.name), | ||
FutureWarning, stacklevel=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this stacklevel should be 8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant 5? (same as the one beneath). Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think this should be 8 (at least, that's what I counted). But you didn't remove the stacklevel=False
, so this does not fail the tests currently
pandas/types/cast.py
Outdated
if dtype.name in ('datetime64', 'datetime64[ns]'): | ||
if dtype.name == 'datetime64': | ||
warnings.warn(msg.format(dtype=dtype.name), | ||
FutureWarning, stacklevel=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this one 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pandas/types/cast.py
Outdated
if dtype.name in ('timedelta64', 'timedelta64[ns]'): | ||
if dtype.name == 'timedelta64': | ||
warnings.warn(msg.format(dtype=dtype.name), | ||
FutureWarning, stacklevel=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this one also 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
data = [1] | ||
|
||
with tm.assert_produces_warning(FutureWarning, | ||
check_stacklevel=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try to remove the check_stacklevel here? (btw, easiest to try out what it should be is the set warnings to raise an error, and then you can count the number of stacks in the traceback)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's fair. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like you removed this?
pandas/tests/series/test_dtypes.py
Outdated
warning_type = FutureWarning | ||
|
||
with tm.assert_produces_warning(warning_type, | ||
check_stacklevel=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it may help, I don't think it is necessarily needed to assert the warning here (as you already have an explicit test for that above), but you can also just catch warnings so they do not appear in the log (with warnings.catch_warnings(record=True):
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Done.
282fd39
to
6445606
Compare
lgtm. I see you 'fixed' the passing ping on green. |
Sounds good...once I can figure out these warnings 😁 |
We only use the nanosecond frequency, and numpy doesn't even handle generic timestamp dtypes well. xref pandas-devgh-15524 (comment).
6445606
to
eb4df87
Compare
@jreback : Everything is green again! |
thanks! |
Previously deprecated for Series constructor and the `.astype` method. Now being enforced. xref pandas-devgh-15987.
Previously deprecated for Series constructor and the `.astype` method. Now being enforced. xref pandas-devgh-15987.
Previously deprecated for Series constructor and the `.astype` method. Now being enforced. xref pandas-devgh-15987.
Previously deprecated for Series constructor and the `.astype` method. Now being enforced. xref gh-15987.
Previously deprecated for Series constructor and the `.astype` method. Now being enforced. xref pandas-devgh-15987.
Previously deprecated for Series constructor and the `.astype` method. Now being enforced. xref pandas-devgh-15987.
Previously deprecated for Series constructor and the `.astype` method. Now being enforced. xref pandas-devgh-15987.
We only use the nanosecond frequency, and
numpy
doesn't even handle generic timestampdtype
well internally with itsndarray
object.xref #15524 (comment).