CLN: Remove/deprecate unused/misleading dtype functions #23917

jbrockmendel · 2018-11-26T03:11:25Z

For "misleading" the leading case is is_int_or_datetime_dtype, which includes timedelta64 but excludes datetime64tz.

Several branches in lib.infer_dtype are unreachable, are removed.

is_datetimetz is redundant with is_datetime64tz_dtype, is deprecated, and internal uses of it are changed.

is_period is both redundant and misleading, is deprecated (no internal usages outside of tests)

maybe_convert_string_to_object and maybe_convert_scalar are never used outside of tests, removed

cc: @h-vetinari IIRC you've been looking at other functions in core.dtypes.cast with an eye towards cleanup or de-duplication. Suggestions welcome.

…ypes

pep8speaks · 2018-11-26T03:11:41Z

Hello @jbrockmendel! Thanks for submitting the PR.

There are no PEP8 issues in the file pandas/core/algorithms.py !
There are no PEP8 issues in the file pandas/core/dtypes/cast.py !
There are no PEP8 issues in the file pandas/core/dtypes/common.py !
There are no PEP8 issues in the file pandas/core/dtypes/concat.py !
There are no PEP8 issues in the file pandas/core/frame.py !
There are no PEP8 issues in the file pandas/core/indexes/datetimes.py !
There are no PEP8 issues in the file pandas/core/internals/blocks.py !
There are no PEP8 issues in the file pandas/core/internals/concat.py !
There are no PEP8 issues in the file pandas/core/nanops.py !
There are no PEP8 issues in the file pandas/core/reshape/merge.py !
There are no PEP8 issues in the file pandas/io/formats/format.py !
There are no PEP8 issues in the file pandas/tests/api/test_types.py !
There are no PEP8 issues in the file pandas/tests/dtypes/test_cast.py !
There are no PEP8 issues in the file pandas/tests/dtypes/test_common.py !
There are no PEP8 issues in the file pandas/tests/dtypes/test_dtypes.py !
There are no PEP8 issues in the file pandas/tests/dtypes/test_inference.py !
There are no PEP8 issues in the file pandas/tests/test_base.py !

gfyoung · 2018-11-26T04:15:51Z

@jbrockmendel @h-vetinari : Have a look at #16242 as well. This issue might be resolvable based on what you guys are doing here...

h-vetinari · 2018-11-26T04:25:21Z

pandas/tests/dtypes/test_dtypes.py

-        assert is_datetimetz(s.dtype)
-        assert not is_datetimetz(np.dtype('float64'))
-        assert not is_datetimetz(1.0)
+        with tm.assert_produces_warning(FutureWarning):


If each call is supposed to generate a warning, this will need more context managers.

h-vetinari · 2018-11-26T04:36:05Z

Thanks for tagging me.

There's a couple of things I encountered in #23796, mainly:

astype_intsafe from _libs.lib is only used once in dtypes.cast.astype_nansafe on the object branch, where there's some confusing/contradictory/redundant branches (related to your integer/datetime/timedelta point)
- the branch for if np.issubdtype(dtype.type, np.integer): catches all the timedeltas (but not the datetimes), both of which would have a dedicated branch further down. However, letting the timedeltas actually hit their dedicated branch means re-calling astype_nansafe after explicitly casting object dtype to timedelta, hitting a dedicated timedelta-dtype-branch, which casts to float. This then contradicts/fails the TestMerge::test_other_timedelta_unit test, because it expects timedelta output.
- furthermore, that integer-condition is actually relevant for some IntegerArray cases, because there's no dedicated branch for extension array types (on the side of the caller) yet.
maybe_promote and maybe_upcast_putmask have some issues, BUG/Internals: maybe_upcast_putmask #23823 / BUG/Internals: maybe_promote #23833 (actually many more than listed in those issues). Both are also completely untested directly, AFAICT. I'm currently in the process of writing tests for the former, and then hoping to fix the behaviour.

doc/source/whatsnew/v0.24.0.rst

jreback · 2018-11-26T13:52:02Z

doc/source/whatsnew/v0.24.0.rst

@@ -1125,6 +1127,7 @@ Removal of prior version deprecations/changes
 - :meth:`SparseSeries.to_dense` has dropped the ``sparse_only`` parameter (:issue:`14686`)
 - :meth:`DataFrame.astype` and :meth:`Series.astype` have renamed the ``raise_on_error`` argument to ``errors`` (:issue:`14967`)
 - ``is_sequence``, ``is_any_int_dtype``, and ``is_floating_dtype`` have been removed from ``pandas.api.types`` (:issue:`16163`, :issue:`16189`)
+- ``is_floating_dtype`` has been removed (:issue:`????`)


this looks like a dupe of the line above

The existing whatsnew makes it look like this has already been removed, but it is still present in master, and not listed (as deprecated or removed) in #6581.

The whatsnew also says is_sequence and is_any_int_dtype have been removed, but each of these are still used in a few places internally. Get rid of those usages? (any_int_dtype only has two usages so that one will be easy)

pandas/_libs/lib.pyx

pandas/core/dtypes/common.py

jreback · 2018-11-26T13:55:14Z

pandas/core/reshape/merge.py

@@ -1604,7 +1604,10 @@ def _factorize_keys(lk, rk, sort=True):

        lk = ensure_int64(lk.codes)
        rk = ensure_int64(rk)
-    elif is_int_or_datetime_dtype(lk) and is_int_or_datetime_dtype(rk):
+    elif (issubclass(lk.dtype.type, (np.integer, np.timedelta64,


this is not right, these should be in separate branches here, the datetime / timedeltas can be in a single elif clause.

If the behavior here is wrong then we should also add a test for it. I'll open a separate issue to address this, since I'm not sure off the top of my head what the correct behavior is.

opened #23929

right these actually this should be 2 clauses of

elif is_integer_dtype(....) .... elif needs_i8_conversion(...) ...

i don't really like changing the way you are doing here.

i was objecting to the object conversion below, not really sure why that is done at all. if you can separate to 2 elif's would be better here

OK. I'd really rather these be separated in a dedicated PR, but I guess #23929 can be a Needs Test Issue... will change.

yeah that's fine. I think there is another PR about adding EA types here as well (though of course orthogonal). see if this passes and can add tests / perf checking later.

had to revert use of needs_i8_conversion since it breaks on period_dtype

pandas/tests/dtypes/test_common.py

…ypes

codecov · 2018-11-26T23:11:18Z

Codecov Report

Merging #23917 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #23917      +/-   ##
==========================================
+ Coverage   92.31%   92.31%   +<.01%     
==========================================
  Files         161      161              
  Lines       51488    51471      -17     
==========================================
- Hits        47530    47515      -15     
+ Misses       3958     3956       -2

Flag	Coverage Δ
#multiple	`90.7% <100%> (ø)`	⬆️
#single	`42.43% <66.66%> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/core/arrays/datetimes.py	`98.37% <ø> (ø)`	⬆️
pandas/io/formats/format.py	`97.76% <100%> (ø)`	⬆️
pandas/core/indexes/datetimes.py	`96.47% <100%> (ø)`	⬆️
pandas/core/dtypes/common.py	`94.77% <100%> (+0.39%)`	⬆️
pandas/core/frame.py	`97.02% <100%> (ø)`	⬆️
pandas/core/algorithms.py	`95.11% <100%> (ø)`	⬆️
pandas/core/dtypes/concat.py	`96.66% <100%> (ø)`	⬆️
pandas/core/nanops.py	`95.05% <100%> (ø)`	⬆️
pandas/core/internals/blocks.py	`93.71% <100%> (ø)`	⬆️
pandas/core/reshape/merge.py	`94.27% <100%> (+0.03%)`	⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d53a4cc...5bdc157. Read the comment docs.

…ypes

jreback · 2018-11-27T01:24:16Z

pandas/core/reshape/merge.py

@@ -1604,7 +1604,10 @@ def _factorize_keys(lk, rk, sort=True):

        lk = ensure_int64(lk.codes)
        rk = ensure_int64(rk)
-    elif is_int_or_datetime_dtype(lk) and is_int_or_datetime_dtype(rk):
+    elif (issubclass(lk.dtype.type, (np.integer, np.timedelta64,


right these actually this should be 2 clauses of

elif is_integer_dtype(....) .... elif needs_i8_conversion(...) ...

jreback · 2018-11-27T01:25:12Z

pandas/core/reshape/merge.py

@@ -1604,7 +1604,10 @@ def _factorize_keys(lk, rk, sort=True):

        lk = ensure_int64(lk.codes)
        rk = ensure_int64(rk)
-    elif is_int_or_datetime_dtype(lk) and is_int_or_datetime_dtype(rk):
+    elif (issubclass(lk.dtype.type, (np.integer, np.timedelta64,


i don't really like changing the way you are doing here.

i was objecting to the object conversion below, not really sure why that is done at all. if you can separate to 2 elif's would be better here

…ypes

jreback

lgtm. merge on green.

…ypes

jbrockmendel · 2018-11-27T01:59:45Z

rebased, also fixed a docstring mistake I made in #23937.

jreback · 2018-11-27T02:05:23Z

ok! ping on green.

…ypes

jbrockmendel · 2018-11-27T17:53:18Z

Ping

jreback · 2018-11-27T18:01:57Z

thanks!

…3917)

jbrockmendel added 9 commits November 25, 2018 15:06

remove unused maybe_convert_scalar

607f853

remove unused maybe_convert_string_to_object

1e95d48

remove deprecated is_floating_dtype

714c287

deprecate is_period

ba6ef3d

deprecate is_datetimetz

a983755

set stacklevel

d7944b6

Merge branch 'master' of https://github.com/pandas-dev/pandas into dt…

2c9b406

…ypes

remove unused is_timedelta_array and is_timedelta64_array

34c4e91

remove is_int_or_datetime_dtype

8e86599

jbrockmendel added 2 commits November 25, 2018 19:11

GH reference

0cf6239

fixup typecheck

8adda17

gfyoung added the Dtype Conversions Unexpected or buggy dtype conversions label Nov 26, 2018

gfyoung added the Clean label Nov 26, 2018

h-vetinari reviewed Nov 26, 2018

View reviewed changes

typo fixu[

fe02240

jreback requested changes Nov 26, 2018

View reviewed changes

jbrockmendel added 2 commits November 26, 2018 08:48

Merge branch 'master' of https://github.com/pandas-dev/pandas into dt…

53d1de4

…ypes

Address comments

e2f8669

jbrockmendel mentioned this pull request Nov 26, 2018

Dtype checking in reshape.merge #23929

Closed

jsexauer mentioned this pull request Nov 26, 2018

DEPR: Clean up list of deprecations from prior versions #6581

Closed

1 task

jbrockmendel added 2 commits November 26, 2018 09:56

isort

3daaabf

dummy commit to force CI

2566a18

Merge branch 'master' of https://github.com/pandas-dev/pandas into dt…

8342b97

…ypes

jreback requested changes Nov 27, 2018

View reviewed changes

jbrockmendel added 2 commits November 26, 2018 17:32

Merge branch 'master' of https://github.com/pandas-dev/pandas into dt…

a7f7916

…ypes

separated conditions

6dda1be

jreback approved these changes Nov 27, 2018

View reviewed changes

jreback added this to the 0.24.0 milestone Nov 27, 2018

jbrockmendel added 2 commits November 26, 2018 17:58

Merge branch 'master' of https://github.com/pandas-dev/pandas into dt…

9c64633

…ypes

docstring fixup

3d8752f

jbrockmendel added 4 commits November 26, 2018 19:12

Merge branch 'master' of https://github.com/pandas-dev/pandas into dt…

bb8a814

…ypes

revert usage of needs_i8_conversion

71483b5

dummy commit to force CI

e0593bb

Merge branch 'master' of https://github.com/pandas-dev/pandas into dt…

5bdc157

…ypes

jreback merged commit ee283fa into pandas-dev:master Nov 27, 2018

jbrockmendel deleted the dtypes branch November 27, 2018 18:48

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

CLN: Remove/deprecate unused/misleading dtype functions (pandas-dev#2…

6e653b3

…3917)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

CLN: Remove/deprecate unused/misleading dtype functions (pandas-dev#2…

dc9a81a

…3917)

jreback mentioned this pull request Nov 21, 2019

DEPR: deprecations log for removed issues #13777

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLN: Remove/deprecate unused/misleading dtype functions #23917

CLN: Remove/deprecate unused/misleading dtype functions #23917

jbrockmendel commented Nov 26, 2018

pep8speaks commented Nov 26, 2018

gfyoung commented Nov 26, 2018

h-vetinari Nov 26, 2018

h-vetinari commented Nov 26, 2018 •

edited

Loading

jreback Nov 26, 2018

jbrockmendel Nov 26, 2018

jreback Nov 26, 2018

jbrockmendel Nov 26, 2018

jbrockmendel Nov 26, 2018

jreback Nov 27, 2018

jreback Nov 27, 2018

jbrockmendel Nov 27, 2018

jreback Nov 27, 2018

jbrockmendel Nov 27, 2018

codecov bot commented Nov 26, 2018 •

edited

Loading

jreback Nov 27, 2018

jreback Nov 27, 2018

jreback left a comment

jbrockmendel commented Nov 27, 2018

jreback commented Nov 27, 2018

jbrockmendel commented Nov 27, 2018

jreback commented Nov 27, 2018

CLN: Remove/deprecate unused/misleading dtype functions #23917

CLN: Remove/deprecate unused/misleading dtype functions #23917

Conversation

jbrockmendel commented Nov 26, 2018

pep8speaks commented Nov 26, 2018

gfyoung commented Nov 26, 2018

Choose a reason for hiding this comment

h-vetinari commented Nov 26, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Nov 26, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

jbrockmendel commented Nov 27, 2018

jreback commented Nov 27, 2018

jbrockmendel commented Nov 27, 2018

jreback commented Nov 27, 2018

h-vetinari commented Nov 26, 2018 •

edited

Loading

codecov bot commented Nov 26, 2018 •

edited

Loading