Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct type inference for UInt64Index during access #29420

Merged
merged 13 commits into from
Nov 27, 2019
17 changes: 10 additions & 7 deletions pandas/core/indexes/numeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import numpy as np

from pandas._libs import index as libindex
from pandas._libs import index as libindex, lib
from pandas.util._decorators import Appender, cache_readonly

from pandas.core.dtypes.cast import astype_nansafe
Expand Down Expand Up @@ -302,12 +302,15 @@ def _convert_scalar_indexer(self, key, kind=None):

@Appender(_index_shared_docs["_convert_arr_indexer"])
def _convert_arr_indexer(self, keyarr):
# Cast the indexer to uint64 if possible so
# that the values returned from indexing are
# also uint64.
keyarr = com.asarray_tuplesafe(keyarr)
if is_integer_dtype(keyarr):
return com.asarray_tuplesafe(keyarr, dtype=np.uint64)
# Cast the indexer to uint64 if possible so that the values returned
# from indexing are also uint64.
if is_integer_dtype(keyarr) or (
lib.infer_dtype(keyarr, skipna=False) == "integer"
):
keyarr = com.asarray_tuplesafe(keyarr, dtype=np.uint64)
else:
keyarr = com.asarray_tuplesafe(keyarr)
oguzhanogreden marked this conversation as resolved.
Show resolved Hide resolved

return keyarr

@Appender(_index_shared_docs["_convert_index_indexer"])
Expand Down
18 changes: 18 additions & 0 deletions pandas/tests/indexes/test_numeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -1187,3 +1187,21 @@ def test_range_float_union_dtype():

result = other.union(index)
tm.assert_index_equal(result, expected)


def test_uint_index_does_not_convert_to_float64():
# https://github.com/pandas-dev/pandas/issues/28279
bug = pd.Series(
oguzhanogreden marked this conversation as resolved.
Show resolved Hide resolved
[0, 1, 2, 3, 4],
index=[
7606741985629028552,
17876870360202815256,
13106359306506049338,
8991270399732411471,
8991270399732411472,
],
)

assert isinstance(
bug.loc[[7606741985629028552, 17876870360202815256]].index, UInt64Index
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explicitly construct the expected index and use tm.assert_index_equal to verify they're the same:

result = s.loc[[7606741985629028552, 17876870360202815256]].index
expected = UInt64Index([7606741985629028552, 17876870360202815256])
tm.assert_index_equal(result, expected)

I'd rather not do a simple isinstance check here because it doesn't guard against potential precision loss with the values in the index, e.g. if someone makes a change where there's an intermediate coercion to Float64Index:

In [2]: idx = pd.UInt64Index([2**53, 2**53 + 1])

In [3]: idx
Out[3]: UInt64Index([9007199254740992, 9007199254740993], dtype='uint64')

In [4]: pd.UInt64Index(pd.Float64Index(idx))
Out[4]: UInt64Index([9007199254740992, 9007199254740992], dtype='uint64')