-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: .nlargest with unsigned integers #21426
Comments
How strange! Let's figure out what's going on there... |
I suspect the issue is with this block of code: pandas/pandas/core/algorithms.py Lines 1136 to 1138 in 4807905
Specifically for uint data, I don't think In [2]: arr = np.array([0, 0, 0, 100, 1000, 10000, 100], dtype='uint64')
In [3]: -arr
Out[3]:
array([ 0, 0, 0,
18446744073709551516, 18446744073709550616, 18446744073709541616,
18446744073709551516], dtype=uint64) |
@jschendel : That would make sense. Hacky solution is to cast to |
I'm not all that well-versed regarding uint operations, but does simply doing In [2]: a = np.array([0, 1, 18446744073709551615], dtype='uint64')
In [3]: a
Out[3]: array([ 0, 1, 18446744073709551615], dtype=uint64)
In [4]: -a
Out[4]: array([ 0, 18446744073709551615, 1], dtype=uint64)
In [5]: -a - 1
Out[5]: array([18446744073709551615, 18446744073709551614, 0], dtype=uint64) |
Hmm...that actually might work, both for the |
Marking this for |
Hi! I would like to pick this up if this is a good first issue? ( Feel free to assign to me ) |
@alimcmaster1 : Go for it! |
I'm actually right about to put a fix in for this |
@jschendel : Thanks for letting us know! @alimcmaster1 : Sorry about that. 😞 But you're more than welcome to review the PR that @jschendel puts up soon-ish. |
No worries! Thanks can do :) |
Created the PR. As another example, note that this fails for In [2]: pd.__version__
Out[2]: '0.23.0'
In [3]: s = pd.Series([-9223372036854775808, 0, 9223372036854775807])
In [4]: s
Out[4]:
0 -9223372036854775808
1 0
2 9223372036854775807
dtype: int64
In [5]: s.nlargest(2)
Out[5]:
0 -9223372036854775808
2 9223372036854775807
dtype: int64 |
Sigh...that's symptomatic of the same overflow issue presented with |
nlargest
bug with unsigned integers
Code Sample, a copy-pastable example if possible
Problem description
nlargest
favours0
above positive values. Common to bothuint32
anduint64
types and possibly others.Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-327.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.4.0
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: 1.1.13
pymysql: 0.7.9.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: