-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: pd.unique treats 0 and False as equivalent #18111
Comments
if you change https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/hashtable_class_helper.pxi.in#L840 |
I replaced I found one issue in
Including However that didn't sufficiently fix the issue. I don't entirely understand what's happening here https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/hashtable_class_helper.pxi.in#L841-L844 where the issue may be, but is it also problematic that False and 0 have the same hash?
|
The first issue can be side-stepped with an explicit cast to numpy-array:
I'm not sure the result is unexpected. The result is similar for
So there are probably two alternatives:
To me, first option seems to be totally Ok. |
Yeah given 0 and False are equivalent in Python, agreed that the current behavior is probably okay. Closing |
Problem description
Currently a testing blocker (
test_datetime_bool
) for PR #17077I am guessing False is getting coerced to 0 when determining uniqueness; however, there may be cases when the user wants False to be distinct from 0.
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: 86e9dcc
python: 2.7.13.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-45-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.22.0.dev0+50.g86e9dcc
pytest: 3.2.1
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.26
numpy: 1.13.1
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 5.3.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.8
lxml: 3.8.0
bs4: 4.3.2
html5lib: 0.9999999
sqlalchemy: 1.1.13
pymysql: 0.7.9.None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: