Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: CategoricalIndex can't be a boolen mask #22665

Closed
TomAugspurger opened this issue Sep 11, 2018 · 2 comments · Fixed by #22667
Closed

BUG: CategoricalIndex can't be a boolen mask #22665

TomAugspurger opened this issue Sep 11, 2018 · 2 comments · Fixed by #22667
Labels
Categorical Categorical Data Type Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@TomAugspurger
Copy link
Contributor

In [1]: import pandas as pd

In [2]: idx = pd.CategoricalIndex([True, False, True])

In [3]: pd.Series(range(3))[idx]
Out[3]:
True    NaN
False   NaN
True    NaN
dtype: float64

Expected:

In [4]: pd.Series(range(3))[idx.astype(object)]
Out[4]:
0    0
2    2
dtype: int64

Fix coming shortly.

@TomAugspurger TomAugspurger added the Categorical Categorical Data Type label Sep 11, 2018
@TomAugspurger TomAugspurger added this to the 0.24.0 milestone Sep 11, 2018
@TomAugspurger TomAugspurger added Indexing Related to indexing on series/frames, not to indexes themselves Dtype Conversions Unexpected or buggy dtype conversions labels Sep 11, 2018
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018
@pganssle
Copy link
Contributor

Is this another manifestation of this bug, or a separate bug?

>>> import pandas as pd
>>> pd.__version__
'0.23.3'
>>> pd.CategoricalIndex(categories=[True, False])
ValueError                                Traceback (most recent call last)
...
~/.virtualenvs/pandas/lib/python3.7/site-packages/pandas/core/arrays/categorical.py in _get_codes_for_values(values, categories)
   2431     (_, _), cats = _get_data_algo(categories, _hashtables)
   2432     t = hash_klass(len(cats))
-> 2433     t.map_locations(cats)
   2434     return coerce_indexer_dtype(t.lookup(vals), cats)
   2435 

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.map_locations()

ValueError: Buffer dtype mismatch, expected 'Python object' but got 'unsigned long'

Full traceback under details tag:

<ipython-input-4-c58e7bc5bb0f> in <module>()
----> 1 pd.CategoricalIndex(categories=[True])

~/.virtualenvs/pandas/lib/python3.7/site-packages/pandas/core/indexes/category.py in __new__(cls, data, categories, ordered, dtype, copy, name, fastpath)
    100                 data = []
    101             data = cls._create_categorical(cls, data, categories, ordered,
--> 102                                            dtype)
    103 
    104         if copy:

~/.virtualenvs/pandas/lib/python3.7/site-packages/pandas/core/indexes/category.py in _create_categorical(self, data, categories, ordered, dtype)
    165             from pandas.core.arrays import Categorical
    166             data = Categorical(data, categories=categories, ordered=ordered,
--> 167                                dtype=dtype)
    168         else:
    169             if categories is not None:

~/.virtualenvs/pandas/lib/python3.7/site-packages/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    368 
    369         else:
--> 370             codes = _get_codes_for_values(values, dtype.categories)
    371 
    372         if null_mask.any():

I'll note that my version of it is a regression that occurred between 0.18.0 and 0.23.0.

@TomAugspurger
Copy link
Contributor Author

Hmm that looks different.

Two things

  1. Index constructors require a data argument, so the proper error is like TypeError: Index(...) must be called with a collection of some kind, None was passed
  2. something is up with booleans. ValueError in Categorical Constructor with empty data and boolean categories #22702 for that.

TomAugspurger added a commit that referenced this issue Sep 20, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants