Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame[np.nan] raises TypeError with non-unique columns #21428

Closed
toobaz opened this issue Jun 11, 2018 · 0 comments · Fixed by #21313
Closed

DataFrame[np.nan] raises TypeError with non-unique columns #21428

toobaz opened this issue Jun 11, 2018 · 0 comments · Fixed by #21313
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Milestone

Comments

@toobaz
Copy link
Member

toobaz commented Jun 11, 2018

Code Sample, a copy-pastable example if possible

In [2]: pd.DataFrame(index=range(3), columns=[1, 2, float('nan'), 2])[float('nan')]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-86ff82017f2a> in <module>()
----> 1 pd.DataFrame(index=range(3), columns=[1, 2, float('nan'), 2])[float('nan')]

/home/nobackup/repo/pandas/pandas/core/frame.py in __getitem__(self, key)
   2685             return self._getitem_multilevel(key)
   2686         else:
-> 2687             return self._getitem_column(key)
   2688 
   2689     def _getitem_column(self, key):

/home/nobackup/repo/pandas/pandas/core/frame.py in _getitem_column(self, key)
   2695 
   2696         # duplicate columns & possible reduce dimensionality
-> 2697         result = self._constructor(self._data.get(key))
   2698         if result.columns.is_unique:
   2699             result = result[key]

/home/nobackup/repo/pandas/pandas/core/internals.py in get(self, item, fastpath)
   4128 
   4129             if isna(item):
-> 4130                 raise TypeError("cannot label index with a null key")
   4131 
   4132             indexer = self.items.get_indexer_for([item])

TypeError: cannot label index with a null key

In [3]: pd.DataFrame(index=range(3), columns=[1, 2, float('nan'), 4])[float('nan')]
Out[3]: 
0    NaN
1    NaN
2    NaN
Name: nan, dtype: object

In [4]: pd.Series(index=[1, 2, float('nan'), 2])[float('nan')]
Out[4]: nan

Problem description

The behavior of DataFrame[np.nan] on non-unique columns makes no particular sense and deviates from the behavior of Series[np.nan].

This is mistakenly tested here:

# multiple nans should fail

I will push a fix in few minutes.

Expected Output

Same as Out[3]:.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 415012f
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-6-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.24.0.dev0+83.g415012f4f
pytest: 3.5.0
pip: 9.0.1
setuptools: 39.2.0
Cython: 0.25.2
numpy: 1.14.3
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.2.2.post1153+gff6786446
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

@toobaz toobaz added Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jun 11, 2018
@jreback jreback added this to the 0.24.0 milestone Jun 12, 2018
jreback pushed a commit that referenced this issue Jul 7, 2018
)

* BUG: fix DataFrame.__getitem__ and .loc with non-list listlikes

close #21294
close #21428
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018
…das-dev#21313)

* BUG: fix DataFrame.__getitem__ and .loc with non-list listlikes

close pandas-dev#21294
close pandas-dev#21428
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants