You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
jreback
changed the title
drop_duplicates() throws exception if DF has duplicate column names
BUG: drop_duplicates() throws exception if DF has duplicate column names
Oct 10, 2017
Code Sample, a copy-pastable example if possible
Problem description
May be related to 10161
Above code blows up with exception:
[...]
/opt/virtualenvs/luigi/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in f(vals)
3095 labels, shape = algos.factorize(vals,
3096 size_hint=min(len(self),
-> 3097 _SIZE_HINT_LIMIT))
3098 return labels.astype('i8', copy=False), len(shape)
3099
/opt/virtualenvs/luigi/local/lib/python2.7/dist-packages/pandas/core/algorithms.pyc in factorize(values, sort, order, na_sentinel, size_hint)
183 table = hash_klass(size_hint or len(vals))
184 uniques = vec_klass()
--> 185 labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
186
187 labels = com._ensure_platform_int(labels)
pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_labels (pandas/hashtable.c:7941)()
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
Expected Output
I was excepting simple dataframe output as there are no duplicate rows there.
a a
0 1 2
1 3 4
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.32-15.41.amzn1.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.1
nose: None
pip: 9.0.1
setuptools: 36.4.0
Cython: None
numpy: 1.11.0
scipy: 0.17.1
statsmodels: 0.8.0
xarray: None
IPython: 5.4.1
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
httplib2: None
apiclient: None
sqlalchemy: 1.1.14
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.7.3
boto: 2.48.0
pandas_datareader: None
The text was updated successfully, but these errors were encountered: