Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sheet_name argument from read_excel is always giving the first sheet #17107

Closed
rubennj opened this issue Jul 28, 2017 · 10 comments
Closed

sheet_name argument from read_excel is always giving the first sheet #17107

rubennj opened this issue Jul 28, 2017 · 10 comments
Labels
IO Excel read_excel, to_excel

Comments

@rubennj
Copy link

rubennj commented Jul 28, 2017

Code Sample, a copy-pastable example if possible

pd.read_excel(filename, sheet_name='sheet1')
pd.read_excel(filename, sheet_name='sheet2')

Problem description

First of all, right now there is PR #16442 to homogenize arguments names, but in 0.20.3 both arguments are accepted although only sheetname is documented.

The problem is that sheet_name is silently failing and returning always the first sheet.

sheetname is OK, but with PR #16442 this issue could be already fixed or even more messy.

Expected Output

The chosen sheet.

Output of pd.show_versions()

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: es
LOCALE: None.None

pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 27.2.0
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: 1.6.2
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: 0.4.0
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added the IO Excel read_excel, to_excel label Jul 28, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 28, 2017

@rubennj : Thanks for the report! We have this catch-all **kwds argument in read_excel that is accepting anything and everything as parameters, even if they don't do anything. Thus, the parameter silently fails on 0.20.3 because sheet_name doesn't do anything in that release. It won't fail like that in 0.21.0.

I hope this clarifies your confusion!

@gfyoung gfyoung closed this as completed Jul 28, 2017
@gfyoung gfyoung added this to the No action milestone Jul 28, 2017
@praveern
Copy link

praveern commented Dec 19, 2017

Hello, I'm facing the above mentioned issue of "sheet_name" returning the first sheet always, with pandas 0.21.1

I tried switching to older versions but it didn't work out.

@JackZL
Copy link

JackZL commented Dec 26, 2017

Hi, I used sheet_name = ['Sheet1', 'Sheet2'], it's supposed to return a dict of dataframe, but it only returns the first sheet

@jorisvandenbossche
Copy link
Member

@PraveerNarwelkar @JackZL Can you show the result of pd.show_versions ? (this should work on 0.21.0)

@JackZL
Copy link

JackZL commented Dec 27, 2017

@jorisvandenbossche Sorry, my mistake. it works fine in 0.21.0. my previous version is 0.20.3.

@waltonereed
Copy link

I can also confirm this is resolved after upgrading from 0.20.1 to 0.23.4

@sergodeeva
Copy link

Thank you! Upgrading to 0.23.4 helped.

@dantp-ai
Copy link

Upgrading did not help in my case

INSTALLED VERSIONS
------------------
commit: None

pandas: 0.23.4
pytest: 3.10.1
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@KodzghlyCZ
Copy link

I can confirm this issue is still happening. I suspect there is some issue with the encoding of the text.

INSTALLED VERSIONS

commit : 0f43794
python : 3.11.8.final.0
python-bits : 64
OS : Linux
OS-release : 6.8.4-lqx1-2-lqx
Version : #1 ZEN SMP PREEMPT Wed, 10 Apr 2024 06:15:33 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.0.3
numpy : 1.23.5
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.2.2
pip : 23.2.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.6.3
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : 2.0.21
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

@mstrayer88
Copy link

This issue is still happening. Running Linux Mint 21.3, xfce 4.18.1

Trying to read an Excel xlsx file with 4 sheets from a PyQt6 app using Python 3.11
The call:
self.df = pd.read_excel(self.startdir, sheet_name=1)

sheet_name=0 works fine. Otherwise the call fails. Doesn't matter if I use the index or the sheet name.

Traceback (most recent call last):
File "/home/jms/JMS_Code/PyQt6/5-Status1/Table1.py", line 85, in funcGo
self.df = pd.read_excel(self.startdir, sheet_name=1, usecols="A:B", skiprows=0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/pandas/io/excel/_base.py", line 508, in read_excel
data = io.parse(
^^^^^^^^^
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/pandas/io/excel/_base.py", line 1616, in parse
return self._reader.parse(
^^^^^^^^^^^^^^^^^^^
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/pandas/io/excel/_base.py", line 778, in parse
data = self.get_sheet_data(sheet, file_rows_needed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/pandas/io/excel/_openpyxl.py", line 615, in get_sheet_data
for row_number, row in enumerate(sheet.rows):
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/openpyxl/worksheet/_read_only.py", line 85, in _cells_by_row
for idx, row in parser.parse():
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/openpyxl/worksheet/_reader.py", line 159, in parse
dispatchertag_name
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/openpyxl/worksheet/_reader.py", line 309, in parse_formatting
cf = ConditionalFormatting.from_tree(element)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/openpyxl/descriptors/serialisable.py", line 87, in from_tree
obj = desc.expected_type.from_tree(el)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/openpyxl/descriptors/serialisable.py", line 103, in from_tree
return cls(**attrib)
^^^^^^^^^^^^^
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/openpyxl/formatting/rule.py", line 201, in init
self.operator = operator
^^^^^^^^^^^^^
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/openpyxl/descriptors/base.py", line 147, in set
super().set(instance, value)
File "/home/jms/JMS_Code/PyQt6/5-Status1/.venv/lib/python3.11/site-packages/openpyxl/descriptors/base.py", line 132, in set
raise ValueError(self.doc)
ValueError: Value must be one of {'greaterThan', 'greaterThanOrEqual', 'containsText', 'endsWith', 'notBetween', 'beginsWith', 'lessThan', 'between', 'lessThanOrEqual', 'notEqual', 'notContains', 'equal'}
Aborted (core dumped)

from pip list in my venv
Package Version


et_xmlfile 2.0.0
numpy 2.1.3
openpyxl 3.1.5
pandas 2.2.3
pip 24.3.1
PyQt6 6.7.1
PyQt6-Qt6 6.7.3
PyQt6_sip 13.8.0
python-dateutil 2.9.0.post0
pytz 2024.2
setuptools 75.3.0
six 1.16.0
tzdata 2024.2
wheel 0.44.0

from pandas.show_versions()
INSTALLED VERSIONS

commit : 0691c5c
python : 3.11.10
python-bits : 64
OS : Linux
OS-release : 6.8.0-48-generic
Version : #48~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Oct 7 11:24:13 UTC 2
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.2.3
numpy : 2.1.3
pytz : 2024.2
dateutil : 2.9.0.post0
pip : 24.3.1
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.5
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2024.2
qtpy : None
pyqt5 : None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel
Projects
None yet
Development

No branches or pull requests

10 participants