Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to read file (failed to read OLE block) #76

Closed
smsaladi opened this issue Jun 6, 2020 · 2 comments
Closed

Unable to read file (failed to read OLE block) #76

smsaladi opened this issue Jun 6, 2020 · 2 comments

Comments

@smsaladi
Copy link

smsaladi commented Jun 6, 2020

I have an excel file that's generated by an application that reads data off an instrument. It looks like libxls is not able to successfully parse excel files exported by the application when tested with xls2csv. Blank lines the length of the file are printed.

The following is printed to stderr with xls2csv 2020-06-02_02-39-48_Quantitation_Summary.xls -v.

Error: fread wanted 1 got 0 loc=8192
Error: Unable to read sector #15
Error: failed to read OLE block

stdout: output.txt

Github doesn't like .xls files attached, so its zipped it up:
2020-06-02_02-39-48_Quantitation_Summary.xls.zip

For reference, I've compiled using gcc-9 from the 1.5.2 release on MacOs 10.14.6:

(base) ➜  2020-06-02_02-39-48 gcc-9 -v
Using built-in specs.
COLLECT_GCC=gcc-9
COLLECT_LTO_WRAPPER=/usr/local/Cellar/gcc/9.2.0_2/libexec/gcc/x86_64-apple-darwin18/9.2.0/lto-wrapper
Target: x86_64-apple-darwin18
Configured with: ../configure --build=x86_64-apple-darwin18 --prefix=/usr/local/Cellar/gcc/9.2.0_2 --libdir=/usr/local/Cellar/gcc/9.2.0_2/lib/gcc/9 --disable-nls --enable-checking=release --enable-languages=c,c++,objc,obj-c++,fortran --program-suffix=-9 --with-gmp=/usr/local/opt/gmp --with-mpfr=/usr/local/opt/mpfr --with-mpc=/usr/local/opt/libmpc --with-isl=/usr/local/opt/isl --with-system-zlib --with-pkgversion='Homebrew GCC 9.2.0_2' --with-bugurl=https://github.com/Homebrew/homebrew-core/issues --disable-multilib --with-native-system-header-dir=/usr/include --with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk
Thread model: posix
gcc version 9.2.0 (Homebrew GCC 9.2.0_2)

In case it's helpful, it looks like pandas (which uses readxl under the hood) is able to process it ok, but with an warning:

In [3]: df = pd.read_excel("2020-06-02_02-39-48_Quantitation_Summary.xls")
WARNING *** file size (8461) not 512 + multiple of sector size (512)

In [4]: df.head()
Out[4]:
   Unnamed: 0 Well Fluor  Content Sample        C(t)  SQ
0         NaN  A03  SYBR  Unkn-01    H2O   62.048927 NaN
1         NaN  A04  SYBR  Unkn-05    H2O   68.577469 NaN
2         NaN  A09  SYBR   NTC-09    H2O   60.147350 NaN
3         NaN  A10  SYBR   NTC-13    H2O   85.389522 NaN
4         NaN  B03  SYBR  Unkn-02    CVS  106.360012 NaN
@smsaladi smsaladi changed the title Unable to read file Unable to read file (failed to read OLE block) Jun 6, 2020
evanmiller added a commit that referenced this issue Jun 6, 2020
@evanmiller
Copy link
Collaborator

Hi, thank you for the bug report. It looks like this file abruptly ends for some reason - I guess the instrument in question isn't producing exactly conformant XLS files. Try this:

beebd66

@smsaladi
Copy link
Author

smsaladi commented Jun 6, 2020

That did it -- thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants