Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating point exception when trying to OCR an image #3991

Open
MerlijnWajer opened this issue Jan 13, 2023 · 12 comments
Open

Floating point exception when trying to OCR an image #3991

MerlijnWajer opened this issue Jan 13, 2023 · 12 comments
Labels

Comments

@MerlijnWajer
Copy link
Contributor

Basic Information

tesseract 5.3.0-1-gd3a4
 leptonica-1.80.0
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.6) : libpng 1.6.37+apng : libtiff 4.3.0 : zlib 1.2.11 : libopenjp2 2.4.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 201511
 Found libarchive 3.6.1 zlib/1.2.11 liblzma/5.2.5 bz2lib/1.0.8
 Found libcurl/7.79.1 GnuTLS/3.7.3 (OpenSSL/1.1.1n) zlib/1.2.11 nghttp2/1.45.1

Operating System

No response

Other Operating System

Gentoo Linux, but also Ubuntu 18.04, 20.04, etc

uname -a

Linux gentoo-x13 5.11.7-gentoo-dist #1 SMP Wed Mar 17 21:03:41 -00 2021 x86_64 AMD Ryzen 7 PRO 4750U with Radeon Graphics AuthenticAMD GNU/Linux

Compiler

GCC 11.2.1

Virtualization / Containers

No response

CPU

AMD Ryzen 7 PRO 4750U with Radeon Graphics

Current Behavior

tesseract -c hocr_char_boxes=1 --dpi 400 ~/Downloads/UNI_1918030101_0003.jp2 - hocr
Floating point exception

Expected Behavior

It would expect Tesseract to not receive SIGFPE.

Suggested Fix

No response

Other Information

The problem doesn't occur if I decompress the JPEG2000 to a TIFF, so perhaps there is some problems with the JPEG2000 handling.

The image is here: https://archive.org/~merlijn/UNI_1918030101_0003.jp2

@MerlijnWajer
Copy link
Contributor Author

I should add that this also occurs with Tesseract 5.1.0

@MerlijnWajer
Copy link
Contributor Author

I ran the program through gdb and it seems to fail in libopenjp2 (I just upgraded to 2.5.0, my bug report mentions 2.4.0):

Program received signal SIGFPE, Arithmetic exception.
0x00007ffff7143da2 in opj_v8dwt_decode () from /usr/lib64/libopenjp2.so.7
(gdb) bt
#0  0x00007ffff7143da2 in opj_v8dwt_decode () from /usr/lib64/libopenjp2.so.7
#1  0x00007ffff7148ee4 in opj_dwt_decode_real () from /usr/lib64/libopenjp2.so.7
#2  0x00007ffff7183675 in opj_tcd_decode_tile () from /usr/lib64/libopenjp2.so.7
#3  0x00007ffff715d507 in opj_j2k_decode_tile () from /usr/lib64/libopenjp2.so.7
#4  0x00007ffff715d834 in opj_j2k_decode_tiles () from /usr/lib64/libopenjp2.so.7
#5  0x00007ffff7153f5b in opj_j2k_exec () from /usr/lib64/libopenjp2.so.7
#6  0x00007ffff715f38b in opj_j2k_decode () from /usr/lib64/libopenjp2.so.7
#7  0x00007ffff7164154 in opj_jp2_decode () from /usr/lib64/libopenjp2.so.7
#8  0x00007ffff7a987c6 in pixReadStreamJp2k () from /usr/lib64/liblept.so.5
#9  0x00007ffff7b462b4 in pixReadStream () from /usr/lib64/liblept.so.5
#10 0x00007ffff7b464f6 in pixRead () from /usr/lib64/liblept.so.5
#11 0x0000555555558090 in main (argc=<optimized out>, argv=<optimized out>) at src/tesseract.cpp:830

@stweil
Copy link
Contributor

stweil commented Jan 13, 2023

Running tesseract https://archive.org/~merlijn/UNI_1918030101_0003.jp2 - -c hocr_char_boxes=1 --dpi 400 hocr with tesseract from Homebrew on MacOS (M1) works fine.

On Debian GNU Linux (x86_64) I can reproduce the crash.

@zdenop
Copy link
Contributor

zdenop commented Jan 13, 2023

It works for me on windows too:

tesseract 5.3.0-9-g8a263
 leptonica-1.83.0 (Dec 21 2022, 14:00:53) [MSC v.1929 LIB Release x64]
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.4.0 : zlib 1.2.12 : libwebp 1.2.2 : libopenjp2 2.5.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 200203
 Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
 Found libcurl/7.75.0 zlib/1.2.12 libssh2/1.10.1_DEV

@stweil
Copy link
Contributor

stweil commented Jan 13, 2023

I now have run a test with latest code for openjpeg and leptonica. The crash occurs in opj_v8dwt_decode_step1_sse which tries to execute _mm_mul_ps in openjpeg/src/lib/openjp2/dwt.c:3032. That statement should multiply {1.57201982e+19, 4.50785901e-29, -2.76601138e-12, 2.98650024e+38} with {1.62573242, 1.62573242, 1.62573242, 1.62573242}. With float values 2.98650024e+38 * 1.62573242 will overflow, and the Tesseract code enables FPU exceptions to detect such overflow by default when it is built with the GNU compiler.

If you build Tesseract with clang, only a FP division by zero raises an exception, so a clang build should not abort.

It would also be possible to disable the code which enables FPU exceptions, either by removing it or by adding a condition which skips it.

@stweil
Copy link
Contributor

stweil commented Jan 13, 2023

It works for me on windows too:

Then the code which enables FPU exceptions is not enabled for Windows builds.

@stweil
Copy link
Contributor

stweil commented Jan 13, 2023

I just made a test with feenableexcept(FE_DIVBYZERO | FE_OVERFLOW | FE_INVALID) replaced by feenableexcept(FE_DIVBYZERO) in the Tesseract code. That avoids the FPU exception and produces OCR output.

@MerlijnWajer
Copy link
Contributor Author

Thanks for figuring this out. I wonder if we should consider this to be a bug in openjp2 ultimately?

I suppose it is also an option to only enable some of these exceptions for debug builds, but not for release builds, but I don't know if that makes any sense.

@stweil
Copy link
Contributor

stweil commented Jan 14, 2023

Yes, I also think that is a bug in openjp2. Ignoring FP exceptions and calculating with NAN does not look like normal behaviour.

I assume that you process a huge number of JPEG 2000 images every day. How many of those trigger this FPE?

I enabled the exceptions in all builds because I want to know when the code does something unexpected which might result in unexpected OCR results. Debug builds won't find such cases because they are not used to process large quantities of different images. It would be possible to change that strategy and only report floating point exceptions without aborting, but I am afraid that people will simply ignore a warning message and don't report it as an issue here.

@stweil stweil added the bug label Jan 14, 2023
@MerlijnWajer
Copy link
Contributor Author

Yes, I also think that is a bug in openjp2. Ignoring FP exceptions and calculating with NAN does not look like normal behaviour.

I assume that you process a huge number of JPEG 2000 images every day. How many of those trigger this FPE?

Not many, I believe, but it's hard to know for sure. But there is a percentage of images where we continue with an empty page if Tesseract exits abnormally. Unrelated to this issue, but worth mentioning, is that we do log this. This is described here: https://archive.org/developers/ocr.html#adaptive-ocr and this search query will find any items where at least one page caused Tesseract to crash:
https://archive.org/search.php?query=ocr_degraded%3Atesseract-crash&sin= - at this point of writing there are 2972, which isn't all that much given that we've processed over 12 million books/magazines with Tesseract. (~0.02%?)
I could try to help go through some of these to try to help find other potential problems.

I enabled the exceptions in all builds because I want to know when the code does something unexpected which might result in unexpected OCR results. Debug builds won't find such cases because they are not used to process large quantities of different images. It would be possible to change that strategy and only report floating point exceptions without aborting, but I am afraid that people will simply ignore a warning message and don't report it as an issue here.

I personally think aborting makes sense here. It could perhaps be optionally disabled with a flag, but I think there's great value in finding bugs like these, so we can keep it on. One thing to note though is that opj_decompress doesn't raise an error, presumably since it doesn't set this FP exceptions flag.

@stweil
Copy link
Contributor

stweil commented Jan 14, 2023

One thing to note though is that opj_decompress doesn't raise an error, presumably since it doesn't set this FP exceptions flag.

That was my first test, and yes, that's because it does not enable any FP exceptions.

@amitdo
Copy link
Collaborator

amitdo commented Jan 14, 2023

[off-topic]

In the cases that Tesseract prints 'Empty page!!', you can re-run Tesseract with Sauvola.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants