-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SetRectangle maybe has an odd thing #845
Comments
Hello, SetRectangle appears to be broken in v4, cf: sirfz/tesserocr#26 In the meantime, you are probably better off creating a sub-image yourself and performing OCR on it. |
@bpotard |
A bit more details, with a minor change to the base API to use the SetRectangle API call just after loading an image:
Then run tesseract on testing/phototest.tif. With branch 3.04:
With master branch:
leptonica-1.74.1 is used in both cases, both on clean debian/jessie64 VMs with identical configurations. |
@bpotard so,you mean 4.0version have a bug?we can use the portion about "box" of other version instead of 4.0? |
@bpotard and how free the memory?delete [] utf8text? I run tesseract to handle more than 500 image ,but it tell me mot memory |
yes, you can use version 3.x instead of version 4.0 if you really need to use the SetRectangle call. Alternatively, you can create an image corresponding to the rectangle you want to recognise, and recognise that instead; it won't be exactly equivalent as the bounding boxes would be shifted compared to the ones in the original image, but it is easy to correct the bounding boxes. |
@bpotard can you elaborate on why the bounding boxes would be shifted? |
@abieler Because the bounding boxes in each elements (paragraph, word, etc.) would be relative to the "new" sub-image you have created rather than the original image - while setRectangle would normally return bounding boxes relative to the original image. So if you need to know where the recognised text comes from precisely in the original image, you would need to do an additional step to have their exact position: you would need to shift the returned bounding boxes in the original coordinate space... which is admittedly not very hard to do: you just need to add the coordinates of the top left corner of your extracted sub-image to all bounding boxes. |
Thanks @bpotard ! I just started using the API and |
If you are using the master branch, SetRectangle is probably still broken so will not work - the bug has not been fixed as far as I know. If you really need the functionality, either use the 3.x branch of tesseract, or create you own sub-images and process them as whole images using the normal API. Do not use SetRectangle in tesseract 4.x. Alternatively, you can try to figure out where the bug in SetRectangle comes from and fix it :-) |
Sorry, that was my mistake, I actually am using sub-images and not |
Can you please provide test case that can demonstrate your problem? I can not reproduce with:
and results is:
Which is exactly what e.g. gimp shows for this area. Or do I miss something? |
Closed as not reproducible with current code. |
https://groups.google.com/g/tesseract-ocr/c/PMHq6YSpRRE In the linked thread you confirmed that |
Notes:
Here is my test code: #include <leptonica/allheaders.h>
#include <tesseract/baseapi.h>
int main() {
// Show version info
char *versionStrP;
printf("tesseract %s\n", tesseract::TessBaseAPI::Version());
versionStrP = getLeptonicaVersion();
printf(" %s\n", versionStrP);
lept_free(versionStrP);
versionStrP = getImagelibVersions();
printf(" %s\n", versionStrP);
lept_free(versionStrP);
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
tesseract::OcrEngineMode enginemode = tesseract::OEM_DEFAULT;
// tesseract::OcrEngineMode enginemode = tesseract::OEM_TESSERACT_ONLY;
api->Init(NULL, "eng", enginemode);
Pix *image = pixRead("SetRectangle_test.png");
api->SetImage(image);
int w = pixGetWidth(image);
int h = pixGetHeight(image);
int h_adj = h * .3;
api->SetRectangle(0, 0, w, h_adj);
char *outTextSR = api->GetUTF8Text();
printf("********\tOCR output after SetRectangle:\n%s", outTextSR);
Pix *rect_pix = api->GetThresholdedImage();
pixWrite("ocred_pix.png", rect_pix, IFF_PNG);
api->SetImage(rect_pix);
char *outTextSI = api->GetUTF8Text();
printf("\n********\tOCR output SetImage:\n%s", outTextSI);
api->End();
pixDestroy(&image);
pixDestroy(&rect_pix);
delete[] outTextSR;
delete[] outTextSI;
delete api;
return 0;
} And here is the testing image (SetRectangle_test.png): With 8. testing row
9, testing row With 1. testing row
2. testing row |
I also tried to use SetRectangle and got problems with this. Some calls resulted in memory access violation. |
@AdmiralPellaeon : Is the problem replicable with legacy engine ? |
@zdenop I tested the following:-
Did you mean this with the legacy engine (OEM_TESSERACT_ONLY)? |
@AdmiralPellaeon: Yes, I mean |
@zdenop in my opinion it fits in this issue: "SetRectangle maybe has an odd thing". The image format as source is only a guess of mine. But yes, I can open a new issue if this helps. |
no - "memory access violation" for 8bit image does not fit to issue.... |
Today,I want to set a rectangle area to recognition,but I find the parameters may be not explain as baseapi.h .
void setrectangle(int left,int top,int width,int height);
when I setrectangle(126,40,1152,28),it will recognizing (0,0,1152,28),I don't know why.
Look forward to your reply,thank you!
The text was updated successfully, but these errors were encountered: