-
Notifications
You must be signed in to change notification settings - Fork 744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
working with uzn file not working #66
Comments
Hi michi729,
Alternatively I could overload the Process method so it takes the input name as an optional parameter. Do you have any preferences? Also if you could kindly provide an example image and corresponding uzn file with a brief description of what you expect the output should be so I can write up a test case to verify the implementation. Note this should of course not contain any confidential information or be copyrighted. |
Hi Charles, thanks for the quick response! |
Thanks just what I needed. |
Hi Charles, I am not sure, if this should be added as parameter. Tesseract itself just replaces the suffix of the current picure's name. I.e. you could get the picture name from parsing LoadFromFile. What do you think? |
In theory yes, however this would only work if the image was loaded from file. Tesseract actually doesn't work this way and according to my analysis of the source relies on the image name being passed in as an additional parameter to it's ProcessPage routine. Its a pretty simple fix really so should have it done tomorrow sometime, assuming no unforeseen issues arise. |
You are right :-) And thanks for taking the time! |
Just released an updated nuget package (1.10) that supports uzn files though an optional parameter on Process as previously discussed. Please note that using a PSM of SingleColumn (4) does NOT work due to a bug in Tesseract 3.02 (https://code.google.com/p/tesseract-ocr/issues/detail?id=653) however other options do. This issue will be resolved once tesseract 3.03 has been released. |
Hi Charles, thank you very much for your fast support :-) |
in a command line you would use "tesseract.exe pic1.bmp pic1.txt -psm 4" and put a pic1.uzn file in the current directory.
When I try
Tesseract.TesseractEngine tesseract = new Tesseract.TesseractEngine("....path... tessdata", "eng", Tesseract.EngineMode.Default);
Tesseract.Pix picture = Tesseract.Pix.LoadFromFile(@"...path... pic1.bmp");
Tesseract.Page page = tesseract.Process(picture, Tesseract.PageSegMode.SingleColumn); //PSM -4
...
string text = page.GetText();
will lead to an exception on GetText (same as tesseract.exe would fail if there is no uzn file)
Therefore I assume that the .net wrapper does not find (or search for) the uzn file.
Could you please tell me what to do or if this is a bug?
The text was updated successfully, but these errors were encountered: