Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When use Chinese chars in setVariable, it has not effect #68

Closed
andyddd opened this issue Jan 29, 2014 · 4 comments
Closed

When use Chinese chars in setVariable, it has not effect #68

andyddd opened this issue Jan 29, 2014 · 4 comments
Assignees
Labels
Milestone

Comments

@andyddd
Copy link

andyddd commented Jan 29, 2014

I tried using your lib in a asp.net project. When I use following code, the engine isn't functioning:
engine.SetVariable("tessedit_char_whitelist", "一些中文");
It seems like the tesseract's DLL only accept UTF-8 chars, but the .NET lib can only feed Unicode chars.
I tried to add a "ReadConfigFile" function to the .NET wrapper, it works. Since the config file is UTF-8 encoded.
So I suggest to add a function like "ReadConfigFile" or to fix the SetVariable function's bug, let it read utf-8 chars correctly (which seems a little harder).

BTW: Thanks for you lib.

@ghost ghost assigned charlesw Jan 29, 2014
@charlesw
Copy link
Owner

Thanks for the bug report, I'll have a look into this shortly.

@charlesw
Copy link
Owner

charlesw commented Mar 5, 2014

I've fixed this in the 3.03 branch for the upcoming 1.1 release. Note I haven't added a thorough test case yet only made sure you can send a UTF8 string to tesseract and that you get back the same value when calling TryGetStringVariable.

@charlesw
Copy link
Owner

Fixed in latest release 2.0.0, will be available on Nuget shortly.

@AndreyAkinshin
Copy link
Contributor

Great news! Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants