Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding of arabic characters in the confusion file is wrong #307

Open
Tailor2019 opened this issue Feb 27, 2022 · 2 comments
Open

encoding of arabic characters in the confusion file is wrong #307

Tailor2019 opened this issue Feb 27, 2022 · 2 comments

Comments

@Tailor2019
Copy link

Tailor2019 commented Feb 27, 2022

Hello!
I'm using the version 2.1.1 of calamari. I trained it on my arabic database.
for validation:
!calamari-eval --gt.texts .gt.txt --pred File --pred.texts 'dirto/.pred.txt' --n_confusions=-1 --xlsx_output dirto/XLSX_OUTPUT
as result in the confusion file:
00

the "GT" and "PRED" values in this screenshot from the confusion file does not match the true text of the correspondant image
in fact this line of the confusion file correspond to this image:
![
998.gt.txt
998
Please how can I obtain a correct confusion file where the GT and the PRED fields have the structure as the image?
Thanks alot in advance!

@andbue
Copy link
Member

andbue commented Feb 27, 2022

Hi, thanks for your report! Could you be a bit more specific about what happens? Is the GT text content somehow put in a different line in the table, are GT and Pred swapped or is there just a problem with right to left ordering of characters?

@Tailor2019
Copy link
Author

Thanks for your reply!
there is no relation of the GT in the confusion matrix and the real GT the same for the prediction file despite it have a very law error rate.
Thanks helping me !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants