Languages/Scripts supported in different versions of Tesseract
LangCode
Language
3.02
3.04
4.00
4.0.0
4.0.0
4.0.0
Nov. 2016
tessdata
tessdata_best
tessdata_fast
afr
Afrikaans
x
x
x
x
x
x
amh
Amharic
x
x
x
x
x
ara
Arabic
x
x
x
x
x
x
asm
Assamese
x
x
x
x
x
aze
Azerbaijani
x
x
x
x
x
aze_cyrl
Azerbaijani - Cyrilic
x
x
x
x
x
x
bel
Belarusian
x
x
x
x
x
x
ben
Bengali
x
x
x
x
x
x
bod
Tibetan
x
x
x
x
x
bos
Bosnian
x
x
x
x
x
bre
Breton
x
x
x
x
bul
Bulgarian
x
x
x
x
x
x
cat
Catalan; Valencian
x
x
x
x
x
x
ceb
Cebuano
x
x
x
x
x
ces
Czech
x
x
x
x
x
x
chi_sim
Chinese - Simplified
x
x
x
x
x
x
chi_tra
Chinese - Traditional
x
x
x
x
x
x
chr
Cherokee
x
x
x
x
x
x
cos
Corsican
x
x
x
cym
Welsh
x
x
x
x
x
dan
Danish
x
x
x
x
x
x
dan_frak
Danish - Fraktur (contrib)
x
x
deu
German
x
x
x
x
x
x
deu_frak
German - Fraktur (contrib)
x
x
deu_latf
German (Fraktur Latin)
x
x
x
x
dzo
Dzongkha
x
x
x
x
x
ell
Greek, Modern (1453-)
x
x
x
x
x
x
eng
English
x
x
x
x
x
x
enm
English, Middle (1100-1500)
x
x
x
x
x
x
epo
Esperanto
x
x
x
x
x
x
equ
Math / equation detection module
x
x
x
x
x
est
Estonian
x
x
x
x
x
x
eus
Basque
x
x
x
x
x
x
fao
Faroese
x
x
x
fas
Persian
x
x
x
x
x
fil
Filipino (old - Tagalog)
x
x
x
fin
Finnish
x
x
x
x
x
x
fra
French
x
x
x
x
x
x
frk
German - Fraktur (now deu_latf)
x
x
x
x
x
x
frm
French, Middle (ca.1400-1600)
x
x
x
x
x
x
fry
Western Frisian
x
x
x
gla
Scottish Gaelic
x
x
x
gle
Irish
x
x
x
x
x
glg
Galician
x
x
x
x
x
x
grc
Greek, Ancient (to 1453) (contrib)
x
x
x
x
x
x
guj
Gujarati
x
x
x
x
x
hat
Haitian; Haitian Creole
x
x
x
x
x
heb
Hebrew
x
x
x
x
x
x
hin
Hindi
x
x
x
x
x
x
hrv
Croatian
x
x
x
x
x
x
hun
Hungarian
x
x
x
x
x
x
hye
Armenian
x
x
x
iku
Inuktitut
x
x
x
x
x
ind
Indonesian
x
x
x
x
x
x
isl
Icelandic
x
x
x
x
x
x
ita
Italian
x
x
x
x
x
x
ita_old
Italian - Old
x
x
x
x
x
x
jav
Javanese
x
x
x
x
x
jpn
Japanese
x
x
x
x
x
x
kan
Kannada
x
x
x
x
x
x
kat
Georgian
x
x
x
x
x
kat_old
Georgian - Old
x
x
x
x
x
kaz
Kazakh
x
x
x
x
x
khm
Central Khmer
x
x
x
x
x
kir
Kirghiz; Kyrgyz
x
x
x
x
x
kmr
Kurmanji (Kurdish - Latin Script)
x
x
x
x
kor
Korean
x
x
x
x
x
x
kor_vert
Korean (vertical)
x
x
x
x
kur
Kurdish (Arabic Script)
x
lao
Lao
x
x
x
x
x
lat
Latin
x
x
x
x
x
lav
Latvian
x
x
x
x
x
x
lit
Lithuanian
x
x
x
x
x
x
ltz
Luxembourgish
x
x
x
x
mal
Malayalam
x
x
x
x
x
x
mar
Marathi
x
x
x
x
x
mkd
Macedonian
x
x
x
x
x
x
mlt
Maltese
x
x
x
x
x
x
mon
Mongolian
x
x
x
x
mri
Maori
x
x
x
x
msa
Malay
x
x
x
x
x
x
mya
Burmese
x
x
x
x
x
nep
Nepali
x
x
x
x
x
nld
Dutch; Flemish
x
x
x
x
x
x
nor
Norwegian
x
x
x
x
x
oci
Occitan (post 1500)
x
x
x
x
x
ori
Oriya
x
x
x
x
x
osd
Orientation and script detection module
x
x
x
x
x
x
pan
Panjabi; Punjabi
x
x
x
x
x
pol
Polish
x
x
x
x
x
x
por
Portuguese
x
x
x
x
x
x
pus
Pushto; Pashto
x
x
x
x
x
que
Quechua
x
x
x
x
ron
Romanian; Moldavian; Moldovan
x
x
x
x
x
x
rus
Russian
x
x
x
x
x
x
san
Sanskrit
x
x
x
x
x
sin
Sinhala; Sinhalese
x
x
x
x
x
slk
Slovak
x
x
x
x
x
x
slk_frak
Slovak - Fraktur (contrib)
x
x
slv
Slovenian
x
x
x
x
x
x
snd
Sindhi
x
x
x
x
spa
Spanish; Castilian
x
x
x
x
x
x
spa_old
Spanish; Castilian - Old
x
x
x
x
x
x
sqi
Albanian
x
x
x
x
x
x
srp
Serbian
x
x
x
x
x
x
srp_latn
Serbian - Latin
x
x
x
x
x
sun
Sundanese
x
x
x
x
swa
Swahili
x
x
x
x
x
x
swe
Swedish
x
x
x
x
x
x
syr
Syriac
x
x
x
x
x
tam
Tamil
x
x
x
x
x
x
tat
Tatar
x
x
x
x
tel
Telugu
x
x
x
x
x
x
tgk
Tajik
x
x
x
x
x
tgl
Tagalog (new - Filipino)
x
x
x
tha
Thai
x
x
x
x
x
x
tir
Tigrinya
x
x
x
x
x
ton
Tonga
x
x
x
x
tur
Turkish
x
x
x
x
x
x
uig
Uighur; Uyghur
x
x
x
x
x
ukr
Ukrainian
x
x
x
x
x
x
urd
Urdu
x
x
x
x
x
uzb
Uzbek
x
x
x
x
x
uzb_cyrl
Uzbek - Cyrilic
x
x
x
x
x
vie
Vietnamese
x
x
x
x
x
x
yid
Yiddish
x
x
x
x
x
yor
Yoruba
x
x
x
x
Script
3.02
3.04
4.00
4.0.0
4.0.0
4.0.0
Nov 2016
tessdata
tessdata_best
tessdata_fast
arab
Arabic
x
x
x
armn
Armenian
x
x
x
beng
Bengali
x
x
x
cans
Canadian_Aboriginal
x
x
x
cher
Cherokee
x
x
x
cyrl
Cyrillic
x
x
x
deva
Devanagari
x
x
x
ethi
Ethiopic
x
x
x
frak
Fraktur
x
x
x
geor
Georgian
x
x
x
grek
Greek
x
x
x
gujr
Gujarati
x
x
x
guru
Gurmukhi
x
x
x
hans
HanS (Han simplified)
x
x
x
hans-vert
HanS_vert (Han simplified vertical)
x
x
x
hant
HanT (Han traditional)
x
x
x
hant-vert
HanT_vert (Han traditional vertical)
x
x
x
hang
Hangul
x
x
x
hang-vert
Hangul_vert (Hangul vertical)
x
x
x
hebr
Hebrew
x
x
x
jpan
Japanese
x
x
x
jpan-vert
Japanese_vert (Japanese vertical)
x
x
x
knda
Kannada
x
x
x
khmr
Khmer
x
x
x
laoo
Lao
x
x
x
latn
Latin
x
x
x
mlym
Malayalam
x
x
x
mymr
Myanmar
x
x
x
orya
Oriya(Odia)
x
x
x
sinh
Sinhala
x
x
x
syrc
Syriac
x
x
x
taml
Tamil
x
x
x
telu
Telugu
x
x
x
thaa
Thaana
x
x
x
thai
Thai
x
x
x
tibt
Tibetan
x
x
x
viet
Vietnamese
x
x
x
For detalls about the languages that each Script.traindata file supports, see the files that end with langs.txt (e.g. Latin.langs.txt) here .