-
Notifications
You must be signed in to change notification settings - Fork 39
Turkish
Source: https://github.com/ahmetaa/zemberek-nlp
UIMA: https://github.com/texttechnologylab/textimager-uima
Available Annotator:
- TokenizerDefault
- TokenizerAll
- Sentence Boundary Detection
- Lemmatizer
- Stemmer
- Part of Speech
- Deasciifier
- Spellchecker
- Disambiguator
License [Apache License] (https://github.com/ahmetaa/zemberek-nlp/blob/master/LICENSE)
Source: http://polyglot.readthedocs.io/en/latest/index.html
UIMA: https://github.com/texttechnologylab/textimager-uima
Available Annotator:
- Tokenization (165 Languages)
- Language detection (196 Languages)
- Named Entity Recognition (40 Languages)
- Part of Speech Tagging (16 Languages)
- Sentiment Analysis (136 Languages)
- Word Embeddings (137 Languages)
- Morphological analysis (135 Languages)
- Transliteration (69 Languages)
License [GPLv3] (http://polyglot.readthedocs.io/en/latest/)
Source: https://github.com/hrzafer/resha-turkish-stemmer
UIMA: https://github.com/texttechnologylab/textimager-uima
Available Annotator:
- Stemmer
License [MIT License] (https://github.com/hrzafer/resha-turkish-stemmer/blob/master/LICENSE)
Source: https://github.com/ahmetb/turkish-deasciifier-java
UIMA: https://github.com/texttechnologylab/textimager-uima
Available Annotator:
- Deasciifier
License [Apache License] (https://github.com/ahmetb/turkish-deasciifier-java/blob/master/LICENSE)
Tool Turkish Natural Language Toolkit (https://github.com/aliok/trnltk-java)
Merged with Zemberek-NLP
TRmorph (https://github.com/coltekin/TRmorph)
- Tokenizer
- Segmenter
- Lemmatizer
- Stemmer
- Part of Speech
- Unknown word guesser
- Hyphenation-Tool
License [MIT License] (https://github.com/coltekin/TRmorph/blob/master/LICENSE)
Requires foma and a C preprocessor
ITU Turkish Natural Language Processing Pipeline (http://tools.nlp.itu.edu.tr/)
Web API (Token Needed)
- Tokenizer
- Normalization
- Deasciifier
- Vowelizer
- Spelling Corrector
- isTurkish
- Morphological Analyzer
TSCorpus (https://tscorpus.com/)
Web API
Lucene-Solr-Analysis Turkish (https://github.com/iorixxx/lucene-solr-analysis-turkish)
Use of Zemberek StemFilter and TRmorph StemFilter
Turkish-Pos-Tagger (https://github.com/onuryilmaz/turkish-pos-tagger)
Requires Python and NLTK
UIMA https://github.com/dkpro/dkpro-core/tree/master/dkpro-core-snowball-asl
Test