Skip to content

NLP for Greek Language. Declination, Conjugation, and Part-Of-Speech tagging, using a parser for el.wiktionary.org monthly dumps.

Notifications You must be signed in to change notification settings

polyvios/el-wiktionary-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

el-wiktionary-parser

A parser for el.wiktionary.org monthly dumps that creates a word conjugator and POS tagger for greek words.

To use:

#1. Download one of the el.wiktionary.org monthly dumps.

wget https://dumps.wikimedia.org/elwiktionary/latest/elwiktionary-latest-pages-articles-multistream.xml.bz2

#2. To create the word_graph.json file with all words, run

greekdict.py elwiktionary-latest-pages-articles-multistream.xml.bz2

#3. From your python code

from greekdict import WikiWordGraph
word_graph = WikiWordGraph('word_graph.json')
nominative = word_graph[u'νερών']
pos = word_graph.get_pos(u'νερών')

About

NLP for Greek Language. Declination, Conjugation, and Part-Of-Speech tagging, using a parser for el.wiktionary.org monthly dumps.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages