Skip to content

The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheniti and Neumann, 2020)

License

Notifications You must be signed in to change notification settings

lenakmeth/Wikinflection-Corpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wikinflection Corpus

An inflectional corpus with inflectional morpheme annotations, in 68 languages. 216K lemmas, 5.4M words. Based on the English Wiktionary (en.wiktionary.org), generated by Wikinflection (Metheniti and Neumann, 2018), evaluated with UniMorph 2.0 (Kirov et al.m 2018).

List of languages and size can be found in corpus_size.csv.

Paper

Metheniti, E. and Neumann, G. (2020). Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus. In Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC2020), Marseille, France, May. European Language Resources Association (ELRA). [link] [BibTeX]

References

Kirov, C., Cotterell, R., Sylak-Glassman, J., Walther, G., Vylomova, E., Xia, P., Faruqui, M., Mielke, S., Mc-Carthy, A., Kubler, S., Yarowsky, D., Eisner, J., and Hulden, M. (2018). UniMorph 2.0: Universal Morphology. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC2018), Miyazaki, Japan, May. European Language Resources Association (ELRA).

Metheniti, E. and Neumann, G. (2018). Wikinflection: Massive semi-supervised generation of multilingual inflectional corpus from Wiktionary. In Proceedings of the 17th International Workshop on Treebanks and Linguistic Theories (TLT 2018), December 13–14, 2018, Oslo University, Norway, number 155, pages 147–161. Linkoping University Electronic Press.

About

The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheniti and Neumann, 2020)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published