Skip to content

NLPH/SVLM-Hebrew-Wikipedia-Corpus

Repository files navigation

The SVLM Hebrew Wikipedia Courpus

LICENCE

The SVLM Hebrew Wikipedia Courpus is a corpus made up of 50,000 Hebrew sentences from the Hebrew Wikipedia chosen to ensure phoneme coverage for the purpose of a sentence recording project.

The corpus was built by Dr. Vered Silber-Varod and Prof. Ami Moyal as part of their work on [Varod17].

Links

Corpus: https://github.com/NLPH/SVLM-Hebrew-Wikipedia-Corpus/blob/master/SVLM_Hebrew_Wikipedia_Corpus.txt

Paper: https://github.com/NLPH/SVLM-Hebrew-Wikipedia-Corpus/blob/master/Phonemes_freqency_Silber-Varod-Latin-Moyal.pdf

License

As it was generated from Hebrew Wikipedia sources, which are licensed under the CC-BY-SA 3.0 license, this corpus is thus also necessarilly licensed under the same license.

References

[Varod17]Silber-Varod, V., Latin, M., & Moyal, A. (2017) "Frequency of Hebrew phonemes and phoneme clusters in a data-driven approach. (in Hebrew). Literacy and Language (Oryanut Ve-Safa), 6, 22-36 [pdf]

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published