Skip to content

A curated list of resources dedicated to Biblical Natural Language Processing

License

Notifications You must be signed in to change notification settings

BibleNLP/awesome-bible-nlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Awesome Bible NLP

A curated list of resources dedicated to Biblical Natural Language Processing

Contribute your favorite Biblical NLP resource by raising a pull request! Please read the contribution guidelines before raising a pull request.

Machine Translation

Audio

  • Snow Mountain Dataset: Open-licensed and formatted dataset of audio recordings of the Bible in low-resource Indian languages.

Original Languages

  • Macula Hebrew | Greek: Open-licensed and curated dataset of the Bible in Hebrew and Greek with various connected meta resources (e.g. Syntax trees, glosses, semantic roles).

Tokenizers

  • utoken: Universal tokenizer in Python and CLI interface that is also tested on Biblical text.

Romanizers

  • uroman: Universal Romanizer that can convert any unicode script to roman (latin) script

Toolkits

  • SIL Machine | Python version | JavaScript Version: Toolkit for various NLP operations on Biblical content (especially support for Paratext projects).
  • Wildebeest: Investigate, repair and normalize text for a wide range of issues at the character level. Especially tested on Biblical content.

About

A curated list of resources dedicated to Biblical Natural Language Processing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published