Skip to content

Commit

Permalink
Add README
Browse files Browse the repository at this point in the history
  • Loading branch information
Michael Hansen committed Oct 9, 2020
1 parent c37494c commit 987e22b
Showing 1 changed file with 20 additions and 0 deletions.
20 changes: 20 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Czech Kaldi Profile

A [Rhasspy](https://github.com/rhasspy/rhasspy) profile for Czech (`cs`).

Includes:

* A [Kaldi nnet3](https://kaldi-asr.org/doc/dnn3.html) speech to text model
* See files in `acoustic_model/`
* Recipe created with [ipa2kaldi](https://github.com/rhasspy/ipa2kaldi)
* Trained on:
* [Vystadial VOIP](http://www.openslr.org/6/) (18.2 hours)
* [Common Voice](https://commonvoice.mozilla.org) (26.6 hours)
* An [IPA](https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) pronunciation lexicon pulled from [Wiktionary](https://www.wiktionary.org/)
* See `base_dictionary.txt.gz`
* A [phonetisaurus](https://github.com/AdolfVonKleist/Phonetisaurus) grapheme to phoneme model for predicting word pronunciations
* See `g2p.fst.gz`
* Trained on `base_dictionary.txt.gz`
* A tri-gram [ARPA language model](https://cmusphinx.github.io/wiki/arpaformat/)
* See `base_language_model.txt.gz`
* Text from audio transcriptions and [Universal Dependencies](https://universaldependencies.org/)

0 comments on commit 987e22b

Please sign in to comment.