PersianSpeech

In this repository, I put the Persian speech dataset along with the related text.

In this link , I put a dataset related to ASR task in Persian language with a duration of 3 hours. The label of each audio file is in the form of a sentence and the duration of each file is about 10 seconds.
This dataset is not copied from anywhere and it is my personal project that I publish freely. You can use it in your projects.

Also, if you want to have a 86-hour dataset like this, you can contact me. hubare.ra[at]gmail.com [not free]

myaudio_tiny is tiny dataset with a duration of 3 hours.
myaudio_full is big dataset with a duration of 30 hours.
persian_v2 is is big datasat with a duration of 56 hours.

Other sources:

Mozilla dataset :
Mozilla Company has started to produce a huge Persian dataset. In its version 7, the company has converted 293 hours of Persian audio to text and published it for free at this link. The sounds in this collection are usually short.
persianspeechcorpus :
You can also use this site. This ~ 2.5-hour Single-Speaker Speech corpus has been developed using the same methodologies used in the PhD work carried out by Nawar Halabi at the University of Southampton.

Donation

I try to publish free Persian datasets in github. Your financial support will encourage me.
Donation link : https://www.patreon.com/persiandataset

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
LICENSE		LICENSE
README.md		README.md
myaudio_full.xlsx		myaudio_full.xlsx
myaudio_tiny.xlsx		myaudio_tiny.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PersianSpeech

Donation

About

Releases

Packages

License

persiandataset/PersianSpeech

Folders and files

Latest commit

History

Repository files navigation

PersianSpeech

Donation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages