You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it be possible to do what the title says. I've made a script that can be run on google collab to get the start and end timestamps of each word if you have a recording and the text file of a surah. It basically uses ai to generate a transcript of the recording with word timestamps. The transcript will have some occasional mistakes like combining multiple words into one or splitting one into multiple. Therefore, it then algorithmically aligns it with the text of the surah to get the best match. I used whisper large-v2 for the model and the timestamps are pretty accurate.
This would be really helpful if you want to imitate a certain recitation, especially for long surahs since getting to a specific ayah or repeting a section is extremely tedious on youtube. It's practically impossible for the app to have recordings of every single reciter. A lot of reciters have multiple styles as well. This would give the user the option of loading custom recordings on their device. I'm willing to implement this feature if you could explain and point me in the right section of the codebase.
The text was updated successfully, but these errors were encountered:
Assalamu alaikum.
Would it be possible to do what the title says. I've made a script that can be run on google collab to get the start and end timestamps of each word if you have a recording and the text file of a surah. It basically uses ai to generate a transcript of the recording with word timestamps. The transcript will have some occasional mistakes like combining multiple words into one or splitting one into multiple. Therefore, it then algorithmically aligns it with the text of the surah to get the best match. I used whisper large-v2 for the model and the timestamps are pretty accurate.
This would be really helpful if you want to imitate a certain recitation, especially for long surahs since getting to a specific ayah or repeting a section is extremely tedious on youtube. It's practically impossible for the app to have recordings of every single reciter. A lot of reciters have multiple styles as well. This would give the user the option of loading custom recordings on their device. I'm willing to implement this feature if you could explain and point me in the right section of the codebase.
The text was updated successfully, but these errors were encountered: