Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with custom dataset #14

Open
LinuxBeginner opened this issue May 19, 2021 · 0 comments
Open

Training with custom dataset #14

LinuxBeginner opened this issue May 19, 2021 · 0 comments

Comments

@LinuxBeginner
Copy link

LinuxBeginner commented May 19, 2021

Hi, thank you for providing the repository.

Could you please guide me, how should I prepare my dataset, so that I can run the experiment?

Current dataset structure is as follows:

Source language:
source1.wav
source1.txt (transcript of source1.wav)
source2.wav
source2.txt
....

Traget language
target1.txt ( translation of source1.txt)
target2.txt
....

I have gone through this tutorial too Getting Started with End-to-End Speech Translation. But, I could not understand how I should prepare or arrange my dataset as per FBK-Fairseq-ST requirement. Should I create a csv file and put the wav file names (source language) in the first column and the text (target language) in the next coulmn OR any other json/csv file that will keep track or map the audio and the text file.

As per the tutorial, I have to prepare a pre-trained ASR model first for FBK-Fairseq-ST.

I am new in this field, I be would thankful for any guidance.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant