You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently following this Blog: https://jaimeleal.github.io/how-to-speech-synthesis#step-5-download-scripts-from-deeplearningexamples. Transcription process is done but I am now stuck at creating metadata.csv and filelists (step 4 in above link). My goal is to create a tacotron2 model and a waveglow model. Which I would like to use to generate custom speech. Is there any major concern I should have before continuing?
The blog uses Amazon Transcribe. Unfortunately, I wasn't able to reach the author that's why I trying it here. The code for extraction looks like this:
`import os
import json
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
Extract the transcript text from the output of Amazon Transcribe
`rows = []
for index in range(1, len(files) + 1):
name = '{:04d}'.format(index)
with open(os.path.join(path, "{}.json".format(name))) as json_file:
data = json.load(json_file)
transcript = data["results"]["transcripts"][0]["transcript"]
rows.append([name, transcript])
data = pd.DataFrame(rows, columns = ["name", "transcript"])`
At line data = json.load(json_file) it returns
in transcript = data["results"]["transcripts"][0]["transcript"] KeyError: 'results'`
As far as I understood it tries to fetch the data from a url. Which I don't have ofc. But then, how can I make it work in my case?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am currently following this Blog: https://jaimeleal.github.io/how-to-speech-synthesis#step-5-download-scripts-from-deeplearningexamples. Transcription process is done but I am now stuck at creating metadata.csv and filelists (step 4 in above link). My goal is to create a tacotron2 model and a waveglow model. Which I would like to use to generate custom speech. Is there any major concern I should have before continuing?
The blog uses Amazon Transcribe. Unfortunately, I wasn't able to reach the author that's why I trying it here. The code for extraction looks like this:
`import os
import json
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
path = '/dataset/transcripts/'
files = os.listdir(path)`
Extract the transcript text from the output of Amazon Transcribe
`rows = []
for index in range(1, len(files) + 1):
name = '{:04d}'.format(index)
with open(os.path.join(path, "{}.json".format(name))) as json_file:
data = json.load(json_file)
transcript = data["results"]["transcripts"][0]["transcript"]
rows.append([name, transcript])
data = pd.DataFrame(rows, columns = ["name", "transcript"])`
At line data = json.load(json_file) it returns
in transcript = data["results"]["transcripts"][0]["transcript"] KeyError: 'results'`
As far as I understood it tries to fetch the data from a url. Which I don't have ofc. But then, how can I make it work in my case?
Beta Was this translation helpful? Give feedback.
All reactions