Replies: 1 comment
-
I can confirm we are seeing the same thing. Regularly words are returned with 0 duration. There's no consistency about whether they should belong to the prior word's timing or next word's. This is in relatively short audio files, approximately 20-40 seconds long. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm using the transcriptions API and I am noticing that in my responses I get words which have effectively 0 duration.
I'm using these parameters in the request:
For the most part, the transcription is accurate, but sometimes it will have a word where the start and end are seperated by less than a millisecond (which is also inaccurate when I listen to the mp3 which is also generated by openai)
Has anyone else had trouble with this/what can I do?
Example (I'm using java), see the word stairs.
After looking at the average duration/letter I suspect the words with zero duration 'belong' to the word prior
Beta Was this translation helpful? Give feedback.
All reactions