Replies: 4 comments 15 replies
-
Yes and I think a public version is under way. If I don't send an update in 2 weeks, feel free to remind me |
Beta Was this translation helpful? Give feedback.
-
Hi guys, you may check this updated implementation: using tensorrt-llm whisper + nvidia triton python backend, https://github.com/k2-fsa/sherpa/tree/master/triton/whisper. It speeds up 7x comparing with previous onnx version implementation pointed by @IbrahimAmin1 |
Beta Was this translation helpful? Give feedback.
-
Great idea. Subscribe this thread. |
Beta Was this translation helpful? Give feedback.
-
Has anyone been able to serve the whisper model on Nvidia's Triton Inference Server?
Beta Was this translation helpful? Give feedback.
All reactions