Skip to content

The text to speech avatar system is a text to speech feature with vision capabilities, that allow customers to create synthetic videos of a 2D photorealistic avatar speaking. The Neural text to speech Avatar models are trained by deep neural networks based on the human video recording samples, and the voice of the avatar .

License

Notifications You must be signed in to change notification settings

bibinkunjumon2020/Azure-Avatar-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Azure Text to Speech AI Avatar Video Synthesis Avatar Image

This Streamlit application allows you to synthesize text into avatar videos using Microsoft Azure's Text to Speech with Avatar service.

Generated Video-Demo

0001.mp4

Getting Started

Prerequisites

Before running the application, ensure you have the following:

  • Azure Subscription: You need an active Azure subscription.
  • Azure Speech API Subscription Key: Obtain this key from your Azure portal.
  • Service Region: Select one of the supported regions: West US 2, West Europe, or Southeast Asia.

Installation

  1. Clone the repository:

       git clone https://github.com/your/repository.git
       cd repository-directory
  2. Install the required Python packages:

       pip install streamlit
    
  3. Running the Application

       streamlit run main.py
    
    

Usage

Fill in Subscription Key and Region: Enter your Speech API subscription key and select the service region from the dropdown.

Enter Text: Input the text that you want to synthesize into an avatar video.

Submit: Click on the "Submit" button to initiate the avatar synthesis job.

Monitor Job Status: Once submitted, the application will display the job status (running, succeeded, or failed) and provide a download link for the video upon successful completion.

List Jobs: You can also view a list of previously submitted batch synthesis jobs and their details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Microsoft Azure for providing the Text to Speech with Avatar service. Streamlit for the user-friendly web application framework.

Author

BIBIN KUNJUMON bibinkunjumon2020@gmail.com

Screenshots of App

image

About

The text to speech avatar system is a text to speech feature with vision capabilities, that allow customers to create synthetic videos of a 2D photorealistic avatar speaking. The Neural text to speech Avatar models are trained by deep neural networks based on the human video recording samples, and the voice of the avatar .

Topics

Resources

License

Stars

Watchers

Forks

Languages