EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
The orignal project never mentioned support AMD ROCm GPU. Actually it is depends on PyTorch and ROCm is already one part of PyTorch. That means you could run it with AMD ROCm GPU(MI series and Radeon Series GPU).
Here I show the steps for running it with ROCm
git clone https://github.com/BadToBest/EchoMimic
cd EchoMimic
- Tested System Environment: Ubuntu 22.04, ROCm >= 6.0
- Tested GPUs: Radeon Pro W7900 / MI300X
- Tested Python Version: 3.10
conda create -n echomimic python=3.10
conda activate echomimic
Comments the top three lines of requirements.txt and save. (do not install torch cuda version)
#torch>=2.0.1,<=2.2.2
#torchvision>=0.15.2,<=0.17.2
#torchaudio>=2.0.2,<=2.2.2
Install PyTorch ROCm version
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.1
Install packages with pip
pip install -r requirements.txt
**Then do it as same as cuda what the original repo README.md say **
Download and decompress ffmpeg-static, then
export FFMPEG_PATH=/path/to/ffmpeg-4.4-amd64-static
git lfs install
git clone https://huggingface.co/BadToBest/EchoMimic pretrained_weights
The pretrained_weights is organized as follows.
./pretrained_weights/
├── denoising_unet.pth
├── reference_unet.pth
├── motion_module.pth
├── face_locator.pth
├── sd-vae-ft-mse
│ └── ...
├── sd-image-variations-diffusers
│ └── ...
└── audio_processor
└── whisper_tiny.pt
In which denoising_unet.pth / reference_unet.pth / motion_module.pth / face_locator.pth are the main checkpoints of EchoMimic. Other models in this hub can be also downloaded from it's original hub, thanks to their brilliant works:
Run the python inference script:
python -u infer_audio2vid.py
python -u infer_audio2vid_pose.py
Edit the inference config file ./configs/prompts/animation.yaml, and add your own case:
test_cases:
"path/to/your/image":
- "path/to/your/audio"
The run the python inference script:
python -u infer_audio2vid.py
(Firstly download the checkpoints with '_pose.pth' postfix from huggingface)
Edit driver_video and ref_image to your path in demo_motion_sync.py, then run
python -u demo_motion_sync.py
Edit ./configs/prompts/animation_pose.yaml, then run
python -u infer_audio2vid_pose.py
Set draw_mouse=True in line 135 of infer_audio2vid_pose.py. Edit ./configs/prompts/animation_pose.yaml, then run
python -u infer_audio2vid_pose.py
Thanks to the contribution from @Robin021:
python -u webgui.py --server_port=3000