BadToBest
/

EchoMimicV2

Model card Files Files and versions

xet

Community

lymhust commited on Nov 27, 2024

Commit

4b0c02e

verified ·

1 Parent(s): 1f83805

Update README.md

Browse files

Files changed (1) hide show

README.md +18 -4

README.md CHANGED Viewed

@@ -21,15 +21,21 @@ Terminal Technology Department, Alipay, Ant Group.
     <a href='https://arxiv.org/abs/2411.10061'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
     <a href='https://github.com/antgroup/echomimic_v2/blob/main/assets/halfbody_demo/wechat_group.png'><img src='https://badges.aleen42.com/src/wechat.svg'></a>
 </div>
 ## &#x1F680; EchoMimic Series
 * EchoMimicV1: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning. [GitHub](https://github.com/antgroup/echomimic)
 * EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation. [GitHub](https://github.com/antgroup/echomimic_v2)
 ## &#x1F4E3; Updates
 * [2024.11.22] 🔥 [ComfyUI](https://github.com/smthemex/ComfyUI_EchoMimic) is now available. Thanks @smthemex for the contribution.
-* [2024.11.19] 🔥 We release the EMTD dataset list and processing scripts.
-* [2024.11.19] 🔥 We release our [EchoMimicV2](https://github.com/antgroup/echomimic_v2) codes and models.
 * [2024.11.15] 🔥 Our [paper](https://arxiv.org/abs/2411.10061) is in public on arxiv.
 ## &#x1F305; Gallery
@@ -149,7 +155,11 @@ Create conda environment (Recommended):
 Install packages with `pip`
 ```bash
   pip install -r requirements.txt
 ```
 ### Download ffmpeg-static
@@ -178,7 +188,7 @@ The **pretrained_weights** is organized as follows.
 ├── sd-image-variations-diffusers
 │   └── ...
 └── audio_processor
-    └── whisper_tiny.pt
 ```
 In which **denoising_unet.pth** / **reference_unet.pth** / **motion_module.pth** / **pose_encoder.pth** are the main checkpoints of **EchoMimic**. Other models in this hub can be also downloaded from it's original hub, thanks to their brilliant works:
@@ -187,6 +197,10 @@ In which **denoising_unet.pth** / **reference_unet.pth** / **motion_module.pth**
 - [audio_processor(whisper)](https://openaipublic.azureedge.net/main/whisper/models/65147644a518d12f04e32d6f3b26facc3f8dd46e5390956a9424a650c0ce22b9/tiny.pt)
 ### Inference on Demo
 Run the python inference script:
 ```bash
 python infer.py --config='./configs/prompts/infer.yaml'
@@ -245,4 +259,4 @@ If you find our work useful for your research, please consider citing the paper
 ```
 ## &#x1F31F; Star History
-[![Star History Chart](https://api.star-history.com/svg?repos=antgroup/echomimic_v2&type=Date)](https://star-history.com/#antgroup/echomimic_v2&Date)

     <a href='https://arxiv.org/abs/2411.10061'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
     <a href='https://github.com/antgroup/echomimic_v2/blob/main/assets/halfbody_demo/wechat_group.png'><img src='https://badges.aleen42.com/src/wechat.svg'></a>
 </div>
+<div align='center'>
+    <a href='https://github.com/antgroup/echomimic_v2/discussions/53'><img src='https://img.shields.io/badge/English-Common Problems-orange'></a>
+    <a href='https://github.com/antgroup/echomimic_v2/discussions/40'><img src='https://img.shields.io/badge/中文版-常见问题汇总-orange'></a>
+</div>
 ## &#x1F680; EchoMimic Series
 * EchoMimicV1: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning. [GitHub](https://github.com/antgroup/echomimic)
 * EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation. [GitHub](https://github.com/antgroup/echomimic_v2)
 ## &#x1F4E3; Updates
+* [2024.11.27] 🔥 Thanks [AiMotionStudio](https://www.youtube.com/@AiMotionStudio) for the [installation tutorial](https://www.youtube.com/watch?v=2ab6U1-nVTQ).
+* [2024.11.22] 🔥 [GradioUI](https://github.com/antgroup/echomimic_v2/blob/main/app.py) is now available. Thanks @gluttony-10 for the contribution.
 * [2024.11.22] 🔥 [ComfyUI](https://github.com/smthemex/ComfyUI_EchoMimic) is now available. Thanks @smthemex for the contribution.
+* [2024.11.21] 🔥 We release the EMTD dataset list and processing scripts.
+* [2024.11.21] 🔥 We release our [EchoMimicV2](https://github.com/antgroup/echomimic_v2) codes and models.
 * [2024.11.15] 🔥 Our [paper](https://arxiv.org/abs/2411.10061) is in public on arxiv.
 ## &#x1F305; Gallery
 Install packages with `pip`
 ```bash
+  pip install pip -U
+  pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 xformers==0.0.28.post3 --index-url https://download.pytorch.org/whl/cu124
+  pip install torchao --index-url https://download.pytorch.org/whl/nightly/cu124
   pip install -r requirements.txt
+  pip install --no-deps facenet_pytorch==2.6.0
 ```
 ### Download ffmpeg-static
 ├── sd-image-variations-diffusers
 │   └── ...
 └── audio_processor
+    └── tiny.pt
 ```
 In which **denoising_unet.pth** / **reference_unet.pth** / **motion_module.pth** / **pose_encoder.pth** are the main checkpoints of **EchoMimic**. Other models in this hub can be also downloaded from it's original hub, thanks to their brilliant works:
 - [audio_processor(whisper)](https://openaipublic.azureedge.net/main/whisper/models/65147644a518d12f04e32d6f3b26facc3f8dd46e5390956a9424a650c0ce22b9/tiny.pt)
 ### Inference on Demo
+Run the gradio:
+```bash
+python app.py
+```
 Run the python inference script:
 ```bash
 python infer.py --config='./configs/prompts/infer.yaml'
 ```
 ## &#x1F31F; Star History
+[![Star History Chart](https://api.star-history.com/svg?repos=antgroup/echomimic_v2&type=Date)](https://star-history.com/#antgroup/echomimic_v2&Date)