metadata
tags:
- text-to-image
- stable-diffusion
- audio-to-video
language:
- en
library_name: diffusers
V-Express Model Card
Project Page | Paper | Code
Introduction
Models
Audio Encoder
- model_ckpts/wav2vec2-base-960h. (It is also available from the original model card facebook/wav2vec2-base-960h)
Face Analysis
- model_ckpts/insightface_models/models/buffalo_l. (It is also available from the original repository insightface/buffalo_l)
V-Express
- model_ckpts/sd-vae-ft-mse. VAE encoder. (original model card stabilityai/sd-vae-ft-mse)
- model_ckpts/stable-diffusion-v1-5. Only the model configuration file for unet is needed here. (original model card runwayml/stable-diffusion-v1-5)
- model_ckpts/v-express. The video generation model conditional on audio and V-kps we call V-Express.
- You should download and put all
.binmodel tomodel_ckpts/v-expressdirectory, which includesaudio_projection.bin,denoising_unet.bin,motion_module.bin,reference_net.bin, andv_kps_guider.bin.
licence
The code of V-Express is released for both academic and commercial usage. However, both manual-downloading and auto-downloading models from V-Express are for non-commercial research purposes. Our released checkpoints are also for research purposes only. Users are granted the freedom to create videos using this tool, but they are obligated to comply with local laws and utilize it responsibly. The developers will not assume any responsibility for potential misuse by users.