V-Express / README.md

tk93

Update README.md

d333da2 verified 10 months ago

preview code

raw

history blame

2.36 kB

metadata

tags:
  - text-to-image
  - stable-diffusion
  - audio-to-video
language:
  - en
library_name: diffusers

V-Express Model Card

Project Page | Paper | Code

Introduction

Models

Audio Encoder

model_ckpts/wav2vec2-base-960h. (It is also available from the original model card facebook/wav2vec2-base-960h)

Face Analysis

model_ckpts/insightface_models/models/buffalo_l. (It is also available from the original repository insightface/buffalo_l)

V-Express

model_ckpts/sd-vae-ft-mse. VAE encoder. (original model card stabilityai/sd-vae-ft-mse)
model_ckpts/stable-diffusion-v1-5. Only the model configuration file for unet is needed here. (original model card runwayml/stable-diffusion-v1-5)
model_ckpts/v-express. The video generation model conditional on audio and V-kps we call V-Express.
You should download and put all .bin model to model_ckpts/v-express directory, which includes audio_projection.bin, denoising_unet.bin, motion_module.bin, reference_net.bin, and v_kps_guider.bin.

licence

The code of V-Express is released for both academic and commercial usage. However, both manual-downloading and auto-downloading models from V-Express are for non-commercial research purposes. Our released checkpoints are also for research purposes only. Users are granted the freedom to create videos using this tool, but they are obligated to comply with local laws and utilize it responsibly. The developers will not assume any responsibility for potential misuse by users.