Image-to-Video
Diffusers
Safetensors
ti2v
How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image, export_to_video

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("MCG-NJU/UniAVGen", dtype=torch.bfloat16, device_map="cuda")
pipe.to("cuda")

prompt = "A man with short gray hair plays a red electric guitar."
image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png"
)

output = pipe(image=image, prompt=prompt).frames[0]
export_to_video(output, "output.mp4")

UniAVGen: Unified Audio and Video Generation with
Asymmetric Cross-Modal Interactions

Guozhen Zhang · Zixiang Zhou · Teng Hu · Ziqiao Peng · Youliang Zhang
Yi Chen · Yuan Zhou · Qinglin Lu · Limin Wang
MCG-NJU   |   Tencent Hunyuan

Paper PDF Project Page

This repository is the checkpoint of paper "UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions". UniAVGen is a unified framework for high-fidelity joint audio-video generation, addressing key limitations of existing methods such as poor lip synchronization, insufficient semantic consistency, and limited task generalization.

Citation

If you think this project is helpful in your research or for application, please feel free to leave a star⭐️ and cite our paper:

@misc{zhang2025uniavgenunifiedaudiovideo,
      title={UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions}, 
      author={Guozhen Zhang and Zixiang Zhou and Teng Hu and Ziqiao Peng and Youliang Zhang and Yi Chen and Yuan Zhou and Qinglin Lu and Limin Wang},
      year={2025},
      eprint={2511.03334},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.03334}, 
}
Downloads last month
125
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MCG-NJU/UniAVGen

Finetuned
(23)
this model

Paper for MCG-NJU/UniAVGen