import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image, export_to_video
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("NimVideo/cogvideox-2b-img2vid", dtype=torch.bfloat16, device_map="cuda")
pipe.to("cuda")
prompt = "A man with short gray hair plays a red electric guitar."
image = load_image(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png"
)
output = pipe(image=image, prompt=prompt).frames[0]
export_to_video(output, "output.mp4")π₯ CogVideoX-2B-Img2Vid π
Fine-tuned on 10 million videos for high-quality generation at SBS levels comparable to CogVideoX-5B! π
Model Highlights π
Fine-tuned on 10 million videos for exceptional image-to-video generation quality.
Performance benchmarked to match SBS standards at CogVideoX-5B i2v level.
Usage Examples π₯
Try it for free on nim.video
CLI Inference π
python -m inference.cli_demo \
--video_path "resources/truck.jpg" \
--prompt "A truck is driving through a dirt road, showcasing its capability for off-roading." \
--model_path NimVideo/cogvideox-2b-img2vid
Gradio Inference with Web Demo π₯οΈ
python -m inference.gradio_web_demo \
--model_path NimVideo/cogvideox-2b-img2vid
ComfyUI Example π‘
π§ Find the custom ComfyUI node here.
Quick Start π
1οΈβ£ Clone the Repository
git clone https://github.com/Nim-Video/cogvideox-2b-img2vid.git
cd cogvideox-2b-img2vid
2οΈβ£ Set up a Virtual Environment
python -m venv venv
source venv/bin/activate
3οΈβ£ Install Requirements
pip install -r requirements.txt
Acknowledgements π
This project builds on the foundational work of CogVideoX.
- Downloads last month
- 368