Image-to-Video
Diffusers
Safetensors
CogVideoXImageToVideoPipeline
cogvideox
video-generation
nf4
quantized
Instructions to use princeDjoumessi/CogVideoX-5b-nf4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use princeDjoumessi/CogVideoX-5b-nf4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("princeDjoumessi/CogVideoX-5b-nf4", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image, export_to_video
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("princeDjoumessi/CogVideoX-5b-nf4", dtype=torch.bfloat16, device_map="cuda")
pipe.to("cuda")
prompt = "A man with short gray hair plays a red electric guitar."
image = load_image(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png"
)
output = pipe(image=image, prompt=prompt).frames[0]
export_to_video(output, "output.mp4")CogVideoX-5b NF4 (4-bit quantized)
Version quantifiée NF4 de THUDM/CogVideoX-5b-I2V pour réduire l'empreinte VRAM (~50% vs bfloat16).
Utilisation
from diffusers import CogVideoXImageToVideoPipeline, CogVideoXTransformer3DModel
from transformers import T5EncoderModel
import torch
transformer = CogVideoXTransformer3DModel.from_pretrained(
'princeDjoumessi/CogVideoX-5b-nf4',
subfolder='transformer',
torch_dtype=torch.bfloat16,
)
text_encoder = T5EncoderModel.from_pretrained(
'princeDjoumessi/CogVideoX-5b-nf4',
subfolder='text_encoder',
torch_dtype=torch.bfloat16,
)
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
'THUDM/CogVideoX-5b-I2V',
transformer=transformer,
text_encoder=text_encoder,
torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload()
pipe.vae.enable_slicing()
Composants quantifiés
| Composant | Quantification |
|---|---|
| Transformer3D | NF4 (double quant) |
| Text Encoder (T5) | NF4 (double quant) |
| VAE | bfloat16 (inchangé) |
| Scheduler | inchangé |
Généré sur Kaggle T4 avec bitsandbytes.
- Downloads last month
- 2
Model tree for princeDjoumessi/CogVideoX-5b-nf4
Base model
zai-org/CogVideoX-5b-I2V