How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image, export_to_video

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("princeDjoumessi/CogVideoX-5b-nf4", dtype=torch.bfloat16, device_map="cuda")
pipe.to("cuda")

prompt = "A man with short gray hair plays a red electric guitar."
image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png"
)

output = pipe(image=image, prompt=prompt).frames[0]
export_to_video(output, "output.mp4")

CogVideoX-5b NF4 (4-bit quantized)

Version quantifiée NF4 de THUDM/CogVideoX-5b-I2V pour réduire l'empreinte VRAM (~50% vs bfloat16).

Utilisation

from diffusers import CogVideoXImageToVideoPipeline, CogVideoXTransformer3DModel
from transformers import T5EncoderModel
import torch

transformer = CogVideoXTransformer3DModel.from_pretrained(
    'princeDjoumessi/CogVideoX-5b-nf4',
    subfolder='transformer',
    torch_dtype=torch.bfloat16,
)
text_encoder = T5EncoderModel.from_pretrained(
    'princeDjoumessi/CogVideoX-5b-nf4',
    subfolder='text_encoder',
    torch_dtype=torch.bfloat16,
)
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    'THUDM/CogVideoX-5b-I2V',
    transformer=transformer,
    text_encoder=text_encoder,
    torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload()
pipe.vae.enable_slicing()

Composants quantifiés

Composant Quantification
Transformer3D NF4 (double quant)
Text Encoder (T5) NF4 (double quant)
VAE bfloat16 (inchangé)
Scheduler inchangé

Généré sur Kaggle T4 avec bitsandbytes.

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for princeDjoumessi/CogVideoX-5b-nf4

Finetuned
(6)
this model