CogVideoX-5b NF4 (4-bit quantized)

Version quantifiée NF4 de THUDM/CogVideoX-5b-I2V pour réduire l'empreinte VRAM (~50% vs bfloat16).

Utilisation

from diffusers import CogVideoXImageToVideoPipeline, CogVideoXTransformer3DModel
from transformers import T5EncoderModel
import torch

transformer = CogVideoXTransformer3DModel.from_pretrained(
    'princeDjoumessi/CogVideoX-5b-nf4',
    subfolder='transformer',
    torch_dtype=torch.bfloat16,
)
text_encoder = T5EncoderModel.from_pretrained(
    'princeDjoumessi/CogVideoX-5b-nf4',
    subfolder='text_encoder',
    torch_dtype=torch.bfloat16,
)
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    'THUDM/CogVideoX-5b-I2V',
    transformer=transformer,
    text_encoder=text_encoder,
    torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload()
pipe.vae.enable_slicing()

Composants quantifiés

Composant Quantification
Transformer3D NF4 (double quant)
Text Encoder (T5) NF4 (double quant)
VAE bfloat16 (inchangé)
Scheduler inchangé

Généré sur Kaggle T4 avec bitsandbytes.

Downloads last month
48
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for princeDjoumessi/CogVideoX-5b-nf4

Finetuned
(6)
this model