Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

DuoNeural
/
Cosmos3-Nano-GPTQ-4bit

Text-to-Video
Diffusers
English
Cosmos3OmniPipeline
qwen3_vl_text
cosmos3
gptq
4bit
quantization
video-generation
nvidia
mixture-of-transformers
4-bit precision
Model card Files Files and versions
xet
Community

Instructions to use DuoNeural/Cosmos3-Nano-GPTQ-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • Diffusers

    How to use DuoNeural/Cosmos3-Nano-GPTQ-4bit with Diffusers:

    pip install -U diffusers transformers accelerate
    import torch
    from diffusers import DiffusionPipeline
    
    # switch to "mps" for apple devices
    pipe = DiffusionPipeline.from_pretrained("DuoNeural/Cosmos3-Nano-GPTQ-4bit", dtype=torch.bfloat16, device_map="cuda")
    
    prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
    image = pipe(prompt).images[0]
  • Notebooks
  • Google Colab
  • Kaggle
Cosmos3-Nano-GPTQ-4bit
11.1 GB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 7 commits
DuoNeural's picture
DuoNeural
Add Kilonova UMA inference test: 14.4GB VRAM, int32 nibble format, path key fix, AMD ROCm flags
68569a0 verified about 13 hours ago
  • .gitattributes
    1.52 kB
    initial commit about 20 hours ago
  • README.md
    12.3 kB
    Add Kilonova UMA inference test: 14.4GB VRAM, int32 nibble format, path key fix, AMD ROCm flags about 13 hours ago
  • config.json
    1.8 kB
    DuoNeural Cosmos3-Nano GPTQ int4: custom nibble-packed, 2.74x compression (30.3GB→11.1GB), 330 linear layers quantized about 20 hours ago
  • model-00001-packed.safetensors
    5.36 GB
    xet
    DuoNeural Cosmos3-Nano GPTQ int4: custom nibble-packed, 2.74x compression (30.3GB→11.1GB), 330 linear layers quantized about 20 hours ago
  • model-00002-packed.safetensors
    5.36 GB
    xet
    DuoNeural Cosmos3-Nano GPTQ int4: custom nibble-packed, 2.74x compression (30.3GB→11.1GB), 330 linear layers quantized about 20 hours ago
  • model-00003-packed.safetensors
    335 MB
    xet
    DuoNeural Cosmos3-Nano GPTQ int4: custom nibble-packed, 2.74x compression (30.3GB→11.1GB), 330 linear layers quantized about 20 hours ago
  • model_index.json
    514 Bytes
    Add model_index.json (transformer-only quantized release) about 20 hours ago