VibeVoice-1.5B - NF4 Quantized
4-bit NF4 quantization of microsoft/VibeVoice-1.5B.
Strategy
Backbone extraction approach:
- Downloaded raw safetensors (bypassed from_pretrained)
- Separated Qwen2.5-1.5B backbone from audio heads
- Quantized backbone as standard Qwen2ForCausalLM with NF4 + double quant
- Packaged quantized backbone + BF16 audio heads
| Component | Method | Size |
|---|---|---|
| LLM backbone (Qwen2.5-1.5B) | NF4 + double quant | ~0.8-1.0 GB |
| Audio heads (tokenizers, diffusion, connectors) | BF16 | ~1.8 GB |
Source
Quantized from microsoft/VibeVoice-1.5B (MIT license).
- Downloads last month
- 37
Model tree for tantk/vibevoice-1.5b-bnb-4bit
Base model
microsoft/VibeVoice-1.5B