tantk
/

vibevoice-1.5b-bnb-4bit

text-generation

4-bit precision

Model card Files Files and versions

VibeVoice-1.5B - NF4 Quantized

4-bit NF4 quantization of microsoft/VibeVoice-1.5B.

Strategy

Backbone extraction approach:

Downloaded raw safetensors (bypassed from_pretrained)
Separated Qwen2.5-1.5B backbone from audio heads
Quantized backbone as standard Qwen2ForCausalLM with NF4 + double quant
Packaged quantized backbone + BF16 audio heads

Component	Method	Size
LLM backbone (Qwen2.5-1.5B)	NF4 + double quant	~0.8-1.0 GB
Audio heads (tokenizers, diffusion, connectors)	BF16	~1.8 GB

Source

Quantized from microsoft/VibeVoice-1.5B (MIT license).

Downloads last month: 9

Safetensors

Model size

2B params

Tensor type

F32

·

BF16

·

U8

·

Model tree for tantk/vibevoice-1.5b-bnb-4bit

Base model

microsoft/VibeVoice-1.5B

Quantized

(6)

this model