YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
VibeVoice F16 Model
This model has been converted to float16 (f16) precision for reduced memory usage.
Conversion Details
- Original model: microsoft/VibeVoice-1.5B
- Mixed precision: True
- Memory savings: ~45.7%
- Original size: 10.07 GB
- Converted size: 5.47 GB
Usage
from vibevoice.modular.modeling_vibevoice_inference import VibeVoiceForConditionalGenerationInference
from vibevoice.processor.vibevoice_processor import VibeVoiceProcessor
# Load with f16 precision
model = VibeVoiceForConditionalGenerationInference.from_pretrained(
"./VibeVoice-1.5B-f16",
torch_dtype=torch.float16,
device_map="cpu" # or "cuda" for GPU
)
processor = VibeVoiceProcessor.from_pretrained("./VibeVoice-1.5B-f16")
# Use --use_f16 flag with demo scripts
python demo/inference_from_file.py --model_path ./VibeVoice-1.5B-f16 --use_f16 --device cpu
Notes
- F16 precision may result in minor quality differences compared to f32
- Some operations automatically upcast to f32 for numerical stability
- Optimized for CPU inference, but also works on CUDA GPUs
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support