| # VibeVoice 1.5B - Intel iGPU Optimized | |
| ## 🚀 Microsoft VibeVoice Optimized for Intel iGPU | |
| This is the INT8 quantized version of Microsoft's VibeVoice 1.5B model, optimized for Intel integrated GPUs. | |
| ### Features | |
| - **Multi-speaker synthesis** (up to 4 speakers) | |
| - **90-minute continuous generation** | |
| - **2-3x faster** than CPU | |
| - **55% smaller** than original model | |
| - **Intel iGPU optimized** via OpenVINO | |
| ### Model Details | |
| - **Base Model**: microsoft/VibeVoice-1.5B | |
| - **Parameters**: 2.7B | |
| - **Quantization**: INT8 dynamic | |
| - **Size**: ~2.3GB (from 5.4GB) | |
| - **Sample Rate**: 24kHz | |
| ### Usage | |
| ```python | |
| import torch | |
| from vibevoice_intel import VibeVoiceIntelOptimized | |
| # Load quantized model | |
| model = VibeVoiceIntelOptimized.from_pretrained( | |
| "magicunicorn/vibevoice-intel-igpu" | |
| ) | |
| # Generate multi-speaker dialogue | |
| script = ''' | |
| Speaker 1: Hello, welcome to our podcast! | |
| Speaker 2: Thanks for having me. | |
| ''' | |
| audio = model.synthesize(script) | |
| ``` | |
| ### Hardware Requirements | |
| - Intel Iris Xe, Arc iGPU, or UHD Graphics | |
| - 8GB+ system RAM | |
| - OpenVINO runtime | |
| ### Performance | |
| - **Inference**: 2-3x faster than CPU | |
| - **Power**: 15W (vs 35W+ CPU) | |
| - **Memory**: 4GB peak usage | |
| ### License | |
| MIT | |
| ### Citation | |
| Original model: Microsoft VibeVoice | |
| Optimization: Magic Unicorn Inc | |