--- title: Voxtral Realtime 4B emoji: 🎙️ colorFrom: yellow colorTo: red sdk: static pinned: false license: apache-2.0 short_description: Speech-to-Text in the browser with transformers.js + WebGPU language: - en - fr - es - de - ru - zh - ja - it - pt - nl - ar - hi - ko --- # Voxtral Realtime 4B — Live Speech-to-Text Real-time speech transcription running entirely in your browser using [Voxtral-Mini-4B-Realtime](https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602) via [transformers.js](https://github.com/huggingface/transformers.js) + WebGPU. - Click the mic to start listening - VAD automatically detects speech segments - Words appear as the model generates them - All processing happens locally — no server needed