--- license: apache-2.0 tags: - text-to-speech - tts - voice-cloning - omnivoice - safetensors - maestraea language: - multilingual pipeline_tag: text-to-speech base_model: k2-fsa/OmniVoice --- # OmniVoice (Mæstræa Mirror) **Multi-Lingual TTS & Voice Cloning — 600+ Languages** [Original Model](https://huggingface.co/k2-fsa/OmniVoice) by [k2-fsa (Next-gen Kaldi)](https://github.com/k2-fsa) · Apache 2.0 > This is a mirror of the OmniVoice model weights for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea). All credits go to the original authors. ## What's in This Repo | Path | Description | Size | |------|-------------|------| | `model.safetensors` | Main OmniVoice model | ~3 GB | | `audio_tokenizer/model.safetensors` | Audio tokenizer | ~260 MB | | `tokenizer.json` | Text tokenizer | ~17 MB | | `config.json` | Model configuration | < 1 KB | ## What OmniVoice Does OmniVoice is a multi-lingual TTS and voice cloning model supporting 600+ languages with near real-time inference (RTF ~0.025). It supports three modes: - **Auto Voice** — Generate speech from text with a default voice - **Voice Cloning** — Clone any voice from a 3–15s reference audio sample - **Voice Design** — Describe the desired voice characteristics in text ### Key Features - 600+ language support - Near real-time inference - Long-form text auto-chunking for constant VRAM usage - ~3–8 GB VRAM depending on mode ## Usage with Mæstræa These models are automatically downloaded by the Mæstræa AI Workstation backend. They can also be loaded manually: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("AEmotionStudio/omnivoice-models") tokenizer = AutoTokenizer.from_pretrained("AEmotionStudio/omnivoice-models") ``` ## License Apache 2.0 — same as the original OmniVoice release. ## Credits - **Model**: [k2-fsa/OmniVoice](https://github.com/k2-fsa/OmniVoice) - **Paper**: See original repo for citation - **Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)