File size: 2,110 Bytes
1913353 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | ---
license: apache-2.0
tags:
- text-to-speech
- tts
- voice-cloning
- omnivoice
- safetensors
- maestraea
language:
- multilingual
pipeline_tag: text-to-speech
base_model: k2-fsa/OmniVoice
---
# OmniVoice (Mæstræa Mirror)
**Multi-Lingual TTS & Voice Cloning — 600+ Languages**
[Original Model](https://huggingface.co/k2-fsa/OmniVoice) by [k2-fsa (Next-gen Kaldi)](https://github.com/k2-fsa) · Apache 2.0
> This is a mirror of the OmniVoice model weights for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea). All credits go to the original authors.
## What's in This Repo
| Path | Description | Size |
|------|-------------|------|
| `model.safetensors` | Main OmniVoice model | ~3 GB |
| `audio_tokenizer/model.safetensors` | Audio tokenizer | ~260 MB |
| `tokenizer.json` | Text tokenizer | ~17 MB |
| `config.json` | Model configuration | < 1 KB |
## What OmniVoice Does
OmniVoice is a multi-lingual TTS and voice cloning model supporting 600+ languages with near real-time inference (RTF ~0.025). It supports three modes:
- **Auto Voice** — Generate speech from text with a default voice
- **Voice Cloning** — Clone any voice from a 3–15s reference audio sample
- **Voice Design** — Describe the desired voice characteristics in text
### Key Features
- 600+ language support
- Near real-time inference
- Long-form text auto-chunking for constant VRAM usage
- ~3–8 GB VRAM depending on mode
## Usage with Mæstræa
These models are automatically downloaded by the Mæstræa AI Workstation backend. They can also be loaded manually:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("AEmotionStudio/omnivoice-models")
tokenizer = AutoTokenizer.from_pretrained("AEmotionStudio/omnivoice-models")
```
## License
Apache 2.0 — same as the original OmniVoice release.
## Credits
- **Model**: [k2-fsa/OmniVoice](https://github.com/k2-fsa/OmniVoice)
- **Paper**: See original repo for citation
- **Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)
|