AEmotionStudio
/

omnivoice-models

Model card Files Files and versions

omnivoice-models / README.md

AEmotionStudio's picture

Add README for Mæstræa mirror

1913353 verified about 1 month ago

|

history blame contribute delete

2.11 kB

	---
	license: apache-2.0
	tags:
	- text-to-speech
	- tts
	- voice-cloning
	- omnivoice
	- safetensors
	- maestraea
	language:
	- multilingual
	pipeline_tag: text-to-speech
	base_model: k2-fsa/OmniVoice
	---

	# OmniVoice (Mæstræa Mirror)

	Multi-Lingual TTS & Voice Cloning — 600+ Languages

	[Original Model](https://huggingface.co/k2-fsa/OmniVoice) by [k2-fsa (Next-gen Kaldi)](https://github.com/k2-fsa) · Apache 2.0

	> This is a mirror of the OmniVoice model weights for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea). All credits go to the original authors.

	## What's in This Repo

	\| Path \| Description \| Size \|
	\|------\|-------------\|------\|
	\| `model.safetensors` \| Main OmniVoice model \| ~3 GB \|
	\| `audio_tokenizer/model.safetensors` \| Audio tokenizer \| ~260 MB \|
	\| `tokenizer.json` \| Text tokenizer \| ~17 MB \|
	\| `config.json` \| Model configuration \| < 1 KB \|

	## What OmniVoice Does

	OmniVoice is a multi-lingual TTS and voice cloning model supporting 600+ languages with near real-time inference (RTF ~0.025). It supports three modes:

	- Auto Voice — Generate speech from text with a default voice
	- Voice Cloning — Clone any voice from a 3–15s reference audio sample
	- Voice Design — Describe the desired voice characteristics in text

	### Key Features

	- 600+ language support
	- Near real-time inference
	- Long-form text auto-chunking for constant VRAM usage
	- ~3–8 GB VRAM depending on mode

	## Usage with Mæstræa

	These models are automatically downloaded by the Mæstræa AI Workstation backend. They can also be loaded manually:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("AEmotionStudio/omnivoice-models")
	tokenizer = AutoTokenizer.from_pretrained("AEmotionStudio/omnivoice-models")
	```

	## License

	Apache 2.0 — same as the original OmniVoice release.

	## Credits

	- Model: [k2-fsa/OmniVoice](https://github.com/k2-fsa/OmniVoice)
	- Paper: See original repo for citation
	- Mirror by: [AEmotionStudio](https://huggingface.co/AEmotionStudio)