Instructions to use btsee/oron-tts with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- F5-TTS
How to use btsee/oron-tts with F5-TTS:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
OronTTS — F5-TTS for Mongolian & Kazakh
Non-autoregressive text-to-speech model based on F5-TTS (Flow Matching + Diffusion Transformer) for Mongolian (Khalkha Cyrillic) and Kazakh (Cyrillic).
Model Details
| Parameter | Value |
|---|---|
| Architecture | F5-TTS (OT-CFM + DiT + Vocos) |
| dim | 1024 |
| depth | 22 |
| heads | 16 |
| vocab_size | 65 |
| sample_rate | 24000 Hz |
| mel_bins | 100 |
Usage
from src.models.f5tts import F5TTS
from src.utils.checkpoint import CheckpointManager
model = F5TTS.from_config(config)
cm = CheckpointManager("checkpoints")
cm.load(model, path="f5tts_best.pt", device="cuda")
wav = model.synthesize(
text="Сайн байна уу",
lang="mn",
ref_audio_path="ref.wav",
)
Training
Trained on btsee/mbspeech_mn (3,846 Mongolian speech samples).
License
MIT
- Downloads last month
- 926