Kokoro-82M / README.md
krebsm's picture
Update README.md
0c59855 verified
|
Raw
History Blame Contribute Delete
1.41 kB
---
license: apache-2.0
base_model: hexgrad/Kokoro-82M
pipeline_tag: text-to-speech
language:
- en
- es
- fr
- hi
- it
- ja
- pt
- zh
tags:
- text-to-speech
- tts
- kokoro
- onnx
- safetensors
library_name: nobodywho
---
# Kokoro v1
## Model Capabilities
- **Text-to-speech** — 24 kHz mono output
- **Multilingual** — American/British English, Spanish, French, Hindi, Italian, Japanese, Brazilian Portuguese, Mandarin Chinese
- **54 voices** across 9 languages, naming convention `<lang><gender>_<name>` (e.g. `af_heart`, `bm_george`, `jf_alpha`)
The full description can be found at [the original model page](https://huggingface.co/hexgrad/Kokoro-82M).
## Getting Started
Run with NobodyWho (the model is fetched and cached on first use):
```rust
use nobodywho::tts::{Tts, TtsConfig};
let tts = Tts::new(TtsConfig::kokoro("NobodyWho/kokoro-v1"))?;
let wav = tts.synthesize("Hello from NobodyWho!")?;
std::fs::write("hello.wav", wav)?;
```
## Benchmarks
Measured with `nobodywho` on **Apple M4 Pro, CPU**:
| Input | Audio | Wallclock | Real-time factor |
|----------|-------|-----------|------------------|
| 10 words | 3.3s | ~0.48s | **6.9×** |
| 30 words | 10.9s | ~1.35s | **8.1×** |
| 70 words | 18.7s | ~2.26s | **8.3×** |
## Credits
Original model and training by [@hexgrad](https://huggingface.co/hexgrad). Thanks!