Input Text
0 / 5000
Quick Examples
Voice
—
Loading voices…
Settings
Speed
1.0×
Format
Output
Generate audio to see output
0:00
0:00
POST
/tts
Synthesize text to audio. Returns raw audio bytes.
// Request body { "text": "Hello world!", "voice": "af_heart", "speed": 1.0, "output_format": "wav" } // Response: audio/wav or audio/mpeg stream // Headers: // X-Duration-Seconds: 3.45 // Content-Disposition: attachment; filename="kokoro_af_heart.wav"
GET
/voices
List all available voices.
// Response { "voices": { "af_heart": { "label": "Heart", "lang": "en-US", "gender": "female", "flag": "🇺🇸" }, ... }, "total": 42 }
GET
/health
Model and device status.
// Response { "status": "ok", "model_loaded": true, "device": "cuda", "cuda": true, "pipelines": ["a","b","e","f","h","i","j","p","z"] }
Quick Start (Python)
# pip install requests import requests resp = requests.post("http://localhost:7860/tts", json={ "text": "Hello from Kokoro TTS!", "voice": "af_heart", "speed": 1.0, "output_format": "wav" }) with open("output.wav", "wb") as f: f.write(resp.content) duration = resp.headers.get("X-Duration-Seconds") print(f"Duration: {duration}s")