Spaces:
Sleeping
Sleeping
metadata
title: Kokoro TTS API - Simple & Fast
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
app_file: app.py
pinned: false
π€ Kokoro TTS API - Simple & Fast
High-speed text-to-speech service powered by Kokoro (82M parameters). No authentication, no database - just pure TTS!
β‘ Features
- Lightning Fast: 10x faster than XTTS
- Emotional Voices: 6 expressive voices
- CPU Optimized: Runs smoothly on CPU
- No Auth Required: Simple HTTP API
- Long Audio: Up to 5 minutes per generation
π΅ Available Voices
| Voice | Description | Best For |
|---|---|---|
bf_isabella |
British Female | β Storytelling, audiobooks |
af_heart |
American Female | Warm, conversational |
af_bella |
American Female | Professional narration |
bf_emma |
British Female | Elegant, formal |
am_adam |
American Male | Confident, clear |
am_michael |
American Male | Friendly, casual |
π Quick Start
Using curl:
curl -X POST https://your-space.hf.space/api/generate \
-F "text=Once upon a time, in a distant kingdom..." \
-F "voice=bf_isabella" \
-F "speed=1.0" \
--output story.wav
Using Python:
import requests
response = requests.post(
"https://your-space.hf.space/api/generate",
data={
"text": "Hello world! This is Kokoro TTS.",
"voice": "bf_isabella",
"speed": 1.0
}
)
with open("audio.wav", "wb") as f:
f.write(response.content)
Using Flutter/Dart:
import 'package:http/http.dart' as http;
Future<File> generateTTS(String text) async {
final response = await http.post(
Uri.parse('https://your-space.hf.space/api/generate'),
body: {
'text': text,
'voice': 'bf_isabella',
'speed': '1.0',
},
);
final file = File('${Directory.systemTemp.path}/tts_${DateTime.now().millisecondsSinceEpoch}.wav');
await file.writeAsBytes(response.bodyBytes);
return file;
}
π API Endpoints
POST /api/generate
Generate TTS audio from text.
Parameters:
text(required): Text to convert (5-4500 characters)voice(optional): Voice to use (default:bf_isabella)speed(optional): Speech speed 0.5-2.0 (default: 1.0)
Response: WAV audio file
GET /health
Check service health and available voices.
GET /docs
Interactive API documentation (Swagger UI).
βοΈ Technical Specs
- Model: Kokoro-82M (ONNX)
- Max Characters: 4500 (~5 minutes audio)
- Generation Time: ~20-30 seconds (CPU)
- Speech Rate: ~900 characters/minute
- Output Format: WAV, 24kHz
π οΈ Deployment
This Space runs on Docker with automatic model download on startup.
No environment variables needed!
π License
Model: Kokoro TTS by thewh1teagle
Service: MIT License