studify-tts-service / README.md
Hexa06's picture
Simplify: Remove Supabase, auth, and quotas - pure TTS API
559b0d5
---
title: Kokoro TTS API - Simple & Fast
emoji: 🎀
colorFrom: blue
colorTo: purple
sdk: docker
app_file: app.py
pinned: false
---
# 🎀 Kokoro TTS API - Simple & Fast
High-speed text-to-speech service powered by Kokoro (82M parameters). No authentication, no database - just pure TTS!
## ⚑ Features
- **Lightning Fast**: 10x faster than XTTS
- **Emotional Voices**: 6 expressive voices
- **CPU Optimized**: Runs smoothly on CPU
- **No Auth Required**: Simple HTTP API
- **Long Audio**: Up to 5 minutes per generation
## 🎡 Available Voices
| Voice | Description | Best For |
|-------|-------------|----------|
| `bf_isabella` | British Female | ⭐ Storytelling, audiobooks |
| `af_heart` | American Female | Warm, conversational |
| `af_bella` | American Female | Professional narration |
| `bf_emma` | British Female | Elegant, formal |
| `am_adam` | American Male | Confident, clear |
| `am_michael` | American Male | Friendly, casual |
## πŸš€ Quick Start
### Using curl:
```bash
curl -X POST https://your-space.hf.space/api/generate \
-F "text=Once upon a time, in a distant kingdom..." \
-F "voice=bf_isabella" \
-F "speed=1.0" \
--output story.wav
```
### Using Python:
```python
import requests
response = requests.post(
"https://your-space.hf.space/api/generate",
data={
"text": "Hello world! This is Kokoro TTS.",
"voice": "bf_isabella",
"speed": 1.0
}
)
with open("audio.wav", "wb") as f:
f.write(response.content)
```
### Using Flutter/Dart:
```dart
import 'package:http/http.dart' as http;
Future<File> generateTTS(String text) async {
final response = await http.post(
Uri.parse('https://your-space.hf.space/api/generate'),
body: {
'text': text,
'voice': 'bf_isabella',
'speed': '1.0',
},
);
final file = File('${Directory.systemTemp.path}/tts_${DateTime.now().millisecondsSinceEpoch}.wav');
await file.writeAsBytes(response.bodyBytes);
return file;
}
```
## πŸ“Š API Endpoints
### `POST /api/generate`
Generate TTS audio from text.
**Parameters:**
- `text` (required): Text to convert (5-4500 characters)
- `voice` (optional): Voice to use (default: `bf_isabella`)
- `speed` (optional): Speech speed 0.5-2.0 (default: 1.0)
**Response:** WAV audio file
### `GET /health`
Check service health and available voices.
### `GET /docs`
Interactive API documentation (Swagger UI).
## βš™οΈ Technical Specs
- **Model**: Kokoro-82M (ONNX)
- **Max Characters**: 4500 (~5 minutes audio)
- **Generation Time**: ~20-30 seconds (CPU)
- **Speech Rate**: ~900 characters/minute
- **Output Format**: WAV, 24kHz
## πŸ› οΈ Deployment
This Space runs on Docker with automatic model download on startup.
**No environment variables needed!**
## πŸ“ License
Model: Kokoro TTS by thewh1teagle
Service: MIT License
## πŸ”— Links
- [Kokoro GitHub](https://github.com/thewh1teagle/kokoro-onnx)
- [API Docs](/docs)
- [Health Check](/health)