Spaces:

Hexa09
/

studify-tts-service

Sleeping

App Files Files Community

studify-tts-service / README.md

Hexa06

Simplify: Remove Supabase, auth, and quotas - pure TTS API

559b0d5 4 months ago

preview code

raw

history blame contribute delete

2.97 kB

metadata

title: Kokoro TTS API - Simple & Fast
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: docker
app_file: app.py
pinned: false

🎤 Kokoro TTS API - Simple & Fast

High-speed text-to-speech service powered by Kokoro (82M parameters). No authentication, no database - just pure TTS!

⚡ Features

Lightning Fast: 10x faster than XTTS
Emotional Voices: 6 expressive voices
CPU Optimized: Runs smoothly on CPU
No Auth Required: Simple HTTP API
Long Audio: Up to 5 minutes per generation

🎵 Available Voices

Voice	Description	Best For
`bf_isabella`	British Female	⭐ Storytelling, audiobooks
`af_heart`	American Female	Warm, conversational
`af_bella`	American Female	Professional narration
`bf_emma`	British Female	Elegant, formal
`am_adam`	American Male	Confident, clear
`am_michael`	American Male	Friendly, casual

🚀 Quick Start

Using curl:

curl -X POST https://your-space.hf.space/api/generate \
  -F "text=Once upon a time, in a distant kingdom..." \
  -F "voice=bf_isabella" \
  -F "speed=1.0" \
  --output story.wav

Using Python:

import requests

response = requests.post(
    "https://your-space.hf.space/api/generate",
    data={
        "text": "Hello world! This is Kokoro TTS.",
        "voice": "bf_isabella",
        "speed": 1.0
    }
)

with open("audio.wav", "wb") as f:
    f.write(response.content)

Using Flutter/Dart:

import 'package:http/http.dart' as http;

Future<File> generateTTS(String text) async {
  final response = await http.post(
    Uri.parse('https://your-space.hf.space/api/generate'),
    body: {
      'text': text,
      'voice': 'bf_isabella',
      'speed': '1.0',
    },
  );
  
  final file = File('${Directory.systemTemp.path}/tts_${DateTime.now().millisecondsSinceEpoch}.wav');
  await file.writeAsBytes(response.bodyBytes);
  return file;
}

📊 API Endpoints

`POST /api/generate`

Generate TTS audio from text.

Parameters:

text (required): Text to convert (5-4500 characters)
voice (optional): Voice to use (default: bf_isabella)
speed (optional): Speech speed 0.5-2.0 (default: 1.0)

Response: WAV audio file

`GET /health`

Check service health and available voices.

`GET /docs`

Interactive API documentation (Swagger UI).

⚙️ Technical Specs

Model: Kokoro-82M (ONNX)
Max Characters: 4500 (~5 minutes audio)
Generation Time: ~20-30 seconds (CPU)
Speech Rate: ~900 characters/minute
Output Format: WAV, 24kHz

🛠️ Deployment

This Space runs on Docker with automatic model download on startup.

No environment variables needed!

📝 License

Model: Kokoro TTS by thewh1teagle
Service: MIT License