--- title: TTS API emoji: 🏆 colorFrom: green colorTo: purple sdk: docker pinned: false --- # Text-to-Speech API 🎤 A public Text-to-Speech API built with FastAPI and Microsoft Edge TTS, optimized for Hugging Face Spaces deployment. ## 🚀 Features - **Convert text to natural-sounding speech** using Microsoft Edge TTS - **Multiple voice options** with different languages and accents - **Customizable speech parameters** (pitch and rate adjustment) - **RESTful API** with automatic OpenAPI documentation - **Public access** with CORS enabled - **Real-time audio generation** and streaming ## 📖 API Documentation Once deployed, visit the root URL to access the interactive API documentation (Swagger UI). ## 🔧 API Endpoints ### Core Endpoints - `GET /` - API information and documentation links - `GET /health` - Health check endpoint - `GET /voices` - List all available voices - `POST /synthesize` - Convert text to speech (JSON) - `POST /synthesize-form` - Convert text to speech (Form data) ### Example Usage #### Using cURL with JSON: ```bash curl -X POST 'https://your-space-url/synthesize' \ -H 'Content-Type: application/json' \ -d '{ "text": "Hello from Hugging Face Spaces!", "voice": "en-GB-SoniaNeural", "pitch": "-10Hz", "rate": "+15%" }' \ --output speech.mp3 ``` #### Using cURL with Form Data: ```bash curl -X POST 'https://your-space-url/synthesize-form' \ -F 'text=Hello World!' \ -F 'voice=en-US-AriaNeural' \ -F 'pitch=+5Hz' \ -F 'rate=+10%' \ --output speech.mp3 ``` #### Using Python requests: ```python import requests response = requests.post( 'https://your-space-url/synthesize', json={ 'text': 'Hello from Python!', 'voice': 'en-US-AriaNeural', 'pitch': '+0Hz', 'rate': '+0%' } ) with open('speech.mp3', 'wb') as f: f.write(response.content) ``` ## 📝 Parameters ### Request Parameters | Parameter | Type | Default | Description | Example | |-----------|------|---------|-------------|---------| | `text` | string | required | Text to convert to speech | "Hello World!" | | `voice` | string | "en-US-AriaNeural" | Voice identifier | "en-GB-SoniaNeural" | | `pitch` | string | "+0Hz" | Pitch adjustment | "+10Hz", "-15Hz" | | `rate` | string | "+0%" | Rate adjustment | "+20%", "-10%" | ### Voice Examples - `en-US-AriaNeural` - US English, Female - `en-GB-SoniaNeural` - UK English, Female - `en-AU-NatashaNeural` - Australian English, Female - `de-DE-KatjaNeural` - German, Female - `fr-FR-DeniseNeural` - French, Female - `es-ES-ElviraNeural` - Spanish, Female *Use the `/voices` endpoint to get the complete list of available voices.* ### Parameter Ranges - **Pitch**: -50Hz to +50Hz (e.g., "-25Hz", "+0Hz", "+30Hz") - **Rate**: -50% to +50% (e.g., "-20%", "+0%", "+25%") ## 🛠️ Local Development ### Installation 1. Clone the repository 2. Install dependencies: ```bash pip install -r requirements.txt ``` 3. Run the server: ```bash python app.py ``` 4. Open http://localhost:7860 for API documentation ### Docker Deployment ```bash # Build the image docker build -t tts-api . # Run the container docker run -p 7860:7860 tts-api ``` ## 🌐 Hugging Face Spaces Deployment 1. Create a new Space on Hugging Face 2. Choose "Docker" as the SDK 3. Upload the following files: - `app.py` (main application) - `requirements.txt` (dependencies) - `Dockerfile` (container configuration) - `README.md` (this file) 4. Your API will be publicly accessible once deployed! ## 📋 Response Format ### Successful Response - **Content-Type**: `audio/mpeg` - **Body**: MP3 audio file ### Error Response ```json { "detail": "Error description" } ``` ## 🔒 Rate Limiting & Usage This is a public API, but please use it responsibly: - Maximum text length: 5,000 characters - Recommended: Don't exceed 100 requests per minute - For production use, consider implementing authentication ## 🐛 Troubleshooting ### Common Issues 1. **Voice not found**: Use the `/voices` endpoint to check available voices 2. **Invalid parameters**: Check pitch/rate format (must include Hz/% suffix) 3. **Text too long**: Maximum 5,000 characters per request 4. **Network timeout**: Large texts may take longer to process ## 📄 License This project uses Microsoft Edge TTS service. Please review Microsoft's terms of service for usage guidelines. ## 🤝 Contributing Feel free to open issues or submit pull requests to improve this API! --- **Made with ❤️ for the Hugging Face community**