# XTTS-v2 API Guide (Hugging Face Spaces) This API provides Text-to-Speech (TTS) capabilities using Coqui XTTS-v2, deployed on Hugging Face Spaces. It supports both streaming (low latency) and full audio generation. ## ⚠️ Critical Setup Note (PyTorch 2.6+) If you are deploying this locally or on a new environment, ensure you are using a compatible PyTorch version or apply the monkeypatch for `torch.load`. - **PyTorch 2.6+** enforces `weights_only=True` by default, which breaks loading of older Coqui TTS checkpoints. - **Fix:** Pin `torch==2.4.0` OR use the monkeypatch included in `app.py`. ## Base URL `https://loomisgitarrist-xtts-multilingual.hf.space` --- ## 1. Streaming Endpoint (`/stream`) **Best for:** Real-time applications, chatbots, assistants. **Method:** `POST` **URL:** `/stream` ### Request Body (JSON) | Field | Type | Required | Description | | :--- | :--- | :--- | :--- | | `text` | string | Yes | The text to convert to speech. | | `language` | string | Yes | Language code (e.g., `en`, `es`, `de`, `fr`, `it`, `pt`, `pl`, `tr`, `ru`, `nl`, `cs`, `ar`, `zh-cn`, `ja`, `hu`, `ko`). | | `speaker_id` | string | Yes | Filename of the speaker WAV in `speakers/` (e.g., `dave.wav`, `robert.wav`). | | `stream_chunk_size` | int | No | Chunk size for processing (default: `20`). Lower = faster start, higher = better context. | ### Python Example (Streaming) ```python import requests API_URL = "https://loomisgitarrist-xtts-multilingual.hf.space/stream" payload = { "text": "Hello, I am streaming this audio directly from the API!", "language": "en", "speaker_id": "dave.wav" } print("Streaming audio...") response = requests.post(API_URL, data=payload, stream=True) with open("stream_output.wav", "wb") as f: for chunk in response.iter_content(chunk_size=1024): if chunk: f.write(chunk) print("Stream saved to stream_output.wav") ``` --- ## 2. Generate Endpoint (`/generate`) **Best for:** High-quality, non-real-time generation. **Method:** `POST` **URL:** `/generate` ### Request Body (JSON) Same as `/stream`. ### Python Example (Generate) ```python import requests API_URL = "https://loomisgitarrist-xtts-multilingual.hf.space/generate" payload = { "text": "This is a full quality generation test.", "language": "en", "speaker_id": "dave.wav" } response = requests.post(API_URL, data=payload) if response.status_code == 200: with open("output.wav", "wb") as f: f.write(response.content) print("Audio saved to output.wav") else: print("Error:", response.text) ``` --- ## Troubleshooting - **503 Model not loaded:** The Space is starting up (Cold Start). Wait 1-2 minutes. - **Empty Audio (44 bytes):** Usually indicates a streaming error on the server. Check logs. - **Connection Error:** Check your internet or if the Space is paused.