Loomis Green
Docs: Update API guide with PyTorch 2.6+ compatibility notes and streaming examples
f34249a | # XTTS-v2 API Guide (Hugging Face Spaces) | |
| This API provides Text-to-Speech (TTS) capabilities using Coqui XTTS-v2, deployed on Hugging Face Spaces. It supports both streaming (low latency) and full audio generation. | |
| ## ⚠️ Critical Setup Note (PyTorch 2.6+) | |
| If you are deploying this locally or on a new environment, ensure you are using a compatible PyTorch version or apply the monkeypatch for `torch.load`. | |
| - **PyTorch 2.6+** enforces `weights_only=True` by default, which breaks loading of older Coqui TTS checkpoints. | |
| - **Fix:** Pin `torch==2.4.0` OR use the monkeypatch included in `app.py`. | |
| ## Base URL | |
| `https://loomisgitarrist-xtts-multilingual.hf.space` | |
| --- | |
| ## 1. Streaming Endpoint (`/stream`) | |
| **Best for:** Real-time applications, chatbots, assistants. | |
| **Method:** `POST` | |
| **URL:** `/stream` | |
| ### Request Body (JSON) | |
| | Field | Type | Required | Description | | |
| | :--- | :--- | :--- | :--- | | |
| | `text` | string | Yes | The text to convert to speech. | | |
| | `language` | string | Yes | Language code (e.g., `en`, `es`, `de`, `fr`, `it`, `pt`, `pl`, `tr`, `ru`, `nl`, `cs`, `ar`, `zh-cn`, `ja`, `hu`, `ko`). | | |
| | `speaker_id` | string | Yes | Filename of the speaker WAV in `speakers/` (e.g., `dave.wav`, `robert.wav`). | | |
| | `stream_chunk_size` | int | No | Chunk size for processing (default: `20`). Lower = faster start, higher = better context. | | |
| ### Python Example (Streaming) | |
| ```python | |
| import requests | |
| API_URL = "https://loomisgitarrist-xtts-multilingual.hf.space/stream" | |
| payload = { | |
| "text": "Hello, I am streaming this audio directly from the API!", | |
| "language": "en", | |
| "speaker_id": "dave.wav" | |
| } | |
| print("Streaming audio...") | |
| response = requests.post(API_URL, data=payload, stream=True) | |
| with open("stream_output.wav", "wb") as f: | |
| for chunk in response.iter_content(chunk_size=1024): | |
| if chunk: | |
| f.write(chunk) | |
| print("Stream saved to stream_output.wav") | |
| ``` | |
| --- | |
| ## 2. Generate Endpoint (`/generate`) | |
| **Best for:** High-quality, non-real-time generation. | |
| **Method:** `POST` | |
| **URL:** `/generate` | |
| ### Request Body (JSON) | |
| Same as `/stream`. | |
| ### Python Example (Generate) | |
| ```python | |
| import requests | |
| API_URL = "https://loomisgitarrist-xtts-multilingual.hf.space/generate" | |
| payload = { | |
| "text": "This is a full quality generation test.", | |
| "language": "en", | |
| "speaker_id": "dave.wav" | |
| } | |
| response = requests.post(API_URL, data=payload) | |
| if response.status_code == 200: | |
| with open("output.wav", "wb") as f: | |
| f.write(response.content) | |
| print("Audio saved to output.wav") | |
| else: | |
| print("Error:", response.text) | |
| ``` | |
| --- | |
| ## Troubleshooting | |
| - **503 Model not loaded:** The Space is starting up (Cold Start). Wait 1-2 minutes. | |
| - **Empty Audio (44 bytes):** Usually indicates a streaming error on the server. Check logs. | |
| - **Connection Error:** Check your internet or if the Space is paused. | |