Loomis Green
Docs: Update API guide with PyTorch 2.6+ compatibility notes and streaming examples
f34249a XTTS-v2 API Guide (Hugging Face Spaces)
This API provides Text-to-Speech (TTS) capabilities using Coqui XTTS-v2, deployed on Hugging Face Spaces. It supports both streaming (low latency) and full audio generation.
⚠️ Critical Setup Note (PyTorch 2.6+)
If you are deploying this locally or on a new environment, ensure you are using a compatible PyTorch version or apply the monkeypatch for torch.load.
- PyTorch 2.6+ enforces
weights_only=Trueby default, which breaks loading of older Coqui TTS checkpoints. - Fix: Pin
torch==2.4.0OR use the monkeypatch included inapp.py.
Base URL
https://loomisgitarrist-xtts-multilingual.hf.space
1. Streaming Endpoint (/stream)
Best for: Real-time applications, chatbots, assistants.
Method: POST
URL: /stream
Request Body (JSON)
| Field | Type | Required | Description |
|---|---|---|---|
text |
string | Yes | The text to convert to speech. |
language |
string | Yes | Language code (e.g., en, es, de, fr, it, pt, pl, tr, ru, nl, cs, ar, zh-cn, ja, hu, ko). |
speaker_id |
string | Yes | Filename of the speaker WAV in speakers/ (e.g., dave.wav, robert.wav). |
stream_chunk_size |
int | No | Chunk size for processing (default: 20). Lower = faster start, higher = better context. |
Python Example (Streaming)
import requests
API_URL = "https://loomisgitarrist-xtts-multilingual.hf.space/stream"
payload = {
"text": "Hello, I am streaming this audio directly from the API!",
"language": "en",
"speaker_id": "dave.wav"
}
print("Streaming audio...")
response = requests.post(API_URL, data=payload, stream=True)
with open("stream_output.wav", "wb") as f:
for chunk in response.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
print("Stream saved to stream_output.wav")
2. Generate Endpoint (/generate)
Best for: High-quality, non-real-time generation.
Method: POST
URL: /generate
Request Body (JSON)
Same as /stream.
Python Example (Generate)
import requests
API_URL = "https://loomisgitarrist-xtts-multilingual.hf.space/generate"
payload = {
"text": "This is a full quality generation test.",
"language": "en",
"speaker_id": "dave.wav"
}
response = requests.post(API_URL, data=payload)
if response.status_code == 200:
with open("output.wav", "wb") as f:
f.write(response.content)
print("Audio saved to output.wav")
else:
print("Error:", response.text)
Troubleshooting
- 503 Model not loaded: The Space is starting up (Cold Start). Wait 1-2 minutes.
- Empty Audio (44 bytes): Usually indicates a streaming error on the server. Check logs.
- Connection Error: Check your internet or if the Space is paused.