Spaces:

Loomisgitarrist
/

xtts-multilingual

Running

App Files Files Community

xtts-multilingual / XTTS_API_GUIDE.md

Loomis Green

Docs: Update API guide with PyTorch 2.6+ compatibility notes and streaming examples

f34249a about 1 month ago

preview code

raw

history blame contribute delete

2.85 kB

XTTS-v2 API Guide (Hugging Face Spaces)

This API provides Text-to-Speech (TTS) capabilities using Coqui XTTS-v2, deployed on Hugging Face Spaces. It supports both streaming (low latency) and full audio generation.

⚠️ Critical Setup Note (PyTorch 2.6+)

If you are deploying this locally or on a new environment, ensure you are using a compatible PyTorch version or apply the monkeypatch for torch.load.

PyTorch 2.6+ enforces weights_only=True by default, which breaks loading of older Coqui TTS checkpoints.
Fix: Pin torch==2.4.0 OR use the monkeypatch included in app.py.

Base URL

https://loomisgitarrist-xtts-multilingual.hf.space

1. Streaming Endpoint (`/stream`)

Best for: Real-time applications, chatbots, assistants. Method: POST URL: /stream

Request Body (JSON)

Field	Type	Required	Description
`text`	string	Yes	The text to convert to speech.
`language`	string	Yes	Language code (e.g., `en`, `es`, `de`, `fr`, `it`, `pt`, `pl`, `tr`, `ru`, `nl`, `cs`, `ar`, `zh-cn`, `ja`, `hu`, `ko`).
`speaker_id`	string	Yes	Filename of the speaker WAV in `speakers/` (e.g., `dave.wav`, `robert.wav`).
`stream_chunk_size`	int	No	Chunk size for processing (default: `20`). Lower = faster start, higher = better context.

Python Example (Streaming)

import requests

API_URL = "https://loomisgitarrist-xtts-multilingual.hf.space/stream"
payload = {
    "text": "Hello, I am streaming this audio directly from the API!",
    "language": "en",
    "speaker_id": "dave.wav"
}

print("Streaming audio...")
response = requests.post(API_URL, data=payload, stream=True)

with open("stream_output.wav", "wb") as f:
    for chunk in response.iter_content(chunk_size=1024):
        if chunk:
            f.write(chunk)
print("Stream saved to stream_output.wav")

2. Generate Endpoint (`/generate`)

Best for: High-quality, non-real-time generation. Method: POST URL: /generate

Request Body (JSON)

Same as /stream.

Python Example (Generate)

import requests

API_URL = "https://loomisgitarrist-xtts-multilingual.hf.space/generate"
payload = {
    "text": "This is a full quality generation test.",
    "language": "en",
    "speaker_id": "dave.wav"
}

response = requests.post(API_URL, data=payload)

if response.status_code == 200:
    with open("output.wav", "wb") as f:
        f.write(response.content)
    print("Audio saved to output.wav")
else:
    print("Error:", response.text)

Troubleshooting

503 Model not loaded: The Space is starting up (Cold Start). Wait 1-2 minutes.
Empty Audio (44 bytes): Usually indicates a streaming error on the server. Check logs.
Connection Error: Check your internet or if the Space is paused.

XTTS-v2 API Guide (Hugging Face Spaces)

⚠️ Critical Setup Note (PyTorch 2.6+)

Base URL

1. Streaming Endpoint (/stream)

Request Body (JSON)

Python Example (Streaming)

2. Generate Endpoint (/generate)

Request Body (JSON)

Python Example (Generate)

Troubleshooting

1. Streaming Endpoint (`/stream`)

2. Generate Endpoint (`/generate`)