Spaces:

grimshaw
/

tts-api

Sleeping

App Files Files Community

tts-api / README.md

gavanduffy

Add HF Space YAML metadata

052919a about 1 month ago

preview code

Raw

History Blame Contribute Delete

2.71 kB

	---
	title: SuperTonic3 TTS API
	emoji: 🎤
	colorFrom: blue
	colorTo: purple
	sdk: docker
	pinned: false
	license: mit
	---

	# SuperTonic3 TTS API

	OpenAI-compatible Text-to-Speech API server powered by [supertonic3](https://github.com/nicegram/nicegram-android) (ONNX-based). Drop-in replacement for OpenAI's TTS API with 31 languages and 10 built-in voices.

	Key Features:

	- OpenAI API Compatible - Works with any OpenAI TTS client
	- 31 Languages - Single model supports 31 languages
	- 10 Built-in Voices - M1-M5 (male), F1-F5 (female)
	- CPU Optimized - No GPU required (ONNX runtime)
	- 44100 Hz Output - High-quality audio
	- Docker Ready - One-command deployment

	## Quick Start

	### Docker

	```bash
	docker compose up -d
	```

	Server available at `http://localhost:7860`

	### Python

	```bash
	python -m venv venv
	source venv/bin/activate
	pip install -r requirements.txt
	python server.py
	```

	## API Usage

	### Generate Speech

	```bash
	curl -X POST http://localhost:7860/v1/audio/speech \
	-H "Content-Type: application/json" \
	-d '{
	"model": "tts-1",
	"input": "Hello world!",
	"voice": "M1"
	}' \
	--output speech.mp3
	```

	### Python Client

	```python
	from openai import OpenAI

	client = OpenAI(
	base_url="http://localhost:7860/v1",
	api_key="not-needed"
	)

	response = client.audio.speech.create(
	model="tts-1",
	voice="M1",
	input="Hello world!"
	)
	response.stream_to_file("output.mp3")
	```

	### Speech Parameters

	\| Parameter \| Type \| Required \| Default \| Description \|
	\| ----------------- \| ------- \| -------- \| ------- \| --------------------------------- \|
	\| `input` \| string \| Yes \| - \| Text to synthesize \|
	\| `voice` \| string \| No \| `M1` \| Voice: M1-M5, F1-F5 \|
	\| `response_format` \| string \| No \| `mp3` \| Output: `mp3`, `wav`, `flac` \|
	\| `lang` \| string \| No \| `en` \| Language code (31 supported) \|
	\| `stream` \| boolean \| No \| `false` \| Enable streaming \|

	### Languages

	31 supported languages: en, zh, ja, ko, fr, de, es, it, pt, ru, ar, hi, bn, id, ms, th, vi, tl, tr, fa, pl, nl, sv, da, fi, cs, ro, hu, el, he, uk

	## Configuration

	\| Environment Variable \| Default \| Description \|
	\| ----------------------------------- \| ---------- \| ----------------- \|
	\| `SUPERTONIC3_HOST` \| `0.0.0.0` \| Bind address \|
	\| `SUPERTONIC3_PORT` \| `7860` \| Port \|
	\| `SUPERTONIC3_VOICE` \| `M1` \| Default voice \|
	\| `SUPERTONIC3_LOG_LEVEL` \| `INFO` \| Log level \|