BoomConnex Voice Studio β Backend
Single FastAPI app that hosts three TTS endpoints and serves the React SPA on the same origin. Designed for one HuggingFace Space on a dedicated GPU.
Endpoints
| Method | Path | Purpose |
|---|---|---|
| POST | /api/voice-clone |
OmniVoice voice clone (LavaSR pre-enhances ref) |
| POST | /api/voice-design |
OmniVoice voice design (LavaSR post-enhance) |
| POST | /api/emotion-tts |
Chatterbox emotional TTS |
| GET | /api/health |
Which models are loaded |
| GET | /api/languages |
OmniVoice language list |
| GET | /api/voice-design/options |
Voice Design dropdown taxonomy |
| GET | /, /emotion, /design |
React SPA (with client-side routing) |
All generation endpoints accept multipart/form-data and return
audio/wav (PCM 16-bit). See main.py for the full form-field list.
Local development
Two processes β backend on :7860, frontend on :8080 with a Vite proxy:
# Terminal 1 β backend
cd backend
pip install -r requirements.txt
python main.py
# Terminal 2 β frontend (sibling repo)
cd remix-of-voicecraft-studio-main
npm install
npm run dev
The Vite config proxies /api/* to http://localhost:7860, so the
frontend talks to the real backend with no CORS gymnastics.
To skip loading models locally (so the server starts fast for UI work):
LOAD_OMNIVOICE=0 LOAD_LAVASR=0 LOAD_CHATTERBOX=0 python main.py
Production / HuggingFace Space
The Space repo layout must be:
.
βββ main.py
βββ omnivoice/ # vendored from Voice-Cloning/omnivoice
βββ frontend/ # = remix-of-voicecraft-studio-main copied here
βββ requirements.txt
βββ Dockerfile
βββ README.md
Build & run locally to mirror Space:
# from backend/, with frontend/ next to it
docker build -t voice-studio .
docker run --rm --gpus all -p 7860:7860 voice-studio
Configuration (env vars)
| Variable | Default | Notes |
|---|---|---|
OMNIVOICE_CHECKPOINT |
k2-fsa/OmniVoice |
HF repo id or local path |
LAVASR_CHECKPOINT |
YatharthS/LavaSR |
HF repo id or local path |
LOAD_OMNIVOICE |
1 |
0 to skip |
LOAD_LAVASR |
1 |
0 to skip |
LOAD_CHATTERBOX |
1 |
0 to skip |
LOAD_ASR |
1 |
OmniVoice's Whisper for ref-text auto |
STATIC_DIR |
static |
Where the built SPA lives |
HOST / PORT |
0.0.0.0 / 7860 |