Qwen3-TTS CustomVoice โ€” HF Inference Endpoint

Custom handler.py that serves Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice on a dedicated HF Inference Endpoint via the official qwen-tts package.

The handler loads the 1.9B model + the 12Hz speech tokenizer (vocoder) at cold start and exposes one POST route returning base64 WAV (24kHz mono).

Request

{
  "inputs": "She watched the rain trace lines down the window.",
  "parameters": {
    "speaker": "Ryan",
    "language": "English",
    "instruct": "calm, observational"
  }
}

Zero-shot clone (best-effort on CustomVoice; the -Base model is better for this):

{
  "inputs": "text in the cloned voice",
  "parameters": {
    "language": "English",
    "ref_audio_b64": "<base64 wav/mp3>",
    "ref_text": "exact transcript of the reference clip"
  }
}

Response: [{"audio": "<base64 wav>", "format": "wav", "sample_rate": 24000, "duration_s": 3.1, "speaker": "Ryan", "language": "English"}]

Built-in speakers

Speaker Voice Native language
Vivian bright, slightly edgy young female Chinese
Serena warm, gentle young female Chinese
Uncle_Fu seasoned male, low mellow timbre Chinese
Dylan youthful Beijing male Chinese (Beijing)
Eric lively Chengdu male Chinese (Sichuan)
Ryan dynamic male, strong rhythm English
Aiden sunny American male, clear midrange English
Ono_Anna playful Japanese female Japanese
Sohee warm Korean female, rich emotion Korean

Languages: Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian (or Auto).

Recommended instance

nvidia-l4 x1 (24 GB) is comfortable โ€” the model is ~4 GB in bf16. Cold start ~3-6 min (pip build + model + vocoder download + load). flash-attn is deliberately omitted; the handler uses sdpa.

Lifecycle (from the video_lab repo)

python scripts/_run_qwen_tts_endpoint.py --start    # create + wait for ready
python scripts/_run_qwen_tts_endpoint.py --status   # show state + URL
python scripts/_run_qwen_tts_endpoint.py --stop     # delete (full teardown)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for macso250/qwen3-tts-endpoint

Finetuned
(16)
this model