trainer-01 / README.md
might2901's picture
Update README.md
788f845 verified
metadata
license: cc-by-nc-sa-4.0
base_model: Qwen/Qwen3-TTS
pipeline_tag: text-to-speech
library_name: transformers
language:
  - en
tags:
  - tts
  - prompttts
  - qwen3-tts
  - voice-design
  - vocence

Qwen3-TTS

A fine-tuned Qwen3-TTS model trained by might2901 for prompt-driven text-to-speech synthesis.

24 kHz mono WAV output, single forward call, no reference audio required.

Usage

pip install qwen-tts transformers torch soundfile
from qwen_tts import Qwen3TTSModel
import soundfile as sf

model = Qwen3TTSModel.from_pretrained("might2901/model-name")

wavs, sr = model.generate_voice_design(
    text="Hello, this is a test of the text to speech system.",
    instruct="A clear, natural voice speaking calmly.",
    language="english",
)
sf.write("output.wav", wavs[0], sr)

Prompt Guide

Layer Examples
Gender a man, a woman
Mood speaking warmly, calm, natural, softly
Pace unhurried, steady, measured
Style conversational, professional, neutral

Example prompts:

A man speaks calmly and naturally.
A woman with a clear, conversational tone.
A professional voice, neutral and steady.

Files

model.safetensors            # model weights
speech_tokenizer/            # Qwen3 audio codec
tokenizer.json + ...         # text tokenizer
config.json                  # model config
generation_config.json       # generation settings

License

CC BY-NC-SA 4.0 — research and non-commercial use only.