--- license: cc-by-nc-sa-4.0 base_model: Qwen/Qwen3-TTS pipeline_tag: text-to-speech library_name: transformers language: - en tags: - tts - prompttts - qwen3-tts - voice-design - vocence --- # Qwen3-TTS A fine-tuned Qwen3-TTS model trained by might2901 for prompt-driven text-to-speech synthesis. 24 kHz mono WAV output, single forward call, no reference audio required. ## Usage ```bash pip install qwen-tts transformers torch soundfile ``` ```python from qwen_tts import Qwen3TTSModel import soundfile as sf model = Qwen3TTSModel.from_pretrained("might2901/model-name") wavs, sr = model.generate_voice_design( text="Hello, this is a test of the text to speech system.", instruct="A clear, natural voice speaking calmly.", language="english", ) sf.write("output.wav", wavs[0], sr) ``` ## Prompt Guide | Layer | Examples | |-------|----------| | Gender | *a man*, *a woman* | | Mood | *speaking warmly*, *calm*, *natural*, *softly* | | Pace | *unhurried*, *steady*, *measured* | | Style | *conversational*, *professional*, *neutral* | Example prompts: ``` A man speaks calmly and naturally. A woman with a clear, conversational tone. A professional voice, neutral and steady. ``` ## Files ``` model.safetensors # model weights speech_tokenizer/ # Qwen3 audio codec tokenizer.json + ... # text tokenizer config.json # model config generation_config.json # generation settings ``` ## License CC BY-NC-SA 4.0 — research and non-commercial use only.