trainer-01 / README.md
might2901's picture
Update README.md
788f845 verified
---
license: cc-by-nc-sa-4.0
base_model: Qwen/Qwen3-TTS
pipeline_tag: text-to-speech
library_name: transformers
language:
- en
tags:
- tts
- prompttts
- qwen3-tts
- voice-design
- vocence
---
# Qwen3-TTS
A fine-tuned Qwen3-TTS model trained by might2901 for prompt-driven text-to-speech synthesis.
24 kHz mono WAV output, single forward call, no reference audio required.
## Usage
```bash
pip install qwen-tts transformers torch soundfile
```
```python
from qwen_tts import Qwen3TTSModel
import soundfile as sf
model = Qwen3TTSModel.from_pretrained("might2901/model-name")
wavs, sr = model.generate_voice_design(
text="Hello, this is a test of the text to speech system.",
instruct="A clear, natural voice speaking calmly.",
language="english",
)
sf.write("output.wav", wavs[0], sr)
```
## Prompt Guide
| Layer | Examples |
|-------|----------|
| Gender | *a man*, *a woman* |
| Mood | *speaking warmly*, *calm*, *natural*, *softly* |
| Pace | *unhurried*, *steady*, *measured* |
| Style | *conversational*, *professional*, *neutral* |
Example prompts:
```
A man speaks calmly and naturally.
A woman with a clear, conversational tone.
A professional voice, neutral and steady.
```
## Files
```
model.safetensors # model weights
speech_tokenizer/ # Qwen3 audio codec
tokenizer.json + ... # text tokenizer
config.json # model config
generation_config.json # generation settings
```
## License
CC BY-NC-SA 4.0 — research and non-commercial use only.