might2901
/

trainer-01

text-generation

Model card Files Files and versions

trainer-01 / README.md

might2901's picture

Update README.md

788f845 verified 13 days ago

|

history blame contribute delete

1.54 kB

	---
	license: cc-by-nc-sa-4.0
	base_model: Qwen/Qwen3-TTS
	pipeline_tag: text-to-speech
	library_name: transformers
	language:
	- en
	tags:
	- tts
	- prompttts
	- qwen3-tts
	- voice-design
	- vocence
	---

	# Qwen3-TTS

	A fine-tuned Qwen3-TTS model trained by might2901 for prompt-driven text-to-speech synthesis.

	24 kHz mono WAV output, single forward call, no reference audio required.

	## Usage

	```bash
	pip install qwen-tts transformers torch soundfile
	```

	```python
	from qwen_tts import Qwen3TTSModel
	import soundfile as sf

	model = Qwen3TTSModel.from_pretrained("might2901/model-name")

	wavs, sr = model.generate_voice_design(
	text="Hello, this is a test of the text to speech system.",
	instruct="A clear, natural voice speaking calmly.",
	language="english",
	)
	sf.write("output.wav", wavs[0], sr)
	```

	## Prompt Guide

	\| Layer \| Examples \|
	\|-------\|----------\|
	\| Gender \| a man, a woman \|
	\| Mood \| speaking warmly, calm, natural, softly \|
	\| Pace \| unhurried, steady, measured \|
	\| Style \| conversational, professional, neutral \|

	Example prompts:

	```
	A man speaks calmly and naturally.
	A woman with a clear, conversational tone.
	A professional voice, neutral and steady.
	```

	## Files

	```
	model.safetensors # model weights
	speech_tokenizer/ # Qwen3 audio codec
	tokenizer.json + ... # text tokenizer
	config.json # model config
	generation_config.json # generation settings
	```

	## License

	CC BY-NC-SA 4.0 — research and non-commercial use only.