Qwen3 TTS 12Hz 0.6B CustomVoice — MLX 4-bit
MLX 4-bit quantized conversion of Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice for Apple Silicon inference.
Usage
Used by qwen3-asr-swift Qwen3TTS module:
let model = try await Qwen3TTSModel.fromPretrained(
modelId: TTSModelVariant.customVoice.rawValue
)
let audio = try model.synthesize("Hello!", speaker: "Chelsie")
audio speak "Hello!" --model custom-voice --speaker Chelsie -o output.wav
Model Details
- Architecture: Qwen3-TTS (Talker transformer + Code Predictor + Mimi speech tokenizer decoder)
- Parameters: 0.6B
- Quantization: 4-bit (MLX, talker only)
- Size: ~964 MB
- Sample rate: 24 kHz
- Codec rate: 12.5 Hz
- Voices: 9 preset voices + instruction-based style control
- Downloads last month
- 45
Model size
0.4B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for aitytech/Qwen3-TTS-12Hz-0.6B-CustomVoice-MLX-4bit
Base model
Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice