YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
Qwen3-ASR-0.6B-chatllm-quantized
This repository contains pre-quantized binaries for Qwen/Qwen3-ASR-0.6B optimized for use with chatllm.cpp and other GGML-compatible backends.
Available Models
qwen3-asr-0.6b-q4_0.bin: 4-bit quantization (Decent accuracy, fastest inference). Recommended for free-tier CPU instances.qwen3-asr-0.6b-q8_0.bin: 8-bit quantization (High accuracy, slightly slower than Q4).
Usage with chatllm.cpp
- Clone or download the binaries.
- Run with the following command (requires
chatllm-main):
./chatllm-main -m qwen3-asr-0.6b-q4_0.bin -p audio.wav -n 2
Credits
- Original Model: Qwen Team (Alibaba Cloud)
- C++ Backend: foldl/chatllm.cpp
- Quantization: Pre-quantized using
chatllm.cppconversion scripts.
License
Please refer to the original Qwen3-ASR License.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support