Qwen3-ASR-1.7B โ€” MLX 4-bit

MLX 4-bit quantized conversion of Qwen/Qwen3-ASR-1.7B for Apple Silicon inference.

Model Details

Detail Value
Architecture Whisper-style audio encoder + Qwen3 text decoder
Parameters 1.7B
Quantization 4-bit (group_size=64, text decoder only)
Audio encoder float16 (24 layers, 1024 dim, 16 heads)
Size ~2.1 GB
Languages Multilingual (EN, ZH, JA, KO, FR, DE, ES, and more)

Usage

let model = try await Qwen3ASRModel.fromPretrained(
    modelId: "aufklarer/Qwen3-ASR-1.7B-MLX-4bit"
)
let text = model.transcribe(audio: samples, sampleRate: 16000)
audio transcribe audio.wav --model aufklarer/Qwen3-ASR-1.7B-MLX-4bit

Variants

Variant Size Model ID
4-bit ~2.1 GB aufklarer/Qwen3-ASR-1.7B-MLX-4bit
8-bit ~3.2 GB aufklarer/Qwen3-ASR-1.7B-MLX-8bit
0.6B 4-bit ~680 MB aufklarer/Qwen3-ASR-0.6B-MLX-4bit
0.6B 8-bit ~1.0 GB aufklarer/Qwen3-ASR-0.6B-MLX-8bit

Links



Downloads last month
4,901
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aufklarer/Qwen3-ASR-1.7B-MLX-4bit

Quantized
(17)
this model

Collection including aufklarer/Qwen3-ASR-1.7B-MLX-4bit