Model Card for Model ID

This model is a quantized version of QwQ3-32B converted to AWQ Q4 format using the mlx library for efficient inference. It retains the core capabilities of QwQ 32b while optimizing for resource constraints.

Downloads last month
7
Safetensors
Model size
5B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Goraint/QwQ-32b-MLX-AWQ-4bit

Base model

Qwen/Qwen2.5-32B
Finetuned
Qwen/QwQ-32B
Finetuned
(86)
this model