Qwen2.5-7B-Instruct MLX 8bit

MLX 8-bit conversion of Qwen/Qwen2.5-7B-Instruct. Default group size (64).

Converted directly from the original HF bf16 safetensors.

The full ladder + group-size sweep

Variant	Repo	Disk	~Min unified RAM	Role
MLX bf16	`Qwen2.5-7B-Instruct-MLX-bf16`	15.24 GB	~18 GB	Reference
MLX 8bit (this repo)	this	8.1 GB	~10 GB	Near-lossless
MLX 6bit	`Qwen2.5-7B-Instruct-MLX-6bit`	6.2 GB	~8 GB	Quality / size middle
MLX 4bit-gs32	`Qwen2.5-7B-Instruct-MLX-4bit-gs32`	4.77 GB	~7 GB	4-bit, group size 32
MLX 4bit-gs64	`Qwen2.5-7B-Instruct-MLX-4bit-gs64`	4.3 GB	~6 GB	4-bit, group size 64 (mlx-lm default)
MLX 4bit-gs128	`Qwen2.5-7B-Instruct-MLX-4bit-gs128`	4.06 GB	~6 GB	4-bit, group size 128
MLX 3bit	`Qwen2.5-7B-Instruct-MLX-3bit`	3.34 GB	~5 GB	Smaller, expect quality drop
MLX 2bit	`Qwen2.5-7B-Instruct-MLX-2bit`	2.39 GB	~4 GB	Aggressive — verify on workload

Collection: Qwen2.5-7B-Instruct MLX ladder + group-size sweep

Use

pip install mlx-lm
mlx_lm.generate --model zaydiscold/Qwen2.5-7B-Instruct-MLX-8bit \
  --prompt "Explain quantum entanglement in one paragraph" --max-tokens 200

Conversion

python -m mlx_lm convert \
  --hf-path Qwen/Qwen2.5-7B-Instruct \
  --mlx-path ./Qwen2.5-7B-Instruct-MLX-8bit \
  -q --q-bits 8

Notes

GGUF Q4_K_M is llama.cpp; MLX has no literal Q4_K_M.
See the sibling repos for other bit budgets / group sizes.

Credits

Source: Qwen/Qwen2.5-7B-Instruct
MLX conversion: zaydiscold

Downloads last month: 14

Safetensors

Model size

8B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zaydiscold/Qwen2.5-7B-Instruct-MLX-8bit

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Quantized

(358)

this model

Collection including zaydiscold/Qwen2.5-7B-Instruct-MLX-8bit

Qwen2.5-7B-Instruct MLX ladder + group-size sweep

Collection

Complete MLX grid for Qwen2.5-7B-Instruct — full bit ladder (bf16/8/6/4/3/2-bit) + 4-bit group-size sweep at gs=32/64/128. • 8 items • Updated May 11