Gemma-4-31B-it-heretic MLX 4-Bit

This is a selectively quantized 4-bit MLX version of coder3101/gemma-4-31B-it-heretic.

Why this variant?

Current "4-bit" heretic versions circulating are poorly vibe-coded, resulting in bloated 28.5GB files that are effectively 7.3-bit, while this model achieves a true 4-bit footprint at 17GB, massively reducing memory overhead on Apple Silicon with zero practical degradation.

Downloads last month
1,109
Safetensors
Model size
5B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for IvanSmit05/gemma-4-31B-it-heretic-MLX-4bit

Quantized
(9)
this model