4bit-MAD

Mixed-precision Alternating-least-squares DWQ export for the local Gemma 4 E4B instruction model.

This is a full standalone MLX model export, not a scale/bias-only checkpoint. It was produced from the dynamic 4b128/6b64 mixed quant at approximately 4.5 bpw, initialized with alternating round + least-squares for 30 iterations, then trained with the n1024 cosine DWQ recipe.

Canonical old-heldout KL against /Users/natebreslow/Documents/gemmaOpt/gemma-4-E4B-it:

avg_kl=0.095879
total_kl=514.967500
total_tokens=5371

The eval log is included as oldheldout_kl.log.

Downloads last month
312
Safetensors
Model size
1B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support