gemma-4-31b-it-abliterated-4bit-mlx
A 4-bit MLX quantization of null-space/gemma-4-31b-it-abliterated, tuned for fast on-device inference on Apple Silicon.
- Base model:
null-space/gemma-4-31b-it-abliterated(BF16, ~62 GB) - Quantization: 4-bit affine, group size 64
- Format: MLX safetensors
- Footprint: ~16 GB on disk, runs comfortably on a 32 GB Mac and flies on 64 GB+
- Throughput: ~15 tok/s on M4 Max (measured with
mlx-lm0.21+)
Why this exists
The default model in nicedreamzapp/claude-code-local used to point at a mlx-community/... repo that never actually existed. This is the real, working 4-bit quant — drop-in replacement.
Usage
from mlx_lm import load, generate
model, tokenizer = load("divinetribe/gemma-4-31b-it-abliterated-4bit-mlx")
print(generate(model, tokenizer, prompt="Hello", max_tokens=200))
Or via the launcher / setup script in claude-code-local:
MLX_MODEL=divinetribe/gemma-4-31b-it-abliterated-4bit-mlx \
bash scripts/start-mlx-server.sh
Abliteration
Refusal-direction projection per Arditi et al. (2024). Use responsibly — you are now the moderator.
- Downloads last month
- 1,092
Model size
31B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for divinetribe/gemma-4-31b-it-abliterated-4bit-mlx
Base model
google/gemma-4-31B-it Finetuned
null-space/gemma-4-31b-it-abliterated