🖤 Dirty-Calla-4B — MLX builds for Apple Silicon

Dirty-Calla-4B-mlx provides Apple Silicon–optimized versions of Daizee/Dirty-Calla-4B, a fine-tuned Gemma 3 (4B) model developed by Daizee for expressive, humanlike, and emotionally textured responses.

This conversion uses Apple’s MLX framework for local inference on M1, M2, and M3 Macs.
Each variant trades size for speed or precision, so you can choose what fits your workflow.

🧩 Note on vocab padding:
The tokenizer and embedding matrix were padded to the next multiple of 64 tokens (262,208 total).
Added tokens are labeled <pad_ex_*> — they will not appear in normal generations.

⚙️ Variants

Folder	Bits	Group Size	Description
`mlx/g128/`	int4	128	Smallest & fastest (lightest memory use)
`mlx/g64/`	int4	64	Balanced: slightly slower, more stable
`mlx/int8/`	int8	—	Closest to fp16 precision, best coherence

🚀 Quickstart

Run directly from Hugging Face

python -m mlx_lm.generate \
  --model hf://Daizee/Dirty-Calla-4B-mlx/mlx/g64 \
  --prompt "Describe a rainy city from the perspective of a poet." \
  --max-tokens 150 --temp 0.4

Downloads last month: 41

MLX

Hardware compatibility

Quantized

Model tree for Daizee/Dirty-Calla-4B-mlx

Base model

Daizee/Gemma3-Callous-Calla-4B

Adapter

Daizee/Dirty-Calla-4B

Finetuned

(1)

this model