File size: 1,490 Bytes
1d20f66 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
tags:
- mlx
- apple-silicon
- text-generation
- gemma3
- instruction-tuned
library_name: mlx-lm
pipeline_tag: text-generation
base_model: Daizee/Dirty-Calla-4B
license: apache-2.0
---
# 🖤 Dirty-Calla-4B — **MLX** builds for Apple Silicon
**Dirty-Calla-4B-mlx** provides Apple Silicon–optimized versions of **Daizee/Dirty-Calla-4B**, a fine-tuned **Gemma 3 (4B)** model developed by **Daizee** for expressive, humanlike, and emotionally textured responses.
This conversion uses Apple’s **MLX** framework for local inference on **M1, M2, and M3 Macs**.
Each variant trades size for speed or precision, so you can choose what fits your workflow.
> 🧩 **Note on vocab padding:**
> The tokenizer and embedding matrix were padded to the next multiple of 64 tokens (262,208 total).
> Added tokens are labeled `<pad_ex_*>` — they will not appear in normal generations.
---
## ⚙️ Variants
| Folder | Bits | Group Size | Description |
|----------------|------|------------|--------------|
| `mlx/g128/` | int4 | 128 | Smallest & fastest (lightest memory use) |
| `mlx/g64/` | int4 | 64 | Balanced: slightly slower, more stable |
| `mlx/int8/` | int8 | — | Closest to fp16 precision, best coherence |
---
## 🚀 Quickstart
### Run directly from Hugging Face
```bash
python -m mlx_lm.generate \
--model hf://Daizee/Dirty-Calla-4B-mlx/mlx/g64 \
--prompt "Describe a rainy city from the perspective of a poet." \
--max-tokens 150 --temp 0.4
|