README
Browse files
README.md
CHANGED
|
@@ -6,6 +6,11 @@ tags:
|
|
| 6 |
- trl
|
| 7 |
- sft
|
| 8 |
- mlx
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
datasets:
|
| 10 |
- Magpie-Align/Magpie-Pro-300K-Filtered
|
| 11 |
- bigcode/self-oss-instruct-sc2-exec-filter-50k
|
|
@@ -16,3 +21,32 @@ language:
|
|
| 16 |
- en
|
| 17 |
pipeline_tag: text-generation
|
| 18 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- trl
|
| 7 |
- sft
|
| 8 |
- mlx
|
| 9 |
+
- apple-silicon
|
| 10 |
+
- on-device
|
| 11 |
+
- tiny-llm
|
| 12 |
+
- smollm
|
| 13 |
+
- quantized
|
| 14 |
datasets:
|
| 15 |
- Magpie-Align/Magpie-Pro-300K-Filtered
|
| 16 |
- bigcode/self-oss-instruct-sc2-exec-filter-50k
|
|
|
|
| 21 |
- en
|
| 22 |
pipeline_tag: text-generation
|
| 23 |
---
|
| 24 |
+
|
| 25 |
+
# SmolLM-360M-Instruct (MLX 3-bit)
|
| 26 |
+
|
| 27 |
+
An **3-bit MLX quantized** build of `HuggingFaceTB/SmolLM-360M-Instruct` for ultra-low memory usage on Apple Silicon.
|
| 28 |
+
|
| 29 |
+
## Benchmark Environment
|
| 30 |
+
- Device: MacBook Pro (M3 Pro)
|
| 31 |
+
- Runtime: MLX
|
| 32 |
+
- Quantization: ~3.5 bits per weight
|
| 33 |
+
|
| 34 |
+
## Tiny Footprint (Measured)
|
| 35 |
+
- Disk size: ~155 MB
|
| 36 |
+
- Peak memory: ~0.20 GB
|
| 37 |
+
- Generation speed: ~458 tokens/sec (short generation)
|
| 38 |
+
|
| 39 |
+
> These numbers were measured on macOS (M3 Pro).
|
| 40 |
+
> This is an **extreme compression** build and may reduce output quality vs 4/5-bit.
|
| 41 |
+
|
| 42 |
+
## Usage
|
| 43 |
+
```bash
|
| 44 |
+
mlx_lm.generate \
|
| 45 |
+
--model Irfanuruchi/SmolLM-360M-Instruct-MLX-3bit \
|
| 46 |
+
--prompt "Reply with exactly 3 bullet points, 4-8 words each: what can you do offline?" \
|
| 47 |
+
--max-tokens 80
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
## License
|
| 51 |
+
Upstream SmolLM is released under **Apache-2.0**. Preserve attribution and the original license terms.
|
| 52 |
+
|