Update README.md
Browse files
README.md
CHANGED
|
@@ -6,6 +6,11 @@ tags:
|
|
| 6 |
- trl
|
| 7 |
- sft
|
| 8 |
- mlx
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
datasets:
|
| 10 |
- Magpie-Align/Magpie-Pro-300K-Filtered
|
| 11 |
- bigcode/self-oss-instruct-sc2-exec-filter-50k
|
|
@@ -16,3 +21,31 @@ language:
|
|
| 16 |
- en
|
| 17 |
pipeline_tag: text-generation
|
| 18 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- trl
|
| 7 |
- sft
|
| 8 |
- mlx
|
| 9 |
+
- apple-silicon
|
| 10 |
+
- on-device
|
| 11 |
+
- tiny-llm
|
| 12 |
+
- smollm
|
| 13 |
+
- quantized
|
| 14 |
datasets:
|
| 15 |
- Magpie-Align/Magpie-Pro-300K-Filtered
|
| 16 |
- bigcode/self-oss-instruct-sc2-exec-filter-50k
|
|
|
|
| 21 |
- en
|
| 22 |
pipeline_tag: text-generation
|
| 23 |
---
|
| 24 |
+
|
| 25 |
+
# SmolLM-1.7B-Instruct (MLX 4-bit)
|
| 26 |
+
|
| 27 |
+
A **4-bit MLX quantized** build of `HuggingFaceTB/SmolLM-1.7B-Instruct`, optimized for Apple Silicon local inference.
|
| 28 |
+
|
| 29 |
+
## Benchmark Environment
|
| 30 |
+
- Device: MacBook Pro (M3 Pro)
|
| 31 |
+
- Runtime: MLX
|
| 32 |
+
- Quantization: ~4.5 bits per weight
|
| 33 |
+
|
| 34 |
+
## Performance (Measured)
|
| 35 |
+
- Disk size: ~922 MB
|
| 36 |
+
- Peak memory: ~1.08 GB
|
| 37 |
+
- Generation speed: ~110 tokens/sec
|
| 38 |
+
|
| 39 |
+
> Benchmarks were collected on macOS (M3 Pro).
|
| 40 |
+
> Performance on iPhone / iPad will vary based on hardware and available memory.
|
| 41 |
+
|
| 42 |
+
## Usage
|
| 43 |
+
```bash
|
| 44 |
+
mlx_lm.generate \
|
| 45 |
+
--model Irfanuruchi/SmolLM-1.7B-Instruct-MLX-4bit \
|
| 46 |
+
--prompt "In 5 sentences, explain the Pomodoro technique and how to start today." \
|
| 47 |
+
--max-tokens 140
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
## License
|
| 51 |
+
Upstream SmolLM is released under **Apache-2.0**. Preserve attribution and the original license terms.
|