Irfanuruchi commited on
Commit
94fdc1f
·
verified ·
1 Parent(s): f8c00c9
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -6,6 +6,11 @@ tags:
6
  - trl
7
  - sft
8
  - mlx
 
 
 
 
 
9
  datasets:
10
  - Magpie-Align/Magpie-Pro-300K-Filtered
11
  - bigcode/self-oss-instruct-sc2-exec-filter-50k
@@ -16,3 +21,32 @@ language:
16
  - en
17
  pipeline_tag: text-generation
18
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - trl
7
  - sft
8
  - mlx
9
+ - apple-silicon
10
+ - on-device
11
+ - tiny-llm
12
+ - smollm
13
+ - quantized
14
  datasets:
15
  - Magpie-Align/Magpie-Pro-300K-Filtered
16
  - bigcode/self-oss-instruct-sc2-exec-filter-50k
 
21
  - en
22
  pipeline_tag: text-generation
23
  ---
24
+
25
+ # SmolLM-360M-Instruct (MLX 3-bit)
26
+
27
+ An **3-bit MLX quantized** build of `HuggingFaceTB/SmolLM-360M-Instruct` for ultra-low memory usage on Apple Silicon.
28
+
29
+ ## Benchmark Environment
30
+ - Device: MacBook Pro (M3 Pro)
31
+ - Runtime: MLX
32
+ - Quantization: ~3.5 bits per weight
33
+
34
+ ## Tiny Footprint (Measured)
35
+ - Disk size: ~155 MB
36
+ - Peak memory: ~0.20 GB
37
+ - Generation speed: ~458 tokens/sec (short generation)
38
+
39
+ > These numbers were measured on macOS (M3 Pro).
40
+ > This is an **extreme compression** build and may reduce output quality vs 4/5-bit.
41
+
42
+ ## Usage
43
+ ```bash
44
+ mlx_lm.generate \
45
+ --model Irfanuruchi/SmolLM-360M-Instruct-MLX-3bit \
46
+ --prompt "Reply with exactly 3 bullet points, 4-8 words each: what can you do offline?" \
47
+ --max-tokens 80
48
+ ```
49
+
50
+ ## License
51
+ Upstream SmolLM is released under **Apache-2.0**. Preserve attribution and the original license terms.
52
+