Irfanuruchi commited on
Commit
5ed6e1c
·
verified ·
1 Parent(s): f6c0b20

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md CHANGED
@@ -6,6 +6,11 @@ tags:
6
  - trl
7
  - sft
8
  - mlx
 
 
 
 
 
9
  datasets:
10
  - Magpie-Align/Magpie-Pro-300K-Filtered
11
  - bigcode/self-oss-instruct-sc2-exec-filter-50k
@@ -16,3 +21,31 @@ language:
16
  - en
17
  pipeline_tag: text-generation
18
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - trl
7
  - sft
8
  - mlx
9
+ - apple-silicon
10
+ - on-device
11
+ - tiny-llm
12
+ - smollm
13
+ - quantized
14
  datasets:
15
  - Magpie-Align/Magpie-Pro-300K-Filtered
16
  - bigcode/self-oss-instruct-sc2-exec-filter-50k
 
21
  - en
22
  pipeline_tag: text-generation
23
  ---
24
+
25
+ # SmolLM-1.7B-Instruct (MLX 4-bit)
26
+
27
+ A **4-bit MLX quantized** build of `HuggingFaceTB/SmolLM-1.7B-Instruct`, optimized for Apple Silicon local inference.
28
+
29
+ ## Benchmark Environment
30
+ - Device: MacBook Pro (M3 Pro)
31
+ - Runtime: MLX
32
+ - Quantization: ~4.5 bits per weight
33
+
34
+ ## Performance (Measured)
35
+ - Disk size: ~922 MB
36
+ - Peak memory: ~1.08 GB
37
+ - Generation speed: ~110 tokens/sec
38
+
39
+ > Benchmarks were collected on macOS (M3 Pro).
40
+ > Performance on iPhone / iPad will vary based on hardware and available memory.
41
+
42
+ ## Usage
43
+ ```bash
44
+ mlx_lm.generate \
45
+ --model Irfanuruchi/SmolLM-1.7B-Instruct-MLX-4bit \
46
+ --prompt "In 5 sentences, explain the Pomodoro technique and how to start today." \
47
+ --max-tokens 140
48
+ ```
49
+
50
+ ## License
51
+ Upstream SmolLM is released under **Apache-2.0**. Preserve attribution and the original license terms.