3morixd commited on
Commit
bc78470
·
verified ·
1 Parent(s): 3ab9de5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +41 -27
README.md CHANGED
@@ -1,45 +1,59 @@
1
  ---
2
  license: apache-2.0
3
- base_model: HuggingFaceTB/SmolLM2-135M-Instruct
 
 
4
  tags:
5
- - dispatch-ai
6
  - mobile
 
7
  - quantized
8
  - gguf
9
- - on-device
10
- - edge-ai
11
- - ultra-small
12
- - featherweight
13
  pipeline_tag: text-generation
14
- language:
15
- - en
16
- library_name: transformers
17
  ---
18
 
19
- ![card_image](card_image.png)
 
 
 
 
20
 
21
- # SmolLM2 135M Featherweight Mobile
22
 
23
- **Dispatch AI** — 135 million parameters. Smaller than a WhatsApp update. And it thinks.
24
 
25
- ## 📱 Phone Farm Benchmark
26
 
27
- | Metric | Value |
28
- |--------|-------|
29
- | **Generation speed** | 22.8 t/s |
30
- | **Model size** | ~85 MB |
31
- | **Load time** | 0.3s |
32
- | **RAM free** | 4.5 GB |
 
 
 
33
 
34
- The lightest model in our mobile lineup. Perfect for:
35
- - Quick text classification on-device
36
- - Lightweight chat assistants
37
- - Edge IoT devices with limited RAM
38
 
39
- ## 💻 Usage
 
40
 
41
- ```bash
42
- llama-cli -m model.gguf -p "Complete: The sky is" -t 2
 
 
 
 
43
  ```
44
 
45
- **Dispatch AI (FZE)** — Sharjah, UAE | License 10818
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: transformers
6
  tags:
 
7
  - mobile
8
+ - on-device
9
  - quantized
10
  - gguf
11
+ - dispatchai
 
 
 
12
  pipeline_tag: text-generation
 
 
 
13
  ---
14
 
15
+ # SmolLM2-135M-Instruct-mobile
16
+
17
+ ✅ **WORKS** — Verified June 2026.
18
+
19
+ ## Verification Results
20
 
21
+ Phone-verified only. See verification report.
22
 
23
+ **Chat format**: `llama-3`
24
 
25
+ ## Model Details
26
 
27
+ | Attribute | Value |
28
+ |-----------|-------|
29
+ | **Base Model** | HuggingFaceTB/SmolLM2-135M-Instruct |
30
+ | **File Size** | 0 MB |
31
+ | **Format** | GGUF |
32
+ | **Chat Format** | llama-3 |
33
+ | **CPU Speed** | 59.7 tokens/sec |
34
+ | **Phone Speed** | 46.0 tokens/sec (Snapdragon 865) |
35
+ | **License** | apache-2.0 |
36
 
37
+ ## Usage
 
 
 
38
 
39
+ ```python
40
+ from llama_cpp import Llama
41
 
42
+ llm = Llama(model_path="model.gguf", chat_format="llama-3", n_ctx=512, n_threads=4)
43
+ response = llm.create_chat_completion(
44
+ messages=[{"role": "user", "content": "What is the capital of France?"}],
45
+ max_tokens=50,
46
+ )
47
+ print(response["choices"][0]["message"]["content"])
48
  ```
49
 
50
+ ### dispatchAI SDK
51
+ ```python
52
+ from dispatchai import load_model
53
+ model = load_model("SmolLM2-135M-Instruct-mobile", backend="gguf")
54
+ print(model.chat("Hello!"))
55
+ ```
56
+
57
+ ## About dispatchAI
58
+
59
+ [dispatchAI](https://huggingface.co/dispatchAI) — Small. Mobile. Free. UAE-built.