Alibrown commited on
Commit
f10fa78
Β·
verified Β·
1 Parent(s): 6ccdae5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +141 -5
README.md CHANGED
@@ -1,11 +1,147 @@
1
  ---
2
  title: SmolLM2 Customs
3
- emoji: πŸ“‰
4
- colorFrom: blue
5
  colorTo: blue
6
  sdk: docker
7
- pinned: false
8
- short_description: Lightwight LLM on CPU to use as a assi
9
  ---
10
 
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: SmolLM2 Customs
3
+ emoji: πŸ€–
4
+ colorFrom: indigo
5
  colorTo: blue
6
  sdk: docker
7
+ pinned: true
8
+ short_description: Showcase β€” Build your own free LLM service
9
  ---
10
 
11
+ # SmolLM2 Customs β€” Build Your Own LLM Service
12
+
13
+ > A showcase: how to build a free, private, OpenAI-compatible LLM service on HuggingFace Spaces and plug it into any hub or application β€” no GPU, no money, no drama.
14
+
15
+ > [!IMPORTANT]
16
+ > This project is under active development β€” always use the latest release from [Codey Lab](https://github.com/Codey-LAB/SmolLM2-customs) *(more stable builds land there first)*.
17
+ > This repo ([DEV-STATUS](https://github.com/VolkanSah/SmolLM2-custom)) is where the chaos happens. πŸ”¬ A ⭐ on the repos would be cool πŸ˜™
18
+
19
+ ---
20
+
21
+ ## What is this?
22
+
23
+ A minimal but production-ready LLM service built on:
24
+
25
+ - **SmolLM2-360M-Instruct** β€” 269MB, Apache 2.0, runs on 2 CPUs for free
26
+ - **FastAPI** β€” OpenAI-compatible `/v1/chat/completions` endpoint
27
+ - **ADI** (Anti-Dump Index) β€” filters low-quality requests before they hit the model
28
+ - **HF Dataset** β€” logs every request for later analysis and finetuning
29
+
30
+ The point is not the model β€” the point is the pattern. Fork it, swap SmolLM2 for any model you want, and you have your own private LLM API running for free.
31
+
32
+ ---
33
+
34
+ ## How it works
35
+
36
+ ```
37
+ Request
38
+ ↓
39
+ ADI Score (is this request worth answering?)
40
+ ↓
41
+ REJECT β†’ returns improvement suggestions, logs to dataset
42
+ MEDIUM/HIGH β†’ SmolLM2 answers, logs to dataset
43
+ SmolLM2 fails β†’ returns 503 β†’ hub fallback chain kicks in
44
+ ```
45
+
46
+ ---
47
+
48
+ ## Endpoints
49
+
50
+ ```
51
+ GET / β†’ status
52
+ GET /v1/health β†’ health check
53
+ POST /v1/chat/completions β†’ OpenAI-compatible inference
54
+ ```
55
+
56
+ ---
57
+
58
+ ## Plug into any Hub (one config block)
59
+
60
+ Works out of the box with [Multi-LLM-API-Gateway](https://github.com/VolkanSah/Multi-LLM-API-Gateway): Hub Screenshot for this [SmolLM2](SmolLM2.jpg)
61
+
62
+ ```ini
63
+ [LLM_PROVIDER.smollm]
64
+ active = "true"
65
+ base_url = "https://YOUR-USERNAME-smollm2-customs.hf.space/v1"
66
+ env_key = "SMOLLM_API_KEY"
67
+ default_model = "smollm2-360m"
68
+ models = "smollm2-360m, YOUR-USERNAME/your-finetuned-model"
69
+ fallback_to = "gemini"
70
+ [LLM_PROVIDER.smollm_END]
71
+ ```
72
+
73
+ Any OpenAI-compatible client works the same way.
74
+
75
+
76
+ ---
77
+
78
+ ## Secrets (HF Space Settings)
79
+
80
+ | Secret | Required | Description |
81
+ |--------|----------|-------------|
82
+ | `SMOLLM_API_KEY` | recommended | Locks the endpoint β€” set same value in your hub |
83
+ | `HF_TOKEN` or `TEST_TOKEN` | optional | HF auth for dataset + model repo access |
84
+ | `MODEL_REPO` | optional | Base model override (default: `HuggingFaceTB/SmolLM2-360M-Instruct`) |
85
+ | `DATASET_REPO` | optional | Your private HF dataset for logging |
86
+ | `PRIVATE_MODEL_REPO` | optional | Your private model repo for finetuned weights |
87
+
88
+ **Auth modes:**
89
+ ```
90
+ SMOLLM_API_KEY not set β†’ open access (demo/showcase mode)
91
+ SMOLLM_API_KEY set β†’ protected (production mode)
92
+ Space private β†’ double protection (HF gate + your key)
93
+ ```
94
+
95
+ ---
96
+
97
+ ## ADI Routing
98
+
99
+ | Decision | Action |
100
+ |----------|--------|
101
+ | `HIGH_PRIORITY` | SmolLM2 handles it |
102
+ | `MEDIUM_PRIORITY` | SmolLM2 handles it |
103
+ | `REJECT` | Returns suggestions, logs to dataset |
104
+ | SmolLM2 fails | 503 β†’ hub fallback chain |
105
+
106
+ ---
107
+
108
+ ## Training Utilities
109
+
110
+ Every request is logged to your private HF dataset. Use it to improve over time:
111
+
112
+ ```bash
113
+ python train.py --mode export # export dataset β†’ JSONL
114
+ python train.py --mode validate # validate ADI weights against labeled data
115
+ python train.py --mode finetune # finetune SmolLM2 on your data (coming soon)
116
+ ```
117
+
118
+ Once you have enough data β†’ finetune β†’ push to your private model repo β†’ Space loads it automatically next restart.
119
+
120
+ ---
121
+
122
+ ## Stack
123
+
124
+ | Component | What it does |
125
+ |-----------|-------------|
126
+ | `main.py` | FastAPI, auth, routing |
127
+ | `smollm.py` | Inference engine, lazy loading |
128
+ | `model.py` | HF token resolution, dataset + model repo access |
129
+ | `adi.py` | Request quality scoring |
130
+ | `train.py` | Dataset export, ADI validation, finetuning |
131
+
132
+ ---
133
+
134
+ ## Part of
135
+
136
+ - [Multi-LLM-API-Gateway](https://github.com/VolkanSah/Multi-LLM-API-Gateway) β€” the hub this was built for
137
+ - [Anti-Dump-Index](https://github.com/VolkanSah/Anti-Dump-Index) β€” the ADI algorithm idea
138
+
139
+
140
+ ## License
141
+
142
+ Dual-licensed:
143
+
144
+ - [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)
145
+ - [Ethical Security Operations License v1.1 (ESOL)](ESOL) β€” mandatory, non-severable
146
+
147
+ By using this software you agree to all ethical constraints defined in ESOL v1.1.