Spaces:
Sleeping
Sleeping
| title: SmolLM2 Customs ADI | |
| emoji: π€ | |
| colorFrom: indigo | |
| colorTo: blue | |
| sdk: docker | |
| pinned: true | |
| short_description: DEMO β Build your own free LLM service | |
| # SmolLM2 Customs β Build Your Own LLM Service | |
| > A showcase: how to build a free, private, OpenAI-compatible LLM service on HuggingFace Spaces and plug it into any hub or application β no GPU, no money, no drama. | |
| > [!IMPORTANT] | |
| > This project is under active development β always use the latest release from [Codey Lab](https://github.com/Codey-LAB/SmolLM2-customs) *(more stable builds land there first)*. | |
| > This repo ([DEV-STATUS](https://github.com/VolkanSah/SmolLM2-ADI)) is where the chaos happens. π¬ A β on the repos would be cool π | |
| --- | |
| ## What is this? | |
| A minimal but production-ready LLM service built on: | |
| - **SmolLM2-360M-Instruct** β 269MB, Apache 2.0, runs on 2 CPUs for free | |
| - **FastAPI** β OpenAI-compatible `/v1/chat/completions` endpoint | |
| - **ADI** (Anti-Dump Index) β filters low-quality requests before they hit the model | |
| - **HF Dataset** β logs every request for later analysis and finetuning | |
| The point is not the model β the point is the pattern. Fork it, swap SmolLM2 for any model you want, and you have your own private LLM API running for free. | |
| --- | |
| ## How it works | |
| ``` | |
| Request | |
| β | |
| ADI Score (is this request worth answering?) | |
| β | |
| REJECT β returns improvement suggestions, logs to dataset | |
| MEDIUM/HIGH β SmolLM2 answers, logs to dataset | |
| SmolLM2 fails β returns 503 β hub fallback chain kicks in | |
| ``` | |
| --- | |
| ## Endpoints | |
| ``` | |
| GET / β status | |
| GET /v1/health β health check | |
| POST /v1/chat/completions β OpenAI-compatible inference | |
| ``` | |
| --- | |
| ## Plug into any Hub (one config block) | |
| Works out of the box with [Multi-LLM-API-Gateway](https://github.com/VolkanSah/Multi-LLM-API-Gateway): Hub Screenshot for this [SmolLM2](SmolLM2.jpg) | |
| ```ini | |
| [LLM_PROVIDER.smollm] | |
| active = "true" | |
| base_url = "https://YOUR-USERNAME-smollm2-customs.hf.space/v1" | |
| env_key = "SMOLLM_API_KEY" | |
| default_model = "smollm2-360m" | |
| models = "smollm2-360m, YOUR-USERNAME/your-finetuned-model" | |
| fallback_to = "gemini" | |
| [LLM_PROVIDER.smollm_END] | |
| ``` | |
| Any OpenAI-compatible client works the same way. | |
| --- | |
| ## Secrets (HF Space Settings) | |
| | Secret | Required | Description | | |
| |--------|----------|-------------| | |
| | `SMOLLM_API_KEY` | recommended | Locks the endpoint β set same value in your hub | | |
| | `HF_TOKEN` or `TEST_TOKEN` | optional | HF auth for dataset + model repo access | | |
| | `MODEL_REPO` | optional | Base model override (default: `HuggingFaceTB/SmolLM2-360M-Instruct`) | | |
| | `DATASET_REPO` | optional | Your private HF dataset for logging | | |
| | `PRIVATE_MODEL_REPO` | optional | Your private model repo for finetuned weights | | |
| **Auth modes:** | |
| ``` | |
| SMOLLM_API_KEY not set β open access (demo/showcase mode) | |
| SMOLLM_API_KEY set β protected (production mode) | |
| Space private β double protection (HF gate + your key) | |
| ``` | |
| --- | |
| ## ADI Routing | |
| | Decision | Action | | |
| |----------|--------| | |
| | `HIGH_PRIORITY` | SmolLM2 handles it | | |
| | `MEDIUM_PRIORITY` | SmolLM2 handles it | | |
| | `REJECT` | Returns suggestions, logs to dataset | | |
| | SmolLM2 fails | 503 β hub fallback chain | | |
| --- | |
| ## Training Utilities | |
| Every request is logged to your private HF dataset. Use it to improve over time: | |
| ```bash | |
| python train.py --mode export # export dataset β JSONL | |
| python train.py --mode validate # validate ADI weights against labeled data | |
| python train.py --mode finetune # finetune SmolLM2 on your data (coming soon) | |
| ``` | |
| Once you have enough data β finetune β push to your private model repo β Space loads it automatically next restart. | |
| --- | |
| ## Stack | |
| | Component | What it does | | |
| |-----------|-------------| | |
| | `main.py` | FastAPI, auth, routing | | |
| | `smollm.py` | Inference engine, lazy loading | | |
| | `model.py` | HF token resolution, dataset + model repo access | | |
| | `adi.py` | Request quality scoring | | |
| | `train.py` | Dataset export, ADI validation, finetuning | | |
| --- | |
| ## Part of | |
| - [Multi-LLM-API-Gateway](https://github.com/VolkanSah/Multi-LLM-API-Gateway) β the hub this was built for | |
| - [Anti-Dump-Index](https://github.com/VolkanSah/Anti-Dump-Index) β the ADI algorithm idea | |
| ## License | |
| Dual-licensed: | |
| - [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) | |
| - [Ethical Security Operations License v1.1 (ESOL)](ESOL) β mandatory, non-severable | |
| By using this software you agree to all ethical constraints defined in ESOL v1.1. |