dispatchAI
/

dispatchAI-SDK

Model card Files Files and versions

xet

Community

3morixd commited on about 5 hours ago

Commit

2b9cf4a

verified ·

1 Parent(s): 46e9ad9

Upload folder using huggingface_hub

Browse files

Files changed (1) hide show

README.md +44 -40

README.md CHANGED Viewed

@@ -2,12 +2,12 @@
 **Small. Mobile. Free. UAE-built.**
-`pip install dispatchai` — Run mobile-optimized LLMs on your phone, edge device, or laptop. 39 models, all tested on real Snapdragon hardware, all free.
 ## Quick Start
 ```bash
-pip install dispatchai
 ```
 ### Chat with a model
@@ -15,18 +15,43 @@ pip install dispatchai
 ```python
 from dispatchai import load_model
-model = load_model("SmolLM2-135M-Instruct-mobile")
 response = model.chat("What is the capital of France?")
 print(response)
 ```
-### Use GGUF/llama.cpp backend
 ```python
-model = load_model("Llama-3.2-1B-Instruct-Q4-mobile", backend="gguf")
-print(model.chat("Write a haiku about the desert."))
 ```
 ### Find the best model for your phone
 ```python
@@ -34,8 +59,6 @@ from dispatchai import recommend
 rec = recommend(ram_mb=2048, task="chat")
 print(f"Best model: {rec['recommended']['name']}")
-print(f"Size: {rec['recommended']['size_mb']}MB")
-print(f"Speed: {rec['recommended']['speed_tps']} tokens/sec")
 ```
 ### List all models
@@ -53,7 +76,7 @@ for m in list_models(task="chat"):
 from dispatchai import estimate_latency
 lat = estimate_latency("1B", "Q4_K_M")
-print(f"{lat['tokens_per_sec']} tokens/sec on Snapdragon 865")
 ```
 ### Calculate cost savings
@@ -71,46 +94,27 @@ print(f"Annual savings: ${result['savings']}")
 pip install dispatchai                    # Core (model catalog, recommendations)
 pip install dispatchai[torch]             # + transformers/torch backend
 pip install dispatchai[gguf]              # + llama.cpp GGUF backend
-pip install dispatchai[full]              # + everything (torch, gguf, sentence-transformers)
 ```
-## Available Models
-| Model | Params | Size | Speed | Task |
-|-------|--------|------|-------|------|
-| SmolLM2-135M-Instruct-mobile | 135M | 270MB | 25.5 t/s | Chat |
-| SmolLM2-360M-Instruct-mobile | 360M | 720MB | 21.0 t/s | Chat |
-| Qwen2.5-0.5B-Instruct-mobile-int4 | 500M | 350MB | 20.0 t/s | Chat |
-| Llama-3.2-1B-Instruct-Q4-mobile | 1B | 700MB | 18.2 t/s | Chat |
-| Llama-3.2-1B-FunctionCall-mobile | 1B | 2.5GB | 12.0 t/s | Function Call |
-| Qwen2.5-Coder-1.5B-mobile | 1.5B | 3.0GB | 10.5 t/s | Code |
-| Gemma-2B-Arabic-mobile | 2B | 5.0GB | 8.0 t/s | Arabic |
-| Llama-3.2-3B-Instruct-Q5-mobile | 3B | 2.1GB | 8.5 t/s | Chat |
-[Browse all 39 models →](https://huggingface.co/dispatchAI)
-## Hardware Targets
-All benchmarks measured on **Snapdragon 865 (Samsung S20 FE, 8GB RAM)** using llama.cpp.
-The `estimate_latency()` function supports:
-- Snapdragon 865 (baseline)
-- Snapdragon 8 Gen 2 (1.8x)
-- Snapdragon 8 Gen 3 (2.2x)
-- Apple A17 Pro (2.5x)
-- Apple M2 (3.0x)
-- Snapdragon 778G mid-range (0.7x)
-## The Thesis
-> *The best model is the one that runs.*
-We're building the AI layer for a billion phones that can't afford cloud inference. Every model is free, open-source, and tested on real hardware.
 ## About
 Dispatch AI (FZE) — Sharjah Free Zone, UAE. License No. 10818.
-🌐 [dispatchai.ai](https://www.dispatchai.ai) | 🤗 [huggingface.co/dispatchAI](https://huggingface.co/dispatchAI) | 𝕏 [@DispatchAIdev](https://twitter.com/DispatchAIdev)
 *I think, therefore I ship.*

 **Small. Mobile. Free. UAE-built.**
+`pip install dispatchai` — Run mobile-optimized LLMs on your phone, edge device, or laptop. 31 verified models, all tested on real Snapdragon hardware, all free.
 ## Quick Start
 ```bash
+pip install dispatchai[gguf]
 ```
 ### Chat with a model
 ```python
 from dispatchai import load_model
+model = load_model("SmolLM2-135M-Instruct-mobile", backend="gguf")
 response = model.chat("What is the capital of France?")
 print(response)
+# → "The capital of France is Paris."
 ```
+## 🌐 Inference API
+Use dispatchAI models via REST API (OpenAI-compatible):
 ```python
+import openai
+client = openai.OpenAI(
+    base_url="https://api.dispatchai.ai/v1",
+    api_key="da-demo-key-0001"
+)
+response = client.chat.completions.create(
+    model="dispatchAI/SmolLM2-135M-Instruct-mobile",
+    messages=[{"role": "user", "content": "What is the capital of France?"}]
+)
+print(response.choices[0].message.content)
+# → "The capital of France is Paris."
 ```
+**Pricing:** $0.001/1K input tokens, $0.002/1K output tokens (10x cheaper than OpenAI)
+**Endpoint:** `https://api.dispatchai.ai/v1`
+**Available Models:**
+- dispatchAI/SmolLM2-135M-Instruct-mobile (101MB, 46 t/s on phone)
+- dispatchAI/Qwen2.5-0.5B-Instruct-mobile-int4 (469MB, 23 t/s on phone)
+- dispatchAI/Llama-3.2-1B-Instruct-Q4-mobile (770MB, 5.4 t/s on phone)
+## Local Inference
 ### Find the best model for your phone
 ```python
 rec = recommend(ram_mb=2048, task="chat")
 print(f"Best model: {rec['recommended']['name']}")
 ```
 ### List all models
 from dispatchai import estimate_latency
 lat = estimate_latency("1B", "Q4_K_M")
+print(f"{lat['tokens_per_sec']} t/s on Snapdragon 865")
 ```
 ### Calculate cost savings
 pip install dispatchai                    # Core (model catalog, recommendations)
 pip install dispatchai[torch]             # + transformers/torch backend
 pip install dispatchai[gguf]              # + llama.cpp GGUF backend
+pip install dispatchai[full]              # + everything
 ```
+## Verified Models (June 2026)
+- ✅ 31 models fully working (0 broken, 0 partial)
+- 📱 24 models phone-verified on Snapdragon 865
+- All have correct chat formats documented
+## Top 3 Models
+| Model | Size | Phone Speed | Use Case |
+|-------|------|-------------|----------|
+| SmolLM2-135M | 101MB | 46.0 t/s | Ultra-fast, budget phones |
+| Qwen2.5-0.5B-int4 | 469MB | 23.2 t/s | Best balance for mobile |
+| Llama-3.2-1B-Q4 | 770MB | 5.4 t/s | Best quality under 1GB |
 ## About
 Dispatch AI (FZE) — Sharjah Free Zone, UAE. License No. 10818.
+🌐 [dispatchai.ai](https://www.dispatchai.ai) | 🤗 [huggingface.co/dispatchAI](https://huggingface.co/dispatchAI) | API: [api.dispatchai.ai](https://api.dispatchai.ai)
 *I think, therefore I ship.*