dispatchAI-SDK / README.md
3morixd's picture
Upload folder using huggingface_hub
2b9cf4a verified
|
Raw
History Blame Contribute Delete
3.19 kB
# dispatchAI SDK
**Small. Mobile. Free. UAE-built.**
`pip install dispatchai` β€” Run mobile-optimized LLMs on your phone, edge device, or laptop. 31 verified models, all tested on real Snapdragon hardware, all free.
## Quick Start
```bash
pip install dispatchai[gguf]
```
### Chat with a model
```python
from dispatchai import load_model
model = load_model("SmolLM2-135M-Instruct-mobile", backend="gguf")
response = model.chat("What is the capital of France?")
print(response)
# β†’ "The capital of France is Paris."
```
## 🌐 Inference API
Use dispatchAI models via REST API (OpenAI-compatible):
```python
import openai
client = openai.OpenAI(
base_url="https://api.dispatchai.ai/v1",
api_key="da-demo-key-0001"
)
response = client.chat.completions.create(
model="dispatchAI/SmolLM2-135M-Instruct-mobile",
messages=[{"role": "user", "content": "What is the capital of France?"}]
)
print(response.choices[0].message.content)
# β†’ "The capital of France is Paris."
```
**Pricing:** $0.001/1K input tokens, $0.002/1K output tokens (10x cheaper than OpenAI)
**Endpoint:** `https://api.dispatchai.ai/v1`
**Available Models:**
- dispatchAI/SmolLM2-135M-Instruct-mobile (101MB, 46 t/s on phone)
- dispatchAI/Qwen2.5-0.5B-Instruct-mobile-int4 (469MB, 23 t/s on phone)
- dispatchAI/Llama-3.2-1B-Instruct-Q4-mobile (770MB, 5.4 t/s on phone)
## Local Inference
### Find the best model for your phone
```python
from dispatchai import recommend
rec = recommend(ram_mb=2048, task="chat")
print(f"Best model: {rec['recommended']['name']}")
```
### List all models
```python
from dispatchai import list_models
for m in list_models(task="chat"):
print(f" {m['name']}: {m['size_mb']}MB, {m['speed_tps']} t/s")
```
### Estimate latency
```python
from dispatchai import estimate_latency
lat = estimate_latency("1B", "Q4_K_M")
print(f"{lat['tokens_per_sec']} t/s on Snapdragon 865")
```
### Calculate cost savings
```python
from dispatchai import calculate_cost
result = calculate_cost(daily_queries=10000, cloud_cost_per_1k=0.50)
print(f"Annual savings: ${result['savings']}")
```
## Installation Options
```bash
pip install dispatchai # Core (model catalog, recommendations)
pip install dispatchai[torch] # + transformers/torch backend
pip install dispatchai[gguf] # + llama.cpp GGUF backend
pip install dispatchai[full] # + everything
```
## Verified Models (June 2026)
- βœ… 31 models fully working (0 broken, 0 partial)
- πŸ“± 24 models phone-verified on Snapdragon 865
- All have correct chat formats documented
## Top 3 Models
| Model | Size | Phone Speed | Use Case |
|-------|------|-------------|----------|
| SmolLM2-135M | 101MB | 46.0 t/s | Ultra-fast, budget phones |
| Qwen2.5-0.5B-int4 | 469MB | 23.2 t/s | Best balance for mobile |
| Llama-3.2-1B-Q4 | 770MB | 5.4 t/s | Best quality under 1GB |
## About
Dispatch AI (FZE) β€” Sharjah Free Zone, UAE. License No. 10818.
🌐 [dispatchai.ai](https://www.dispatchai.ai) | πŸ€— [huggingface.co/dispatchAI](https://huggingface.co/dispatchAI) | API: [api.dispatchai.ai](https://api.dispatchai.ai)
*I think, therefore I ship.*