# dispatchAI SDK

**Small. Mobile. Free. UAE-built.**

`pip install dispatchai` — Run mobile-optimized LLMs on your phone, edge device, or laptop. 31 verified models, all tested on real Snapdragon hardware, all free.

## Quick Start

```bash
pip install dispatchai[gguf]
```

### Chat with a model

```python
from dispatchai import load_model

model = load_model("SmolLM2-135M-Instruct-mobile", backend="gguf")
response = model.chat("What is the capital of France?")
print(response)
# → "The capital of France is Paris."
```

## 🌐 Inference API

Use dispatchAI models via REST API (OpenAI-compatible):

```python
import openai

client = openai.OpenAI(
    base_url="https://api.dispatchai.ai/v1",
    api_key="da-demo-key-0001"
)

response = client.chat.completions.create(
    model="dispatchAI/SmolLM2-135M-Instruct-mobile",
    messages=[{"role": "user", "content": "What is the capital of France?"}]
)
print(response.choices[0].message.content)
# → "The capital of France is Paris."
```

**Pricing:** $0.001/1K input tokens, $0.002/1K output tokens (10x cheaper than OpenAI)

**Endpoint:** `https://api.dispatchai.ai/v1`

**Available Models:**
- dispatchAI/SmolLM2-135M-Instruct-mobile (101MB, 46 t/s on phone)
- dispatchAI/Qwen2.5-0.5B-Instruct-mobile-int4 (469MB, 23 t/s on phone)
- dispatchAI/Llama-3.2-1B-Instruct-Q4-mobile (770MB, 5.4 t/s on phone)

## Local Inference

### Find the best model for your phone

```python
from dispatchai import recommend

rec = recommend(ram_mb=2048, task="chat")
print(f"Best model: {rec['recommended']['name']}")
```

### List all models

```python
from dispatchai import list_models

for m in list_models(task="chat"):
    print(f"  {m['name']}: {m['size_mb']}MB, {m['speed_tps']} t/s")
```

### Estimate latency

```python
from dispatchai import estimate_latency

lat = estimate_latency("1B", "Q4_K_M")
print(f"{lat['tokens_per_sec']} t/s on Snapdragon 865")
```

### Calculate cost savings

```python
from dispatchai import calculate_cost

result = calculate_cost(daily_queries=10000, cloud_cost_per_1k=0.50)
print(f"Annual savings: ${result['savings']}")
```

## Installation Options

```bash
pip install dispatchai                    # Core (model catalog, recommendations)
pip install dispatchai[torch]             # + transformers/torch backend
pip install dispatchai[gguf]              # + llama.cpp GGUF backend
pip install dispatchai[full]              # + everything
```

## Verified Models (June 2026)

- ✅ 31 models fully working (0 broken, 0 partial)
- 📱 24 models phone-verified on Snapdragon 865
- All have correct chat formats documented

## Top 3 Models

| Model | Size | Phone Speed | Use Case |
|-------|------|-------------|----------|
| SmolLM2-135M | 101MB | 46.0 t/s | Ultra-fast, budget phones |
| Qwen2.5-0.5B-int4 | 469MB | 23.2 t/s | Best balance for mobile |
| Llama-3.2-1B-Q4 | 770MB | 5.4 t/s | Best quality under 1GB |

## About

Dispatch AI (FZE) — Sharjah Free Zone, UAE. License No. 10818.

🌐 [dispatchai.ai](https://www.dispatchai.ai) | 🤗 [huggingface.co/dispatchAI](https://huggingface.co/dispatchAI) | API: [api.dispatchai.ai](https://api.dispatchai.ai)

*I think, therefore I ship.*