# dispatchAI SDK **Small. Mobile. Free. UAE-built.** `pip install dispatchai` — Run mobile-optimized LLMs on your phone, edge device, or laptop. 31 verified models, all tested on real Snapdragon hardware, all free. ## Quick Start ```bash pip install dispatchai[gguf] ``` ### Chat with a model ```python from dispatchai import load_model model = load_model("SmolLM2-135M-Instruct-mobile", backend="gguf") response = model.chat("What is the capital of France?") print(response) # → "The capital of France is Paris." ``` ## 🌐 Inference API Use dispatchAI models via REST API (OpenAI-compatible): ```python import openai client = openai.OpenAI( base_url="https://api.dispatchai.ai/v1", api_key="da-demo-key-0001" ) response = client.chat.completions.create( model="dispatchAI/SmolLM2-135M-Instruct-mobile", messages=[{"role": "user", "content": "What is the capital of France?"}] ) print(response.choices[0].message.content) # → "The capital of France is Paris." ``` **Pricing:** $0.001/1K input tokens, $0.002/1K output tokens (10x cheaper than OpenAI) **Endpoint:** `https://api.dispatchai.ai/v1` **Available Models:** - dispatchAI/SmolLM2-135M-Instruct-mobile (101MB, 46 t/s on phone) - dispatchAI/Qwen2.5-0.5B-Instruct-mobile-int4 (469MB, 23 t/s on phone) - dispatchAI/Llama-3.2-1B-Instruct-Q4-mobile (770MB, 5.4 t/s on phone) ## Local Inference ### Find the best model for your phone ```python from dispatchai import recommend rec = recommend(ram_mb=2048, task="chat") print(f"Best model: {rec['recommended']['name']}") ``` ### List all models ```python from dispatchai import list_models for m in list_models(task="chat"): print(f" {m['name']}: {m['size_mb']}MB, {m['speed_tps']} t/s") ``` ### Estimate latency ```python from dispatchai import estimate_latency lat = estimate_latency("1B", "Q4_K_M") print(f"{lat['tokens_per_sec']} t/s on Snapdragon 865") ``` ### Calculate cost savings ```python from dispatchai import calculate_cost result = calculate_cost(daily_queries=10000, cloud_cost_per_1k=0.50) print(f"Annual savings: ${result['savings']}") ``` ## Installation Options ```bash pip install dispatchai # Core (model catalog, recommendations) pip install dispatchai[torch] # + transformers/torch backend pip install dispatchai[gguf] # + llama.cpp GGUF backend pip install dispatchai[full] # + everything ``` ## Verified Models (June 2026) - ✅ 31 models fully working (0 broken, 0 partial) - 📱 24 models phone-verified on Snapdragon 865 - All have correct chat formats documented ## Top 3 Models | Model | Size | Phone Speed | Use Case | |-------|------|-------------|----------| | SmolLM2-135M | 101MB | 46.0 t/s | Ultra-fast, budget phones | | Qwen2.5-0.5B-int4 | 469MB | 23.2 t/s | Best balance for mobile | | Llama-3.2-1B-Q4 | 770MB | 5.4 t/s | Best quality under 1GB | ## About Dispatch AI (FZE) — Sharjah Free Zone, UAE. License No. 10818. 🌐 [dispatchai.ai](https://www.dispatchai.ai) | 🤗 [huggingface.co/dispatchAI](https://huggingface.co/dispatchAI) | API: [api.dispatchai.ai](https://api.dispatchai.ai) *I think, therefore I ship.*