# SimpleTool Skill — Real-Time AI Application Development > **This is a skill file.** Feed it to any AI coding assistant (Claude, Gemini, GPT, Cursor, etc.) as context, then describe the app you want. The AI will generate a working SimpleTool-powered application. > > Example prompt: *"Read the attached SimpleTool skill, then build me a Pong game where AI controls one paddle in real-time."* --- ## 1. What is SimpleTool? SimpleTool is a **multi-head parallel decoding** server for real-time LLM function calling. It runs on vLLM and decodes function name + arguments simultaneously instead of sequentially. ``` Traditional: function → arg1 → arg2 → ... (sequential, ~200-500ms) SimpleTool: [function, arg1, arg2, ...] (parallel, ~25-60ms) ``` **Application domains**: game AI, robotic arm control, digital human animation, IoT automation — anything that needs < 100ms LLM decision-making. ## 2. Server API Server default: `http://localhost:8899` ### Endpoints | Method | Path | Description | |--------|------|-------------| | GET | `/health` | Health check, returns `{status, version, model}` | | POST | `/v1/function_call` | Multi-head parallel function call | ### Request Format (v2) ```javascript { messages: [{role: 'user', content: 'your query'}], tools: [...], // OpenAI-format tool definitions system: "domain prompt", // Domain-specific system prompt (v2) environment: [...], // Current state info (string array, optional) history: [...], // Action history (string array, max 6) include_content_head: false // Whether to generate head } ``` The `system` field lets you inject a domain-specific system prompt (e.g., "You are a robotic arm controller"). If omitted, the server uses a generic default. The `environment` field is optional context folded into the user message. ### Response Format ```javascript { success: true, function: "move", args: {direction: "up", speed: "fast"}, // Named args (param names from tool def) heads: { // Raw per-head output function: "move", arg1: "up", arg2: "fast", arg3: "<|null|>" }, content: null, // Only if include_content_head was true latency_ms: 35.2 } ``` ## 3. Dynamic Head Count (Critical for Latency!) **The server automatically prunes unused heads.** If your tools have at most 2 parameters, only 3 heads are spawned (``, ``, ``), not 8. This saves ~40% latency. ``` Active heads = [] + [...] where N = max parameter count across all tool definitions ``` **Design tip**: Keep your tools to 1–3 parameters when possible. Fewer params = fewer heads = lower latency. ## 4. Tool Definition ### Constraints - Maximum **6 arguments** per function (arg1–arg6) - Arguments map to `arg1, arg2, ...` in the order defined in `properties` - Server auto-converts types: numeric strings → int/float, otherwise lowercase string - Use `enum` to constrain options — this dramatically improves accuracy ### Template ```javascript const TOOLS = [{ type: "function", function: { name: "action_name", description: "Clear, concise — what this action does and when to use it", parameters: { type: "object", properties: { param1: { type: "string", enum: ["opt_a", "opt_b", "opt_c"], // Constrain! Improves accuracy description: "What this param controls" }, param2: { type: "number", description: "Numeric value with unit, e.g. 'Force in Newtons'" } }, required: ["param1"] } } }]; ``` ### Multi-Tool Example (Game) ```javascript const TOOLS = [ {type:"function", function:{name:"move", description:"Move unit to position", parameters:{type:"object", properties:{unit:{type:"string"}, target:{type:"string", enum:["north","south","east","west"]}}}}}, {type:"function", function:{name:"attack", description:"Attack enemy", parameters:{type:"object", properties:{unit:{type:"string"}, target:{type:"string"}}}}}, {type:"function", function:{name:"retreat", description:"Pull back unit", parameters:{type:"object", properties:{unit:{type:"string"}}}}}, {type:"function", function:{name:"pass", description:"Do nothing this turn", parameters:{type:"object", properties:{}}}} ]; // Max params = 2 → only 3 heads spawned ``` ## 5. Query Design ### Principles 1. **Be imperative** — tell the model what to decide, not just describe state 2. **Include decision context** — "Ball is BELOW paddle, intercept it" not "Ball y=250" 3. **List valid options** — "Choose: up/down/stay" 4. **Keep it short** — shorter query = faster prefill ### Good vs Bad ``` ✅ "Ball 50px BELOW paddle, approaching fast. Move DOWN to intercept. Choose: up/down/stay" ❌ "Ball position: 250, Paddle position: 200. What should I do?" ✅ "Red gear at (300,150,50). Move arm there slowly for pickup." ❌ "There is a gear somewhere on the table. The arm needs to go to it." ✅ "Stream starting, viewers saying hello. Greet them warmly." ❌ "Viewers are in the chat. Do something appropriate." ``` ### Environment & History ```javascript // Environment: current state as key=value strings const env = [ `ball_y=${ballY}`, `paddle_y=${paddleY}`, `gap=${gap}`, `approaching=true` ]; // History: recent actions (max 6, server trims automatically) const history = [ "move(up)", "move(up)", "stay()" ]; ``` ### Domain System Prompts (v2) For v2 server, set a domain-specific system prompt: ```javascript // Game AI const SYSTEM = "You are the AI controller for a Pong game. Move the paddle to intercept the ball. React quickly."; // Robotic arm const SYSTEM = "You are the voice controller for a 6-axis robotic arm. Convert commands to precise function calls. Coordinates in mm."; // Digital human const SYSTEM = "You are the animation controller for a virtual streamer. Convert director instructions to expression and speech calls."; ``` ## 6. Frontend Code Standards ### Required: Type-Safe Value Extraction ```javascript // Values in args may be int, not string — always coerce function safeStr(v) { if (v === null || v === undefined) return ''; return String(v).trim().toLowerCase(); } // Extract with args (named) first, heads (positional) as fallback let direction = safeStr(d.args?.direction) || safeStr(d.heads?.arg1); ``` ### Required: Validate Return Values ```javascript const VALID = ['up', 'down', 'stay']; if (!VALID.includes(direction)) { console.warn(`Invalid: "${direction}", fallback to stay`); direction = 'stay'; } ``` ### Required: Error Handling with Fallback ```javascript async function callAI() { try { const r = await fetch(SERVER_URL + '/v1/function_call', { method: 'POST', headers: {'Content-Type': 'application/json'}, body: JSON.stringify(request) }); const data = await r.json(); if (!data.success) throw new Error(data.error); applyAction(data); } catch (e) { console.error('[AI] Failed:', e); applyFallbackAI(); // MUST have fallback — never freeze the app } } ``` ### Required: Logging ```javascript console.log(`[Game] Query: ${query}`); console.log(`[Game] → ${data.function}(${JSON.stringify(data.args)}) ${data.latency_ms.toFixed(0)}ms`); ``` ### Recommended: Debug UI Overlay Show in a corner of your app: current query, raw response, latency (current + rolling average). ## 7. Game Loop Pattern **Decouple AI from rendering.** The AI loop runs at 10–16 Hz; the render loop runs at 60 fps. ```javascript const AI_INTERVAL = 100; // 100ms = 10 Hz let aiPending = false; // Render loop (60fps) — never blocks on AI function gameLoop() { update(); render(); requestAnimationFrame(gameLoop); } // AI loop (async, non-blocking) async function aiLoop() { if (aiPending) return; aiPending = true; await callAI(); aiPending = false; } setInterval(aiLoop, AI_INTERVAL); gameLoop(); ``` ## 8. FCClient Template Drop-in client class for any HTML/JS application: ```javascript class FCClient { constructor(url = 'http://localhost:8899') { this.url = url.replace(/\/$/, ''); } async health() { try { const r = await fetch(`${this.url}/health`, {signal: AbortSignal.timeout(3000)}); const d = await r.json(); return {ok: d.loaded === true || d.status === 'ok', version: d.version}; } catch (e) { return {ok: false}; } } async call({query, tools, system, env, history, includeContent = false}) { const t0 = performance.now(); try { const r = await fetch(`${this.url}/v1/function_call`, { method: 'POST', headers: {'Content-Type': 'application/json'}, body: JSON.stringify({ messages: [{role: 'user', content: query}], tools, system, // v2: domain system prompt environment: env, history, include_content_head: includeContent }) }); const d = await r.json(); return {...d, wall_ms: performance.now() - t0}; } catch (e) { return {success: false, error: e.message, wall_ms: performance.now() - t0}; } } } ``` Usage: ```javascript const ai = new FCClient('http://localhost:8899'); const result = await ai.call({ query: "Ball is BELOW. Move down. Choose: up/down/stay", tools: TOOLS, system: "You are a Pong AI. Move paddle to intercept ball.", env: ["ball_y=300", "paddle_y=200", "gap=100"], history: ["move(down)", "move(down)"] }); if (result.success) { console.log(`${result.function}(${JSON.stringify(result.args)}) in ${result.latency_ms}ms`); } ``` ## 9. Troubleshooting | Symptom | Cause | Fix | |---------|-------|-----| | AI stuck / no movement | Query too vague | Add decision hints: "Move DOWN to intercept" | | `.trim is not a function` | `args` values may be int | Use `String(v)` before `.trim()` | | High latency (>100ms) | Too many heads / long query | Reduce tool params, shorten query/env | | Wrong function called | Ambiguous tool descriptions | Add `enum`, improve `description` fields | | `<|null|>` in all args | Model confused | Check tool param order matches expectations | --- **Skill Version**: 2.0 — Supports v1/v2 server, multi-domain (game, robotics, avatar) **Last Updated**: 2026-03