SimpleTool / simpletool-game.skill.md
Cialtion's picture
Update simpletool-game.skill.md
945db12 verified
# SimpleTool Skill — Real-Time AI Application Development
> **This is a skill file.** Feed it to any AI coding assistant (Claude, Gemini, GPT, Cursor, etc.) as context, then describe the app you want. The AI will generate a working SimpleTool-powered application.
>
> Example prompt: *"Read the attached SimpleTool skill, then build me a Pong game where AI controls one paddle in real-time."*
---
## 1. What is SimpleTool?
SimpleTool is a **multi-head parallel decoding** server for real-time LLM function calling. It runs on vLLM and decodes function name + arguments simultaneously instead of sequentially.
```
Traditional: function → arg1 → arg2 → ... (sequential, ~200-500ms)
SimpleTool: [function, arg1, arg2, ...] (parallel, ~25-60ms)
```
**Application domains**: game AI, robotic arm control, digital human animation, IoT automation — anything that needs < 100ms LLM decision-making.
## 2. Server API
Server default: `http://localhost:8899`
### Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/health` | Health check, returns `{status, version, model}` |
| POST | `/v1/function_call` | Multi-head parallel function call |
### Request Format (v2)
```javascript
{
messages: [{role: 'user', content: 'your query'}],
tools: [...], // OpenAI-format tool definitions
system: "domain prompt", // Domain-specific system prompt (v2)
environment: [...], // Current state info (string array, optional)
history: [...], // Action history (string array, max 6)
include_content_head: false // Whether to generate <content> head
}
```
The `system` field lets you inject a domain-specific system prompt (e.g., "You are a robotic arm controller"). If omitted, the server uses a generic default. The `environment` field is optional context folded into the user message.
### Response Format
```javascript
{
success: true,
function: "move",
args: {direction: "up", speed: "fast"}, // Named args (param names from tool def)
heads: { // Raw per-head output
function: "move",
arg1: "up",
arg2: "fast",
arg3: "<|null|>"
},
content: null, // Only if include_content_head was true
latency_ms: 35.2
}
```
## 3. Dynamic Head Count (Critical for Latency!)
**The server automatically prunes unused heads.** If your tools have at most 2 parameters, only 3 heads are spawned (`<function>`, `<arg1>`, `<arg2>`), not 8. This saves ~40% latency.
```
Active heads = [<function>] + [<arg1>...<argN>]
where N = max parameter count across all tool definitions
```
**Design tip**: Keep your tools to 1–3 parameters when possible. Fewer params = fewer heads = lower latency.
## 4. Tool Definition
### Constraints
- Maximum **6 arguments** per function (arg1–arg6)
- Arguments map to `arg1, arg2, ...` in the order defined in `properties`
- Server auto-converts types: numeric strings → int/float, otherwise lowercase string
- Use `enum` to constrain options — this dramatically improves accuracy
### Template
```javascript
const TOOLS = [{
type: "function",
function: {
name: "action_name",
description: "Clear, concise — what this action does and when to use it",
parameters: {
type: "object",
properties: {
param1: {
type: "string",
enum: ["opt_a", "opt_b", "opt_c"], // Constrain! Improves accuracy
description: "What this param controls"
},
param2: {
type: "number",
description: "Numeric value with unit, e.g. 'Force in Newtons'"
}
},
required: ["param1"]
}
}
}];
```
### Multi-Tool Example (Game)
```javascript
const TOOLS = [
{type:"function", function:{name:"move", description:"Move unit to position", parameters:{type:"object", properties:{unit:{type:"string"}, target:{type:"string", enum:["north","south","east","west"]}}}}},
{type:"function", function:{name:"attack", description:"Attack enemy", parameters:{type:"object", properties:{unit:{type:"string"}, target:{type:"string"}}}}},
{type:"function", function:{name:"retreat", description:"Pull back unit", parameters:{type:"object", properties:{unit:{type:"string"}}}}},
{type:"function", function:{name:"pass", description:"Do nothing this turn", parameters:{type:"object", properties:{}}}}
];
// Max params = 2 → only 3 heads spawned
```
## 5. Query Design
### Principles
1. **Be imperative** — tell the model what to decide, not just describe state
2. **Include decision context** — "Ball is BELOW paddle, intercept it" not "Ball y=250"
3. **List valid options** — "Choose: up/down/stay"
4. **Keep it short** — shorter query = faster prefill
### Good vs Bad
```
✅ "Ball 50px BELOW paddle, approaching fast. Move DOWN to intercept. Choose: up/down/stay"
❌ "Ball position: 250, Paddle position: 200. What should I do?"
✅ "Red gear at (300,150,50). Move arm there slowly for pickup."
❌ "There is a gear somewhere on the table. The arm needs to go to it."
✅ "Stream starting, viewers saying hello. Greet them warmly."
❌ "Viewers are in the chat. Do something appropriate."
```
### Environment & History
```javascript
// Environment: current state as key=value strings
const env = [
`ball_y=${ballY}`,
`paddle_y=${paddleY}`,
`gap=${gap}`,
`approaching=true`
];
// History: recent actions (max 6, server trims automatically)
const history = [
"move(up)", "move(up)", "stay()"
];
```
### Domain System Prompts (v2)
For v2 server, set a domain-specific system prompt:
```javascript
// Game AI
const SYSTEM = "You are the AI controller for a Pong game. Move the paddle to intercept the ball. React quickly.";
// Robotic arm
const SYSTEM = "You are the voice controller for a 6-axis robotic arm. Convert commands to precise function calls. Coordinates in mm.";
// Digital human
const SYSTEM = "You are the animation controller for a virtual streamer. Convert director instructions to expression and speech calls.";
```
## 6. Frontend Code Standards
### Required: Type-Safe Value Extraction
```javascript
// Values in args may be int, not string — always coerce
function safeStr(v) {
if (v === null || v === undefined) return '';
return String(v).trim().toLowerCase();
}
// Extract with args (named) first, heads (positional) as fallback
let direction = safeStr(d.args?.direction) || safeStr(d.heads?.arg1);
```
### Required: Validate Return Values
```javascript
const VALID = ['up', 'down', 'stay'];
if (!VALID.includes(direction)) {
console.warn(`Invalid: "${direction}", fallback to stay`);
direction = 'stay';
}
```
### Required: Error Handling with Fallback
```javascript
async function callAI() {
try {
const r = await fetch(SERVER_URL + '/v1/function_call', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify(request)
});
const data = await r.json();
if (!data.success) throw new Error(data.error);
applyAction(data);
} catch (e) {
console.error('[AI] Failed:', e);
applyFallbackAI(); // MUST have fallback — never freeze the app
}
}
```
### Required: Logging
```javascript
console.log(`[Game] Query: ${query}`);
console.log(`[Game] → ${data.function}(${JSON.stringify(data.args)}) ${data.latency_ms.toFixed(0)}ms`);
```
### Recommended: Debug UI Overlay
Show in a corner of your app: current query, raw response, latency (current + rolling average).
## 7. Game Loop Pattern
**Decouple AI from rendering.** The AI loop runs at 10–16 Hz; the render loop runs at 60 fps.
```javascript
const AI_INTERVAL = 100; // 100ms = 10 Hz
let aiPending = false;
// Render loop (60fps) — never blocks on AI
function gameLoop() {
update();
render();
requestAnimationFrame(gameLoop);
}
// AI loop (async, non-blocking)
async function aiLoop() {
if (aiPending) return;
aiPending = true;
await callAI();
aiPending = false;
}
setInterval(aiLoop, AI_INTERVAL);
gameLoop();
```
## 8. FCClient Template
Drop-in client class for any HTML/JS application:
```javascript
class FCClient {
constructor(url = 'http://localhost:8899') {
this.url = url.replace(/\/$/, '');
}
async health() {
try {
const r = await fetch(`${this.url}/health`, {signal: AbortSignal.timeout(3000)});
const d = await r.json();
return {ok: d.loaded === true || d.status === 'ok', version: d.version};
} catch (e) {
return {ok: false};
}
}
async call({query, tools, system, env, history, includeContent = false}) {
const t0 = performance.now();
try {
const r = await fetch(`${this.url}/v1/function_call`, {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({
messages: [{role: 'user', content: query}],
tools,
system, // v2: domain system prompt
environment: env,
history,
include_content_head: includeContent
})
});
const d = await r.json();
return {...d, wall_ms: performance.now() - t0};
} catch (e) {
return {success: false, error: e.message, wall_ms: performance.now() - t0};
}
}
}
```
Usage:
```javascript
const ai = new FCClient('http://localhost:8899');
const result = await ai.call({
query: "Ball is BELOW. Move down. Choose: up/down/stay",
tools: TOOLS,
system: "You are a Pong AI. Move paddle to intercept ball.",
env: ["ball_y=300", "paddle_y=200", "gap=100"],
history: ["move(down)", "move(down)"]
});
if (result.success) {
console.log(`${result.function}(${JSON.stringify(result.args)}) in ${result.latency_ms}ms`);
}
```
## 9. Troubleshooting
| Symptom | Cause | Fix |
|---------|-------|-----|
| AI stuck / no movement | Query too vague | Add decision hints: "Move DOWN to intercept" |
| `.trim is not a function` | `args` values may be int | Use `String(v)` before `.trim()` |
| High latency (>100ms) | Too many heads / long query | Reduce tool params, shorten query/env |
| Wrong function called | Ambiguous tool descriptions | Add `enum`, improve `description` fields |
| `<|null|>` in all args | Model confused | Check tool param order matches expectations |
---
**Skill Version**: 2.0 — Supports v1/v2 server, multi-domain (game, robotics, avatar)
**Last Updated**: 2026-03