SimpleTool / simpletool-game.skill.md

Update simpletool-game.skill.md

945db12 verified 1 day ago

10.4 kB

	# SimpleTool Skill — Real-Time AI Application Development

	> This is a skill file. Feed it to any AI coding assistant (Claude, Gemini, GPT, Cursor, etc.) as context, then describe the app you want. The AI will generate a working SimpleTool-powered application.
	>
	> Example prompt: "Read the attached SimpleTool skill, then build me a Pong game where AI controls one paddle in real-time."

	---

	## 1. What is SimpleTool?

	SimpleTool is a multi-head parallel decoding server for real-time LLM function calling. It runs on vLLM and decodes function name + arguments simultaneously instead of sequentially.

	```
	Traditional: function → arg1 → arg2 → ... (sequential, ~200-500ms)
	SimpleTool: [function, arg1, arg2, ...] (parallel, ~25-60ms)
	```

	Application domains: game AI, robotic arm control, digital human animation, IoT automation — anything that needs < 100ms LLM decision-making.

	## 2. Server API

	Server default: `http://localhost:8899`

	### Endpoints
	\| Method \| Path \| Description \|
	\|--------\|------\|-------------\|
	\| GET \| `/health` \| Health check, returns `{status, version, model}` \|
	\| POST \| `/v1/function_call` \| Multi-head parallel function call \|

	### Request Format (v2)
	```javascript
	{
	messages: [{role: 'user', content: 'your query'}],
	tools: [...], // OpenAI-format tool definitions
	system: "domain prompt", // Domain-specific system prompt (v2)
	environment: [...], // Current state info (string array, optional)
	history: [...], // Action history (string array, max 6)
	include_content_head: false // Whether to generate <content> head
	}
	```

	The `system` field lets you inject a domain-specific system prompt (e.g., "You are a robotic arm controller"). If omitted, the server uses a generic default. The `environment` field is optional context folded into the user message.

	### Response Format
	```javascript
	{
	success: true,
	function: "move",
	args: {direction: "up", speed: "fast"}, // Named args (param names from tool def)
	heads: { // Raw per-head output
	function: "move",
	arg1: "up",
	arg2: "fast",
	arg3: "<\|null\|>"
	},
	content: null, // Only if include_content_head was true
	latency_ms: 35.2
	}
	```

	## 3. Dynamic Head Count (Critical for Latency!)

	The server automatically prunes unused heads. If your tools have at most 2 parameters, only 3 heads are spawned (`<function>`, `<arg1>`, `<arg2>`), not 8. This saves ~40% latency.

	```
	Active heads = [<function>] + [<arg1>...<argN>]
	where N = max parameter count across all tool definitions
	```

	Design tip: Keep your tools to 1–3 parameters when possible. Fewer params = fewer heads = lower latency.

	## 4. Tool Definition

	### Constraints
	- Maximum 6 arguments per function (arg1–arg6)
	- Arguments map to `arg1, arg2, ...` in the order defined in `properties`
	- Server auto-converts types: numeric strings → int/float, otherwise lowercase string
	- Use `enum` to constrain options — this dramatically improves accuracy

	### Template
	```javascript
	const TOOLS = [{
	type: "function",
	function: {
	name: "action_name",
	description: "Clear, concise — what this action does and when to use it",
	parameters: {
	type: "object",
	properties: {
	param1: {
	type: "string",
	enum: ["opt_a", "opt_b", "opt_c"], // Constrain! Improves accuracy
	description: "What this param controls"
	},
	param2: {
	type: "number",
	description: "Numeric value with unit, e.g. 'Force in Newtons'"
	}
	},
	required: ["param1"]
	}
	}
	}];
	```

	### Multi-Tool Example (Game)
	```javascript
	const TOOLS = [
	{type:"function", function:{name:"move", description:"Move unit to position", parameters:{type:"object", properties:{unit:{type:"string"}, target:{type:"string", enum:["north","south","east","west"]}}}}},
	{type:"function", function:{name:"attack", description:"Attack enemy", parameters:{type:"object", properties:{unit:{type:"string"}, target:{type:"string"}}}}},
	{type:"function", function:{name:"retreat", description:"Pull back unit", parameters:{type:"object", properties:{unit:{type:"string"}}}}},
	{type:"function", function:{name:"pass", description:"Do nothing this turn", parameters:{type:"object", properties:{}}}}
	];
	// Max params = 2 → only 3 heads spawned
	```

	## 5. Query Design

	### Principles
	1. Be imperative — tell the model what to decide, not just describe state
	2. Include decision context — "Ball is BELOW paddle, intercept it" not "Ball y=250"
	3. List valid options — "Choose: up/down/stay"
	4. Keep it short — shorter query = faster prefill

	### Good vs Bad
	```
	✅ "Ball 50px BELOW paddle, approaching fast. Move DOWN to intercept. Choose: up/down/stay"
	❌ "Ball position: 250, Paddle position: 200. What should I do?"

	✅ "Red gear at (300,150,50). Move arm there slowly for pickup."
	❌ "There is a gear somewhere on the table. The arm needs to go to it."

	✅ "Stream starting, viewers saying hello. Greet them warmly."
	❌ "Viewers are in the chat. Do something appropriate."
	```

	### Environment & History
	```javascript
	// Environment: current state as key=value strings
	const env = [
	`ball_y=${ballY}`,
	`paddle_y=${paddleY}`,
	`gap=${gap}`,
	`approaching=true`
	];

	// History: recent actions (max 6, server trims automatically)
	const history = [
	"move(up)", "move(up)", "stay()"
	];
	```

	### Domain System Prompts (v2)
	For v2 server, set a domain-specific system prompt:
	```javascript
	// Game AI
	const SYSTEM = "You are the AI controller for a Pong game. Move the paddle to intercept the ball. React quickly.";

	// Robotic arm
	const SYSTEM = "You are the voice controller for a 6-axis robotic arm. Convert commands to precise function calls. Coordinates in mm.";

	// Digital human
	const SYSTEM = "You are the animation controller for a virtual streamer. Convert director instructions to expression and speech calls.";
	```

	## 6. Frontend Code Standards

	### Required: Type-Safe Value Extraction
	```javascript
	// Values in args may be int, not string — always coerce
	function safeStr(v) {
	if (v === null \|\| v === undefined) return '';
	return String(v).trim().toLowerCase();
	}

	// Extract with args (named) first, heads (positional) as fallback
	let direction = safeStr(d.args?.direction) \|\| safeStr(d.heads?.arg1);
	```

	### Required: Validate Return Values
	```javascript
	const VALID = ['up', 'down', 'stay'];
	if (!VALID.includes(direction)) {
	console.warn(`Invalid: "${direction}", fallback to stay`);
	direction = 'stay';
	}
	```

	### Required: Error Handling with Fallback
	```javascript
	async function callAI() {
	try {
	const r = await fetch(SERVER_URL + '/v1/function_call', {
	method: 'POST',
	headers: {'Content-Type': 'application/json'},
	body: JSON.stringify(request)
	});
	const data = await r.json();
	if (!data.success) throw new Error(data.error);
	applyAction(data);
	} catch (e) {
	console.error('[AI] Failed:', e);
	applyFallbackAI(); // MUST have fallback — never freeze the app
	}
	}
	```

	### Required: Logging
	```javascript
	console.log(`[Game] Query: ${query}`);
	console.log(`[Game] → ${data.function}(${JSON.stringify(data.args)}) ${data.latency_ms.toFixed(0)}ms`);
	```

	### Recommended: Debug UI Overlay
	Show in a corner of your app: current query, raw response, latency (current + rolling average).

	## 7. Game Loop Pattern

	Decouple AI from rendering. The AI loop runs at 10–16 Hz; the render loop runs at 60 fps.

	```javascript
	const AI_INTERVAL = 100; // 100ms = 10 Hz
	let aiPending = false;

	// Render loop (60fps) — never blocks on AI
	function gameLoop() {
	update();
	render();
	requestAnimationFrame(gameLoop);
	}

	// AI loop (async, non-blocking)
	async function aiLoop() {
	if (aiPending) return;
	aiPending = true;
	await callAI();
	aiPending = false;
	}

	setInterval(aiLoop, AI_INTERVAL);
	gameLoop();
	```

	## 8. FCClient Template

	Drop-in client class for any HTML/JS application:

	```javascript
	class FCClient {
	constructor(url = 'http://localhost:8899') {
	this.url = url.replace(/\/$/, '');
	}

	async health() {
	try {
	const r = await fetch(`${this.url}/health`, {signal: AbortSignal.timeout(3000)});
	const d = await r.json();
	return {ok: d.loaded === true \|\| d.status === 'ok', version: d.version};
	} catch (e) {
	return {ok: false};
	}
	}

	async call({query, tools, system, env, history, includeContent = false}) {
	const t0 = performance.now();
	try {
	const r = await fetch(`${this.url}/v1/function_call`, {
	method: 'POST',
	headers: {'Content-Type': 'application/json'},
	body: JSON.stringify({
	messages: [{role: 'user', content: query}],
	tools,
	system, // v2: domain system prompt
	environment: env,
	history,
	include_content_head: includeContent
	})
	});
	const d = await r.json();
	return {...d, wall_ms: performance.now() - t0};
	} catch (e) {
	return {success: false, error: e.message, wall_ms: performance.now() - t0};
	}
	}
	}
	```

	Usage:
	```javascript
	const ai = new FCClient('http://localhost:8899');

	const result = await ai.call({
	query: "Ball is BELOW. Move down. Choose: up/down/stay",
	tools: TOOLS,
	system: "You are a Pong AI. Move paddle to intercept ball.",
	env: ["ball_y=300", "paddle_y=200", "gap=100"],
	history: ["move(down)", "move(down)"]
	});

	if (result.success) {
	console.log(`${result.function}(${JSON.stringify(result.args)}) in ${result.latency_ms}ms`);
	}
	```

	## 9. Troubleshooting

	\| Symptom \| Cause \| Fix \|
	\|---------\|-------\|-----\|
	\| AI stuck / no movement \| Query too vague \| Add decision hints: "Move DOWN to intercept" \|
	\| `.trim is not a function` \| `args` values may be int \| Use `String(v)` before `.trim()` \|
	\| High latency (>100ms) \| Too many heads / long query \| Reduce tool params, shorten query/env \|
	\| Wrong function called \| Ambiguous tool descriptions \| Add `enum`, improve `description` fields \|
	\| `<\|null\|>` in all args \| Model confused \| Check tool param order matches expectations \|

	---

	Skill Version: 2.0 — Supports v1/v2 server, multi-domain (game, robotics, avatar)
	Last Updated: 2026-03