denhit10 Claude Sonnet 4.6 commited on
Commit
dc89ddf
·
0 Parent(s):

Initial release — agent-visibility dashboard

Browse files

Real-time debug dashboard for multi-agent AI systems:
- Topology canvas with clickable agent nodes that expand into
per-kind operation sub-nodes (generate, embed, retrieve, tool)
- Full LLM turn inspector: messages in, response out, thinking
- Tool call traces with full input/output (not truncated)
- Embeddings, retrievals, memory panel, plan tab, event log
- Chronological sequence numbers and collapsible dropdowns
- Canvas overlay panel tied to Tools tab selection
- Three built-in demo scenarios with realistic prompt/response data
- HTTP POST API + optional MCP bridge (agentscope)
- Zero dependencies, Node ≥ 18

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Files changed (7) hide show
  1. .gitignore +6 -0
  2. README.md +133 -0
  3. agentscope/agentscope.js +115 -0
  4. bin/visibility.js +70 -0
  5. package.json +19 -0
  6. src/dashboard.html +806 -0
  7. src/server.js +615 -0
.gitignore ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ node_modules/
2
+ .wrangler/
3
+ .claude/
4
+ *.log
5
+ .env
6
+ .DS_Store
README.md ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # agent-visibility
2
+
3
+ Real-time debug dashboard for multi-agent AI systems.
4
+
5
+ Plug it into any agent framework via HTTP or MCP and get an instant view of:
6
+
7
+ - **Topology graph** — live agent nodes, hierarchy lines, message arrows; click any node to expand into operation sub-nodes
8
+ - **LLM turn inspector** — full prompt messages, model response, and optional thinking/scratchpad for every generation
9
+ - **Tool call traces** — full input/output for every tool call, with success/error status and latency
10
+ - **Embeddings & retrievals** — query text, top results, similarity scores
11
+ - **Memory panel** — key/value store with read/write flash animations
12
+ - **Plan & event log** — task plan with completion state, timestamped event stream
13
+
14
+ ![screenshot placeholder](docs/screenshot.png)
15
+
16
+ ---
17
+
18
+ ## Quick start
19
+
20
+ ```bash
21
+ # no install needed — zero dependencies
22
+ node bin/visibility.js
23
+ # → Dashboard at http://localhost:4242
24
+ ```
25
+
26
+ Click one of the built-in demo scenarios (Research + code, Critic retry loop, Memory overflow) to see a full run with real LLM prompts and responses.
27
+
28
+ ---
29
+
30
+ ## Send data from your agent
31
+
32
+ ### Option A — HTTP POST (any language)
33
+
34
+ ```bash
35
+ curl -X POST http://localhost:4242/tool \
36
+ -H 'Content-Type: application/json' \
37
+ -d '{"tool":"register_agent","args":{"id":"my-agent","label":"My Agent","role":"worker","model":"claude-sonnet-4-5"}}'
38
+ ```
39
+
40
+ ### Option B — MCP bridge
41
+
42
+ ```bash
43
+ node bin/visibility.js --mcp
44
+ # → MCP SSE endpoint at http://localhost:4243/sse
45
+ ```
46
+
47
+ Add to your agent's MCP config:
48
+
49
+ ```json
50
+ {
51
+ "mcpServers": {
52
+ "agentscope": { "url": "http://localhost:4243/sse" }
53
+ }
54
+ }
55
+ ```
56
+
57
+ ---
58
+
59
+ ## Available tools
60
+
61
+ | Tool | Purpose |
62
+ |---|---|
63
+ | `register_agent` | Register an agent (id, label, role, model, hierarchy) |
64
+ | `set_goal` | Set the run goal and start the timer |
65
+ | `set_agent_state` | Update agent status (`running`, `done`, `error`, …) |
66
+ | `log_event` | Log a timestamped event to the event stream |
67
+ | `log_llm_turn` | **Full LLM turn** — messages in, response out, optional thinking |
68
+ | `log_generation` | Token-count-only generation (lightweight alternative) |
69
+ | `log_tool_call` | Tool call with full input/output |
70
+ | `log_embedding` | Embedding call (text, model, dims) |
71
+ | `log_retrieval` | Retrieval call (query, results with scores) |
72
+ | `trace_step` | Draw an arrow between two agents on the graph |
73
+ | `set_memory` | Write/read a value in the memory panel |
74
+ | `set_plan` | Publish the task plan |
75
+ | `finish_run` | Mark the run as done or errored |
76
+
77
+ ### Logging a full LLM turn
78
+
79
+ ```bash
80
+ curl -X POST http://localhost:4242/tool \
81
+ -H 'Content-Type: application/json' \
82
+ -d '{
83
+ "tool": "log_llm_turn",
84
+ "args": {
85
+ "agent": "researcher",
86
+ "model": "claude-haiku-4-5",
87
+ "prompt_tokens": 1840,
88
+ "completion_tokens": 620,
89
+ "latency_ms": 1320,
90
+ "stop_reason": "end_turn",
91
+ "messages": [
92
+ {"role": "system", "content": "You are a researcher agent…"},
93
+ {"role": "user", "content": "Explain quicksort."}
94
+ ],
95
+ "response": "Quicksort is a divide-and-conquer algorithm…"
96
+ }
97
+ }'
98
+ ```
99
+
100
+ ---
101
+
102
+ ## Canvas interaction
103
+
104
+ - **Click an agent node** → expands into operation-type sub-nodes (generate, embed, retrieve, tool) with counts and stats
105
+ - **Click a tool dropdown** → highlights the agent node on the canvas and shows an info overlay
106
+
107
+ ---
108
+
109
+ ## Ports
110
+
111
+ | Port | Service |
112
+ |---|---|
113
+ | `4242` | Dashboard HTTP server + SSE stream |
114
+ | `4243` | MCP bridge (only with `--mcp`) |
115
+
116
+ Override with `--port` / `--mcp-port` flags or `VISIBILITY_PORT` / `VISIBILITY_MCP_PORT` env vars.
117
+
118
+ ---
119
+
120
+ ## File layout
121
+
122
+ ```
123
+ bin/visibility.js CLI entry point
124
+ src/server.js HTTP + SSE dashboard server
125
+ src/dashboard.html Dark-theme UI (served by the node server)
126
+ agentscope/agentscope.js MCP bridge (forwards tool calls to the dashboard)
127
+ ```
128
+
129
+ ---
130
+
131
+ ## License
132
+
133
+ MIT
agentscope/agentscope.js ADDED
@@ -0,0 +1,115 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env node
2
+ /**
3
+ * agentscope — MCP bridge for agent-visibility
4
+ *
5
+ * Agents connect here via MCP (SSE transport). Tool calls are forwarded to
6
+ * the dashboard server at DASHBOARD_URL.
7
+ *
8
+ * Usage:
9
+ * node agentscope/agentscope.js
10
+ *
11
+ * MCP config for your agent:
12
+ * { "mcpServers": { "agentscope": { "url": "http://localhost:4243/sse" } } }
13
+ */
14
+ 'use strict';
15
+ const http = require('http');
16
+
17
+ const MCP_PORT = parseInt(process.env.VISIBILITY_MCP_PORT || '4243');
18
+ const DASHBOARD = `http://localhost:${process.env.VISIBILITY_PORT || '4242'}`;
19
+
20
+ // ── Tool definitions ──────────────────────────────────────────────────────────
21
+ const TOOLS = [
22
+ { name: 'register_agent', description: 'Register an agent with the visibility dashboard.', inputSchema: { type:'object', required:['id','label','role'], properties: { id:{type:'string'}, label:{type:'string'}, role:{type:'string',enum:['orchestrator','worker','researcher','coder','critic','synthesiser']}, model:{type:'string'}, reports_to:{type:'string'}, token_budget:{type:'number'}, color:{type:'string'} } } },
23
+ { name: 'log_event', description: 'Log an agent event to the dashboard.', inputSchema: { type:'object', required:['agent','event_type','message'], properties: { agent:{type:'string'}, event_type:{type:'string',enum:['start','plan','route','reply','tool','result','pass','fail','retry','warn','error','done']}, message:{type:'string'}, tokens:{type:'number'}, latency_ms:{type:'number'}, metadata:{type:'object'} } } },
24
+ { name: 'log_llm_turn', description: 'Log a full LLM conversation turn (messages in + response out + optional thinking). Use this to expose the exact context sent to and received from the model.',
25
+ inputSchema: { type:'object', required:['agent'], properties: { agent:{type:'string'}, model:{type:'string'}, prompt_tokens:{type:'number'}, completion_tokens:{type:'number'}, latency_ms:{type:'number'}, stop_reason:{type:'string'}, messages:{type:'array',items:{type:'object',properties:{role:{type:'string'},content:{type:'string'}}}}, response:{type:'string'}, thinking:{type:'string'} } } },
26
+ { name: 'trace_step', description: 'Draw an arrow between two agents on the canvas.', inputSchema: { type:'object', required:['from_agent','to_agent'], properties: { from_agent:{type:'string'}, to_agent:{type:'string'}, label:{type:'string'}, arrow_type:{type:'string',enum:['msg','result','retry','tool']} } } },
27
+ { name: 'set_memory', description: 'Write a value to the shared memory panel.', inputSchema: { type:'object', required:['key','value'], properties: { key:{type:'string'}, value:{type:'string'}, op:{type:'string',enum:['write','read']} } } },
28
+ { name: 'set_agent_state',description: 'Update an agent status on the dashboard.', inputSchema: { type:'object', required:['agent_id','status'], properties: { agent_id:{type:'string'}, status:{type:'string',enum:['idle','running','active','done','error']} } } },
29
+ { name: 'set_goal', description: 'Set the run goal and mark the run as started.', inputSchema: { type:'object', required:['goal'], properties: { goal:{type:'string'}, run_id:{type:'string'} } } },
30
+ { name: 'set_plan', description: 'Publish the task plan to the Plan tab.', inputSchema: { type:'object', required:['tasks'], properties: { tasks:{type:'array'} } } },
31
+ { name: 'finish_run', description: 'Mark the current run as complete.', inputSchema: { type:'object', properties: { status:{type:'string',enum:['done','error']} } } },
32
+ ];
33
+
34
+ // ── Forward tool call to dashboard ────────────────────────────────────────────
35
+ function forward(tool, args) {
36
+ return new Promise(resolve => {
37
+ const body = JSON.stringify({ tool, args });
38
+ const req = http.request(DASHBOARD + '/tool', {
39
+ method: 'POST',
40
+ headers: { 'Content-Type': 'application/json', 'Content-Length': Buffer.byteLength(body) },
41
+ }, res => {
42
+ let data = '';
43
+ res.on('data', c => data += c);
44
+ res.on('end', () => { try { resolve(JSON.parse(data)); } catch (_) { resolve({ ok: true }); } });
45
+ });
46
+ req.on('error', err => resolve({ ok: false, error: `Dashboard unreachable: ${err.message}` }));
47
+ req.write(body); req.end();
48
+ });
49
+ }
50
+
51
+ // ── MCP message handling ──────────────────────────────────────────────────────
52
+ async function handleMsg(msg, send) {
53
+ const { id, method, params } = msg;
54
+ if (method === 'initialize') {
55
+ send({ jsonrpc:'2.0', id, result: { protocolVersion:'2024-11-05', capabilities:{ tools:{} }, serverInfo:{ name:'agentscope', version:'1.0.0' } } });
56
+ } else if (method === 'tools/list') {
57
+ send({ jsonrpc:'2.0', id, result: { tools: TOOLS } });
58
+ } else if (method === 'tools/call') {
59
+ const { name, arguments: args } = params || {};
60
+ const found = TOOLS.find(t => t.name === name);
61
+ if (!found) { send({ jsonrpc:'2.0', id, error:{ code:-32601, message:`Unknown tool: ${name}` } }); return; }
62
+ const result = await forward(name, args || {});
63
+ send({ jsonrpc:'2.0', id, result:{ content:[{ type:'text', text:JSON.stringify(result) }], isError: result.ok === false } });
64
+ } else if (method === 'notifications/initialized') {
65
+ // no response
66
+ } else if (id !== undefined) {
67
+ send({ jsonrpc:'2.0', id, error:{ code:-32601, message:`Method not found: ${method}` } });
68
+ }
69
+ }
70
+
71
+ // ── HTTP server (SSE transport) ────────────────────────────────────────────────
72
+ const sessions = new Map();
73
+ const CORS = { 'Access-Control-Allow-Origin':'*', 'Access-Control-Allow-Methods':'GET, POST, OPTIONS', 'Access-Control-Allow-Headers':'Content-Type, Accept' };
74
+ function readBody(req, cb) { let d=''; req.on('data', c=>d+=c); req.on('end', ()=>cb(d)); }
75
+
76
+ http.createServer((req, res) => {
77
+ if (req.method === 'OPTIONS') { res.writeHead(204, CORS); res.end(); return; }
78
+
79
+ if (req.method === 'GET' && req.url === '/sse') {
80
+ const sid = `s_${Date.now()}_${Math.random().toString(36).slice(2)}`;
81
+ res.writeHead(200, { ...CORS, 'Content-Type':'text/event-stream', 'Cache-Control':'no-cache', 'Connection':'keep-alive' });
82
+ const send = obj => { try { res.write(`data: ${JSON.stringify(obj)}\n\n`); } catch (_) {} };
83
+ res.write(`event: endpoint\ndata: /message?sessionId=${sid}\n\n`);
84
+ sessions.set(sid, { send });
85
+ req.on('close', () => sessions.delete(sid));
86
+ return;
87
+ }
88
+
89
+ if (req.method === 'POST' && req.url.startsWith('/message')) {
90
+ const sid = new URL(req.url, 'http://localhost').searchParams.get('sessionId');
91
+ const session = sessions.get(sid);
92
+ if (!session) { res.writeHead(404, { ...CORS, 'Content-Type':'application/json' }); res.end(JSON.stringify({ error:'Session not found' })); return; }
93
+ readBody(req, async data => {
94
+ let msg;
95
+ try { msg = JSON.parse(data); } catch (_) { res.writeHead(400, { ...CORS }); res.end('{}'); return; }
96
+ res.writeHead(202, { ...CORS, 'Content-Type':'application/json' }); res.end('{"ok":true}');
97
+ await handleMsg(msg, session.send);
98
+ });
99
+ return;
100
+ }
101
+
102
+ if (req.url === '/health') {
103
+ res.writeHead(200, { ...CORS, 'Content-Type':'application/json' });
104
+ res.end(JSON.stringify({ ok:true, tools: TOOLS.map(t => t.name), dashboard: DASHBOARD }));
105
+ return;
106
+ }
107
+
108
+ res.writeHead(404, { ...CORS, 'Content-Type':'application/json' }); res.end('{"error":"Not found"}');
109
+ }).listen(MCP_PORT, () => {
110
+ console.log(`\n agentscope — MCP bridge\n`);
111
+ console.log(` SSE: http://localhost:${MCP_PORT}/sse`);
112
+ console.log(` Dashboard: ${DASHBOARD}\n`);
113
+ console.log(` Add to your agent config:`);
114
+ console.log(` { "mcpServers": { "agentscope": { "url": "http://localhost:${MCP_PORT}/sse" } } }\n`);
115
+ });
bin/visibility.js ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env node
2
+ /**
3
+ * visibility — agent-visibility CLI
4
+ *
5
+ * Usage:
6
+ * visibility dashboard on :4242, opens browser
7
+ * visibility --mcp dashboard + MCP bridge on :4243
8
+ * visibility --port 5000 custom dashboard port
9
+ * visibility --mcp-port 5001 custom MCP port
10
+ * visibility --no-open don't auto-open browser
11
+ * visibility --help
12
+ */
13
+ 'use strict';
14
+ const path = require('path');
15
+ const { execSync, spawn } = require('child_process');
16
+
17
+ const argv = process.argv.slice(2);
18
+ const flags = { mcp:false, noOpen:false, help:false, port:4242, mcpPort:4243 };
19
+ for (let i = 0; i < argv.length; i++) {
20
+ if (argv[i] === '--mcp') flags.mcp = true;
21
+ if (argv[i] === '--no-open') flags.noOpen = true;
22
+ if (argv[i] === '--help' || argv[i] === '-h') flags.help = true;
23
+ if (argv[i] === '--port' && argv[i+1]) flags.port = parseInt(argv[++i]);
24
+ if (argv[i] === '--mcp-port' && argv[i+1]) flags.mcpPort = parseInt(argv[++i]);
25
+ }
26
+
27
+ if (flags.help) {
28
+ console.log(`
29
+ agent-visibility
30
+
31
+ Commands:
32
+ visibility dashboard on :4242, opens browser
33
+ visibility --mcp dashboard + MCP bridge on :4243
34
+ visibility --port 5000 custom dashboard port
35
+ visibility --mcp-port 5001 custom MCP port
36
+ visibility --no-open suppress auto browser open
37
+ visibility --help
38
+
39
+ MCP config (after running with --mcp):
40
+ { "mcpServers": { "agentscope": { "url": "http://localhost:4243/sse" } } }
41
+ `);
42
+ process.exit(0);
43
+ }
44
+
45
+ const env = { ...process.env, VISIBILITY_PORT: String(flags.port), VISIBILITY_MCP_PORT: String(flags.mcpPort) };
46
+ const children = [];
47
+
48
+ function spawn_(script) {
49
+ const child = spawn(process.execPath, [script], { stdio:'inherit', env });
50
+ children.push(child);
51
+ child.on('exit', code => { if (code) process.exit(code); });
52
+ }
53
+
54
+ function shutdown() { children.forEach(c => { try { c.kill('SIGTERM'); } catch (_) {} }); }
55
+ process.on('SIGINT', shutdown);
56
+ process.on('SIGTERM', shutdown);
57
+
58
+ spawn_(path.join(__dirname, '..', 'src', 'server.js'));
59
+
60
+ if (flags.mcp) {
61
+ setTimeout(() => spawn_(path.join(__dirname, '..', 'agentscope', 'agentscope.js')), 400);
62
+ }
63
+
64
+ if (!flags.noOpen) {
65
+ setTimeout(() => {
66
+ const url = `http://localhost:${flags.port}`;
67
+ const cmd = process.platform === 'win32' ? `start "" "${url}"` : process.platform === 'darwin' ? `open "${url}"` : `xdg-open "${url}"`;
68
+ try { execSync(cmd, { stdio:'ignore' }); } catch (_) {}
69
+ }, 900);
70
+ }
package.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "agent-visibility",
3
+ "version": "0.1.0",
4
+ "description": "Real-time debug dashboard for multi-agent AI systems — topology graph, LLM turn inspector, tool call traces, memory panel",
5
+ "main": "src/server.js",
6
+ "bin": {
7
+ "visibility": "bin/visibility.js"
8
+ },
9
+ "scripts": {
10
+ "start": "node bin/visibility.js",
11
+ "start:mcp": "node bin/visibility.js --mcp",
12
+ "dev": "node bin/visibility.js --no-open"
13
+ },
14
+ "keywords": ["agent", "llm", "debug", "visibility", "observability", "mcp", "multi-agent"],
15
+ "license": "MIT",
16
+ "engines": {
17
+ "node": ">=18"
18
+ }
19
+ }
src/dashboard.html ADDED
@@ -0,0 +1,806 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8"><meta name="viewport" content="width=device-width,initial-scale=1.0">
5
+ <title>Agent Visibility</title>
6
+ <link href="https://fonts.googleapis.com/css2?family=IBM+Plex+Mono:wght@400;500&family=Inter:wght@300;400;500&display=swap" rel="stylesheet">
7
+ <style>
8
+ :root{
9
+ --bg:#0d0f12;--bg2:#161820;--bg3:#1e2028;--border:rgba(255,255,255,.08);--border2:rgba(255,255,255,.14);
10
+ --text:#e2e4e8;--muted:#5a6070;--purple:#8b7cf8;--teal:#2dd4b0;--amber:#f59e0b;--coral:#f87171;--blue:#60a5fa;--green:#4ade80;
11
+ }
12
+ *,*::before,*::after{box-sizing:border-box;margin:0;padding:0}
13
+ html,body{height:100%;background:var(--bg);color:var(--text);font-family:'Inter',sans-serif;font-size:13px;overflow:hidden}
14
+
15
+ /* ── Shell ── */
16
+ .shell{display:grid;grid-template-rows:48px 1fr;height:100vh}
17
+
18
+ /* ── Top bar ── */
19
+ .bar{display:flex;align-items:center;gap:12px;padding:0 20px;border-bottom:1px solid var(--border);background:var(--bg2)}
20
+ .logo{font-family:'IBM Plex Mono',monospace;font-size:13px;font-weight:500;color:var(--purple);letter-spacing:-.01em}
21
+ .goal{flex:1;font-size:12px;color:var(--muted);white-space:nowrap;overflow:hidden;text-overflow:ellipsis;padding:0 12px}
22
+ .badge{font-size:11px;padding:3px 9px;border-radius:12px;border:1px solid var(--border);color:var(--muted);font-weight:500}
23
+ .badge.live{border-color:var(--green);color:var(--green)}.badge.dead{border-color:var(--coral);color:var(--coral)}
24
+ .badge.running{border-color:var(--teal);color:var(--teal)}.badge.done{border-color:var(--green);color:var(--green)}.badge.error{border-color:var(--coral);color:var(--coral)}
25
+ .btn-reset{background:var(--bg3);color:var(--muted);border:1px solid var(--border2);border-radius:7px;padding:5px 12px;font-size:12px;font-family:'Inter',sans-serif;cursor:pointer}
26
+ .btn-reset:hover{color:var(--text);border-color:var(--border2)}
27
+
28
+ /* ── Main grid ── */
29
+ .grid{display:grid;grid-template-columns:200px 1fr;height:100%;overflow:hidden}
30
+
31
+ /* ── Left sidebar ── */
32
+ .sidebar{display:flex;flex-direction:column;border-right:1px solid var(--border);overflow:hidden}
33
+ .section{padding:12px 14px;border-bottom:1px solid var(--border)}
34
+ .section-label{font-size:10px;font-weight:600;letter-spacing:.08em;text-transform:uppercase;color:var(--muted);margin-bottom:8px}
35
+
36
+ /* scenario buttons */
37
+ .demo-btn{display:block;width:100%;text-align:left;background:transparent;border:1px solid var(--border);border-radius:8px;padding:8px 10px;margin-bottom:6px;cursor:pointer;color:var(--muted);font-family:'Inter',sans-serif;font-size:12px;transition:border-color .15s,color .15s}
38
+ .demo-btn:last-child{margin-bottom:0}
39
+ .demo-btn:hover{border-color:var(--purple);color:var(--text)}
40
+ .demo-btn strong{display:block;font-size:12px;font-weight:500;color:var(--text);margin-bottom:1px}
41
+ .demo-btn span{font-size:10px}
42
+
43
+ /* agents list */
44
+ .agents-scroll{flex:1;overflow-y:auto;padding:10px 14px}
45
+ .agents-scroll::-webkit-scrollbar{width:2px}.agents-scroll::-webkit-scrollbar-thumb{background:var(--bg3)}
46
+ .agent-row{display:flex;align-items:center;gap:8px;padding:7px 9px;border-radius:8px;margin-bottom:4px;border:1px solid transparent;transition:border-color .2s,background .2s}
47
+ .agent-row.idle{border-color:var(--border)}.agent-row.registered{border-color:rgba(139,124,248,.25)}
48
+ .agent-row.running{border-color:var(--teal);background:rgba(45,212,176,.04)}.agent-row.active{border-color:var(--purple);background:rgba(139,124,248,.04)}
49
+ .agent-row.done{border-color:rgba(74,222,128,.3)}.agent-row.error{border-color:var(--coral)}
50
+ .agent-dot{width:7px;height:7px;border-radius:50%;flex-shrink:0;background:var(--muted)}
51
+ .agent-row.running .agent-dot{background:var(--teal)}.agent-row.active .agent-dot{background:var(--purple)}
52
+ .agent-row.done .agent-dot{background:var(--green)}.agent-row.error .agent-dot{background:var(--coral)}
53
+ .agent-row.registered .agent-dot{background:rgba(139,124,248,.6)}
54
+ .agent-name{font-size:12px;font-weight:500;flex:1;overflow:hidden;text-overflow:ellipsis;white-space:nowrap}
55
+ .agent-role{font-size:10px;padding:1px 5px;border-radius:4px;white-space:nowrap;flex-shrink:0}
56
+
57
+ /* ── Right panel ── */
58
+ .main{display:flex;flex-direction:column;overflow:hidden}
59
+
60
+ /* canvas */
61
+ .canvas-wrap{position:relative;flex-shrink:0;border-bottom:1px solid var(--border)}
62
+ canvas{display:block;width:100%}
63
+ .tool-overlay{position:absolute;background:var(--bg2);border:1px solid var(--border2);border-radius:9px;min-width:210px;max-width:270px;pointer-events:none;z-index:10;box-shadow:0 6px 20px rgba(0,0,0,.5);overflow:hidden}
64
+ .tool-overlay .tool-blocks{padding:6px 6px 4px 6px}
65
+ .tool-overlay-hdr{display:flex;align-items:center;gap:6px;padding:6px 8px 5px;border-bottom:1px solid var(--border)}
66
+ .tool-overlay-hdr .tool-kind{font-size:10px}
67
+ .tool-overlay-hdr .tool-seq{font-size:10px;color:var(--muted);font-family:'IBM Plex Mono',monospace}
68
+ .tool-overlay-hdr .tool-agent-name{font-size:11px;font-weight:600;flex:1}
69
+
70
+ /* tabs */
71
+ .tabs{display:flex;border-bottom:1px solid var(--border);flex-shrink:0;background:var(--bg2)}
72
+ .tab{font-size:12px;font-weight:500;padding:9px 16px;cursor:pointer;color:var(--muted);border-bottom:2px solid transparent;margin-bottom:-1px;transition:color .15s}
73
+ .tab:hover{color:var(--text)}.tab.active{color:var(--text);border-bottom-color:var(--purple)}
74
+ .tab-panel{display:none;flex:1;overflow-y:auto;padding:12px 14px}
75
+ .tab-panel::-webkit-scrollbar{width:2px}.tab-panel::-webkit-scrollbar-thumb{background:var(--bg3)}
76
+ .tab-panel.active{display:block}
77
+ .empty{color:var(--muted);font-size:12px;font-style:italic;padding:8px 0}
78
+
79
+ /* ── Log tab ── */
80
+ .log-row{display:flex;gap:8px;align-items:flex-start;padding:5px 0;border-bottom:1px solid rgba(255,255,255,.04)}
81
+ .log-row:last-child{border-bottom:none}
82
+ .log-tag{font-size:10px;padding:2px 6px;border-radius:4px;white-space:nowrap;flex-shrink:0;margin-top:1px;font-weight:500}
83
+ .log-msg{font-size:12px;line-height:1.45;color:#c8cad0;flex:1;word-break:break-word}
84
+ .log-time{font-family:'IBM Plex Mono',monospace;font-size:10px;color:var(--muted);white-space:nowrap;flex-shrink:0}
85
+
86
+ /* ── Tools tab ── */
87
+ .tool-item{border-bottom:1px solid rgba(255,255,255,.04)}
88
+ .tool-item:last-child{border-bottom:none}
89
+ .tool-item summary{display:flex;gap:8px;align-items:center;padding:6px 2px;cursor:pointer;list-style:none;user-select:none;outline:none}
90
+ .tool-item summary::-webkit-details-marker{display:none}
91
+ .tool-chevron{font-size:8px;color:var(--muted);transition:transform .15s;flex-shrink:0}
92
+ .tool-item[open] .tool-chevron{transform:rotate(90deg)}
93
+ .tool-seq{font-family:'IBM Plex Mono',monospace;font-size:10px;color:var(--muted);flex-shrink:0;width:22px;text-align:right}
94
+ .tool-kind{font-size:10px;padding:2px 6px;border-radius:4px;font-weight:600;white-space:nowrap;flex-shrink:0}
95
+ .tool-agent-name{font-size:12px;font-weight:500;white-space:nowrap;flex-shrink:0}
96
+ .tool-preview{font-size:12px;color:var(--muted);flex:1;overflow:hidden;text-overflow:ellipsis;white-space:nowrap}
97
+ .tool-lat{font-family:'IBM Plex Mono',monospace;font-size:10px;color:var(--muted);white-space:nowrap;flex-shrink:0}
98
+ .tool-body{border-top:1px solid rgba(255,255,255,.04);padding:6px 4px 8px 0}
99
+ .tool-blocks{display:flex;flex-wrap:wrap;gap:5px;padding:4px 4px 2px 36px}
100
+ .tool-block{background:var(--bg3);border:1px solid var(--border);border-radius:6px;padding:5px 9px;min-width:90px}
101
+ .tool-block.full{flex-basis:100%;min-width:0}
102
+ .tb-label{font-size:9px;font-weight:600;letter-spacing:.08em;text-transform:uppercase;color:var(--muted);display:block;margin-bottom:2px}
103
+ .tb-val{font-size:12px;color:var(--text);line-height:1.45;word-break:break-word}
104
+ .tb-val b{font-weight:600}
105
+
106
+ /* ── LLM conversation thread ── */
107
+ .llm-thread{display:flex;flex-direction:column;gap:3px;padding:4px 4px 2px 36px}
108
+ .llm-turn{border-radius:5px;padding:5px 8px;border:1px solid rgba(255,255,255,.05)}
109
+ .llm-turn.system{background:rgba(255,255,255,.02)}
110
+ .llm-turn.user{background:rgba(96,165,250,.05);border-color:rgba(96,165,250,.12)}
111
+ .llm-turn.assistant{background:rgba(139,124,248,.05);border-color:rgba(139,124,248,.12)}
112
+ .llm-turn.tool{background:rgba(45,212,176,.04);border-color:rgba(45,212,176,.10)}
113
+ .llm-role{font-size:9px;font-weight:600;letter-spacing:.08em;text-transform:uppercase;margin-bottom:2px}
114
+ .llm-turn.system .llm-role{color:var(--muted)}
115
+ .llm-turn.user .llm-role{color:var(--blue)}
116
+ .llm-turn.assistant .llm-role{color:var(--purple)}
117
+ .llm-turn.tool .llm-role{color:var(--teal)}
118
+ .llm-content{font-size:11px;line-height:1.5;color:var(--text);word-break:break-word;white-space:pre-wrap}
119
+ .llm-response{border-top:1px solid rgba(255,255,255,.06);padding:6px 8px 6px 36px}
120
+ .llm-section-label{font-size:9px;font-weight:600;letter-spacing:.08em;text-transform:uppercase;margin-bottom:4px}
121
+ .llm-response .llm-section-label{color:var(--purple)}
122
+ .llm-thinking .llm-section-label{color:var(--amber)}
123
+ .llm-response-text,.llm-thinking-text{font-size:11px;line-height:1.5;color:var(--text);white-space:pre-wrap;word-break:break-word}
124
+ .llm-thinking{border-top:1px solid rgba(255,255,255,.06);padding:6px 8px 6px 36px}
125
+ .llm-thinking-text{color:var(--muted);font-style:italic}
126
+
127
+ /* ── Memory tab ── */
128
+ .mem-card{border:1px solid var(--border);border-radius:8px;padding:8px 10px;margin-bottom:6px;transition:border-color .3s,background .3s}
129
+ @keyframes fw{0%{border-color:var(--teal);background:rgba(45,212,176,.07)}100%{border-color:var(--border);background:transparent}}
130
+ @keyframes fr{0%{border-color:var(--blue);background:rgba(96,165,250,.07)}100%{border-color:var(--border);background:transparent}}
131
+ .mem-card.fw{animation:fw .9s ease-out forwards}.mem-card.fr{animation:fr .6s ease-out forwards}
132
+ .mem-key{font-family:'IBM Plex Mono',monospace;font-size:10px;font-weight:500;color:var(--muted);margin-bottom:3px;text-transform:uppercase;letter-spacing:.04em}
133
+ .mem-val{font-size:12px;line-height:1.5;color:var(--text)}
134
+ .mem-val b{font-weight:600}
135
+
136
+ /* ── Plan tab ── */
137
+ .plan-row{display:flex;align-items:center;gap:8px;padding:6px 0;border-bottom:1px solid rgba(255,255,255,.04)}
138
+ .plan-row:last-child{border-bottom:none}
139
+ .plan-dot{width:6px;height:6px;border-radius:50%;flex-shrink:0}
140
+ .plan-agent{font-size:11px;font-weight:600;width:80px;flex-shrink:0;overflow:hidden;text-overflow:ellipsis;white-space:nowrap}
141
+ .plan-task{font-size:12px;color:var(--muted);flex:1;overflow:hidden;text-overflow:ellipsis;white-space:nowrap}
142
+ .plan-done{font-size:10px;padding:1px 6px;border-radius:4px;background:rgba(74,222,128,.1);color:var(--green);white-space:nowrap}
143
+
144
+ /* ── Metrics bar ── */
145
+ .metrics{display:flex;border-top:1px solid var(--border);flex-shrink:0;background:var(--bg2)}
146
+ .metric{flex:1;padding:8px 14px;border-right:1px solid var(--border)}.metric:last-child{border-right:none}
147
+ .metric-label{font-size:10px;font-weight:600;letter-spacing:.07em;text-transform:uppercase;color:var(--muted);margin-bottom:2px}
148
+ .metric-val{font-family:'IBM Plex Mono',monospace;font-size:16px;font-weight:500}
149
+
150
+ @keyframes spin{to{transform:rotate(360deg)}}
151
+ .spinner{width:12px;height:12px;border:1.5px solid var(--border2);border-top-color:var(--purple);border-radius:50%;animation:spin .6s linear infinite;display:none;flex-shrink:0}
152
+ .spinner.on{display:inline-block}
153
+ </style>
154
+ </head>
155
+ <body>
156
+ <div class="shell">
157
+
158
+ <!-- top bar -->
159
+ <div class="bar">
160
+ <div class="logo">agent.visibility</div>
161
+ <div class="goal" id="goal-text">waiting for agents…</div>
162
+ <span class="spinner" id="spinner"></span>
163
+ <span class="badge" id="run-status">idle</span>
164
+ <span class="badge dead" id="conn-badge">connecting…</span>
165
+ <button class="btn-reset" onclick="doReset()">Reset</button>
166
+ </div>
167
+
168
+ <div class="grid">
169
+
170
+ <!-- sidebar -->
171
+ <div class="sidebar">
172
+ <div class="section">
173
+ <div class="section-label">Demo scenarios</div>
174
+ <button class="demo-btn" onclick="emulate('research_code')">
175
+ <strong>Research + code</strong>
176
+ <span>4 agents · clean run</span>
177
+ </button>
178
+ <button class="demo-btn" onclick="emulate('critic_retry')">
179
+ <strong>Critic retry loop</strong>
180
+ <span>3 agents · fail → retry → pass</span>
181
+ </button>
182
+ <button class="demo-btn" onclick="emulate('memory_overflow')">
183
+ <strong>Memory overflow</strong>
184
+ <span>4 agents · context truncation</span>
185
+ </button>
186
+ </div>
187
+ <div class="section" style="padding-bottom:6px">
188
+ <div class="section-label">Agents</div>
189
+ </div>
190
+ <div class="agents-scroll" id="agents-list">
191
+ <div class="empty">Agents appear after registration</div>
192
+ </div>
193
+ </div>
194
+
195
+ <!-- main panel -->
196
+ <div class="main">
197
+ <div class="canvas-wrap"><canvas id="fc" height="260"></canvas></div>
198
+ <div class="tabs">
199
+ <div class="tab active" onclick="switchTab(event,'log')">Log</div>
200
+ <div class="tab" onclick="switchTab(event,'tools')">Tools</div>
201
+ <div class="tab" onclick="switchTab(event,'mem')">Memory</div>
202
+ <div class="tab" onclick="switchTab(event,'plan')">Plan</div>
203
+ </div>
204
+ <div class="tab-panel active" id="tp-log"><div class="empty">Events stream here during a run</div></div>
205
+ <div class="tab-panel" id="tp-tools"><div class="empty">Embeddings, retrievals, tool calls and LLM generations appear here</div></div>
206
+ <div class="tab-panel" id="tp-mem"><div id="mem-grid"></div></div>
207
+ <div class="tab-panel" id="tp-plan"><div id="plan-list"><div class="empty">Plan appears after orchestrator runs</div></div></div>
208
+ <div class="metrics">
209
+ <div class="metric"><div class="metric-label">Steps</div><div class="metric-val" id="m-steps">0</div></div>
210
+ <div class="metric"><div class="metric-label">Tokens</div><div class="metric-val" id="m-tokens">0</div></div>
211
+ <div class="metric"><div class="metric-label">Elapsed</div><div class="metric-val" id="m-elapsed">—</div></div>
212
+ <div class="metric"><div class="metric-label">Retries</div><div class="metric-val" id="m-retries">0</div></div>
213
+ </div>
214
+ </div>
215
+ </div>
216
+ </div>
217
+
218
+ <script>
219
+ const SERVER = location.origin;
220
+
221
+ const ROLE_COLORS = {
222
+ orchestrator:'#8b7cf8', researcher:'#2dd4b0', coder:'#60a5fa',
223
+ critic:'#f59e0b', synthesiser:'#60a5fa', worker:'#2dd4b0',
224
+ };
225
+
226
+ let S = { registry:{}, agents:{}, memory:{}, events:[], plan:[], internals:[], metrics:{steps:0,tokens:0,retries:0}, status:'idle', goal:'', startedAt:null, lastArrow:null };
227
+ let es = null, elapsedTimer = null, toolSeq = 0, selectedToolItem = null, flowPos = {}, expandedAgent = null;
228
+
229
+ // ── Canvas ─────────────────────────────────────────────────────────────────────
230
+ const fc = document.getElementById('fc');
231
+ const ctx = fc.getContext('2d');
232
+ function initCanvas(){ fc.width = fc.parentElement.clientWidth; drawFlow(); }
233
+ window.addEventListener('resize', initCanvas);
234
+
235
+ // click agent node → expand / collapse sub-nodes
236
+ fc.addEventListener('click', e => {
237
+ if (!Object.keys(S.registry).length) return;
238
+ const rect = fc.getBoundingClientRect();
239
+ const scaleX = fc.width / fc.clientWidth, scaleY = fc.height / fc.clientHeight;
240
+ const mx = (e.clientX - rect.left) * scaleX, my = (e.clientY - rect.top) * scaleY;
241
+ let hit = null;
242
+ Object.keys(S.registry).forEach(id => {
243
+ const p = flowPos[id]; if (!p) return;
244
+ if (mx >= p.x-46 && mx <= p.x+46 && my >= p.y-17 && my <= p.y+17) hit = id;
245
+ });
246
+ if (hit !== null) { expandedAgent = (expandedAgent === hit) ? null : hit; drawFlow(); }
247
+ });
248
+
249
+ fc.addEventListener('mousemove', e => {
250
+ if (!Object.keys(S.registry).length) { fc.style.cursor='default'; return; }
251
+ const rect = fc.getBoundingClientRect();
252
+ const scaleX = fc.width / fc.clientWidth, scaleY = fc.height / fc.clientHeight;
253
+ const mx = (e.clientX - rect.left) * scaleX, my = (e.clientY - rect.top) * scaleY;
254
+ let hit = false;
255
+ Object.keys(S.registry).forEach(id => {
256
+ const p = flowPos[id]; if (!p) return;
257
+ if (mx >= p.x-46 && mx <= p.x+46 && my >= p.y-17 && my <= p.y+17) hit = true;
258
+ });
259
+ fc.style.cursor = hit ? 'pointer' : 'default';
260
+ });
261
+
262
+ function hexA(h, a){
263
+ const r=parseInt(h.slice(1,3),16), g=parseInt(h.slice(3,5),16), b=parseInt(h.slice(5,7),16);
264
+ return `rgba(${r},${g},${b},${a})`;
265
+ }
266
+
267
+ function layout(reg){
268
+ const ids = Object.keys(reg);
269
+ if (!ids.length) return {};
270
+ const tierOf = {};
271
+ function depth(id){
272
+ if (tierOf[id] !== undefined) return tierOf[id];
273
+ const parent = reg[id]?.reports_to;
274
+ tierOf[id] = (parent && reg[parent]) ? depth(parent)+1 : 0;
275
+ return tierOf[id];
276
+ }
277
+ ids.forEach(id => depth(id));
278
+ const maxTier = Math.max(...Object.values(tierOf));
279
+ const rows = Array.from({length: maxTier+1}, () => []);
280
+ ids.forEach(id => rows[tierOf[id]].push(id));
281
+ const filled = rows.filter(r => r.length > 0);
282
+ const W = fc.width || 600, H = fc.height;
283
+ const rowY = filled.length === 1
284
+ ? [H*.45]
285
+ : filled.map((_,i) => H*(.14 + i*(.72/Math.max(filled.length-1,1))));
286
+ const pos = {};
287
+ filled.forEach((row, ri) => {
288
+ const step = W/(row.length+1);
289
+ row.forEach((id, ci) => { pos[id] = { x: step*(ci+1), y: rowY[ri], color: reg[id].color || '#6b7280' }; });
290
+ });
291
+ return pos;
292
+ }
293
+
294
+ function drawArrow(x1,y1,x2,y2,color,label,dashed){
295
+ const dx=x2-x1, dy=y2-y1, len=Math.sqrt(dx*dx+dy*dy);
296
+ if (len < 5) return;
297
+ const u={x:dx/len,y:dy/len}, pad=18;
298
+ const s={x:x1+u.x*pad,y:y1+u.y*pad}, e={x:x2-u.x*pad,y:y2-u.y*pad};
299
+ const cx=(s.x+e.x)/2-u.y*24, cy=(s.y+e.y)/2+u.x*24;
300
+ ctx.beginPath(); ctx.moveTo(s.x,s.y); ctx.quadraticCurveTo(cx,cy,e.x,e.y);
301
+ ctx.strokeStyle=color; ctx.lineWidth=1.5; ctx.setLineDash(dashed?[4,3]:[]); ctx.stroke(); ctx.setLineDash([]);
302
+ const ang=Math.atan2(e.y-cy,e.x-cx);
303
+ ctx.beginPath(); ctx.moveTo(e.x,e.y);
304
+ ctx.lineTo(e.x-7*Math.cos(ang-.4),e.y-7*Math.sin(ang-.4));
305
+ ctx.lineTo(e.x-7*Math.cos(ang+.4),e.y-7*Math.sin(ang+.4));
306
+ ctx.closePath(); ctx.fillStyle=color; ctx.fill();
307
+ if (label){
308
+ ctx.fillStyle=color; ctx.font='9px "IBM Plex Mono",monospace';
309
+ ctx.textAlign='center'; ctx.textBaseline='middle';
310
+ ctx.fillText(label.length>16?label.slice(0,16)+'…':label, cx, cy-10);
311
+ }
312
+ }
313
+
314
+ function drawFlow(){
315
+ if (!fc.width) return;
316
+ const targetH = (expandedAgent && Object.keys(S.registry).length) ? 390 : 260;
317
+ if (fc.height !== targetH) fc.height = targetH;
318
+ ctx.clearRect(0,0,fc.width,fc.height);
319
+ const reg = S.registry;
320
+ if (!Object.keys(reg).length){
321
+ ctx.fillStyle='#2a2d35'; ctx.font='12px "Inter",sans-serif';
322
+ ctx.textAlign='center'; ctx.textBaseline='middle';
323
+ ctx.fillText('Agent topology renders here after a scenario runs', fc.width/2, fc.height/2);
324
+ return;
325
+ }
326
+ const pos = layout(reg);
327
+ flowPos = pos;
328
+ // hierarchy lines
329
+ Object.values(reg).forEach(agent => {
330
+ if (!agent.reports_to || !pos[agent.id] || !pos[agent.reports_to]) return;
331
+ const fp=pos[agent.reports_to], tp=pos[agent.id];
332
+ ctx.beginPath(); ctx.moveTo(fp.x,fp.y+16); ctx.lineTo(tp.x,tp.y-16);
333
+ ctx.strokeStyle='rgba(255,255,255,.08)'; ctx.lineWidth=1; ctx.setLineDash([4,3]); ctx.stroke(); ctx.setLineDash([]);
334
+ });
335
+ // last arrow
336
+ if (S.lastArrow && pos[S.lastArrow.from] && pos[S.lastArrow.to]){
337
+ const a=S.lastArrow, fp=pos[a.from], tp=pos[a.to];
338
+ const col = a.arrow_type==='retry'?'#f59e0b' : a.arrow_type==='result'?'#4ade80' : (reg[a.from]?.color||'#888');
339
+ drawArrow(fp.x,fp.y,tp.x,tp.y,col,a.label,a.arrow_type==='retry');
340
+ }
341
+ // nodes
342
+ Object.keys(reg).forEach(id => {
343
+ const p=pos[id], ag=S.agents[id], r=reg[id];
344
+ if (!p) return;
345
+ const st=ag?.status||'idle', nw=88, nh=30, nx=p.x-nw/2, ny=p.y-nh/2, active=st!=='idle';
346
+ if (st==='running'){ ctx.shadowColor=p.color; ctx.shadowBlur=12; }
347
+ ctx.beginPath(); ctx.roundRect(nx,ny,nw,nh,7);
348
+ ctx.fillStyle=active?hexA(p.color,.12):'#1e2028'; ctx.fill();
349
+ ctx.strokeStyle=active?p.color:'rgba(255,255,255,.1)'; ctx.lineWidth=active?1.5:.7; ctx.stroke();
350
+ ctx.shadowBlur=0;
351
+ ctx.fillStyle=active?p.color:'#5a6070';
352
+ ctx.font=(active?'500 ':'')+'11px "Inter",sans-serif';
353
+ ctx.textAlign='center'; ctx.textBaseline='middle'; ctx.fillText(r.label,p.x,p.y-4);
354
+ ctx.fillStyle=hexA(p.color,active?.6:.35); ctx.font='9px "IBM Plex Mono",monospace';
355
+ ctx.fillText(r.role,p.x,p.y+8);
356
+ });
357
+ // ── Expanded agent sub-nodes ─────────────────────────────────────────────────
358
+ if (expandedAgent && pos[expandedAgent]) {
359
+ const parent = pos[expandedAgent];
360
+ // show ▼ on node
361
+ ctx.fillStyle = hexA(parent.color, .8);
362
+ ctx.font = 'bold 8px "Inter",sans-serif';
363
+ ctx.textAlign = 'center'; ctx.textBaseline = 'top';
364
+ ctx.fillText('▼', parent.x, parent.y + 17);
365
+ // gather ops by kind
366
+ const byKind = {};
367
+ S.internals.filter(it => it.agent === expandedAgent).forEach(it => {
368
+ (byKind[it.kind] = byKind[it.kind] || []).push(it);
369
+ });
370
+ const kinds = Object.keys(byKind);
371
+ if (kinds.length) {
372
+ const subW = 108, subH = 38, gap = 10;
373
+ const subY = parent.y + 76;
374
+ const totalW = kinds.length * (subW + gap) - gap;
375
+ let startX = parent.x - totalW / 2;
376
+ // clamp to canvas
377
+ if (startX < 8) startX = 8;
378
+ if (startX + totalW > fc.width - 8) startX = fc.width - 8 - totalW;
379
+ kinds.forEach((kind, i) => {
380
+ const items = byKind[kind];
381
+ const [bg, col, kindLabel] = (KIND[kind]||'rgba(107,114,128,.15)|#6b7280|?').split('|');
382
+ const sx = startX + i * (subW + gap) + subW / 2;
383
+ const sy = subY;
384
+ flowPos[expandedAgent + ':' + kind] = { x: sx, y: sy, color: col };
385
+ // connector
386
+ ctx.beginPath(); ctx.moveTo(parent.x, parent.y + 16); ctx.lineTo(sx, sy - subH/2 - 1);
387
+ ctx.strokeStyle = hexA(col, .2); ctx.lineWidth = 1; ctx.setLineDash([3, 3]); ctx.stroke(); ctx.setLineDash([]);
388
+ // box
389
+ ctx.beginPath(); ctx.roundRect(sx - subW/2, sy - subH/2, subW, subH, 7);
390
+ ctx.fillStyle = bg; ctx.fill();
391
+ ctx.strokeStyle = col; ctx.lineWidth = 1; ctx.stroke();
392
+ // kind label
393
+ ctx.fillStyle = col; ctx.font = '600 9px "IBM Plex Mono",monospace';
394
+ ctx.textAlign = 'center'; ctx.textBaseline = 'middle';
395
+ ctx.fillText(kindLabel.toUpperCase(), sx, sy - 9);
396
+ // detail
397
+ let detail = '×' + items.length;
398
+ if (kind === 'generation') {
399
+ const tok = items.reduce((s, it) => s + (it.prompt_tokens||0) + (it.completion_tokens||0), 0);
400
+ const model = trunc(items[0]?.model||'', 10);
401
+ detail = items.length + '× · ' + (tok > 999 ? (tok/1000).toFixed(1)+'k' : tok) + ' tok';
402
+ ctx.fillStyle = hexA(col, .55); ctx.font = '8px "IBM Plex Mono",monospace';
403
+ ctx.fillText(trunc(model, 14), sx, sy + 3);
404
+ } else if (kind === 'tool_call') {
405
+ const tools = [...new Set(items.map(it => it.tool_name))].slice(0, 2);
406
+ ctx.fillStyle = hexA(col, .55); ctx.font = '8px "IBM Plex Mono",monospace';
407
+ ctx.fillText(trunc(tools.join(', '), 16), sx, sy + 3);
408
+ } else if (kind === 'embedding') {
409
+ const model = items[0]?.model || '—';
410
+ ctx.fillStyle = hexA(col, .55); ctx.font = '8px "IBM Plex Mono",monospace';
411
+ ctx.fillText(trunc(model, 14), sx, sy + 3);
412
+ } else if (kind === 'retrieval') {
413
+ ctx.fillStyle = hexA(col, .55); ctx.font = '8px "IBM Plex Mono",monospace';
414
+ ctx.fillText((items[0]?.results?.length || 0) + ' results ea.', sx, sy + 3);
415
+ }
416
+ ctx.fillStyle = hexA(col, .85); ctx.font = '500 9px "Inter",sans-serif';
417
+ ctx.textAlign = 'center'; ctx.textBaseline = 'middle';
418
+ ctx.fillText(detail, sx, sy + 12);
419
+ });
420
+ } else {
421
+ // agent has no internals yet
422
+ ctx.fillStyle = hexA(parent.color, .35); ctx.font = '10px "Inter",sans-serif';
423
+ ctx.textAlign = 'center'; ctx.textBaseline = 'middle';
424
+ ctx.fillText('no operations recorded', parent.x, parent.y + 65);
425
+ }
426
+ }
427
+
428
+ // selection ring from Tools tab
429
+ if (selectedToolItem && pos[selectedToolItem.agent]){
430
+ const p=pos[selectedToolItem.agent];
431
+ const col=(KIND[selectedToolItem.kind]||'|||').split('|')[1]||'#888';
432
+ const [,,kindLabel]=(KIND[selectedToolItem.kind]||'||?').split('|');
433
+ ctx.shadowColor=col; ctx.shadowBlur=22;
434
+ ctx.beginPath(); ctx.roundRect(p.x-48,p.y-18,96,36,9);
435
+ ctx.strokeStyle=col; ctx.lineWidth=2; ctx.stroke(); ctx.shadowBlur=0;
436
+ ctx.fillStyle=col; ctx.font='bold 9px "IBM Plex Mono",monospace';
437
+ ctx.textAlign='center'; ctx.textBaseline='bottom';
438
+ ctx.fillText('▶ '+kindLabel, p.x, p.y-21);
439
+ }
440
+ }
441
+
442
+ // ── Tool overlay on canvas ─────────────────────────────────────────────────────
443
+ function updateToolOverlay(item, show, seq){
444
+ let ov = document.getElementById('tool-overlay');
445
+ if (!show){
446
+ if (ov) ov.style.display='none';
447
+ return;
448
+ }
449
+ const p = flowPos[item?.agent];
450
+ if (!p){ if (ov) ov.style.display='none'; return; }
451
+
452
+ const [bg,col,kindLabel]=(KIND[item.kind]||'rgba(107,114,128,.15)|#6b7280|?').split('|');
453
+ const agentLabel=(S.registry[item.agent]?.label)||item.agent;
454
+ const agentColor=S.registry[item.agent]?.color||'#6b7280';
455
+
456
+ if (!ov){
457
+ ov=document.createElement('div'); ov.id='tool-overlay'; ov.className='tool-overlay';
458
+ fc.parentElement.appendChild(ov);
459
+ }
460
+
461
+ ov.innerHTML=`<div class="tool-overlay-hdr">
462
+ <span class="tool-seq">#${seq}</span>
463
+ <span class="tool-kind" style="background:${bg};color:${col}">${kindLabel}</span>
464
+ <span class="tool-agent-name" style="color:${agentColor}">${agentLabel}</span>
465
+ </div>
466
+ ${toolBody(item)}`;
467
+ ov.style.display='block';
468
+
469
+ // Map canvas coords → CSS px (canvas may be scaled via CSS width:100%)
470
+ const scaleX = fc.clientWidth / fc.width;
471
+ const scaleY = fc.clientHeight / fc.height;
472
+ const cx = p.x * scaleX, cy = p.y * scaleY;
473
+ const ovW = 260, ovH = ov.offsetHeight || 160;
474
+ // prefer right side, fall back to left
475
+ let left = cx + 54;
476
+ if (left + ovW > fc.clientWidth - 4) left = cx - 54 - ovW;
477
+ left = Math.max(4, left);
478
+ let top = cy - ovH / 2;
479
+ top = Math.max(4, Math.min(fc.clientHeight - ovH - 4, top));
480
+ ov.style.left = left+'px';
481
+ ov.style.top = top+'px';
482
+ }
483
+
484
+ // ── Agents list ────────────────────────────────────────────────────────────────
485
+ function renderAgents(){
486
+ const el = document.getElementById('agents-list');
487
+ const reg = S.registry, agents = S.agents;
488
+ const ids = Object.keys(reg);
489
+ if (!ids.length){ el.innerHTML = '<div class="empty">Agents appear after registration</div>'; return; }
490
+ el.innerHTML = '';
491
+ ids.forEach(id => {
492
+ const r=reg[id], a=agents[id]||{}, col=r.color||ROLE_COLORS[r.role]||'#6b7280';
493
+ const st=a.status||'idle', display=st==='idle'&&reg[id]?'registered':st;
494
+ const row=document.createElement('div'); row.className='agent-row '+display;
495
+ row.innerHTML = `<div class="agent-dot"></div>
496
+ <span class="agent-name" style="color:${st==='idle'?'var(--text)':col}">${r.label}</span>
497
+ <span class="agent-role" style="background:${hexA(col,.14)};color:${col}">${r.role}</span>`;
498
+ el.appendChild(row);
499
+ });
500
+ }
501
+
502
+ // ── Memory tab ────────────────────────────────────────────────────────────────
503
+ function memSentence(k, m){
504
+ if (!m) return `Nothing has been stored under <b>${k}</b> yet.`;
505
+ const v = String(m.value);
506
+ const short = v.slice(0,90)+(v.length>90?'…':'');
507
+ return `<b>${k}</b> was ${m.op==='read'?'read as':'set to'}: "${short}"`;
508
+ }
509
+
510
+ function renderMem(){
511
+ const g=document.getElementById('mem-grid');
512
+ const keys=[...new Set([...Object.keys(S.memory)])];
513
+ if (!keys.length){ g.innerHTML='<div class="empty">No memory entries yet</div>'; return; }
514
+ g.innerHTML='';
515
+ keys.forEach(k => {
516
+ const m=S.memory[k], card=document.createElement('div'); card.className='mem-card'; card.id='mc-'+k;
517
+ card.innerHTML=`<div class="mem-val set">${memSentence(k,m)}</div>`;
518
+ g.appendChild(card);
519
+ });
520
+ }
521
+
522
+ function flashMem(key,op){
523
+ const c=document.getElementById('mc-'+key);
524
+ if (!c){ renderMem(); return; }
525
+ c.classList.remove('fw','fr'); void c.offsetWidth; c.classList.add(op==='write'?'fw':'fr');
526
+ }
527
+
528
+ // ── Log tab ───────────────────────────────────────────────────────────────────
529
+ const TAG = {
530
+ start:'rgba(139,124,248,.15)|#8b7cf8', plan:'rgba(139,124,248,.15)|#8b7cf8', route:'rgba(139,124,248,.15)|#8b7cf8',
531
+ registered:'rgba(139,124,248,.12)|#7f77dd', reply:'rgba(74,222,128,.12)|#4ade80', pass:'rgba(74,222,128,.12)|#4ade80',
532
+ done:'rgba(74,222,128,.12)|#4ade80', fail:'rgba(248,113,113,.12)|#f87171', error:'rgba(248,113,113,.12)|#f87171',
533
+ retry:'rgba(245,158,11,.12)|#f59e0b', warn:'rgba(245,158,11,.12)|#f59e0b',
534
+ };
535
+
536
+ const LOG_VERB = {
537
+ start:'started', plan:'planned', route:'routed a task', registered:'joined',
538
+ reply:'replied', pass:'passed', done:'finished', fail:'failed',
539
+ error:'hit an error', retry:'is retrying', warn:'warned', tool:'called a tool', result:'got a result',
540
+ };
541
+
542
+ function logSentence(ev){
543
+ const label = (S.registry[ev.agent]?.label) || ev.agent;
544
+ const verb = LOG_VERB[ev.event_type] || ev.event_type;
545
+ return `<b>${label}</b> ${verb} — ${ev.message}`;
546
+ }
547
+
548
+ function addLog(ev, prepend=true){
549
+ const log=document.getElementById('tp-log');
550
+ const empty=log.querySelector('.empty'); if (empty) empty.remove();
551
+ const [bg,col]=(TAG[ev.event_type]||'rgba(255,255,255,.06)|#9ca3af').split('|');
552
+ const d=new Date(ev.ts||Date.now()), ts=`${String(d.getMinutes()).padStart(2,'0')}:${String(d.getSeconds()).padStart(2,'0')}`;
553
+ const row=document.createElement('div'); row.className='log-row';
554
+ row.innerHTML=`<span class="log-tag" style="background:${bg};color:${col}">${ev.event_type}</span><span class="log-msg">${logSentence(ev)}</span><span class="log-time">${ts}</span>`;
555
+ if (prepend) log.insertBefore(row,log.firstChild); else log.appendChild(row);
556
+ if (log.children.length>80) log.removeChild(log.lastChild);
557
+ }
558
+
559
+ // ── Tools tab ─────────────────────────────────────────────────────────────────
560
+ const KIND = {
561
+ embedding: 'rgba(139,124,248,.15)|#8b7cf8|embed',
562
+ retrieval: 'rgba(45,212,176,.15)|#2dd4b0|retrieve',
563
+ tool_call: 'rgba(96,165,250,.15)|#60a5fa|tool',
564
+ generation:'rgba(245,158,11,.15)|#f59e0b|generate',
565
+ };
566
+
567
+ const trunc = (s, n) => (s && s.length > n) ? s.slice(0, n) + '…' : (s||'');
568
+
569
+ function toolPreview(item){
570
+ switch(item.kind){
571
+ case 'embedding': return `"${trunc(item.text, 45)}" → ${item.dims}d (${item.model})`;
572
+ case 'retrieval': return `search: "${trunc(item.query, 45)}"`;
573
+ case 'tool_call': return item.error ? `✗ ${item.tool_name}: ${trunc(item.error, 50)}` : `${item.tool_name} — ${trunc(item.output, 55)}`;
574
+ case 'generation': return `${item.model||'model'} · ${(item.prompt_tokens+item.completion_tokens).toLocaleString()} tokens`;
575
+ default: return trunc(JSON.stringify(item), 60);
576
+ }
577
+ }
578
+
579
+ function toolBody(item){
580
+ const agentLabel = (S.registry[item.agent]?.label) || item.agent;
581
+ const agentColor = S.registry[item.agent]?.color || '#6b7280';
582
+ const blk = (label, val, full=false) =>
583
+ `<div class="tool-block${full?' full':''}"><span class="tb-label">${label}</span><div class="tb-val">${val}</div></div>`;
584
+ const agent = `<span style="color:${agentColor};font-weight:600">${agentLabel}</span>`;
585
+ switch(item.kind){
586
+ case 'embedding':
587
+ return `<div class="tool-blocks">
588
+ ${blk('Agent', agent)}
589
+ ${blk('Model', item.model||'—')}
590
+ ${blk('Dimensions', item.dims ? item.dims+'d' : '—')}
591
+ ${blk('Input text', trunc(item.text, 300), true)}
592
+ </div>`;
593
+ case 'retrieval': {
594
+ const n=(item.results||[]).length, top=item.results?.[0];
595
+ return `<div class="tool-blocks">
596
+ ${blk('Agent', agent)}
597
+ ${blk('Results found', String(n))}
598
+ ${blk('Query', trunc(item.query, 300), true)}
599
+ ${top ? blk('Best match', `<span style="color:var(--teal)">score ${top.score.toFixed(2)}</span> — ${trunc(top.text, 200)}`, true) : ''}
600
+ </div>`;
601
+ }
602
+ case 'tool_call': {
603
+ const ok=!item.error;
604
+ const esc = s => String(s||'').replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;');
605
+ return `<div class="tool-blocks">
606
+ ${blk('Agent', agent)}
607
+ ${blk('Tool', `<b>${item.tool_name}</b>`)}
608
+ ${blk('Status', ok ? '<span style="color:var(--green)">✓ success</span>' : '<span style="color:var(--coral)">✗ failed</span>')}
609
+ ${item.latency_ms ? blk('Latency', item.latency_ms+'ms') : ''}
610
+ </div>
611
+ ${item.input ? `<div class="llm-response" style="border-top:1px solid rgba(255,255,255,.06)"><div class="llm-section-label" style="color:var(--blue)">↑ input</div><div class="llm-response-text" style="color:var(--muted)">${esc(item.input)}</div></div>` : ''}
612
+ ${ok && item.output ? `<div class="llm-response"><div class="llm-section-label" style="color:var(--green)">↓ output</div><div class="llm-response-text">${esc(item.output)}</div></div>` : ''}
613
+ ${!ok && item.error ? `<div class="llm-response"><div class="llm-section-label" style="color:var(--coral)">✗ error</div><div class="llm-response-text" style="color:var(--coral)">${esc(item.error)}</div></div>` : ''}`;
614
+ }
615
+ case 'generation': {
616
+ const total = item.prompt_tokens + item.completion_tokens;
617
+ const esc = s => String(s||'').replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;');
618
+ const msgs = (item.messages||[]).map(m =>
619
+ `<div class="llm-turn ${m.role}"><div class="llm-role">${m.role}</div><div class="llm-content">${esc(m.content)}</div></div>`
620
+ ).join('');
621
+ return `<div class="tool-blocks">
622
+ ${blk('Agent', agent)}
623
+ ${blk('Model', item.model||'—')}
624
+ ${blk('Prompt tokens', item.prompt_tokens.toLocaleString())}
625
+ ${blk('Completion tokens', item.completion_tokens.toLocaleString())}
626
+ ${blk('Total tokens', total.toLocaleString())}
627
+ ${item.latency_ms ? blk('Latency', item.latency_ms+'ms') : ''}
628
+ ${item.stop_reason ? blk('Stop reason', item.stop_reason) : ''}
629
+ </div>
630
+ ${msgs ? `<div class="llm-thread">${msgs}</div>` : ''}
631
+ ${item.thinking ? `<div class="llm-thinking"><div class="llm-section-label">◎ thinking</div><div class="llm-thinking-text">${esc(item.thinking)}</div></div>` : ''}
632
+ ${item.response ? `<div class="llm-response"><div class="llm-section-label">↩ response</div><div class="llm-response-text">${esc(item.response)}</div></div>` : ''}`;
633
+ }
634
+ default:
635
+ return `<div class="tool-blocks">${blk('Raw', `<pre style="font-size:10px;white-space:pre-wrap">${JSON.stringify(item,null,2)}</pre>`, true)}</div>`;
636
+ }
637
+ }
638
+
639
+ function addInternal(item, prepend=true, seqOverride=null){
640
+ const panel=document.getElementById('tp-tools');
641
+ const empty=panel.querySelector('.empty'); if (empty) empty.remove();
642
+ const [bg,col,kindLabel]=(KIND[item.kind]||'rgba(107,114,128,.15)|#6b7280|?').split('|');
643
+ const agentLabel = (S.registry[item.agent]?.label) || item.agent;
644
+ const agentColor = S.registry[item.agent]?.color || '#6b7280';
645
+
646
+ if (seqOverride === null) toolSeq++;
647
+ const seq = seqOverride !== null ? seqOverride : toolSeq;
648
+
649
+ const det = document.createElement('details'); det.className='tool-item';
650
+ det.innerHTML=`<summary>
651
+ <span class="tool-chevron">▶</span>
652
+ <span class="tool-seq">#${seq}</span>
653
+ <span class="tool-kind" style="background:${bg};color:${col}">${kindLabel}</span>
654
+ <span class="tool-agent-name" style="color:${agentColor}">${agentLabel}</span>
655
+ <span class="tool-preview">${toolPreview(item)}</span>
656
+ ${item.latency_ms?`<span class="tool-lat">${item.latency_ms}ms</span>`:''}
657
+ </summary>
658
+ <div class="tool-body">${toolBody(item)}</div>`;
659
+
660
+ det.addEventListener('toggle', () => {
661
+ selectedToolItem = det.open ? item : null;
662
+ drawFlow();
663
+ updateToolOverlay(item, det.open, seq);
664
+ });
665
+
666
+ if (prepend) panel.insertBefore(det, panel.firstChild); else panel.appendChild(det);
667
+ if (panel.children.length>100) panel.removeChild(panel.lastChild);
668
+ }
669
+
670
+ // ── Plan tab ──────────────────────────────────────────────────────────────────
671
+ function renderPlan(){
672
+ const pl=document.getElementById('plan-list');
673
+ if (!S.plan.length) return;
674
+ pl.innerHTML='';
675
+ S.plan.forEach((t,i) => {
676
+ const r=S.registry[t.agent]||{}, col=r.color||'#6b7280';
677
+ const done=(S.agents[t.agent]?.status==='done')||S.metrics.steps>i+2;
678
+ const row=document.createElement('div'); row.className='plan-row';
679
+ row.innerHTML=`<div class="plan-dot" style="background:${done?'#4ade80':col}"></div><span class="plan-agent" style="color:${col}">${t.agent}</span><span class="plan-task">${t.task}</span>${done?'<span class="plan-done">done</span>':''}`;
680
+ pl.appendChild(row);
681
+ });
682
+ }
683
+
684
+ // ── Status / elapsed ─────────────────────────────────────────────────────────
685
+ function setStatus(s){
686
+ S.status=s;
687
+ const p=document.getElementById('run-status');
688
+ p.textContent=s; p.className='badge '+s;
689
+ document.getElementById('spinner').className='spinner'+(s==='running'?' on':'');
690
+ if (s!=='running'){ clearInterval(elapsedTimer); elapsedTimer=null; }
691
+ }
692
+ function startElapsed(){
693
+ if (elapsedTimer) clearInterval(elapsedTimer);
694
+ elapsedTimer=setInterval(() => {
695
+ if (!S.startedAt) return;
696
+ document.getElementById('m-elapsed').textContent=(Math.round((Date.now()-S.startedAt)/100)/10)+'s';
697
+ },200);
698
+ }
699
+
700
+ // ── Full state apply ──────────────────────────────────────────────────────────
701
+ function applyFull(st){
702
+ if (!st){
703
+ S={registry:{},agents:{},memory:{},events:[],plan:[],internals:[],metrics:{steps:0,tokens:0,retries:0},status:'idle',goal:'',startedAt:null,lastArrow:null};
704
+ toolSeq=0; selectedToolItem=null; flowPos={}; expandedAgent=null;
705
+ const _ov=document.getElementById('tool-overlay'); if(_ov) _ov.style.display='none';
706
+ document.getElementById('goal-text').textContent='waiting for agents…';
707
+ document.getElementById('tp-log').innerHTML='<div class="empty">Events stream here during a run</div>';
708
+ document.getElementById('tp-tools').innerHTML='<div class="empty">Embeddings, retrievals, tool calls and LLM generations appear here</div>';
709
+ document.getElementById('mem-grid').innerHTML='';
710
+ document.getElementById('plan-list').innerHTML='<div class="empty">Plan appears after orchestrator runs</div>';
711
+ ['m-steps','m-tokens','m-retries'].forEach(id=>document.getElementById(id).textContent='0');
712
+ document.getElementById('m-elapsed').textContent='—';
713
+ setStatus('idle'); renderAgents(); drawFlow();
714
+ return;
715
+ }
716
+ S.registry=st.registry||{};
717
+ S.agents=st.agents||{};
718
+ S.memory={};
719
+ Object.entries(st.memory||{}).forEach(([k,v])=>S.memory[k]=v);
720
+ S.events=st.events||[];
721
+ S.plan=st.plan||[];
722
+ S.internals=st.internals||[];
723
+ S.metrics=st.metrics||{steps:0,tokens:0,retries:0};
724
+ S.status=st.status||'idle';
725
+ S.goal=st.goal||'';
726
+ S.startedAt=st.startedAt||null;
727
+ S.lastArrow=(st.arrows||[])[0]||null;
728
+
729
+ document.getElementById('goal-text').textContent=S.goal||'waiting for agents…';
730
+ document.getElementById('m-steps').textContent=S.metrics.steps;
731
+ document.getElementById('m-tokens').textContent=S.metrics.tokens>999?(S.metrics.tokens/1000).toFixed(1)+'k':S.metrics.tokens;
732
+ document.getElementById('m-retries').textContent=S.metrics.retries;
733
+ setStatus(S.status);
734
+ renderAgents(); renderMem(); renderPlan(); drawFlow();
735
+
736
+ const logEl=document.getElementById('tp-log'); logEl.innerHTML='';
737
+ S.events.slice().reverse().forEach(ev=>addLog(ev,false));
738
+ if (!S.events.length) logEl.innerHTML='<div class="empty">Events stream here during a run</div>';
739
+
740
+ toolSeq = 0;
741
+ const toolsEl=document.getElementById('tp-tools'); toolsEl.innerHTML='';
742
+ if (S.internals.length) S.internals.forEach((it,i)=>addInternal(it,false,i+1));
743
+ else toolsEl.innerHTML='<div class="empty">Embeddings, retrievals, tool calls and LLM generations appear here</div>';
744
+
745
+ if (S.startedAt&&S.status==='running') startElapsed();
746
+ }
747
+
748
+ // ── SSE handler ───────────────────────────────────────────────────────────────
749
+ function handle(type,p){
750
+ switch(type){
751
+ case 'init': applyFull(p.state); break;
752
+ case 'reset': applyFull(null); break;
753
+ case 'registry': S.registry=p; renderAgents(); drawFlow(); break;
754
+ case 'agents': S.agents=p; renderAgents(); drawFlow(); break;
755
+ case 'goal': S.goal=p.goal; S.startedAt=Date.now(); document.getElementById('goal-text').textContent=p.goal; startElapsed(); break;
756
+ case 'status': setStatus(p); break;
757
+ case 'event': addLog(p); break;
758
+ case 'memory': S.memory[p.key]=p; renderMem(); flashMem(p.key,p.op); break;
759
+ case 'arrow': S.lastArrow=p; drawFlow(); break;
760
+ case 'plan': S.plan=p; renderPlan(); break;
761
+ case 'metrics':
762
+ S.metrics=p;
763
+ document.getElementById('m-steps').textContent=p.steps;
764
+ document.getElementById('m-tokens').textContent=p.tokens>999?(p.tokens/1000).toFixed(1)+'k':p.tokens;
765
+ document.getElementById('m-retries').textContent=p.retries;
766
+ break;
767
+ case 'internal': addInternal(p); break;
768
+ }
769
+ }
770
+
771
+ // ── Connection ────────────────────────────────────────────────────────────────
772
+ function setBadge(live){
773
+ const b=document.getElementById('conn-badge');
774
+ b.textContent=live?'live':'reconnecting…';
775
+ b.className='badge '+(live?'live':'dead');
776
+ }
777
+ function connect(){
778
+ if (es){ es.close(); es=null; }
779
+ es=new EventSource(SERVER+'/events');
780
+ es.onopen=()=>setBadge(true);
781
+ es.onerror=()=>{ setBadge(false); es.close(); es=null; setTimeout(connect,2000); };
782
+ es.onmessage=e=>{ setBadge(true); const msg=JSON.parse(e.data); handle(msg.type,msg.payload); };
783
+ }
784
+
785
+ // ── UI actions ────────────────────────────────────────────────────────────────
786
+ function switchTab(e,name){
787
+ document.querySelectorAll('.tab').forEach(t=>t.classList.remove('active'));
788
+ document.querySelectorAll('.tab-panel').forEach(p=>p.classList.remove('active'));
789
+ e.target.classList.add('active');
790
+ document.getElementById('tp-'+name).classList.add('active');
791
+ }
792
+ async function emulate(scenario){
793
+ try{
794
+ const r=await fetch(SERVER+'/emulate',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({scenario})});
795
+ if (!r.ok) alert('Server error: '+r.status);
796
+ }catch(_){alert('Cannot reach server at '+SERVER);}
797
+ }
798
+ async function doReset(){
799
+ try{ await fetch(SERVER+'/reset',{method:'POST'}); }catch(_){ applyFull(null); }
800
+ }
801
+
802
+ // ── Boot ──────────────────────────────────────────────────────────────────────
803
+ setTimeout(()=>{ initCanvas(); connect(); },80);
804
+ </script>
805
+ </body>
806
+ </html>
src/server.js ADDED
@@ -0,0 +1,615 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env node
2
+ 'use strict';
3
+ const http = require('http');
4
+ const fs = require('fs');
5
+ const path = require('path');
6
+
7
+ const PORT = parseInt(process.env.VISIBILITY_PORT || '4242');
8
+
9
+ // ── State ─────────────────────────────────────────────────────────────────────
10
+ let state = fresh();
11
+ function fresh() {
12
+ return {
13
+ agents: {}, registry: {}, memory: {}, events: [],
14
+ arrows: [], plan: [], internals: [],
15
+ metrics: { steps: 0, tokens: 0, retries: 0 },
16
+ goal: '', runId: null, status: 'idle', startedAt: null,
17
+ clients: [],
18
+ };
19
+ }
20
+
21
+ // ── SSE broadcast ─────────────────────────────────────────────────────────────
22
+ function broadcast(type, payload) {
23
+ const msg = `data: ${JSON.stringify({ type, payload, ts: Date.now() })}\n\n`;
24
+ state.clients.forEach(r => { try { r.write(msg); } catch (_) {} });
25
+ }
26
+
27
+ // ── Role colours ──────────────────────────────────────────────────────────────
28
+ const COLORS = {
29
+ orchestrator: '#8b7cf8', researcher: '#2dd4b0', coder: '#60a5fa',
30
+ critic: '#f59e0b', synthesiser: '#60a5fa', worker: '#2dd4b0',
31
+ };
32
+
33
+ // ── Helpers ───────────────────────────────────────────────────────────────────
34
+ function ensureAgent(id) {
35
+ if (!state.agents[id]) {
36
+ const r = state.registry[id] || {};
37
+ state.agents[id] = {
38
+ id, label: r.label || id, role: r.role || 'worker', model: r.model || '',
39
+ reports_to: r.reports_to || null, token_budget: r.token_budget || 8192,
40
+ color: r.color || COLORS[r.role] || '#6b7280', status: 'idle', tokens: 0, calls: 0,
41
+ };
42
+ }
43
+ }
44
+ function safeAgents() {
45
+ const out = {};
46
+ for (const [k, v] of Object.entries(state.agents)) {
47
+ out[k] = { id: v.id, label: v.label, role: v.role, model: v.model,
48
+ reports_to: v.reports_to, token_budget: v.token_budget, color: v.color,
49
+ status: v.status, tokens: v.tokens, calls: v.calls };
50
+ }
51
+ return out;
52
+ }
53
+ function snapshot() {
54
+ return {
55
+ registry: state.registry, runId: state.runId, goal: state.goal,
56
+ status: state.status, startedAt: state.startedAt, agents: safeAgents(),
57
+ memory: state.memory, events: state.events.slice(0, 80),
58
+ arrows: state.arrows.slice(0, 20), plan: state.plan, metrics: state.metrics,
59
+ internals: state.internals.slice(0, 60),
60
+ scenarios: Object.keys(SCENARIOS),
61
+ };
62
+ }
63
+
64
+ // ── Tools ─────────────────────────────────────────────────────────────────────
65
+ const TOOLS = {
66
+ register_agent({ id, label, role = 'worker', model = '', reports_to = null, token_budget = 8192, color = null }) {
67
+ const c = color || COLORS[role] || '#6b7280';
68
+ state.registry[id] = { id, label, role, model, reports_to, token_budget, color: c };
69
+ state.agents[id] = { ...state.registry[id], status: 'idle', tokens: 0, calls: 0 };
70
+ broadcast('registry', state.registry);
71
+ broadcast('agents', safeAgents());
72
+ broadcast('event', { agent: id, event_type: 'registered',
73
+ message: `${label} registered — role:${role}, model:${model || 'unset'}`,
74
+ tokens: 0, latency_ms: 0, ts: Date.now() });
75
+ return { ok: true };
76
+ },
77
+ log_event({ agent, event_type, message, tokens = 0, latency_ms = 0, metadata = {} }) {
78
+ ensureAgent(agent);
79
+ const item = { agent, event_type, message, tokens, latency_ms, metadata, ts: Date.now() };
80
+ state.events.unshift(item);
81
+ if (state.events.length > 200) state.events.pop();
82
+ if (tokens) {
83
+ state.agents[agent].tokens += tokens;
84
+ state.agents[agent].calls += 1;
85
+ state.metrics.tokens += tokens;
86
+ }
87
+ state.metrics.steps++;
88
+ broadcast('event', item);
89
+ broadcast('metrics', state.metrics);
90
+ broadcast('agents', safeAgents());
91
+ return { ok: true };
92
+ },
93
+ set_memory({ key, value, op = 'write' }) {
94
+ state.memory[key] = { value, op, ts: Date.now() };
95
+ broadcast('memory', { key, value, op, ts: Date.now() });
96
+ return { ok: true };
97
+ },
98
+ set_agent_state({ agent_id, status }) {
99
+ ensureAgent(agent_id);
100
+ state.agents[agent_id].status = status;
101
+ broadcast('agents', safeAgents());
102
+ return { ok: true };
103
+ },
104
+ trace_step({ from_agent, to_agent, label = '', arrow_type = 'msg' }) {
105
+ ensureAgent(from_agent); ensureAgent(to_agent);
106
+ const arrow = { from: from_agent, to: to_agent, label, arrow_type, ts: Date.now() };
107
+ state.arrows.unshift(arrow);
108
+ if (state.arrows.length > 50) state.arrows.pop();
109
+ broadcast('arrow', arrow);
110
+ return { ok: true };
111
+ },
112
+ set_plan({ tasks }) { state.plan = tasks; broadcast('plan', tasks); return { ok: true }; },
113
+ set_goal({ goal, run_id }) {
114
+ state.goal = goal; state.runId = run_id || String(Date.now());
115
+ state.status = 'running'; state.startedAt = Date.now();
116
+ broadcast('goal', { goal, runId: state.runId });
117
+ broadcast('status', 'running');
118
+ return { ok: true };
119
+ },
120
+ finish_run({ status = 'done' }) {
121
+ state.status = status; broadcast('status', status); return { ok: true };
122
+ },
123
+
124
+ // ── Internal observability tools ──────────────────────────────────────────
125
+ log_embedding({ agent, text, model = 'text-embedding-3-small', dims = 1536, latency_ms = 0 }) {
126
+ ensureAgent(agent);
127
+ const item = { kind: 'embedding', agent, text: String(text).slice(0, 90), model, dims, latency_ms, ts: Date.now() };
128
+ state.internals.unshift(item);
129
+ if (state.internals.length > 200) state.internals.pop();
130
+ broadcast('internal', item);
131
+ return { ok: true };
132
+ },
133
+ log_retrieval({ agent, query, results = [], latency_ms = 0 }) {
134
+ ensureAgent(agent);
135
+ const item = {
136
+ kind: 'retrieval', agent,
137
+ query: String(query).slice(0, 90),
138
+ results: results.slice(0, 6).map(r => ({ text: String(r.text || '').slice(0, 70), score: r.score ?? 0 })),
139
+ latency_ms, ts: Date.now(),
140
+ };
141
+ state.internals.unshift(item);
142
+ if (state.internals.length > 200) state.internals.pop();
143
+ broadcast('internal', item);
144
+ return { ok: true };
145
+ },
146
+ log_tool_call({ agent, tool_name, input = '', output = '', latency_ms = 0, error = null }) {
147
+ ensureAgent(agent);
148
+ const item = {
149
+ kind: 'tool_call', agent, tool_name,
150
+ input: String(input).slice(0, 4000),
151
+ output: String(output).slice(0, 4000),
152
+ latency_ms, error, ts: Date.now(),
153
+ };
154
+ state.internals.unshift(item);
155
+ if (state.internals.length > 200) state.internals.pop();
156
+ broadcast('internal', item);
157
+ return { ok: true };
158
+ },
159
+ log_generation({ agent, prompt_tokens = 0, completion_tokens = 0, model = '', latency_ms = 0, stop_reason = 'stop', messages = [], response = null, thinking = null }) {
160
+ ensureAgent(agent);
161
+ const total = prompt_tokens + completion_tokens;
162
+ const item = {
163
+ kind: 'generation', agent, prompt_tokens, completion_tokens, total, model, latency_ms, stop_reason,
164
+ messages: (messages||[]).slice(0,30).map(m => ({ role: String(m.role||'user'), content: String(m.content||'').slice(0,2000) })),
165
+ response: response ? String(response).slice(0,4000) : null,
166
+ thinking: thinking ? String(thinking).slice(0,3000) : null,
167
+ ts: Date.now(),
168
+ };
169
+ state.internals.unshift(item);
170
+ if (state.internals.length > 200) state.internals.pop();
171
+ if (total) {
172
+ state.agents[agent].tokens += total;
173
+ state.agents[agent].calls += 1;
174
+ state.metrics.tokens += total;
175
+ }
176
+ broadcast('internal', item);
177
+ broadcast('agents', safeAgents());
178
+ broadcast('metrics', state.metrics);
179
+ return { ok: true };
180
+ },
181
+ };
182
+ // alias: log_llm_turn → log_generation (richer name exposed in MCP)
183
+ TOOLS.log_llm_turn = TOOLS.log_generation;
184
+
185
+ // ── Demo scenarios ─────────────────────────────────────────────────────────────
186
+ const SCENARIOS = {
187
+ research_code: {
188
+ goal: 'Explain quicksort and write a Python implementation',
189
+ steps: [
190
+ { delay: 0, fn: () => {
191
+ TOOLS.register_agent({ id: 'orchestrator', label: 'Orchestrator', role: 'orchestrator', model: 'claude-sonnet-4-20250514', token_budget: 16384 });
192
+ TOOLS.register_agent({ id: 'researcher', label: 'Researcher', role: 'researcher', model: 'claude-haiku-4-5-20251001', reports_to: 'orchestrator', token_budget: 8192 });
193
+ TOOLS.register_agent({ id: 'coder', label: 'Coder', role: 'coder', model: 'claude-sonnet-4-20250514', reports_to: 'orchestrator', token_budget: 8192 });
194
+ TOOLS.register_agent({ id: 'critic', label: 'Critic', role: 'critic', model: 'claude-haiku-4-5-20251001', reports_to: 'orchestrator', token_budget: 4096 });
195
+ }},
196
+ { delay: 800, fn: () => {
197
+ TOOLS.set_goal({ goal: SCENARIOS.research_code.goal });
198
+ TOOLS.set_agent_state({ agent_id: 'orchestrator', status: 'running' });
199
+ TOOLS.log_generation({ agent: 'orchestrator', prompt_tokens: 280, completion_tokens: 95, model: 'claude-sonnet-4-20250514', latency_ms: 620, stop_reason: 'end_turn',
200
+ messages: [
201
+ { role: 'system', content: 'You are an orchestrator agent. Break the user goal into subtasks and delegate to specialist agents: Researcher (theory/research), Coder (implementation), Critic (validation). Always plan before routing.' },
202
+ { role: 'user', content: 'Explain quicksort and write a Python implementation' },
203
+ ],
204
+ response: "I'll break this into 3 sequential tasks:\n1. **Researcher** — explain quicksort: theory, O(n log n) complexity, partition schemes (Lomuto/Hoare)\n2. **Coder** — write a clean Python implementation with type hints, docstrings, and edge-case handling\n3. **Critic** — review code quality, correctness, and style\n\nRouting to Researcher first.",
205
+ });
206
+ TOOLS.log_event({ agent: 'orchestrator', event_type: 'start', message: 'Planning tasks…' });
207
+ }},
208
+ { delay: 900, fn: () => {
209
+ TOOLS.set_plan({ tasks: [{ agent: 'researcher', task: 'Explain quicksort', depends_on: [] }, { agent: 'coder', task: 'Write Python implementation', depends_on: [0] }, { agent: 'critic', task: 'Validate code quality', depends_on: [1] }] });
210
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'researcher', label: 'explain', arrow_type: 'msg' });
211
+ TOOLS.set_agent_state({ agent_id: 'researcher', status: 'running' });
212
+ TOOLS.set_memory({ key: 'goal', value: SCENARIOS.research_code.goal });
213
+ }},
214
+ // Researcher — embed query, web search, generate
215
+ { delay: 400, fn: () => {
216
+ TOOLS.log_embedding({ agent: 'researcher', text: 'quicksort algorithm explanation divide conquer', model: 'text-embedding-3-small', dims: 1536, latency_ms: 48 });
217
+ }},
218
+ { delay: 300, fn: () => {
219
+ TOOLS.log_retrieval({ agent: 'researcher', query: 'quicksort algorithm complexity analysis', latency_ms: 92,
220
+ results: [
221
+ { text: 'Quicksort uses divide-and-conquer: pick a pivot, partition into <, =, > subarrays.', score: 0.94 },
222
+ { text: 'Average-case O(n log n); worst-case O(n²) with bad pivot selection.', score: 0.91 },
223
+ { text: 'Lomuto vs Hoare partition schemes differ in swap count and cache behaviour.', score: 0.87 },
224
+ { text: 'Introsort (used in STL) falls back to heapsort to avoid O(n²) worst case.', score: 0.82 },
225
+ ],
226
+ });
227
+ }},
228
+ { delay: 500, fn: () => {
229
+ TOOLS.log_tool_call({ agent: 'researcher', tool_name: 'web_search', input: 'quicksort algorithm detailed explanation', output: '6 results — Wikipedia, CS Visualizer, CLRS excerpt', latency_ms: 340 });
230
+ }},
231
+ { delay: 900, fn: () => {
232
+ TOOLS.log_generation({ agent: 'researcher', prompt_tokens: 1840, completion_tokens: 620, model: 'claude-haiku-4-5-20251001', latency_ms: 1320, stop_reason: 'end_turn',
233
+ messages: [
234
+ { role: 'system', content: 'You are a researcher agent. Synthesise accurate technical information from retrieved documents. Be precise, cite complexity bounds, note tradeoffs.' },
235
+ { role: 'user', content: 'Task from orchestrator: Explain the quicksort algorithm in detail — theory, complexity, partition schemes.' },
236
+ { role: 'assistant', content: '[embedding query and retrieving relevant documents…]' },
237
+ { role: 'tool', content: 'Retrieved 4 chunks:\n• Quicksort uses divide-and-conquer: pick a pivot, partition into <, =, > subarrays. (score 0.94)\n• Average O(n log n); worst O(n²) with bad pivot selection. (score 0.91)\n• Lomuto vs Hoare partition differ in swap count and cache behaviour. (score 0.87)\n• Introsort falls back to heapsort to avoid O(n²) worst case. (score 0.82)' },
238
+ ],
239
+ response: "**Quicksort** is a divide-and-conquer sorting algorithm:\n\n**Core strategy**: Choose a pivot element, partition the array into ≤ pivot and > pivot halves, then recursively sort each half in-place.\n\n**Complexity**:\n- Average: O(n log n) — balanced splits with good pivot choice\n- Worst: O(n²) — degenerate pivot on already-sorted input\n- Space: O(log n) stack depth average\n\n**Partition schemes**:\n- *Lomuto*: simpler code, last element as pivot, O(n) comparisons\n- *Hoare*: ~3× fewer swaps, two converging pointers\n\n**Practical optimisations**:\n- Median-of-3 pivot selection to avoid worst case\n- Switch to insertion sort for subarrays smaller than ~10 elements\n- Introsort (Python's Timsort variant) adds heapsort fallback for guaranteed O(n log n)",
240
+ });
241
+ TOOLS.log_event({ agent: 'researcher', event_type: 'reply', message: 'Quicksort: divide-and-conquer. Pivot splits into <, =, > partitions. Avg O(n log n), worst O(n²) with sorted input.' });
242
+ TOOLS.set_memory({ key: 'research', value: 'Quicksort: O(n log n) avg, O(n²) worst. Lomuto/Hoare partition.' });
243
+ TOOLS.trace_step({ from_agent: 'researcher', to_agent: 'orchestrator', label: 'done', arrow_type: 'result' });
244
+ TOOLS.set_agent_state({ agent_id: 'researcher', status: 'done' });
245
+ }},
246
+ { delay: 500, fn: () => {
247
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'coder', label: 'implement', arrow_type: 'msg' });
248
+ TOOLS.set_agent_state({ agent_id: 'coder', status: 'running' });
249
+ }},
250
+ // Coder — retrieve code examples, execute sandbox, generate
251
+ { delay: 400, fn: () => {
252
+ TOOLS.log_embedding({ agent: 'coder', text: 'Python quicksort implementation with type hints', model: 'text-embedding-3-small', dims: 1536, latency_ms: 51 });
253
+ }},
254
+ { delay: 300, fn: () => {
255
+ TOOLS.log_retrieval({ agent: 'coder', query: 'Python quicksort recursive + in-place implementation', latency_ms: 88,
256
+ results: [
257
+ { text: 'def quicksort(arr): return [] if not arr else quicksort([x for x in arr[1:] if x<=arr[0]])+[arr[0]]+...', score: 0.96 },
258
+ { text: 'def partition(arr, lo, hi): pivot=arr[hi]; i=lo-1; ...', score: 0.93 },
259
+ { text: 'Lomuto partition: O(n) extra calls, simpler code. Hoare: fewer swaps.', score: 0.88 },
260
+ ],
261
+ });
262
+ }},
263
+ { delay: 800, fn: () => {
264
+ TOOLS.log_generation({ agent: 'coder', prompt_tokens: 3200, completion_tokens: 980, model: 'claude-sonnet-4-20250514', latency_ms: 2100, stop_reason: 'end_turn',
265
+ messages: [
266
+ { role: 'system', content: 'You are a coder agent. Write production-quality Python code with type hints, docstrings, and edge-case handling. Use retrieved snippets as reference, not copy-paste.' },
267
+ { role: 'user', content: 'Task from orchestrator: Implement quicksort in Python. Context from researcher:\n- Quicksort: divide-and-conquer, pick pivot, partition into ≤ and > halves\n- Avg O(n log n), worst O(n²)\n- Lomuto partition is simpler; Hoare has fewer swaps' },
268
+ { role: 'assistant', content: '[retrieving code examples and planning implementation…]' },
269
+ { role: 'tool', content: 'Retrieved snippets:\n• def quicksort(arr): return [] if not arr else quicksort([x for x in arr[1:] if x<=arr[0]])+[arr[0]]+... (functional, not in-place)\n• def partition(arr, lo, hi): pivot=arr[hi]; i=lo-1; ... (Lomuto, in-place)' },
270
+ ],
271
+ response: "```python\nfrom typing import TypeVar\n\nT = TypeVar('T')\n\n\ndef quicksort(arr: list[T]) -> list[T]:\n \"\"\"Return a new sorted list using quicksort (functional style).\n\n Average O(n log n), worst-case O(n²). Not in-place.\n \"\"\"\n if len(arr) <= 1:\n return arr\n pivot = arr[len(arr) // 2] # median-of-1 (simple)\n left = [x for x in arr if x < pivot]\n mid = [x for x in arr if x == pivot]\n right = [x for x in arr if x > pivot]\n return quicksort(left) + mid + quicksort(right)\n\n\ndef quicksort_inplace(arr: list[T], lo: int = 0, hi: int | None = None) -> None:\n \"\"\"Sort arr in-place using Lomuto partition scheme.\"\"\"\n if hi is None:\n hi = len(arr) - 1\n if lo < hi:\n p = _partition(arr, lo, hi)\n quicksort_inplace(arr, lo, p - 1)\n quicksort_inplace(arr, p + 1, hi)\n\n\ndef _partition(arr: list[T], lo: int, hi: int) -> int:\n pivot, i = arr[hi], lo - 1\n for j in range(lo, hi):\n if arr[j] <= pivot:\n i += 1\n arr[i], arr[j] = arr[j], arr[i]\n arr[i + 1], arr[hi] = arr[hi], arr[i + 1]\n return i + 1\n```\n\nBoth variants pass edge cases: empty list, single element, all-equal, reverse-sorted.",
272
+ });
273
+ }},
274
+ { delay: 600, fn: () => {
275
+ TOOLS.log_tool_call({ agent: 'coder', tool_name: 'code_execute', input: 'quicksort([3,1,4,1,5,9,2,6]) — smoke test', output: '[1, 1, 2, 3, 4, 5, 6, 9] ✓ (12ms)', latency_ms: 112 });
276
+ TOOLS.log_tool_call({ agent: 'coder', tool_name: 'code_execute', input: 'quicksort([]) — edge case empty list', output: '[] ✓', latency_ms: 8 });
277
+ TOOLS.log_tool_call({ agent: 'coder', tool_name: 'code_execute', input: 'quicksort([1]) — single element', output: '[1] ✓', latency_ms: 6 });
278
+ }},
279
+ { delay: 900, fn: () => {
280
+ TOOLS.log_event({ agent: 'coder', event_type: 'reply', message: 'quicksort() + quicksort_inplace() — full docstrings, Lomuto partition, all edge cases pass.' });
281
+ TOOLS.set_memory({ key: 'code', value: 'def quicksort(arr: list) -> list: ...\ndef quicksort_inplace(arr, lo, hi): ...' });
282
+ TOOLS.trace_step({ from_agent: 'coder', to_agent: 'orchestrator', label: 'ready', arrow_type: 'result' });
283
+ TOOLS.set_agent_state({ agent_id: 'coder', status: 'done' });
284
+ }},
285
+ { delay: 500, fn: () => {
286
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'critic', label: 'validate', arrow_type: 'msg' });
287
+ TOOLS.set_agent_state({ agent_id: 'critic', status: 'running' });
288
+ }},
289
+ // Critic — embed code, lint, generate review
290
+ { delay: 400, fn: () => {
291
+ TOOLS.log_embedding({ agent: 'critic', text: 'def quicksort(arr: list) -> list: ...', model: 'text-embedding-3-small', dims: 1536, latency_ms: 44 });
292
+ }},
293
+ { delay: 300, fn: () => {
294
+ TOOLS.log_tool_call({ agent: 'critic', tool_name: 'lint_check', input: 'quicksort.py', output: 'pylint 9.8/10 — 0 errors, 1 convention (missing module docstring)', latency_ms: 180 });
295
+ TOOLS.log_tool_call({ agent: 'critic', tool_name: 'type_check', input: 'mypy quicksort.py --strict', output: 'Success: no issues found in 1 source file', latency_ms: 95 });
296
+ }},
297
+ { delay: 700, fn: () => {
298
+ TOOLS.log_generation({ agent: 'critic', prompt_tokens: 2100, completion_tokens: 480, model: 'claude-haiku-4-5-20251001', latency_ms: 980, stop_reason: 'end_turn',
299
+ messages: [
300
+ { role: 'system', content: 'You are a critic agent. Review code for correctness, style, type safety, and edge-case coverage. Output a score /10 with justification.' },
301
+ { role: 'user', content: 'Review this Python quicksort implementation:\n\ndef quicksort(arr: list[T]) -> list[T]: ...\ndef quicksort_inplace(arr, lo, hi): ...\n\nTool results: pylint 9.8/10, mypy strict: no issues.' },
302
+ ],
303
+ response: "**PASS — 9/10**\n\n✓ Type hints on public API (TypeVar T for generics)\n✓ Docstrings explain complexity and behaviour\n✓ Both functional and in-place variants provided\n✓ Edge cases: empty list, single element return correctly\n✓ mypy strict passes — no type errors\n✓ pylint 9.8/10\n\n**Minor issues**:\n- Missing module-level docstring (-0.5)\n- `quicksort_inplace` docstring doesn't document `lo`/`hi` params (-0.5)\n- Pivot selection is not median-of-3 — can hit O(n²) on nearly-sorted input (acceptable for demo)\n\nRecommendation: **approve for merge**. Add module docstring before production use.",
304
+ });
305
+ TOOLS.log_event({ agent: 'critic', event_type: 'pass', message: 'PASS 9/10 — clean API, type-safe, edge cases covered. Minor: missing module docstring.' });
306
+ TOOLS.trace_step({ from_agent: 'critic', to_agent: 'orchestrator', label: 'pass 9/10', arrow_type: 'result' });
307
+ TOOLS.set_agent_state({ agent_id: 'critic', status: 'done' });
308
+ }},
309
+ { delay: 400, fn: () => {
310
+ TOOLS.set_memory({ key: 'output', value: 'quicksort.py — approved 9/10' });
311
+ TOOLS.log_event({ agent: 'orchestrator', event_type: 'done', message: 'Run complete — 18 steps' });
312
+ TOOLS.set_agent_state({ agent_id: 'orchestrator', status: 'done' });
313
+ TOOLS.finish_run({ status: 'done' });
314
+ }},
315
+ ],
316
+ },
317
+
318
+ critic_retry: {
319
+ goal: 'Write an RFC-5321 compliant email regex validator',
320
+ steps: [
321
+ { delay: 0, fn: () => {
322
+ TOOLS.register_agent({ id: 'orchestrator', label: 'Orchestrator', role: 'orchestrator', model: 'claude-sonnet-4-20250514', token_budget: 16384 });
323
+ TOOLS.register_agent({ id: 'coder', label: 'Coder', role: 'coder', model: 'claude-sonnet-4-20250514', reports_to: 'orchestrator', token_budget: 8192 });
324
+ TOOLS.register_agent({ id: 'critic', label: 'Critic', role: 'critic', model: 'claude-haiku-4-5-20251001', reports_to: 'orchestrator', token_budget: 4096 });
325
+ }},
326
+ { delay: 700, fn: () => {
327
+ TOOLS.set_goal({ goal: SCENARIOS.critic_retry.goal });
328
+ TOOLS.set_agent_state({ agent_id: 'orchestrator', status: 'running' });
329
+ TOOLS.log_generation({ agent: 'orchestrator', prompt_tokens: 240, completion_tokens: 80, model: 'claude-sonnet-4-20250514', latency_ms: 580 });
330
+ TOOLS.log_event({ agent: 'orchestrator', event_type: 'start', message: 'Planning…' });
331
+ }},
332
+ { delay: 800, fn: () => {
333
+ TOOLS.set_plan({ tasks: [{ agent: 'coder', task: 'Write RFC-5321 email regex', depends_on: [] }, { agent: 'critic', task: 'Validate regex correctness', depends_on: [0] }] });
334
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'coder', label: 'write', arrow_type: 'msg' });
335
+ TOOLS.set_agent_state({ agent_id: 'coder', status: 'running' });
336
+ }},
337
+ // Coder v1 — minimal attempt
338
+ { delay: 400, fn: () => {
339
+ TOOLS.log_embedding({ agent: 'coder', text: 'RFC-5321 email address validation regex Python', model: 'text-embedding-3-small', dims: 1536, latency_ms: 49 });
340
+ }},
341
+ { delay: 300, fn: () => {
342
+ TOOLS.log_retrieval({ agent: 'coder', query: 'email regex RFC 5321 compliant Python', latency_ms: 84,
343
+ results: [
344
+ { text: 'Simple: r"[^@]+@[^@]+\\.[^@]+" — catches most but misses edge cases.', score: 0.89 },
345
+ { text: 'RFC-5321 allows quoted strings, IP literals, special chars in local part.', score: 0.85 },
346
+ ],
347
+ });
348
+ }},
349
+ { delay: 900, fn: () => {
350
+ TOOLS.log_generation({ agent: 'coder', prompt_tokens: 920, completion_tokens: 240, model: 'claude-sonnet-4-20250514', latency_ms: 1800, stop_reason: 'end_turn' });
351
+ TOOLS.log_tool_call({ agent: 'coder', tool_name: 'code_execute', input: 'test_email("user@example.com")', output: 'True ✓', latency_ms: 14 });
352
+ TOOLS.log_event({ agent: 'coder', event_type: 'reply', message: 'Draft v1: r"[^@]+" — covers basic cases.' });
353
+ TOOLS.set_memory({ key: 'code', value: 'r"[^@]+"' });
354
+ TOOLS.trace_step({ from_agent: 'coder', to_agent: 'orchestrator', label: 'v1', arrow_type: 'result' });
355
+ TOOLS.set_agent_state({ agent_id: 'coder', status: 'active' });
356
+ }},
357
+ // Critic v1 review — fail
358
+ { delay: 500, fn: () => {
359
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'critic', label: 'review v1', arrow_type: 'msg' });
360
+ TOOLS.set_agent_state({ agent_id: 'critic', status: 'running' });
361
+ }},
362
+ { delay: 400, fn: () => {
363
+ TOOLS.log_embedding({ agent: 'critic', text: 'r"[^@]+" email regex RFC-5321 compliance', model: 'text-embedding-3-small', dims: 1536, latency_ms: 46 });
364
+ TOOLS.log_tool_call({ agent: 'critic', tool_name: 'regex_test_suite', input: 'RFC-5321 test vectors (120 cases)', output: '67/120 pass — missing TLDs, quoted strings, IP literals, consecutive dot check', latency_ms: 220 });
365
+ }},
366
+ { delay: 700, fn: () => {
367
+ TOOLS.log_generation({ agent: 'critic', prompt_tokens: 1400, completion_tokens: 360, model: 'claude-haiku-4-5-20251001', latency_ms: 980, stop_reason: 'end_turn' });
368
+ TOOLS.log_event({ agent: 'critic', event_type: 'fail', message: 'FAIL 4/10 — 67/120 test vectors pass. Missing: TLDs, quoted strings, IP literals, consecutive-dot rule.' });
369
+ TOOLS.set_memory({ key: 'critique', value: 'fail 4/10 — missing TLDs, quoted strings, IP literals' });
370
+ TOOLS.trace_step({ from_agent: 'critic', to_agent: 'orchestrator', label: 'fail 4/10', arrow_type: 'result' });
371
+ TOOLS.set_agent_state({ agent_id: 'critic', status: 'active' });
372
+ state.metrics.retries++; broadcast('metrics', state.metrics);
373
+ }},
374
+ // Orchestrator retries coder
375
+ { delay: 500, fn: () => {
376
+ TOOLS.log_generation({ agent: 'orchestrator', prompt_tokens: 480, completion_tokens: 120, model: 'claude-sonnet-4-20250514', latency_ms: 640 });
377
+ TOOLS.log_event({ agent: 'orchestrator', event_type: 'retry', message: 'Critic FAIL — retrying Coder with full critique attached' });
378
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'coder', label: 'retry', arrow_type: 'retry' });
379
+ TOOLS.set_agent_state({ agent_id: 'coder', status: 'running' });
380
+ }},
381
+ // Coder v2 — thorough attempt
382
+ { delay: 400, fn: () => {
383
+ TOOLS.log_embedding({ agent: 'coder', text: 'RFC-5321 quoted strings IP literal TLD validation', model: 'text-embedding-3-small', dims: 1536, latency_ms: 52 });
384
+ TOOLS.log_retrieval({ agent: 'coder', query: 'RFC 5321 email local-part quoted string IP literal syntax', latency_ms: 96,
385
+ results: [
386
+ { text: 'Local part: atom or quoted-string. Quoted allows spaces, special chars within double quotes.', score: 0.95 },
387
+ { text: 'Domain: hostname or IP literal [n.n.n.n]. TLD must be 2+ alpha chars.', score: 0.93 },
388
+ { text: 'No consecutive dots in local or domain part. No leading/trailing dot.', score: 0.91 },
389
+ ],
390
+ });
391
+ }},
392
+ { delay: 1200, fn: () => {
393
+ TOOLS.log_generation({ agent: 'coder', prompt_tokens: 2800, completion_tokens: 780, model: 'claude-sonnet-4-20250514', latency_ms: 2600, stop_reason: 'end_turn' });
394
+ }},
395
+ { delay: 600, fn: () => {
396
+ TOOLS.log_tool_call({ agent: 'coder', tool_name: 'code_execute', input: 'RFC-5321 test suite — 120 vectors', output: '118/120 pass (2 obscure IPv6 edge cases)', latency_ms: 340 });
397
+ TOOLS.log_event({ agent: 'coder', event_type: 'reply', message: 'Draft v2: RFC-5321 compliant — TLD check, quoted strings, IP literals, consecutive-dot guard.' });
398
+ TOOLS.set_memory({ key: 'code', value: 'RFC5321_RE = re.compile(r\'...\') # 118/120 RFC vectors pass' });
399
+ TOOLS.trace_step({ from_agent: 'coder', to_agent: 'orchestrator', label: 'v2', arrow_type: 'result' });
400
+ TOOLS.set_agent_state({ agent_id: 'coder', status: 'done' });
401
+ }},
402
+ // Critic v2 review — pass
403
+ { delay: 500, fn: () => {
404
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'critic', label: 'review v2', arrow_type: 'msg' });
405
+ TOOLS.set_agent_state({ agent_id: 'critic', status: 'running' });
406
+ }},
407
+ { delay: 400, fn: () => {
408
+ TOOLS.log_tool_call({ agent: 'critic', tool_name: 'regex_test_suite', input: 'RFC-5321 test vectors (120 cases)', output: '118/120 pass — 2 obscure IPv6 literals; acceptable for prod use', latency_ms: 215 });
409
+ }},
410
+ { delay: 700, fn: () => {
411
+ TOOLS.log_generation({ agent: 'critic', prompt_tokens: 1600, completion_tokens: 320, model: 'claude-haiku-4-5-20251001', latency_ms: 860, stop_reason: 'end_turn' });
412
+ TOOLS.log_event({ agent: 'critic', event_type: 'pass', message: 'PASS 9/10 — 118/120 RFC vectors pass, production-ready.' });
413
+ TOOLS.trace_step({ from_agent: 'critic', to_agent: 'orchestrator', label: 'pass 9/10', arrow_type: 'result' });
414
+ TOOLS.set_agent_state({ agent_id: 'critic', status: 'done' });
415
+ }},
416
+ { delay: 400, fn: () => {
417
+ TOOLS.log_event({ agent: 'orchestrator', event_type: 'done', message: 'Complete after 1 retry — 1 retry, 20 steps' });
418
+ TOOLS.set_agent_state({ agent_id: 'orchestrator', status: 'done' });
419
+ TOOLS.finish_run({ status: 'done' });
420
+ }},
421
+ ],
422
+ },
423
+
424
+ memory_overflow: {
425
+ goal: 'Summarise 3 ML papers and synthesise into a report',
426
+ steps: [
427
+ { delay: 0, fn: () => {
428
+ TOOLS.register_agent({ id: 'orchestrator', label: 'Orchestrator', role: 'orchestrator', model: 'claude-sonnet-4-20250514', token_budget: 16384 });
429
+ TOOLS.register_agent({ id: 'researcher', label: 'Researcher', role: 'researcher', model: 'claude-haiku-4-5-20251001', reports_to: 'orchestrator', token_budget: 8192 });
430
+ TOOLS.register_agent({ id: 'synthesiser', label: 'Synthesiser', role: 'synthesiser', model: 'claude-sonnet-4-20250514', reports_to: 'orchestrator', token_budget: 8192 });
431
+ TOOLS.register_agent({ id: 'critic', label: 'Critic', role: 'critic', model: 'claude-haiku-4-5-20251001', reports_to: 'orchestrator', token_budget: 4096 });
432
+ }},
433
+ { delay: 700, fn: () => {
434
+ TOOLS.set_goal({ goal: SCENARIOS.memory_overflow.goal });
435
+ TOOLS.set_agent_state({ agent_id: 'orchestrator', status: 'running' });
436
+ TOOLS.log_generation({ agent: 'orchestrator', prompt_tokens: 260, completion_tokens: 88, model: 'claude-sonnet-4-20250514', latency_ms: 600 });
437
+ TOOLS.log_event({ agent: 'orchestrator', event_type: 'start', message: 'Planning 3-paper synthesis…' });
438
+ }},
439
+ { delay: 900, fn: () => {
440
+ TOOLS.set_plan({ tasks: [{ agent: 'researcher', task: 'Summarise paper A — scaling laws', depends_on: [] }, { agent: 'researcher', task: 'Summarise paper B — MoE routing', depends_on: [] }, { agent: 'researcher', task: 'Summarise paper C — RLHF hacking', depends_on: [] }, { agent: 'synthesiser', task: 'Synthesise into report', depends_on: [0,1,2] }] });
441
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'researcher', label: 'paper A', arrow_type: 'msg' });
442
+ TOOLS.set_agent_state({ agent_id: 'researcher', status: 'running' });
443
+ }},
444
+ // Paper A
445
+ { delay: 400, fn: () => {
446
+ TOOLS.log_tool_call({ agent: 'researcher', tool_name: 'pdf_extract', input: 'scaling_laws_2020.pdf', output: '18,400 tokens extracted — 42 pages', latency_ms: 480 });
447
+ TOOLS.log_embedding({ agent: 'researcher', text: 'neural scaling laws loss compute data parameters', model: 'text-embedding-3-small', dims: 1536, latency_ms: 55 });
448
+ }},
449
+ { delay: 600, fn: () => {
450
+ TOOLS.log_retrieval({ agent: 'researcher', query: 'key findings scaling laws compute-optimal training', latency_ms: 104,
451
+ results: [
452
+ { text: 'Loss scales as power law with N (params), D (data), C (compute): L ∝ N^0.076.', score: 0.97 },
453
+ { text: 'Compute-optimal: scale params and data proportionally. Chinchilla law.', score: 0.94 },
454
+ { text: 'Irreducible loss ≈ 1.69 nats; emergent capabilities at scale thresholds.', score: 0.88 },
455
+ ],
456
+ });
457
+ TOOLS.log_generation({ agent: 'researcher', prompt_tokens: 2400, completion_tokens: 520, model: 'claude-haiku-4-5-20251001', latency_ms: 1600, stop_reason: 'end_turn' });
458
+ TOOLS.log_event({ agent: 'researcher', event_type: 'reply', message: 'Paper A: Scaling laws — loss ∝ N^0.076. Compute-optimal: equal param/data scaling.' });
459
+ TOOLS.set_memory({ key: 'paper_a', value: 'Scaling laws: loss ∝ N^0.076, Chinchilla-optimal' });
460
+ TOOLS.trace_step({ from_agent: 'researcher', to_agent: 'orchestrator', label: 'A done', arrow_type: 'result' });
461
+ }},
462
+ // Paper B
463
+ { delay: 400, fn: () => {
464
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'researcher', label: 'paper B', arrow_type: 'msg' });
465
+ TOOLS.log_tool_call({ agent: 'researcher', tool_name: 'pdf_extract', input: 'moe_routing_2023.pdf', output: '22,100 tokens extracted — 51 pages', latency_ms: 520 });
466
+ TOOLS.log_embedding({ agent: 'researcher', text: 'mixture of experts routing sparse transformer efficiency', model: 'text-embedding-3-small', dims: 1536, latency_ms: 53 });
467
+ }},
468
+ { delay: 600, fn: () => {
469
+ TOOLS.log_retrieval({ agent: 'researcher', query: 'MoE routing top-k expert selection load balancing', latency_ms: 98,
470
+ results: [
471
+ { text: 'Top-2 routing: each token sent to 2 of N experts. 60% active-param reduction vs dense.', score: 0.96 },
472
+ { text: 'Load balancing loss prevents expert collapse. Jitter noise aids exploration.', score: 0.92 },
473
+ { text: 'Switch Transformer: top-1 routing, simpler but prone to collapse without aux loss.', score: 0.87 },
474
+ ],
475
+ });
476
+ TOOLS.log_generation({ agent: 'researcher', prompt_tokens: 2800, completion_tokens: 490, model: 'claude-haiku-4-5-20251001', latency_ms: 1500, stop_reason: 'end_turn' });
477
+ TOOLS.log_event({ agent: 'researcher', event_type: 'reply', message: 'Paper B: MoE top-2 routing, 60% active-param reduction. Load-balance aux loss prevents collapse.' });
478
+ TOOLS.set_memory({ key: 'paper_b', value: 'MoE: top-2 routing, 60% reduction, aux load-balance loss' });
479
+ TOOLS.trace_step({ from_agent: 'researcher', to_agent: 'orchestrator', label: 'B done', arrow_type: 'result' });
480
+ }},
481
+ // Paper C — triggers memory pressure
482
+ { delay: 400, fn: () => {
483
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'researcher', label: 'paper C', arrow_type: 'msg' });
484
+ TOOLS.log_tool_call({ agent: 'researcher', tool_name: 'pdf_extract', input: 'rlhf_reward_hacking_2024.pdf', output: '31,200 tokens extracted — 68 pages', latency_ms: 710 });
485
+ TOOLS.log_embedding({ agent: 'researcher', text: 'RLHF reward hacking overoptimisation KL penalty', model: 'text-embedding-3-small', dims: 1536, latency_ms: 58 });
486
+ }},
487
+ { delay: 600, fn: () => {
488
+ TOOLS.log_retrieval({ agent: 'researcher', query: 'reward hacking frequency mitigation strategies RLHF', latency_ms: 112,
489
+ results: [
490
+ { text: 'Reward hacking observed in 34% of runs beyond 3000 RL steps. KL alone insufficient.', score: 0.95 },
491
+ { text: 'Constitutional AI + process reward models reduce hacking to <8%.', score: 0.91 },
492
+ { text: 'Ensemble reward models provide more robust signal than single RM.', score: 0.88 },
493
+ ],
494
+ });
495
+ TOOLS.log_generation({ agent: 'researcher', prompt_tokens: 3200, completion_tokens: 560, model: 'claude-haiku-4-5-20251001', latency_ms: 1800, stop_reason: 'end_turn' });
496
+ TOOLS.log_event({ agent: 'researcher', event_type: 'reply', message: 'Paper C: RLHF reward hacking in 34% of runs. KL penalty alone insufficient; ensemble RMs help.' });
497
+ TOOLS.set_memory({ key: 'paper_c', value: 'RLHF: reward hacking 34%, use ensemble RMs + CAI' });
498
+ TOOLS.trace_step({ from_agent: 'researcher', to_agent: 'orchestrator', label: 'C done', arrow_type: 'result' });
499
+ TOOLS.set_agent_state({ agent_id: 'researcher', status: 'done' });
500
+ }},
501
+ // Synthesiser — context overflow
502
+ { delay: 600, fn: () => {
503
+ TOOLS.trace_step({ from_agent: 'orchestrator', to_agent: 'synthesiser', label: 'synthesise', arrow_type: 'msg' });
504
+ TOOLS.set_agent_state({ agent_id: 'synthesiser', status: 'running' });
505
+ }},
506
+ { delay: 400, fn: () => {
507
+ TOOLS.log_embedding({ agent: 'synthesiser', text: 'scaling laws MoE routing RLHF reward hacking synthesis', model: 'text-embedding-3-small', dims: 1536, latency_ms: 62 });
508
+ TOOLS.log_tool_call({ agent: 'synthesiser', tool_name: 'context_count', input: 'papers A+B+C combined tokens', output: '7,840 / 8,192 tokens used (95.7%) — paper C will be truncated', latency_ms: 12 });
509
+ TOOLS.log_event({ agent: 'synthesiser', event_type: 'warn', message: 'WARNING: context at 95.7% — paper C (RLHF) will be truncated to fit budget.' });
510
+ }},
511
+ { delay: 1200, fn: () => {
512
+ TOOLS.log_generation({ agent: 'synthesiser', prompt_tokens: 7840, completion_tokens: 980, model: 'claude-sonnet-4-20250514', latency_ms: 3200, stop_reason: 'max_tokens' });
513
+ TOOLS.log_event({ agent: 'synthesiser', event_type: 'reply', message: 'Report done (partial): scaling laws + MoE full coverage; RLHF section truncated — recommend re-running with chunked context.' });
514
+ TOOLS.set_memory({ key: 'output', value: 'Report: scaling (full) + MoE (full) + RLHF (truncated)' });
515
+ TOOLS.trace_step({ from_agent: 'synthesiser', to_agent: 'orchestrator', label: 'report', arrow_type: 'result' });
516
+ TOOLS.set_agent_state({ agent_id: 'synthesiser', status: 'done' });
517
+ }},
518
+ { delay: 400, fn: () => {
519
+ TOOLS.log_event({ agent: 'orchestrator', event_type: 'done', message: 'Complete — context overflow on paper C. Recommend chunked summarisation for large doc sets.' });
520
+ TOOLS.set_agent_state({ agent_id: 'orchestrator', status: 'done' });
521
+ TOOLS.finish_run({ status: 'done' });
522
+ }},
523
+ ],
524
+ },
525
+ };
526
+
527
+ function runScenario(name) {
528
+ const s = SCENARIOS[name];
529
+ if (!s) return false;
530
+ const clients = state.clients;
531
+ state = fresh();
532
+ state.clients = clients;
533
+ broadcast('reset', {});
534
+ let cum = 0;
535
+ s.steps.forEach(step => { cum += step.delay; setTimeout(() => { try { step.fn(); } catch (e) { console.error(e); } }, cum); });
536
+ return true;
537
+ }
538
+
539
+ // ── Dashboard HTML ─────────────────────────────────────────────────────────────
540
+ const HTML = fs.readFileSync(path.join(__dirname, 'dashboard.html'), 'utf8');
541
+
542
+ // ── HTTP helpers ──────────────────────────────────────────────────────────────
543
+ const CORS = {
544
+ 'Access-Control-Allow-Origin': '*',
545
+ 'Access-Control-Allow-Methods': 'GET, POST, OPTIONS',
546
+ 'Access-Control-Allow-Headers': 'Content-Type',
547
+ };
548
+ function body(req, cb) { let d = ''; req.on('data', c => d += c); req.on('end', () => cb(d)); }
549
+ function json(res, data, status = 200) {
550
+ res.writeHead(status, { ...CORS, 'Content-Type': 'application/json' });
551
+ res.end(JSON.stringify(data));
552
+ }
553
+
554
+ // ── HTTP server ────────────────────────────────────────────────────────────────
555
+ const server = http.createServer((req, res) => {
556
+ if (req.method === 'OPTIONS') { res.writeHead(204, CORS); res.end(); return; }
557
+
558
+ // Dashboard UI
559
+ if (req.method === 'GET' && (req.url === '/' || req.url === '/index.html')) {
560
+ res.writeHead(200, { 'Content-Type': 'text/html' });
561
+ res.end(HTML);
562
+ return;
563
+ }
564
+
565
+ // SSE stream
566
+ if (req.method === 'GET' && req.url === '/events') {
567
+ res.writeHead(200, { ...CORS, 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', 'Connection': 'keep-alive' });
568
+ res.write(`data: ${JSON.stringify({ type: 'init', payload: { state: snapshot() }, ts: Date.now() })}\n\n`);
569
+ state.clients.push(res);
570
+ req.on('close', () => { state.clients = state.clients.filter(c => c !== res); });
571
+ return;
572
+ }
573
+
574
+ // Current state snapshot
575
+ if (req.method === 'GET' && req.url === '/state') {
576
+ json(res, snapshot()); return;
577
+ }
578
+
579
+ // Tool call
580
+ if (req.method === 'POST' && req.url === '/tool') {
581
+ body(req, data => {
582
+ try {
583
+ const { tool, args } = JSON.parse(data);
584
+ const fn = TOOLS[tool];
585
+ json(res, fn ? fn(args || {}) : { error: `Unknown tool: ${tool}` });
586
+ } catch (e) { json(res, { error: e.message }, 400); }
587
+ }); return;
588
+ }
589
+
590
+ // Run a demo scenario
591
+ if (req.method === 'POST' && req.url === '/emulate') {
592
+ body(req, data => {
593
+ const { scenario } = JSON.parse(data || '{}');
594
+ const ok = runScenario(scenario || 'research_code');
595
+ json(res, { ok, scenario }, ok ? 200 : 400);
596
+ }); return;
597
+ }
598
+
599
+ // Reset state
600
+ if (req.method === 'POST' && req.url === '/reset') {
601
+ const clients = state.clients;
602
+ state = fresh(); state.clients = clients;
603
+ broadcast('reset', {});
604
+ json(res, { ok: true }); return;
605
+ }
606
+
607
+ json(res, { error: 'Not found' }, 404);
608
+ });
609
+
610
+ server.listen(PORT, () => {
611
+ console.log(`\n agent-visibility\n`);
612
+ console.log(` Dashboard → http://localhost:${PORT}`);
613
+ console.log(` Tool POST → http://localhost:${PORT}/tool`);
614
+ console.log(` Ctrl+C to stop\n`);
615
+ });