danelcsb commited on
Commit
e15d158
·
verified ·
1 Parent(s): 0d3185e

deploy generable-dispatch demo (scenarios-best)

Browse files
Files changed (8) hide show
  1. README.md +27 -37
  2. app.js +59 -52
  3. dispatch_heads.json +0 -0
  4. heads.json +0 -0
  5. index.html +7 -5
  6. meta.json +1 -1
  7. model.fp16.onnx +1 -1
  8. style.css +1 -0
README.md CHANGED
@@ -17,57 +17,47 @@ calling** and **multi-step planning** — running **entirely in your browser** o
17
  (WASM fallback when WebGPU is unavailable). No server, no API key; the model is downloaded once
18
  and cached.
19
 
20
- Model: [`SangbumChoi/localagent-tiny-30m-byte`](https://huggingface.co/SangbumChoi/localagent-tiny-30m-byte).
21
  Source: [LocalAgent](https://github.com/sangbumchoi/localagent).
22
 
23
- ## What it shows
24
 
25
- - **Tool selection** — the model's *real* `tool_head` decision (a linear head on the ONNX
26
- `hidden` output) over the 21-tool surface, with a confidence score, plus **abstention** when no
27
- tool fits.
28
- - **Grounded arguments** arguments copied from spans of your prompt, so the emitted call is
29
- schema-valid by construction.
30
- - **Multi-step plans** — the learned `plan_rollout`: pick a tool ground it feed back a
31
- simulated response pick the next, until the model emits the *stop* (`text`) class.
 
 
32
 
33
  ## How it runs (honest version)
34
 
35
- The transformer forward pass runs on **WebGPU** via an exported ONNX graph that emits both `logits`
36
- and the last `hidden` state. The **tool head** (one matmul + argmax over `hidden`), the
37
- **argument grounding**, and the **planner loop** are light JavaScript on top — a faithful port of
38
- the Python `tool_head` / grounding / `plan_rollout`. Arg grounding in-browser covers the common
39
- formats (paths, URLs, quoted strings, names, numbers); the full Python grounder is the source of
40
- truth. First load fetches `model.fp16.onnx` (~tens of MB) and caches it.
41
 
42
  ## Files
43
  - `index.html` / `style.css` — the UI shell.
44
- - `app.js` — byte tokenizer, onnxruntime-web session (WebGPU + WASM fallback), tool selection,
45
  grounding, and the planner rollout.
46
- - `model.fp16.onnx`, `heads.json`, `meta.json` — the exported inference bundle (**not in the
47
- source repo**; they are deploy artifacts).
48
 
49
  ## Deploy
50
- The model bundle is produced from a trained checkpoint, separately from the source tree:
 
51
 
52
  ```bash
53
  python -c "from localagent.inference.export.to_onnx import export_web; \
54
- export_web('runs/tiny-30m-byte-best.pt', 'runs/web_export')"
55
  ```
56
 
57
- Then upload **the four static files + the three bundle files** into a Hugging Face Space repo
58
- (`sdk: static`), all at the repo root, using git-lfs for the large ones:
59
-
60
- ```bash
61
- huggingface-cli upload <user>/localagent-webgpu spaces/localagent-webgpu/ . --repo-type space
62
- huggingface-cli upload <user>/localagent-webgpu runs/web_export/model.fp16.onnx model.fp16.onnx --repo-type space
63
- huggingface-cli upload <user>/localagent-webgpu runs/web_export/heads.json heads.json --repo-type space
64
- huggingface-cli upload <user>/localagent-webgpu runs/web_export/meta.json meta.json --repo-type space
65
- ```
66
-
67
- `app.js` fetches `model.fp16.onnx` / `heads.json` / `meta.json` relative to the page, so they must
68
- sit next to `index.html`. The export was verified for onnxruntime-CPU↔PyTorch parity
69
- (max |Δlogits| 7.6e-6; fp16 drift 1.4e-3, same tool argmax) and the in-browser tool-selection +
70
- pointer-grounding math was checked against the bundle (`get_weather{city:"Paris"}`,
71
- `read_file{path:"tests/test_api.py"}`, …). The graph is all standard opset-17 ops; onnxruntime-web
72
- falls back per-op to WASM for any op without a WebGPU kernel (`Trilu`/`Tile`/`Expand`), with
73
- identical results.
 
17
  (WASM fallback when WebGPU is unavailable). No server, no API key; the model is downloaded once
18
  and cached.
19
 
20
+ Model: [`danelcsb/localagent-tiny-30m-byte`](https://huggingface.co/danelcsb/localagent-tiny-30m-byte).
21
  Source: [LocalAgent](https://github.com/sangbumchoi/localagent).
22
 
23
+ ## What it shows (generable dispatch — no fixed-N classifier)
24
 
25
+ - **Route gate** — a 5-way head (`web_search / computer_use / code / app_action / text`) on the ONNX
26
+ `hidden` output; the `text` route is **abstention** (answer directly / no tool).
27
+ - **Tool selection** — a **dense two-tower selector**: the query tower projects `hidden`, scored by
28
+ cosine against a precomputed per-tool description-embedding matrix over the **50-tool** surface
29
+ (`argmax_j q·tool_matrix[j]`). Adding/removing a tool is adding/removing a row — no retraining.
30
+ - **Grounded arguments** — copied from spans of your prompt via the learned pointer head, so the
31
+ emitted call is schema-valid by construction.
32
+ - **Multi-step plans** — the rollout: pick a tool → ground it → feed back a simulated response →
33
+ pick the next, until the route head emits `text`.
34
 
35
  ## How it runs (honest version)
36
 
37
+ The transformer forward pass runs on **WebGPU** via an exported ONNX graph that emits `logits` and
38
+ the last `hidden` state. The **route head**, the **dense selector** (matmul + normalize + argmax over
39
+ the precomputed tool matrix), the **pointer-copy** grounding, and the **planner loop** are light
40
+ JavaScript on top — a faithful port of the Python `routes` / `dense_selector` / `pointer_head`
41
+ pipeline (parity-checked at export: 100% argmax/top-1 agreement). First load fetches
42
+ `model.fp16.onnx` (~57 MB) and caches it.
43
 
44
  ## Files
45
  - `index.html` / `style.css` — the UI shell.
46
+ - `app.js` — byte tokenizer, onnxruntime-web session (WebGPU + WASM fallback), route+selector dispatch,
47
  grounding, and the planner rollout.
48
+ - `model.fp16.onnx`, `heads.json`, `meta.json`, `dispatch_heads.json` — the exported inference
49
+ bundle (**not in the source repo**; deploy artifacts). See `DEPLOY.md` for the exact commands.
50
 
51
  ## Deploy
52
+ See **`DEPLOY.md`** for copy-paste build + push commands. In short: export the bundle from the latest
53
+ checkpoint and upload the static app + the four bundle files into a `sdk: static` Space:
54
 
55
  ```bash
56
  python -c "from localagent.inference.export.to_onnx import export_web; \
57
+ export_web('runs/tiny-30m-scenarios-best.pt', 'build/web')"
58
  ```
59
 
60
+ `app.js` fetches `model.fp16.onnx` / `heads.json` / `meta.json` / `dispatch_heads.json` relative to
61
+ the page, so they must sit next to `index.html`. Export is parity-checked vs PyTorch (max |Δlogits|
62
+ 7.6e-6; route-head & dense-selector argmax/top-1 100% agreement). The graph is standard opset-17;
63
+ onnxruntime-web falls back per-op to WASM for any op without a WebGPU kernel, with identical results.
 
 
 
 
 
 
 
 
 
 
 
 
 
app.js CHANGED
@@ -1,25 +1,33 @@
1
  /* LocalAgent — in-browser tool calling on onnxruntime-web (WebGPU + WASM fallback).
2
  *
3
- * The transformer forward pass runs as an ONNX graph emitting `logits` and `hidden`.
4
- * The tool head, argument grounding, and the planner rollout are ported here from the Python
5
- * `tool_head` / grounding / `plan_rollout`. Bundle contract (see localagent.inference.export):
6
- * model.fp16.onnx inputs: input_ids[int64, 1xT] outputs: logits[1,T,256], hidden[1,T,d]
7
- * heads.json { tool_head:{weight:[C][d], bias:[C], classes:[C], stop_index:int}, ... }
8
- * meta.json { vocab_size, d_model, pad_id, markers:{...}, tools:[{name,args,schema}], tool_classes }
 
 
 
 
 
 
9
  */
10
 
11
  const MODEL_URL = "model.fp16.onnx";
12
  let SESSION = null;
13
  let HEADS = null;
14
  let META = null;
 
15
  let BACKEND = "wasm";
16
 
17
  // ---- bundle loading -------------------------------------------------------
18
  async function loadBundle() {
19
  ort.env.wasm.wasmPaths = "https://cdn.jsdelivr.net/npm/onnxruntime-web@1.20.1/dist/";
20
- [HEADS, META] = await Promise.all([
21
  fetch("heads.json").then((r) => r.json()),
22
  fetch("meta.json").then((r) => r.json()),
 
23
  ]);
24
  try {
25
  SESSION = await ort.InferenceSession.create(MODEL_URL, {
@@ -39,7 +47,7 @@ const enc = new TextEncoder();
39
  function bytesOf(s) { return Array.from(enc.encode(s)); }
40
  function mark(name) { return META.markers[name].text; } // markers carry { text, ids }
41
 
42
- // Render a user turn the way the model was trained / `plan_rollout` renders it.
43
  function renderContext(query, steps) {
44
  let s = mark("user") + query + mark("assistant");
45
  for (const st of steps || []) {
@@ -58,49 +66,47 @@ async function forward(ids) {
58
  return out; // { logits, hidden }
59
  }
60
 
61
- // ---- tool head (linear on the last hidden vector) -------------------------
62
- function softmaxArgmax(logits) {
63
- let m = -Infinity;
64
- for (const v of logits) m = Math.max(m, v);
65
- let z = 0;
66
- const p = logits.map((v) => { const e = Math.exp(v - m); z += e; return e; });
67
- let bi = 0;
68
- for (let i = 1; i < p.length; i++) if (p[i] > p[bi]) bi = i;
69
- return { index: bi, conf: p[bi] / z };
70
- }
71
-
72
- function selectTool(hiddenTensor, T) {
73
- const d = META.d_model;
74
- const H = hiddenTensor.data; // Float32Array length T*d
75
- const off = (T - 1) * d; // last position
76
- const last = H.subarray ? H.subarray(off, off + d) : Array.from(H).slice(off, off + d);
77
- const { weight, bias, classes, stop_index } = HEADS.tool_head;
78
- const logits = new Array(classes.length);
79
- for (let c = 0; c < classes.length; c++) {
80
- let acc = bias[c];
81
- const Wc = weight[c];
82
- for (let k = 0; k < d; k++) acc += Wc[k] * last[k];
83
- logits[c] = acc;
 
 
 
 
 
84
  }
85
- const { index, conf } = softmaxArgmax(logits);
86
- return { name: classes[index], index, conf, isStop: index === stop_index };
87
  }
88
 
89
  // ---- argument grounding via the learned pointer head (port of pointer_head) ----
90
- // For each copy-arg of the chosen tool, compute start/end span logits over the input positions
91
- // and slice the value out of the prompt bytes — identical math to the PyTorch pointer head:
92
  // q = arg_emb[arg_idx[arg]]; qs = start_W·q; qe = end_W·q
93
  // start = argmax_t hidden[t]·qs; end = argmax_{t>=start} hidden[t]·qe; value = bytes[start..end]
94
- function matvec(M, v) { // M [d][d] · v [d] -> [d]
95
  const d = v.length, out = new Float32Array(d);
96
  for (let i = 0; i < d; i++) { const Mi = M[i]; let a = 0; for (let j = 0; j < d; j++) a += Mi[j] * v[j]; out[i] = a; }
97
  return out;
98
  }
99
- function dotAt(H, t, d, q) { // hidden[t] · q
100
- const off = t * d; let a = 0;
101
- for (let k = 0; k < d; k++) a += H[off + k] * q[k];
102
- return a;
103
- }
104
  function pointerSpan(arg, ids, H, T) {
105
  const ph = HEADS.pointer_head, d = META.d_model;
106
  const ai = ph.arg_idx[arg];
@@ -129,31 +135,31 @@ async function callOnce(query) {
129
  const ids = renderContext(query, []);
130
  const t0 = performance.now();
131
  const out = await forward(ids);
132
- const sel = selectTool(out.hidden, ids.length);
133
  const ms = performance.now() - t0;
134
- if (sel.isStop) return { abstain: true, conf: sel.conf, ms };
135
- return { tool: sel.name, args: groundArgs(sel.name, ids, out.hidden, ids.length), conf: sel.conf, ms };
136
  }
137
 
138
- // ---- planner rollout (port of plan_rollout) -------------------------------
139
  async function planRollout(query, maxSteps = 4) {
140
  const steps = [];
141
  const t0 = performance.now();
142
  for (let i = 0; i < maxSteps; i++) {
143
  const ids = renderContext(query, steps);
144
  const out = await forward(ids);
145
- const sel = selectTool(out.hidden, ids.length);
146
  if (sel.isStop) break;
147
  const args = groundArgs(sel.name, ids, out.hidden, ids.length);
148
- steps.push({ tool: sel.name, args, conf: sel.conf, response: simResponse(sel.name, args) });
149
  }
150
  return { steps, ms: performance.now() - t0 };
151
  }
152
 
153
- // A compact simulated tool response so downstream steps have context (mirrors _sim_response).
154
  function simResponse(tool, args) {
155
- if (/read_file|grep/.test(tool)) return Object.values(args)[0] || "ok";
156
- if (/search|news/.test(tool)) return "result: " + (args.query || "");
157
  return "ok";
158
  }
159
 
@@ -172,11 +178,12 @@ function renderCall(step, idx) {
172
  const div = document.createElement("div");
173
  div.className = "call" + (step.abstain ? " abstain" : "");
174
  const conf = step.conf != null ? `<span class="conf">${(step.conf * 100).toFixed(0)}%</span>` : "";
 
175
  if (step.abstain) {
176
- div.innerHTML = `${conf}<span class="tool">— abstains (no tool needed)</span>`;
177
  } else {
178
  const ix = idx != null ? `<span class="step-index">${idx + 1}.</span>` : "";
179
- div.innerHTML = `${conf}${ix}<span class="tool">${step.tool}</span>` +
180
  `<pre>${JSON.stringify(step.args, null, 2)}</pre>`;
181
  }
182
  return div;
 
1
  /* LocalAgent — in-browser tool calling on onnxruntime-web (WebGPU + WASM fallback).
2
  *
3
+ * The transformer forward pass runs as an ONNX graph emitting `logits` and `hidden`. The GENERABLE
4
+ * dispatch (route head -> dense two-tower selector -> pointer-copy args) is ported here from the
5
+ * Python pipeline. Bundle contract (see localagent.inference.export):
6
+ * model.fp16.onnx inputs: input_ids[int64, 1xT] outputs: logits[1,T,256], hidden[1,T,d]
7
+ * dispatch_heads.json { route_head:{weight:[5][d],bias:[5],routes:[5],stop_index},
8
+ * dense_selector:{q_proj_weight:[p][d],q_proj_bias:[p],proj:p,
9
+ * tool_matrix:[N][p],tool_names:[N],normalize_query} }
10
+ * heads.json { pointer_head:{arg_idx,arg_emb,start_W,end_W}, ... } (args copy)
11
+ * meta.json { d_model, markers:{...}, tools:[{name,args,schema}] } (50 tools)
12
+ *
13
+ * Selection is NOT a fixed-N classifier: the dense selector scores every tool by its description
14
+ * embedding, so adding/removing a tool is adding/removing a tool_matrix row.
15
  */
16
 
17
  const MODEL_URL = "model.fp16.onnx";
18
  let SESSION = null;
19
  let HEADS = null;
20
  let META = null;
21
+ let DISPATCH = null;
22
  let BACKEND = "wasm";
23
 
24
  // ---- bundle loading -------------------------------------------------------
25
  async function loadBundle() {
26
  ort.env.wasm.wasmPaths = "https://cdn.jsdelivr.net/npm/onnxruntime-web@1.20.1/dist/";
27
+ [HEADS, META, DISPATCH] = await Promise.all([
28
  fetch("heads.json").then((r) => r.json()),
29
  fetch("meta.json").then((r) => r.json()),
30
+ fetch("dispatch_heads.json").then((r) => r.json()),
31
  ]);
32
  try {
33
  SESSION = await ort.InferenceSession.create(MODEL_URL, {
 
47
  function bytesOf(s) { return Array.from(enc.encode(s)); }
48
  function mark(name) { return META.markers[name].text; } // markers carry { text, ids }
49
 
50
+ // Render a user turn the way the model was trained.
51
  function renderContext(query, steps) {
52
  let s = mark("user") + query + mark("assistant");
53
  for (const st of steps || []) {
 
66
  return out; // { logits, hidden }
67
  }
68
 
69
+ // ---- generable dispatch: route head -> dense two-tower selector ------------
70
+ function lastHidden(hiddenTensor, T) {
71
+ const d = META.d_model, H = hiddenTensor.data, off = (T - 1) * d;
72
+ return H.subarray ? H.subarray(off, off + d) : Array.from(H).slice(off, off + d);
73
+ }
74
+ function linrow(W, b, x) { // W[o][d] · x[d] + b[o] -> [o]
75
+ const o = W.length, out = new Float32Array(o);
76
+ for (let i = 0; i < o; i++) { const Wi = W[i]; let a = b ? b[i] : 0; for (let k = 0; k < x.length; k++) a += Wi[k] * x[k]; out[i] = a; }
77
+ return out;
78
+ }
79
+ function argmax(v) { let bi = 0; for (let i = 1; i < v.length; i++) if (v[i] > v[bi]) bi = i; return bi; }
80
+ function softmaxAt(v, i) { let m = -Infinity; for (const x of v) m = Math.max(m, x); let z = 0; for (const x of v) z += Math.exp(x - m); return Math.exp(v[i] - m) / z; }
81
+
82
+ function dispatchSelect(hiddenTensor, T) {
83
+ const last = lastHidden(hiddenTensor, T);
84
+ // 1. route head (5-way modality gate); the `text` route (stop_index) = abstain / direct answer.
85
+ const R = DISPATCH.route_head;
86
+ const rl = linrow(R.weight, R.bias, last);
87
+ const ri = argmax(rl);
88
+ if (ri === R.stop_index) return { isStop: true, route: R.routes[ri], conf: softmaxAt(rl, ri) };
89
+ // 2. dense selector: q = normalize(q_proj(last)); score_j = q · tool_matrix[j]; argmax.
90
+ const S = DISPATCH.dense_selector;
91
+ const q = linrow(S.q_proj_weight, S.q_proj_bias, last);
92
+ if (S.normalize_query) { let n = 0; for (const x of q) n += x * x; n = Math.sqrt(n) || 1; for (let i = 0; i < q.length; i++) q[i] /= n; }
93
+ let bi = 0, bs = -Infinity;
94
+ for (let j = 0; j < S.tool_names.length; j++) {
95
+ const Tj = S.tool_matrix[j]; let a = 0; for (let i = 0; i < S.proj; i++) a += Tj[i] * q[i];
96
+ if (a > bs) { bs = a; bi = j; }
97
  }
98
+ return { name: S.tool_names[bi], route: R.routes[ri], conf: (bs + 1) / 2, isStop: false };
 
99
  }
100
 
101
  // ---- argument grounding via the learned pointer head (port of pointer_head) ----
 
 
102
  // q = arg_emb[arg_idx[arg]]; qs = start_W·q; qe = end_W·q
103
  // start = argmax_t hidden[t]·qs; end = argmax_{t>=start} hidden[t]·qe; value = bytes[start..end]
104
+ function matvec(M, v) {
105
  const d = v.length, out = new Float32Array(d);
106
  for (let i = 0; i < d; i++) { const Mi = M[i]; let a = 0; for (let j = 0; j < d; j++) a += Mi[j] * v[j]; out[i] = a; }
107
  return out;
108
  }
109
+ function dotAt(H, t, d, q) { const off = t * d; let a = 0; for (let k = 0; k < d; k++) a += H[off + k] * q[k]; return a; }
 
 
 
 
110
  function pointerSpan(arg, ids, H, T) {
111
  const ph = HEADS.pointer_head, d = META.d_model;
112
  const ai = ph.arg_idx[arg];
 
135
  const ids = renderContext(query, []);
136
  const t0 = performance.now();
137
  const out = await forward(ids);
138
+ const sel = dispatchSelect(out.hidden, ids.length);
139
  const ms = performance.now() - t0;
140
+ if (sel.isStop) return { abstain: true, route: sel.route, conf: sel.conf, ms };
141
+ return { tool: sel.name, route: sel.route, args: groundArgs(sel.name, ids, out.hidden, ids.length), conf: sel.conf, ms };
142
  }
143
 
144
+ // ---- planner rollout ------------------------------------------------------
145
  async function planRollout(query, maxSteps = 4) {
146
  const steps = [];
147
  const t0 = performance.now();
148
  for (let i = 0; i < maxSteps; i++) {
149
  const ids = renderContext(query, steps);
150
  const out = await forward(ids);
151
+ const sel = dispatchSelect(out.hidden, ids.length);
152
  if (sel.isStop) break;
153
  const args = groundArgs(sel.name, ids, out.hidden, ids.length);
154
+ steps.push({ tool: sel.name, route: sel.route, args, conf: sel.conf, response: simResponse(sel.name, args) });
155
  }
156
  return { steps, ms: performance.now() - t0 };
157
  }
158
 
159
+ // A compact simulated tool response so downstream steps have context.
160
  function simResponse(tool, args) {
161
+ if (/read_file|grep|list_dir|find/.test(tool)) return Object.values(args)[0] || "ok";
162
+ if (/search|news|http|open_url|define/.test(tool)) return "result: " + (Object.values(args)[0] || "");
163
  return "ok";
164
  }
165
 
 
178
  const div = document.createElement("div");
179
  div.className = "call" + (step.abstain ? " abstain" : "");
180
  const conf = step.conf != null ? `<span class="conf">${(step.conf * 100).toFixed(0)}%</span>` : "";
181
+ const route = step.route ? `<span class="route">${step.route}</span>` : "";
182
  if (step.abstain) {
183
+ div.innerHTML = `${conf}${route}<span class="tool">— abstains (no tool needed)</span>`;
184
  } else {
185
  const ix = idx != null ? `<span class="step-index">${idx + 1}.</span>` : "";
186
+ div.innerHTML = `${conf}${route}${ix}<span class="tool">${step.tool}</span>` +
187
  `<pre>${JSON.stringify(step.args, null, 2)}</pre>`;
188
  }
189
  return div;
dispatch_heads.json ADDED
The diff for this file is too large to render. See raw diff
 
heads.json CHANGED
The diff for this file is too large to render. See raw diff
 
index.html CHANGED
@@ -11,9 +11,10 @@
11
  <header>
12
  <h1>🛠️ LocalAgent <span class="sub">tool calling in your browser</span></h1>
13
  <p class="tagline">
14
- A <strong>28M-param, from-scratch</strong> byte-level agent. The transformer runs on
15
- <strong>WebGPU</strong> (<code>onnxruntime-web</code>); the tool head, grounding, and the
16
- planner loop are light JS. <a href="https://huggingface.co/SangbumChoi/localagent-tiny-30m-byte" target="_blank" rel="noopener">model</a> ·
 
17
  <a href="https://github.com/sangbumchoi/localagent" target="_blank" rel="noopener">code</a>
18
  </p>
19
  </header>
@@ -39,9 +40,10 @@
39
  <div class="examples">
40
  <span>Try:</span>
41
  <button class="chip" data-plan="0">What's the weather in Tokyo?</button>
42
- <button class="chip" data-plan="0">How tall is Mount Everest?</button>
 
43
  <button class="chip" data-plan="0">What does ephemeral mean?</button>
44
- <button class="chip" data-plan="0">Set a timer for 10 minutes.</button>
45
  <button class="chip" data-plan="1">Search the web for AI news, then save it to Notion.</button>
46
  <button class="chip" data-plan="1">Read config.yaml, run the tests, then commit.</button>
47
  </div>
 
11
  <header>
12
  <h1>🛠️ LocalAgent <span class="sub">tool calling in your browser</span></h1>
13
  <p class="tagline">
14
+ A <strong>28M-param, from-scratch</strong> byte-level agent over <strong>50 tools</strong>.
15
+ The transformer runs on <strong>WebGPU</strong> (<code>onnxruntime-web</code>); the
16
+ <strong>route head dense selector pointer-copy</strong> dispatch and the planner loop are
17
+ light JS. <a href="https://huggingface.co/danelcsb/localagent-tiny-30m-byte" target="_blank" rel="noopener">model</a> ·
18
  <a href="https://github.com/sangbumchoi/localagent" target="_blank" rel="noopener">code</a>
19
  </p>
20
  </header>
 
40
  <div class="examples">
41
  <span>Try:</span>
42
  <button class="chip" data-plan="0">What's the weather in Tokyo?</button>
43
+ <button class="chip" data-plan="0">List the files in the src directory</button>
44
+ <button class="chip" data-plan="0">Open https://github.com/pytorch/pytorch</button>
45
  <button class="chip" data-plan="0">What does ephemeral mean?</button>
46
+ <button class="chip" data-plan="0">Email Dana the quarterly report</button>
47
  <button class="chip" data-plan="1">Search the web for AI news, then save it to Notion.</button>
48
  <button class="chip" data-plan="1">Read config.yaml, run the tests, then commit.</button>
49
  </div>
meta.json CHANGED
@@ -1 +1 @@
1
- {"vocab_size": 256, "d_model": 512, "pad_id": 0, "eos_id": 0, "encoding": "utf-8-bytes", "markers": {"user": {"text": "<|user|>", "ids": [60, 124, 117, 115, 101, 114, 124, 62]}, "assistant": {"text": "<|assistant|>", "ids": [60, 124, 97, 115, 115, 105, 115, 116, 97, 110, 116, 124, 62]}, "tool": {"text": "<|tool|>", "ids": [60, 124, 116, 111, 111, 108, 124, 62]}, "tool_call_open": {"text": "<tool_call>", "ids": [60, 116, 111, 111, 108, 95, 99, 97, 108, 108, 62]}, "tool_call_close": {"text": "</tool_call>", "ids": [60, 47, 116, 111, 111, 108, 95, 99, 97, 108, 108, 62]}, "tool_response_open": {"text": "<tool_response>", "ids": [60, 116, 111, 111, 108, 95, 114, 101, 115, 112, 111, 110, 115, 101, 62]}, "tool_response_close": {"text": "</tool_response>", "ids": [60, 47, 116, 111, 111, 108, 95, 114, 101, 115, 112, 111, 110, 115, 101, 62]}}, "tools": [{"name": "get_weather", "description": "Get the current weather for a city.", "args": ["city", "unit"], "schema": {"type": "object", "properties": {"city": {"type": "string"}, "unit": {"type": "string", "enum": ["c", "f"]}}, "required": ["city"]}}, {"name": "calculator", "description": "Evaluate an arithmetic expression.", "args": ["expression"], "schema": {"type": "object", "properties": {"expression": {"type": "string", "format": "arithmetic"}}, "required": ["expression"]}}, {"name": "web_search", "description": "Search the web.", "args": ["query", "k"], "schema": {"type": "object", "properties": {"query": {"type": "string"}, "k": {"type": "integer"}}, "required": ["query"]}}, {"name": "planner", "description": "Make a plan to achieve a goal.", "args": ["goal"], "schema": {"type": "object", "properties": {"goal": {"type": "string"}}, "required": ["goal"]}}, {"name": "define", "description": "Define a term.", "args": ["term"], "schema": {"type": "object", "properties": {"term": {"type": "string"}}, "required": ["term"]}}, {"name": "play_music", "description": "Play a song.", "args": ["song"], "schema": {"type": "object", "properties": {"song": {"type": "string"}}, "required": ["song"]}}, {"name": "get_news", "description": "Get news on a topic.", "args": ["topic"], "schema": {"type": "object", "properties": {"topic": {"type": "string"}}, "required": ["topic"]}}, {"name": "read_file", "description": "Read a file.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "write_file", "description": "Create or write a file.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "grep_search", "description": "Search the codebase for a pattern.", "args": ["pattern"], "schema": {"type": "object", "properties": {"pattern": {"type": "string", "format": "quoted"}}, "required": ["pattern"]}}, {"name": "run_command", "description": "Run a shell command.", "args": ["command"], "schema": {"type": "object", "properties": {"command": {"type": "string", "format": "quoted"}}, "required": ["command"]}}, {"name": "git_commit", "description": "Make a git commit.", "args": ["message"], "schema": {"type": "object", "properties": {"message": {"type": "string", "format": "quoted"}}, "required": ["message"]}}, {"name": "run_tests", "description": "Run the test suite.", "args": [], "schema": {"type": "object", "properties": {}, "required": []}}, {"name": "set_reminder", "description": "Set a reminder for a task.", "args": ["task"], "schema": {"type": "object", "properties": {"task": {"type": "string"}}, "required": ["task"]}}, {"name": "set_timer", "description": "Set a timer for a duration.", "args": ["duration"], "schema": {"type": "object", "properties": {"duration": {"type": "string"}}, "required": ["duration"]}}, {"name": "calendar_event", "description": "Create a Google Calendar event.", "args": ["title"], "schema": {"type": "object", "properties": {"title": {"type": "string", "format": "quoted"}}, "required": ["title"]}}, {"name": "send_email", "description": "Send an email to someone.", "args": ["recipient"], "schema": {"type": "object", "properties": {"recipient": {"type": "string"}}, "required": ["recipient"]}}, {"name": "open_url", "description": "Open a URL in the web browser.", "args": ["url"], "schema": {"type": "object", "properties": {"url": {"type": "string", "format": "url"}}, "required": ["url"]}}, {"name": "notion_write", "description": "Write a note in Notion.", "args": ["content"], "schema": {"type": "object", "properties": {"content": {"type": "string", "format": "quoted"}}, "required": ["content"]}}, {"name": "slack_send", "description": "Send a Slack message.", "args": ["message"], "schema": {"type": "object", "properties": {"message": {"type": "string", "format": "quoted"}}, "required": ["message"]}}, {"name": "jira_issue", "description": "Create a Jira issue.", "args": ["summary"], "schema": {"type": "object", "properties": {"summary": {"type": "string", "format": "quoted"}}, "required": ["summary"]}}], "tool_classes": ["get_weather", "calculator", "web_search", "planner", "define", "play_music", "get_news", "read_file", "write_file", "grep_search", "run_command", "git_commit", "run_tests", "set_reminder", "set_timer", "calendar_event", "send_email", "open_url", "notion_write", "slack_send", "jira_issue", "text"]}
 
1
+ {"vocab_size": 256, "d_model": 512, "pad_id": 0, "eos_id": 0, "encoding": "utf-8-bytes", "markers": {"user": {"text": "<|user|>", "ids": [60, 124, 117, 115, 101, 114, 124, 62]}, "assistant": {"text": "<|assistant|>", "ids": [60, 124, 97, 115, 115, 105, 115, 116, 97, 110, 116, 124, 62]}, "tool": {"text": "<|tool|>", "ids": [60, 124, 116, 111, 111, 108, 124, 62]}, "tool_call_open": {"text": "<tool_call>", "ids": [60, 116, 111, 111, 108, 95, 99, 97, 108, 108, 62]}, "tool_call_close": {"text": "</tool_call>", "ids": [60, 47, 116, 111, 111, 108, 95, 99, 97, 108, 108, 62]}, "tool_response_open": {"text": "<tool_response>", "ids": [60, 116, 111, 111, 108, 95, 114, 101, 115, 112, 111, 110, 115, 101, 62]}, "tool_response_close": {"text": "</tool_response>", "ids": [60, 47, 116, 111, 111, 108, 95, 114, 101, 115, 112, 111, 110, 115, 101, 62]}}, "tools": [{"name": "get_weather", "description": "Get the current weather for a city.", "args": ["city", "unit"], "schema": {"type": "object", "properties": {"city": {"type": "string"}, "unit": {"type": "string", "enum": ["c", "f"]}}, "required": ["city"]}}, {"name": "calculator", "description": "Evaluate an arithmetic expression.", "args": ["expression"], "schema": {"type": "object", "properties": {"expression": {"type": "string", "format": "arithmetic"}}, "required": ["expression"]}}, {"name": "web_search", "description": "Search the web.", "args": ["query", "k"], "schema": {"type": "object", "properties": {"query": {"type": "string"}, "k": {"type": "integer"}}, "required": ["query"]}}, {"name": "planner", "description": "Make a plan to achieve a goal.", "args": ["goal"], "schema": {"type": "object", "properties": {"goal": {"type": "string"}}, "required": ["goal"]}}, {"name": "define", "description": "Define a term.", "args": ["term"], "schema": {"type": "object", "properties": {"term": {"type": "string"}}, "required": ["term"]}}, {"name": "play_music", "description": "Play a song.", "args": ["song"], "schema": {"type": "object", "properties": {"song": {"type": "string"}}, "required": ["song"]}}, {"name": "get_news", "description": "Get news on a topic.", "args": ["topic"], "schema": {"type": "object", "properties": {"topic": {"type": "string"}}, "required": ["topic"]}}, {"name": "read_file", "description": "Read a file.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "write_file", "description": "Create or write a file.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "grep_search", "description": "Search the codebase for a pattern.", "args": ["pattern"], "schema": {"type": "object", "properties": {"pattern": {"type": "string", "format": "quoted"}}, "required": ["pattern"]}}, {"name": "run_command", "description": "Run a shell command.", "args": ["command"], "schema": {"type": "object", "properties": {"command": {"type": "string", "format": "quoted"}}, "required": ["command"]}}, {"name": "git_commit", "description": "Make a git commit.", "args": ["message"], "schema": {"type": "object", "properties": {"message": {"type": "string", "format": "quoted"}}, "required": ["message"]}}, {"name": "run_tests", "description": "Run the test suite.", "args": [], "schema": {"type": "object", "properties": {}, "required": []}}, {"name": "set_reminder", "description": "Set a reminder for a task.", "args": ["task"], "schema": {"type": "object", "properties": {"task": {"type": "string"}}, "required": ["task"]}}, {"name": "set_timer", "description": "Set a timer for a duration.", "args": ["duration"], "schema": {"type": "object", "properties": {"duration": {"type": "string"}}, "required": ["duration"]}}, {"name": "calendar_event", "description": "Create a Google Calendar event.", "args": ["title"], "schema": {"type": "object", "properties": {"title": {"type": "string", "format": "quoted"}}, "required": ["title"]}}, {"name": "send_email", "description": "Send an email to someone.", "args": ["recipient"], "schema": {"type": "object", "properties": {"recipient": {"type": "string"}}, "required": ["recipient"]}}, {"name": "open_url", "description": "Open a URL in the web browser.", "args": ["url"], "schema": {"type": "object", "properties": {"url": {"type": "string", "format": "url"}}, "required": ["url"]}}, {"name": "notion_write", "description": "Write a note in Notion.", "args": ["content"], "schema": {"type": "object", "properties": {"content": {"type": "string", "format": "quoted"}}, "required": ["content"]}}, {"name": "slack_send", "description": "Send a Slack message.", "args": ["message"], "schema": {"type": "object", "properties": {"message": {"type": "string", "format": "quoted"}}, "required": ["message"]}}, {"name": "jira_issue", "description": "Create a Jira issue.", "args": ["summary"], "schema": {"type": "object", "properties": {"summary": {"type": "string", "format": "quoted"}}, "required": ["summary"]}}, {"name": "screenshot", "description": "Take a screenshot of the screen.", "args": [], "schema": {"type": "object", "properties": {}, "required": []}}, {"name": "click", "description": "Click a UI element described in text.", "args": ["target"], "schema": {"type": "object", "properties": {"target": {"type": "string", "format": "quoted"}}, "required": ["target"]}}, {"name": "double_click", "description": "Double-click a UI element described in text.", "args": ["target"], "schema": {"type": "object", "properties": {"target": {"type": "string", "format": "quoted"}}, "required": ["target"]}}, {"name": "type_text", "description": "Type text into the focused field.", "args": ["text"], "schema": {"type": "object", "properties": {"text": {"type": "string", "format": "quoted"}}, "required": ["text"]}}, {"name": "key_press", "description": "Press a keyboard key.", "args": ["key"], "schema": {"type": "object", "properties": {"key": {"type": "string", "enum": ["Enter", "Tab", "Escape", "Backspace", "Space", "Delete", "ArrowUp", "ArrowDown", "ArrowLeft", "ArrowRight"]}}, "required": ["key"]}}, {"name": "scroll", "description": "Scroll the view in a direction.", "args": ["direction"], "schema": {"type": "object", "properties": {"direction": {"type": "string", "enum": ["up", "down", "left", "right"]}}, "required": ["direction"]}}, {"name": "drag", "description": "Drag from a source element to a destination element.", "args": ["source", "dest"], "schema": {"type": "object", "properties": {"source": {"type": "string", "format": "quoted"}, "dest": {"type": "string", "format": "quoted"}}, "required": ["source", "dest"]}}, {"name": "wait", "description": "Wait for a number of seconds.", "args": ["seconds"], "schema": {"type": "object", "properties": {"seconds": {"type": "integer"}}, "required": ["seconds"]}}, {"name": "move_cursor", "description": "Move the cursor to a UI element described in text.", "args": ["target"], "schema": {"type": "object", "properties": {"target": {"type": "string", "format": "quoted"}}, "required": ["target"]}}, {"name": "open_app", "description": "Open a desktop application by name.", "args": ["name"], "schema": {"type": "object", "properties": {"name": {"type": "string", "format": "quoted"}}, "required": ["name"]}}, {"name": "run_python", "description": "Run a snippet of Python code.", "args": ["code"], "schema": {"type": "object", "properties": {"code": {"type": "string", "format": "quoted"}}, "required": ["code"]}}, {"name": "edit_file", "description": "Edit a file at a path.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "apply_patch", "description": "Apply a patch to a file at a path.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "http_request", "description": "Make an HTTP request to a URL.", "args": ["url", "method"], "schema": {"type": "object", "properties": {"url": {"type": "string", "format": "url"}, "method": {"type": "string", "enum": ["GET", "POST", "PUT", "DELETE", "PATCH"]}}, "required": ["url"]}}, {"name": "sql_query", "description": "Run a SQL query.", "args": ["query"], "schema": {"type": "object", "properties": {"query": {"type": "string", "format": "quoted"}}, "required": ["query"]}}, {"name": "list_dir", "description": "List the contents of a directory.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "find_files", "description": "Find files matching a glob pattern.", "args": ["pattern"], "schema": {"type": "object", "properties": {"pattern": {"type": "string", "format": "quoted"}}, "required": ["pattern"]}}, {"name": "git_diff", "description": "Show the git diff.", "args": [], "schema": {"type": "object", "properties": {}, "required": []}}, {"name": "git_status", "description": "Show the git status.", "args": [], "schema": {"type": "object", "properties": {}, "required": []}}, {"name": "install_package", "description": "Install a package by name.", "args": ["name"], "schema": {"type": "object", "properties": {"name": {"type": "string", "format": "quoted"}}, "required": ["name"]}}, {"name": "kill_process", "description": "Kill a running process by name.", "args": ["name"], "schema": {"type": "object", "properties": {"name": {"type": "string", "format": "quoted"}}, "required": ["name"]}}, {"name": "read_clipboard", "description": "Read the system clipboard.", "args": [], "schema": {"type": "object", "properties": {}, "required": []}}, {"name": "write_clipboard", "description": "Write text to the system clipboard.", "args": ["text"], "schema": {"type": "object", "properties": {"text": {"type": "string", "format": "quoted"}}, "required": ["text"]}}, {"name": "download_file", "description": "Download a file from a URL.", "args": ["url"], "schema": {"type": "object", "properties": {"url": {"type": "string", "format": "url"}}, "required": ["url"]}}, {"name": "unzip", "description": "Unzip an archive at a path.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "env_get", "description": "Read an environment variable by name.", "args": ["name"], "schema": {"type": "object", "properties": {"name": {"type": "string", "format": "quoted"}}, "required": ["name"]}}, {"name": "make_dir", "description": "Create a directory at a path.", "args": ["path"], "schema": {"type": "object", "properties": {"path": {"type": "string", "format": "path"}}, "required": ["path"]}}, {"name": "list_processes", "description": "List running processes.", "args": [], "schema": {"type": "object", "properties": {}, "required": []}}, {"name": "docker_run", "description": "Run a Docker container from an image.", "args": ["image"], "schema": {"type": "object", "properties": {"image": {"type": "string", "format": "quoted"}}, "required": ["image"]}}], "tool_classes": ["get_weather", "calculator", "web_search", "planner", "define", "play_music", "get_news", "read_file", "write_file", "grep_search", "run_command", "git_commit", "run_tests", "set_reminder", "set_timer", "calendar_event", "send_email", "open_url", "notion_write", "slack_send", "jira_issue", "screenshot", "click", "double_click", "type_text", "key_press", "scroll", "drag", "wait", "move_cursor", "open_app", "run_python", "edit_file", "apply_patch", "http_request", "sql_query", "list_dir", "find_files", "git_diff", "git_status", "install_package", "kill_process", "read_clipboard", "write_clipboard", "download_file", "unzip", "env_get", "make_dir", "list_processes", "docker_run", "text"]}
model.fp16.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:56e567de94cfe5b9c620931d30669e2292aa1f02a7868d2e9edb1545dfe02280
3
  size 57468760
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a902bba0acc2e216e0134454820d3ae7fbcbb2f07686d69151c3f3e6a697bfc6
3
  size 57468760
style.css CHANGED
@@ -83,6 +83,7 @@ button#run:not(:disabled):hover { filter: brightness(1.08); }
83
  .call.abstain { border-left-color: var(--warn); }
84
  .call.abstain .tool { color: var(--warn); }
85
  .step-index { color: var(--muted); font-family: var(--mono); font-size: 0.78rem; margin-right: 6px; }
 
86
  .timing { color: var(--muted); font-size: 0.78rem; margin-top: 6px; font-family: var(--mono); }
87
 
88
  footer { margin-top: 36px; color: var(--muted); font-size: 0.78rem; }
 
83
  .call.abstain { border-left-color: var(--warn); }
84
  .call.abstain .tool { color: var(--warn); }
85
  .step-index { color: var(--muted); font-family: var(--mono); font-size: 0.78rem; margin-right: 6px; }
86
+ .call .route { display: inline-block; margin-right: 8px; padding: 1px 7px; border-radius: 10px; background: var(--accent); color: #fff; font-family: var(--mono); font-size: 0.72rem; vertical-align: middle; }
87
  .timing { color: var(--muted); font-size: 0.78rem; margin-top: 6px; font-family: var(--mono); }
88
 
89
  footer { margin-top: 36px; color: var(--muted); font-size: 0.78rem; }