File size: 2,813 Bytes
450b47a
 
 
 
 
 
 
 
 
 
 
 
 
e15d158
 
 
 
450b47a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0d3185e
e15d158
 
0d3185e
e15d158
0d3185e
 
450b47a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1" />
  <title>LocalAgent — Tool Calling (WebGPU)</title>
  <link rel="stylesheet" href="style.css" />
</head>
<body>
  <main>
    <header>
      <h1>🛠️ LocalAgent <span class="sub">tool calling in your browser</span></h1>
      <p class="tagline">
        A <strong>28M-param, from-scratch</strong> byte-level agent over <strong>50 tools</strong>.
        The transformer runs on <strong>WebGPU</strong> (<code>onnxruntime-web</code>); the
        <strong>route head → dense selector → pointer-copy</strong> dispatch and the planner loop are
        light JS. <a href="https://huggingface.co/danelcsb/localagent-tiny-30m-byte" target="_blank" rel="noopener">model</a> ·
        <a href="https://github.com/sangbumchoi/localagent" target="_blank" rel="noopener">code</a>
      </p>
    </header>

    <section id="status" class="status loading">
      <span class="dot"></span>
      <span id="status-text">Loading model…</span>
      <span id="backend-badge" class="badge" hidden></span>
    </section>

    <section class="io">
      <label for="prompt">Your request</label>
      <textarea id="prompt" rows="2" placeholder="e.g. What's the weather in Paris?"></textarea>

      <div class="row">
        <label class="switch">
          <input type="checkbox" id="plan-mode" />
          <span>Multi-step plan (planner rollout)</span>
        </label>
        <button id="run" disabled>Run</button>
      </div>

      <div class="examples">
        <span>Try:</span>
        <button class="chip" data-plan="0">What's the weather in Tokyo?</button>
        <button class="chip" data-plan="0">List the files in the src directory</button>
        <button class="chip" data-plan="0">Open https://github.com/pytorch/pytorch</button>
        <button class="chip" data-plan="0">What does ephemeral mean?</button>
        <button class="chip" data-plan="0">Email Dana the quarterly report</button>
        <button class="chip" data-plan="1">Search the web for AI news, then save it to Notion.</button>
        <button class="chip" data-plan="1">Read config.yaml, run the tests, then commit.</button>
      </div>
    </section>

    <section id="result" class="result" hidden></section>

    <footer>
      <p>
        Runs fully client-side — the model is fetched once and cached. Arg grounding in-browser
        covers common formats; the Python grounder is the source of truth. No data leaves your
        device.
      </p>
    </footer>
  </main>

  <!-- onnxruntime-web (WASM + WebGPU execution providers) -->
  <script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.20.1/dist/ort.webgpu.min.js"></script>
  <script src="app.js"></script>
</body>
</html>