localagent-webgpu / index.html
danelcsb's picture
deploy generable-dispatch demo (scenarios-best)
e15d158 verified
Raw
History Blame Contribute Delete
2.81 kB
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>LocalAgent — Tool Calling (WebGPU)</title>
<link rel="stylesheet" href="style.css" />
</head>
<body>
<main>
<header>
<h1>🛠️ LocalAgent <span class="sub">tool calling in your browser</span></h1>
<p class="tagline">
A <strong>28M-param, from-scratch</strong> byte-level agent over <strong>50 tools</strong>.
The transformer runs on <strong>WebGPU</strong> (<code>onnxruntime-web</code>); the
<strong>route head → dense selector → pointer-copy</strong> dispatch and the planner loop are
light JS. <a href="https://huggingface.co/danelcsb/localagent-tiny-30m-byte" target="_blank" rel="noopener">model</a> ·
<a href="https://github.com/sangbumchoi/localagent" target="_blank" rel="noopener">code</a>
</p>
</header>
<section id="status" class="status loading">
<span class="dot"></span>
<span id="status-text">Loading model…</span>
<span id="backend-badge" class="badge" hidden></span>
</section>
<section class="io">
<label for="prompt">Your request</label>
<textarea id="prompt" rows="2" placeholder="e.g. What's the weather in Paris?"></textarea>
<div class="row">
<label class="switch">
<input type="checkbox" id="plan-mode" />
<span>Multi-step plan (planner rollout)</span>
</label>
<button id="run" disabled>Run</button>
</div>
<div class="examples">
<span>Try:</span>
<button class="chip" data-plan="0">What's the weather in Tokyo?</button>
<button class="chip" data-plan="0">List the files in the src directory</button>
<button class="chip" data-plan="0">Open https://github.com/pytorch/pytorch</button>
<button class="chip" data-plan="0">What does ephemeral mean?</button>
<button class="chip" data-plan="0">Email Dana the quarterly report</button>
<button class="chip" data-plan="1">Search the web for AI news, then save it to Notion.</button>
<button class="chip" data-plan="1">Read config.yaml, run the tests, then commit.</button>
</div>
</section>
<section id="result" class="result" hidden></section>
<footer>
<p>
Runs fully client-side — the model is fetched once and cached. Arg grounding in-browser
covers common formats; the Python grounder is the source of truth. No data leaves your
device.
</p>
</footer>
</main>
<!-- onnxruntime-web (WASM + WebGPU execution providers) -->
<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.20.1/dist/ort.webgpu.min.js"></script>
<script src="app.js"></script>
</body>
</html>