LaelaZ's picture
Surface the measured supervisor result (AUC 0.99 on real DROID actions); sync supervisor.py (adds read-only drift_score)
e96076c verified
{% extends "base.html" %}
{% block title %}embodied-efficiency: run the VLA on the robot, not just in the demo{% endblock %}
{% block content %}
<!-- ============================ HERO ============================ -->
<section class="grid items-center gap-10 lg:grid-cols-[1.05fr_0.95fr]">
<div>
<span class="inline-flex items-center gap-2 rounded-full bg-signal-50 px-3 py-1 text-xs font-semibold text-signal-700 ring-1 ring-inset ring-signal-200/70 dark:bg-signal-500/10 dark:text-signal-300 dark:ring-signal-400/20">
<span class="ee-breathe h-1.5 w-1.5 rounded-full bg-signal-500"></span>
Kernels · quantization · a runtime trust layer · no API key
</span>
<h1 class="mt-4 text-4xl font-semibold leading-[1.08] tracking-tight text-slate-900 dark:text-slate-50 sm:text-5xl">
Run the VLA on the robot,<br class="hidden sm:block" />
<span class="ee-gradient-text">not just in the demo.</span>
</h1>
<p class="mt-4 max-w-xl text-lg leading-relaxed text-slate-600 dark:text-slate-300">
A vision-language-action model folds laundry in the lab. Put it on the actual
robot and it stalls, and not because it can't do the task. It can't do it fast
enough. The capability is there. What's left is engineering: get it inside a
latency budget, then keep it safe once it's running.
</p>
<div class="mt-7 flex flex-wrap items-center gap-3">
<a href="#compiler"
class="ee-focus inline-flex items-center gap-2 rounded-xl bg-signal-600 px-5 py-3 text-sm font-semibold text-white shadow-lg shadow-signal-700/25 transition hover:bg-signal-700">
<svg class="h-4 w-4" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.2" stroke-linecap="round" stroke-linejoin="round"><path d="M3 17l4-8 4 5 3-6 4 9"/></svg>
Set a deploy budget
</a>
<a href="{{ thesis_url }}" target="_blank" rel="noopener"
class="ee-focus inline-flex items-center gap-2 rounded-xl border border-slate-300 bg-white/70 px-5 py-3 text-sm font-semibold text-slate-700 transition hover:border-signal-300 hover:bg-white dark:border-white/15 dark:bg-white/5 dark:text-slate-200 dark:hover:bg-white/10">
Read the thesis
</a>
</div>
</div>
<!-- Hero proof card: the deploy gap, drawn to scale. -->
<div class="ee-card p-6 sm:p-7">
<div class="flex items-center justify-between">
<span class="ee-chip bg-signal-50 text-signal-700 ring-1 ring-inset ring-signal-200/70 dark:bg-signal-500/10 dark:text-signal-300 dark:ring-signal-400/20">
<svg class="h-3 w-3" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><path d="M3 17l4-8 4 5 3-6 4 9"/></svg>
The deploy gap
</span>
<span class="ee-mono text-xs font-medium text-slate-400">control rate, Hz</span>
</div>
<div class="mt-6 space-y-5">
<div>
<div class="flex items-baseline justify-between text-sm">
<span class="font-medium text-slate-600 dark:text-slate-300">End-to-end VLA today</span>
<span class="ee-mono font-semibold text-slate-700 dark:text-slate-200">3&ndash;5&nbsp;Hz</span>
</div>
<div class="ee-meter ee-meter--hold mt-1.5"><span style="width: 6%"></span></div>
</div>
<div>
<div class="flex items-baseline justify-between text-sm">
<span class="font-medium text-slate-600 dark:text-slate-300">A robot arm, to move smoothly</span>
<span class="ee-mono font-semibold text-slate-700 dark:text-slate-200">50&ndash;100&nbsp;Hz</span>
</div>
<div class="ee-meter ee-meter--go mt-1.5"><span style="width: 100%"></span></div>
</div>
</div>
<p class="mt-6 rounded-xl ee-panel px-4 py-3 text-sm leading-relaxed text-slate-600 dark:text-slate-300">
<span class="font-semibold text-slate-800 dark:text-slate-100">A 10&ndash;30&times; gap.</span>
This repo closes it with the levers that actually pay off at batch&nbsp;1, then
adds a supervisor so the fast policy is also one you can leave running.
</p>
</div>
</section>
<!-- ============================ STATS ============================ -->
<section class="mt-12 grid grid-cols-2 gap-4 sm:grid-cols-4">
{% set stats = [
("5.9&times;", "CUDA-graph speedup", "measured on a T4, beats torch.compile"),
("0.089", "ms / action, best case", "bf16 + graph + action-chunking"),
("4", "experiments on low-bit", "the win and the negative, same rigor"),
("0", "API keys, 0 GPU needed", "the console runs the real code, free")
] %}
{% for big, label, sub in stats %}
<div class="ee-card p-5">
<div class="ee-mono text-2xl font-bold tracking-tight text-slate-800 dark:text-slate-100 sm:text-3xl">{{ big|safe }}</div>
<div class="mt-1 text-sm font-semibold text-slate-600 dark:text-slate-300">{{ label }}</div>
<div class="mt-1 text-xs text-slate-400 dark:text-slate-500">{{ sub }}</div>
</div>
{% endfor %}
</section>
<!-- ========================= DEPLOY COMPILER ========================= -->
<section id="compiler" class="mt-16 scroll-mt-24">
<div class="flex items-center gap-3">
<span class="grid h-10 w-10 place-items-center rounded-xl bg-gradient-to-br from-signal-600 to-cyan2-400 text-white shadow-md shadow-signal-700/25">
<svg class="h-5 w-5" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M3 3v18h18"/><path d="m7 14 3-4 3 3 4-6"/></svg>
</span>
<div>
<h2 class="text-2xl font-semibold tracking-tight sm:text-3xl">Deploy-compiler</h2>
<p class="text-sm text-slate-500 dark:text-slate-400">Set a budget. It picks the best config off the real-L4 frontier, live.</p>
</div>
</div>
<div class="mt-6 grid gap-5 lg:grid-cols-[0.85fr_1.15fr]">
<!-- Budget knobs -->
<div class="ee-card p-6">
<div class="flex items-center justify-between">
<h3 class="font-semibold text-slate-800 dark:text-slate-100">Deployment budget</h3>
<button type="button" id="reset-budget" class="ee-focus rounded-lg px-2 py-1 text-xs font-medium text-slate-500 transition hover:bg-slate-200/60 dark:text-slate-400 dark:hover:bg-white/5">Reset</button>
</div>
<div class="mt-5">
<label class="text-xs font-semibold uppercase tracking-wide text-slate-500 dark:text-slate-400">Minimize</label>
<div class="mt-2 grid grid-cols-2 gap-2" id="objective">
<button type="button" data-obj="latency" class="ee-obj ee-focus rounded-xl border px-3 py-2.5 text-sm font-semibold transition">
Latency
<span class="block text-[11px] font-normal text-slate-400">fastest action</span>
</button>
<button type="button" data-obj="footprint" class="ee-obj ee-focus rounded-xl border px-3 py-2.5 text-sm font-semibold transition">
Footprint
<span class="block text-[11px] font-normal text-slate-400">smallest model</span>
</button>
</div>
</div>
<div class="mt-6 space-y-5">
<div>
<div class="flex items-center justify-between text-sm">
<label for="max_lat" class="font-medium text-slate-600 dark:text-slate-300">Max latency</label>
<span class="ee-mono text-xs font-semibold text-signal-700 dark:text-signal-300"><span id="v_lat">12.4</span> ms/action</span>
</div>
<input id="max_lat" type="range" min="0" max="1000" value="1000" class="ee-range mt-2" />
</div>
<div>
<div class="flex items-center justify-between text-sm">
<label for="max_mb" class="font-medium text-slate-600 dark:text-slate-300">Max footprint</label>
<span class="ee-mono text-xs font-semibold text-signal-700 dark:text-signal-300"><span id="v_mb">51</span> MB</span>
</div>
<input id="max_mb" type="range" min="13" max="51" step="0.1" value="51" class="ee-range mt-2" />
</div>
<div>
<div class="flex items-center justify-between text-sm">
<label for="max_rmse" class="font-medium text-slate-600 dark:text-slate-300">Max action error (fidelity)</label>
<span class="ee-mono text-xs font-semibold text-signal-700 dark:text-signal-300">rMSE &le; <span id="v_rmse">0.05</span></span>
</div>
<input id="max_rmse" type="range" min="0" max="0.30" step="0.005" value="0.05" class="ee-range mt-2" />
</div>
<div>
<div class="flex items-center justify-between text-sm">
<label for="max_stale" class="font-medium text-slate-600 dark:text-slate-300">Max staleness</label>
<span class="ee-mono text-xs font-semibold text-signal-700 dark:text-signal-300"><span id="v_stale">49</span> steps</span>
</div>
<input id="max_stale" type="range" min="0" max="49" value="49" class="ee-range mt-2" />
<p class="mt-1.5 text-[11px] leading-snug text-slate-400 dark:text-slate-500">Action-chunking runs many actions per sampler call: cheaper per action, but the last one is more stale. This knob sets how stale you'll allow.</p>
</div>
</div>
</div>
<!-- Pick + frontier -->
<div class="ee-card p-6">
<div id="pick-result"><!-- filled by app.js --></div>
<div class="mt-5">
<div class="flex items-center justify-between text-xs text-slate-400 dark:text-slate-500">
<span>Real-L4 frontier · 27 configs</span>
<span class="inline-flex items-center gap-3">
<span class="inline-flex items-center gap-1"><span class="h-2 w-2 rounded-full" style="background:rgb(16 185 129)"></span>low error</span>
<span class="inline-flex items-center gap-1"><span class="h-2 w-2 rounded-full" style="background:rgb(244 100 110)"></span>high error</span>
</span>
</div>
<div id="pareto" class="mt-2 w-full"><!-- SVG injected by app.js --></div>
</div>
</div>
</div>
</section>
<!-- ========================= SAFETY SUPERVISOR ========================= -->
<section id="supervisor" class="mt-16 scroll-mt-24">
<div class="flex items-center gap-3">
<span class="grid h-10 w-10 place-items-center rounded-xl bg-gradient-to-br from-signal-600 to-cyan2-400 text-white shadow-md shadow-signal-700/25">
<svg class="h-5 w-5" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M12 22s8-4 8-10V5l-8-3-8 3v7c0 6 8 10 8 10z"/></svg>
</span>
<div>
<h2 class="text-2xl font-semibold tracking-tight sm:text-3xl">Safety supervisor</h2>
<p class="text-sm text-slate-500 dark:text-slate-400">The runtime trust layer. It vets every action before it reaches a motor, live, on this server.</p>
</div>
</div>
<div class="mt-4 flex flex-wrap items-center gap-2 text-xs">
<span class="ee-chip bg-emerald-50 text-emerald-700 ring-1 ring-inset ring-emerald-200/70 dark:bg-emerald-500/10 dark:text-emerald-300 dark:ring-emerald-400/20">
<span class="ee-light ee-light--go"></span> measured, not asserted
</span>
<span class="text-slate-500 dark:text-slate-400">
On <strong>real DROID robot actions</strong> + labelled faults: the drift detector scores
<span class="ee-mono font-semibold text-slate-700 dark:text-slate-200">AUC 0.99</span>, and tuned to a 1% false-alarm budget it catches
<span class="ee-mono font-semibold text-slate-700 dark:text-slate-200">91%</span> of faults. Eval in the repo.
</span>
</div>
<div class="mt-6 grid gap-5 lg:grid-cols-[0.85fr_1.15fr]">
<!-- Scenario picker -->
<div class="ee-card p-6">
<h3 class="font-semibold text-slate-800 dark:text-slate-100">Send it an action to vet</h3>
<p class="mt-1 text-sm text-slate-500 dark:text-slate-400">The policy is calibrated on a normal posture. Pick what to throw at it.</p>
<form hx-post="/vet" hx-target="#verdict" hx-swap="innerHTML" class="mt-5 space-y-3">
<div class="space-y-2">
{% for key, label in scenarios %}
<label class="ee-scen flex cursor-pointer items-center gap-3 rounded-xl border border-slate-200 px-4 py-3 text-sm transition hover:border-signal-300 hover:bg-signal-50/40 dark:border-white/10 dark:hover:border-signal-400/40 dark:hover:bg-signal-500/5">
<input type="radio" name="scenario" value="{{ key }}" {% if loop.first %}checked{% endif %}
class="h-4 w-4 accent-signal-600" />
<span class="font-medium text-slate-700 dark:text-slate-200">{{ label }}</span>
</label>
{% endfor %}
</div>
<button type="submit"
class="ee-focus mt-2 inline-flex w-full items-center justify-center gap-2 rounded-xl bg-signal-600 px-5 py-3 text-sm font-semibold text-white shadow-lg shadow-signal-700/25 transition hover:bg-signal-700">
<span class="htmx-indicator inline-block h-4 w-4 animate-spin rounded-full border-2 border-white/40 border-t-white"></span>
Vet this action
</button>
</form>
</div>
<!-- Verdict + governance trail -->
<div class="ee-card p-6">
<div id="verdict">
<div class="flex h-full min-h-[18rem] flex-col items-center justify-center text-center">
<span class="grid h-14 w-14 place-items-center rounded-2xl bg-slate-100 text-slate-400 dark:bg-white/5 dark:text-slate-500">
<svg class="h-7 w-7" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.8" stroke-linecap="round" stroke-linejoin="round"><path d="M12 22s8-4 8-10V5l-8-3-8 3v7c0 6 8 10 8 10z"/></svg>
</span>
<p class="mt-4 text-sm font-medium text-slate-500 dark:text-slate-400">Pick an action and hit <span class="font-semibold text-slate-700 dark:text-slate-200">Vet this action</span>.</p>
<p class="mt-1 text-xs text-slate-400 dark:text-slate-500">The verdict, the action actually sent, and the running governance log show up here.</p>
</div>
</div>
</div>
</div>
</section>
<!-- ========================= CLOSING ========================= -->
<section class="mt-16">
<div class="ee-card relative overflow-hidden p-8 text-center sm:p-10">
<div class="pointer-events-none absolute inset-x-0 -top-24 mx-auto h-48 w-48 rounded-full bg-signal-400/20 blur-3xl"></div>
<div class="relative">
<h2 class="text-2xl font-semibold tracking-tight sm:text-3xl">Efficiency gets it onto the robot. The supervisor lets it stay.</h2>
<p class="mx-auto mt-2 max-w-2xl text-slate-600 dark:text-slate-300">
Everything here is measured and reproducible, on the hardware robots actually
carry. The code, the four-experiment low-bit study, and the full write-up are
public.
</p>
<div class="mt-6 flex flex-wrap justify-center gap-3">
<a href="{{ github_url }}" target="_blank" rel="noopener" class="ee-focus inline-flex items-center gap-2 rounded-xl bg-signal-600 px-5 py-3 text-sm font-semibold text-white shadow-lg shadow-signal-700/25 transition hover:bg-signal-700">
<svg class="h-4 w-4" viewBox="0 0 24 24" fill="currentColor"><path d="M12 .5C5.73.5.5 5.73.5 12c0 5.08 3.29 9.39 7.86 10.91.58.11.79-.25.79-.56v-2c-3.2.7-3.88-1.54-3.88-1.54-.52-1.33-1.28-1.69-1.28-1.69-1.05-.72.08-.7.08-.7 1.16.08 1.77 1.19 1.77 1.19 1.03 1.77 2.7 1.26 3.36.96.1-.75.4-1.26.73-1.55-2.55-.29-5.24-1.28-5.24-5.69 0-1.26.45-2.29 1.19-3.1-.12-.29-.52-1.46.11-3.05 0 0 .97-.31 3.18 1.18a11 11 0 0 1 5.8 0c2.2-1.49 3.17-1.18 3.17-1.18.63 1.59.23 2.76.11 3.05.74.81 1.19 1.84 1.19 3.1 0 4.42-2.69 5.39-5.25 5.68.41.36.78 1.06.78 2.14v3.17c0 .31.21.68.8.56A10.52 10.52 0 0 0 23.5 12C23.5 5.73 18.27.5 12 .5Z"/></svg>
View the repo
</a>
<a href="{{ thesis_url }}" target="_blank" rel="noopener" class="ee-focus inline-flex items-center gap-2 rounded-xl border border-slate-300 bg-white/70 px-5 py-3 text-sm font-semibold text-slate-700 transition hover:border-signal-300 hover:bg-white dark:border-white/15 dark:bg-white/5 dark:text-slate-200 dark:hover:bg-white/10">
Read the thesis
</a>
</div>
</div>
</div>
</section>
<!-- Real-L4 Pareto frontier, serialised for the client-side compiler. -->
<script id="configs-data" type="application/json">{{ configs_json|safe }}</script>
{% endblock %}