Spaces:

OpenHands
/

openhands-index

Running

App Files Files Community

Empty PR for testing

by gneubig - opened Feb 11

base: refs/heads/main

←

from: refs/pr/9

Discussion Files changed

+253

-2506

Files changed (29) hide show

.gitattributes +0 -1
OpenHands-Design/DESIGN.md +0 -597
OpenHands-Design/NORMALIZATION_LOG.md +0 -36
OpenHands-Design/README.md +0 -222
OpenHands-Design/index.html +0 -396
OpenHands-Design/src/components/ui/button.tsx +0 -50
OpenHands-Design/src/components/ui/input.tsx +0 -22
OpenHands-Design/src/components/ui/native-select.tsx +0 -26
OpenHands-Design/src/components/ui/search-input.tsx +0 -74
OpenHands-Design/src/globals.css +0 -135
OpenHands-Design/tailwind.config.js +0 -99
alternative_agents_page.py +0 -103
app.py +4 -31
assets/harnesses/README.md +0 -59
assets/harnesses/claude-code.svg +0 -1
assets/harnesses/codex-cli.svg +0 -1
assets/harnesses/gemini-cli.svg +0 -1
assets/harnesses/openhands.svg +0 -1
assets/openhands-logotype-design.svg +0 -1
assets/openhands-logotype-on-dark.svg +0 -1
assets/openhands-logotype-on-light.svg +0 -1
content.py +0 -5
docs/screenshots/alternative-agents.png +0 -3
leaderboard_transformer.py +88 -249
main_page.py +41 -81
setup_data.py +5 -17
simple_data_loader.py +59 -156
tests/test_runtime_sorting.py +0 -40
ui_components.py +56 -97

.gitattributes CHANGED Viewed

	@@ -1,2 +1 @@
1
2	- docs/screenshots/alternative-agents.png filter=lfs diff=lfs merge=lfs -text


1

OpenHands-Design/DESIGN.md DELETED Viewed

@@ -1,597 +0,0 @@
-# OpenHands UI Design System
-## 1. Visual Theme & Atmosphere
-OpenHands is a dark-first AI agent platform built on a near-black monochrome canvas. The entire experience lives on a `0 0% 5%` HSL background — effectively `#0d0d0d` — with `0 0% 98%` foreground text that reads as warm off-white. Every surface is a shade of neutral grey scaled in 2–5% lightness increments, creating depth through tonal variation rather than color. The only chromatic moments are semantic: green for success, red-orange for danger, amber for warnings, and blue for informational states.
-Typography is carried by **Inter** (sans-serif) for all UI text and **JetBrains Mono** for code, terminals, and technical labels. The type system is weight-restrained — `font-medium` (500) is the workhorse, `font-semibold` (600) for headings and emphasis, and `font-normal` (400) for body. Bold (700) is rare and reserved for maximum emphasis.
-The UI framework is **React + Tailwind CSS + Radix primitives** (shadcn/ui pattern). All colors flow through CSS custom properties declared in `:root` and consumed via `hsl(var(--token))` in the Tailwind config. This means every color in the system is overridable by changing a single HSL triplet.
-**Key characteristics:**
-- Near-black monochrome canvas (`#0d0d0d` background, `#fafafa` foreground)
-- Neutral grey surface scale in 2–5% lightness increments (5% → 7% → 8% → 12% → 14% → 18%)
-- Inter + JetBrains Mono dual-font system
-- HSL-based CSS custom property architecture for full theme overridability
-- Tailwind utility-first styling with Radix UI headless primitives
-- `transition-colors` as the dominant transition (958 uses) — UI feels responsive but not animated
-- Dark-only primary mode; light and sepia modes exist as secondary via class-map theming
----
-## 2. Color Palette & Roles
-All colors are declared as HSL triplets (without the `hsl()` wrapper) in CSS custom properties. Tailwind maps them as `hsl(var(--token))`.
-### Core Surfaces
-| Token | HSL | Hex | Role |
-|-------|-----|-----|------|
-| `--background` | `0 0% 5%` | `#0d0d0d` | Page background, app shell |
-| `--card` | `0 0% 7%` | `#121212` | Card surfaces, elevated containers |
-| `--secondary` | `0 0% 8%` | `#141414` | Secondary surfaces, sidebar accent |
-| `--popover` | `0 0% 7%` | `#121212` | Dropdown menus, popovers |
-| `--muted` | `0 0% 12%` | `#1f1f1f` | Muted backgrounds, hover fills, badges, **tooltip surfaces** |
-| `--border` / `--input` | `0 0% 14%` | `#242424` | Borders, input borders, dividers |
-| `--muted-hover` | `0 0% 18%` | `#2e2e2e` | Hover state for muted surfaces |
-| `--modal-background` | Inherits `--background` | `#0d0d0d` | Dialogs, sheets, modals (can diverge) |
-### Core Text
-| Token | HSL | Hex | Role |
-|-------|-----|-----|------|
-| `--foreground` | `0 0% 98%` | `#fafafa` | Primary text, headings |
-| `--muted-foreground` | `0 0% 55%` | `#8c8c8c` | Secondary text, labels, placeholders, icons |
-| `--primary` | `0 0% 100%` | `#ffffff` | Maximum emphasis text, primary buttons |
-| `--primary-foreground` | `0 0% 0%` | `#000000` | Text on primary (white) surfaces |
-| `--accent` | `0 0% 100%` | `#ffffff` | Accent elements (matches primary in dark) |
-### Sidebar (inherits core but isolated for overridability)
-| Token | HSL | Role |
-|-------|-----|------|
-| `--sidebar-background` | `0 0% 5%` | Sidebar background |
-| `--sidebar-foreground` | `0 0% 98%` | Sidebar text |
-| `--sidebar-accent` | `0 0% 8%` | Sidebar hover/active background |
-| `--sidebar-border` | `0 0% 14%` | Sidebar dividers |
-| `--sidebar-ring` | `0 0% 50%` | Sidebar focus ring |
-### Semantic / Status
-| Token | HSL | Hex | Role |
-|-------|-----|-----|------|
-| `--success` | `142 71% 45%` | `#22c55e` | Success states, running indicators |
-| `--success-foreground` | `142 71% 76%` | `#86efac` | Success text on dark surfaces |
-| `--warning` | `38 92% 50%` | `#f59e0b` | Warning states, caution badges |
-| `--info` | `217 91% 60%` | `#3b82f6` | Informational states, links |
-| `--destructive` | `0 72% 51%` | `#dc2626` | Error states, danger actions, delete |
-| `--destructive-foreground` | `0 0% 98%` | `#fafafa` | Text on destructive surfaces |
-| `--ring` | `0 0% 80%` | `#cccccc` | Focus rings (1px, keyboard-only via `focus-visible:`) |
-### Gradients & Decorative
-| Token | Value | Role |
-|-------|-------|------|
-| `--gradient-card-hover` | `linear-gradient(180deg, hsl(0 0% 9%) 0%, hsl(0 0% 7%) 100%)` | Subtle card hover gradient |
-| `--shadow-card` | `0 1px 2px 0 hsl(0 0% 0% / 0.3)` | Default card shadow |
-### Hover Backgrounds
-| Surface | Hover Token | Use |
-|---------|-------------|-----|
-| Dark surfaces (cards, nav items, menus, rows) | `hover:bg-muted/60` | **Standard hover** — the single canonical dark-surface hover |
-| White/primary buttons | `hover:bg-primary/85` | Light grey hover on white buttons (85% opacity white) |
-**Canonical dark-surface hover: `hover:bg-muted/60`** — used consistently across the codebase. Do **not** mix `/40`, `/50`, `/70` variants.
-**Canonical primary-button hover: `hover:bg-primary/85`** — never use `hover:bg-muted/60` on a `bg-primary`/`bg-white` button (causes dark flash).
----
-## 3. Typography Rules
-### Font Families
-| Role | Family | CSS Variable | Tailwind Class | Fallbacks |
-|------|--------|-------------|----------------|-----------|
-| UI / Body | Inter | `--font-sans` | `font-sans` | `system-ui, sans-serif` |
-| Code / Technical | JetBrains Mono | `--font-mono` | `font-mono` | `monospace` |
-Fonts are loaded via Google Fonts `@import` in `index.css`.
-### Type Scale
-The app uses Tailwind's default type scale. These are the **canonical sizes** ordered by frequency of use:
-| Tailwind Class | Size | Uses | Role |
-|----------------|------|------|------|
-| `text-sm` | 14px / 0.875rem | 711 | **Primary body text**, labels, button text, descriptions |
-| `text-xs` | 12px / 0.75rem | 427 | **Secondary text**, metadata, badges, menu items, captions |
-| `text-base` | 16px / 1rem | 52 | Larger body text, input text, chat messages |
-| `text-lg` | 18px / 1.125rem | 67 | Section sub-headings, dialog titles |
-| `text-xl` | 20px / 1.25rem | 28 | Page sub-headings |
-| `text-2xl` | 24px / 1.5rem | 31 | Page headings, modal titles |
-| `text-3xl` | 30px / 1.875rem | 12 | Hero headings, landing sections |
-### Arbitrary Font Sizes (to normalize)
-These arbitrary sizes appear frequently and should be migrated to the standard scale or formalized as tokens:
-| Arbitrary | Count | Recommended Replacement |
-|-----------|-------|------------------------|
-| `text-[11px]` | 46 | `text-xs` (12px) — or formalize as `--text-2xs` if 11px is intentional |
-| `text-[10px]` | 20 | `text-xs` (12px) — or formalize as `--text-2xs` |
-| `text-[12px]` | 8 | `text-xs` (already 12px — use the utility) |
-| `text-[40px]` | 5 | `text-4xl` (36px) or formalize as hero display size |
-| `text-[28px]` | 3 | `text-3xl` (30px) or formalize |
-| `text-[32px]` | 1 | `text-3xl` (30px) or `text-4xl` (36px) |
-| `text-[8px]` | 1 | Likely a micro label — evaluate if needed |
-### Font Weight Scale
-| Tailwind Class | Weight | Uses | Role |
-|----------------|--------|------|------|
-| `font-medium` | 500 | 304 | Labels, nav items, badges (note: buttons use `font-normal`) |
-| `font-semibold` | 600 | 229 | **Headings**, section titles, strong emphasis |
-| `font-normal` | 400 | 106 | **Body text**, descriptions, long-form content |
-| `font-bold` | 700 | 29 | Maximum emphasis (use sparingly) |
-| `font-light` | 300 | 13 | De-emphasized text (use sparingly) |
-### Line Height
-| Tailwind Class | Uses | Role |
-|----------------|------|------|
-| `leading-4` | 38 | Tight — compact UI, badges |
-| `leading-6` | 34 | Standard — body text |
-| `leading-relaxed` | 33 | Comfortable — long-form, descriptions |
-| `leading-5` | 28 | Medium — labels, short text |
-| `leading-tight` | 27 | Condensed — headings |
-| `leading-snug` | 17 | Slightly condensed |
-| `leading-none` | 16 | No leading — single-line elements |
-### Letter Spacing
-| Tailwind Class | Uses | Role |
-|----------------|------|------|
-| `tracking-wide` | 28 | Uppercase labels, section headers |
-| `tracking-wider` | 20 | Small-caps metadata |
-| `tracking-tight` | 19 | Display headings |
-### Canonical Patterns
-**Body text:** `text-sm font-normal text-foreground`
-**Label:** `text-sm font-medium text-foreground`
-**Secondary text:** `text-sm text-muted-foreground`
-**Metadata/caption:** `text-xs text-muted-foreground`
-**Uppercase category:** `text-[11px] font-medium uppercase tracking-wide text-muted-foreground`
-**Heading (page):** `text-2xl font-semibold text-foreground`
-**Heading (section):** `text-lg font-semibold text-foreground`
-**Code/mono:** `text-sm font-mono`
----
-## 4. Component Stylings
-### Buttons (`Button` component — `src/components/ui/button.tsx`)
-**Base classes (all variants):**
-`inline-flex items-center justify-center gap-2 whitespace-nowrap rounded-md text-sm font-normal ring-offset-background transition-all focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:pointer-events-none disabled:opacity-50 active:scale-[0.97] [&_svg]:pointer-events-none [&_svg]:size-4 [&_svg]:shrink-0`
-| Variant | Background | Text | Border | Hover | Use |
-|---------|-----------|------|--------|-------|-----|
-| `default` | `bg-primary` | `text-primary-foreground` | — | `hover:bg-primary/85` | Primary CTA (white button, black text) |
-| `destructive` | `bg-destructive` | `text-destructive-foreground` | — | `hover:bg-destructive/85` | Delete, danger actions |
-| `outline` | `bg-background` | — | `border border-input` | `hover:bg-muted hover:text-foreground` | Secondary actions (most used — 53 instances) |
-| `light` | `bg-primary` | `text-primary-foreground` | `border border-input` | `hover:bg-primary/85` | High-contrast primary on dark bg (token-based, no raw `bg-white`) |
-| `secondary` | `bg-secondary` | `text-secondary-foreground` | — | `hover:bg-muted-hover` | Tertiary actions |
-| `muted` | `bg-muted` | `text-muted-foreground` | — | `hover:bg-muted-hover hover:text-foreground` | Subdued actions |
-| `ghost` | transparent | — | — | `hover:bg-muted hover:text-foreground` | Minimal chrome actions |
-| `link` | transparent | `text-primary underline-offset-4` | — | `hover:underline` | Inline links |
-**Primary button convention:** All white/primary buttons use `bg-primary text-primary-foreground hover:bg-primary/85`. Never use `bg-white text-black hover:bg-muted/60` inline — the dark hover on a white button is incorrect. Use the `Button` component or match its tokens.
-| Size | Height | Padding | Font |
-|------|--------|---------|------|
-| `default` | `h-10` | `px-4 py-2` | `text-sm` |
-| `sm` | `h-10` | `px-3` | `text-sm` |
-| `xs` | `h-10` | `px-3` | `text-xs` |
-| `lg` | `h-11` | `px-8` | `text-sm` |
-| `icon` | `h-10 w-10` | — | — |
-### Cards & Containers
-There is no dedicated `Card` primitive — cards are composed with utilities.
-**Standard card recipe:**
-```
-bg-card border border-border rounded-lg p-4
-```
-**Elevated card:**
-```
-bg-card border border-border rounded-xl p-6 shadow-lg
-```
-**Interactive card:**
-```
-bg-card border border-border rounded-lg p-4 transition-colors hover:border-white/30
-```
-**Glass / backdrop card:**
-```
-bg-card/70 border border-border/60 rounded-lg p-6 shadow-lg backdrop-blur-xl supports-[backdrop-filter]:bg-card/50
-```
-### Inputs (`Input` component — `src/components/ui/input.tsx`)
-**Standard input:**
-```
-h-10 w-full rounded-md border border-border bg-muted/40 px-3 py-2 text-base md:text-sm
-ring-offset-background focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring
-focus-visible:ring-offset-2 focus-visible:bg-muted/60 hover:bg-muted/60
-placeholder:text-muted-foreground
-disabled:cursor-not-allowed disabled:opacity-50 disabled:bg-muted/30
-```
-**Canonical focus style (all inputs, textareas, selects must match):**
-```
-ring-offset-background focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:bg-muted/60
-```
-Key rules:
-- Always use `focus-visible:` (keyboard-only), never `focus:` (fires on click too)
-- Always include `ring-offset-background` and `focus-visible:ring-offset-2`
-- Always include `focus-visible:bg-muted/60` for the subtle fill on focus
-- Search inputs (`type="search"`) have `appearance: none` in global CSS to strip browser default focus chrome
-**Size variants (via SearchInput wrapper):**
-- `sm`: `h-9` + `pl-8 pr-8` (icon padding)
-- `default`: `h-10` + `pl-9 pr-9`
-- `lg`: `h-11` + `pl-10 pr-10`
-### Dropdown Menus (`DropdownMenu` — Radix-based)
-**Menu content:**
-```
-z-[100] min-w-[8rem] overflow-y-auto rounded-md border bg-popover p-1 text-popover-foreground shadow-md
-```
-**Menu item:**
-```
-group relative flex cursor-default select-none items-center rounded-md px-2 py-1.5 text-sm
-transition-colors focus:bg-muted/60 data-[highlighted]:bg-muted/60
-```
-**Icon treatment in menu items:**
-- Default: `[&_svg]:text-muted-foreground` (grey)
-- Hover/highlight: `group-hover:[&_svg]:!text-foreground` (white)
-### Popover
-**Content:**
-```
-z-50 max-h-[min(24rem,calc(100dvh-2rem))] shadow-md rounded-[12px] border border-border
-bg-sidebar p-6 text-sidebar-foreground overflow-y-auto
-```
-### Navigation (LeftNav sidebar)
-- Collapsed: 56px wide icon rail
-- Expanded: 240px+ with text labels
-- Items: `flex items-center gap-2 rounded-md px-3 py-1.5 text-xs transition-colors`
-- Icons: `w-4 h-4 text-muted-foreground group-hover:text-white`
-- Active: `bg-muted/60 text-foreground`
-- Hover: `hover:bg-muted/60 hover:text-white`
-### Scrollbar Variants
-| Class | Width | Behavior | Use |
-|-------|-------|----------|-----|
-| `.dropdown-scroll` | 6px thin | Always visible | Menus, popovers |
-| `.custom-scrollbar` | 8px thin | Always visible | Chat, main content |
-| `.scrollbar-on-hover` | 8px thin | Visible on hover only | Chat threads |
-| `.hide-scrollbar` | hidden | Hidden | Horizontal scroll areas |
-All scrollbar thumbs: `hsl(var(--muted-foreground) / 0.5)` with hover at `0.7`.
-### Tooltips
-All tooltips use `bg-muted` for a lighter surface that visually separates from the dark page background.
-**Standard tooltip (rounded-md):**
-```
-whitespace-nowrap rounded-md bg-muted px-2 py-1 text-xs text-foreground shadow-md
-```
-**Pill tooltip (rounded-full):**
-```
-bg-muted text-foreground text-xs rounded-full shadow-lg px-3 py-1
-```
-### Dialog Close Button
-The dialog close "×" button has no focus ring (focus ring removed to avoid visual noise on click):
-```
-absolute right-4 top-4 inline-flex h-7 w-7 items-center justify-center rounded-md opacity-70
-ring-offset-background transition-colors hover:opacity-100 hover:bg-muted/60 focus:outline-none
-```
----
-## 5. Layout Principles
-### Spacing System
-The app uses Tailwind's default 4px-based spacing scale. These are the most common values by usage:
-**Gaps (flex/grid):**
-| Class | Px | Uses | Context |
-|-------|-----|------|---------|
-| `gap-2` | 8px | 414 | **Standard gap** — between items in rows, icon + label |
-| `gap-3` | 12px | 144 | Comfortable gap — form groups, card content |
-| `gap-4` | 16px | 110 | Generous gap — section spacing, grid layouts |
-| `gap-1` | 4px | 86 | Tight gap — inline badges, compact lists |
-| `gap-1.5` | 6px | 59 | Between tight and standard |
-| `gap-6` | 24px | 50 | Large gap — major sections |
-**Padding:**
-| Class | Px | Uses | Context |
-|-------|-----|------|---------|
-| `px-4` | 16px | 216 | **Standard horizontal padding** — buttons, cards |
-| `px-3` | 12px | 212 | Compact horizontal padding — menu items, inputs |
-| `py-2` | 8px | 200 | **Standard vertical padding** — buttons, rows |
-| `px-2` | 8px | 198 | Tight horizontal padding — badges, pills |
-| `py-1` | 4px | 138 | Compact vertical padding |
-| `py-1.5` | 6px | 76 | Slightly more than compact |
-| `p-4` | 16px | 89 | Uniform card/container padding |
-| `p-6` | 24px | 26 | Generous container/dialog padding |
-### Grid & Container
-- Max container width: `1400px` (via Tailwind `container` config with `2rem` padding)
-- Primary layout: sidebar (56–240px) + main content area
-- Settings layout: custom CSS vars for independent nav/main vertical inset
-  - `--settings-nav-padding-top/bottom`: `2rem`
-  - `--settings-main-padding-top/bottom`: `2rem`
-### Whitespace Philosophy
-- **Dense but breathable**: The app uses `text-sm` (14px) as the default with `gap-2` (8px) standard spacing — dense enough for a productivity tool, but never cramped.
-- **Consistent rhythm**: Sections are separated by `border-t border-border` dividers with `my-3` (12px) vertical margin. No heavy horizontal rules.
-- **Surface differentiation over spacing**: Rather than using large whitespace to separate areas, the app uses background color shifts (`bg-background` → `bg-card` → `bg-muted`) to create visual sections.
-### Border Radius Scale
-Defined via CSS custom properties and Tailwind mapping:
-| Token | Value | Tailwind | Uses | Role |
-|-------|-------|----------|------|------|
-| `--radius` | `0.375rem` (6px) | `rounded-lg` | 112 | **Standard container radius** — cards, panels |
-| `calc(--radius - 2px)` | `0.25rem` (4px) | `rounded-md` | 501 | **Default element radius** — buttons, inputs, menu items |
-| `calc(--radius - 4px)` | `0.125rem` (2px) | `rounded-sm` | 12 | Subtle radius — small inline elements |
-| `--radius-modal` | `0.75rem` (12px) | `rounded-modal` | — | Modal/dialog/popover radius |
-| — | — | `rounded-xl` | 90 | Larger cards, featured containers |
-| — | — | `rounded-2xl` | 23 | Hero elements, large cards |
-| — | — | `rounded-full` | 185 | Avatars, pills, circular buttons, badges |
-**Arbitrary radii to normalize:**
-| Arbitrary | Count | Recommended |
-|-----------|-------|-------------|
-| `rounded-[6px]` | 20 | `rounded-lg` (already 6px via `--radius`) |
-| `rounded-[100px]` | 15 | `rounded-full` (same visual effect) |
-| `rounded-[12px]` | 8 | `rounded-modal` or `rounded-xl` (12px) |
-| `rounded-[4px]` | 4 | `rounded-md` (already 4px) |
----
-## 6. Depth & Elevation
-### Shadow Scale
-| Tailwind | Uses | Role |
-|----------|------|------|
-| `shadow-sm` | 21 | Subtle elevation — small cards, badges |
-| `shadow` | 22 | Default — standalone cards |
-| `shadow-md` | 31 | Medium — dropdown menus, popovers |
-| `shadow-lg` | 49 | **Most used** — modals, dialogs, elevated panels |
-| `shadow-xl` | 14 | High emphasis — floating panels |
-| `shadow-2xl` | 5 | Maximum — overlay dialogs |
-| `shadow-inner` | 8 | Inset — pressed buttons, input focus |
-| `shadow-none` | 19 | Reset — flat elements |
-### Custom Shadows
-| Token | Value | Role |
-|-------|-------|------|
-| `--shadow-card` | `0 1px 2px 0 hsl(0 0% 0% / 0.3)` | Card resting shadow |
-### Elevation Levels
-| Level | Treatment | Use |
-|-------|-----------|-----|
-| 0 — Flat | No shadow, `bg-background` | Page background |
-| 1 — Surface | `bg-card` + `border border-border` | Cards, content panels |
-| 2 — Raised | `shadow-md` + `border` | Dropdown menus, popovers |
-| 3 — Floating | `shadow-lg` + `border` | Modals, dialogs, sheets |
-| 4 — Overlay | `shadow-xl` or `shadow-2xl` | Full-screen overlays, drawers |
-### Border System
-- **Standard border:** `border border-border` (1px solid `hsl(0 0% 14%)`)
-- **Subtle border:** `border border-border/60` (reduced opacity)
-- **Interactive hover:** `hover:border-white/30` or `hover:border-muted-foreground/30`
-- **Section divider:** `border-t border-border` (horizontal rule) or `border-t border-sidebar-border` (in sidebar)
-- **Focus ring:** `ring-1 ring-ring ring-offset-2 ring-offset-background` (1px, `focus-visible:` only)
----
-## 7. Do's and Don'ts
-### Colors
-| Do | Don't |
-|----|-------|
-| Use `text-foreground` for primary text | Use `text-white` for primary text (278 instances to migrate) |
-| Use `text-muted-foreground` for secondary text | Use `text-stone-400` or `text-gray-400` (raw palette) |
-| Use `bg-background` for page surfaces | Use `bg-black` or hardcoded `bg-[#0d0d0d]` |
-| Use `bg-card` for elevated surfaces | Use `bg-stone-800` or `bg-neutral-900` |
-| Use `bg-muted` for subtle backgrounds | Use `bg-stone-700` or `bg-gray-800` |
-| Use `border-border` for all borders | Use `border-stone-700` or `border-gray-700` |
-| Use `text-success-foreground` for success text | Use `text-emerald-400` or `text-green-400` |
-| Use `text-destructive` for error text | Use `text-red-500` or `text-rose-500` |
-| Use `hover:text-foreground` for hover text brightening | Use `hover:text-white` except in sidebar context |
-**Semantic status colors:** Use `text-success` / `bg-success`, `text-warning` / `bg-warning`, `text-info` / `bg-info`, `text-destructive` / `bg-destructive` — never raw chromatic palette classes like `text-green-500`, `bg-amber-400`, `text-blue-500`, etc.
-**Current debt:** `themeAppClassMap.ts` and `NewUserExperienceFlowchart.tsx` still use raw `stone-*` / `rgb()` values (theme definition maps — intentionally deferred; requires per-theme CSS variable architecture). `ChatThread.tsx` `messageTypeColors` has 4 remaining raw palette colors (`orange-500`, `indigo-500`, `purple-500`, `pink-500`) for categorical distinctness — no semantic tokens defined for these yet.
-### Typography
-| Do | Don't |
-|----|-------|
-| Use `text-sm` (14px) as default body size | Use `text-[14px]` or arbitrary pixel values |
-| Use `text-xs` (12px) for small/meta text | Use arbitrary pixel sizes for general text |
-| Use Tailwind scale (`text-lg`, `text-xl`, `text-2xl`) | Use arbitrary sizes like `text-[28px]`, `text-[40px]` |
-| Use `font-medium` as default weight | Use `font-bold` for general emphasis |
-| Keep heading hierarchy: `2xl` → `xl` → `lg` → `base` | Skip levels or invert the scale |
-### Border Radius
-| Do | Don't |
-|----|-------|
-| Use `rounded-md` (4px) for buttons, inputs, menu items | Use `rounded-[4px]` (same value, less maintainable) |
-| Use `rounded-lg` (6px) for cards, containers | Use `rounded-[6px]` (use the token) |
-| Use `rounded-xl` or `rounded-modal` for dialogs | Use `rounded-[12px]` (use the token) |
-| Use `rounded-full` for pills and avatars | Use `rounded-[100px]` (use `rounded-full`) |
-### Spacing
-| Do | Don't |
-|----|-------|
-| Use `gap-2` (8px) as standard item gap | Use arbitrary gap values |
-| Use `px-3`/`px-4` for horizontal padding | Mix `px-2.5` and `px-3.5` without reason |
-| Use `p-4` for card padding, `p-6` for dialogs | Use `p-[24px]` (same as `p-6`) |
-| Use `my-3` for section divider spacing | Use inconsistent vertical margins around dividers |
-### Hover & Interaction
-| Do | Don't |
-|----|-------|
-| Use `hover:bg-muted/60` as standard hover bg on dark surfaces | Mix `/40`, `/50`, `/60`, `/70` without hierarchy |
-| Use `hover:bg-primary/85` for white/primary buttons | Use `hover:bg-muted/60` on white buttons (creates dark hover) |
-| Use `transition-colors` for color-only changes | Use `transition-all` when only color changes |
-| Use `duration-200` as standard transition speed | Mix `duration-150`, `duration-200`, `duration-300` randomly |
-| Use `group` + `group-hover:` for parent-child hover | Apply hover to each child independently |
-| Use `active:scale-[0.97]` for button press feedback | Use `active:scale-95` (inconsistent with Button component) |
-### Icons
-| Do | Don't |
-|----|-------|
-| Use `w-4 h-4` as standard icon size in menus/buttons | Use `w-3 h-3` or `w-5 h-5` without size hierarchy reason |
-| Set icon color to `text-muted-foreground` by default | Leave icons inheriting parent text color (appears too bright) |
-| Brighten on hover: `group-hover:text-foreground` or `group-hover:text-white` | Omit icon hover transitions |
-| Use `shrink-0` on icons in flex layouts | Let icons squish when text wraps |
----
-## 8. Responsive Behavior
-### Breakpoints (Tailwind defaults)
-| Prefix | Min Width | Key Changes |
-|--------|-----------|-------------|
-| (none) | 0px | Mobile-first base styles |
-| `sm` | 640px | Wider cards, more padding |
-| `md` | 768px | Multi-column layouts begin, `md:text-sm` on inputs |
-| `lg` | 1024px | Full sidebar visible, expanded grid |
-| `xl` | 1280px | Maximum content width, full feature layout |
-| `2xl` | 1400px | Container max-width ceiling |
-### Touch Targets
-- Minimum interactive height: `h-10` (40px) for buttons and inputs
-- Small variant: `h-9` (36px) for compact contexts
-- Icon buttons: `h-10 w-10` (40×40px)
-- Menu items: `py-1.5` (6px) vertical padding at `text-sm` yields ~32px touch target
-### Collapsing Strategy
-- Sidebar: collapses from expanded (labels) to icon-only rail on narrow viewports
-- Navigation menus: horizontal → hamburger on mobile
-- Grid layouts: multi-column → single-column stacked
-- Container padding: reduces from `p-6` → `p-4` → `p-3` at smaller breakpoints
----
-## 9. Interaction & Motion
-### Transitions
-| Pattern | Uses | When |
-|---------|------|------|
-| `transition-colors` | 958 | **Default** — use for any color/bg/border change |
-| `transition-opacity` | 96 | Fade in/out |
-| `transition-all` | 96 | Multiple properties changing simultaneously |
-| `transition-transform` | 59 | Scale/translate animations |
-### Duration
-| Duration | Uses | When |
-|----------|------|------|
-| `duration-200` | ~48 | **Standard** — local, small feedback: toggles, chevron rotation, sidebar width, card/row hovers, opacity on hover, dialogs |
-| `duration-300` | ~34 | **Layout motion** — panel/drawer resize, sheet exit, canvas split, login/marketing card hover, grid row animations |
-### Easing
-| Easing | Uses | When |
-|--------|------|------|
-| `ease-in-out` | 111 | **Default** — smooth symmetrical transitions |
-| `ease-out` | 52 | Enter animations — elements arriving |
-### Framer Motion Patterns (23 files)
-- `AnimatePresence` for mount/unmount transitions
-- Standard enter: `initial={{ opacity: 0 }}` → `animate={{ opacity: 1 }}`
-- Standard exit: `exit={{ opacity: 0 }}`
-- Duration: typically `0.2s`–`0.3s`
-- Used for: panel reveals, notification toasts, drawer slides, loading states
-### Interactive Feedback
-- **Button press:** `active:scale-[0.97]` (slight shrink on click)
-- **Card hover:** `hover:scale-[1.02]` (subtle grow, 12 uses)
-- **Focus:** `focus-visible:ring-1 focus-visible:ring-ring focus-visible:ring-offset-2` (1px ring, keyboard-only)
----
-## 10. Agent Prompt Guide
-### Quick Color Reference
-- Page background: `bg-background` → `hsl(0 0% 5%)` → `#0d0d0d`
-- Primary text: `text-foreground` → `hsl(0 0% 98%)` → `#fafafa`
-- Secondary text: `text-muted-foreground` → `hsl(0 0% 55%)` → `#8c8c8c`
-- Card surface: `bg-card` → `hsl(0 0% 7%)` → `#121212`
-- Border: `border-border` → `hsl(0 0% 14%)` → `#242424`
-- Hover background: `bg-muted/60` → `hsl(0 0% 12% / 0.6)`
-- Success: `text-success-foreground` → `hsl(142 71% 76%)` → `#86efac`
-- Error: `text-destructive` → `hsl(0 72% 51%)` → `#dc2626`
-### Example Component Prompts
-- **"Create a settings card"**: `bg-card border border-border rounded-lg p-4`. Title at `text-lg font-semibold text-foreground`. Description at `text-sm text-muted-foreground`. Action button: `<Button variant="outline">`.
-- **"Create a sidebar menu item"**: `group flex items-center gap-2 rounded-md px-3 py-1.5 text-xs text-sidebar-foreground hover:text-white hover:bg-muted/60 transition-colors`. Icon: `w-4 h-4 shrink-0 text-muted-foreground transition-colors group-hover:text-white`.
-- **"Create a dropdown menu"**: Use `DropdownMenu` + `DropdownMenuTrigger` + `DropdownMenuContent` + `DropdownMenuItem` from `src/components/ui/dropdown-menu.tsx`. Icons auto-styled grey → white on hover via the component's built-in `[&_svg]` selectors.
-- **"Create a form field"**: Label at `text-sm font-medium text-foreground mb-1.5`. Use `<Input>` component (never inline raw `<input>` with custom focus styles). Help text at `text-xs text-muted-foreground mt-1`.
-- **"Create a tooltip"**: `bg-muted text-foreground text-xs rounded-md px-2 py-1 shadow-md`. For pill-style: use `rounded-full` instead of `rounded-md`.
-- **"Create a status badge"**: `inline-flex items-center rounded-full px-2 py-0.5 text-[11px] font-medium`. Success: `bg-success/10 text-success-foreground`. Error: `bg-destructive/10 text-destructive`.
-### Iteration Guide
-1. **Always use semantic color tokens** — never raw palette colors (`stone-*`, `gray-*`, `slate-*`). Every color should trace back to a `--css-variable`.
-2. **`text-sm` is the default** — don't reach for `text-base` unless the context genuinely needs larger text (e.g., chat messages, hero content).
-3. **`rounded-md` for elements, `rounded-lg` for containers** — this is the consistent radius hierarchy. Dialogs get `rounded-xl` or `rounded-modal`.
-4. **`gap-2` is the standard** — 8px between items in any flex/grid layout. Use `gap-4` for major sections.
-5. **Icons are always `text-muted-foreground`** by default and brighten to `text-foreground` or `text-white` on hover via `group` + `group-hover:`.
-6. **`transition-colors duration-200`** is the standard animation. Don't add `transition-all` unless multiple property types are actually changing.
-7. **`hover:bg-muted/60`** is the canonical hover background. Use it consistently across menus, nav items, and interactive rows.
-8. **The `Button` component handles its own variants** — don't rebuild button styles from scratch. Use `variant="outline"` for most secondary actions.

OpenHands-Design/NORMALIZATION_LOG.md DELETED Viewed

@@ -1,36 +0,0 @@
-# Token Normalization Log
-Tracking the migration from raw Tailwind palette classes and arbitrary values to semantic design tokens.
-## Completed
-- [x] `.dark` CSS block now declares all variables from `:root` (`--modal-background`, `--radius-*`, `--success`, `--warning`, `--info`, `--gradient-*`, `--shadow-*`, `--font-*`, `--settings-*`)
-- [x] Migrated ~160 raw `stone-*` utility classes across 17 component files to semantic tokens (excluding `themeAppClassMap.ts`, `index.ts`, and scrollbar utilities)
-- [x] Migrated ~44 raw `gray-*` utility classes across 10 files to semantic tokens
-- [x] Migrated ~83 raw `neutral-*` classes across 3 files to semantic tokens (preserving VS Code diff mock borders)
-- [x] Replaced 47 arbitrary border radius values (`rounded-[6px]` → `rounded-lg`, `rounded-[100px]` → `rounded-full`, `rounded-[12px]` → `rounded-xl`, `rounded-[4px]` → `rounded-md`, `rounded-r-[100px]` → `rounded-r-full`)
-- [x] Replaced 74 arbitrary font sizes (`text-[11px]` → `text-xs`, `text-[10px]` → `text-xs`, `text-[12px]` → `text-xs`)
-- [x] Standardized 57 `hover:bg-muted` opacity variants (`/40`, `/50`, `/70`, `/80`) to canonical `/60`
-- [x] Replaced 6 unsafe `text-white` usages with semantic tokens (`text-foreground`, `text-card-foreground`)
-- [x] Replaced `bg-[#141414]` → `bg-secondary` (2 instances)
-- [x] Migrated ~100+ chromatic palette classes to semantic tokens: `amber/yellow` → `warning`, `blue/sky` → `info`, `green/emerald` → `success`, `red` → `destructive`
-- [x] Unified tooltip backgrounds from `bg-popover`/`bg-card` to `bg-muted` across all 6 tooltip instances
-- [x] Fixed 33 inline white buttons (`bg-white text-black hover:bg-muted/60` → `bg-primary text-primary-foreground hover:bg-primary/85`)
-- [x] Fixed Button `light` variant from `bg-white text-black hover:bg-zinc-200` to `bg-primary text-primary-foreground hover:bg-primary/85`
-- [x] Removed Dialog `--ring` inline override (`0 0% 95%`) that caused inconsistent focus ring color in modals
-- [x] Updated global `--ring` from `0 0% 50%` to `0 0% 80%` for better visibility
-- [x] Normalized ~50 inline input/textarea/select focus styles to canonical `focus-visible:` pattern (from mixed `focus:`/`focus-visible:` with missing offsets)
-- [x] Changed all focus rings from `ring-2` to `ring-1` site-wide (~97 instances across 39 files)
-- [x] Added `appearance: none` on `input[type="search"]` to strip browser default focus chrome
-- [x] Removed focus ring from dialog close button
-- [x] Consolidated `active:scale-95` (3 uses) → `active:scale-[0.97]` to match Button standard
-- [x] Migrated `ChatThread.tsx` `messageTypeColors`: `yellow-500` → `warning`, `blue-500` → `info` (3 categories). Remaining categorical colors (`orange-500`, `indigo-500`, `purple-500`, `pink-500`) kept as raw palette — no semantic equivalent for multi-category distinctness
-- [x] Migrated 14 remaining `bg-white` usages in non-button contexts to semantic tokens: toggles → `bg-primary`, resize grips → `bg-foreground`, badge → `bg-primary`, attachment previews → `bg-foreground/5`, CTA buttons → `bg-primary text-primary-foreground hover:bg-primary/85`, ghost hover buttons → `hover:bg-primary hover:text-primary-foreground`
-- [x] Documented `duration-200` vs `duration-300` convention (200ms = local feedback, 300ms = layout/panel motion)
-- [x] ~~Migrate legacy `index.ts`~~ — invalid; `screens/index.ts` and `components/workflow/index.ts` are barrel files with zero raw color classes
-## Deferred by Design
-- [ ] `themeAppClassMap.ts` and `NewUserExperienceFlowchart.tsx` use raw `stone-*` / `rgb()` values — these are **theme definition maps** that intentionally encode per-theme palettes (dark/light/sepia). Migrating requires defining CSS variables for each theme mode, which is an architectural change
-- [ ] `sepia` theme in `themeAppClassMap.ts` uses hardcoded `rgb()` — requires defining a `.theme-sepia` CSS variable block before semantic classes can replace arbitrary values
-- [ ] `ChatThread.tsx` categorical palette (`orange-500`, `indigo-500`, `purple-500`, `pink-500`) for `bug`, `docs`, `dependency`, `git` message types — no semantic tokens exist for multi-category distinctness; would need new `--chart-*` or `--category-*` CSS variables

OpenHands-Design/README.md DELETED Viewed

@@ -1,222 +0,0 @@
-# OpenHands Design System
-A portable design system extracted from the OpenHands UI. Drop it into any React + Tailwind project to get a consistent dark-first interface with semantic color tokens, pre-built components, and a comprehensive style guide.
-## What's Included
-```
-OpenHands-Design/
-  DESIGN.md                          # Full design system specification
-  README.md                          # This file
-  tailwind.config.js                 # Tailwind theme (colors, radii, fonts, animations)
-  src/
-    globals.css                      # CSS custom properties (design tokens) + base resets
-    lib/
-      utils.ts                       # cn() helper (clsx + tailwind-merge)
-    components/ui/
-      button.tsx                     # Button with 8 variants (default, destructive, outline, light, secondary, muted, ghost, link)
-      input.tsx                      # Text input with unified focus style
-      search-input.tsx               # Search input with icon, clear button, and 3 sizes
-      native-select.tsx              # Native <select> with consistent styling
-```
-## Quick Start
-### Install with npx
-From your project root:
-```bash
-npx openhands-design
-```
-This adds `./OpenHands-Design/` (including `DESIGN.md`, tokens, and UI components). Then ask your AI assistant to use **`DESIGN.md`** for UI work. If the folder already exists, run `npx openhands-design --force` to replace it.
-### 1. Install dependencies
-```bash
-npm install clsx tailwind-merge class-variance-authority @radix-ui/react-slot lucide-react tailwindcss-animate
-```
-### 2. Copy files into your project
-```bash
-# Copy the design tokens and global CSS
-cp OpenHands-Design/src/globals.css        your-project/src/globals.css
-# Copy the Tailwind config (or merge into your existing one)
-cp OpenHands-Design/tailwind.config.js     your-project/tailwind.config.js
-# Copy the utility helper
-cp OpenHands-Design/src/lib/utils.ts       your-project/src/lib/utils.ts
-# Copy the UI components
-cp -r OpenHands-Design/src/components/ui/  your-project/src/components/ui/
-```
-### 3. Import globals.css
-In your app entry point (e.g., `main.tsx` or `App.tsx`):
-```tsx
-import './globals.css';
-```
-### 4. Add the dark class
-The system is dark-first. Add the `dark` class to your `<html>` tag:
-```html
-<html lang="en" class="dark">
-```
-### 5. Start using components
-```tsx
-import { Button } from './components/ui/button';
-import { Input } from './components/ui/input';
-import { SearchInput } from './components/ui/search-input';
-import { NativeSelect } from './components/ui/native-select';
-function Example() {
-  return (
-    <div className="flex flex-col gap-4 bg-background p-6 text-foreground">
-      <h1 className="text-2xl font-semibold">Settings</h1>
-      <p className="text-sm text-muted-foreground">Manage your account.</p>
-      <Input placeholder="Your name" />
-      <NativeSelect>
-        <option>Option A</option>
-        <option>Option B</option>
-      </NativeSelect>
-      <div className="flex gap-2">
-        <Button>Save</Button>
-        <Button variant="outline">Cancel</Button>
-        <Button variant="destructive">Delete</Button>
-      </div>
-    </div>
-  );
-}
-```
-## Using with AI Agents (Cursor, Copilot, etc.)
-The `DESIGN.md` file is structured as an AI-readable specification. Two ways to use it:
-### Option A: Cursor Rule (recommended)
-Create `.cursor/rules/design-system.md` in your project:
-```markdown
-When building UI components, follow the design system in /DESIGN.md.
-Key rules:
-- Use semantic color tokens (bg-card, text-foreground, border-border) — never raw palette classes
-- Use the Button, Input, SearchInput, and NativeSelect components — never raw HTML with inline styles
-- Hover on dark surfaces: hover:bg-muted/60
-- Hover on white/primary buttons: hover:bg-primary/85
-- Focus rings: focus-visible:ring-1 (keyboard-only, 1px)
-- Default text: text-sm font-normal text-foreground
-- Secondary text: text-sm text-muted-foreground
-- Standard gap: gap-2 (8px)
-- Standard card: bg-card border border-border rounded-lg p-4
-```
-Every Cursor conversation will now follow your design system automatically.
-### Option B: Direct prompt
-Paste this at the start of a conversation:
-> Build this feature following the design system in DESIGN.md. Use semantic tokens for all colors, the Button component for actions, and the Input component for form fields.
-## Token Architecture
-All colors are HSL triplets stored as CSS custom properties. Tailwind maps them via `hsl(var(--token))`.
-```
-Background scale (darkest → lightest):
-  --background    5%   #0d0d0d   Page background
-  --card          7%   #121212   Cards, elevated surfaces
-  --secondary     8%   #141414   Secondary surfaces
-  --muted        12%   #1f1f1f   Hover fills, badges, tooltips
-  --border       14%   #242424   Borders, dividers
-  --muted-hover  18%   #2e2e2e   Hover on muted surfaces
-Text scale:
-  --foreground          98%   #fafafa   Primary text
-  --muted-foreground    55%   #8c8c8c   Secondary text, placeholders
-  --primary            100%   #ffffff   Maximum emphasis, button bg
-  --primary-foreground   0%   #000000   Text on white buttons
-Semantic colors:
-  --success       hsl(142 71% 45%)   Green — success states
-  --warning       hsl(38 92% 50%)    Amber — warnings, in-progress
-  --info          hsl(217 91% 60%)   Blue — links, informational
-  --destructive   hsl(0 72% 51%)     Red — errors, danger
-```
-## Button Variants
-| Variant | Look | Use |
-|---------|------|-----|
-| `default` | White bg, black text | Primary CTA |
-| `destructive` | Red bg, white text | Delete, danger |
-| `outline` | Transparent, border | Secondary actions |
-| `light` | White bg, border | High-contrast primary |
-| `secondary` | Dark bg | Tertiary actions |
-| `muted` | Muted bg, grey text | Subdued actions |
-| `ghost` | Transparent, no border | Minimal chrome |
-| `link` | Underline on hover | Inline links |
-## Customization
-### Changing the color scheme
-Edit the HSL values in `globals.css`. Every UI element updates automatically:
-```css
-:root {
-  --background: 220 20% 5%;   /* Add a blue tint */
-  --card: 220 15% 8%;
-  --border: 220 10% 16%;
-}
-```
-### Adding a light theme
-Create a new class block in `globals.css` with inverted values:
-```css
-.light {
-  --background: 0 0% 100%;
-  --foreground: 0 0% 5%;
-  --card: 0 0% 97%;
-  --border: 0 0% 88%;
-  /* ... */
-}
-```
-Then toggle `class="light"` on the `<html>` element.
-## Reference
-See [DESIGN.md](./DESIGN.md) for the complete specification including:
-1. Visual theme and atmosphere
-2. Full color palette with hex values
-3. Typography rules and type scale
-4. Component styling recipes
-5. Layout principles and spacing system
-6. Depth and elevation system
-7. Do's and Don'ts
-8. Responsive behavior
-9. Interaction and motion patterns
-10. AI agent prompt guide
-11. Normalization backlog
-## License
-MIT

OpenHands-Design/index.html DELETED Viewed

@@ -1,396 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-<meta charset="UTF-8">
-<meta name="viewport" content="width=device-width, initial-scale=1.0">
-<title>OpenHands Design System</title>
-<link rel="preconnect" href="https://fonts.googleapis.com">
-<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500;600&display=swap" rel="stylesheet">
-<style>
-*,*::before,*::after{box-sizing:border-box;margin:0;padding:0}
-:root{
-  --bg:hsl(0 0% 5%);
-  --card:hsl(0 0% 7%);
-  --secondary:hsl(0 0% 8%);
-  --muted:hsl(0 0% 12%);
-  --border:hsl(0 0% 14%);
-  /* DESIGN.md: interactive border emphasis (e.g. card/button hover) */
-  --border-hover:hsl(0 0% 24%);
-  --muted-hover:hsl(0 0% 18%);
-  --fg:hsl(0 0% 98%);
-  --fg-muted:hsl(0 0% 55%);
-  --primary:hsl(0 0% 100%);
-  --primary-fg:hsl(0 0% 0%);
-  --success:#22c55e;
-  --success-fg:#86efac;
-  --warning:#f59e0b;
-  --info:#3b82f6;
-  --destructive:#dc2626;
-  --destructive-fg:#fafafa;
-  --ring:#cccccc;
-  --shadow-card:0 1px 2px 0 hsl(0 0% 0% / 0.3);
-  --font-sans:'Inter',system-ui,sans-serif;
-  --font-mono:'JetBrains Mono',monospace;
-}
-html{scroll-behavior:smooth}
-body{font-family:var(--font-sans);background:var(--bg);color:var(--fg);line-height:1.6;-webkit-font-smoothing:antialiased}
-/* Nav */
-.nav{position:sticky;top:0;z-index:50;display:flex;align-items:center;flex-wrap:nowrap;gap:16px;padding:0 24px 0 16px;height:56px;background:hsl(0 0% 5% / 0.85);backdrop-filter:blur(12px);border-bottom:1px solid var(--border);overflow:hidden}
-.nav-start{display:flex;align-items:center;flex-wrap:nowrap;gap:12px;flex-shrink:0;min-width:0}
-.nav-github{display:inline-flex;align-items:center;gap:6px;max-width:100%;overflow:hidden;text-overflow:ellipsis;white-space:nowrap;font-size:12px;color:var(--fg-muted);text-decoration:none;padding:4px 10px;border-radius:6px;border:1px solid var(--border);transition:all 0.2s;flex-shrink:0}
-.nav-github:hover{background:var(--muted);color:var(--fg);border-color:var(--border-hover)}
-.nav-links{display:flex;flex:1;justify-content:flex-end;gap:4px;list-style:none;margin:0;padding:0;min-width:0;flex-wrap:nowrap}
-.nav-links a{font-size:13px;color:var(--fg-muted);text-decoration:none;padding:6px 12px;border-radius:6px;transition:all 0.2s}
-.nav-links a:hover{color:var(--fg);background:var(--muted)}
-.nav-logo{display:flex;align-items:center;flex-shrink:0;opacity:0.85;transition:opacity 0.2s}
-.nav-logo:hover{opacity:1}
-.nav-logo svg{height:22px;width:auto;display:block}
-/* Hero */
-.hero{text-align:center;padding:80px 24px 64px;max-width:720px;margin:0 auto}
-.hero h1{font-size:48px;font-weight:600;line-height:1.1;letter-spacing:-1.5px;margin-bottom:16px}
-.hero p{font-size:16px;color:var(--fg-muted);line-height:1.6;max-width:520px;margin:0 auto 32px}
-.hero-buttons{display:flex;gap:12px;justify-content:center;flex-wrap:wrap}
-/* Buttons */
-.btn-primary{display:inline-flex;align-items:center;justify-content:center;height:40px;padding:0 20px;font-size:14px;font-weight:400;border-radius:6px;text-decoration:none;transition:all 0.2s;background:var(--secondary);color:var(--fg);border:1px solid var(--border)}
-.btn-primary:hover{background:var(--muted-hover);border-color:var(--border-hover)}
-.btn-dark{display:inline-flex;align-items:center;justify-content:center;height:40px;padding:0 20px;font-size:14px;font-weight:400;border-radius:6px;text-decoration:none;transition:all 0.2s;background:var(--primary);color:var(--primary-fg)}
-.btn-dark:hover{opacity:0.85}
-.btn-ghost{display:inline-flex;align-items:center;justify-content:center;height:40px;padding:0 20px;font-size:14px;font-weight:400;border-radius:6px;text-decoration:none;transition:all 0.2s;color:var(--fg-muted);background:transparent}
-.btn-ghost:hover{color:var(--fg);background:var(--muted)}
-.btn-destructive{display:inline-flex;align-items:center;justify-content:center;height:40px;padding:0 20px;font-size:14px;font-weight:400;border-radius:6px;text-decoration:none;transition:all 0.2s;background:var(--destructive);color:var(--destructive-fg)}
-.btn-destructive:hover{opacity:0.85}
-.btn-outline{display:inline-flex;align-items:center;justify-content:center;height:40px;padding:0 20px;font-size:14px;font-weight:400;border-radius:6px;text-decoration:none;transition:all 0.2s;color:var(--fg);background:var(--bg);border:1px solid var(--border)}
-.btn-outline:hover{background:var(--muted);color:var(--fg)}
-.btn-muted{display:inline-flex;align-items:center;justify-content:center;height:40px;padding:0 20px;font-size:14px;font-weight:400;border-radius:6px;text-decoration:none;transition:all 0.2s;color:var(--fg-muted);background:var(--muted)}
-.btn-muted:hover{background:var(--muted-hover);color:var(--fg)}
-.btn-pill{display:inline-flex;align-items:center;justify-content:center;height:28px;padding:0 12px;font-size:12px;font-weight:500;border-radius:9999px;text-decoration:none;transition:all 0.2s;background:var(--muted);color:var(--fg-muted)}
-.btn-pill:hover{color:var(--fg);background:var(--muted-hover)}
-/* Sections */
-.section{max-width:1100px;margin:0 auto;padding:64px 24px}
-.section-label{font-size:11px;font-weight:500;text-transform:uppercase;letter-spacing:1.5px;color:var(--fg-muted);margin-bottom:8px}
-.section-title{font-size:28px;font-weight:600;letter-spacing:-0.5px;margin-bottom:40px}
-.section-divider{border:none;border-top:1px solid var(--border);margin:0}
-/* Color swatches */
-.color-group-label{font-size:13px;font-weight:500;color:var(--fg-muted);margin-bottom:12px;margin-top:32px}
-.color-group-label:first-of-type{margin-top:0}
-.color-grid{display:grid;grid-template-columns:repeat(auto-fill,minmax(160px,1fr));gap:12px;margin-bottom:24px}
-.color-swatch{border-radius:10px;overflow:hidden;border:1px solid var(--border);background:var(--card);transition:transform 0.2s}
-.color-swatch:hover{transform:translateY(-2px)}
-.color-swatch-block{height:80px}
-.color-swatch-info{padding:10px 12px}
-.color-swatch-name{font-size:13px;font-weight:500;margin-bottom:2px}
-.color-swatch-hex{font-family:var(--font-mono);font-size:11px;color:var(--fg-muted)}
-.color-swatch-role{font-size:11px;color:var(--fg-muted);margin-top:4px}
-/* Typography */
-.type-sample{padding:20px 0;border-bottom:1px solid var(--border)}
-.type-sample:last-child{border-bottom:none}
-.type-meta{font-size:12px;color:var(--fg-muted);margin-top:8px;font-family:var(--font-mono)}
-/* Buttons section */
-.button-row{display:flex;flex-wrap:wrap;gap:24px;align-items:flex-start}
-.button-item{display:flex;flex-direction:column;align-items:center;gap:8px}
-.button-label{font-size:11px;color:var(--fg-muted);text-align:center}
-/* Cards */
-.card-grid{display:grid;grid-template-columns:repeat(auto-fill,minmax(300px,1fr));gap:16px}
-.card{background:var(--card);border:1px solid var(--border);border-radius:10px;padding:24px;transition:all 0.2s}
-.card:hover{border-color:var(--border-hover)}
-.card h3{font-size:16px;font-weight:600;margin-bottom:8px}
-.card p{font-size:14px;color:var(--fg-muted);line-height:1.6}
-.card-badge{display:inline-block;font-size:11px;font-weight:500;padding:2px 8px;border-radius:9999px;margin-bottom:12px}
-/* Forms */
-.form-group{max-width:480px;margin-bottom:24px}
-.form-label{display:block;font-size:14px;font-weight:500;margin-bottom:6px}
-.form-input,.form-textarea{width:100%;height:40px;padding:0 12px;font-size:14px;font-family:var(--font-sans);color:var(--fg);background:hsl(0 0% 12% / 0.4);border:1px solid var(--border);border-radius:6px;outline:none;transition:all 0.2s}
-.form-input::placeholder,.form-textarea::placeholder{color:var(--fg-muted)}
-.form-input:hover,.form-textarea:hover{background:hsl(0 0% 12% / 0.6)}
-.form-input:focus,.form-textarea:focus{background:hsl(0 0% 12% / 0.6);box-shadow:0 0 0 1px var(--ring),0 0 0 3px var(--bg)}
-.form-input--focus{background:hsl(0 0% 12% / 0.6) !important;box-shadow:0 0 0 1px var(--ring),0 0 0 3px var(--bg) !important}
-.form-input--error{border-color:var(--destructive) !important;box-shadow:0 0 0 1px var(--destructive),0 0 0 3px var(--bg) !important}
-.form-textarea{height:auto;min-height:80px;padding:10px 12px;resize:vertical}
-.form-state-label{font-size:11px;color:var(--fg-muted);margin-top:6px}
-/* Spacing */
-.spacing-row{display:flex;flex-wrap:wrap;gap:16px;align-items:flex-end}
-.spacing-item{display:flex;flex-direction:column;align-items:center;gap:8px}
-.spacing-block{height:40px;background:var(--primary);border-radius:3px;min-width:2px}
-.spacing-value{font-family:var(--font-mono);font-size:11px;color:var(--fg-muted)}
-/* Radius */
-.radius-row{display:flex;flex-wrap:wrap;gap:24px;align-items:flex-start}
-.radius-item{display:flex;flex-direction:column;align-items:center;gap:8px}
-.radius-box{width:64px;height:64px;border:2px solid var(--fg-muted);background:var(--card)}
-.radius-label{font-family:var(--font-mono);font-size:12px;color:var(--fg)}
-.radius-context{font-size:11px;color:var(--fg-muted)}
-/* Elevation */
-.elevation-grid{display:grid;grid-template-columns:repeat(auto-fill,minmax(200px,1fr));gap:16px}
-.elevation-card{background:var(--card);border-radius:10px;padding:24px;min-height:100px;display:flex;flex-direction:column;justify-content:flex-end}
-.elevation-label{font-size:13px;font-weight:500;margin-bottom:4px}
-.elevation-desc{font-size:12px;color:var(--fg-muted)}
-/* Footer */
-.footer{text-align:center;padding:48px 24px;font-size:13px;color:var(--fg-muted);border-top:1px solid var(--border)}
-.footer a{color:var(--fg-muted);text-decoration:none;transition:color 0.2s}
-.footer a:hover{color:var(--fg)}
-/* Responsive */
-@media(max-width:768px){
-  .nav-links{display:none}
-  .hero h1{font-size:32px}
-  .color-grid{grid-template-columns:repeat(2,1fr)}
-  .card-grid{grid-template-columns:1fr}
-}
-</style>
-</head>
-<body>
-<nav class="nav">
-  <div class="nav-start">
-    <a class="nav-logo" href="https://github.com/FraterCCCLXIII/OpenHands-Design.md" target="_blank" rel="noopener noreferrer" aria-label="OpenHands">
-      <svg viewBox="0 0 599.09 99.17" fill="#fff" aria-hidden="true"><path d="M159.74,53.71c0-18.39,11.81-31.21,28.17-31.21s28.17,12.82,28.17,31.21-11.81,31.21-28.17,31.21-28.17-12.82-28.17-31.21ZM205.29,53.71c0-13.16-7.17-21.68-17.38-21.68s-17.38,8.52-17.38,21.68,7.17,21.68,17.38,21.68,17.38-8.52,17.38-21.68Z"/><path d="M246.07,84.92c-5.74,0-9.95-2.28-12.74-5.57v19.82h-10.12v-59.47h10.12v4.72c2.78-3.29,7-5.57,12.74-5.57,12.4,0,19.48,10.46,19.48,23.03s-7.09,23.03-19.48,23.03ZM233.08,60.63v2.61c0,8.18,4.72,12.82,10.97,12.82,7.34,0,11.3-5.74,11.3-14.17s-3.96-14.17-11.3-14.17c-6.24,0-10.97,4.56-10.97,12.91Z"/><path d="M291.26,84.92c-12.65,0-21.51-9.36-21.51-23.03s8.77-23.03,21.09-23.03,19.65,9.7,19.65,21.85v3.37h-31.04c.76,7.59,5.32,12.23,11.81,12.23,4.98,0,8.94-2.53,10.29-7.09l8.69,3.29c-3.12,7.76-10.12,12.4-18.98,12.4ZM290.76,47.38c-5.23,0-9.28,3.12-10.8,9.11h20.33c-.08-4.89-3.12-9.11-9.53-9.11Z"/><path d="M317.25,83.99v-44.29h10.12v4.72c2.53-2.95,6.49-5.57,12.23-5.57,9.28,0,14.85,6.41,14.85,15.94v29.19h-10.12v-26.23c0-5.48-2.19-9.45-7.76-9.45-4.55,0-9.19,3.37-9.19,9.7v25.98h-10.12Z"/><path d="M404.17,23.43h10.8v60.57h-10.8v-26.49h-29.02v26.49h-10.8V23.43h10.8v24.63h29.02v-24.63Z"/><path d="M436.93,84.75c-8.43,0-14.93-5.15-14.93-13.07,0-8.44,6.33-12.15,14.85-13.92l12.23-2.53v-.76c0-4.22-2.19-6.83-7.59-6.83-4.81,0-7.34,2.19-8.52,6.5l-9.53-2.19c2.19-7.34,8.69-13.07,18.47-13.07,10.63,0,17.04,5.06,17.04,15.27v19.06c0,2.53,1.1,3.29,3.88,2.95v7.84c-7.34.84-11.22-.59-12.74-4.22-2.78,3.12-7.42,4.98-13.16,4.98ZM449.08,68.39v-5.4l-9.53,2.02c-4.3.93-7.51,2.28-7.51,6.24,0,3.46,2.53,5.4,6.41,5.4,5.4,0,10.63-2.87,10.63-8.27Z"/><path d="M469.17,83.99v-44.29h10.12v4.72c2.53-2.95,6.49-5.57,12.23-5.57,9.28,0,14.85,6.41,14.85,15.94v29.19h-10.12v-26.23c0-5.48-2.19-9.45-7.76-9.45-4.56,0-9.2,3.37-9.2,9.7v25.98h-10.12Z"/><path d="M532.56,84.92c-12.4,0-19.48-10.46-19.48-23.03s7.08-23.03,19.48-23.03c5.74,0,9.95,2.28,12.74,5.57v-21h10.12v60.57h-10.12v-4.64c-2.78,3.29-7,5.57-12.74,5.57ZM545.55,60.63c0-8.35-4.72-12.91-10.96-12.91-7.34,0-11.3,5.74-11.3,14.17s3.96,14.17,11.3,14.17c6.24,0,10.96-4.64,10.96-12.82v-2.61Z"/><path d="M560.54,75.56l7.59-6.07c2.62,4.3,7.68,7.17,12.82,7.17,4.3,0,8.27-1.52,8.27-5.48s-3.71-4.22-10.71-5.65c-7-1.43-15.01-3.21-15.01-12.65,0-8.1,7.09-14,17.29-14,7.76,0,14.68,3.46,17.88,8.35l-6.83,6.16c-2.53-3.96-6.75-6.24-11.64-6.24-4.13,0-6.83,1.86-6.83,4.81,0,3.21,3.2,3.8,8.77,4.98,7.51,1.6,16.95,3.21,16.95,13.33,0,8.94-8.18,14.68-18.22,14.68-8.18,0-16.36-3.29-20.33-9.36Z"/><path d="M64.97,14.8V1.93c0-1.07.86-1.93,1.93-1.93s1.93.86,1.93,1.93v12.87c0,1.07-.86,1.93-1.93,1.93s-1.93-.86-1.93-1.93Z"/><path d="M74.95,16.72l6.43-11.15c.53-.92,1.71-1.24,2.64-.71.92.53,1.24,1.71.71,2.64l-6.43,11.15c-.53.92-1.71,1.24-2.64.71-.92-.53-1.24-1.71-.71-2.64Z"/><path d="M58.85,16.72l-6.43-11.15c-.53-.92-1.71-1.24-2.64-.71-.92.53-1.24,1.71-.71,2.64l6.43,11.15c.53.92,1.71,1.24,2.64.71.92-.53,1.24-1.71.71-2.64Z"/><path d="M128.77,56.65c0-3.35.9-13.3,1.19-16.58.19-2.22-.07-3.44-.43-4.06-.26-.46-.67-.78-1.66-.84-.71-.05-1.49.16-2.07.68-.54.49-1.15,1.48-1.15,3.47v.11s-.89,15.12-.89,15.12c-.03.54-.29,1.05-.72,1.39-.42.34-.97.49-1.51.4l-9.29-1.47-10.02-1.33c-.93-.12-1.63-.89-1.67-1.82l-.55-11.95v-.1c-.25-4.76-.49-9.1-.49-10.44,0-3.75-.63-5.33-1.19-5.99-.44-.53-1.08-.76-2.44-.76-.49,0-.83.1-1.09.25-.25.15-.54.41-.82.94-.59,1.12-1.02,3.22-.86,6.88.21,4.76.53,8.31.85,11.51.32,3.2.63,6.1.81,9.47.27,5.28.25,8.92.03,11.39-.11,1.23-.27,2.23-.48,3.02-.2.75-.51,1.51-1.04,2.07-.64.69-1.56,1.02-2.52.79-.76-.18-1.29-.66-1.58-.97-.61-.64-1.04-1.46-1.21-1.89-.98-2.47-4.01-8.22-8.12-11.46-1.2-.95-2.07-1.22-2.62-1.26-.52-.04-.89.11-1.19.35-.33.26-.57.63-.69.99-.04.13-.06.22-.07.27,1.11,1.88,5.53,8.77,7.61,15.76,1.55,5.21,5.29,10.52,8.09,12.8,2.71,2.2,7.57,3.57,13.05,3.84,5.42.27,11.01-.57,14.95-2.33,7.6-3.41,9.14-10.91,9.84-14.16.54-2.52.55-5.22.4-7.72-.07-1.25-.18-2.41-.27-3.49-.09-1.04-.17-2.05-.17-2.88ZM110.59,24.28c0-1.17-.31-2.21-.83-2.91-.47-.63-1.16-1.07-2.26-1.07-.91,0-1.52.11-1.94.29-.39.16-.71.42-1,.9-.68,1.1-1.18,3.3-1.18,7.69l.48,10.39c.18,3.47.37,7.22.49,10.35l6.25.83v-26.47ZM114.45,51.31l5.58.88.76-12.93v-9.97c0-1.37-.56-2.21-1.22-2.74-.74-.6-1.6-.81-2-.81-.74,0-1.5.11-2.05.5-.42.3-1.07,1.01-1.07,3.05v22.01ZM124.65,32c1.15-.58,2.39-.76,3.48-.69,1.97.13,3.71.96,4.75,2.77.95,1.65,1.15,3.83.93,6.31-.3,3.43-1.18,13.11-1.18,16.25,0,.63.06,1.47.16,2.54.09,1.05.21,2.28.28,3.6.15,2.63.16,5.72-.48,8.75-.67,3.15-2.49,12.6-12.03,16.88-4.64,2.08-10.87,2.95-16.72,2.66-5.79-.28-11.64-1.73-15.29-4.7-3.44-2.8-7.59-8.79-9.35-14.69-1.99-6.67-6.29-13.24-7.36-15.11-.63-1.1-.43-2.4-.14-3.27.33-.98.98-2,1.94-2.77,1-.79,2.32-1.29,3.88-1.18,1.53.12,3.11.81,4.72,2.08,4.14,3.27,7.18,8.43,8.67,11.59.02-.15.03-.3.05-.46.19-2.21.23-5.65-.04-10.86-.17-3.26-.47-6.05-.79-9.29-.32-3.24-.65-6.87-.87-11.72-.17-3.88.23-6.82,1.31-8.86.56-1.06,1.32-1.9,2.28-2.46.96-.56,2.01-.78,3.04-.78,1.53,0,3.43.22,4.95,1.66.13-.29.28-.56.44-.81.7-1.13,1.63-1.93,2.77-2.42,1.1-.47,2.29-.6,3.46-.6,2.36,0,4.19,1.04,5.36,2.63.76,1.03,1.22,2.23,1.44,3.46,1.25-.57,2.51-.64,3.28-.64,1.31,0,3.02.53,4.43,1.68,1.49,1.21,2.65,3.11,2.65,5.74v2.71Z"/><path d="M5.12,56.65c0-3.35-.9-13.3-1.19-16.58-.19-2.22.07-3.44.43-4.06.26-.46.67-.78,1.66-.84.71-.05,1.49.16,2.07.68.54.49,1.15,1.48,1.15,3.47v.11s.89,15.12.89,15.12c.03.54.29,1.05.72,1.39.42.34.97.49,1.51.4l9.29-1.47,10.02-1.33c.93-.12,1.63-.89,1.67-1.82l.55-11.95v-.1c.25-4.76.48-9.1.48-10.44,0-3.75.63-5.33,1.19-5.99.44-.53,1.08-.76,2.44-.76.49,0,.83.1,1.09.25.25.15.54.41.82.94.59,1.12,1.02,3.22.86,6.88-.21,4.76-.53,8.31-.85,11.51-.32,3.2-.63,6.1-.81,9.47-.27,5.28-.25,8.92-.03,11.39.11,1.23.27,2.23.48,3.02.2.75.51,1.51,1.04,2.07.65.69,1.56,1.02,2.52.79.76-.18,1.29-.66,1.58-.97.61-.64,1.04-1.46,1.21-1.89.98-2.47,4.01-8.22,8.12-11.46,1.2-.95,2.07-1.22,2.62-1.26.52-.04.89.11,1.19.35.33.26.57.63.69.99.04.13.06.22.07.27-1.11,1.88-5.53,8.77-7.61,15.76-1.55,5.21-5.29,10.52-8.09,12.8-2.71,2.2-7.57,3.57-13.05,3.84-5.43.27-11.01-.57-14.95-2.33-7.6-3.41-9.15-10.91-9.84-14.16-.54-2.52-.55-5.22-.4-7.72.07-1.25.18-2.41.27-3.49.09-1.04.17-2.05.17-2.88ZM23.29,24.28c0-1.17.31-2.21.83-2.91.47-.63,1.16-1.07,2.26-1.07.91,0,1.52.11,1.95.29.39.16.71.42,1,.9.68,1.1,1.18,3.3,1.18,7.69l-.48,10.39c-.18,3.47-.37,7.22-.49,10.35l-6.25.83v-26.47ZM19.43,51.31l-5.58.88-.76-12.93v-9.97c0-1.37.56-2.21,1.22-2.74.74-.6,1.59-.81,2-.81.74,0,1.5.11,2.05.5.42.3,1.07,1.01,1.07,3.05v22.01ZM9.24,32c-1.15-.58-2.39-.76-3.48-.69-1.97.13-3.7.96-4.75,2.77-.95,1.65-1.15,3.83-.93,6.31.3,3.43,1.18,13.11,1.18,16.25,0,.63-.07,1.47-.16,2.54-.09,1.05-.21,2.28-.28,3.6-.15,2.63-.16,5.72.48,8.75.67,3.15,2.49,12.6,12.04,16.88,4.64,2.08,10.87,2.95,16.72,2.66,5.79-.28,11.65-1.73,15.29-4.7,3.44-2.8,7.59-8.79,9.35-14.69,1.99-6.67,6.29-13.24,7.36-15.11.63-1.1.43-2.4.14-3.27-.33-.98-.98-2-1.94-2.77-1-.79-2.32-1.29-3.88-1.18-1.53.12-3.11.81-4.72,2.08-4.14,3.27-7.18,8.43-8.67,11.59-.02-.15-.03-.3-.05-.46-.19-2.21-.23-5.65.04-10.86.17-3.26.47-6.05.79-9.29.32-3.24.65-6.87.87-11.72.17-3.88-.23-6.82-1.31-8.86-.56-1.06-1.32-1.9-2.28-2.46-.96-.56-2.01-.78-3.04-.78-1.53,0-3.43.22-4.95,1.66-.13-.29-.28-.56-.44-.81-.7-1.13-1.63-1.93-2.77-2.42-1.1-.47-2.28-.6-3.46-.6-2.36,0-4.19,1.04-5.36,2.63-.76,1.03-1.22,2.23-1.44,3.46-1.25-.57-2.51-.64-3.27-.64-1.31,0-3.02.53-4.43,1.68-1.49,1.21-2.64,3.11-2.64,5.74v2.71Z"/></svg>
-    </a>
-    <a class="nav-github" href="https://github.com/FraterCCCLXIII/OpenHands-Design.md" target="_blank" rel="noopener noreferrer" aria-label="OpenHands-Design.md on GitHub">
-      <svg width="14" height="14" viewBox="0 0 16 16" fill="currentColor" aria-hidden="true"><path d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2 .37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0 0 16 8c0-4.42-3.58-8-8-8z"/></svg>
-      OpenHands-Design.md
-    </a>
-  </div>
-  <ul class="nav-links">
-    <li><a href="#colors">Colors</a></li>
-    <li><a href="#typography">Typography</a></li>
-    <li><a href="#buttons">Buttons</a></li>
-    <li><a href="#cards">Cards</a></li>
-    <li><a href="#forms">Forms</a></li>
-    <li><a href="#spacing">Spacing</a></li>
-    <li><a href="#radius">Radius</a></li>
-    <li><a href="#elevation">Elevation</a></li>
-  </ul>
-</nav>
-<section class="hero">
-  <h1>OpenHands<br>Design System</h1>
-  <p>A design token catalog generated from DESIGN.md. Every color, font, component, and spacing value, visualized on the near-black monochrome canvas.</p>
-  <div class="hero-buttons">
-    <a class="btn-dark" href="https://github.com/FraterCCCLXIII/OpenHands-Design.md" target="_blank">View Repository</a>
-    <a class="btn-primary" href="#colors">Explore Tokens</a>
-  </div>
-</section>
-<hr class="section-divider">
-<!-- ==================== COLORS ==================== -->
-<section class="section" id="colors">
-  <div class="section-label">01 / Colors</div>
-  <h2 class="section-title">Color Palette</h2>
-  <div class="color-group-label">Core Surfaces</div>
-  <div class="color-grid">
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#0d0d0d"></div><div class="color-swatch-info"><div class="color-swatch-name">background</div><div class="color-swatch-hex">#0d0d0d</div><div class="color-swatch-role">Page background, app shell</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#121212"></div><div class="color-swatch-info"><div class="color-swatch-name">card</div><div class="color-swatch-hex">#121212</div><div class="color-swatch-role">Card surfaces, elevated containers</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#141414"></div><div class="color-swatch-info"><div class="color-swatch-name">secondary</div><div class="color-swatch-hex">#141414</div><div class="color-swatch-role">Secondary surfaces, sidebar accent</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#1f1f1f"></div><div class="color-swatch-info"><div class="color-swatch-name">muted</div><div class="color-swatch-hex">#1f1f1f</div><div class="color-swatch-role">Hover fills, badges, tooltips</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#242424"></div><div class="color-swatch-info"><div class="color-swatch-name">border</div><div class="color-swatch-hex">#242424</div><div class="color-swatch-role">Borders, input borders, dividers</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#2e2e2e"></div><div class="color-swatch-info"><div class="color-swatch-name">muted-hover</div><div class="color-swatch-hex">#2e2e2e</div><div class="color-swatch-role">Hover on muted surfaces</div></div></div>
-  </div>
-  <div class="color-group-label">Core Text</div>
-  <div class="color-grid">
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#fafafa"></div><div class="color-swatch-info"><div class="color-swatch-name">foreground</div><div class="color-swatch-hex">#fafafa</div><div class="color-swatch-role">Primary text, headings</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#8c8c8c"></div><div class="color-swatch-info"><div class="color-swatch-name">muted-foreground</div><div class="color-swatch-hex">#8c8c8c</div><div class="color-swatch-role">Secondary text, labels, placeholders</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#ffffff;border-bottom:1px solid hsl(0 0% 14%)"></div><div class="color-swatch-info"><div class="color-swatch-name">primary</div><div class="color-swatch-hex">#ffffff</div><div class="color-swatch-role">Maximum emphasis, button bg</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#000000"></div><div class="color-swatch-info"><div class="color-swatch-name">primary-foreground</div><div class="color-swatch-hex">#000000</div><div class="color-swatch-role">Text on white buttons</div></div></div>
-  </div>
-  <div class="color-group-label">Semantic / Status</div>
-  <div class="color-grid">
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#22c55e"></div><div class="color-swatch-info"><div class="color-swatch-name">success</div><div class="color-swatch-hex">#22c55e</div><div class="color-swatch-role">Success states, running</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#86efac"></div><div class="color-swatch-info"><div class="color-swatch-name">success-foreground</div><div class="color-swatch-hex">#86efac</div><div class="color-swatch-role">Success text on dark</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#f59e0b"></div><div class="color-swatch-info"><div class="color-swatch-name">warning</div><div class="color-swatch-hex">#f59e0b</div><div class="color-swatch-role">Warning, caution badges</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#3b82f6"></div><div class="color-swatch-info"><div class="color-swatch-name">info</div><div class="color-swatch-hex">#3b82f6</div><div class="color-swatch-role">Informational, links</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#dc2626"></div><div class="color-swatch-info"><div class="color-swatch-name">destructive</div><div class="color-swatch-hex">#dc2626</div><div class="color-swatch-role">Error, danger, delete</div></div></div>
-    <div class="color-swatch"><div class="color-swatch-block" style="background:#cccccc"></div><div class="color-swatch-info"><div class="color-swatch-name">ring</div><div class="color-swatch-hex">#cccccc</div><div class="color-swatch-role">Focus rings (1px, keyboard-only)</div></div></div>
-  </div>
-  <div class="color-group-label">Surface Scale (5% &rarr; 18% lightness)</div>
-  <div style="display:flex;border-radius:10px;overflow:hidden;height:64px;border:1px solid var(--border);margin-bottom:24px">
-    <div style="flex:1;background:#0d0d0d" title="5% — background"></div>
-    <div style="flex:1;background:#121212" title="7% — card"></div>
-    <div style="flex:1;background:#141414" title="8% — secondary"></div>
-    <div style="flex:1;background:#1f1f1f" title="12% — muted"></div>
-    <div style="flex:1;background:#242424" title="14% — border"></div>
-    <div style="flex:1;background:#2e2e2e" title="18% — muted-hover"></div>
-  </div>
-  <div style="display:flex;justify-content:space-between;font-family:var(--font-mono);font-size:11px;color:var(--fg-muted);margin-bottom:8px">
-    <span>5%</span><span>7%</span><span>8%</span><span>12%</span><span>14%</span><span>18%</span>
-  </div>
-</section>
-<hr class="section-divider">
-<!-- ==================== TYPOGRAPHY ==================== -->
-<section class="section" id="typography">
-  <div class="section-label">02 / Typography</div>
-  <h2 class="section-title">Typography Scale</h2>
-  <div class="type-sample"><div style="font-size:30px;font-weight:600;line-height:1.2;letter-spacing:-0.5px">Hero Heading (text-3xl)</div><div class="type-meta">30px / 600 / 1.2 / -0.5px / Inter</div></div>
-  <div class="type-sample"><div style="font-size:24px;font-weight:600;line-height:1.25;letter-spacing:-0.3px">Page Heading (text-2xl)</div><div class="type-meta">24px / 600 / 1.25 / -0.3px / Inter</div></div>
-  <div class="type-sample"><div style="font-size:20px;font-weight:600;line-height:1.3">Sub-heading (text-xl)</div><div class="type-meta">20px / 600 / 1.3 / Inter</div></div>
-  <div class="type-sample"><div style="font-size:18px;font-weight:600;line-height:1.4">Section Title (text-lg)</div><div class="type-meta">18px / 600 / 1.4 / Inter</div></div>
-  <div class="type-sample"><div style="font-size:16px;font-weight:400;line-height:1.5">Body Large — Larger body text for chat messages and hero content. (text-base)</div><div class="type-meta">16px / 400 / 1.5 / Inter</div></div>
-  <div class="type-sample"><div style="font-size:14px;font-weight:400;line-height:1.5">Body — Standard UI text for labels, descriptions, and interface elements. (text-sm)</div><div class="type-meta">14px / 400 / 1.5 / Inter — primary body size (711 uses)</div></div>
-  <div class="type-sample"><div style="font-size:14px;font-weight:500;line-height:1.5">Label — Form labels, nav items, and badges. (text-sm font-medium)</div><div class="type-meta">14px / 500 / 1.5 / Inter — label weight (304 uses)</div></div>
-  <div class="type-sample"><div style="font-size:12px;font-weight:400;line-height:1.5;color:var(--fg-muted)">Secondary — Metadata, captions, and small labels. (text-xs)</div><div class="type-meta">12px / 400 / 1.5 / Inter — secondary size (427 uses)</div></div>
-  <div class="type-sample"><div style="font-family:var(--font-mono);font-size:14px;font-weight:400;line-height:1.6">const design = await openHands.init({ tokens: true });</div><div class="type-meta">14px / 400 / 1.6 / JetBrains Mono — code / technical text</div></div>
-  <div class="type-sample"><div style="font-size:11px;font-weight:500;line-height:1.27;text-transform:uppercase;letter-spacing:1.5px;color:var(--fg-muted)">SYSTEM CATEGORY</div><div class="type-meta">11px / 500 / uppercase / tracking-wide — section labels</div></div>
-</section>
-<hr class="section-divider">
-<!-- ==================== BUTTONS ==================== -->
-<section class="section" id="buttons">
-  <div class="section-label">03 / Buttons</div>
-  <h2 class="section-title">Button Variants</h2>
-  <div class="button-row">
-    <div class="button-item"><a class="btn-dark" href="#">Save Changes</a><div class="button-label">default</div></div>
-    <div class="button-item"><a class="btn-destructive" href="#">Delete</a><div class="button-label">destructive</div></div>
-    <div class="button-item"><a class="btn-outline" href="#">Cancel</a><div class="button-label">outline</div></div>
-    <div class="button-item"><a class="btn-primary" href="#">Settings</a><div class="button-label">secondary</div></div>
-    <div class="button-item"><a class="btn-muted" href="#">Archive</a><div class="button-label">muted</div></div>
-    <div class="button-item"><a class="btn-ghost" href="#">Learn More</a><div class="button-label">ghost</div></div>
-    <div class="button-item"><a href="#" style="font-size:14px;color:var(--primary);text-decoration:none;text-underline-offset:4px" onmouseover="this.style.textDecoration='underline'" onmouseout="this.style.textDecoration='none'">View docs</a><div class="button-label">link</div></div>
-  </div>
-  <div style="margin-top:40px">
-    <div class="color-group-label">Status Pills</div>
-    <div class="button-row">
-      <div class="button-item"><span style="display:inline-block;background:hsl(142 71% 45% / 0.1);color:var(--success-fg);padding:3px 10px;border-radius:9999px;font-size:12px;font-weight:500">Success</span><div class="button-label">success</div></div>
-      <div class="button-item"><span style="display:inline-block;background:hsl(38 92% 50% / 0.15);color:var(--warning);padding:3px 10px;border-radius:9999px;font-size:12px;font-weight:500">Warning</span><div class="button-label">warning</div></div>
-      <div class="button-item"><span style="display:inline-block;background:hsl(217 91% 60% / 0.15);color:var(--info);padding:3px 10px;border-radius:9999px;font-size:12px;font-weight:500">Info</span><div class="button-label">info</div></div>
-      <div class="button-item"><span style="display:inline-block;background:hsl(0 72% 51% / 0.1);color:var(--destructive);padding:3px 10px;border-radius:9999px;font-size:12px;font-weight:500">Error</span><div class="button-label">destructive</div></div>
-    </div>
-  </div>
-</section>
-<hr class="section-divider">
-<!-- ==================== CARDS ==================== -->
-<section class="section" id="cards">
-  <div class="section-label">04 / Cards</div>
-  <h2 class="section-title">Card Examples</h2>
-  <div class="card-grid">
-    <div class="card">
-      <div class="card-badge" style="background:hsl(142 71% 45% / 0.1);color:var(--success-fg)">Standard</div>
-      <h3>Standard Card</h3>
-      <p>bg-card border border-border rounded-lg p-4. The workhorse container for settings panels, content sections, and list items.</p>
-    </div>
-    <div class="card" style="box-shadow:var(--shadow-card);border-radius:12px;padding:24px">
-      <div class="card-badge" style="background:hsl(217 91% 60% / 0.15);color:var(--info)">Elevated</div>
-      <h3>Elevated Card</h3>
-      <p>bg-card border border-border rounded-xl p-6 shadow-lg. For modals, dialogs, and featured content that needs to float above the surface.</p>
-    </div>
-    <div class="card" style="border-color:var(--border-hover)">
-      <div class="card-badge" style="background:hsl(38 92% 50% / 0.15);color:var(--warning)">Interactive</div>
-      <h3>Interactive Card</h3>
-      <p>hover:border-white/30. Cards that respond to hover with a subtle border brightening to indicate they are clickable.</p>
-    </div>
-  </div>
-</section>
-<hr class="section-divider">
-<!-- ==================== FORMS ==================== -->
-<section class="section" id="forms">
-  <div class="section-label">05 / Forms</div>
-  <h2 class="section-title">Form Elements</h2>
-  <div class="form-group"><label class="form-label">Project Name</label><input class="form-input" type="text" placeholder="my-openhands-project"><div class="form-state-label">Default (border-border, bg-muted/40)</div></div>
-  <div class="form-group"><label class="form-label">Repository</label><input class="form-input form-input--focus" type="text" value="openhands/agent"><div class="form-state-label">Focus (ring-1, ring-ring, bg-muted/60)</div></div>
-  <div class="form-group"><label class="form-label">API Key</label><input class="form-input form-input--error" type="text" value="invalid-key-123"><div class="form-state-label">Error (border-destructive)</div></div>
-  <div class="form-group"><label class="form-label">Instructions</label><textarea class="form-textarea" placeholder="Describe the task for the agent..."></textarea></div>
-  <div class="form-group">
-    <label class="form-label">Framework</label>
-    <div style="position:relative">
-      <select class="form-input" style="appearance:none;padding-right:36px;cursor:pointer">
-        <option>React + Tailwind</option>
-        <option>Next.js</option>
-        <option>Vue</option>
-      </select>
-      <svg style="position:absolute;right:12px;top:50%;transform:translateY(-50%);pointer-events:none" width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="m6 9 6 6 6-6"/></svg>
-    </div>
-    <div class="form-state-label">NativeSelect (appearance: none, chevron overlay)</div>
-  </div>
-</section>
-<hr class="section-divider">
-<!-- ==================== SPACING ==================== -->
-<section class="section" id="spacing">
-  <div class="section-label">06 / Spacing</div>
-  <h2 class="section-title">Spacing Scale</h2>
-  <div class="spacing-row">
-    <div class="spacing-item"><div class="spacing-block" style="width:4px"></div><div class="spacing-value">4 (gap-1)</div></div>
-    <div class="spacing-item"><div class="spacing-block" style="width:6px"></div><div class="spacing-value">6 (gap-1.5)</div></div>
-    <div class="spacing-item"><div class="spacing-block" style="width:8px"></div><div class="spacing-value">8 (gap-2)</div></div>
-    <div class="spacing-item"><div class="spacing-block" style="width:12px"></div><div class="spacing-value">12 (gap-3)</div></div>
-    <div class="spacing-item"><div class="spacing-block" style="width:16px"></div><div class="spacing-value">16 (gap-4)</div></div>
-    <div class="spacing-item"><div class="spacing-block" style="width:24px"></div><div class="spacing-value">24 (gap-6)</div></div>
-    <div class="spacing-item"><div class="spacing-block" style="width:32px"></div><div class="spacing-value">32 (gap-8)</div></div>
-    <div class="spacing-item"><div class="spacing-block" style="width:48px"></div><div class="spacing-value">48 (gap-12)</div></div>
-  </div>
-  <div style="margin-top:40px">
-    <div class="color-group-label">Padding Patterns</div>
-    <div style="display:flex;flex-wrap:wrap;gap:16px;margin-top:12px">
-      <div style="background:var(--card);border:1px solid var(--border);border-radius:6px;padding:8px 12px;font-size:12px;color:var(--fg-muted)">px-3 py-2 <span style="color:var(--fg)">compact</span></div>
-      <div style="background:var(--card);border:1px solid var(--border);border-radius:6px;padding:8px 16px;font-size:12px;color:var(--fg-muted)">px-4 py-2 <span style="color:var(--fg)">standard</span></div>
-      <div style="background:var(--card);border:1px solid var(--border);border-radius:6px;padding:16px;font-size:12px;color:var(--fg-muted)">p-4 <span style="color:var(--fg)">card</span></div>
-      <div style="background:var(--card);border:1px solid var(--border);border-radius:6px;padding:24px;font-size:12px;color:var(--fg-muted)">p-6 <span style="color:var(--fg)">dialog</span></div>
-    </div>
-  </div>
-</section>
-<hr class="section-divider">
-<!-- ==================== RADIUS ==================== -->
-<section class="section" id="radius">
-  <div class="section-label">07 / Radius</div>
-  <h2 class="section-title">Border Radius Scale</h2>
-  <div class="radius-row">
-    <div class="radius-item"><div class="radius-box" style="border-radius:2px"></div><div class="radius-label">2px</div><div class="radius-context">rounded-sm</div></div>
-    <div class="radius-item"><div class="radius-box" style="border-radius:4px"></div><div class="radius-label">4px</div><div class="radius-context">rounded-md</div></div>
-    <div class="radius-item"><div class="radius-box" style="border-radius:6px"></div><div class="radius-label">6px</div><div class="radius-context">rounded-lg</div></div>
-    <div class="radius-item"><div class="radius-box" style="border-radius:12px"></div><div class="radius-label">12px</div><div class="radius-context">rounded-modal</div></div>
-    <div class="radius-item"><div class="radius-box" style="border-radius:16px"></div><div class="radius-label">16px</div><div class="radius-context">rounded-2xl</div></div>
-    <div class="radius-item"><div class="radius-box" style="border-radius:9999px"></div><div class="radius-label">9999px</div><div class="radius-context">rounded-full</div></div>
-  </div>
-</section>
-<hr class="section-divider">
-<!-- ==================== ELEVATION ==================== -->
-<section class="section" id="elevation">
-  <div class="section-label">08 / Elevation</div>
-  <h2 class="section-title">Elevation &amp; Depth</h2>
-  <div class="elevation-grid">
-    <div class="elevation-card" style="border:1px solid var(--border)"><div class="elevation-label">Level 0: Flat</div><div class="elevation-desc">No shadow, bg-background</div></div>
-    <div class="elevation-card" style="border:1px solid var(--border);box-shadow:0 1px 2px 0 hsl(0 0% 0% / 0.3)"><div class="elevation-label">Level 1: Surface</div><div class="elevation-desc">shadow-card + border</div></div>
-    <div class="elevation-card" style="border:1px solid var(--border);box-shadow:0 4px 6px -1px hsl(0 0% 0% / 0.3),0 2px 4px -2px hsl(0 0% 0% / 0.3)"><div class="elevation-label">Level 2: Raised</div><div class="elevation-desc">shadow-md — dropdowns, popovers</div></div>
-    <div class="elevation-card" style="border:1px solid var(--border);box-shadow:0 10px 15px -3px hsl(0 0% 0% / 0.3),0 4px 6px -4px hsl(0 0% 0% / 0.3)"><div class="elevation-label">Level 3: Floating</div><div class="elevation-desc">shadow-lg — modals, dialogs</div></div>
-    <div class="elevation-card" style="border:1px solid var(--border);box-shadow:0 20px 25px -5px hsl(0 0% 0% / 0.3),0 8px 10px -6px hsl(0 0% 0% / 0.3)"><div class="elevation-label">Level 4: Overlay</div><div class="elevation-desc">shadow-xl — full-screen overlays</div></div>
-    <div class="elevation-card" style="border:1px solid var(--border);box-shadow:0 0 0 1px var(--ring),0 0 0 3px var(--bg)"><div class="elevation-label">Focus Ring</div><div class="elevation-desc">ring-1 ring-ring ring-offset-2</div></div>
-  </div>
-</section>
-<footer class="footer">
-  <a href="https://github.com/FraterCCCLXIII/OpenHands-Design.md" target="_blank" rel="noopener noreferrer">OpenHands-Design.md</a> &mdash; Design tokens for the OpenHands UI
-</footer>
-</body>
-</html>

OpenHands-Design/src/components/ui/button.tsx DELETED Viewed

@@ -1,50 +0,0 @@
-import * as React from 'react';
-import { Slot } from '@radix-ui/react-slot';
-import { cva, type VariantProps } from 'class-variance-authority';
-import { cn } from '../../lib/utils';
-const buttonVariants = cva(
-  'inline-flex items-center justify-center gap-2 whitespace-nowrap rounded-md text-sm font-normal ring-offset-background transition-all focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:pointer-events-none disabled:opacity-50 active:scale-[0.97] [&_svg]:pointer-events-none [&_svg]:size-4 [&_svg]:shrink-0',
-  {
-    variants: {
-      variant: {
-        default: 'bg-primary text-primary-foreground hover:bg-primary/85',
-        destructive: 'bg-destructive text-destructive-foreground hover:bg-destructive/85',
-        outline: 'border border-input bg-background hover:bg-muted hover:text-foreground',
-        light: 'border border-input bg-primary text-primary-foreground hover:bg-primary/85',
-        secondary: 'bg-secondary text-secondary-foreground hover:bg-muted-hover',
-        muted: 'bg-muted text-muted-foreground hover:bg-muted-hover hover:text-foreground',
-        ghost: 'hover:bg-muted hover:text-foreground',
-        link: 'text-primary underline-offset-4 hover:underline',
-      },
-      size: {
-        default: 'h-10 px-4 py-2',
-        sm: 'h-10 rounded-md px-3',
-        xs: 'h-10 rounded-md px-3 text-xs',
-        lg: 'h-10 rounded-md px-8',
-        icon: 'h-10 w-10',
-      },
-    },
-    defaultVariants: {
-      variant: 'default',
-      size: 'default',
-    },
-  }
-);
-export interface ButtonProps
-  extends React.ButtonHTMLAttributes<HTMLButtonElement>,
-    VariantProps<typeof buttonVariants> {
-  asChild?: boolean;
-}
-const Button = React.forwardRef<HTMLButtonElement, ButtonProps>(
-  ({ className, variant, size, asChild = false, ...props }, ref) => {
-    const Comp = asChild ? Slot : 'button';
-    return <Comp className={cn(buttonVariants({ variant, size, className }))} ref={ref} {...props} />;
-  }
-);
-Button.displayName = 'Button';
-export { Button, buttonVariants };

OpenHands-Design/src/components/ui/input.tsx DELETED Viewed

@@ -1,22 +0,0 @@
-import * as React from 'react';
-import { cn } from '../../lib/utils';
-const Input = React.forwardRef<HTMLInputElement, React.ComponentProps<'input'>>(
-  ({ className, type, ...props }, ref) => {
-    return (
-      <input
-        type={type}
-        className={cn(
-          'flex h-10 w-full rounded-md border border-border bg-muted/40 px-3 py-2 text-base ring-offset-background file:border-0 file:bg-transparent file:text-sm file:font-medium file:text-foreground placeholder:text-muted-foreground focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:bg-muted/60 hover:bg-muted/60 disabled:cursor-not-allowed disabled:opacity-50 disabled:bg-muted/30 md:text-sm',
-          className
-        )}
-        ref={ref}
-        {...props}
-      />
-    );
-  }
-);
-Input.displayName = 'Input';
-export { Input };

OpenHands-Design/src/components/ui/native-select.tsx DELETED Viewed

@@ -1,26 +0,0 @@
-import * as React from 'react';
-import { ChevronDown } from 'lucide-react';
-import { cn } from '../../lib/utils';
-const nativeSelectClassName =
-  'h-10 w-full appearance-none rounded-md border border-border bg-muted/40 py-2 pl-3 pr-10 text-sm text-foreground ring-offset-background hover:bg-muted/60 focus-visible:bg-muted/60 focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:cursor-not-allowed disabled:bg-muted/30 disabled:opacity-50';
-export type NativeSelectProps = React.ComponentPropsWithoutRef<'select'> & {
-  wrapperClassName?: string;
-};
-export const NativeSelect = React.forwardRef<HTMLSelectElement, NativeSelectProps>(
-  ({ className, wrapperClassName, children, ...props }, ref) => (
-    <div className={cn('relative w-full', wrapperClassName)}>
-      <select ref={ref} className={cn(nativeSelectClassName, className)} {...props}>
-        {children}
-      </select>
-      <ChevronDown
-        className="pointer-events-none absolute right-3 top-1/2 h-5 w-5 -translate-y-1/2 text-muted-foreground"
-        aria-hidden
-      />
-    </div>
-  )
-);
-NativeSelect.displayName = 'NativeSelect';

OpenHands-Design/src/components/ui/search-input.tsx DELETED Viewed

@@ -1,74 +0,0 @@
-import * as React from 'react';
-import { Search, XCircle } from 'lucide-react';
-import { Input } from './input';
-import { cn } from '../../lib/utils';
-type InputProps = React.ComponentProps<typeof Input>;
-export type SearchInputProps = Omit<InputProps, 'type' | 'size'> & {
-  value: string;
-  onValueChange: (value: string) => void;
-  /** Size: sm (h-9), default (h-10), lg (h-11) */
-  size?: 'sm' | 'default' | 'lg';
-};
-const SearchInput = React.forwardRef<HTMLInputElement, SearchInputProps>(
-  (
-    {
-      value,
-      onValueChange,
-      placeholder,
-      'aria-label': ariaLabel,
-      className,
-      size = 'default',
-      ...props
-    },
-    ref
-  ) => {
-    const sizeClasses = {
-      sm: 'h-9 pl-9 pr-9',
-      default: 'h-10 pl-10 pr-10',
-      lg: 'h-11 pl-11 pr-11 text-base',
-    };
-    const iconSizes = {
-      sm: 'h-4 w-4',
-      default: 'h-4 w-4',
-      lg: 'h-5 w-5',
-    };
-    const hasValue = value.length > 0;
-    return (
-      <div className={cn('relative w-full', className)}>
-        <Search
-          className={cn(
-            'absolute left-3 top-1/2 -translate-y-1/2 text-muted-foreground pointer-events-none',
-            iconSizes[size]
-          )}
-          aria-hidden
-        />
-        <Input
-          ref={ref}
-          type="search"
-          value={value}
-          onChange={(e) => onValueChange(e.target.value)}
-          placeholder={placeholder}
-          aria-label={ariaLabel}
-          className={cn(sizeClasses[size])}
-          {...props}
-        />
-        {hasValue && (
-          <button
-            type="button"
-            onClick={() => onValueChange('')}
-            className="absolute right-2 top-1/2 -translate-y-1/2 rounded-full p-1 text-muted-foreground hover:bg-muted hover:text-foreground focus:outline-none focus-visible:ring-1 focus-visible:ring-ring focus-visible:ring-offset-2"
-            aria-label="Clear search"
-          >
-            <XCircle className={cn(iconSizes[size])} strokeWidth={2} />
-          </button>
-        )}
-      </div>
-    );
-  }
-);
-SearchInput.displayName = 'SearchInput';
-export { SearchInput };

OpenHands-Design/src/globals.css DELETED Viewed

@@ -1,135 +0,0 @@
-@import url('https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500;600&family=Inter:wght@400;500;600;700&display=swap');
-@tailwind base;
-@tailwind components;
-@tailwind utilities;
-@layer base {
-  :root {
-    --background: 0 0% 5%;
-    --modal-background: var(--background);
-    --foreground: 0 0% 98%;
-    --card: 0 0% 7%;
-    --card-foreground: 0 0% 98%;
-    --popover: 0 0% 7%;
-    --popover-foreground: 0 0% 98%;
-    --primary: 0 0% 100%;
-    --primary-foreground: 0 0% 0%;
-    --secondary: 0 0% 8%;
-    --secondary-foreground: 0 0% 98%;
-    --muted: 0 0% 12%;
-    --muted-foreground: 0 0% 55%;
-    --muted-hover: 0 0% 18%;
-    --accent: 0 0% 100%;
-    --accent-foreground: 0 0% 0%;
-    --destructive: 0 72% 51%;
-    --destructive-foreground: 0 0% 98%;
-    --border: 0 0% 14%;
-    --input: 0 0% 14%;
-    --ring: 0 0% 80%;
-    --radius: 0.375rem;
-    --radius-sm: 0.25rem;
-    --radius-modal: 0.75rem;
-    --success: 142 71% 45%;
-    --success-foreground: 142 71% 76%;
-    --warning: 38 92% 50%;
-    --info: 217 91% 60%;
-    --gradient-card-hover: linear-gradient(180deg, hsl(0 0% 9%) 0%, hsl(0 0% 7%) 100%);
-    --shadow-card: 0 1px 2px 0 hsl(0 0% 0% / 0.3);
-    --font-sans: 'Inter', system-ui, sans-serif;
-    --font-mono: 'JetBrains Mono', monospace;
-    --sidebar-background: 0 0% 5%;
-    --sidebar-foreground: 0 0% 98%;
-    --sidebar-primary: 0 0% 100%;
-    --sidebar-primary-foreground: 0 0% 0%;
-    --sidebar-accent: 0 0% 8%;
-    --sidebar-accent-foreground: 0 0% 98%;
-    --sidebar-border: 0 0% 14%;
-    --sidebar-ring: 0 0% 50%;
-  }
-  .dark {
-    --background: 0 0% 5%;
-    --modal-background: var(--background);
-    --foreground: 0 0% 98%;
-    --card: 0 0% 7%;
-    --card-foreground: 0 0% 98%;
-    --popover: 0 0% 7%;
-    --popover-foreground: 0 0% 98%;
-    --primary: 0 0% 100%;
-    --primary-foreground: 0 0% 0%;
-    --secondary: 0 0% 8%;
-    --secondary-foreground: 0 0% 98%;
-    --muted: 0 0% 12%;
-    --muted-foreground: 0 0% 55%;
-    --muted-hover: 0 0% 18%;
-    --accent: 0 0% 100%;
-    --accent-foreground: 0 0% 0%;
-    --destructive: 0 72% 51%;
-    --destructive-foreground: 0 0% 98%;
-    --border: 0 0% 14%;
-    --input: 0 0% 14%;
-    --ring: 0 0% 80%;
-    --radius: 0.375rem;
-    --radius-sm: 0.25rem;
-    --radius-modal: 0.75rem;
-    --success: 142 71% 45%;
-    --success-foreground: 142 71% 76%;
-    --warning: 38 92% 50%;
-    --info: 217 91% 60%;
-    --gradient-card-hover: linear-gradient(180deg, hsl(0 0% 9%) 0%, hsl(0 0% 7%) 100%);
-    --shadow-card: 0 1px 2px 0 hsl(0 0% 0% / 0.3);
-    --font-sans: 'Inter', system-ui, sans-serif;
-    --font-mono: 'JetBrains Mono', monospace;
-    --sidebar-background: 0 0% 5%;
-    --sidebar-foreground: 0 0% 98%;
-    --sidebar-primary: 0 0% 100%;
-    --sidebar-primary-foreground: 0 0% 0%;
-    --sidebar-accent: 0 0% 8%;
-    --sidebar-accent-foreground: 0 0% 98%;
-    --sidebar-border: 0 0% 14%;
-    --sidebar-ring: 0 0% 50%;
-  }
-}
-/* Strip native search chrome so focus ring matches all other inputs */
-input[type="search"] {
-  -webkit-appearance: none;
-  appearance: none;
-}
-input[type="search"]::-webkit-search-cancel-button {
-  -webkit-appearance: none;
-  appearance: none;
-  display: none;
-}
-input[type="search"]::-webkit-search-decoration,
-input[type="search"]::-webkit-search-results-button,
-input[type="search"]::-webkit-search-results-decoration {
-  -webkit-appearance: none;
-  appearance: none;
-}

OpenHands-Design/tailwind.config.js DELETED Viewed

@@ -1,99 +0,0 @@
-/** @type {import('tailwindcss').Config} */
-export default {
-  darkMode: ["class"],
-  content: ["./index.html", "./src/**/*.{js,ts,jsx,tsx}"],
-  theme: {
-    container: {
-      center: true,
-      padding: "2rem",
-      screens: {
-        "2xl": "1400px",
-      },
-    },
-    extend: {
-      fontFamily: {
-        sans: ["Inter", "system-ui", "sans-serif"],
-        mono: ["JetBrains Mono", "monospace"],
-      },
-      colors: {
-        border: "hsl(var(--border))",
-        input: "hsl(var(--input))",
-        ring: "hsl(var(--ring))",
-        background: "hsl(var(--background))",
-        modal: "hsl(var(--modal-background))",
-        foreground: "hsl(var(--foreground))",
-        primary: {
-          DEFAULT: "hsl(var(--primary))",
-          foreground: "hsl(var(--primary-foreground))",
-        },
-        secondary: {
-          DEFAULT: "hsl(var(--secondary))",
-          foreground: "hsl(var(--secondary-foreground))",
-        },
-        destructive: {
-          DEFAULT: "hsl(var(--destructive) / <alpha-value>)",
-          foreground: "hsl(var(--destructive-foreground) / <alpha-value>)",
-        },
-        muted: {
-          DEFAULT: "hsl(var(--muted))",
-          foreground: "hsl(var(--muted-foreground))",
-          hover: "hsl(var(--muted-hover))",
-        },
-        accent: {
-          DEFAULT: "hsl(var(--accent))",
-          foreground: "hsl(var(--accent-foreground))",
-        },
-        popover: {
-          DEFAULT: "hsl(var(--popover))",
-          foreground: "hsl(var(--popover-foreground))",
-        },
-        card: {
-          DEFAULT: "hsl(var(--card))",
-          foreground: "hsl(var(--card-foreground))",
-        },
-        sidebar: {
-          DEFAULT: "hsl(var(--sidebar-background))",
-          foreground: "hsl(var(--sidebar-foreground))",
-          primary: "hsl(var(--sidebar-primary))",
-          "primary-foreground": "hsl(var(--sidebar-primary-foreground))",
-          accent: "hsl(var(--sidebar-accent))",
-          "accent-foreground": "hsl(var(--sidebar-accent-foreground))",
-          border: "hsl(var(--sidebar-border))",
-          ring: "hsl(var(--sidebar-ring))",
-        },
-        success: {
-          DEFAULT: "hsl(var(--success) / <alpha-value>)",
-          foreground: "hsl(var(--success-foreground) / <alpha-value>)",
-        },
-        warning: "hsl(var(--warning) / <alpha-value>)",
-        info: "hsl(var(--info) / <alpha-value>)",
-      },
-      borderRadius: {
-        modal: "var(--radius-modal)",
-        lg: "var(--radius)",
-        md: "calc(var(--radius) - 2px)",
-        sm: "calc(var(--radius) - 4px)",
-      },
-      keyframes: {
-        "accordion-down": {
-          from: { height: "0" },
-          to: { height: "var(--radix-accordion-content-height)" },
-        },
-        "accordion-up": {
-          from: { height: "var(--radix-accordion-content-height)" },
-          to: { height: "0" },
-        },
-        "pulse-glow": {
-          "0%, 100%": { opacity: "1" },
-          "50%": { opacity: "0.5" },
-        },
-      },
-      animation: {
-        "accordion-down": "accordion-down 0.2s ease-out",
-        "accordion-up": "accordion-up 0.2s ease-out",
-        "pulse-glow": "pulse-glow 2s ease-in-out infinite",
-      },
-    },
-  },
-  plugins: [require("tailwindcss-animate")],
-};

alternative_agents_page.py DELETED Viewed

@@ -1,103 +0,0 @@
-"""Alternative Agents leaderboard page.
-The canonical OpenHands Index leaderboard (Home + the per-category pages)
-ranks default OpenHands agent runs from ``results/{model}/`` in the
-openhands-index-results repo. Third-party harnesses (Claude Code, Codex,
-Gemini CLI, OpenHands Sub-agents, ...) live under
-``alternative_agents/{type}/{model}/`` and aren't directly comparable to
-default OpenHands runs (different scaffolds, different cost/runtime
-characteristics), so they get their own standalone page instead of being
-mixed into the same ranking.
-This page is intentionally a single Overall view (no per-category
-subpages) — the alternative-agents dataset is small (one row per
-harness × model) and the goal is "show me all the alternatives at a
-glance", not "drill into Issue Resolution for Codex".
-To make same-model comparisons easier, the page also appends canonical
-OpenHands rows for any language model that appears in the alternative
-agent dataset. The match is exact, so ``Gemini-3-Pro`` and
-``Gemini-3.1-Pro`` remain distinct entries.
-"""
-import matplotlib
-matplotlib.use('Agg')
-import pandas as pd
-import gradio as gr
-from simple_data_loader import SimpleLeaderboardViewer
-from ui_components import (
-    create_leaderboard_display,
-    get_full_leaderboard_data,
-)
-ALTERNATIVE_AGENTS_INTRO = """
-<div id="alternative-agents-intro">
-  <h2>Alternative Agents</h2>
-  <p>
-    Third-party agent harnesses running the OpenHands Index benchmarks.
-    To make direct comparisons easier, this page also includes the
-    canonical OpenHands row whenever the exact same language model appears
-    under an alternative harness. Cost and runtime numbers still come from
-    each harness's own instrumentation and aren't directly comparable
-    across harnesses.
-  </p>
-</div>
-"""
-def _append_openhands_shared_models(
-    alternative_df: pd.DataFrame,
-    split: str,
-) -> pd.DataFrame:
-    if alternative_df.empty or "Language Model" not in alternative_df.columns:
-        return alternative_df
-    openhands_df, _ = get_full_leaderboard_data(
-        split,
-        agent_filter=SimpleLeaderboardViewer.AGENT_FILTER_OPENHANDS,
-    )
-    if openhands_df.empty or "Language Model" not in openhands_df.columns:
-        return alternative_df
-    alternative_models = set(
-        alternative_df["Language Model"].dropna().astype(str).str.strip()
-    )
-    if not alternative_models:
-        return alternative_df
-    openhands_shared_df = openhands_df[
-        openhands_df["Language Model"].astype(str).str.strip().isin(alternative_models)
-    ].copy()
-    if openhands_shared_df.empty:
-        return alternative_df
-    return pd.concat([alternative_df, openhands_shared_df], ignore_index=True, sort=False)
-def build_page():
-    gr.HTML(ALTERNATIVE_AGENTS_INTRO)
-    gr.Markdown("---")
-    test_df, test_tag_map = get_full_leaderboard_data(
-        "test",
-        agent_filter=SimpleLeaderboardViewer.AGENT_FILTER_ALTERNATIVE,
-    )
-    if test_df.empty:
-        gr.Markdown(
-            "No alternative agent submissions yet. New runs land in "
-            "`alternative_agents/{type}/{model}/` in "
-            "[openhands-index-results](https://github.com/OpenHands/openhands-index-results)."
-        )
-        return
-    test_df = _append_openhands_shared_models(test_df, split="test")
-    create_leaderboard_display(
-        full_df=test_df,
-        tag_map=test_tag_map,
-        category_name="Overall",
-        split_name="test",
-    )

app.py CHANGED Viewed

@@ -35,7 +35,6 @@ from app_creation import build_page as build_app_creation_page
 from frontend_development import build_page as build_frontend_page
 from test_generation import build_page as build_test_generation_page
 from information_gathering import build_page as build_information_gathering_page
-from alternative_agents_page import build_page as build_alternative_agents_page
 from about import build_page as build_about_page
 logger.info(f"All modules imported (LOCAL_DEBUG={LOCAL_DEBUG})")
@@ -374,46 +373,20 @@ with demo.route("Testing", "/testing"):
 with demo.route("Information Gathering", "/information-gathering"):
     build_information_gathering_page()
-with demo.route("Alternative Agents", "/alternative-agents"):
-    build_alternative_agents_page()
 with demo.route("About", "/about"):
     build_about_page()
 logger.info("All routes configured")
 # Mount the REST API on /api
-from fastapi import FastAPI, Request
-from fastapi.responses import RedirectResponse
-from starlette.middleware.base import BaseHTTPMiddleware
 from api import api_app
-class RootRedirectMiddleware(BaseHTTPMiddleware):
-    """Middleware to redirect root path "/" to "/home".
-    This fixes the 307 trailing slash redirect issue (Gradio bug #11071) that
-    occurs when Gradio is mounted at "/" - FastAPI's default behavior redirects
-    "/" to "//", which breaks routing on HuggingFace Spaces.
-    See: https://github.com/gradio-app/gradio/issues/11071
-    """
-    async def dispatch(self, request: Request, call_next):
-        if request.url.path == "/":
-            return RedirectResponse(url="/home", status_code=302)
-        return await call_next(request)
-# Create a parent FastAPI app with redirect_slashes=False to prevent
-# automatic trailing slash redirects that cause issues with Gradio
-root_app = FastAPI(redirect_slashes=False)
-# Add middleware to handle root path redirect to /home
-root_app.add_middleware(RootRedirectMiddleware)
 root_app.mount("/api", api_app)
-# Mount Gradio app at root path
 app = gr.mount_gradio_app(root_app, demo, path="/")
 logger.info("REST API mounted at /api, Gradio app mounted at /")

 from frontend_development import build_page as build_frontend_page
 from test_generation import build_page as build_test_generation_page
 from information_gathering import build_page as build_information_gathering_page
 from about import build_page as build_about_page
 logger.info(f"All modules imported (LOCAL_DEBUG={LOCAL_DEBUG})")
 with demo.route("Information Gathering", "/information-gathering"):
     build_information_gathering_page()
 with demo.route("About", "/about"):
     build_about_page()
 logger.info("All routes configured")
 # Mount the REST API on /api
+from fastapi import FastAPI
 from api import api_app
+# Create a parent FastAPI app that will host both the API and Gradio
+root_app = FastAPI()
 root_app.mount("/api", api_app)
+# Mount Gradio app - root redirect is handled by the proxy
 app = gr.mount_gradio_app(root_app, demo, path="/")
 logger.info("REST API mounted at /api, Gradio app mounted at /")

assets/harnesses/README.md DELETED Viewed

@@ -1,59 +0,0 @@
-# Agent harness logos
-This folder holds the **bottom half** of the composite scatter markers used
-on the [Alternative Agents](../../alternative_agents_page.py) page. Each
-point on that scatter stacks two logos: the model provider on top (from
-`assets/logo-*.svg`) and the harness on the bottom (from this folder).
-## Expected filenames
-The scatter code looks up a logo by the exact `agent_name` string that the
-`push-to-index` workflow writes into the index repo's `metadata.json`, then
-maps it through `HARNESS_LOGO_STEMS` in `leaderboard_transformer.py`. Keep
-these filenames in sync with that map.
-| `agent_name` (in index repo) | File in this folder |
-| --- | --- |
-| `Claude Code`          | `claude-code.svg`  or `claude-code.png` |
-| `Codex`                | `codex-cli.svg`    or `codex-cli.png`   |
-| `Gemini CLI`           | `gemini-cli.svg`   or `gemini-cli.png`  |
-| `OpenHands`            | `openhands.svg`    or `openhands.png`   |
-| `OpenHands Sub-agents` | `openhands.svg`    or `openhands.png`   (shared with `OpenHands`) |
-Both `.svg` and `.png` are accepted — the resolver tries `.svg` first, then
-`.png`. **Prefer SVG when possible**: the HuggingFace Space rejects new
-binary files on plain `git push` and routes PNGs through Xet, so an SVG is
-one less thing to set up.
-## When a file is missing
-The scatter falls back to a single marker (just the model provider logo) —
-exactly the same rendering path the canonical OpenHands pages use. Nothing
-crashes and nothing prints a warning in normal operation. This means you
-can roll out logos one harness at a time without waiting for all four.
-## Sizing and shape
-- Square canvas. The composite marker is drawn at a fixed aspect ratio, so
-  a non-square logo will get squished.
-- Any SVG `viewBox` works — the renderer base64-encodes the file as-is and
-  Plotly scales it to the marker's `sizex` / `sizey`. Around `80×80` to
-  `256×256` is a good source size.
-- Leave some internal padding (≈10%) so the logo doesn't touch the marker
-  edge when two are stacked.
-- No background is required, but a rounded-square coloured tile reads well
-  at small sizes because it gives each harness a distinct silhouette even
-  when the inner glyph isn't fully legible. Look at the existing
-  `assets/logo-*.svg` files for the canonical model provider logos if you
-  want a visual reference for sizing.
-## Adding a new harness
-1. Decide on the exact `agent_name` that the push-to-index workflow writes
-   for the new harness (see `AGENT_NAME_BY_TYPE` in
-   `OpenHands/evaluation/push-to-index-job/scripts/push_to_index_from_archive.py`).
-2. Add an entry to `HARNESS_LOGO_STEMS` in
-   [`leaderboard_transformer.py`](../../leaderboard_transformer.py) that
-   maps the display name to a stem.
-3. Drop `{stem}.svg` (or `.png`) into this folder.
-4. Reload the app and look at `/alternative-agents`.

assets/harnesses/claude-code.svg DELETED Viewed

assets/harnesses/codex-cli.svg DELETED Viewed

assets/harnesses/gemini-cli.svg DELETED Viewed

assets/harnesses/openhands.svg DELETED Viewed

assets/openhands-logotype-design.svg DELETED Viewed

assets/openhands-logotype-on-dark.svg DELETED Viewed

assets/openhands-logotype-on-light.svg DELETED Viewed

content.py CHANGED Viewed

@@ -556,11 +556,6 @@ span.wrap[tabindex="0"][role="button"][data-editable="false"] {
     grid-column: 8 !important;
     white-space: nowrap !important;
 }
-/* Hide the Alternative Agents page from the top-level nav for now. */
-.nav-holder nav a[href*="alternative-agents"] {
-    display: none !important;
-}
 /* Divider line between header and category nav */
 .nav-holder nav::after {
     content: ''; /* Required for pseudo-elements to appear */

     grid-column: 8 !important;
     white-space: nowrap !important;
 }
 /* Divider line between header and category nav */
 .nav-holder nav::after {
     content: ''; /* Required for pseudo-elements to appear */

docs/screenshots/alternative-agents.png DELETED Viewed

Git LFS Details

SHA256: 99766c7d2c11a6f90f24a5f0effbae74a8aa33096b89ff1c4fcfb238fe06a2f5
Pointer size: 131 Bytes
Size of remote file: 104 kB

leaderboard_transformer.py CHANGED Viewed

@@ -228,17 +228,17 @@ def get_country_from_model(model_name: str) -> dict:
 def get_marker_icon(model_name: str, openness: str, mark_by: str) -> dict:
     """
     Gets the appropriate icon based on the mark_by selection.
     Args:
         model_name: The model name
         openness: The openness value (open/closed)
         mark_by: One of "Company", "Openness", or "Country"
     Returns:
         dict with 'path' and 'name' keys
     """
     from constants import MARK_BY_COMPANY, MARK_BY_OPENNESS, MARK_BY_COUNTRY
     if mark_by == MARK_BY_OPENNESS:
         return get_openness_icon(openness)
     elif mark_by == MARK_BY_COUNTRY:
@@ -247,59 +247,6 @@ def get_marker_icon(model_name: str, openness: str, mark_by: str) -> dict:
         return get_company_from_model(model_name)
-# Map the agent_name stored in the index repo's metadata.json to a file stem
-# inside assets/harnesses/. Kept in sync with AGENT_NAME_BY_TYPE in
-# OpenHands/evaluation push_to_index_from_archive.py — if a new ACP harness
-# lands there, add the corresponding display name and a matching stem here.
-#
-# The scatter plot looks for {stem}.svg first, then {stem}.png in
-# assets/harnesses/. This repo intentionally ships only a README in that
-# folder: drop the logo files in by hand (SVG preferred, PNG works too via
-# HF Xet) and they'll be picked up on the next app restart. If the file is
-# missing, get_harness_icon() returns None and the scatter falls back to the
-# single-marker path — same rendering the canonical OpenHands pages use —
-# so logos can be added one harness at a time without breaking anything.
-HARNESS_LOGO_STEMS: dict[str, str] = {
-    "Claude Code":          "claude-code",
-    "Codex":                "codex-cli",
-    "Gemini CLI":           "gemini-cli",
-    "OpenHands":            "openhands",
-    "OpenHands Sub-agents": "openhands",
-}
-HARNESS_LOGO_DIR = "assets/harnesses"
-HARNESS_LOGO_EXTENSIONS = ("svg", "png")
-def get_harness_icon(agent_name: Optional[str]) -> Optional[dict]:
-    """Return {'path', 'name'} for the harness logo, or None if not usable.
-    Consumed by the Alternative Agents scatter plot to draw a composite
-    marker (model provider on top, harness on bottom). Returns None in any
-    of three cases, all of which make the caller skip the harness layer:
-    - ``agent_name`` is empty or missing from the dataframe row.
-    - ``agent_name`` isn't in ``HARNESS_LOGO_STEMS`` (new harness that
-      hasn't been registered yet — register it and drop in a logo).
-    - The logo file for that stem doesn't exist in ``assets/harnesses/``
-      yet (the repo ships only the README).
-    That third case is the important one: it lets the Alternative Agents
-    page work immediately after checkout even when the harness logo files
-    haven't been dropped in. The corresponding points just render like a
-    canonical-page marker (model logo only) until the file is added.
-    """
-    if not agent_name:
-        return None
-    stem = HARNESS_LOGO_STEMS.get(str(agent_name).strip())
-    if stem is None:
-        return None
-    for ext in HARNESS_LOGO_EXTENSIONS:
-        path = f"{HARNESS_LOGO_DIR}/{stem}.{ext}"
-        if os.path.exists(path):
-            return {"path": path, "name": agent_name}
-    return None
 # Standard layout configuration for all charts
 STANDARD_LAYOUT = dict(
     template="plotly_white",
@@ -708,7 +655,6 @@ def _pretty_column_name(raw_col: str) -> str:
     # Case 1: Handle fixed, special-case mappings first.
     fixed_mappings = {
         'id': 'id',
-        'agent_name': 'Agent',
         'SDK version': 'SDK Version',
         'Openhands version': 'SDK Version',  # Legacy support
         'Language model': 'Language Model',
@@ -869,21 +815,7 @@ class DataTransformer:
         df_view = df_sorted.copy()
         # --- 3. Add Columns for Agent Openness ---
-        # Only include the "Agent" column when the dataframe actually has
-        # more than one distinct agent. On the canonical OpenHands pages
-        # every row says "OpenHands", so adding the column is just noise;
-        # on the Alternative Agents page rows differ (Claude Code / Codex
-        # / Gemini CLI / OpenHands Sub-agents), so the column carries
-        # signal and disambiguates same-model rows from different
-        # harnesses.
-        has_mixed_agents = (
-            "Agent" in df_view.columns
-            and df_view["Agent"].dropna().nunique() > 1
-        )
-        if has_mixed_agents:
-            base_cols = ["id", "Agent", "Language Model", "SDK Version", "Source"]
-        else:
-            base_cols = ["id", "Language Model", "SDK Version", "Source"]
         new_cols = ["Openness"]
         ending_cols = ["Date", "Logs", "Visualization"]
@@ -970,8 +902,7 @@ def _plot_scatter_plotly(
         agent_col: str = 'Agent',
         name: Optional[str] = None,
         plot_type: str = 'cost',  # 'cost' or 'runtime'
-        mark_by: Optional[str] = None,  # 'Company', 'Openness', or 'Country'
-        show_all_labels: bool = False  # Show labels for all points vs only Pareto frontier
 ) -> go.Figure:
     from constants import MARK_BY_DEFAULT
     if mark_by is None:
@@ -1087,18 +1018,13 @@ def _plot_scatter_plotly(
         """
         Builds the complete HTML string for the plot's hover tooltip.
         Format: {lm_name} (SDK {version})
-                Harness: {agent}        (only when the row carries an Agent —
-                                         Alternative Agents page only; the
-                                         canonical OpenHands pages drop the
-                                         Agent column in view() so this line
-                                         is skipped there)
                 Average Score: {score}
                 Average Cost/Runtime: {value}
                 Openness: {openness}
         """
         h_pad = "   "
         parts = ["<br>"]
         # Get and clean the language model name
         llm_base_value = row.get('Language Model', '')
         llm_base_value = clean_llm_base_list(llm_base_value)
@@ -1106,21 +1032,13 @@ def _plot_scatter_plotly(
             lm_name = llm_base_value[0]
         else:
             lm_name = str(llm_base_value) if llm_base_value else 'Unknown'
         # Get SDK version
         sdk_version = row.get('SDK Version', row.get(agent_col, 'Unknown'))
         # Title line: {lm_name} (SDK {version})
         parts.append(f"{h_pad}<b>{lm_name}</b> (SDK {sdk_version}){h_pad}<br>")
-        # Harness line — only on pages where the Agent column is present
-        # (Alternative Agents). Without this, two rows for the same LM run
-        # under different harnesses (e.g. Claude Code vs OpenHands Sub-agents
-        # on claude-sonnet-4-5) are indistinguishable on hover.
-        agent_value = row.get('Agent')
-        if agent_value is not None and pd.notna(agent_value) and str(agent_value).strip():
-            parts.append(f"{h_pad}Harness: <b>{agent_value}</b>{h_pad}<br>")
         # Average Score
         parts.append(f"{h_pad}Average Score: <b>{row[y_col]:.3f}</b>{h_pad}<br>")
@@ -1193,182 +1111,103 @@ def _plot_scatter_plotly(
     y_min = min_score - 5 if min_score > 5 else 0
     y_max = max_score + 5
-    # Cache base64-encoded logos across rows — every Claude model on the
-    # Alternative Agents page points at the same assets/harness-claude-code.svg,
-    # so decoding once per path is ~N× cheaper than once per point.
-    _logo_cache: dict[str, str] = {}
-    def _encode_logo(path: str) -> Optional[str]:
-        if path in _logo_cache:
-            return _logo_cache[path]
-        if not os.path.exists(path):
-            return None
-        try:
-            with open(path, "rb") as f:
-                encoded = base64.b64encode(f.read()).decode("utf-8")
-        except Exception as e:
-            logger.warning(f"Could not load logo {path}: {e}")
-            return None
-        mime = "svg+xml" if path.lower().endswith(".svg") else "png"
-        uri = f"data:image/{mime};base64,{encoded}"
-        _logo_cache[path] = uri
-        return uri
-    # Composite markers: on the Alternative Agents page the dataframe carries
-    # an "Agent" column (Claude Code / Codex / Gemini CLI / OpenHands Sub-agents),
-    # so a point for claude-sonnet-4-5 under Claude Code and under OpenHands
-    # Sub-agents would otherwise share the exact same Anthropic logo marker
-    # and be visually indistinguishable. When Agent is present, we stack
-    # two logos at each point: model provider on top, harness on the bottom.
-    # Canonical OpenHands pages drop the Agent column in view() (via the
-    # has_mixed_agents check), so they fall through to the single-logo path
-    # and render exactly as before.
-    has_harness_column = (
-        "Agent" in data_plot.columns
-        and data_plot["Agent"].dropna().astype(str).str.strip().ne("").any()
-    )
-    # Marker sizes. The composite variant fits two logos inside roughly the
-    # same vertical footprint as a single marker, so each half is slightly
-    # smaller and the two halves are offset symmetrically around the point's
-    # true y-coordinate.
-    SINGLE_SIZE_X, SINGLE_SIZE_Y = 0.04, 0.06
-    STACKED_SIZE_X, STACKED_SIZE_Y = 0.035, 0.048
-    STACKED_Y_OFFSET = 0.028  # half-separation between model (top) and harness (bottom)
     for _, row in data_plot.iterrows():
         model_name = row.get('Language Model', '')
         openness = row.get('Openness', '')
         marker_info = get_marker_icon(model_name, openness, mark_by)
-        model_logo_uri = _encode_logo(marker_info['path'])
-        if model_logo_uri is None:
-            continue
-        # Harness (only meaningful when the dataframe carries an Agent column).
-        harness_uri = None
-        if has_harness_column:
-            harness_info = get_harness_icon(row.get("Agent"))
-            if harness_info is not None:
-                harness_uri = _encode_logo(harness_info["path"])
-        x_val = row[x_col_to_use]
-        y_val = row[y_col_to_use]
-        # Convert to domain coordinates (0-1 range)
-        # For log scale x: domain_x = (log10(x) - x_min_log) / (x_max_log - x_min_log)
-        if x_val > 0:
-            log_x = np.log10(x_val)
-            domain_x = (log_x - x_min_log) / (x_max_log - x_min_log)
-        else:
-            domain_x = 0
-        # For linear y: domain_y = (y - y_min) / (y_max - y_min)
-        domain_y = (y_val - y_min) / (y_max - y_min) if (y_max - y_min) > 0 else 0.5
-        # Clamp to valid range
-        domain_x = max(0, min(1, domain_x))
-        domain_y = max(0, min(1, domain_y))
-        # Convert to data coordinates
-        # For log scale x: use log10(x) to match the axis type
-        x_log = np.log10(x_val) if x_val > 0 else x_min_log
-        if harness_uri is not None:
-            # Composite: stack model on top, harness on bottom
-            # Use data coordinates (x, y) so logos zoom/pan together with labels
-            y_offset = 0.8  # Offset above the data point (in score units)
-            layout_images.append(dict(
-                source=model_logo_uri,
-                xref="x", yref="y",
-                x=x_log, y=y_val + y_offset,
-                sizex=STACKED_SIZE_X * (x_max_log - x_min_log),
-                sizey=STACKED_SIZE_Y * (y_max - y_min),
-                xanchor="center", yanchor="middle",
-                layer="above",
-            ))
-            layout_images.append(dict(
-                source=harness_uri,
-                xref="x", yref="y",
-                x=x_log, y=y_val - y_offset,
-                sizex=STACKED_SIZE_X * (x_max_log - x_min_log),
-                sizey=STACKED_SIZE_Y * (y_max - y_min),
-                xanchor="center", yanchor="middle",
-                layer="above",
-            ))
-        else:
-            # Single marker - use data coordinates so logo zooms/pans with labels
-            layout_images.append(dict(
-                source=model_logo_uri,
-                xref="x", yref="y",
-                x=x_log, y=y_val,
-                sizex=SINGLE_SIZE_X * (x_max_log - x_min_log),
-                sizey=SINGLE_SIZE_Y * (y_max - y_min),
-                xanchor="center", yanchor="middle",
-                layer="above",
-            ))
-    # --- Section 7: Add Model Name Labels ---
-    # Show labels for all points if show_all_labels is True, otherwise just Pareto frontier
-    if show_all_labels:
-        # Label all data points
-        labels_data = []
-        for _, row in data_plot.iterrows():
-            x_val = row[x_col_to_use]
-            y_val = row[y_col_to_use]
-            model_name = row.get('Language Model', '')
-            if isinstance(model_name, list):
-                model_name = model_name[0] if model_name else ''
-            model_name = str(model_name).split('/')[-1]
-            if len(model_name) > 25:
-                model_name = model_name[:22] + '...'
-            labels_data.append({'x': x_val, 'y': y_val, 'label': model_name})
-    elif frontier_rows:
-        # Label only Pareto frontier points
-        labels_data = []
         for row in frontier_rows:
             x_val = row[x_col_to_use]
             y_val = row[y_col_to_use]
             model_name = row.get('Language Model', '')
             if isinstance(model_name, list):
                 model_name = model_name[0] if model_name else ''
             model_name = str(model_name).split('/')[-1]
             if len(model_name) > 25:
                 model_name = model_name[:22] + '...'
-            labels_data.append({'x': x_val, 'y': y_val, 'label': model_name})
-    else:
-        labels_data = []
-    # Add annotations for each label
-    # For log scale x-axis, annotations need log10(x) coordinates (Plotly issue #2580)
-    for item in labels_data:
-        x_val = item['x']
-        y_val = item['y']
-        label = item['label']
-        # Transform x to log10 for annotation positioning on log scale
-        if x_val > 0:
-            x_log = np.log10(x_val)
-        else:
-            x_log = x_min_log
-        fig.add_annotation(
-            x=x_log,
-            y=y_val,
-            text=label,
-            showarrow=False,
-            yshift=25,  # Move label higher above the icon
-            font=dict(
-                size=10,
-                color='#0D0D0F',  # neutral-950
-                family=FONT_FAMILY_SHORT
-            ),
-            xanchor='center',
-            yanchor='bottom'
-        )
     # --- Section 8: Configure Layout  ---
     # Use the same axis ranges as calculated for domain coordinates

 def get_marker_icon(model_name: str, openness: str, mark_by: str) -> dict:
     """
     Gets the appropriate icon based on the mark_by selection.
     Args:
         model_name: The model name
         openness: The openness value (open/closed)
         mark_by: One of "Company", "Openness", or "Country"
     Returns:
         dict with 'path' and 'name' keys
     """
     from constants import MARK_BY_COMPANY, MARK_BY_OPENNESS, MARK_BY_COUNTRY
     if mark_by == MARK_BY_OPENNESS:
         return get_openness_icon(openness)
     elif mark_by == MARK_BY_COUNTRY:
         return get_company_from_model(model_name)
 # Standard layout configuration for all charts
 STANDARD_LAYOUT = dict(
     template="plotly_white",
     # Case 1: Handle fixed, special-case mappings first.
     fixed_mappings = {
         'id': 'id',
         'SDK version': 'SDK Version',
         'Openhands version': 'SDK Version',  # Legacy support
         'Language model': 'Language Model',
         df_view = df_sorted.copy()
         # --- 3. Add Columns for Agent Openness ---
+        base_cols = ["id","Language Model","SDK Version","Source"]
         new_cols = ["Openness"]
         ending_cols = ["Date", "Logs", "Visualization"]
         agent_col: str = 'Agent',
         name: Optional[str] = None,
         plot_type: str = 'cost',  # 'cost' or 'runtime'
+        mark_by: Optional[str] = None  # 'Company', 'Openness', or 'Country'
 ) -> go.Figure:
     from constants import MARK_BY_DEFAULT
     if mark_by is None:
         """
         Builds the complete HTML string for the plot's hover tooltip.
         Format: {lm_name} (SDK {version})
                 Average Score: {score}
                 Average Cost/Runtime: {value}
                 Openness: {openness}
         """
         h_pad = "   "
         parts = ["<br>"]
         # Get and clean the language model name
         llm_base_value = row.get('Language Model', '')
         llm_base_value = clean_llm_base_list(llm_base_value)
             lm_name = llm_base_value[0]
         else:
             lm_name = str(llm_base_value) if llm_base_value else 'Unknown'
         # Get SDK version
         sdk_version = row.get('SDK Version', row.get(agent_col, 'Unknown'))
         # Title line: {lm_name} (SDK {version})
         parts.append(f"{h_pad}<b>{lm_name}</b> (SDK {sdk_version}){h_pad}<br>")
         # Average Score
         parts.append(f"{h_pad}Average Score: <b>{row[y_col]:.3f}</b>{h_pad}<br>")
     y_min = min_score - 5 if min_score > 5 else 0
     y_max = max_score + 5
     for _, row in data_plot.iterrows():
         model_name = row.get('Language Model', '')
         openness = row.get('Openness', '')
         marker_info = get_marker_icon(model_name, openness, mark_by)
+        logo_path = marker_info['path']
+        # Read the SVG file and encode as base64 data URI
+        if os.path.exists(logo_path):
+            try:
+                with open(logo_path, 'rb') as f:
+                    encoded_logo = base64.b64encode(f.read()).decode('utf-8')
+                    logo_uri = f"data:image/svg+xml;base64,{encoded_logo}"
+                    x_val = row[x_col_to_use]
+                    y_val = row[y_col_to_use]
+                    # Convert to domain coordinates (0-1 range)
+                    # For log scale x: domain_x = (log10(x) - x_min_log) / (x_max_log - x_min_log)
+                    if x_val > 0:
+                        log_x = np.log10(x_val)
+                        domain_x = (log_x - x_min_log) / (x_max_log - x_min_log)
+                    else:
+                        domain_x = 0
+                    # For linear y: domain_y = (y - y_min) / (y_max - y_min)
+                    domain_y = (y_val - y_min) / (y_max - y_min) if (y_max - y_min) > 0 else 0.5
+                    # Clamp to valid range
+                    domain_x = max(0, min(1, domain_x))
+                    domain_y = max(0, min(1, domain_y))
+                    layout_images.append(dict(
+                        source=logo_uri,
+                        xref="x domain",  # Use domain coordinates for log scale compatibility
+                        yref="y domain",
+                        x=domain_x,
+                        y=domain_y,
+                        sizex=0.04,  # Size as fraction of plot width
+                        sizey=0.06,  # Size as fraction of plot height
+                        xanchor="center",
+                        yanchor="middle",
+                        layer="above"
+                    ))
+            except Exception as e:
+                logger.warning(f"Could not load logo {logo_path}: {e}")
+    # --- Section 7: Add Model Name Labels to Frontier Points ---
+    if frontier_rows:
+        frontier_labels_data = []
         for row in frontier_rows:
             x_val = row[x_col_to_use]
             y_val = row[y_col_to_use]
+            # Get the model name for the label
             model_name = row.get('Language Model', '')
             if isinstance(model_name, list):
                 model_name = model_name[0] if model_name else ''
+            # Clean the model name (remove path prefixes)
             model_name = str(model_name).split('/')[-1]
+            # Truncate long names
             if len(model_name) > 25:
                 model_name = model_name[:22] + '...'
+            frontier_labels_data.append({
+                'x': x_val,
+                'y': y_val,
+                'label': model_name
+            })
+        # Add annotations for each frontier label
+        # For log scale x-axis, annotations need log10(x) coordinates (Plotly issue #2580)
+        for item in frontier_labels_data:
+            x_val = item['x']
+            y_val = item['y']
+            label = item['label']
+            # Transform x to log10 for annotation positioning on log scale
+            if x_val > 0:
+                x_log = np.log10(x_val)
+            else:
+                x_log = x_min_log
+            fig.add_annotation(
+                x=x_log,
+                y=y_val,
+                text=label,
+                showarrow=False,
+                yshift=25,  # Move label higher above the icon
+                font=dict(
+                    size=10,
+                    color='#0D0D0F',  # neutral-950
+                    family=FONT_FAMILY_SHORT
+                ),
+                xanchor='center',
+                yanchor='bottom'
+            )
     # --- Section 8: Configure Layout  ---
     # Use the same axis ranges as calculated for domain coordinates

main_page.py CHANGED Viewed

@@ -1,7 +1,6 @@
 import matplotlib
 matplotlib.use('Agg')
 import gradio as gr
-import pandas as pd
 from ui_components import (
@@ -27,32 +26,6 @@ from constants import MARK_BY_DEFAULT
 CACHED_VIEWERS = {}
 CACHED_TAG_MAPS = {}
-def filter_complete_entries(df: pd.DataFrame) -> pd.DataFrame:
-    if df.empty:
-        return df.copy()
-    category_score_columns = [
-        'Issue Resolution Score',
-        'Frontend Score',
-        'Greenfield Score',
-        'Testing Score',
-        'Information Gathering Score',
-    ]
-    if all(column in df.columns for column in category_score_columns):
-        return df[df[category_score_columns].notna().all(axis=1)].copy()
-    if 'Categories Completed' in df.columns:
-        categories_completed = pd.to_numeric(df['Categories Completed'], errors='coerce')
-        return df[categories_completed >= 5].copy()
-    if 'Categories Attempted' in df.columns:
-        return df[df['Categories Attempted'] == '5/5'].copy()
-    return df.copy()
 def build_page():
     with gr.Row(elem_id="intro-row"):
         with gr.Column(scale=1):
@@ -65,91 +38,78 @@ def build_page():
     test_df, test_tag_map = get_full_leaderboard_data("test")
     if not test_df.empty:
-        show_incomplete_checkbox, show_open_only_checkbox, mark_by_dropdown = create_leaderboard_display(
             full_df=test_df,
             tag_map=test_tag_map,
             category_name=CATEGORY_NAME,
             split_name="test"
         )
-        test_df_complete = filter_complete_entries(test_df)
-        has_complete_entries = len(test_df_complete) > 0
         if 'Openness' in test_df.columns:
             test_df_open = test_df[test_df['Openness'].str.lower() == 'open'].copy()
         else:
             test_df_open = test_df.copy()
-        test_df_complete_open = filter_complete_entries(test_df_open)
-        initial_df = test_df_complete if has_complete_entries else test_df
         # --- Winners by Category Section ---
         gr.Markdown("---")
         gr.HTML('<h2>Winners by Category</h2>', elem_id="winners-header")
         gr.Markdown("Top 5 performing systems in each benchmark category.")
-        winners_component = gr.HTML(
-            create_winners_by_category_html(initial_df, top_n=5),
-            elem_id="winners-by-category",
-        )
         # --- New Visualization Sections ---
         gr.Markdown("---")
         # Evolution Over Time Section
         gr.HTML('<h2>Evolution Over Time</h2>', elem_id="evolution-header")
         gr.Markdown("Track how model performance has improved over time based on release dates.")
-        evolution_component = gr.Plot(
-            value=create_evolution_over_time_chart(initial_df, MARK_BY_DEFAULT),
-            elem_id="evolution-chart",
-        )
         gr.Markdown("---")
         # Open Model Accuracy by Size Section (always shows open models only by design)
         gr.HTML('<h2>Open Model Accuracy by Size</h2>', elem_id="size-accuracy-header")
         gr.Markdown("Compare open-weights model performance against their parameter count.")
-        size_component = gr.Plot(
-            value=create_accuracy_by_size_chart(initial_df, MARK_BY_DEFAULT),
-            elem_id="size-accuracy-chart",
-        )
-        def update_extra_sections(show_incomplete, show_open_only, mark_by):
-            include_incomplete = show_incomplete or not has_complete_entries
-            base_df = test_df if include_incomplete else test_df_complete
-            base_df_open = test_df_open if include_incomplete else test_df_complete_open
-            winners_df = base_df_open if show_open_only else base_df
-            winners_html = create_winners_by_category_html(winners_df, top_n=5)
-            evolution_fig = create_evolution_over_time_chart(winners_df, mark_by)
-            size_fig = create_accuracy_by_size_chart(base_df, mark_by)
             return winners_html, evolution_fig, size_fig
-        show_incomplete_input = show_incomplete_checkbox if show_incomplete_checkbox is not None else gr.State(value=True)
-        show_open_only_input = show_open_only_checkbox if show_open_only_checkbox is not None else gr.State(value=False)
-        extra_section_inputs = [show_incomplete_input, show_open_only_input, mark_by_dropdown]
-        if show_incomplete_checkbox is not None:
-            show_incomplete_checkbox.change(
-                fn=update_extra_sections,
-                inputs=extra_section_inputs,
-                outputs=[winners_component, evolution_component, size_component]
-            )
         if show_open_only_checkbox is not None:
             show_open_only_checkbox.change(
                 fn=update_extra_sections,
-                inputs=extra_section_inputs,
                 outputs=[winners_component, evolution_component, size_component]
             )
         if mark_by_dropdown is not None:
             mark_by_dropdown.change(
                 fn=update_extra_sections,
-                inputs=extra_section_inputs,
                 outputs=[winners_component, evolution_component, size_component]
             )

 import matplotlib
 matplotlib.use('Agg')
 import gradio as gr
 from ui_components import (
 CACHED_VIEWERS = {}
 CACHED_TAG_MAPS = {}
 def build_page():
     with gr.Row(elem_id="intro-row"):
         with gr.Column(scale=1):
     test_df, test_tag_map = get_full_leaderboard_data("test")
     if not test_df.empty:
+        # Get the checkbox and dropdown returned from create_leaderboard_display
+        show_open_only_checkbox, mark_by_dropdown = create_leaderboard_display(
             full_df=test_df,
             tag_map=test_tag_map,
             category_name=CATEGORY_NAME,
             split_name="test"
         )
+        # Prepare open-only filtered dataframe for Winners and Evolution
         if 'Openness' in test_df.columns:
             test_df_open = test_df[test_df['Openness'].str.lower() == 'open'].copy()
         else:
             test_df_open = test_df.copy()
         # --- Winners by Category Section ---
         gr.Markdown("---")
         gr.HTML('<h2>Winners by Category</h2>', elem_id="winners-header")
         gr.Markdown("Top 5 performing systems in each benchmark category.")
+        # Create both all and open-only versions of winners HTML
+        winners_html_all = create_winners_by_category_html(test_df, top_n=5)
+        winners_html_open = create_winners_by_category_html(test_df_open, top_n=5)
+        winners_component = gr.HTML(winners_html_all, elem_id="winners-by-category")
         # --- New Visualization Sections ---
         gr.Markdown("---")
         # Evolution Over Time Section
         gr.HTML('<h2>Evolution Over Time</h2>', elem_id="evolution-header")
         gr.Markdown("Track how model performance has improved over time based on release dates.")
+        # Create initial evolution chart with default mark_by
+        evolution_fig_all = create_evolution_over_time_chart(test_df, MARK_BY_DEFAULT)
+        evolution_component = gr.Plot(value=evolution_fig_all, elem_id="evolution-chart")
         gr.Markdown("---")
         # Open Model Accuracy by Size Section (always shows open models only by design)
         gr.HTML('<h2>Open Model Accuracy by Size</h2>', elem_id="size-accuracy-header")
         gr.Markdown("Compare open-weights model performance against their parameter count.")
+        size_fig = create_accuracy_by_size_chart(test_df, MARK_BY_DEFAULT)
+        size_component = gr.Plot(value=size_fig, elem_id="size-accuracy-chart")
+        # Update function for Winners, Evolution, and Size charts based on filters
+        def update_extra_sections(show_open_only, mark_by):
+            # Select the appropriate dataframe based on open_only filter
+            df_to_use = test_df_open if show_open_only else test_df
+            # Winners HTML (not affected by mark_by, only open_only)
+            winners_html = winners_html_open if show_open_only else winners_html_all
+            # Regenerate charts with current mark_by setting
+            evolution_fig = create_evolution_over_time_chart(df_to_use, mark_by)
+            size_fig = create_accuracy_by_size_chart(test_df, mark_by)  # Size chart always uses full df (filters internally)
             return winners_html, evolution_fig, size_fig
+        # Connect both checkbox and dropdown to update all extra sections
         if show_open_only_checkbox is not None:
             show_open_only_checkbox.change(
                 fn=update_extra_sections,
+                inputs=[show_open_only_checkbox, mark_by_dropdown],
                 outputs=[winners_component, evolution_component, size_component]
             )
         if mark_by_dropdown is not None:
             mark_by_dropdown.change(
                 fn=update_extra_sections,
+                inputs=[show_open_only_checkbox if show_open_only_checkbox else gr.State(value=False), mark_by_dropdown],
                 outputs=[winners_component, evolution_component, size_component]
             )

setup_data.py CHANGED Viewed

@@ -70,39 +70,27 @@ def fetch_data_from_github():
         # Look for data files in the cloned repository
         results_source = clone_dir / "results"
         if not results_source.exists():
             print(f"Results directory not found in repository")
             return False
         # Check if there are any agent result directories
         result_dirs = list(results_source.iterdir())
         if not result_dirs:
             print(f"No agent results found in {results_source}")
             return False
         print(f"Found {len(result_dirs)} agent result directories")
         # Create target directory and copy the results structure
         os.makedirs(target_dir.parent, exist_ok=True)
         if target_dir.exists():
             shutil.rmtree(target_dir)
         # Copy the entire results directory
         target_results = target_dir / "results"
         shutil.copytree(results_source, target_results)
-        # Also copy alternative_agents/ if present, so the loader can pick up
-        # ACP runs (acp-claude, acp-codex, acp-gemini, openhands_subagents, ...)
-        # alongside the default OpenHands agent results.
-        alt_source = clone_dir / "alternative_agents"
-        if alt_source.exists():
-            alt_target = target_dir / "alternative_agents"
-            shutil.copytree(alt_source, alt_target)
-            agent_types = sorted(p.name for p in alt_source.iterdir() if p.is_dir())
-            print(f"Found alternative agent types: {agent_types}")
-        else:
-            print("No alternative_agents/ directory in repository (skipping)")
         print(f"Successfully fetched data from GitHub. Files: {list(target_dir.glob('*'))}")

         # Look for data files in the cloned repository
         results_source = clone_dir / "results"
         if not results_source.exists():
             print(f"Results directory not found in repository")
             return False
         # Check if there are any agent result directories
         result_dirs = list(results_source.iterdir())
         if not result_dirs:
             print(f"No agent results found in {results_source}")
             return False
         print(f"Found {len(result_dirs)} agent result directories")
         # Create target directory and copy the results structure
         os.makedirs(target_dir.parent, exist_ok=True)
         if target_dir.exists():
             shutil.rmtree(target_dir)
         # Copy the entire results directory
         target_results = target_dir / "results"
         shutil.copytree(results_source, target_results)
         print(f"Successfully fetched data from GitHub. Files: {list(target_dir.glob('*'))}")

simple_data_loader.py CHANGED Viewed

@@ -96,43 +96,17 @@ def load_and_validate_agent_data(agent_dir: Path) -> tuple[Optional[dict], Optio
 class SimpleLeaderboardViewer:
     """Simple replacement for agent-eval's LeaderboardViewer."""
-    AGENT_FILTER_OPENHANDS = "openhands"
-    AGENT_FILTER_ALTERNATIVE = "alternative"
-    def __init__(
-        self,
-        data_dir: str,
-        config: str,
-        split: str,
-        agent_filter: str = AGENT_FILTER_OPENHANDS,
-    ):
         """
         Args:
             data_dir: Path to data directory
             config: Config name (e.g., "1.0.0-dev1")
             split: Split name (e.g., "validation" or "test")
-            agent_filter: Which submissions to include.
-                ``"openhands"`` (default) loads only the default OpenHands
-                agent runs from ``results/{model}/`` — the canonical
-                leaderboard. ``"alternative"`` loads only third-party
-                harnesses (Claude Code / Codex / Gemini CLI / OpenHands
-                Sub-agents) from ``alternative_agents/{type}/{model}/``,
-                which power the standalone Alternative Agents page.
-                The two are kept on separate pages because their
-                cost/runtime numbers aren't apples-to-apples and mixing
-                them in one ranking would be misleading.
         """
-        if agent_filter not in (self.AGENT_FILTER_OPENHANDS, self.AGENT_FILTER_ALTERNATIVE):
-            raise ValueError(
-                f"agent_filter must be one of "
-                f"{{{self.AGENT_FILTER_OPENHANDS!r}, {self.AGENT_FILTER_ALTERNATIVE!r}}}, "
-                f"got {agent_filter!r}"
-            )
         self.data_dir = Path(data_dir)
         self.config = config
         self.split = split
-        self.agent_filter = agent_filter
         self.config_path = self.data_dir / config
         # Benchmark to category mappings (single source of truth)
@@ -153,116 +127,55 @@ class SimpleLeaderboardViewer:
                 if benchmark not in self.tag_map[category]:
                     self.tag_map[category].append(benchmark)
-    # Default agent_name when metadata.json doesn't carry one. Matches the
-    # default-agent value used by push_to_index_from_archive.py so legacy
-    # entries (which omit the field) still group cleanly with new entries.
-    DEFAULT_AGENT_NAME = "OpenHands"
-    def _records_from_agent_dir(self, agent_dir: Path, default_agent_name: str | None = None) -> tuple[list[dict], list[str]]:
-        """Build per-benchmark records from a single agent directory.
-        Shared by ``_load_from_agent_dirs`` (default OpenHands results) and
-        ``_load_from_alternative_agents_dirs`` (acp-claude / acp-codex / etc.).
-        Returns ``(records, validation_errors)``. Returns an empty list of
-        records when the directory has no scores or is hidden from the
-        leaderboard.
-        """
-        records: list[dict] = []
-        metadata, scores, errors = load_and_validate_agent_data(agent_dir)
-        if metadata is None or scores is None:
-            return records, errors
-        if metadata.get('hide_from_leaderboard', False):
-            logger.info(f"Skipping {agent_dir.name}: hide_from_leaderboard is True")
-            return records, errors
-        # Resolve the agent display name. Prefer the value stamped into
-        # metadata.json by push-to-index; fall back to the directory's
-        # default (e.g. "Claude Code" for acp-claude/) and finally to
-        # "OpenHands" for legacy results/ entries that predate the field.
-        agent_name = (
-            metadata.get('agent_name')
-            or default_agent_name
-            or self.DEFAULT_AGENT_NAME
-        )
-        for score_entry in scores:
-            record = {
-                'agent_name': agent_name,
-                'agent_version': metadata.get('agent_version', 'Unknown'),
-                'llm_base': metadata.get('model', 'unknown'),
-                'openness': metadata.get('openness', 'unknown'),
-                'submission_time': score_entry.get('submission_time', metadata.get('submission_time', '')),
-                'release_date': metadata.get('release_date', ''),
-                'parameter_count_b': metadata.get('parameter_count_b'),
-                'active_parameter_count_b': metadata.get('active_parameter_count_b'),
-                'score': score_entry.get('score'),
-                'metric': score_entry.get('metric', 'unknown'),
-                'cost_per_instance': score_entry.get('cost_per_instance'),
-                'average_runtime': score_entry.get('average_runtime'),
-                'tags': [score_entry.get('benchmark')],
-                'full_archive': score_entry.get('full_archive', ''),
-                'eval_visualization_page': score_entry.get('eval_visualization_page', ''),
-            }
-            records.append(record)
-        return records, errors
     def _load_from_agent_dirs(self):
-        """Load agent records based on ``self.agent_filter``.
-        - ``"openhands"`` (default): only ``{config}/results/{model}/``,
-          which is the canonical OpenHands leaderboard. The Home page and
-          the per-category subpages use this.
-        - ``"alternative"``: only
-          ``{config}/alternative_agents/{type}/{model}/`` (acp-claude,
-          acp-codex, acp-gemini, openhands_subagents, ...). The dedicated
-          Alternative Agents page uses this.
-        Returns ``None`` if no records were found (which makes the caller
-        render an empty-state placeholder).
-        """
         all_records = []
         all_validation_errors = []
-        if self.agent_filter == self.AGENT_FILTER_OPENHANDS:
-            # Default OpenHands agent results
-            results_dir = self.config_path / "results"
-            if results_dir.exists():
-                for agent_dir in results_dir.iterdir():
-                    if not agent_dir.is_dir():
-                        continue
-                    records, errors = self._records_from_agent_dir(agent_dir)
-                    all_records.extend(records)
-                    all_validation_errors.extend(errors)
-        else:
-            # Alternative agents (one subdirectory per agent_type, then per model)
-            # Default agent_name per agent_type matches the AGENT_NAME_BY_TYPE
-            # map in OpenHands/evaluation push_to_index_from_archive.py — keeping
-            # it in sync ensures rows are labelled the same way the index repo
-            # records them.
-            agent_type_default_name = {
-                'acp-claude': 'Claude Code',
-                'acp-codex': 'Codex',
-                'acp-gemini': 'Gemini CLI',
-            }
-            alt_dir = self.config_path / "alternative_agents"
-            if alt_dir.exists():
-                for type_dir in alt_dir.iterdir():
-                    if not type_dir.is_dir():
-                        continue
-                    default_name = agent_type_default_name.get(type_dir.name)
-                    if default_name is None:
-                        continue  # skip unlisted agent types (e.g. openhands_subagents)
-                    for agent_dir in type_dir.iterdir():
-                        if not agent_dir.is_dir():
-                            continue
-                        records, errors = self._records_from_agent_dir(
-                            agent_dir, default_agent_name=default_name
-                        )
-                        all_records.extend(records)
-                        all_validation_errors.extend(errors)
         # Log validation errors if any
         if all_validation_errors:
             logger.warning(f"Schema validation errors ({len(all_validation_errors)} total):")
@@ -270,10 +183,10 @@ class SimpleLeaderboardViewer:
                 logger.warning(f"  - {error}")
             if len(all_validation_errors) > 5:
                 logger.warning(f"  ... and {len(all_validation_errors) - 5} more")
         if not all_records:
-            return None  # Caller will render empty-state placeholder
         return pd.DataFrame(all_records)
     def _load(self):
@@ -293,36 +206,26 @@ class SimpleLeaderboardViewer:
             # Group by agent (version + model combination) to aggregate results across datasets
             transformed_records = []
-            # Create a unique identifier per (agent_name, agent_version, model)
-            # tuple. Including agent_name keeps an OpenHands run and a Claude
-            # Code run on the same SDK version + model from collapsing into
-            # one row when both submit to the leaderboard.
-            df['agent_name'] = df['agent_name'].fillna(self.DEFAULT_AGENT_NAME)
-            df['agent_id'] = (
-                df['agent_name'].astype(str)
-                + '_' + df['agent_version'].astype(str)
-                + '_' + df['llm_base'].astype(str)
-            )
             for agent_id in df['agent_id'].unique():
                 agent_records = df[df['agent_id'] == agent_id]
                 # Build a single record for this agent
                 first_record = agent_records.iloc[0]
                 agent_version = first_record['agent_version']
-                agent_name = first_record['agent_name']
                 # Normalize openness to "open" or "closed"
                 from aliases import OPENNESS_MAPPING
                 raw_openness = first_record['openness']
                 normalized_openness = OPENNESS_MAPPING.get(raw_openness, raw_openness)
                 # All 5 categories for the leaderboard
                 ALL_CATEGORIES = ['Issue Resolution', 'Frontend', 'Greenfield', 'Testing', 'Information Gathering']
                 record = {
                     # Core agent info - use final display names
-                    'agent_name': agent_name,  # Will become "Agent"
                     'SDK version': agent_version,  # Will become "SDK Version"
                     'Language model': first_record['llm_base'],  # Will become "Language Model"
                     'openness': normalized_openness,  # Will become "Openness" (simplified to "open" or "closed")
@@ -332,7 +235,7 @@ class SimpleLeaderboardViewer:
                     'parameter_count_b': first_record.get('parameter_count_b'),  # Total params in billions
                     'active_parameter_count_b': first_record.get('active_parameter_count_b'),  # Active params for MoE
                     # Additional columns expected by the transformer
-                    # Use agent_id (name_version_model) as unique identifier for Pareto frontier calculation
                     'id': agent_id,
                     'source': first_record.get('source', ''),  # Will become "Source"
                     'logs': first_record.get('logs', ''),  # Will become "Logs"

 class SimpleLeaderboardViewer:
     """Simple replacement for agent-eval's LeaderboardViewer."""
+    def __init__(self, data_dir: str, config: str, split: str):
         """
         Args:
             data_dir: Path to data directory
             config: Config name (e.g., "1.0.0-dev1")
             split: Split name (e.g., "validation" or "test")
         """
         self.data_dir = Path(data_dir)
         self.config = config
         self.split = split
         self.config_path = self.data_dir / config
         # Benchmark to category mappings (single source of truth)
                 if benchmark not in self.tag_map[category]:
                     self.tag_map[category].append(benchmark)
     def _load_from_agent_dirs(self):
+        """Load data from new agent-centric directory structure (results/YYYYMMDD_model/)."""
+        results_dir = self.config_path / "results"
+        if not results_dir.exists():
+            return None  # Fall back to old format
         all_records = []
         all_validation_errors = []
+        # Iterate through each agent directory
+        for agent_dir in results_dir.iterdir():
+            if not agent_dir.is_dir():
+                continue
+            # Load and validate using pydantic models
+            metadata, scores, errors = load_and_validate_agent_data(agent_dir)
+            if errors:
+                all_validation_errors.extend(errors)
+            if metadata is None or scores is None:
+                continue
+            # Skip entries that are hidden from the leaderboard
+            if metadata.get('hide_from_leaderboard', False):
+                logger.info(f"Skipping {agent_dir.name}: hide_from_leaderboard is True")
+                continue
+            # Create one record per benchmark (mimicking old JSONL format)
+            for score_entry in scores:
+                record = {
+                    'agent_version': metadata.get('agent_version', 'Unknown'),
+                    'llm_base': metadata.get('model', 'unknown'),
+                    'openness': metadata.get('openness', 'unknown'),
+                    'submission_time': metadata.get('submission_time', ''),
+                    'release_date': metadata.get('release_date', ''),  # Model release date
+                    'parameter_count_b': metadata.get('parameter_count_b'),  # Total params in billions
+                    'active_parameter_count_b': metadata.get('active_parameter_count_b'),  # Active params for MoE
+                    'score': score_entry.get('score'),
+                    'metric': score_entry.get('metric', 'unknown'),
+                    'cost_per_instance': score_entry.get('cost_per_instance'),
+                    'average_runtime': score_entry.get('average_runtime'),
+                    'tags': [score_entry.get('benchmark')],
+                    'full_archive': score_entry.get('full_archive', ''),  # Download URL for trajectories
+                    'eval_visualization_page': score_entry.get('eval_visualization_page', ''),  # Laminar visualization URL
+                }
+                all_records.append(record)
         # Log validation errors if any
         if all_validation_errors:
             logger.warning(f"Schema validation errors ({len(all_validation_errors)} total):")
                 logger.warning(f"  - {error}")
             if len(all_validation_errors) > 5:
                 logger.warning(f"  ... and {len(all_validation_errors) - 5} more")
         if not all_records:
+            return None  # Fall back to old format
         return pd.DataFrame(all_records)
     def _load(self):
             # Group by agent (version + model combination) to aggregate results across datasets
             transformed_records = []
+            # Create a unique identifier for each agent (version + model)
+            df['agent_id'] = df['agent_version'] + '_' + df['llm_base']
             for agent_id in df['agent_id'].unique():
                 agent_records = df[df['agent_id'] == agent_id]
                 # Build a single record for this agent
                 first_record = agent_records.iloc[0]
                 agent_version = first_record['agent_version']
                 # Normalize openness to "open" or "closed"
                 from aliases import OPENNESS_MAPPING
                 raw_openness = first_record['openness']
                 normalized_openness = OPENNESS_MAPPING.get(raw_openness, raw_openness)
                 # All 5 categories for the leaderboard
                 ALL_CATEGORIES = ['Issue Resolution', 'Frontend', 'Greenfield', 'Testing', 'Information Gathering']
                 record = {
                     # Core agent info - use final display names
                     'SDK version': agent_version,  # Will become "SDK Version"
                     'Language model': first_record['llm_base'],  # Will become "Language Model"
                     'openness': normalized_openness,  # Will become "Openness" (simplified to "open" or "closed")
                     'parameter_count_b': first_record.get('parameter_count_b'),  # Total params in billions
                     'active_parameter_count_b': first_record.get('active_parameter_count_b'),  # Active params for MoE
                     # Additional columns expected by the transformer
+                    # Use agent_id (version_model) as unique identifier for Pareto frontier calculation
                     'id': agent_id,
                     'source': first_record.get('source', ''),  # Will become "Source"
                     'logs': first_record.get('logs', ''),  # Will become "Logs"

tests/test_runtime_sorting.py DELETED Viewed

@@ -1,40 +0,0 @@
-import pandas as pd
-from leaderboard_transformer import format_runtime_column
-def test_runtime_strings_sort_numerically_in_ascending_order():
-    df = pd.DataFrame(
-        {
-            "Average Score": [0.8, 0.8, 0.8, 0.8, None],
-            "Average Runtime": [1323.0, 372.0, 410.0, None, None],
-        }
-    )
-    formatted = format_runtime_column(df.copy(), "Average Runtime")
-    runtimes = formatted["Average Runtime"].tolist()
-    assert sorted(runtimes) == [
-        runtimes[1],
-        runtimes[2],
-        runtimes[0],
-        runtimes[3],
-        runtimes[4],
-    ]
-def test_runtime_formatting_preserves_visible_labels():
-    df = pd.DataFrame(
-        {
-            "Average Score": [0.8, 0.8, None],
-            "Average Runtime": [45.2, None, None],
-        }
-    )
-    formatted = format_runtime_column(df.copy(), "Average Runtime")
-    values = formatted["Average Runtime"].tolist()
-    assert values[0].endswith("45s")
-    assert values[1].endswith("Missing</span>")
-    assert values[2].endswith("Not Submitted</span>")
-    assert 'display:none' in values[0]

ui_components.py CHANGED Viewed

@@ -508,36 +508,28 @@ class DummyViewer:
         # The _load method returns the error DataFrame and an empty tag map
         return self._error_df, {}
-def get_leaderboard_viewer_instance(
-    split: str,
-    agent_filter: str = SimpleLeaderboardViewer.AGENT_FILTER_OPENHANDS,
-):
     """
-    Fetches the LeaderboardViewer for a (split, agent_filter) pair, using a
-    thread-safe cache to avoid re-downloading data. The cache is keyed on
-    both axes so the OpenHands and Alternative Agents pages don't fight
-    over a single slot. On error, returns a stable DummyViewer.
     """
     global CACHED_VIEWERS, CACHED_TAG_MAPS
-    cache_key = (split, agent_filter)
     with _cache_lock:
-        if cache_key in CACHED_VIEWERS:
             # Cache hit: return the cached viewer and tag map
-            return CACHED_VIEWERS[cache_key], CACHED_TAG_MAPS.get(cache_key, {"Overall": []})
     # --- Cache miss: try to load data from the source ---
     try:
         # First try to load from extracted data directory (local mock data)
         data_dir = EXTRACTED_DATA_DIR if os.path.exists(EXTRACTED_DATA_DIR) else "mock_results"
-        print(f"Loading data for split '{split}' (agent_filter={agent_filter}) from: {data_dir}/{CONFIG_NAME}")
         viewer = SimpleLeaderboardViewer(
             data_dir=data_dir,
             config=CONFIG_NAME,
-            split=split,
-            agent_filter=agent_filter,
         )
         # Simplify tag map creation
@@ -545,14 +537,14 @@ def get_leaderboard_viewer_instance(
         # Cache the results for next time (thread-safe)
         with _cache_lock:
-            CACHED_VIEWERS[cache_key] = viewer
-            CACHED_TAG_MAPS[cache_key] = pretty_tag_map  # Cache the pretty map directly
         return viewer, pretty_tag_map
     except Exception as e:
         # On ANY error, create a consistent error message and cache a DummyViewer
-        error_message = f"Error loading data for split '{split}' (agent_filter={agent_filter}): {e}"
         print(format_error(error_message))
         dummy_df = pd.DataFrame({"Message": [error_message]})
@@ -561,8 +553,8 @@ def get_leaderboard_viewer_instance(
         # Cache the dummy objects so we don't try to fetch again on this run
         with _cache_lock:
-            CACHED_VIEWERS[cache_key] = dummy_viewer
-            CACHED_TAG_MAPS[cache_key] = dummy_tag_map
         return dummy_viewer, dummy_tag_map
@@ -705,7 +697,7 @@ def create_leaderboard_display(
         primary_runtime_col = f"{category_name} Runtime"
     # Function to create cost/performance scatter plot from data
-    def create_cost_scatter_plot(df_data, mark_by=MARK_BY_DEFAULT, show_all_labels=False):
         return _plot_scatter_plotly(
             data=df_data,
             x=primary_cost_col if primary_cost_col in df_data.columns else None,
@@ -713,12 +705,11 @@ def create_leaderboard_display(
             agent_col="SDK Version",
             name=category_name,
             plot_type='cost',
-            mark_by=mark_by,
-            show_all_labels=show_all_labels
         )
     # Function to create runtime/performance scatter plot from data
-    def create_runtime_scatter_plot(df_data, mark_by=MARK_BY_DEFAULT, show_all_labels=False):
         return _plot_scatter_plotly(
             data=df_data,
             x=primary_runtime_col if primary_runtime_col in df_data.columns else None,
@@ -726,8 +717,7 @@ def create_leaderboard_display(
             agent_col="SDK Version",
             name=category_name,
             plot_type='runtime',
-            mark_by=mark_by,
-            show_all_labels=show_all_labels
         )
     # Create initial cost scatter plots for all filter combinations
@@ -794,13 +784,6 @@ def create_leaderboard_display(
                 )
             else:
                 show_open_only_checkbox = None
-            # Add checkbox for showing all labels on scatter plot
-            show_all_labels_checkbox = gr.Checkbox(
-                label="Show all labels on scatter plots",
-                value=False,
-                elem_id="show-all-labels-toggle"
-            )
         with gr.Column(scale=1):
             mark_by_dropdown = gr.Dropdown(
@@ -844,7 +827,7 @@ def create_leaderboard_display(
             )
             # Update function for filters - handles checkboxes and mark_by dropdown
-            def update_display(show_incomplete, show_open_only, mark_by, show_all_labels):
                 # Determine which dataframe to show based on checkbox states
                 if show_open_only:
                     df_to_show = df_display_open if show_incomplete else df_display_complete_open
@@ -853,9 +836,9 @@ def create_leaderboard_display(
                     df_to_show = df_display_all if show_incomplete else df_display_complete
                     view_df = df_view_full if show_incomplete else df_view_complete
-                # Regenerate plots with current mark_by and show_all_labels settings
-                cost_plot = create_cost_scatter_plot(view_df, mark_by, show_all_labels)
-                runtime_plot = create_runtime_scatter_plot(view_df, mark_by, show_all_labels)
                 return df_to_show, cost_plot, runtime_plot
             # Connect checkboxes and dropdown to the update function
@@ -866,7 +849,6 @@ def create_leaderboard_display(
                 # Add a dummy value for show_open_only when checkbox doesn't exist
                 filter_inputs = [show_incomplete_checkbox, gr.State(value=False)]
             filter_inputs.append(mark_by_dropdown)
-            filter_inputs.append(show_all_labels_checkbox)
             show_incomplete_checkbox.change(
                 fn=update_display,
@@ -884,11 +866,6 @@ def create_leaderboard_display(
                 inputs=filter_inputs,
                 outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
             )
-            show_all_labels_checkbox.change(
-                fn=update_display,
-                inputs=filter_inputs,
-                outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
-            )
         else:
             dataframe_component = gr.DataFrame(
                 headers=df_headers,
@@ -903,15 +880,15 @@ def create_leaderboard_display(
             )
             # Update function for mark_by and optional open_only checkbox
-            def update_display_no_complete(show_open_only, mark_by, show_all_labels):
                 if show_open_only:
                     df_to_show = df_display_open
                     view_df = df_view_open
                 else:
                     df_to_show = df_display_all
                     view_df = df_view_full
-                cost_plot = create_cost_scatter_plot(view_df, mark_by, show_all_labels)
-                runtime_plot = create_runtime_scatter_plot(view_df, mark_by, show_all_labels)
                 return df_to_show, cost_plot, runtime_plot
             filter_inputs_no_complete = []
@@ -920,7 +897,6 @@ def create_leaderboard_display(
             else:
                 filter_inputs_no_complete.append(gr.State(value=False))
             filter_inputs_no_complete.append(mark_by_dropdown)
-            filter_inputs_no_complete.append(show_all_labels_checkbox)
             if show_open_only_checkbox is not None:
                 show_open_only_checkbox.change(
@@ -933,18 +909,13 @@ def create_leaderboard_display(
                 inputs=filter_inputs_no_complete,
                 outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
             )
-            show_all_labels_checkbox.change(
-                fn=update_display_no_complete,
-                inputs=filter_inputs_no_complete,
-                outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
-            )
         legend_markdown = create_legend_markdown(category_name)
         gr.HTML(value=legend_markdown, elem_id="legend-markdown")
     # Add a timer to periodically check for data updates and refresh the UI
     # This runs every 60 seconds to check if new data is available
-    def check_and_refresh_data(show_incomplete, show_open_only=False, mark_by=MARK_BY_DEFAULT, show_all_labels=False):
         """Check if data has been refreshed and return updated data if so."""
         current_version = get_data_version()
         if current_version > initial_data_version:
@@ -954,7 +925,7 @@ def create_leaderboard_display(
             if not new_df.empty:
                 new_transformer = DataTransformer(new_df, new_tag_map)
                 new_df_view_full, _ = new_transformer.view(tag=category_name, use_plotly=True)
                 # Prepare both complete and all entries versions
                 if 'Categories Attempted' in new_df_view_full.columns:
                     new_df_view_complete = new_df_view_full[new_df_view_full['Categories Attempted'] == '5/5'].copy()
@@ -974,16 +945,16 @@ def create_leaderboard_display(
                 new_df_display_open = prepare_df_for_display(new_df_view_open)
                 new_df_display_complete_open = prepare_df_for_display(new_df_view_complete_open)
-                # Create new scatter plots for all combinations (with current mark_by and show_all_labels)
-                new_cost_scatter_complete = create_cost_scatter_plot(new_df_view_complete, mark_by, show_all_labels) if len(new_df_display_complete) > 0 else go.Figure()
-                new_cost_scatter_all = create_cost_scatter_plot(new_df_view_full, mark_by, show_all_labels)
-                new_cost_scatter_open = create_cost_scatter_plot(new_df_view_open, mark_by, show_all_labels) if len(new_df_view_open) > 0 else go.Figure()
-                new_cost_scatter_complete_open = create_cost_scatter_plot(new_df_view_complete_open, mark_by, show_all_labels) if len(new_df_view_complete_open) > 0 else go.Figure()
-                new_runtime_scatter_complete = create_runtime_scatter_plot(new_df_view_complete, mark_by, show_all_labels) if len(new_df_display_complete) > 0 else go.Figure()
-                new_runtime_scatter_all = create_runtime_scatter_plot(new_df_view_full, mark_by, show_all_labels)
-                new_runtime_scatter_open = create_runtime_scatter_plot(new_df_view_open, mark_by, show_all_labels) if len(new_df_view_open) > 0 else go.Figure()
-                new_runtime_scatter_complete_open = create_runtime_scatter_plot(new_df_view_complete_open, mark_by, show_all_labels) if len(new_df_view_complete_open) > 0 else go.Figure()
                 # Return the appropriate data based on checkbox states
                 if show_open_only:
@@ -1014,25 +985,18 @@ def create_leaderboard_display(
     # Connect the timer to the refresh function
     if show_incomplete_checkbox is not None:
         if show_open_only_checkbox is not None:
-            refresh_timer.tick(
-                fn=check_and_refresh_data,
-                inputs=[show_incomplete_checkbox, show_open_only_checkbox, mark_by_dropdown, show_all_labels_checkbox],
-                outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
-            )
-        else:
-            # No open/closed split in this dataset — gr.State can't fill the gap in a
-            # timer tick (no session context), so use a wrapper that fixes show_open_only=False.
-            def _timer_refresh_no_open(show_incomplete, mark_by, show_all_labels):
-                return check_and_refresh_data(show_incomplete, False, mark_by, show_all_labels)
-            refresh_timer.tick(
-                fn=_timer_refresh_no_open,
-                inputs=[show_incomplete_checkbox, mark_by_dropdown, show_all_labels_checkbox],
-                outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
-            )
     else:
         # If no incomplete checkbox, always show all data (but still filter by open if needed)
-        def check_and_refresh_all(show_open_only=False, mark_by=MARK_BY_DEFAULT, show_all_labels=False):
             current_version = get_data_version()
             if current_version > initial_data_version:
                 print(f"[REFRESH] Data version changed, reloading...")
@@ -1045,8 +1009,8 @@ def create_leaderboard_display(
                         new_df_view_full = new_df_view_full[new_df_view_full['Openness'].str.lower() == 'open'].copy()
                     new_df_display_all = prepare_df_for_display(new_df_view_full)
-                    new_cost_scatter_all = create_cost_scatter_plot(new_df_view_full, mark_by, show_all_labels)
-                    new_runtime_scatter_all = create_runtime_scatter_plot(new_df_view_full, mark_by, show_all_labels)
                     return new_df_display_all, new_cost_scatter_all, new_runtime_scatter_all
             if show_open_only:
@@ -1056,20 +1020,20 @@ def create_leaderboard_display(
         if show_open_only_checkbox is not None:
             refresh_timer.tick(
                 fn=check_and_refresh_all,
-                inputs=[show_open_only_checkbox, mark_by_dropdown, show_all_labels_checkbox],
                 outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
             )
         else:
-            def check_and_refresh_simple(mark_by=MARK_BY_DEFAULT, show_all_labels=False):
-                return check_and_refresh_all(False, mark_by, show_all_labels)
             refresh_timer.tick(
                 fn=check_and_refresh_simple,
-                inputs=[mark_by_dropdown, show_all_labels_checkbox],
                 outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
             )
-    # Return the filter controls so they can be used to update other sections
-    return show_incomplete_checkbox, show_open_only_checkbox, mark_by_dropdown
 # # --- Detailed Benchmark Display ---
 def create_benchmark_details_display(
@@ -1304,17 +1268,12 @@ def create_benchmark_details_display(
             legend_markdown = create_legend_markdown(benchmark_name)
             gr.HTML(value=legend_markdown, elem_id="legend-markdown")
-def get_full_leaderboard_data(
-    split: str,
-    agent_filter: str = SimpleLeaderboardViewer.AGENT_FILTER_OPENHANDS,
-) -> tuple[pd.DataFrame, dict]:
     """
-    Loads and transforms the complete dataset for a (split, agent_filter)
-    pair. ``agent_filter`` defaults to ``"openhands"`` so existing pages
-    that don't pass it stay on the canonical leaderboard. The Alternative
-    Agents page passes ``"alternative"`` to get the third-party harnesses.
     """
-    viewer_or_data, raw_tag_map = get_leaderboard_viewer_instance(split, agent_filter=agent_filter)
     if isinstance(viewer_or_data, (SimpleLeaderboardViewer, DummyViewer)):
         raw_df, _ = viewer_or_data._load()

         # The _load method returns the error DataFrame and an empty tag map
         return self._error_df, {}
+def get_leaderboard_viewer_instance(split: str):
     """
+    Fetches the LeaderboardViewer for a split, using a thread-safe cache to avoid
+    re-downloading data. On error, returns a stable DummyViewer object.
     """
     global CACHED_VIEWERS, CACHED_TAG_MAPS
     with _cache_lock:
+        if split in CACHED_VIEWERS:
             # Cache hit: return the cached viewer and tag map
+            return CACHED_VIEWERS[split], CACHED_TAG_MAPS.get(split, {"Overall": []})
     # --- Cache miss: try to load data from the source ---
     try:
         # First try to load from extracted data directory (local mock data)
         data_dir = EXTRACTED_DATA_DIR if os.path.exists(EXTRACTED_DATA_DIR) else "mock_results"
+        print(f"Loading data for split '{split}' from: {data_dir}/{CONFIG_NAME}")
         viewer = SimpleLeaderboardViewer(
             data_dir=data_dir,
             config=CONFIG_NAME,
+            split=split
         )
         # Simplify tag map creation
         # Cache the results for next time (thread-safe)
         with _cache_lock:
+            CACHED_VIEWERS[split] = viewer
+            CACHED_TAG_MAPS[split] = pretty_tag_map  # Cache the pretty map directly
         return viewer, pretty_tag_map
     except Exception as e:
         # On ANY error, create a consistent error message and cache a DummyViewer
+        error_message = f"Error loading data for split '{split}': {e}"
         print(format_error(error_message))
         dummy_df = pd.DataFrame({"Message": [error_message]})
         # Cache the dummy objects so we don't try to fetch again on this run
         with _cache_lock:
+            CACHED_VIEWERS[split] = dummy_viewer
+            CACHED_TAG_MAPS[split] = dummy_tag_map
         return dummy_viewer, dummy_tag_map
         primary_runtime_col = f"{category_name} Runtime"
     # Function to create cost/performance scatter plot from data
+    def create_cost_scatter_plot(df_data, mark_by=MARK_BY_DEFAULT):
         return _plot_scatter_plotly(
             data=df_data,
             x=primary_cost_col if primary_cost_col in df_data.columns else None,
             agent_col="SDK Version",
             name=category_name,
             plot_type='cost',
+            mark_by=mark_by
         )
     # Function to create runtime/performance scatter plot from data
+    def create_runtime_scatter_plot(df_data, mark_by=MARK_BY_DEFAULT):
         return _plot_scatter_plotly(
             data=df_data,
             x=primary_runtime_col if primary_runtime_col in df_data.columns else None,
             agent_col="SDK Version",
             name=category_name,
             plot_type='runtime',
+            mark_by=mark_by
         )
     # Create initial cost scatter plots for all filter combinations
                 )
             else:
                 show_open_only_checkbox = None
         with gr.Column(scale=1):
             mark_by_dropdown = gr.Dropdown(
             )
             # Update function for filters - handles checkboxes and mark_by dropdown
+            def update_display(show_incomplete, show_open_only, mark_by):
                 # Determine which dataframe to show based on checkbox states
                 if show_open_only:
                     df_to_show = df_display_open if show_incomplete else df_display_complete_open
                     df_to_show = df_display_all if show_incomplete else df_display_complete
                     view_df = df_view_full if show_incomplete else df_view_complete
+                # Regenerate plots with current mark_by setting
+                cost_plot = create_cost_scatter_plot(view_df, mark_by)
+                runtime_plot = create_runtime_scatter_plot(view_df, mark_by)
                 return df_to_show, cost_plot, runtime_plot
             # Connect checkboxes and dropdown to the update function
                 # Add a dummy value for show_open_only when checkbox doesn't exist
                 filter_inputs = [show_incomplete_checkbox, gr.State(value=False)]
             filter_inputs.append(mark_by_dropdown)
             show_incomplete_checkbox.change(
                 fn=update_display,
                 inputs=filter_inputs,
                 outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
             )
         else:
             dataframe_component = gr.DataFrame(
                 headers=df_headers,
             )
             # Update function for mark_by and optional open_only checkbox
+            def update_display_no_complete(show_open_only, mark_by):
                 if show_open_only:
                     df_to_show = df_display_open
                     view_df = df_view_open
                 else:
                     df_to_show = df_display_all
                     view_df = df_view_full
+                cost_plot = create_cost_scatter_plot(view_df, mark_by)
+                runtime_plot = create_runtime_scatter_plot(view_df, mark_by)
                 return df_to_show, cost_plot, runtime_plot
             filter_inputs_no_complete = []
             else:
                 filter_inputs_no_complete.append(gr.State(value=False))
             filter_inputs_no_complete.append(mark_by_dropdown)
             if show_open_only_checkbox is not None:
                 show_open_only_checkbox.change(
                 inputs=filter_inputs_no_complete,
                 outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
             )
         legend_markdown = create_legend_markdown(category_name)
         gr.HTML(value=legend_markdown, elem_id="legend-markdown")
     # Add a timer to periodically check for data updates and refresh the UI
     # This runs every 60 seconds to check if new data is available
+    def check_and_refresh_data(show_incomplete, show_open_only=False, mark_by=MARK_BY_DEFAULT):
         """Check if data has been refreshed and return updated data if so."""
         current_version = get_data_version()
         if current_version > initial_data_version:
             if not new_df.empty:
                 new_transformer = DataTransformer(new_df, new_tag_map)
                 new_df_view_full, _ = new_transformer.view(tag=category_name, use_plotly=True)
                 # Prepare both complete and all entries versions
                 if 'Categories Attempted' in new_df_view_full.columns:
                     new_df_view_complete = new_df_view_full[new_df_view_full['Categories Attempted'] == '5/5'].copy()
                 new_df_display_open = prepare_df_for_display(new_df_view_open)
                 new_df_display_complete_open = prepare_df_for_display(new_df_view_complete_open)
+                # Create new scatter plots for all combinations (with current mark_by)
+                new_cost_scatter_complete = create_cost_scatter_plot(new_df_view_complete, mark_by) if len(new_df_display_complete) > 0 else go.Figure()
+                new_cost_scatter_all = create_cost_scatter_plot(new_df_view_full, mark_by)
+                new_cost_scatter_open = create_cost_scatter_plot(new_df_view_open, mark_by) if len(new_df_view_open) > 0 else go.Figure()
+                new_cost_scatter_complete_open = create_cost_scatter_plot(new_df_view_complete_open, mark_by) if len(new_df_view_complete_open) > 0 else go.Figure()
+                new_runtime_scatter_complete = create_runtime_scatter_plot(new_df_view_complete, mark_by) if len(new_df_display_complete) > 0 else go.Figure()
+                new_runtime_scatter_all = create_runtime_scatter_plot(new_df_view_full, mark_by)
+                new_runtime_scatter_open = create_runtime_scatter_plot(new_df_view_open, mark_by) if len(new_df_view_open) > 0 else go.Figure()
+                new_runtime_scatter_complete_open = create_runtime_scatter_plot(new_df_view_complete_open, mark_by) if len(new_df_view_complete_open) > 0 else go.Figure()
                 # Return the appropriate data based on checkbox states
                 if show_open_only:
     # Connect the timer to the refresh function
     if show_incomplete_checkbox is not None:
+        timer_inputs = [show_incomplete_checkbox]
         if show_open_only_checkbox is not None:
+            timer_inputs.append(show_open_only_checkbox)
+        timer_inputs.append(mark_by_dropdown)  # Always include mark_by
+        refresh_timer.tick(
+            fn=check_and_refresh_data,
+            inputs=timer_inputs,
+            outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
+        )
     else:
         # If no incomplete checkbox, always show all data (but still filter by open if needed)
+        def check_and_refresh_all(show_open_only=False, mark_by=MARK_BY_DEFAULT):
             current_version = get_data_version()
             if current_version > initial_data_version:
                 print(f"[REFRESH] Data version changed, reloading...")
                         new_df_view_full = new_df_view_full[new_df_view_full['Openness'].str.lower() == 'open'].copy()
                     new_df_display_all = prepare_df_for_display(new_df_view_full)
+                    new_cost_scatter_all = create_cost_scatter_plot(new_df_view_full, mark_by)
+                    new_runtime_scatter_all = create_runtime_scatter_plot(new_df_view_full, mark_by)
                     return new_df_display_all, new_cost_scatter_all, new_runtime_scatter_all
             if show_open_only:
         if show_open_only_checkbox is not None:
             refresh_timer.tick(
                 fn=check_and_refresh_all,
+                inputs=[show_open_only_checkbox, mark_by_dropdown],
                 outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
             )
         else:
+            def check_and_refresh_simple(mark_by=MARK_BY_DEFAULT):
+                return check_and_refresh_all(False, mark_by)
             refresh_timer.tick(
                 fn=check_and_refresh_simple,
+                inputs=[mark_by_dropdown],
                 outputs=[dataframe_component, cost_plot_component, runtime_plot_component]
             )
+    # Return the show_open_only_checkbox and mark_by_dropdown so they can be used to update other sections
+    return show_open_only_checkbox, mark_by_dropdown
 # # --- Detailed Benchmark Display ---
 def create_benchmark_details_display(
             legend_markdown = create_legend_markdown(benchmark_name)
             gr.HTML(value=legend_markdown, elem_id="legend-markdown")
+def get_full_leaderboard_data(split: str) -> tuple[pd.DataFrame, dict]:
     """
+    Loads and transforms the complete dataset for a given split.
+    This function handles caching and returns the final "pretty" DataFrame and tag map.
     """
+    viewer_or_data, raw_tag_map = get_leaderboard_viewer_instance(split)
     if isinstance(viewer_or_data, (SimpleLeaderboardViewer, DummyViewer)):
         raw_df, _ = viewer_or_data._load()