Spaces:
Running
Running
| title: Agent UI | |
| emoji: π€ | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: docker | |
| pinned: false | |
| header: mini | |
| # Agent UI | |
| A multi-agent AI interface with code execution, web search, image generation, and deep research β all orchestrated from a single command center. | |
| ## Local Install | |
| ```bash | |
| pip install . # Install from pyproject.toml | |
| python -m backend.main # Start server at http://localhost:8765 | |
| ``` | |
| Or use Make shortcuts: | |
| ```bash | |
| make install # pip install . | |
| make dev # Start dev server | |
| ``` | |
| Configure API keys in the Settings panel, or set environment variables: | |
| | Variable | Purpose | | |
| |----------|---------| | |
| | `LLM_API_KEY` | Default LLM provider token (any OpenAI-compatible API) | | |
| | `HF_TOKEN` | HuggingFace token (image generation, hosted models) | | |
| | `E2B_API_KEY` | [E2B](https://e2b.dev) sandbox for code execution | | |
| | `SERPER_API_KEY` | [Serper](https://serper.dev) for web search | | |
| ## Docker | |
| ```bash | |
| docker build -t agent-ui . | |
| docker run -p 7860:7860 -e LLM_API_KEY=... agent-ui | |
| ``` | |
| CLI options: `--port`, `--no-browser`, `--config-dir`, `--workspace-dir`, `--multi-user`. | |
| For HuggingFace Spaces deployment, set `HF_BUCKET` and `HF_BUCKET_TOKEN` secrets for workspace persistence across restarts. | |
| ## Architecture | |
| ``` | |
| backend/ | |
| βββ agents.py # Agent registry (single source of truth) + shared LLM utilities | |
| βββ main.py # FastAPI routes, SSE streaming, file management | |
| βββ command.py # Command center: tool routing, agent launching | |
| βββ code.py # Code agent: E2B sandbox execution | |
| βββ agent.py # Web agent: search + browse | |
| βββ research.py # Research agent: multi-source deep analysis | |
| βββ image.py # Image agent: generate/edit via HuggingFace | |
| βββ tools.py # Direct tools (execute_code, web_search, show_html, etc.) | |
| frontend/ | |
| βββ index.html # Entry point | |
| βββ utils.js # Global state, shared helpers (setupInputListeners, closeAllPanels) | |
| βββ timeline.js # Sidebar timeline data + rendering | |
| βββ sessions.js # Session CRUD + panel | |
| βββ tabs.js # Tab creation/switching, sendMessage | |
| βββ streaming.js # SSE streaming, code cells, action widgets, markdown | |
| βββ workspace.js # Workspace serialize/restore | |
| βββ settings.js # Settings CRUD, themes, debug/files/sessions panels | |
| βββ app.js # Initialization, event listeners, DOMContentLoaded | |
| βββ style.css # All styles (CSS custom properties for theming) | |
| βββ research-ui.js # Research-specific UI components | |
| ``` | |
| ### How It Works | |
| 1. The **command center** receives user messages and decides whether to answer directly or launch sub-agents | |
| 2. Sub-agents (code, web, research, image) run in their own tabs with specialized tools | |
| 3. All communication uses **SSE streaming** β agents yield JSON events with a `type` field | |
| 4. Settings store providers, models, and agent-to-model assignments β any OpenAI-compatible API works | |
| ## Extending Agent UI | |
| ### Adding a New Agent | |
| Only the backend needs changes β the frontend fetches the registry from `GET /api/agents` at startup. | |
| **1. Backend registry** β add to `AGENT_REGISTRY` in `backend/agents.py`: | |
| ```python | |
| "my_agent": { | |
| "label": "MY AGENT", | |
| "system_prompt": "You are a helpful assistant...", | |
| "tool": { | |
| "type": "function", | |
| "function": { | |
| "name": "launch_my_agent", | |
| "description": "Launch my agent for X tasks.", | |
| "parameters": { | |
| "type": "object", | |
| "properties": { | |
| "task": {"type": "string", "description": "The task"}, | |
| "task_id": {"type": "string", "description": "2-3 word ID"} | |
| }, | |
| "required": ["task", "task_id"] | |
| } | |
| } | |
| }, | |
| "tool_arg": "task", | |
| "has_counter": True, | |
| "in_menu": True, | |
| "in_launcher": True, | |
| "placeholder": "Enter message...", | |
| "capabilities": "Short description of what this agent can do.", | |
| }, | |
| ``` | |
| **2. Backend streaming handler** β create `backend/my_agent.py`: | |
| ```python | |
| from .agents import call_llm | |
| def stream_my_agent(client, model, messages, extra_params=None, abort_event=None): | |
| """Generator yielding SSE event dicts.""" | |
| debug_call_number = 0 | |
| while not_done: | |
| # call_llm handles retries and emits debug events | |
| response = None | |
| for event in call_llm(client, model, messages, tools=MY_TOOLS, | |
| extra_params=extra_params, abort_event=abort_event, | |
| call_number=debug_call_number): | |
| if "_response" in event: | |
| response = event["_response"] | |
| debug_call_number = event["_call_number"] | |
| else: | |
| yield event | |
| if event.get("type") in ("error", "aborted"): | |
| return | |
| # Process response, yield events... | |
| yield {"type": "thinking", "content": "..."} | |
| yield {"type": "result", "content": "Final answer"} | |
| yield {"type": "done"} | |
| ``` | |
| Required events: `done`, `error`. Common: `thinking`, `content`, `result`, `result_preview`. | |
| **3. Wire the route** β in `backend/main.py`, add to the streaming handler dispatch (search for `agent_type`): | |
| ```python | |
| elif request.agent_type == "my_agent": | |
| return StreamingResponse(stream_my_agent_handler(...), ...) | |
| ``` | |
| **4. Frontend** β no changes needed. The frontend fetches the registry from `GET /api/agents` at startup. | |
| ### Adding a Direct Tool | |
| Direct tools execute synchronously in the command center (no sub-agent spawned). Only two files need changes. | |
| **1. Define the tool schema + execute function** in `backend/tools.py`: | |
| ```python | |
| my_tool = { | |
| "type": "function", | |
| "function": { | |
| "name": "my_tool", | |
| "description": "Does something useful.", | |
| "parameters": { | |
| "type": "object", | |
| "properties": { | |
| "input": {"type": "string", "description": "The input"} | |
| }, | |
| "required": ["input"] | |
| } | |
| } | |
| } | |
| def execute_my_tool(input: str, files_root: str = None) -> dict: | |
| return {"content": "Result text for the LLM", "extra_data": "..."} | |
| ``` | |
| **2. Register it** in `DIRECT_TOOL_REGISTRY` at the bottom of `backend/tools.py`: | |
| ```python | |
| DIRECT_TOOL_REGISTRY = { | |
| "show_html": { ... }, # existing | |
| "my_tool": { | |
| "schema": my_tool, | |
| "execute": lambda args, ctx: execute_my_tool( | |
| args.get("input", ""), files_root=ctx.get("files_root") | |
| ), | |
| }, | |
| } | |
| ``` | |
| That's it β `command.py` automatically picks up tools from the registry. | |
| ### Modifying System Prompts | |
| All system prompts live in `backend/agents.py` inside `AGENT_REGISTRY`. Edit the `"system_prompt"` field for any agent. | |
| The `get_system_prompt()` function adds dynamic context automatically: | |
| - `{tools_section}` β replaced with available agent descriptions (command center only) | |
| - Current date is appended to all prompts | |
| - Project file tree is appended (in `main.py` wrapper) | |
| - Theme/styling context is added for code agents | |
| ### Adding a Model Provider | |
| In the Settings panel, models are configured through **Providers** and **Models**: | |
| 1. **Add a provider**: name + OpenAI-compatible endpoint URL + API token | |
| 2. **Add a model**: name + provider + API model ID (e.g., `gpt-4o`, `claude-sonnet-4-20250514`) | |
| 3. **Assign models**: pick which model each agent type uses | |
| Any OpenAI-compatible API works (OpenAI, Anthropic via proxy, Ollama, vLLM, etc.). | |
| Settings are stored in `workspace/settings.json` and managed via the Settings panel in the UI. | |
| ### Creating a Theme | |
| Themes are CSS custom property sets defined in `frontend/settings.js`. | |
| **Add to `themeColors` object** (search for `const themeColors`): | |
| ```javascript | |
| myTheme: { | |
| border: '#8e24aa', | |
| bg: '#f3e5f5', | |
| hoverBg: '#e1bee7', | |
| accent: '#6a1b9a', | |
| accentRgb: '106, 27, 154', | |
| ...lightSurface // Use for light themes | |
| }, | |
| ``` | |
| For dark themes, override the surface colors instead of spreading `lightSurface`: | |
| ```javascript | |
| myDarkTheme: { | |
| border: '#bb86fc', | |
| bg: '#1e1e2e', | |
| hoverBg: '#2a2a3e', | |
| accent: '#bb86fc', | |
| accentRgb: '187, 134, 252', | |
| bgPrimary: '#121218', | |
| bgSecondary: '#1e1e2e', | |
| bgTertiary: '#0e0e14', | |
| bgInput: '#0e0e14', | |
| bgHover: '#2a2a3e', | |
| bgCard: '#1e1e2e', | |
| textPrimary: '#e0e0e0', | |
| textSecondary: '#999999', | |
| textMuted: '#666666', | |
| borderPrimary: '#333344', | |
| borderSubtle: '#222233' | |
| }, | |
| ``` | |
| The theme automatically appears in the Settings theme picker β no other changes needed. The `applyTheme()` function reads all properties from the object and sets the corresponding CSS variables. | |
| **Available CSS variables:** `--theme-accent`, `--theme-accent-rgb`, `--theme-bg`, `--theme-hover-bg`, `--theme-border`, `--bg-primary`, `--bg-secondary`, `--bg-tertiary`, `--bg-input`, `--bg-hover`, `--bg-card`, `--text-primary`, `--text-secondary`, `--text-muted`, `--border-primary`, `--border-subtle`. | |
| ## SSE Event Protocol | |
| All agents communicate via Server-Sent Events. Each event is a JSON object with a `type` field. | |
| | Event | Description | | |
| |-------|-------------| | |
| | `done` | Stream complete (required) | | |
| | `error` | `{content}` β error message (required) | | |
| | `thinking` | `{content}` β reasoning text | | |
| | `content` | `{content}` β streamed response tokens | | |
| | `result` | `{content, figures?}` β final output for command center | | |
| | `result_preview` | Same as result, shown inline | | |
| | `retry` | `{attempt, max_attempts, delay, message}` β retrying | | |
| | `debug_call_input` | `{call_number, messages}` β LLM input (debug panel) | | |
| | `debug_call_output` | `{call_number, response}` β LLM output (debug panel) | | |
| | `launch` | `{agent_type, initial_message, task_id}` β spawn sub-agent | | |
| | `tool_start` | `{tool, args}` β direct tool started | | |
| | `tool_result` | `{tool, result}` β direct tool completed | | |
| | `code_start` | `{code}` β code execution started | | |
| | `code` | `{output, error, images}` β code execution result | | |
| ## Key Patterns & Conventions | |
| ### Backend | |
| - **Single source of truth**: `AGENT_REGISTRY` in `backend/agents.py` defines all agent types. The frontend fetches it via `GET /api/agents` β never duplicate agent definitions. | |
| - **LLM calls**: Always use `call_llm()` from `agents.py` β it handles retries, abort checking, and emits `debug_call_input`/`debug_call_output` events for the debug panel. | |
| - **Streaming pattern**: Agent handlers are sync generators yielding event dicts. `_stream_sync_generator()` in `main.py` wraps them for async SSE delivery β never duplicate the async queue boilerplate. | |
| - **Direct tools**: `DIRECT_TOOL_REGISTRY` in `tools.py` maps tool name β `{schema, execute}`. `command.py` dispatches automatically. | |
| - **Result nudging**: When an agent finishes without `<result>` tags, `nudge_for_result()` in `agents.py` asks the LLM for a final answer. It uses `call_llm` internally. | |
| ### Frontend | |
| - **No build system**: Plain `<script>` tags in `index.html`, no bundler. Files share `window` scope. | |
| - **Load order matters**: `utils.js` loads first (declares all globals), then other files. Cross-file function calls are fine because they happen at runtime, not parse time. | |
| - **Global state** lives in `utils.js`: `AGENT_REGISTRY`, `settings`, `activeTabId`, `tabCounter`, `timelineData`, `debugHistory`, `globalFigureRegistry`, etc. | |
| - **Shared helpers** (also in `utils.js`): | |
| - `setupInputListeners(container, tabId)` β wires textarea auto-resize, Enter-to-send, send button click | |
| - `setupCollapseToggle(cell, labelSelector)` β wires click-to-collapse on tool/code cells | |
| - `closeAllPanels()` β closes all right-side panels (settings, debug, files, sessions) | |
| - **Markdown rendering**: `parseMarkdown()` in `streaming.js` is the single entry point (marked + KaTeX + Prism). | |
| - **Panel toggle pattern**: Call `closeAllPanels()` first, then add `.active` to the panel being opened. | |
| - **Workspace persistence**: Changes auto-save via `saveWorkspaceDebounced()`. Tab state is serialized to JSON and posted to `/api/workspace`. | |
| - **Cache busting**: Bump `?v=N` query params in `index.html` when changing JS/CSS files. | |
| ### Naming | |
| - Backend: `stream_<agent>_execution()` for the sync generator, `_stream_<agent>_inner()` for the async wrapper in `main.py` | |
| - Frontend: Agent types use short keys (`code`, `agent`, `research`, `image`, `command`) | |
| - CSS: `--theme-*` for accent colors, `--bg-*` / `--text-*` / `--border-*` for surface colors | |
| ## Verification | |
| Verify backend imports: `python -c "from backend.command import stream_command_center"` | |