Spaces:

lvwerra
/

agent-ui

Running

lvwerra HF Staff Claude Opus 4.6 commited on Mar 17

Commit

4da424f

1 Parent(s): c48a8e5

Fix tool message handling, parallel image refs, error display, and UX polish

- Preserve tool_call_id/tool_calls in Message model so command center history
doesn't break on subsequent LLM calls (400 error fix)
- Namespace image/figure refs with tab ID (image_1 -> image_T3_1) to avoid
collisions between parallel sub-agents
- Strip HTML error pages (e.g. HF 503) to short status messages
- Show progress widget when command center auto-continues after sub-agents finish
- Only reuse agent tabs when both task_id and agent type match
- Nudge command center and image agent to only generate images when explicitly asked
- Add GLM-5 and Qwen3.5-397B models, update default agent assignments
- Update README with install/docker/env docs, add privacy notice to login

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (8) hide show

README.md +30 -14
backend/agents.py +9 -3
backend/main.py +19 -3
frontend/index.html +1 -0
frontend/streaming.js +42 -15
frontend/style.css +16 -0
frontend/tabs.js +1 -0
settings.json +15 -5

README.md CHANGED Viewed

@@ -12,13 +12,40 @@ header: mini
 A multi-agent AI interface with code execution, web search, image generation, and deep research — all orchestrated from a single command center.
-## Quick Start
 ```bash
-make install   # Install dependencies
-make dev       # Start server at http://localhost:8765
 ```
 ## Architecture
 ```
@@ -292,14 +319,3 @@ All agents communicate via Server-Sent Events. Each event is a JSON object with
 ## Verification
 Verify backend imports: `python -c "from backend.command import stream_command_center"`
-## Deployment
-The app runs as a Docker container (designed for HuggingFace Spaces):
-```bash
-docker build -t agent-ui .
-docker run -p 7860:7860 agent-ui
-```
-Set API keys via environment variables: `OPENAI_API_KEY`, `E2B_API_KEY`, `SERPER_API_KEY`, `HF_TOKEN`.

 A multi-agent AI interface with code execution, web search, image generation, and deep research — all orchestrated from a single command center.
+## Local Install
 ```bash
+pip install .        # Install from pyproject.toml
+python -m backend.main   # Start server at http://localhost:8765
 ```
+Or use Make shortcuts:
+```bash
+make install   # pip install .
+make dev       # Start dev server
+```
+Configure API keys in the Settings panel, or set environment variables:
+| Variable | Purpose |
+|----------|---------|
+| `LLM_API_KEY` | Default LLM provider token (any OpenAI-compatible API) |
+| `HF_TOKEN` | HuggingFace token (image generation, hosted models) |
+| `E2B_API_KEY` | [E2B](https://e2b.dev) sandbox for code execution |
+| `SERPER_API_KEY` | [Serper](https://serper.dev) for web search |
+## Docker
+```bash
+docker build -t agent-ui .
+docker run -p 7860:7860 -e LLM_API_KEY=... agent-ui
+```
+CLI options: `--port`, `--no-browser`, `--config-dir`, `--workspace-dir`, `--multi-user`.
+For HuggingFace Spaces deployment, set `HF_BUCKET` and `HF_BUCKET_TOKEN` secrets for workspace persistence across restarts.
 ## Architecture
 ```
 ## Verification
 Verify backend imports: `python -c "from backend.command import stream_command_center"`

backend/agents.py CHANGED Viewed

@@ -52,7 +52,7 @@ AGENT_REGISTRY = {
             "- **Web agent**: searches, lookups, fact-checking, reading URLs\n"
             "- **Code agent**: data analysis, code execution, visualizations, debugging\n"
             "- **Research agent**: ONLY deep multi-source analysis, comparisons, reports\n"
-            "- **Image agent**: generating or editing images\n\n"
             "When delegating, provide a clear objective, scope boundaries, and expected output format.\n\n"
             "## Task Decomposition — ALWAYS parallelize\n\n"
             "**RULE: When a request mentions multiple distinct entities or topics, "
@@ -239,8 +239,10 @@ AGENT_REGISTRY = {
             "SVG NOT supported. Returns image reference.\n\n"
             "## Strategy\n\n"
             "1. If user provides a URL/file, use read_image first to load it\n"
-            "2. Use generate_image for new images, edit_image to transform existing ones\n"
-            "3. Write detailed prompts. Describe what you see and iterate if needed.\n\n"
             "## CRITICAL: You MUST provide a <result> tag\n\n"
             "Use <image_1> (self-closing) to embed images in your result.\n\n"
             "<result>\n"
@@ -340,6 +342,10 @@ def parse_llm_error(error: Exception) -> dict:
         pass
     retryable = any(x in error_str.lower() for x in ["429", "rate limit", "too many requests", "overloaded", "high traffic"])
     return {"message": error_str, "type": "unknown_error", "retryable": retryable}

             "- **Web agent**: searches, lookups, fact-checking, reading URLs\n"
             "- **Code agent**: data analysis, code execution, visualizations, debugging\n"
             "- **Research agent**: ONLY deep multi-source analysis, comparisons, reports\n"
+            "- **Image agent**: generating or editing images (ONLY when the user explicitly asks to generate/create an image — never for finding/showing existing photos)\n\n"
             "When delegating, provide a clear objective, scope boundaries, and expected output format.\n\n"
             "## Task Decomposition — ALWAYS parallelize\n\n"
             "**RULE: When a request mentions multiple distinct entities or topics, "
             "SVG NOT supported. Returns image reference.\n\n"
             "## Strategy\n\n"
             "1. If user provides a URL/file, use read_image first to load it\n"
+            "2. Use generate_image ONLY when explicitly asked to generate/create an image — "
+            "never use it to \"find\" or \"show\" a photo of something\n"
+            "3. Use edit_image to transform existing ones\n"
+            "4. Write detailed prompts. Describe what you see and iterate if needed.\n\n"
             "## CRITICAL: You MUST provide a <result> tag\n\n"
             "Use <image_1> (self-closing) to embed images in your result.\n\n"
             "<result>\n"
         pass
     retryable = any(x in error_str.lower() for x in ["429", "rate limit", "too many requests", "overloaded", "high traffic"])
+    # Strip HTML error pages (e.g. 503 from HuggingFace) to a short message
+    if "<html" in error_str.lower():
+        status_match = _re.search(r'(\d{3})', error_str)
+        error_str = f"Service error (HTTP {status_match.group(1)})" if status_match else "Service unavailable"
     return {"message": error_str, "type": "unknown_error", "retryable": retryable}

backend/main.py CHANGED Viewed

@@ -306,6 +306,8 @@ app.add_middleware(
 class Message(BaseModel):
     role: str
     content: str
 class FrontendContext(BaseModel):
@@ -719,7 +721,14 @@ async def stream_chat_response(
             ) as response:
                 if response.status_code != 200:
                     error_text = await response.aread()
-                    error_detail = error_text.decode() if error_text else f"Status {response.status_code}"
                     error_message = f"LLM API error ({response.status_code}): {error_detail}"
                     logger.error(f"LLM API error: {error_message}")
                     yield f"data: {json.dumps({'type': 'error', 'content': error_message})}\n\n"
@@ -853,8 +862,15 @@ async def chat_stream(raw_request: Request, request: ChatRequest):
     user_id = get_user_id(raw_request)
     files_root = get_user_files_root(user_id)
-    # Convert Pydantic models to dicts
-    messages = [{"role": msg.role, "content": msg.content} for msg in request.messages]
     # Get tab_id for debugging (prefixed with user_id for dict isolation)
     tab_id = request.agent_id or "0"

 class Message(BaseModel):
     role: str
     content: str
+    tool_call_id: Optional[str] = None  # Required for role="tool" messages
+    tool_calls: Optional[List[Dict]] = None  # Required for assistant messages with tool use
 class FrontendContext(BaseModel):
             ) as response:
                 if response.status_code != 200:
                     error_text = await response.aread()
+                    error_detail = error_text.decode() if error_text else ""
+                    # Try to extract JSON error message; fall back to short status text
+                    try:
+                        error_detail = json.loads(error_detail).get("error", {}).get("message", error_detail)
+                    except (json.JSONDecodeError, AttributeError):
+                        pass
+                    if "<html" in error_detail.lower():
+                        error_detail = f"Status {response.status_code}"
                     error_message = f"LLM API error ({response.status_code}): {error_detail}"
                     logger.error(f"LLM API error: {error_message}")
                     yield f"data: {json.dumps({'type': 'error', 'content': error_message})}\n\n"
     user_id = get_user_id(raw_request)
     files_root = get_user_files_root(user_id)
+    # Convert Pydantic models to dicts, preserving tool call fields
+    messages = []
+    for msg in request.messages:
+        m = {"role": msg.role, "content": msg.content}
+        if msg.tool_call_id is not None:
+            m["tool_call_id"] = msg.tool_call_id
+        if msg.tool_calls is not None:
+            m["tool_calls"] = msg.tool_calls
+        messages.append(m)
     # Get tab_id for debugging (prefixed with user_id for dict isolation)
     tab_id = request.agent_id or "0"

frontend/index.html CHANGED Viewed

@@ -526,6 +526,7 @@
             <input type="search" id="usernameInput" name="display_nickname" placeholder="Your name" maxlength="30" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false" data-1p-ignore data-lpignore="true" data-bwignore data-form-type="other" role="presentation">
             <div id="usernameWarning" class="username-warning" style="display:none"></div>
             <button id="usernameSubmit">Start</button>
         </div>
     </div>

             <input type="search" id="usernameInput" name="display_nickname" placeholder="Your name" maxlength="30" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false" data-1p-ignore data-lpignore="true" data-bwignore data-form-type="other" role="presentation">
             <div id="usernameWarning" class="username-warning" style="display:none"></div>
             <button id="usernameSubmit">Start</button>
+            <p class="username-notice">All sessions are publicly stored. For private use, <a href="https://github.com/huggingface/agent-ui" target="_blank">clone the repo</a> and run locally.</p>
         </div>
     </div>

frontend/streaming.js CHANGED Viewed

@@ -173,24 +173,47 @@ async function streamChatResponse(messages, chatContainer, agentType, tabId) {
                         // Still generating - no action needed
                     } else if (data.type === 'result') {
-                        // Populate global figure/image registry only for items referenced in result content
-                        const resultText = data.content || '';
                         if (data.figures) {
                             for (const [name, figData] of Object.entries(data.figures)) {
-                                if (new RegExp(`</?${name}>`, 'i').test(resultText)) {
-                                    globalFigureRegistry[name] = figData;
-                                }
                             }
                         }
                         if (data.images) {
                             for (const [name, imgBase64] of Object.entries(data.images)) {
-                                if (new RegExp(`</?${name}>`, 'i').test(resultText)) {
-                                    globalFigureRegistry[name] = { type: 'png', data: imgBase64 };
-                                }
                             }
                         }
                         // Agent result - update command center widget
-                        updateActionWidgetWithResult(tabId, data.content, data.figures, data.images);
                     } else if (data.type === 'result_preview') {
                         // Show result preview
@@ -1057,12 +1080,16 @@ function handleActionToken(action, message, callback, taskId = null, parentTabId
         const existingContent = document.querySelector(`[data-content-id="${existingTabId}"]`);
         if (existingContent) {
-            // Send the message to the existing agent
-            sendMessageToTab(existingTabId, message);
-            if (callback) {
-                callback(existingTabId);
             }
-            return;
         } else {
             // Tab no longer exists, clean up the mapping
             delete taskIdToTabId[taskId];
@@ -1229,7 +1256,7 @@ if (typeof marked !== 'undefined') {
 // Resolve <figure_N> and <image_N> references using the global registry
 function resolveGlobalFigureRefs(html) {
-    return html.replace(/<\/?(figure_\d+|image_\d+)>/gi, (match) => {
         // Extract the name (strip < > and /)
         const name = match.replace(/[<>/]/g, '');
         const data = globalFigureRegistry[name];

                         // Still generating - no action needed
                     } else if (data.type === 'result') {
+                        // Namespace figure/image references with tab ID to avoid collisions
+                        // between parallel agents (e.g., image_1 -> image_T3_1)
+                        const prefix = `T${tabId}_`;
+                        let resultText = data.content || '';
+                        const namespacedFigures = {};
+                        const namespacedImages = {};
                         if (data.figures) {
                             for (const [name, figData] of Object.entries(data.figures)) {
+                                const nsName = name.replace(/^(figure_)/, `$1${prefix}`);
+                                resultText = resultText.replace(new RegExp(`(</?)(${name})(>)`, 'gi'), `$1${nsName}$3`);
+                                namespacedFigures[nsName] = figData;
                             }
                         }
                         if (data.images) {
                             for (const [name, imgBase64] of Object.entries(data.images)) {
+                                const nsName = name.replace(/^(image_)/, `$1${prefix}`);
+                                resultText = resultText.replace(new RegExp(`(</?)(${name})(>)`, 'gi'), `$1${nsName}$3`);
+                                namespacedImages[nsName] = imgBase64;
+                            }
+                        }
+                        // Populate global registry with namespaced names
+                        for (const [name, figData] of Object.entries(namespacedFigures)) {
+                            if (new RegExp(`</?${name}>`, 'i').test(resultText)) {
+                                globalFigureRegistry[name] = figData;
                             }
                         }
+                        for (const [name, imgBase64] of Object.entries(namespacedImages)) {
+                            if (new RegExp(`</?${name}>`, 'i').test(resultText)) {
+                                globalFigureRegistry[name] = { type: 'png', data: imgBase64 };
+                            }
+                        }
+                        // Update data for downstream consumers with namespaced refs
+                        data.content = resultText;
+                        data.figures = namespacedFigures;
+                        data.images = namespacedImages;
                         // Agent result - update command center widget
+                        updateActionWidgetWithResult(tabId, resultText, namespacedFigures, namespacedImages);
                     } else if (data.type === 'result_preview') {
                         // Show result preview
         const existingContent = document.querySelector(`[data-content-id="${existingTabId}"]`);
         if (existingContent) {
+            // Only reuse if the agent type matches — different type with same task_id should create a new tab
+            const existingType = existingContent.querySelector('.chat-container')?.dataset?.agentType;
+            if (existingType === action) {
+                // Send the message to the existing agent
+                sendMessageToTab(existingTabId, message);
+                if (callback) {
+                    callback(existingTabId);
+                }
+                return;
             }
         } else {
             // Tab no longer exists, clean up the mapping
             delete taskIdToTabId[taskId];
 // Resolve <figure_N> and <image_N> references using the global registry
 function resolveGlobalFigureRefs(html) {
+    return html.replace(/<\/?(figure_(?:T\d+_)?\d+|image_(?:T\d+_)?\d+)>/gi, (match) => {
         // Extract the name (strip < > and /)
         const name = match.replace(/[<>/]/g, '');
         const data = globalFigureRegistry[name];

frontend/style.css CHANGED Viewed

@@ -4308,6 +4308,22 @@ pre code [class*="token"] {
     text-align: left;
 }
 .user-indicator-block {
     display: flex;
     align-items: stretch;

     text-align: left;
 }
+.username-notice {
+    margin: 16px 0 0;
+    font-size: 10px;
+    color: var(--text-muted);
+    line-height: 1.4;
+}
+.username-notice a {
+    color: var(--theme-accent);
+    text-decoration: none;
+}
+.username-notice a:hover {
+    text-decoration: underline;
+}
 .user-indicator-block {
     display: flex;
     align-items: stretch;

frontend/tabs.js CHANGED Viewed

@@ -418,6 +418,7 @@ async function continueCommandCenter() {
     if (!chatContainer) return;
     setTabGenerating(0, true);
     const messages = getConversationHistory(chatContainer);
     await streamChatResponse(messages, chatContainer, 'command', 0);

     if (!chatContainer) return;
     setTabGenerating(0, true);
+    showProgressWidget(chatContainer);
     const messages = getConversationHistory(chatContainer);
     await streamChatResponse(messages, chatContainer, 'command', 0);

settings.json CHANGED Viewed

@@ -49,14 +49,24 @@
       "name": "FLUX.1-Kontext-dev",
       "providerId": "provider_default",
       "modelId": "black-forest-labs/FLUX.1-Kontext-dev"
     }
   },
   "agents": {
-    "command": "model_1768317361934_c2w927ez4",
-    "agent": "model_1768317361934_c2w927ez4",
-    "code": "model_1768317361934_c2w927ez4",
-    "research": "model_1768317361934_c2w927ez4",
-    "image": "model_default"
   },
   "e2bKey": "",
   "serperKey": "",

       "name": "FLUX.1-Kontext-dev",
       "providerId": "provider_default",
       "modelId": "black-forest-labs/FLUX.1-Kontext-dev"
+    },
+    "model_1773742757137_25jano5d1": {
+      "name": "Qwen3.5-397B-A17B",
+      "providerId": "provider_default",
+      "modelId": "Qwen/Qwen3.5-397B-A17B"
+    },
+    "model_1773742815533_dzfd6vjze": {
+      "name": "GLM-5",
+      "providerId": "provider_default",
+      "modelId": "zai-org/GLM-5"
     }
   },
   "agents": {
+    "command": "model_1773742815533_dzfd6vjze",
+    "agent": "model_1773742815533_dzfd6vjze",
+    "code": "model_1773742815533_dzfd6vjze",
+    "research": "model_1773742815533_dzfd6vjze",
+    "image": "model_1773742757137_25jano5d1"
   },
   "e2bKey": "",
   "serperKey": "",