lvwerra HF Staff Claude Opus 4.6 commited on
Commit
4da424f
·
1 Parent(s): c48a8e5

Fix tool message handling, parallel image refs, error display, and UX polish

Browse files

- Preserve tool_call_id/tool_calls in Message model so command center history
doesn't break on subsequent LLM calls (400 error fix)
- Namespace image/figure refs with tab ID (image_1 -> image_T3_1) to avoid
collisions between parallel sub-agents
- Strip HTML error pages (e.g. HF 503) to short status messages
- Show progress widget when command center auto-continues after sub-agents finish
- Only reuse agent tabs when both task_id and agent type match
- Nudge command center and image agent to only generate images when explicitly asked
- Add GLM-5 and Qwen3.5-397B models, update default agent assignments
- Update README with install/docker/env docs, add privacy notice to login

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

README.md CHANGED
@@ -12,13 +12,40 @@ header: mini
12
 
13
  A multi-agent AI interface with code execution, web search, image generation, and deep research — all orchestrated from a single command center.
14
 
15
- ## Quick Start
16
 
17
  ```bash
18
- make install # Install dependencies
19
- make dev # Start server at http://localhost:8765
20
  ```
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ## Architecture
23
 
24
  ```
@@ -292,14 +319,3 @@ All agents communicate via Server-Sent Events. Each event is a JSON object with
292
  ## Verification
293
 
294
  Verify backend imports: `python -c "from backend.command import stream_command_center"`
295
-
296
- ## Deployment
297
-
298
- The app runs as a Docker container (designed for HuggingFace Spaces):
299
-
300
- ```bash
301
- docker build -t agent-ui .
302
- docker run -p 7860:7860 agent-ui
303
- ```
304
-
305
- Set API keys via environment variables: `OPENAI_API_KEY`, `E2B_API_KEY`, `SERPER_API_KEY`, `HF_TOKEN`.
 
12
 
13
  A multi-agent AI interface with code execution, web search, image generation, and deep research — all orchestrated from a single command center.
14
 
15
+ ## Local Install
16
 
17
  ```bash
18
+ pip install . # Install from pyproject.toml
19
+ python -m backend.main # Start server at http://localhost:8765
20
  ```
21
 
22
+ Or use Make shortcuts:
23
+
24
+ ```bash
25
+ make install # pip install .
26
+ make dev # Start dev server
27
+ ```
28
+
29
+ Configure API keys in the Settings panel, or set environment variables:
30
+
31
+ | Variable | Purpose |
32
+ |----------|---------|
33
+ | `LLM_API_KEY` | Default LLM provider token (any OpenAI-compatible API) |
34
+ | `HF_TOKEN` | HuggingFace token (image generation, hosted models) |
35
+ | `E2B_API_KEY` | [E2B](https://e2b.dev) sandbox for code execution |
36
+ | `SERPER_API_KEY` | [Serper](https://serper.dev) for web search |
37
+
38
+ ## Docker
39
+
40
+ ```bash
41
+ docker build -t agent-ui .
42
+ docker run -p 7860:7860 -e LLM_API_KEY=... agent-ui
43
+ ```
44
+
45
+ CLI options: `--port`, `--no-browser`, `--config-dir`, `--workspace-dir`, `--multi-user`.
46
+
47
+ For HuggingFace Spaces deployment, set `HF_BUCKET` and `HF_BUCKET_TOKEN` secrets for workspace persistence across restarts.
48
+
49
  ## Architecture
50
 
51
  ```
 
319
  ## Verification
320
 
321
  Verify backend imports: `python -c "from backend.command import stream_command_center"`
 
 
 
 
 
 
 
 
 
 
 
backend/agents.py CHANGED
@@ -52,7 +52,7 @@ AGENT_REGISTRY = {
52
  "- **Web agent**: searches, lookups, fact-checking, reading URLs\n"
53
  "- **Code agent**: data analysis, code execution, visualizations, debugging\n"
54
  "- **Research agent**: ONLY deep multi-source analysis, comparisons, reports\n"
55
- "- **Image agent**: generating or editing images\n\n"
56
  "When delegating, provide a clear objective, scope boundaries, and expected output format.\n\n"
57
  "## Task Decomposition — ALWAYS parallelize\n\n"
58
  "**RULE: When a request mentions multiple distinct entities or topics, "
@@ -239,8 +239,10 @@ AGENT_REGISTRY = {
239
  "SVG NOT supported. Returns image reference.\n\n"
240
  "## Strategy\n\n"
241
  "1. If user provides a URL/file, use read_image first to load it\n"
242
- "2. Use generate_image for new images, edit_image to transform existing ones\n"
243
- "3. Write detailed prompts. Describe what you see and iterate if needed.\n\n"
 
 
244
  "## CRITICAL: You MUST provide a <result> tag\n\n"
245
  "Use <image_1> (self-closing) to embed images in your result.\n\n"
246
  "<result>\n"
@@ -340,6 +342,10 @@ def parse_llm_error(error: Exception) -> dict:
340
  pass
341
 
342
  retryable = any(x in error_str.lower() for x in ["429", "rate limit", "too many requests", "overloaded", "high traffic"])
 
 
 
 
343
  return {"message": error_str, "type": "unknown_error", "retryable": retryable}
344
 
345
 
 
52
  "- **Web agent**: searches, lookups, fact-checking, reading URLs\n"
53
  "- **Code agent**: data analysis, code execution, visualizations, debugging\n"
54
  "- **Research agent**: ONLY deep multi-source analysis, comparisons, reports\n"
55
+ "- **Image agent**: generating or editing images (ONLY when the user explicitly asks to generate/create an image — never for finding/showing existing photos)\n\n"
56
  "When delegating, provide a clear objective, scope boundaries, and expected output format.\n\n"
57
  "## Task Decomposition — ALWAYS parallelize\n\n"
58
  "**RULE: When a request mentions multiple distinct entities or topics, "
 
239
  "SVG NOT supported. Returns image reference.\n\n"
240
  "## Strategy\n\n"
241
  "1. If user provides a URL/file, use read_image first to load it\n"
242
+ "2. Use generate_image ONLY when explicitly asked to generate/create an image — "
243
+ "never use it to \"find\" or \"show\" a photo of something\n"
244
+ "3. Use edit_image to transform existing ones\n"
245
+ "4. Write detailed prompts. Describe what you see and iterate if needed.\n\n"
246
  "## CRITICAL: You MUST provide a <result> tag\n\n"
247
  "Use <image_1> (self-closing) to embed images in your result.\n\n"
248
  "<result>\n"
 
342
  pass
343
 
344
  retryable = any(x in error_str.lower() for x in ["429", "rate limit", "too many requests", "overloaded", "high traffic"])
345
+ # Strip HTML error pages (e.g. 503 from HuggingFace) to a short message
346
+ if "<html" in error_str.lower():
347
+ status_match = _re.search(r'(\d{3})', error_str)
348
+ error_str = f"Service error (HTTP {status_match.group(1)})" if status_match else "Service unavailable"
349
  return {"message": error_str, "type": "unknown_error", "retryable": retryable}
350
 
351
 
backend/main.py CHANGED
@@ -306,6 +306,8 @@ app.add_middleware(
306
  class Message(BaseModel):
307
  role: str
308
  content: str
 
 
309
 
310
 
311
  class FrontendContext(BaseModel):
@@ -719,7 +721,14 @@ async def stream_chat_response(
719
  ) as response:
720
  if response.status_code != 200:
721
  error_text = await response.aread()
722
- error_detail = error_text.decode() if error_text else f"Status {response.status_code}"
 
 
 
 
 
 
 
723
  error_message = f"LLM API error ({response.status_code}): {error_detail}"
724
  logger.error(f"LLM API error: {error_message}")
725
  yield f"data: {json.dumps({'type': 'error', 'content': error_message})}\n\n"
@@ -853,8 +862,15 @@ async def chat_stream(raw_request: Request, request: ChatRequest):
853
  user_id = get_user_id(raw_request)
854
  files_root = get_user_files_root(user_id)
855
 
856
- # Convert Pydantic models to dicts
857
- messages = [{"role": msg.role, "content": msg.content} for msg in request.messages]
 
 
 
 
 
 
 
858
 
859
  # Get tab_id for debugging (prefixed with user_id for dict isolation)
860
  tab_id = request.agent_id or "0"
 
306
  class Message(BaseModel):
307
  role: str
308
  content: str
309
+ tool_call_id: Optional[str] = None # Required for role="tool" messages
310
+ tool_calls: Optional[List[Dict]] = None # Required for assistant messages with tool use
311
 
312
 
313
  class FrontendContext(BaseModel):
 
721
  ) as response:
722
  if response.status_code != 200:
723
  error_text = await response.aread()
724
+ error_detail = error_text.decode() if error_text else ""
725
+ # Try to extract JSON error message; fall back to short status text
726
+ try:
727
+ error_detail = json.loads(error_detail).get("error", {}).get("message", error_detail)
728
+ except (json.JSONDecodeError, AttributeError):
729
+ pass
730
+ if "<html" in error_detail.lower():
731
+ error_detail = f"Status {response.status_code}"
732
  error_message = f"LLM API error ({response.status_code}): {error_detail}"
733
  logger.error(f"LLM API error: {error_message}")
734
  yield f"data: {json.dumps({'type': 'error', 'content': error_message})}\n\n"
 
862
  user_id = get_user_id(raw_request)
863
  files_root = get_user_files_root(user_id)
864
 
865
+ # Convert Pydantic models to dicts, preserving tool call fields
866
+ messages = []
867
+ for msg in request.messages:
868
+ m = {"role": msg.role, "content": msg.content}
869
+ if msg.tool_call_id is not None:
870
+ m["tool_call_id"] = msg.tool_call_id
871
+ if msg.tool_calls is not None:
872
+ m["tool_calls"] = msg.tool_calls
873
+ messages.append(m)
874
 
875
  # Get tab_id for debugging (prefixed with user_id for dict isolation)
876
  tab_id = request.agent_id or "0"
frontend/index.html CHANGED
@@ -526,6 +526,7 @@
526
  <input type="search" id="usernameInput" name="display_nickname" placeholder="Your name" maxlength="30" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false" data-1p-ignore data-lpignore="true" data-bwignore data-form-type="other" role="presentation">
527
  <div id="usernameWarning" class="username-warning" style="display:none"></div>
528
  <button id="usernameSubmit">Start</button>
 
529
  </div>
530
  </div>
531
 
 
526
  <input type="search" id="usernameInput" name="display_nickname" placeholder="Your name" maxlength="30" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false" data-1p-ignore data-lpignore="true" data-bwignore data-form-type="other" role="presentation">
527
  <div id="usernameWarning" class="username-warning" style="display:none"></div>
528
  <button id="usernameSubmit">Start</button>
529
+ <p class="username-notice">All sessions are publicly stored. For private use, <a href="https://github.com/huggingface/agent-ui" target="_blank">clone the repo</a> and run locally.</p>
530
  </div>
531
  </div>
532
 
frontend/streaming.js CHANGED
@@ -173,24 +173,47 @@ async function streamChatResponse(messages, chatContainer, agentType, tabId) {
173
  // Still generating - no action needed
174
 
175
  } else if (data.type === 'result') {
176
- // Populate global figure/image registry only for items referenced in result content
177
- const resultText = data.content || '';
 
 
 
 
 
178
  if (data.figures) {
179
  for (const [name, figData] of Object.entries(data.figures)) {
180
- if (new RegExp(`</?${name}>`, 'i').test(resultText)) {
181
- globalFigureRegistry[name] = figData;
182
- }
183
  }
184
  }
185
  if (data.images) {
186
  for (const [name, imgBase64] of Object.entries(data.images)) {
187
- if (new RegExp(`</?${name}>`, 'i').test(resultText)) {
188
- globalFigureRegistry[name] = { type: 'png', data: imgBase64 };
189
- }
 
 
 
 
 
 
 
190
  }
191
  }
 
 
 
 
 
 
 
 
 
 
 
192
  // Agent result - update command center widget
193
- updateActionWidgetWithResult(tabId, data.content, data.figures, data.images);
194
 
195
  } else if (data.type === 'result_preview') {
196
  // Show result preview
@@ -1057,12 +1080,16 @@ function handleActionToken(action, message, callback, taskId = null, parentTabId
1057
  const existingContent = document.querySelector(`[data-content-id="${existingTabId}"]`);
1058
 
1059
  if (existingContent) {
1060
- // Send the message to the existing agent
1061
- sendMessageToTab(existingTabId, message);
1062
- if (callback) {
1063
- callback(existingTabId);
 
 
 
 
 
1064
  }
1065
- return;
1066
  } else {
1067
  // Tab no longer exists, clean up the mapping
1068
  delete taskIdToTabId[taskId];
@@ -1229,7 +1256,7 @@ if (typeof marked !== 'undefined') {
1229
 
1230
  // Resolve <figure_N> and <image_N> references using the global registry
1231
  function resolveGlobalFigureRefs(html) {
1232
- return html.replace(/<\/?(figure_\d+|image_\d+)>/gi, (match) => {
1233
  // Extract the name (strip < > and /)
1234
  const name = match.replace(/[<>/]/g, '');
1235
  const data = globalFigureRegistry[name];
 
173
  // Still generating - no action needed
174
 
175
  } else if (data.type === 'result') {
176
+ // Namespace figure/image references with tab ID to avoid collisions
177
+ // between parallel agents (e.g., image_1 -> image_T3_1)
178
+ const prefix = `T${tabId}_`;
179
+ let resultText = data.content || '';
180
+ const namespacedFigures = {};
181
+ const namespacedImages = {};
182
+
183
  if (data.figures) {
184
  for (const [name, figData] of Object.entries(data.figures)) {
185
+ const nsName = name.replace(/^(figure_)/, `$1${prefix}`);
186
+ resultText = resultText.replace(new RegExp(`(</?)(${name})(>)`, 'gi'), `$1${nsName}$3`);
187
+ namespacedFigures[nsName] = figData;
188
  }
189
  }
190
  if (data.images) {
191
  for (const [name, imgBase64] of Object.entries(data.images)) {
192
+ const nsName = name.replace(/^(image_)/, `$1${prefix}`);
193
+ resultText = resultText.replace(new RegExp(`(</?)(${name})(>)`, 'gi'), `$1${nsName}$3`);
194
+ namespacedImages[nsName] = imgBase64;
195
+ }
196
+ }
197
+
198
+ // Populate global registry with namespaced names
199
+ for (const [name, figData] of Object.entries(namespacedFigures)) {
200
+ if (new RegExp(`</?${name}>`, 'i').test(resultText)) {
201
+ globalFigureRegistry[name] = figData;
202
  }
203
  }
204
+ for (const [name, imgBase64] of Object.entries(namespacedImages)) {
205
+ if (new RegExp(`</?${name}>`, 'i').test(resultText)) {
206
+ globalFigureRegistry[name] = { type: 'png', data: imgBase64 };
207
+ }
208
+ }
209
+
210
+ // Update data for downstream consumers with namespaced refs
211
+ data.content = resultText;
212
+ data.figures = namespacedFigures;
213
+ data.images = namespacedImages;
214
+
215
  // Agent result - update command center widget
216
+ updateActionWidgetWithResult(tabId, resultText, namespacedFigures, namespacedImages);
217
 
218
  } else if (data.type === 'result_preview') {
219
  // Show result preview
 
1080
  const existingContent = document.querySelector(`[data-content-id="${existingTabId}"]`);
1081
 
1082
  if (existingContent) {
1083
+ // Only reuse if the agent type matches different type with same task_id should create a new tab
1084
+ const existingType = existingContent.querySelector('.chat-container')?.dataset?.agentType;
1085
+ if (existingType === action) {
1086
+ // Send the message to the existing agent
1087
+ sendMessageToTab(existingTabId, message);
1088
+ if (callback) {
1089
+ callback(existingTabId);
1090
+ }
1091
+ return;
1092
  }
 
1093
  } else {
1094
  // Tab no longer exists, clean up the mapping
1095
  delete taskIdToTabId[taskId];
 
1256
 
1257
  // Resolve <figure_N> and <image_N> references using the global registry
1258
  function resolveGlobalFigureRefs(html) {
1259
+ return html.replace(/<\/?(figure_(?:T\d+_)?\d+|image_(?:T\d+_)?\d+)>/gi, (match) => {
1260
  // Extract the name (strip < > and /)
1261
  const name = match.replace(/[<>/]/g, '');
1262
  const data = globalFigureRegistry[name];
frontend/style.css CHANGED
@@ -4308,6 +4308,22 @@ pre code [class*="token"] {
4308
  text-align: left;
4309
  }
4310
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4311
  .user-indicator-block {
4312
  display: flex;
4313
  align-items: stretch;
 
4308
  text-align: left;
4309
  }
4310
 
4311
+ .username-notice {
4312
+ margin: 16px 0 0;
4313
+ font-size: 10px;
4314
+ color: var(--text-muted);
4315
+ line-height: 1.4;
4316
+ }
4317
+
4318
+ .username-notice a {
4319
+ color: var(--theme-accent);
4320
+ text-decoration: none;
4321
+ }
4322
+
4323
+ .username-notice a:hover {
4324
+ text-decoration: underline;
4325
+ }
4326
+
4327
  .user-indicator-block {
4328
  display: flex;
4329
  align-items: stretch;
frontend/tabs.js CHANGED
@@ -418,6 +418,7 @@ async function continueCommandCenter() {
418
  if (!chatContainer) return;
419
 
420
  setTabGenerating(0, true);
 
421
 
422
  const messages = getConversationHistory(chatContainer);
423
  await streamChatResponse(messages, chatContainer, 'command', 0);
 
418
  if (!chatContainer) return;
419
 
420
  setTabGenerating(0, true);
421
+ showProgressWidget(chatContainer);
422
 
423
  const messages = getConversationHistory(chatContainer);
424
  await streamChatResponse(messages, chatContainer, 'command', 0);
settings.json CHANGED
@@ -49,14 +49,24 @@
49
  "name": "FLUX.1-Kontext-dev",
50
  "providerId": "provider_default",
51
  "modelId": "black-forest-labs/FLUX.1-Kontext-dev"
 
 
 
 
 
 
 
 
 
 
52
  }
53
  },
54
  "agents": {
55
- "command": "model_1768317361934_c2w927ez4",
56
- "agent": "model_1768317361934_c2w927ez4",
57
- "code": "model_1768317361934_c2w927ez4",
58
- "research": "model_1768317361934_c2w927ez4",
59
- "image": "model_default"
60
  },
61
  "e2bKey": "",
62
  "serperKey": "",
 
49
  "name": "FLUX.1-Kontext-dev",
50
  "providerId": "provider_default",
51
  "modelId": "black-forest-labs/FLUX.1-Kontext-dev"
52
+ },
53
+ "model_1773742757137_25jano5d1": {
54
+ "name": "Qwen3.5-397B-A17B",
55
+ "providerId": "provider_default",
56
+ "modelId": "Qwen/Qwen3.5-397B-A17B"
57
+ },
58
+ "model_1773742815533_dzfd6vjze": {
59
+ "name": "GLM-5",
60
+ "providerId": "provider_default",
61
+ "modelId": "zai-org/GLM-5"
62
  }
63
  },
64
  "agents": {
65
+ "command": "model_1773742815533_dzfd6vjze",
66
+ "agent": "model_1773742815533_dzfd6vjze",
67
+ "code": "model_1773742815533_dzfd6vjze",
68
+ "research": "model_1773742815533_dzfd6vjze",
69
+ "image": "model_1773742757137_25jano5d1"
70
  },
71
  "e2bKey": "",
72
  "serperKey": "",