VibecoderMcSwaggins commited on
Commit
a3d157a
·
1 Parent(s): 153a9c0

docs: remove redundant bug docs (P0/P1/P2 already fixed)

Browse files
docs/bugs/P0_GRADIO_EXAMPLE_CACHING_CRASH.md DELETED
@@ -1,134 +0,0 @@
1
- # P0 Bug Report: Gradio Example Caching Crash
2
-
3
- ## Status
4
- - **Date:** 2025-11-29
5
- - **Priority:** P0 CRITICAL (Production Down)
6
- - **Component:** `src/app.py:131`
7
- - **Environment:** HuggingFace Spaces (Python 3.11, Gradio)
8
-
9
- ## Error Message
10
-
11
- ```text
12
- AttributeError: 'NoneType' object has no attribute 'strip'
13
- ```
14
-
15
- ## Full Stack Trace
16
-
17
- ```text
18
- File "/app/src/app.py", line 131, in research_agent
19
- user_api_key = (api_key.strip() or api_key_state.strip()) or None
20
- ^^^^^^^^^^^^^
21
- AttributeError: 'NoneType' object has no attribute 'strip'
22
- ```
23
-
24
- ## Root Cause Analysis
25
-
26
- ### The Trigger
27
- Gradio's example caching mechanism runs the `research_agent` function during startup to pre-cache example outputs. This happens at:
28
-
29
- ```text
30
- File "/usr/local/lib/python3.11/site-packages/gradio/helpers.py", line 509, in _start_caching
31
- await self.cache()
32
- ```
33
-
34
- ### The Problem
35
- Our examples only provide values for 2 of the 4 function parameters:
36
-
37
- ```python
38
- examples=[
39
- ["What is the evidence for testosterone therapy in women with HSDD?", "simple"],
40
- ["Promising drug candidates for endometriosis pain management", "simple"],
41
- ]
42
- ```
43
-
44
- These map to `[message, mode]` but **NOT** to `api_key` or `api_key_state`.
45
-
46
- When Gradio runs the function for caching, it passes `None` for the unprovided parameters:
47
-
48
- ```python
49
- async def research_agent(
50
- message: str, # ✅ Provided by example
51
- history: list[...], # ✅ Empty list default
52
- mode: str = "simple", # ✅ Provided by example
53
- api_key: str = "", # ❌ Becomes None during caching!
54
- api_key_state: str = "" # ❌ Becomes None during caching!
55
- ) -> AsyncGenerator[...]:
56
- ```
57
-
58
- ### The Crash
59
- Line 131 attempts to call `.strip()` on `None`:
60
-
61
- ```python
62
- user_api_key = (api_key.strip() or api_key_state.strip()) or None
63
- # ^^^^^^^^^^^^^
64
- # NoneType has no attribute 'strip'
65
- ```
66
-
67
- ## Gradio Warning (Ignored)
68
-
69
- Gradio actually warned us about this:
70
-
71
- ```text
72
- UserWarning: Examples will be cached but not all input components have
73
- example values. This may result in an exception being thrown by your function.
74
- ```
75
-
76
- ## Solution
77
-
78
- ### Option A: Defensive None Handling (Recommended)
79
- Add None guards before calling `.strip()`:
80
-
81
- ```python
82
- # Handle None values from Gradio example caching
83
- api_key_str = api_key or ""
84
- api_key_state_str = api_key_state or ""
85
- user_api_key = (api_key_str.strip() or api_key_state_str.strip()) or None
86
- ```
87
-
88
- ### Option B: Disable Example Caching
89
- Set `cache_examples=False` in ChatInterface:
90
-
91
- ```python
92
- gr.ChatInterface(
93
- fn=research_agent,
94
- examples=[...],
95
- cache_examples=False, # Disable caching
96
- )
97
- ```
98
-
99
- This avoids the crash but loses the UX benefit of pre-cached examples.
100
-
101
- ### Option C: Provide Full Example Values
102
- Include all 4 columns in examples:
103
-
104
- ```python
105
- examples=[
106
- ["What is the evidence...", "simple", "", ""], # [msg, mode, api_key, state]
107
- ]
108
- ```
109
-
110
- This is verbose and exposes internal state to users.
111
-
112
- ## Recommendation
113
-
114
- **Option A** is the cleanest fix. It:
115
- 1. Maintains cached examples for fast UX
116
- 2. Handles edge cases defensively
117
- 3. Doesn't expose internal state in examples
118
-
119
- ## Pre-Merge Checklist
120
-
121
- - [ ] Fix applied to `src/app.py`
122
- - [ ] Unit test added for None parameter handling
123
- - [ ] `make check` passes
124
- - [ ] Test locally with `uv run python -m src.app`
125
- - [ ] Verify example caching works without crash
126
- - [ ] Deploy to HuggingFace Spaces
127
- - [ ] Verify Space starts without error
128
-
129
- ## Lessons Learned
130
-
131
- 1. Always test Gradio apps with example caching enabled locally before deploying
132
- 2. Gradio's "partial examples" feature passes `None` for missing columns
133
- 3. Default parameter values (`str = ""`) are ignored when Gradio explicitly passes `None`
134
- 4. The Gradio warning about missing example values should be treated as an error
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/bugs/P1_MULTIPLE_UX_BUGS.md DELETED
@@ -1,49 +0,0 @@
1
- # P1 Bug Report: Multiple UX and Configuration Issues
2
-
3
- ## Status
4
- - **Date:** 2025-11-29
5
- - **Priority:** P1 (Multiple user-facing issues)
6
- - **Components:** `src/app.py`, `src/orchestrator_magentic.py`
7
-
8
- ## Resolved Issues (Fixed 2025-11-29)
9
-
10
- ### Bug 1: API Key Cleared When Clicking Examples
11
- **Fixed.** Updated `examples` in `app.py` to include explicit `None` values for additional inputs. Gradio preserves values when the example value is `None`.
12
-
13
- ### Bug 2: No Loading/Processing Indicator
14
- **Fixed.** `research_agent` yields an immediate "⏳ Processing..." message before starting the orchestrator.
15
-
16
- ### Bug 3: Advanced Mode Temperature Error
17
- **Fixed.** Explicitly set `temperature=1.0` for all Magentic agents in `src/agents/magentic_agents.py`. This is compatible with OpenAI reasoning models (o1/o3) which require `temperature=1` and were rejecting the default (likely 0.3 or None).
18
-
19
- ### Bug 4: HSDD Acronym Not Spelled Out
20
- **Fixed.** Updated example text in `app.py` to "HSDD (Hypoactive Sexual Desire Disorder)".
21
-
22
- ---
23
-
24
- ## Open / Deferred Issues
25
-
26
- ### Bug 5: Free Tier Quota Exhausted (UX Improvement)
27
- **Deferred.** Currently shows standard error message. Improve if users report confusion.
28
-
29
- ### Bug 6: Asyncio File Descriptor Warnings
30
- **Won't Fix.** Cosmetic issue only.
31
-
32
- ---
33
-
34
- ## Priority Order (Completed)
35
-
36
- 1. **Bug 4 (HSDD)** - Fixed
37
- 2. **Bug 2 (Loading indicator)** - Fixed
38
- 3. **Bug 3 (Temperature)** - Fixed
39
- 4. **Bug 1 (API key)** - Fixed
40
-
41
- ---
42
-
43
- ## Test Plan
44
- - [x] Fix HSDD acronym
45
- - [x] Add loading indicator yield
46
- - [x] Test advanced mode with temperature fix (Static analysis/Code change)
47
- - [x] Research Gradio example behavior for API key (Implemented None fix)
48
- - [ ] Run `make check`
49
- - [ ] Deploy and test on HuggingFace Spaces
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/bugs/P2_MAGENTIC_THINKING_STATE.md DELETED
@@ -1,232 +0,0 @@
1
- # P2 Bug Report: Advanced Mode Missing "Thinking" State
2
-
3
- ## Status
4
- - **Date:** 2025-11-29
5
- - **Priority:** P2 (UX polish, not blocking functionality)
6
- - **Component:** `src/orchestrator_magentic.py`, `src/app.py`
7
-
8
- ---
9
-
10
- ## Symptoms
11
-
12
- User experience in **Advanced (Magentic) mode**:
13
- 1. Click example or submit query
14
- 2. See: `🚀 **STARTED**: Starting research (Magentic mode)...`
15
- 3. **2+ minutes of nothing** (no spinner, no progress, no indication work is happening)
16
- 4. Eventually see: `🧠 **JUDGING**: Manager (user_task)...`
17
-
18
- **User perception:** "Is it frozen? Did it crash?"
19
-
20
- ### Container Logs Confirm Work IS Happening
21
- ```
22
- 14:54:22 [info] Starting Magentic orchestrator query='...'
23
- 14:54:22 [info] Embedding service enabled
24
- ... 2+ MINUTES OF SILENCE (agent-framework doing internal LLM calls) ...
25
- 14:56:38 [info] Creating orchestrator mode=advanced
26
- ```
27
-
28
- The silence is because `workflow.run_stream()` doesn't yield events during its setup phase.
29
-
30
- ---
31
-
32
- ## Root Cause Analysis
33
-
34
- ### Current Flow (`src/orchestrator_magentic.py`)
35
- ```python
36
- async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
37
- # 1. Immediately yields "started"
38
- yield AgentEvent(type="started", message=f"Starting research (Magentic mode): {query}")
39
-
40
- # 2. Setup (fast, no yield needed)
41
- embedding_service = self._init_embedding_service()
42
- init_magentic_state(embedding_service)
43
- workflow = self._build_workflow()
44
-
45
- # 3. GAP: workflow.run_stream() blocks for 2+ minutes before first event
46
- async for event in workflow.run_stream(task): # <-- THE BOTTLENECK
47
- yield self._process_event(event)
48
- ```
49
-
50
- The `agent-framework`'s `workflow.run_stream()` is calling OpenAI's API, building the manager prompt, coordinating agents, etc. **It doesn't yield events during this setup phase**.
51
-
52
- ---
53
-
54
- ## Gold Standard UX (What We'd Want)
55
-
56
- ### Gradio's Native Thinking Support
57
-
58
- Per [Gradio Chatbot Docs](https://www.gradio.app/docs/gradio/chatbot):
59
-
60
- > "The Gradio Chatbot can natively display intermediate thoughts and tool usage in a collapsible accordion next to a chat message. This makes it perfect for creating UIs for LLM agents and chain-of-thought (CoT) or reasoning demos."
61
-
62
- **Features available:**
63
- - `gr.ChatMessage` with `metadata={"status": "pending"}` shows spinner
64
- - `metadata={"title": "Thinking...", "status": "pending"}` creates collapsible accordion
65
- - Nested thoughts via `id` and `parent_id`
66
- - `duration` metadata shows time spent
67
-
68
- **Example from Gradio docs:**
69
- ```python
70
- import gradio as gr
71
-
72
- def chat_fn(message, history):
73
- # Yield thinking state with spinner
74
- yield gr.ChatMessage(
75
- role="assistant",
76
- metadata={"title": "🧠 Thinking...", "status": "pending"}
77
- )
78
-
79
- # Do work...
80
-
81
- # Update with completed thought
82
- yield gr.ChatMessage(
83
- role="assistant",
84
- content="Analysis complete",
85
- metadata={"title": "🧠 Thinking...", "status": "done", "duration": 5.2}
86
- )
87
-
88
- yield "Here's the final answer..."
89
- ```
90
-
91
- ---
92
-
93
- ## Why This is Complex for DeepBoner
94
-
95
- ### Constraint 1: ChatInterface Returns Strings
96
- Our `research_agent()` yields plain strings:
97
- ```python
98
- yield "🧠 **Backend**: {backend_name}\n\n"
99
- yield "⏳ **Processing...** Searching PubMed...\n"
100
- yield "\n\n".join(response_parts)
101
- ```
102
-
103
- Converting to `gr.ChatMessage` objects would require refactoring the entire response pipeline.
104
-
105
- ### Constraint 2: Agent-Framework is the Bottleneck
106
- The 2-minute gap is inside `workflow.run_stream(task)`, which is the `agent-framework` library. We can't inject yields into a third-party library's blocking call.
107
-
108
- ### Constraint 3: ChatInterface vs Blocks
109
- `gr.ChatInterface` is a convenience wrapper. The full `gr.ChatMessage` metadata features work best with raw `gr.Blocks` + `gr.Chatbot` components.
110
-
111
- ---
112
-
113
- ## Options
114
-
115
- ### Option A: Yield "Thinking" Before Blocking Call (Recommended)
116
- **Effort:** 5 minutes
117
- **Impact:** Users see *something* while waiting
118
-
119
- ```python
120
- # In src/orchestrator_magentic.py
121
- async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
122
- yield AgentEvent(type="started", message=f"Starting research (Magentic mode): {query}")
123
-
124
- # NEW: Yield thinking state before the blocking call
125
- yield AgentEvent(
126
- type="thinking", # New event type
127
- message="🧠 Agents are reasoning... This may take 2-5 minutes for complex queries.",
128
- iteration=0,
129
- )
130
-
131
- # ... rest of setup ...
132
-
133
- async for event in workflow.run_stream(task):
134
- yield self._process_event(event)
135
- ```
136
-
137
- **Pros:**
138
- - Simple, doesn't require Gradio changes
139
- - Works with current string-based approach
140
- - Sets user expectations ("2-5 minutes")
141
-
142
- **Cons:**
143
- - No spinner/animation (static text)
144
- - Doesn't show real-time progress during the gap
145
-
146
- ### Option B: Use `gr.ChatMessage` with Metadata (Major Refactor)
147
- **Effort:** 2-4 hours
148
- **Impact:** Full gold-standard UX
149
-
150
- Would require:
151
- 1. Changing `research_agent()` to yield `gr.ChatMessage` objects
152
- 2. Adding thinking states with `metadata={"status": "pending"}`
153
- 3. Updating all event handlers to produce proper ChatMessage objects
154
-
155
- ### Option C: Heartbeat/Polling (Over-Engineering)
156
- **Effort:** 4+ hours
157
- **Impact:** Spinner during blocking call
158
-
159
- Create a background task that yields "still working..." every 10 seconds while waiting for the agent-framework. Requires:
160
- - `asyncio.create_task()` for heartbeat
161
- - Task cancellation when real events arrive
162
- - Proper cleanup
163
-
164
- **Verdict:** Over-engineering for a demo.
165
-
166
- ### Option D: Accept the Limitation (Document It)
167
- **Effort:** 0
168
- **Impact:** None (users still confused)
169
-
170
- Just document that Advanced mode takes 2-5 minutes and users should wait.
171
-
172
- ---
173
-
174
- ## Recommendation
175
-
176
- **Implement Option A** - Add a "thinking" yield before the blocking call.
177
-
178
- It's:
179
- 1. Minimal code change (5 minutes)
180
- 2. Sets user expectations clearly
181
- 3. Doesn't require Gradio refactoring
182
- 4. Better than silence
183
-
184
- ---
185
-
186
- ## Implementation Plan
187
-
188
- ### Step 1: Add "thinking" Event Type
189
- ```python
190
- # In src/utils/models.py
191
- class AgentEvent(BaseModel):
192
- type: Literal[
193
- "started", "thinking", "searching", ... # Add "thinking"
194
- ]
195
- ```
196
-
197
- ### Step 2: Yield Thinking Event in Magentic Orchestrator
198
- ```python
199
- # In src/orchestrator_magentic.py, run() method
200
- yield AgentEvent(
201
- type="thinking",
202
- message="🧠 Multi-agent reasoning in progress... This may take 2-5 minutes.",
203
- iteration=0,
204
- )
205
- ```
206
-
207
- ### Step 3: Handle in App
208
- ```python
209
- # In src/app.py, research_agent()
210
- if event.type == "thinking":
211
- yield f"⏳ {event.message}"
212
- ```
213
-
214
- ---
215
-
216
- ## Test Plan
217
-
218
- - [ ] Add `"thinking"` to AgentEvent type literals
219
- - [ ] Add yield before `workflow.run_stream()`
220
- - [ ] Handle in app.py
221
- - [ ] `make check` passes
222
- - [ ] Manual test: Advanced mode shows "reasoning in progress" message
223
- - [ ] Deploy to HuggingFace, verify UX improvement
224
-
225
- ---
226
-
227
- ## References
228
-
229
- - [Gradio ChatInterface Docs](https://www.gradio.app/docs/gradio/chatinterface)
230
- - [Gradio Chatbot Metadata](https://www.gradio.app/docs/gradio/chatbot)
231
- - [Agents and Tool Usage Guide](https://www.gradio.app/guides/agents-and-tool-usage)
232
- - [GitHub Issue: Streaming text not working](https://github.com/gradio-app/gradio/issues/11443)