Commit
·
5441526
1
Parent(s):
29a6844
fix: multiple UX improvements (P1 bugs)
Browse files1. HSDD acronym spelled out (Hypoactive Sexual Desire Disorder)
2. Added loading indicator ("Processing...") for immediate feedback
3. Removed temperature settings from magentic agents for reasoning
model compatibility (o3, o1 only support temperature=1)
4. Bug report documenting remaining issues (API key persistence)
140 tests passing.
- docs/bugs/P1_MULTIPLE_UX_BUGS.md +174 -0
- src/agents/magentic_agents.py +5 -4
- src/app.py +7 -1
docs/bugs/P1_MULTIPLE_UX_BUGS.md
ADDED
|
@@ -0,0 +1,174 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# P1 Bug Report: Multiple UX and Configuration Issues
|
| 2 |
+
|
| 3 |
+
## Status
|
| 4 |
+
- **Date:** 2025-11-29
|
| 5 |
+
- **Priority:** P1 (Multiple user-facing issues)
|
| 6 |
+
- **Components:** `src/app.py`, `src/orchestrator_magentic.py`
|
| 7 |
+
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
## Bug 1: API Key Cleared When Clicking Examples
|
| 11 |
+
|
| 12 |
+
### Symptoms
|
| 13 |
+
- User enters API key in textbox
|
| 14 |
+
- User clicks an example prompt
|
| 15 |
+
- API key textbox is cleared/reset
|
| 16 |
+
|
| 17 |
+
### Root Cause
|
| 18 |
+
Despite examples only having 2 columns `[message, mode]`, Gradio's ChatInterface still resets `additional_inputs` that aren't in the examples list. The comment on line 273-274 was incorrect:
|
| 19 |
+
|
| 20 |
+
```python
|
| 21 |
+
# API key persists because examples only include [message, mode] columns,
|
| 22 |
+
# so Gradio doesn't overwrite the api_key textbox when examples are clicked.
|
| 23 |
+
```
|
| 24 |
+
|
| 25 |
+
This assumption is **wrong** - Gradio resets ALL additional_inputs, not just those with example values.
|
| 26 |
+
|
| 27 |
+
### Potential Fix
|
| 28 |
+
Option A: Include API key column in examples (set to empty string explicitly)
|
| 29 |
+
```python
|
| 30 |
+
examples=[
|
| 31 |
+
["What drugs improve female libido?", "simple", ""],
|
| 32 |
+
...
|
| 33 |
+
]
|
| 34 |
+
```
|
| 35 |
+
|
| 36 |
+
Option B: Use JavaScript to preserve the value (hacky)
|
| 37 |
+
|
| 38 |
+
Option C: Move API key outside ChatInterface into a separate Blocks layout
|
| 39 |
+
|
| 40 |
+
### Research Needed
|
| 41 |
+
- Gradio ChatInterface 2025 behavior with partial examples
|
| 42 |
+
- Whether `cache_examples=False` affects this
|
| 43 |
+
|
| 44 |
+
---
|
| 45 |
+
|
| 46 |
+
## Bug 2: No Loading/Processing Indicator
|
| 47 |
+
|
| 48 |
+
### Symptoms
|
| 49 |
+
- User submits query
|
| 50 |
+
- UI shows "🚀 STARTED:" message but nothing else
|
| 51 |
+
- No spinner, no "thinking...", no indication work is happening
|
| 52 |
+
- User thinks it's frozen
|
| 53 |
+
|
| 54 |
+
### Container Logs Show
|
| 55 |
+
Work IS happening:
|
| 56 |
+
```
|
| 57 |
+
[info] Creating orchestrator mode=advanced
|
| 58 |
+
[info] Starting Magentic orchestrator query='...'
|
| 59 |
+
[info] Embedding service enabled
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
But user sees nothing for 30+ seconds.
|
| 63 |
+
|
| 64 |
+
### Root Cause
|
| 65 |
+
The Gradio ChatInterface doesn't show intermediate yields quickly enough, and we don't yield a "⏳ Processing..." message immediately.
|
| 66 |
+
|
| 67 |
+
### Proposed Fix
|
| 68 |
+
Add immediate feedback in `research_agent()`:
|
| 69 |
+
```python
|
| 70 |
+
yield "⏳ **Processing...** Searching PubMed, ClinicalTrials.gov, Europe PMC..."
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
---
|
| 74 |
+
|
| 75 |
+
## Bug 3: Advanced Mode Temperature Error
|
| 76 |
+
|
| 77 |
+
### Error
|
| 78 |
+
```
|
| 79 |
+
Unsupported value: 'temperature' does not support 0.3 with this model.
|
| 80 |
+
Only the default (1) value is supported.
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
### Root Cause
|
| 84 |
+
The `agent_framework` (Magentic) is using `temperature=0.3` but some OpenAI models (like `o3`, `o1`, reasoning models) only support `temperature=1`.
|
| 85 |
+
|
| 86 |
+
### Location
|
| 87 |
+
Likely in `src/orchestrator_magentic.py` or agent-framework configuration.
|
| 88 |
+
|
| 89 |
+
### Proposed Fix
|
| 90 |
+
1. Detect model type and skip temperature for reasoning models
|
| 91 |
+
2. Or: Remove explicit temperature setting, use model defaults
|
| 92 |
+
3. Or: Catch this error and fall back to default temperature
|
| 93 |
+
|
| 94 |
+
---
|
| 95 |
+
|
| 96 |
+
## Bug 4: HSDD Acronym Not Spelled Out
|
| 97 |
+
|
| 98 |
+
### Issue
|
| 99 |
+
Example prompt says:
|
| 100 |
+
```
|
| 101 |
+
"Evidence for testosterone therapy in women with HSDD?"
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
**HSDD = Hypoactive Sexual Desire Disorder** (low libido condition)
|
| 105 |
+
|
| 106 |
+
Most users (including doctors!) won't know this acronym.
|
| 107 |
+
|
| 108 |
+
### Fix
|
| 109 |
+
Change to:
|
| 110 |
+
```
|
| 111 |
+
"Evidence for testosterone therapy in women with HSDD (Hypoactive Sexual Desire Disorder)?"
|
| 112 |
+
```
|
| 113 |
+
|
| 114 |
+
Also update README if it uses this acronym.
|
| 115 |
+
|
| 116 |
+
---
|
| 117 |
+
|
| 118 |
+
## Bug 5: Free Tier Quota Exhausted (Expected Behavior)
|
| 119 |
+
|
| 120 |
+
### Logs
|
| 121 |
+
```
|
| 122 |
+
[error] HF Quota Exhausted error='402 Client Error: Payment Required...'
|
| 123 |
+
```
|
| 124 |
+
|
| 125 |
+
### This is NOT a bug
|
| 126 |
+
HuggingFace free tier has limited credits. When exhausted:
|
| 127 |
+
- User should enter their own API key
|
| 128 |
+
- The app correctly falls back to showing evidence without LLM analysis
|
| 129 |
+
|
| 130 |
+
### UX Improvement
|
| 131 |
+
Show clearer message to user when quota is exhausted:
|
| 132 |
+
```
|
| 133 |
+
⚠️ Free tier quota exceeded. Enter your OpenAI/Anthropic API key above for full analysis.
|
| 134 |
+
```
|
| 135 |
+
|
| 136 |
+
---
|
| 137 |
+
|
| 138 |
+
## Bug 6: Asyncio File Descriptor Warnings (Low Priority)
|
| 139 |
+
|
| 140 |
+
### Error
|
| 141 |
+
```
|
| 142 |
+
ValueError: Invalid file descriptor: -1
|
| 143 |
+
Exception ignored in: <function BaseEventLoop.__del__>
|
| 144 |
+
```
|
| 145 |
+
|
| 146 |
+
### Root Cause
|
| 147 |
+
Event loop cleanup issue in async code. Common when mixing sync/async or when event loops are garbage collected.
|
| 148 |
+
|
| 149 |
+
### Impact
|
| 150 |
+
**Cosmetic only** - doesn't affect functionality. Just pollutes logs.
|
| 151 |
+
|
| 152 |
+
### Fix (if desired)
|
| 153 |
+
Properly close event loops or use `asyncio.run()` context managers.
|
| 154 |
+
|
| 155 |
+
---
|
| 156 |
+
|
| 157 |
+
## Priority Order
|
| 158 |
+
|
| 159 |
+
1. **Bug 4 (HSDD)** - 2 min fix, improves UX immediately
|
| 160 |
+
2. **Bug 2 (Loading indicator)** - 5 min fix, critical for UX
|
| 161 |
+
3. **Bug 3 (Temperature)** - Needs investigation, breaks advanced mode
|
| 162 |
+
4. **Bug 1 (API key)** - Needs Gradio research, workaround exists (enter key after clicking example)
|
| 163 |
+
5. **Bug 5 (Quota message)** - Nice to have
|
| 164 |
+
6. **Bug 6 (Asyncio)** - Low priority, cosmetic
|
| 165 |
+
|
| 166 |
+
---
|
| 167 |
+
|
| 168 |
+
## Test Plan
|
| 169 |
+
- [ ] Fix HSDD acronym
|
| 170 |
+
- [ ] Add loading indicator yield
|
| 171 |
+
- [ ] Test advanced mode with temperature fix
|
| 172 |
+
- [ ] Research Gradio example behavior for API key
|
| 173 |
+
- [ ] Run `make check`
|
| 174 |
+
- [ ] Deploy and test on HuggingFace Spaces
|
src/agents/magentic_agents.py
CHANGED
|
@@ -46,7 +46,8 @@ Be thorough - search multiple databases when appropriate.
|
|
| 46 |
Focus on finding: mechanisms of action, clinical evidence, and specific drug candidates.""",
|
| 47 |
chat_client=client,
|
| 48 |
tools=[search_pubmed, search_clinical_trials, search_preprints],
|
| 49 |
-
|
|
|
|
| 50 |
)
|
| 51 |
|
| 52 |
|
|
@@ -85,7 +86,7 @@ Be rigorous but fair. Look for:
|
|
| 85 |
- Safety data
|
| 86 |
- Drug-drug interactions""",
|
| 87 |
chat_client=client,
|
| 88 |
-
|
| 89 |
)
|
| 90 |
|
| 91 |
|
|
@@ -122,7 +123,7 @@ def create_hypothesis_agent(chat_client: OpenAIChatClient | None = None) -> Chat
|
|
| 122 |
|
| 123 |
Focus on mechanistic plausibility and existing evidence.""",
|
| 124 |
chat_client=client,
|
| 125 |
-
|
| 126 |
)
|
| 127 |
|
| 128 |
|
|
@@ -180,5 +181,5 @@ Format them as a numbered list.
|
|
| 180 |
Be comprehensive but concise. Cite evidence for all claims.""",
|
| 181 |
chat_client=client,
|
| 182 |
tools=[get_bibliography],
|
| 183 |
-
temperature
|
| 184 |
)
|
|
|
|
| 46 |
Focus on finding: mechanisms of action, clinical evidence, and specific drug candidates.""",
|
| 47 |
chat_client=client,
|
| 48 |
tools=[search_pubmed, search_clinical_trials, search_preprints],
|
| 49 |
+
# Note: temperature removed for compatibility with reasoning models (o3, o1)
|
| 50 |
+
# which only support temperature=1
|
| 51 |
)
|
| 52 |
|
| 53 |
|
|
|
|
| 86 |
- Safety data
|
| 87 |
- Drug-drug interactions""",
|
| 88 |
chat_client=client,
|
| 89 |
+
# Note: temperature removed for reasoning model compatibility
|
| 90 |
)
|
| 91 |
|
| 92 |
|
|
|
|
| 123 |
|
| 124 |
Focus on mechanistic plausibility and existing evidence.""",
|
| 125 |
chat_client=client,
|
| 126 |
+
# Note: temperature removed for reasoning model compatibility
|
| 127 |
)
|
| 128 |
|
| 129 |
|
|
|
|
| 181 |
Be comprehensive but concise. Cite evidence for all claims.""",
|
| 182 |
chat_client=client,
|
| 183 |
tools=[get_bibliography],
|
| 184 |
+
# Note: temperature removed for reasoning model compatibility
|
| 185 |
)
|
src/app.py
CHANGED
|
@@ -175,6 +175,12 @@ async def research_agent(
|
|
| 175 |
|
| 176 |
yield f"🧠 **Backend**: {backend_name}\n\n"
|
| 177 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 178 |
async for event in orchestrator.run(message):
|
| 179 |
# BUG FIX: Handle streaming events separately to avoid token-by-token spam
|
| 180 |
if event.type == "streaming":
|
|
@@ -248,7 +254,7 @@ def create_demo() -> tuple[gr.ChatInterface, gr.Accordion]:
|
|
| 248 |
"advanced",
|
| 249 |
],
|
| 250 |
[
|
| 251 |
-
"
|
| 252 |
"simple",
|
| 253 |
],
|
| 254 |
],
|
|
|
|
| 175 |
|
| 176 |
yield f"🧠 **Backend**: {backend_name}\n\n"
|
| 177 |
|
| 178 |
+
# Immediate loading feedback so user knows something is happening
|
| 179 |
+
yield (
|
| 180 |
+
f"🧠 **Backend**: {backend_name}\n\n"
|
| 181 |
+
"⏳ **Processing...** Searching PubMed, ClinicalTrials.gov, Europe PMC...\n"
|
| 182 |
+
)
|
| 183 |
+
|
| 184 |
async for event in orchestrator.run(message):
|
| 185 |
# BUG FIX: Handle streaming events separately to avoid token-by-token spam
|
| 186 |
if event.type == "streaming":
|
|
|
|
| 254 |
"advanced",
|
| 255 |
],
|
| 256 |
[
|
| 257 |
+
"Testosterone therapy for HSDD (Hypoactive Sexual Desire Disorder)?",
|
| 258 |
"simple",
|
| 259 |
],
|
| 260 |
],
|