Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

App Files Files Community

VibecoderMcSwaggins commited on Nov 29, 2025

Commit

a3d157a

1 Parent(s): 153a9c0

docs: remove redundant bug docs (P0/P1/P2 already fixed)

Browse files

Files changed (3) hide show

docs/bugs/P0_GRADIO_EXAMPLE_CACHING_CRASH.md +0 -134
docs/bugs/P1_MULTIPLE_UX_BUGS.md +0 -49
docs/bugs/P2_MAGENTIC_THINKING_STATE.md +0 -232

docs/bugs/P0_GRADIO_EXAMPLE_CACHING_CRASH.md DELETED Viewed

@@ -1,134 +0,0 @@
-# P0 Bug Report: Gradio Example Caching Crash
-## Status
-- **Date:** 2025-11-29
-- **Priority:** P0 CRITICAL (Production Down)
-- **Component:** `src/app.py:131`
-- **Environment:** HuggingFace Spaces (Python 3.11, Gradio)
-## Error Message
-```text
-AttributeError: 'NoneType' object has no attribute 'strip'
-```
-## Full Stack Trace
-```text
-File "/app/src/app.py", line 131, in research_agent
-    user_api_key = (api_key.strip() or api_key_state.strip()) or None
-                    ^^^^^^^^^^^^^
-AttributeError: 'NoneType' object has no attribute 'strip'
-```
-## Root Cause Analysis
-### The Trigger
-Gradio's example caching mechanism runs the `research_agent` function during startup to pre-cache example outputs. This happens at:
-```text
-File "/usr/local/lib/python3.11/site-packages/gradio/helpers.py", line 509, in _start_caching
-    await self.cache()
-```
-### The Problem
-Our examples only provide values for 2 of the 4 function parameters:
-```python
-examples=[
-    ["What is the evidence for testosterone therapy in women with HSDD?", "simple"],
-    ["Promising drug candidates for endometriosis pain management", "simple"],
-]
-```
-These map to `[message, mode]` but **NOT** to `api_key` or `api_key_state`.
-When Gradio runs the function for caching, it passes `None` for the unprovided parameters:
-```python
-async def research_agent(
-    message: str,           # ✅ Provided by example
-    history: list[...],     # ✅ Empty list default
-    mode: str = "simple",   # ✅ Provided by example
-    api_key: str = "",      # ❌ Becomes None during caching!
-    api_key_state: str = "" # ❌ Becomes None during caching!
-) -> AsyncGenerator[...]:
-```
-### The Crash
-Line 131 attempts to call `.strip()` on `None`:
-```python
-user_api_key = (api_key.strip() or api_key_state.strip()) or None
-#               ^^^^^^^^^^^^^
-#               NoneType has no attribute 'strip'
-```
-## Gradio Warning (Ignored)
-Gradio actually warned us about this:
-```text
-UserWarning: Examples will be cached but not all input components have
-example values. This may result in an exception being thrown by your function.
-```
-## Solution
-### Option A: Defensive None Handling (Recommended)
-Add None guards before calling `.strip()`:
-```python
-# Handle None values from Gradio example caching
-api_key_str = api_key or ""
-api_key_state_str = api_key_state or ""
-user_api_key = (api_key_str.strip() or api_key_state_str.strip()) or None
-```
-### Option B: Disable Example Caching
-Set `cache_examples=False` in ChatInterface:
-```python
-gr.ChatInterface(
-    fn=research_agent,
-    examples=[...],
-    cache_examples=False,  # Disable caching
-)
-```
-This avoids the crash but loses the UX benefit of pre-cached examples.
-### Option C: Provide Full Example Values
-Include all 4 columns in examples:
-```python
-examples=[
-    ["What is the evidence...", "simple", "", ""],  # [msg, mode, api_key, state]
-]
-```
-This is verbose and exposes internal state to users.
-## Recommendation
-**Option A** is the cleanest fix. It:
-1. Maintains cached examples for fast UX
-2. Handles edge cases defensively
-3. Doesn't expose internal state in examples
-## Pre-Merge Checklist
-- [ ] Fix applied to `src/app.py`
-- [ ] Unit test added for None parameter handling
-- [ ] `make check` passes
-- [ ] Test locally with `uv run python -m src.app`
-- [ ] Verify example caching works without crash
-- [ ] Deploy to HuggingFace Spaces
-- [ ] Verify Space starts without error
-## Lessons Learned
-1. Always test Gradio apps with example caching enabled locally before deploying
-2. Gradio's "partial examples" feature passes `None` for missing columns
-3. Default parameter values (`str = ""`) are ignored when Gradio explicitly passes `None`
-4. The Gradio warning about missing example values should be treated as an error

docs/bugs/P1_MULTIPLE_UX_BUGS.md DELETED Viewed

@@ -1,49 +0,0 @@
-# P1 Bug Report: Multiple UX and Configuration Issues
-## Status
-- **Date:** 2025-11-29
-- **Priority:** P1 (Multiple user-facing issues)
-- **Components:** `src/app.py`, `src/orchestrator_magentic.py`
-## Resolved Issues (Fixed 2025-11-29)
-### Bug 1: API Key Cleared When Clicking Examples
-**Fixed.** Updated `examples` in `app.py` to include explicit `None` values for additional inputs. Gradio preserves values when the example value is `None`.
-### Bug 2: No Loading/Processing Indicator
-**Fixed.** `research_agent` yields an immediate "⏳ Processing..." message before starting the orchestrator.
-### Bug 3: Advanced Mode Temperature Error
-**Fixed.** Explicitly set `temperature=1.0` for all Magentic agents in `src/agents/magentic_agents.py`. This is compatible with OpenAI reasoning models (o1/o3) which require `temperature=1` and were rejecting the default (likely 0.3 or None).
-### Bug 4: HSDD Acronym Not Spelled Out
-**Fixed.** Updated example text in `app.py` to "HSDD (Hypoactive Sexual Desire Disorder)".
----
-## Open / Deferred Issues
-### Bug 5: Free Tier Quota Exhausted (UX Improvement)
-**Deferred.** Currently shows standard error message. Improve if users report confusion.
-### Bug 6: Asyncio File Descriptor Warnings
-**Won't Fix.** Cosmetic issue only.
----
-## Priority Order (Completed)
-1. **Bug 4 (HSDD)** - Fixed
-2. **Bug 2 (Loading indicator)** - Fixed
-3. **Bug 3 (Temperature)** - Fixed
-4. **Bug 1 (API key)** - Fixed
----
-## Test Plan
-- [x] Fix HSDD acronym
-- [x] Add loading indicator yield
-- [x] Test advanced mode with temperature fix (Static analysis/Code change)
-- [x] Research Gradio example behavior for API key (Implemented None fix)
-- [ ] Run `make check`
-- [ ] Deploy and test on HuggingFace Spaces

docs/bugs/P2_MAGENTIC_THINKING_STATE.md DELETED Viewed

@@ -1,232 +0,0 @@
-# P2 Bug Report: Advanced Mode Missing "Thinking" State
-## Status
-- **Date:** 2025-11-29
-- **Priority:** P2 (UX polish, not blocking functionality)
-- **Component:** `src/orchestrator_magentic.py`, `src/app.py`
----
-## Symptoms
-User experience in **Advanced (Magentic) mode**:
-1. Click example or submit query
-2. See: `🚀 **STARTED**: Starting research (Magentic mode)...`
-3. **2+ minutes of nothing** (no spinner, no progress, no indication work is happening)
-4. Eventually see: `🧠 **JUDGING**: Manager (user_task)...`
-**User perception:** "Is it frozen? Did it crash?"
-### Container Logs Confirm Work IS Happening
-```
-14:54:22 [info] Starting Magentic orchestrator query='...'
-14:54:22 [info] Embedding service enabled
-... 2+ MINUTES OF SILENCE (agent-framework doing internal LLM calls) ...
-14:56:38 [info] Creating orchestrator mode=advanced
-```
-The silence is because `workflow.run_stream()` doesn't yield events during its setup phase.
----
-## Root Cause Analysis
-### Current Flow (`src/orchestrator_magentic.py`)
-```python
-async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
-    # 1. Immediately yields "started"
-    yield AgentEvent(type="started", message=f"Starting research (Magentic mode): {query}")
-    # 2. Setup (fast, no yield needed)
-    embedding_service = self._init_embedding_service()
-    init_magentic_state(embedding_service)
-    workflow = self._build_workflow()
-    # 3. GAP: workflow.run_stream() blocks for 2+ minutes before first event
-    async for event in workflow.run_stream(task):  # <-- THE BOTTLENECK
-        yield self._process_event(event)
-```
-The `agent-framework`'s `workflow.run_stream()` is calling OpenAI's API, building the manager prompt, coordinating agents, etc. **It doesn't yield events during this setup phase**.
----
-## Gold Standard UX (What We'd Want)
-### Gradio's Native Thinking Support
-Per [Gradio Chatbot Docs](https://www.gradio.app/docs/gradio/chatbot):
-> "The Gradio Chatbot can natively display intermediate thoughts and tool usage in a collapsible accordion next to a chat message. This makes it perfect for creating UIs for LLM agents and chain-of-thought (CoT) or reasoning demos."
-**Features available:**
-- `gr.ChatMessage` with `metadata={"status": "pending"}` shows spinner
-- `metadata={"title": "Thinking...", "status": "pending"}` creates collapsible accordion
-- Nested thoughts via `id` and `parent_id`
-- `duration` metadata shows time spent
-**Example from Gradio docs:**
-```python
-import gradio as gr
-def chat_fn(message, history):
-    # Yield thinking state with spinner
-    yield gr.ChatMessage(
-        role="assistant",
-        metadata={"title": "🧠 Thinking...", "status": "pending"}
-    )
-    # Do work...
-    # Update with completed thought
-    yield gr.ChatMessage(
-        role="assistant",
-        content="Analysis complete",
-        metadata={"title": "🧠 Thinking...", "status": "done", "duration": 5.2}
-    )
-    yield "Here's the final answer..."
-```
----
-## Why This is Complex for DeepBoner
-### Constraint 1: ChatInterface Returns Strings
-Our `research_agent()` yields plain strings:
-```python
-yield "🧠 **Backend**: {backend_name}\n\n"
-yield "⏳ **Processing...** Searching PubMed...\n"
-yield "\n\n".join(response_parts)
-```
-Converting to `gr.ChatMessage` objects would require refactoring the entire response pipeline.
-### Constraint 2: Agent-Framework is the Bottleneck
-The 2-minute gap is inside `workflow.run_stream(task)`, which is the `agent-framework` library. We can't inject yields into a third-party library's blocking call.
-### Constraint 3: ChatInterface vs Blocks
-`gr.ChatInterface` is a convenience wrapper. The full `gr.ChatMessage` metadata features work best with raw `gr.Blocks` + `gr.Chatbot` components.
----
-## Options
-### Option A: Yield "Thinking" Before Blocking Call (Recommended)
-**Effort:** 5 minutes
-**Impact:** Users see *something* while waiting
-```python
-# In src/orchestrator_magentic.py
-async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
-    yield AgentEvent(type="started", message=f"Starting research (Magentic mode): {query}")
-    # NEW: Yield thinking state before the blocking call
-    yield AgentEvent(
-        type="thinking",  # New event type
-        message="🧠 Agents are reasoning... This may take 2-5 minutes for complex queries.",
-        iteration=0,
-    )
-    # ... rest of setup ...
-    async for event in workflow.run_stream(task):
-        yield self._process_event(event)
-```
-**Pros:**
-- Simple, doesn't require Gradio changes
-- Works with current string-based approach
-- Sets user expectations ("2-5 minutes")
-**Cons:**
-- No spinner/animation (static text)
-- Doesn't show real-time progress during the gap
-### Option B: Use `gr.ChatMessage` with Metadata (Major Refactor)
-**Effort:** 2-4 hours
-**Impact:** Full gold-standard UX
-Would require:
-1. Changing `research_agent()` to yield `gr.ChatMessage` objects
-2. Adding thinking states with `metadata={"status": "pending"}`
-3. Updating all event handlers to produce proper ChatMessage objects
-### Option C: Heartbeat/Polling (Over-Engineering)
-**Effort:** 4+ hours
-**Impact:** Spinner during blocking call
-Create a background task that yields "still working..." every 10 seconds while waiting for the agent-framework. Requires:
-- `asyncio.create_task()` for heartbeat
-- Task cancellation when real events arrive
-- Proper cleanup
-**Verdict:** Over-engineering for a demo.
-### Option D: Accept the Limitation (Document It)
-**Effort:** 0
-**Impact:** None (users still confused)
-Just document that Advanced mode takes 2-5 minutes and users should wait.
----
-## Recommendation
-**Implement Option A** - Add a "thinking" yield before the blocking call.
-It's:
-1. Minimal code change (5 minutes)
-2. Sets user expectations clearly
-3. Doesn't require Gradio refactoring
-4. Better than silence
----
-## Implementation Plan
-### Step 1: Add "thinking" Event Type
-```python
-# In src/utils/models.py
-class AgentEvent(BaseModel):
-    type: Literal[
-        "started", "thinking", "searching", ...  # Add "thinking"
-    ]
-```
-### Step 2: Yield Thinking Event in Magentic Orchestrator
-```python
-# In src/orchestrator_magentic.py, run() method
-yield AgentEvent(
-    type="thinking",
-    message="🧠 Multi-agent reasoning in progress... This may take 2-5 minutes.",
-    iteration=0,
-)
-```
-### Step 3: Handle in App
-```python
-# In src/app.py, research_agent()
-if event.type == "thinking":
-    yield f"⏳ {event.message}"
-```
----
-## Test Plan
-- [ ] Add `"thinking"` to AgentEvent type literals
-- [ ] Add yield before `workflow.run_stream()`
-- [ ] Handle in app.py
-- [ ] `make check` passes
-- [ ] Manual test: Advanced mode shows "reasoning in progress" message
-- [ ] Deploy to HuggingFace, verify UX improvement
----
-## References
-- [Gradio ChatInterface Docs](https://www.gradio.app/docs/gradio/chatinterface)
-- [Gradio Chatbot Metadata](https://www.gradio.app/docs/gradio/chatbot)
-- [Agents and Tool Usage Guide](https://www.gradio.app/guides/agents-and-tool-usage)
-- [GitHub Issue: Streaming text not working](https://github.com/gradio-app/gradio/issues/11443)