File size: 6,944 Bytes
631e5fc
 
a01e4db
2c5db87
 
599a754
631e5fc
9639483
 
3a2b22f
9639483
3d25956
9639483
b80c43b
 
cc5dfc8
b80c43b
 
 
 
 
a01e4db
cc5dfc8
 
b80c43b
a01e4db
 
 
 
 
cc5dfc8
 
 
 
b80c43b
 
 
1d32642
 
b074f88
 
 
 
 
 
 
9639483
d36ce3c
631e5fc
d36ce3c
631e5fc
e1232d2
 
 
 
 
 
 
 
 
 
 
3a2b22f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a205fc5
 
74e87c1
a205fc5
 
74e87c1
a205fc5
 
 
 
74e87c1
a205fc5
599a754
 
 
 
 
 
 
 
 
 
3d25956
 
 
 
 
 
 
 
 
d36ce3c
7f11675
631e5fc
d36ce3c
 
 
631e5fc
d36ce3c
 
631e5fc
d36ce3c
 
 
 
631e5fc
d36ce3c
 
631e5fc
d36ce3c
 
 
631e5fc
d36ce3c
 
631e5fc
d36ce3c
 
 
631e5fc
d04e93b
 
 
 
 
 
 
d36ce3c
 
 
 
631e5fc
 
 
d36ce3c
631e5fc
d36ce3c
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
# Active Bugs

> Last updated: 2025-12-01 (07:30 PST)
>
> **Note:** Completed bug docs archived to `docs/bugs/archive/`
> **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)

## P0 - Blocker

_No active P0 bugs._

---

## P2 - UX Friction

### P2 - Advanced Mode Cold Start Has No User Feedback (βœ… FIXED)
**File:** `docs/bugs/P2_ADVANCED_MODE_COLD_START_NO_FEEDBACK.md`
**Issue:** [#108](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/108)
**Found:** 2025-12-01 (Gradio Testing)

**Problem:** Three "dead zones" with no visual feedback during Advanced Mode startup:
1. **Dead Zone #1** (5-15s): Between STARTED β†’ THINKING βœ… FIXED (granular events)
2. **Dead Zone #2** (10-30s): Between THINKING β†’ PROGRESS (first LLM call) βœ… FIXED (Progress Bar)
3. **Dead Zone #3** (30-90s): After PROGRESS (SearchAgent executing) βœ… FIXED (Pre-warming + Progress Bar)

**Phase 1 Fix (commit dbf888c):**
- Added granular progress events during initialization
- Users now see "Loading embedding service...", "Initializing research memory...", "Building agent team..."
- Significantly improves perceived responsiveness

**Phase 2/3 Fix (Latest):**
- Implemented service pre-warming (`service_loader.warmup_services`)
- Added native Gradio progress bar (`gr.Progress`) to `research_agent`
- Visual feedback is now continuous throughout the entire lifecycle

---

## P1 - Important

### P1 - Memory Layer Not Integrated (Post-Hackathon)
**Issue:** [#73](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/73)
**Spec:** [SPEC_08_INTEGRATE_MEMORY_LAYER.md](../specs/SPEC_08_INTEGRATE_MEMORY_LAYER.md)

**Problem:** Structured memory (hypotheses, conflicts) is isolated in "God Mode" only.
**Solution:** Extract memory into shared service, integrate into Simple and Advanced modes.
**Status:** Spec written. Blocked until post-hackathon.

---

## Resolved Bugs

### ~~P1 - Advanced Mode Exposes Uninterpretable Chain-of-Thought~~ FIXED
**File:** `docs/bugs/P1_ADVANCED_MODE_UNINTERPRETABLE_CHAIN_OF_THOUGHT.md`
**PR:** [#107](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/107)
**Found:** 2025-12-01
**Resolved:** 2025-12-01

- Problem: Advanced mode exposed raw `task_ledger` and `instruction` events, truncated mid-word.
- Fix: Filtered internal events, transformed `user_task` to progress type, smart sentence-aware truncation.
- Tests: `tests/unit/orchestrators/test_advanced_events.py` (5 tests)
- CodeRabbit review addressed: test markers, edge case handling, truncation test coverage.

### ~~P0 - Advanced Mode Timeout Yields No Synthesis~~ FIXED
**File:** `docs/bugs/P0_ADVANCED_MODE_TIMEOUT_NO_SYNTHESIS.md`
**Found:** 2025-11-30 (Manual Testing)
**Resolved:** 2025-12-01

- Problem: Advanced mode timed out and displayed "Synthesizing..." but no synthesis occurred.
- Root Causes:
  1. Timeout handler yielded misleading message without calling ReportAgent
  2. Factory used wrong setting (`max_iterations=10` instead of `advanced_max_rounds=5`)
  3. Missing `get_context_summary()` in ResearchMemory
- Fix:
  1. Implemented actual synthesis on timeout via ReportAgent invocation
  2. Factory now uses `settings.advanced_max_rounds` (5)
  3. Added `get_context_summary()` to ResearchMemory
- Tests: `tests/unit/orchestrators/test_advanced_timeout.py`
- Key files: `src/orchestrators/advanced.py`, `src/orchestrators/factory.py`, `src/services/research_memory.py`

### ~~P0 - Free Tier Synthesis Incorrectly Uses Server-Side API Keys~~ FIXED
**File:** `docs/bugs/P1_SYNTHESIS_BROKEN_KEY_FALLBACK.md`
**PR:** [#103](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/103)
**Found:** 2025-11-30 (Testing)
**Resolved:** 2025-11-30
**Verified:** Free Tier now produces full LLM-synthesized research reports βœ…

- Problem: Simple Mode crashed with "OpenAIError" on HuggingFace Spaces when user provided no key but admin key was invalid.
- Root Cause: Synthesis logic bypassed the Free Tier judge and incorrectly used server-side keys via `get_model()`.
- Fix: Implemented `synthesize()` in `HFInferenceJudgeHandler` to use free HuggingFace Inference, ensuring consistency with the judge phase.
- Key files: `src/agent_factory/judges.py`, `src/orchestrators/simple.py`

### ~~P0 - Synthesis Fails with OpenAIError in Free Mode~~ FIXED
**File:** `docs/bugs/P0_SYNTHESIS_PROVIDER_MISMATCH.md`
**Found:** 2025-11-30 (Code Audit)
**Resolved:** 2025-11-30

- Problem: "Simple Mode" (Free Tier) crashed with `OpenAIError`.
- Root Cause: `get_model()` defaulted to OpenAI regardless of available keys.
- Fix: Implemented auto-detection in `judges.py` (OpenAI > Anthropic > HuggingFace).
- Added extensive unit tests and regression tests.

### ~~P0 - Simple Mode Never Synthesizes~~ FIXED
**PR:** [#71](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/71) (SPEC_06)
**Commit**: `5cac97d` (2025-11-29)

- Root cause: LLM-as-Judge recommendations were being IGNORED
- Fix: Code-enforced termination criteria (`_should_synthesize()`)
- Added combined score thresholds, late-iteration logic, emergency fallback
- Simple mode now synthesizes instead of spinning forever

### ~~P3 - Magentic Mode Missing Termination Guarantee~~ FIXED
**Commit**: `d36ce3c` (2025-11-29)

- Added `final_event_received` tracking in `orchestrator_magentic.py`
- Added fallback yield for "max iterations reached" scenario
- Verified with `test_magentic_termination.py`

### ~~P0 - Magentic Mode Report Generation~~ FIXED
**Commit**: `9006d69` (2025-11-29)

- Fixed `_extract_text()` to handle various message object formats
- Increased `max_rounds=10` (was 3)
- Added `temperature=1.0` for reasoning model compatibility
- Advanced mode now produces full research reports

### ~~P1 - Streaming Spam + API Key Persistence~~ FIXED
**Commit**: `0c9be4a` (2025-11-29)

- Streaming events now buffered (not token-by-token spam)
- API key persists across example clicks via `gr.State`
- Examples use explicit `None` values to avoid overwriting keys

### ~~P2 - Missing "Thinking" State~~ FIXED
**Commit**: `9006d69` (2025-11-29)

- Added `"thinking"` event type with hourglass icon
- Yields "Multi-agent reasoning in progress..." before blocking workflow call
- Users now see feedback during 2-5 minute initial processing

### ~~P2 - Gradio Example Not Filling Chat Box~~ FIXED
**Commit**: `2ea01fd` (2025-11-29)

- Third example (HSDD) wasn't populating chat box when clicked
- Root cause: Parentheses in `HSDD (Hypoactive Sexual Desire Disorder)`
- Fix: Simplified to `Testosterone therapy for Hypoactive Sexual Desire Disorder?`

### ~~P1 - Gradio Settings Accordion~~ WONTFIX

Decision: Removed nested Blocks, using ChatInterface directly.
Accordion behavior is default Gradio - acceptable for demo.

---

## How to Report Bugs

1. Create `docs/bugs/P{N}_{SHORT_NAME}.md`
2. Include: Symptom, Root Cause, Fix Plan, Test Plan
3. Update this index
4. Priority: P0=blocker, P1=important, P2=UX, P3=edge case