VibecoderMcSwaggins commited on
Commit
631e5fc
Β·
1 Parent(s): 43cfea2

docs: reorganize documentation structure for clarity

Browse files

DELETE (duplicates/obsolete):
- to_do/REFERENCE_GRADDIO_DEMO_ANALYSIS.md
- bugs/P0_MAGENTIC_MODE_BROKEN.md (superseded by FIX_PLAN)

CREATE:
- future-roadmap/ for planned phases 15-17
- decisions/architecture-2025-11/ for magentic-pydantic docs
- bugs/ACTIVE_BUGS.md index

MOVE:
- DEEP_RESEARCH_ROADMAP.md β†’ future-roadmap/
- 04_OPENALEX_INTEGRATION.md β†’ future-roadmap/
- brainstorming/implementation/*.md β†’ future-roadmap/phases/
- brainstorming/magentic-pydantic/*.md β†’ decisions/architecture-2025-11/

UPDATE:
- docs/index.md: Updated links, Europe PMC references, test count

docs/bugs/ACTIVE_BUGS.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Active Bugs
2
+
3
+ > Last updated: 2025-11-28
4
+
5
+ ## P0 - Critical
6
+
7
+ ### Magentic Mode Report Generation
8
+ **File**: [FIX_PLAN_MAGENTIC_MODE.md](./FIX_PLAN_MAGENTIC_MODE.md)
9
+
10
+ **Symptom**: Magentic mode returns `ChatMessage` object instead of synthesized report text.
11
+
12
+ **Root Cause**:
13
+ - `event.message.text` extraction fails in orchestrator
14
+ - `max_rounds=3` too low for SearchAgent + JudgeAgent + ReportAgent sequence
15
+
16
+ **Workaround**: Use Simple mode (default) - works correctly with all LLM providers.
17
+
18
+ **Status**: Fix plan documented, not yet implemented.
19
+
20
+ ---
21
+
22
+ ## P1 - Minor UX
23
+
24
+ ### Gradio Settings Accordion Won't Collapse
25
+ **File**: [P1_GRADIO_SETTINGS_CLEANUP.md](./P1_GRADIO_SETTINGS_CLEANUP.md)
26
+
27
+ **Symptom**: Settings accordion stays open after user interaction.
28
+
29
+ **Root Cause**: Nested `gr.Blocks` context prevents accordion state management.
30
+
31
+ **Impact**: UX only - all functionality works correctly.
32
+
33
+ **Status**: Solution documented, not yet implemented.
34
+
35
+ ---
36
+
37
+ ## Resolved Bugs
38
+
39
+ *None currently - bugs above are still open.*
docs/bugs/P0_MAGENTIC_MODE_BROKEN.md DELETED
@@ -1,116 +0,0 @@
1
- # P0 Bug: Magentic Mode Returns ChatMessage Object Instead of Report Text
2
-
3
- **Status**: OPEN
4
- **Priority**: P0 (Critical)
5
- **Date**: 2025-11-27
6
-
7
- ---
8
-
9
- ## Actual Bug Found (Not What We Thought)
10
-
11
- **The OpenAI key works fine.** The real bug is different:
12
-
13
- ### The Problem
14
-
15
- When Magentic mode completes, the final report returns a `ChatMessage` object instead of the actual text:
16
-
17
- ```
18
- FINAL REPORT:
19
- <agent_framework._types.ChatMessage object at 0x11db70310>
20
- ```
21
-
22
- ### Evidence
23
-
24
- Full test output shows:
25
- 1. Magentic orchestrator starts correctly
26
- 2. SearchAgent finds evidence
27
- 3. HypothesisAgent generates hypotheses
28
- 4. JudgeAgent evaluates
29
- 5. **BUT**: Final output is `ChatMessage` object, not text
30
-
31
- ### Root Cause
32
-
33
- In `src/orchestrator_magentic.py` line 193:
34
-
35
- ```python
36
- elif isinstance(event, MagenticFinalResultEvent):
37
- text = event.message.text if event.message else "No result"
38
- ```
39
-
40
- The `event.message` is a `ChatMessage` object, and `.text` may not extract the content correctly, or the message structure changed in the agent-framework library.
41
-
42
- ---
43
-
44
- ## Secondary Issue: Max Rounds Reached
45
-
46
- The orchestrator hits max rounds before producing a report:
47
-
48
- ```
49
- [ERROR] Magentic Orchestrator: Max round count reached
50
- ```
51
-
52
- This means the workflow times out before the ReportAgent synthesizes the final output.
53
-
54
- ---
55
-
56
- ## What Works
57
-
58
- - OpenAI API key: **Works** (loaded from .env)
59
- - SearchAgent: **Works** (finds evidence from PubMed, ClinicalTrials, Europe PMC)
60
- - HypothesisAgent: **Works** (generates Drug -> Target -> Pathway chains)
61
- - JudgeAgent: **Partial** (evaluates but sometimes loses context)
62
-
63
- ---
64
-
65
- ## Files to Fix
66
-
67
- | File | Line | Issue |
68
- |------|------|-------|
69
- | `src/orchestrator_magentic.py` | 193 | `event.message.text` returns object, not string |
70
- | `src/orchestrator_magentic.py` | 97-99 | `max_round_count=3` too low for full pipeline |
71
-
72
- ---
73
-
74
- ## Suggested Fix
75
-
76
- ```python
77
- # In _process_event, line 192-199
78
- elif isinstance(event, MagenticFinalResultEvent):
79
- # Handle ChatMessage object properly
80
- if event.message:
81
- if hasattr(event.message, 'content'):
82
- text = event.message.content
83
- elif hasattr(event.message, 'text'):
84
- text = event.message.text
85
- else:
86
- text = str(event.message)
87
- else:
88
- text = "No result"
89
- ```
90
-
91
- And increase rounds:
92
-
93
- ```python
94
- # In _build_workflow, line 97
95
- max_round_count=self._max_rounds, # Use configured value, default 10
96
- ```
97
-
98
- ---
99
-
100
- ## Test Command
101
-
102
- ```bash
103
- set -a && source .env && set +a && uv run python examples/orchestrator_demo/run_magentic.py "metformin alzheimer"
104
- ```
105
-
106
- ---
107
-
108
- ## Simple Mode Works
109
-
110
- For reference, simple mode produces full reports:
111
-
112
- ```bash
113
- uv run python examples/orchestrator_demo/run_agent.py "metformin alzheimer"
114
- ```
115
-
116
- Output includes structured report with Drug Candidates, Key Findings, etc.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/{brainstorming/magentic-pydantic β†’ decisions/architecture-2025-11}/00_SITUATION_AND_PLAN.md RENAMED
File without changes
docs/{brainstorming/magentic-pydantic β†’ decisions/architecture-2025-11}/01_ARCHITECTURE_SPEC.md RENAMED
File without changes
docs/{brainstorming/magentic-pydantic β†’ decisions/architecture-2025-11}/02_IMPLEMENTATION_PHASES.md RENAMED
File without changes
docs/{brainstorming/magentic-pydantic β†’ decisions/architecture-2025-11}/03_IMMEDIATE_ACTIONS.md RENAMED
File without changes
docs/{brainstorming/magentic-pydantic β†’ decisions/architecture-2025-11}/04_FOLLOWUP_REVIEW_REQUEST.md RENAMED
File without changes
docs/{brainstorming/magentic-pydantic β†’ decisions/architecture-2025-11}/REVIEW_PROMPT_FOR_SENIOR_AGENT.md RENAMED
File without changes
docs/{to_do β†’ future-roadmap}/DEEP_RESEARCH_ROADMAP.md RENAMED
File without changes
docs/{brainstorming/04_OPENALEX_INTEGRATION.md β†’ future-roadmap/OPENALEX_INTEGRATION.md} RENAMED
File without changes
docs/{brainstorming/implementation β†’ future-roadmap/phases}/15_PHASE_OPENALEX.md RENAMED
File without changes
docs/{brainstorming/implementation β†’ future-roadmap/phases}/16_PHASE_PUBMED_FULLTEXT.md RENAMED
File without changes
docs/{brainstorming/implementation β†’ future-roadmap/phases}/17_PHASE_RATE_LIMITING.md RENAMED
File without changes
docs/{brainstorming/implementation β†’ future-roadmap/phases}/README.md RENAMED
File without changes
docs/index.md CHANGED
@@ -1,8 +1,8 @@
1
  # DeepBoner Documentation
2
 
3
- ## Medical Drug Repurposing Research Agent
4
 
5
- AI-powered deep research system for accelerating drug repurposing discovery.
6
 
7
  ---
8
 
@@ -11,8 +11,9 @@ AI-powered deep research system for accelerating drug repurposing discovery.
11
  ### Architecture
12
  - **[Overview](architecture/overview.md)** - Project overview, use case, architecture
13
  - **[Design Patterns](architecture/design-patterns.md)** - Technical patterns, data models
 
14
 
15
- ### Implementation
16
  - **[Roadmap](implementation/roadmap.md)** - Phased execution plan with TDD
17
  - **[Phase 1: Foundation](implementation/01_phase_foundation.md)** βœ… - Tooling, config, first tests
18
  - **[Phase 2: Search](implementation/02_phase_search.md)** βœ… - PubMed search
@@ -24,25 +25,47 @@ AI-powered deep research system for accelerating drug repurposing discovery.
24
  - **[Phase 8: Report](implementation/08_phase_report.md)** βœ… - Structured scientific reports
25
  - **[Phase 9: Source Cleanup](implementation/09_phase_source_cleanup.md)** βœ… - Remove DuckDuckGo
26
  - **[Phase 10: ClinicalTrials](implementation/10_phase_clinicaltrials.md)** βœ… - Clinical trials API
27
- - **[Phase 11: bioRxiv](implementation/11_phase_biorxiv.md)** βœ… - Preprint search
28
  - **[Phase 12: MCP Server](implementation/12_phase_mcp_server.md)** βœ… - Claude Desktop integration
29
  - **[Phase 13: Modal Integration](implementation/13_phase_modal_integration.md)** βœ… - Secure code execution
30
  - **[Phase 14: Demo Submission](implementation/14_phase_demo_submission.md)** βœ… - Hackathon submission
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  ### Guides
33
  - **[Deployment Guide](guides/deployment.md)** - Gradio, MCP, and Modal launch steps
34
 
35
  ### Development
36
  - **[Testing Strategy](development/testing.md)** - Unit, Integration, and E2E testing patterns
37
 
 
 
 
 
 
 
38
  ---
39
 
40
  ## What We're Building
41
 
42
- **One-liner**: AI agent that searches medical literature to find existing drugs that might treat new diseases.
43
 
44
- **Example Query**:
45
- > "What existing drugs might help treat long COVID fatigue?"
 
 
46
 
47
  **Output**: Research report with drug candidates, mechanisms, evidence quality, and citations.
48
 
@@ -54,7 +77,7 @@ AI-powered deep research system for accelerating drug repurposing discovery.
54
  User Question β†’ Research Agent (Orchestrator)
55
  ↓
56
  Search Loop:
57
- β†’ Tools (PubMed, ClinicalTrials, bioRxiv)
58
  β†’ Judge (Quality + Budget)
59
  β†’ Repeat or Synthesize
60
  ↓
@@ -70,15 +93,7 @@ User Question β†’ Research Agent (Orchestrator)
70
  | **Gradio UI** | βœ… Complete | Streaming chat interface |
71
  | **MCP Server** | βœ… Complete | Tools accessible from Claude Desktop |
72
  | **Modal Sandbox** | βœ… Complete | Secure statistical analysis |
73
- | **Multi-Source Search** | βœ… Complete | PubMed, ClinicalTrials, bioRxiv |
74
-
75
- ---
76
-
77
- ## Team
78
-
79
- - The-Obstacle-Is-The-Way
80
- - MarioAderman
81
- - Josephrp
82
 
83
  ---
84
 
@@ -88,5 +103,5 @@ User Question β†’ Research Agent (Orchestrator)
88
  |-------|--------|
89
  | Phases 1-14 | βœ… COMPLETE |
90
 
91
- **Test Coverage**: 65% (96 tests passing)
92
- **Architecture Review**: PASSED (98-99/100)
 
1
  # DeepBoner Documentation
2
 
3
+ ## Sexual Health Research Agent
4
 
5
+ AI-powered deep research system for sexual wellness, reproductive health, and hormone therapy research.
6
 
7
  ---
8
 
 
11
  ### Architecture
12
  - **[Overview](architecture/overview.md)** - Project overview, use case, architecture
13
  - **[Design Patterns](architecture/design-patterns.md)** - Technical patterns, data models
14
+ - **[Workflow Diagrams](workflow-diagrams.md)** - Visual architecture (Magentic v2.0)
15
 
16
+ ### Implementation (Phases 1-14 βœ… COMPLETE)
17
  - **[Roadmap](implementation/roadmap.md)** - Phased execution plan with TDD
18
  - **[Phase 1: Foundation](implementation/01_phase_foundation.md)** βœ… - Tooling, config, first tests
19
  - **[Phase 2: Search](implementation/02_phase_search.md)** βœ… - PubMed search
 
25
  - **[Phase 8: Report](implementation/08_phase_report.md)** βœ… - Structured scientific reports
26
  - **[Phase 9: Source Cleanup](implementation/09_phase_source_cleanup.md)** βœ… - Remove DuckDuckGo
27
  - **[Phase 10: ClinicalTrials](implementation/10_phase_clinicaltrials.md)** βœ… - Clinical trials API
28
+ - **[Phase 11: Europe PMC](implementation/11_phase_biorxiv.md)** βœ… - Preprint search
29
  - **[Phase 12: MCP Server](implementation/12_phase_mcp_server.md)** βœ… - Claude Desktop integration
30
  - **[Phase 13: Modal Integration](implementation/13_phase_modal_integration.md)** βœ… - Secure code execution
31
  - **[Phase 14: Demo Submission](implementation/14_phase_demo_submission.md)** βœ… - Hackathon submission
32
 
33
+ ### Future Roadmap
34
+ - **[Overview](future-roadmap/phases/README.md)** - Planned phases 15-17
35
+ - **[Phase 15: OpenAlex](future-roadmap/phases/15_PHASE_OPENALEX.md)** - Citation network integration
36
+ - **[Phase 16: PubMed Full-text](future-roadmap/phases/16_PHASE_PUBMED_FULLTEXT.md)** - BioC API
37
+ - **[Phase 17: Rate Limiting](future-roadmap/phases/17_PHASE_RATE_LIMITING.md)** - Production hardening
38
+ - **[Deep Research Mode](future-roadmap/DEEP_RESEARCH_ROADMAP.md)** - GPT-Researcher style enhancements
39
+
40
+ ### Bugs & Issues
41
+ - **[Active Bugs](bugs/ACTIVE_BUGS.md)** - Current issues and workarounds
42
+
43
+ ### Decisions
44
+ - **[PR #55 Evaluation](decisions/2025-11-27-pr55-evaluation.md)** - Architecture decision record
45
+ - **[Magentic + PydanticAI](decisions/architecture-2025-11/)** - Framework architecture decisions
46
+
47
  ### Guides
48
  - **[Deployment Guide](guides/deployment.md)** - Gradio, MCP, and Modal launch steps
49
 
50
  ### Development
51
  - **[Testing Strategy](development/testing.md)** - Unit, Integration, and E2E testing patterns
52
 
53
+ ### Brainstorming (Source Improvements)
54
+ - **[Roadmap Summary](brainstorming/00_ROADMAP_SUMMARY.md)** - Data source enhancement ideas
55
+ - **[PubMed Improvements](brainstorming/01_PUBMED_IMPROVEMENTS.md)**
56
+ - **[ClinicalTrials Improvements](brainstorming/02_CLINICALTRIALS_IMPROVEMENTS.md)**
57
+ - **[Europe PMC Improvements](brainstorming/03_EUROPEPMC_IMPROVEMENTS.md)**
58
+
59
  ---
60
 
61
  ## What We're Building
62
 
63
+ **One-liner**: AI agent that searches medical literature to find evidence for sexual health research questions.
64
 
65
+ **Example Queries**:
66
+ > "What drugs improve female libido post-menopause?"
67
+ > "Evidence for testosterone therapy in women with HSDD?"
68
+ > "Clinical trials for erectile dysfunction alternatives to PDE5 inhibitors?"
69
 
70
  **Output**: Research report with drug candidates, mechanisms, evidence quality, and citations.
71
 
 
77
  User Question β†’ Research Agent (Orchestrator)
78
  ↓
79
  Search Loop:
80
+ β†’ Tools (PubMed, ClinicalTrials, Europe PMC)
81
  β†’ Judge (Quality + Budget)
82
  β†’ Repeat or Synthesize
83
  ↓
 
93
  | **Gradio UI** | βœ… Complete | Streaming chat interface |
94
  | **MCP Server** | βœ… Complete | Tools accessible from Claude Desktop |
95
  | **Modal Sandbox** | βœ… Complete | Secure statistical analysis |
96
+ | **Multi-Source Search** | βœ… Complete | PubMed, ClinicalTrials, Europe PMC |
 
 
 
 
 
 
 
 
97
 
98
  ---
99
 
 
103
  |-------|--------|
104
  | Phases 1-14 | βœ… COMPLETE |
105
 
106
+ **Tests**: 127 passing, 0 warnings
107
+ **Known Issues**: See [Active Bugs](bugs/ACTIVE_BUGS.md)
docs/to_do/REFERENCE_GRADDIO_DEMO_ANALYSIS.md DELETED
@@ -1,229 +0,0 @@
1
- # Reference: GradioDemo Analysis
2
-
3
- > Analysis of code from https://github.com/DeepBoner/GradioDemo
4
- > Purpose: Extract good ideas, understand patterns, avoid mistakes
5
-
6
- ## Overview
7
-
8
- | Metric | Value |
9
- |--------|-------|
10
- | Total lines added | ~7,000 |
11
- | New Python files | +20 |
12
- | Test pass rate | 80% (62 errors due to missing mocks) |
13
- | Integration status | **NOT WIRED IN** |
14
-
15
- ## Component Catalog
16
-
17
- ### REDUNDANT (Already have equivalent)
18
-
19
- | Component | Lines | What We Have Instead |
20
- |-----------|-------|---------------------|
21
- | `orchestrator/graph_orchestrator.py` | 974 | MagenticBuilder |
22
- | `middleware/budget_tracker.py` | 391 | MagenticBuilder max_round_count |
23
- | `middleware/state_machine.py` | 130 | agents/state.py with contextvars |
24
- | `middleware/workflow_manager.py` | 300 | asyncio.gather() |
25
- | `orchestrator/research_flow.py` (IterativeResearchFlow) | 500 | MagenticOrchestrator |
26
- | HuggingFace integration | various | HFInferenceJudgeHandler |
27
-
28
- ### POTENTIALLY USEFUL (Ideas to cherry-pick)
29
-
30
- #### 1. InputParser (`agents/input_parser.py` - 179 lines)
31
-
32
- **Idea**: Detect research mode from query text.
33
-
34
- ```python
35
- # Key logic (simplified)
36
- research_mode: Literal["iterative", "deep"] = "iterative"
37
- if any(keyword in query.lower() for keyword in [
38
- "comprehensive", "report", "sections", "analyze", "analysis", "overview", "market"
39
- ]):
40
- research_mode = "deep"
41
- ```
42
-
43
- **Good pattern**: Heuristic fallback when LLM fails.
44
- **Our implementation**: See Phase 1 in DEEP_RESEARCH_ROADMAP.md
45
-
46
- #### 2. PlannerAgent (`orchestrator/planner_agent.py` - 184 lines)
47
-
48
- **Idea**: LLM creates section outline for report.
49
-
50
- ```python
51
- class ReportPlan(BaseModel):
52
- title: str
53
- sections: list[ReportSection]
54
- estimated_time_minutes: int
55
-
56
- class ReportSection(BaseModel):
57
- title: str
58
- query: str
59
- description: str
60
- priority: int
61
- ```
62
-
63
- **Good pattern**: Structured output with Pydantic models.
64
- **Our implementation**: See Phase 2 in DEEP_RESEARCH_ROADMAP.md
65
-
66
- #### 3. DeepResearchFlow (`orchestrator/research_flow.py` - 500 lines)
67
-
68
- **Idea**: Run parallel research loops per section.
69
-
70
- ```python
71
- # Their pattern (simplified)
72
- async def run_parallel_loops(sections: list[ReportSection]):
73
- tasks = [run_single_loop(s) for s in sections]
74
- results = await asyncio.gather(*tasks, return_exceptions=True)
75
- ```
76
-
77
- **Problem**: They built new IterativeResearchFlow instead of reusing MagenticOrchestrator.
78
- **Our implementation**: Just run multiple MagenticOrchestrator instances.
79
-
80
- #### 4. LlamaIndex RAG (`services/llamaindex_rag.py` - 454 lines)
81
-
82
- **Idea**: Semantic search over collected evidence.
83
-
84
- ```python
85
- # Their approach
86
- class LlamaIndexRAGService:
87
- def __init__(self):
88
- # ChromaDB + LlamaIndex + HuggingFace embeddings
89
- self.vector_store = ChromaVectorStore(...)
90
- self.index = VectorStoreIndex(...)
91
-
92
- def retrieve(self, query: str, top_k: int = 5) -> list[dict]:
93
- retriever = VectorIndexRetriever(index=self.index, similarity_top_k=top_k)
94
- return retriever.retrieve(query)
95
- ```
96
-
97
- **Good**: Full-featured RAG with multiple embedding providers.
98
- **Simpler alternative**: Direct ChromaDB + sentence-transformers (no LlamaIndex).
99
- **Our implementation**: See Phase 4 in DEEP_RESEARCH_ROADMAP.md
100
-
101
- #### 5. LongWriterAgent (`agents/long_writer.py` - ~300 lines)
102
-
103
- **Idea**: Write reports section-by-section to handle length.
104
-
105
- ```python
106
- class SectionOutput(BaseModel):
107
- section_content: str
108
- references: list[str]
109
- next_section_context: str # What to avoid repeating
110
-
111
- async def write_next_section(
112
- section_title: str,
113
- findings: str,
114
- previous_sections: str, # Avoid repetition
115
- ) -> SectionOutput:
116
- ```
117
-
118
- **Good pattern**: Passing context to avoid repetition.
119
- **Our implementation**: See Phase 5 in DEEP_RESEARCH_ROADMAP.md
120
-
121
- #### 6. ProofreaderAgent (`agents/proofreader.py` - ~200 lines)
122
-
123
- **Idea**: Final cleanup pass on report.
124
-
125
- ```python
126
- # Tasks:
127
- # 1. Remove duplicate information
128
- # 2. Fix citation numbering
129
- # 3. Add executive summary
130
- # 4. Ensure consistent formatting
131
- ```
132
-
133
- **Good pattern**: Separate concerns - writer writes, proofreader polishes.
134
- **Our implementation**: Optional Phase 6 if needed.
135
-
136
- ### Graph Architecture (Educational Reference)
137
-
138
- The graph system is well-designed in theory:
139
-
140
- ```python
141
- # Node types
142
- class AgentNode(GraphNode):
143
- agent: Any # Pydantic AI agent
144
- input_transformer: Callable # Transform input
145
- output_transformer: Callable # Transform output
146
-
147
- class DecisionNode(GraphNode):
148
- decision_function: Callable[[Any], str] # Returns next node ID
149
- options: list[str]
150
-
151
- class ParallelNode(GraphNode):
152
- parallel_nodes: list[str] # Run these in parallel
153
- aggregator: Callable # Combine results
154
-
155
- # Graph structure
156
- class ResearchGraph:
157
- nodes: dict[str, GraphNode]
158
- edges: dict[str, list[GraphEdge]]
159
- entry_node: str
160
- exit_nodes: list[str]
161
- ```
162
-
163
- **Why we don't need it**: MagenticBuilder already provides:
164
- - Agent coordination via manager
165
- - Conditional routing (manager decides)
166
- - Multiple participants
167
-
168
- This is essentially reimplementing what `agent-framework` already does.
169
-
170
- ## Key Lessons
171
-
172
- ### What Went Wrong
173
-
174
- 1. **Parallel architecture** - Built new system instead of extending existing
175
- 2. **Horizontal sprawl** - All infrastructure, nothing wired in
176
- 3. **Test mocking** - Tests don't mock API clients properly
177
- 4. **No manual testing** - Code never ran end-to-end
178
-
179
- ### What To Learn From
180
-
181
- 1. **Pydantic models for structured output** - Good pattern
182
- 2. **Heuristic fallbacks** - When LLM fails, have a fallback
183
- 3. **Section-by-section writing** - For long reports
184
- 4. **RAG for evidence retrieval** - Useful for large evidence sets
185
-
186
- ### The 7,000 Line vs 500 Line Comparison
187
-
188
- **Their approach**:
189
- - New GraphOrchestrator (974 lines)
190
- - New ResearchFlow (999 lines)
191
- - New BudgetTracker (391 lines)
192
- - New StateMachine (130 lines)
193
- - New WorkflowManager (300 lines)
194
- - New agents (InputParser, Writer, LongWriter, Proofreader, etc.)
195
- - Total: ~7,000 lines, not integrated
196
-
197
- **Our approach**:
198
- - InputParser (50-100 lines) - extends existing
199
- - PlannerAgent (80-120 lines) - uses ChatAgent pattern
200
- - DeepOrchestrator (100-150 lines) - wraps MagenticOrchestrator
201
- - RAGService (100-150 lines) - simple ChromaDB
202
- - LongWriter (80-100 lines) - extends ReportAgent
203
- - Total: ~500 lines, each phase ships working
204
-
205
- ## File Locations (for reference)
206
-
207
- ```
208
- reference_repos/GradioDemo/src/
209
- β”œβ”€β”€ orchestrator/
210
- β”‚ β”œβ”€β”€ graph_orchestrator.py # 974 lines - graph execution
211
- β”‚ β”œβ”€β”€ research_flow.py # 999 lines - iterative/deep flows
212
- β”‚ └── planner_agent.py # 184 lines - section planning
213
- β”œβ”€β”€ agents/
214
- β”‚ β”œβ”€β”€ input_parser.py # 179 lines - query analysis
215
- β”‚ β”œβ”€β”€ writer.py # 210 lines - report writing
216
- β”‚ β”œβ”€β”€ long_writer.py # ~300 lines - section writing
217
- β”‚ β”œβ”€β”€ proofreader.py # ~200 lines - cleanup
218
- β”‚ └── knowledge_gap.py # gap detection
219
- β”œβ”€β”€ middleware/
220
- β”‚ β”œβ”€β”€ budget_tracker.py # 391 lines - token/time tracking
221
- β”‚ β”œβ”€β”€ state_machine.py # 130 lines - workflow state
222
- β”‚ └── workflow_manager.py # 300 lines - parallel loop mgmt
223
- β”œβ”€β”€ services/
224
- β”‚ └── llamaindex_rag.py # 454 lines - RAG service
225
- β”œβ”€β”€ tools/
226
- β”‚ └── rag_tool.py # 191 lines - RAG as search tool
227
- └── agent_factory/
228
- └── graph_builder.py # ~400 lines - graph construction
229
- ```