Spaces:

MCP-1st-Birthday
/

DeepBoner

Sleeping

VibecoderMcSwaggins commited on Nov 29, 2025

Commit

631e5fc

1 Parent(s): 43cfea2

docs: reorganize documentation structure for clarity

DELETE (duplicates/obsolete):
- to_do/REFERENCE_GRADDIO_DEMO_ANALYSIS.md
- bugs/P0_MAGENTIC_MODE_BROKEN.md (superseded by FIX_PLAN)

CREATE:
- future-roadmap/ for planned phases 15-17
- decisions/architecture-2025-11/ for magentic-pydantic docs
- bugs/ACTIVE_BUGS.md index

MOVE:
- DEEP_RESEARCH_ROADMAP.md → future-roadmap/
- 04_OPENALEX_INTEGRATION.md → future-roadmap/
- brainstorming/implementation/*.md → future-roadmap/phases/
- brainstorming/magentic-pydantic/*.md → decisions/architecture-2025-11/

UPDATE:
- docs/index.md: Updated links, Europe PMC references, test count

Files changed (16) hide show

docs/bugs/ACTIVE_BUGS.md +39 -0
docs/bugs/P0_MAGENTIC_MODE_BROKEN.md +0 -116
docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/00_SITUATION_AND_PLAN.md +0 -0
docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/01_ARCHITECTURE_SPEC.md +0 -0
docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/02_IMPLEMENTATION_PHASES.md +0 -0
docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/03_IMMEDIATE_ACTIONS.md +0 -0
docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/04_FOLLOWUP_REVIEW_REQUEST.md +0 -0
docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/REVIEW_PROMPT_FOR_SENIOR_AGENT.md +0 -0
docs/{to_do → future-roadmap}/DEEP_RESEARCH_ROADMAP.md +0 -0
docs/{brainstorming/04_OPENALEX_INTEGRATION.md → future-roadmap/OPENALEX_INTEGRATION.md} +0 -0
docs/{brainstorming/implementation → future-roadmap/phases}/15_PHASE_OPENALEX.md +0 -0
docs/{brainstorming/implementation → future-roadmap/phases}/16_PHASE_PUBMED_FULLTEXT.md +0 -0
docs/{brainstorming/implementation → future-roadmap/phases}/17_PHASE_RATE_LIMITING.md +0 -0
docs/{brainstorming/implementation → future-roadmap/phases}/README.md +0 -0
docs/index.md +34 -19
docs/to_do/REFERENCE_GRADDIO_DEMO_ANALYSIS.md +0 -229

docs/bugs/ACTIVE_BUGS.md ADDED Viewed

	@@ -0,0 +1,39 @@

+# Active Bugs
+> Last updated: 2025-11-28
+## P0 - Critical
+### Magentic Mode Report Generation
+**File**: [FIX_PLAN_MAGENTIC_MODE.md](./FIX_PLAN_MAGENTIC_MODE.md)
+**Symptom**: Magentic mode returns `ChatMessage` object instead of synthesized report text.
+**Root Cause**:
+- `event.message.text` extraction fails in orchestrator
+- `max_rounds=3` too low for SearchAgent + JudgeAgent + ReportAgent sequence
+**Workaround**: Use Simple mode (default) - works correctly with all LLM providers.
+**Status**: Fix plan documented, not yet implemented.
+---
+## P1 - Minor UX
+### Gradio Settings Accordion Won't Collapse
+**File**: [P1_GRADIO_SETTINGS_CLEANUP.md](./P1_GRADIO_SETTINGS_CLEANUP.md)
+**Symptom**: Settings accordion stays open after user interaction.
+**Root Cause**: Nested `gr.Blocks` context prevents accordion state management.
+**Impact**: UX only - all functionality works correctly.
+**Status**: Solution documented, not yet implemented.
+---
+## Resolved Bugs
+*None currently - bugs above are still open.*

docs/bugs/P0_MAGENTIC_MODE_BROKEN.md DELETED Viewed

@@ -1,116 +0,0 @@
-# P0 Bug: Magentic Mode Returns ChatMessage Object Instead of Report Text
-**Status**: OPEN
-**Priority**: P0 (Critical)
-**Date**: 2025-11-27
----
-## Actual Bug Found (Not What We Thought)
-**The OpenAI key works fine.** The real bug is different:
-### The Problem
-When Magentic mode completes, the final report returns a `ChatMessage` object instead of the actual text:
-```
-FINAL REPORT:
-<agent_framework._types.ChatMessage object at 0x11db70310>
-```
-### Evidence
-Full test output shows:
-1. Magentic orchestrator starts correctly
-2. SearchAgent finds evidence
-3. HypothesisAgent generates hypotheses
-4. JudgeAgent evaluates
-5. **BUT**: Final output is `ChatMessage` object, not text
-### Root Cause
-In `src/orchestrator_magentic.py` line 193:
-```python
-elif isinstance(event, MagenticFinalResultEvent):
-    text = event.message.text if event.message else "No result"
-```
-The `event.message` is a `ChatMessage` object, and `.text` may not extract the content correctly, or the message structure changed in the agent-framework library.
----
-## Secondary Issue: Max Rounds Reached
-The orchestrator hits max rounds before producing a report:
-```
-[ERROR] Magentic Orchestrator: Max round count reached
-```
-This means the workflow times out before the ReportAgent synthesizes the final output.
----
-## What Works
-- OpenAI API key: **Works** (loaded from .env)
-- SearchAgent: **Works** (finds evidence from PubMed, ClinicalTrials, Europe PMC)
-- HypothesisAgent: **Works** (generates Drug -> Target -> Pathway chains)
-- JudgeAgent: **Partial** (evaluates but sometimes loses context)
----
-## Files to Fix
-| File | Line | Issue |
-|------|------|-------|
-| `src/orchestrator_magentic.py` | 193 | `event.message.text` returns object, not string |
-| `src/orchestrator_magentic.py` | 97-99 | `max_round_count=3` too low for full pipeline |
----
-## Suggested Fix
-```python
-# In _process_event, line 192-199
-elif isinstance(event, MagenticFinalResultEvent):
-    # Handle ChatMessage object properly
-    if event.message:
-        if hasattr(event.message, 'content'):
-            text = event.message.content
-        elif hasattr(event.message, 'text'):
-            text = event.message.text
-        else:
-            text = str(event.message)
-    else:
-        text = "No result"
-```
-And increase rounds:
-```python
-# In _build_workflow, line 97
-max_round_count=self._max_rounds,  # Use configured value, default 10
-```
----
-## Test Command
-```bash
-set -a && source .env && set +a && uv run python examples/orchestrator_demo/run_magentic.py "metformin alzheimer"
-```
----
-## Simple Mode Works
-For reference, simple mode produces full reports:
-```bash
-uv run python examples/orchestrator_demo/run_agent.py "metformin alzheimer"
-```
-Output includes structured report with Drug Candidates, Key Findings, etc.

docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/00_SITUATION_AND_PLAN.md RENAMED Viewed

File without changes

docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/01_ARCHITECTURE_SPEC.md RENAMED Viewed

File without changes

docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/02_IMPLEMENTATION_PHASES.md RENAMED Viewed

File without changes

docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/03_IMMEDIATE_ACTIONS.md RENAMED Viewed

File without changes

docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/04_FOLLOWUP_REVIEW_REQUEST.md RENAMED Viewed

File without changes

docs/{brainstorming/magentic-pydantic → decisions/architecture-2025-11}/REVIEW_PROMPT_FOR_SENIOR_AGENT.md RENAMED Viewed

File without changes

docs/{to_do → future-roadmap}/DEEP_RESEARCH_ROADMAP.md RENAMED Viewed

File without changes

docs/{brainstorming/04_OPENALEX_INTEGRATION.md → future-roadmap/OPENALEX_INTEGRATION.md} RENAMED Viewed

File without changes

docs/{brainstorming/implementation → future-roadmap/phases}/15_PHASE_OPENALEX.md RENAMED Viewed

File without changes

docs/{brainstorming/implementation → future-roadmap/phases}/16_PHASE_PUBMED_FULLTEXT.md RENAMED Viewed

File without changes

docs/{brainstorming/implementation → future-roadmap/phases}/17_PHASE_RATE_LIMITING.md RENAMED Viewed

File without changes

docs/{brainstorming/implementation → future-roadmap/phases}/README.md RENAMED Viewed

File without changes

docs/index.md CHANGED Viewed

@@ -1,8 +1,8 @@
 # DeepBoner Documentation
-## Medical Drug Repurposing Research Agent
-AI-powered deep research system for accelerating drug repurposing discovery.
 ---
@@ -11,8 +11,9 @@ AI-powered deep research system for accelerating drug repurposing discovery.
 ### Architecture
 - **[Overview](architecture/overview.md)** - Project overview, use case, architecture
 - **[Design Patterns](architecture/design-patterns.md)** - Technical patterns, data models
-### Implementation
 - **[Roadmap](implementation/roadmap.md)** - Phased execution plan with TDD
 - **[Phase 1: Foundation](implementation/01_phase_foundation.md)** ✅ - Tooling, config, first tests
 - **[Phase 2: Search](implementation/02_phase_search.md)** ✅ - PubMed search
@@ -24,25 +25,47 @@ AI-powered deep research system for accelerating drug repurposing discovery.
 - **[Phase 8: Report](implementation/08_phase_report.md)** ✅ - Structured scientific reports
 - **[Phase 9: Source Cleanup](implementation/09_phase_source_cleanup.md)** ✅ - Remove DuckDuckGo
 - **[Phase 10: ClinicalTrials](implementation/10_phase_clinicaltrials.md)** ✅ - Clinical trials API
-- **[Phase 11: bioRxiv](implementation/11_phase_biorxiv.md)** ✅ - Preprint search
 - **[Phase 12: MCP Server](implementation/12_phase_mcp_server.md)** ✅ - Claude Desktop integration
 - **[Phase 13: Modal Integration](implementation/13_phase_modal_integration.md)** ✅ - Secure code execution
 - **[Phase 14: Demo Submission](implementation/14_phase_demo_submission.md)** ✅ - Hackathon submission
 ### Guides
 - **[Deployment Guide](guides/deployment.md)** - Gradio, MCP, and Modal launch steps
 ### Development
 - **[Testing Strategy](development/testing.md)** - Unit, Integration, and E2E testing patterns
 ---
 ## What We're Building
-**One-liner**: AI agent that searches medical literature to find existing drugs that might treat new diseases.
-**Example Query**:
-> "What existing drugs might help treat long COVID fatigue?"
 **Output**: Research report with drug candidates, mechanisms, evidence quality, and citations.
@@ -54,7 +77,7 @@ AI-powered deep research system for accelerating drug repurposing discovery.
 User Question → Research Agent (Orchestrator)
                       ↓
               Search Loop:
-                → Tools (PubMed, ClinicalTrials, bioRxiv)
                 → Judge (Quality + Budget)
                 → Repeat or Synthesize
                       ↓
@@ -70,15 +93,7 @@ User Question → Research Agent (Orchestrator)
 | **Gradio UI** | ✅ Complete | Streaming chat interface |
 | **MCP Server** | ✅ Complete | Tools accessible from Claude Desktop |
 | **Modal Sandbox** | ✅ Complete | Secure statistical analysis |
-| **Multi-Source Search** | ✅ Complete | PubMed, ClinicalTrials, bioRxiv |
----
-## Team
-- The-Obstacle-Is-The-Way
-- MarioAderman
-- Josephrp
 ---
@@ -88,5 +103,5 @@ User Question → Research Agent (Orchestrator)
 |-------|--------|
 | Phases 1-14 | ✅ COMPLETE |
-**Test Coverage**: 65% (96 tests passing)
-**Architecture Review**: PASSED (98-99/100)

 # DeepBoner Documentation
+## Sexual Health Research Agent
+AI-powered deep research system for sexual wellness, reproductive health, and hormone therapy research.
 ---
 ### Architecture
 - **[Overview](architecture/overview.md)** - Project overview, use case, architecture
 - **[Design Patterns](architecture/design-patterns.md)** - Technical patterns, data models
+- **[Workflow Diagrams](workflow-diagrams.md)** - Visual architecture (Magentic v2.0)
+### Implementation (Phases 1-14 ✅ COMPLETE)
 - **[Roadmap](implementation/roadmap.md)** - Phased execution plan with TDD
 - **[Phase 1: Foundation](implementation/01_phase_foundation.md)** ✅ - Tooling, config, first tests
 - **[Phase 2: Search](implementation/02_phase_search.md)** ✅ - PubMed search
 - **[Phase 8: Report](implementation/08_phase_report.md)** ✅ - Structured scientific reports
 - **[Phase 9: Source Cleanup](implementation/09_phase_source_cleanup.md)** ✅ - Remove DuckDuckGo
 - **[Phase 10: ClinicalTrials](implementation/10_phase_clinicaltrials.md)** ✅ - Clinical trials API
+- **[Phase 11: Europe PMC](implementation/11_phase_biorxiv.md)** ✅ - Preprint search
 - **[Phase 12: MCP Server](implementation/12_phase_mcp_server.md)** ✅ - Claude Desktop integration
 - **[Phase 13: Modal Integration](implementation/13_phase_modal_integration.md)** ✅ - Secure code execution
 - **[Phase 14: Demo Submission](implementation/14_phase_demo_submission.md)** ✅ - Hackathon submission
+### Future Roadmap
+- **[Overview](future-roadmap/phases/README.md)** - Planned phases 15-17
+- **[Phase 15: OpenAlex](future-roadmap/phases/15_PHASE_OPENALEX.md)** - Citation network integration
+- **[Phase 16: PubMed Full-text](future-roadmap/phases/16_PHASE_PUBMED_FULLTEXT.md)** - BioC API
+- **[Phase 17: Rate Limiting](future-roadmap/phases/17_PHASE_RATE_LIMITING.md)** - Production hardening
+- **[Deep Research Mode](future-roadmap/DEEP_RESEARCH_ROADMAP.md)** - GPT-Researcher style enhancements
+### Bugs & Issues
+- **[Active Bugs](bugs/ACTIVE_BUGS.md)** - Current issues and workarounds
+### Decisions
+- **[PR #55 Evaluation](decisions/2025-11-27-pr55-evaluation.md)** - Architecture decision record
+- **[Magentic + PydanticAI](decisions/architecture-2025-11/)** - Framework architecture decisions
 ### Guides
 - **[Deployment Guide](guides/deployment.md)** - Gradio, MCP, and Modal launch steps
 ### Development
 - **[Testing Strategy](development/testing.md)** - Unit, Integration, and E2E testing patterns
+### Brainstorming (Source Improvements)
+- **[Roadmap Summary](brainstorming/00_ROADMAP_SUMMARY.md)** - Data source enhancement ideas
+- **[PubMed Improvements](brainstorming/01_PUBMED_IMPROVEMENTS.md)**
+- **[ClinicalTrials Improvements](brainstorming/02_CLINICALTRIALS_IMPROVEMENTS.md)**
+- **[Europe PMC Improvements](brainstorming/03_EUROPEPMC_IMPROVEMENTS.md)**
 ---
 ## What We're Building
+**One-liner**: AI agent that searches medical literature to find evidence for sexual health research questions.
+**Example Queries**:
+> "What drugs improve female libido post-menopause?"
+> "Evidence for testosterone therapy in women with HSDD?"
+> "Clinical trials for erectile dysfunction alternatives to PDE5 inhibitors?"
 **Output**: Research report with drug candidates, mechanisms, evidence quality, and citations.
 User Question → Research Agent (Orchestrator)
                       ↓
               Search Loop:
+                → Tools (PubMed, ClinicalTrials, Europe PMC)
                 → Judge (Quality + Budget)
                 → Repeat or Synthesize
                       ↓
 | **Gradio UI** | ✅ Complete | Streaming chat interface |
 | **MCP Server** | ✅ Complete | Tools accessible from Claude Desktop |
 | **Modal Sandbox** | ✅ Complete | Secure statistical analysis |
+| **Multi-Source Search** | ✅ Complete | PubMed, ClinicalTrials, Europe PMC |
 ---
 |-------|--------|
 | Phases 1-14 | ✅ COMPLETE |
+**Tests**: 127 passing, 0 warnings
+**Known Issues**: See [Active Bugs](bugs/ACTIVE_BUGS.md)

docs/to_do/REFERENCE_GRADDIO_DEMO_ANALYSIS.md DELETED Viewed

@@ -1,229 +0,0 @@
-# Reference: GradioDemo Analysis
-> Analysis of code from https://github.com/DeepBoner/GradioDemo
-> Purpose: Extract good ideas, understand patterns, avoid mistakes
-## Overview
-| Metric | Value |
-|--------|-------|
-| Total lines added | ~7,000 |
-| New Python files | +20 |
-| Test pass rate | 80% (62 errors due to missing mocks) |
-| Integration status | **NOT WIRED IN** |
-## Component Catalog
-### REDUNDANT (Already have equivalent)
-| Component | Lines | What We Have Instead |
-|-----------|-------|---------------------|
-| `orchestrator/graph_orchestrator.py` | 974 | MagenticBuilder |
-| `middleware/budget_tracker.py` | 391 | MagenticBuilder max_round_count |
-| `middleware/state_machine.py` | 130 | agents/state.py with contextvars |
-| `middleware/workflow_manager.py` | 300 | asyncio.gather() |
-| `orchestrator/research_flow.py` (IterativeResearchFlow) | 500 | MagenticOrchestrator |
-| HuggingFace integration | various | HFInferenceJudgeHandler |
-### POTENTIALLY USEFUL (Ideas to cherry-pick)
-#### 1. InputParser (`agents/input_parser.py` - 179 lines)
-**Idea**: Detect research mode from query text.
-```python
-# Key logic (simplified)
-research_mode: Literal["iterative", "deep"] = "iterative"
-if any(keyword in query.lower() for keyword in [
-    "comprehensive", "report", "sections", "analyze", "analysis", "overview", "market"
-]):
-    research_mode = "deep"
-```
-**Good pattern**: Heuristic fallback when LLM fails.
-**Our implementation**: See Phase 1 in DEEP_RESEARCH_ROADMAP.md
-#### 2. PlannerAgent (`orchestrator/planner_agent.py` - 184 lines)
-**Idea**: LLM creates section outline for report.
-```python
-class ReportPlan(BaseModel):
-    title: str
-    sections: list[ReportSection]
-    estimated_time_minutes: int
-class ReportSection(BaseModel):
-    title: str
-    query: str
-    description: str
-    priority: int
-```
-**Good pattern**: Structured output with Pydantic models.
-**Our implementation**: See Phase 2 in DEEP_RESEARCH_ROADMAP.md
-#### 3. DeepResearchFlow (`orchestrator/research_flow.py` - 500 lines)
-**Idea**: Run parallel research loops per section.
-```python
-# Their pattern (simplified)
-async def run_parallel_loops(sections: list[ReportSection]):
-    tasks = [run_single_loop(s) for s in sections]
-    results = await asyncio.gather(*tasks, return_exceptions=True)
-```
-**Problem**: They built new IterativeResearchFlow instead of reusing MagenticOrchestrator.
-**Our implementation**: Just run multiple MagenticOrchestrator instances.
-#### 4. LlamaIndex RAG (`services/llamaindex_rag.py` - 454 lines)
-**Idea**: Semantic search over collected evidence.
-```python
-# Their approach
-class LlamaIndexRAGService:
-    def __init__(self):
-        # ChromaDB + LlamaIndex + HuggingFace embeddings
-        self.vector_store = ChromaVectorStore(...)
-        self.index = VectorStoreIndex(...)
-    def retrieve(self, query: str, top_k: int = 5) -> list[dict]:
-        retriever = VectorIndexRetriever(index=self.index, similarity_top_k=top_k)
-        return retriever.retrieve(query)
-```
-**Good**: Full-featured RAG with multiple embedding providers.
-**Simpler alternative**: Direct ChromaDB + sentence-transformers (no LlamaIndex).
-**Our implementation**: See Phase 4 in DEEP_RESEARCH_ROADMAP.md
-#### 5. LongWriterAgent (`agents/long_writer.py` - ~300 lines)
-**Idea**: Write reports section-by-section to handle length.
-```python
-class SectionOutput(BaseModel):
-    section_content: str
-    references: list[str]
-    next_section_context: str  # What to avoid repeating
-async def write_next_section(
-    section_title: str,
-    findings: str,
-    previous_sections: str,  # Avoid repetition
-) -> SectionOutput:
-```
-**Good pattern**: Passing context to avoid repetition.
-**Our implementation**: See Phase 5 in DEEP_RESEARCH_ROADMAP.md
-#### 6. ProofreaderAgent (`agents/proofreader.py` - ~200 lines)
-**Idea**: Final cleanup pass on report.
-```python
-# Tasks:
-# 1. Remove duplicate information
-# 2. Fix citation numbering
-# 3. Add executive summary
-# 4. Ensure consistent formatting
-```
-**Good pattern**: Separate concerns - writer writes, proofreader polishes.
-**Our implementation**: Optional Phase 6 if needed.
-### Graph Architecture (Educational Reference)
-The graph system is well-designed in theory:
-```python
-# Node types
-class AgentNode(GraphNode):
-    agent: Any  # Pydantic AI agent
-    input_transformer: Callable  # Transform input
-    output_transformer: Callable  # Transform output
-class DecisionNode(GraphNode):
-    decision_function: Callable[[Any], str]  # Returns next node ID
-    options: list[str]
-class ParallelNode(GraphNode):
-    parallel_nodes: list[str]  # Run these in parallel
-    aggregator: Callable  # Combine results
-# Graph structure
-class ResearchGraph:
-    nodes: dict[str, GraphNode]
-    edges: dict[str, list[GraphEdge]]
-    entry_node: str
-    exit_nodes: list[str]
-```
-**Why we don't need it**: MagenticBuilder already provides:
-- Agent coordination via manager
-- Conditional routing (manager decides)
-- Multiple participants
-This is essentially reimplementing what `agent-framework` already does.
-## Key Lessons
-### What Went Wrong
-1. **Parallel architecture** - Built new system instead of extending existing
-2. **Horizontal sprawl** - All infrastructure, nothing wired in
-3. **Test mocking** - Tests don't mock API clients properly
-4. **No manual testing** - Code never ran end-to-end
-### What To Learn From
-1. **Pydantic models for structured output** - Good pattern
-2. **Heuristic fallbacks** - When LLM fails, have a fallback
-3. **Section-by-section writing** - For long reports
-4. **RAG for evidence retrieval** - Useful for large evidence sets
-### The 7,000 Line vs 500 Line Comparison
-**Their approach**:
-- New GraphOrchestrator (974 lines)
-- New ResearchFlow (999 lines)
-- New BudgetTracker (391 lines)
-- New StateMachine (130 lines)
-- New WorkflowManager (300 lines)
-- New agents (InputParser, Writer, LongWriter, Proofreader, etc.)
-- Total: ~7,000 lines, not integrated
-**Our approach**:
-- InputParser (50-100 lines) - extends existing
-- PlannerAgent (80-120 lines) - uses ChatAgent pattern
-- DeepOrchestrator (100-150 lines) - wraps MagenticOrchestrator
-- RAGService (100-150 lines) - simple ChromaDB
-- LongWriter (80-100 lines) - extends ReportAgent
-- Total: ~500 lines, each phase ships working
-## File Locations (for reference)
-```
-reference_repos/GradioDemo/src/
-├── orchestrator/
-│   ├── graph_orchestrator.py    # 974 lines - graph execution
-│   ├── research_flow.py         # 999 lines - iterative/deep flows
-│   └── planner_agent.py         # 184 lines - section planning
-├── agents/
-│   ├── input_parser.py          # 179 lines - query analysis
-│   ├── writer.py                # 210 lines - report writing
-│   ├── long_writer.py           # ~300 lines - section writing
-│   ├── proofreader.py           # ~200 lines - cleanup
-│   └── knowledge_gap.py         # gap detection
-├── middleware/
-│   ├── budget_tracker.py        # 391 lines - token/time tracking
-│   ├── state_machine.py         # 130 lines - workflow state
-│   └── workflow_manager.py      # 300 lines - parallel loop mgmt
-├── services/
-│   └── llamaindex_rag.py        # 454 lines - RAG service
-├── tools/
-│   └── rag_tool.py              # 191 lines - RAG as search tool
-└── agent_factory/
-    └── graph_builder.py         # ~400 lines - graph construction
-```