DeepBoner Workflow - Magentic Architecture
Architecture Pattern: Microsoft Magentic Orchestration Design Philosophy: Simple, dynamic, manager-driven coordination Key Innovation: Intelligent manager replaces rigid sequential phases Last Updated: 2025-12-06
Current Agent Inventory
| Agent | File | Status |
|---|---|---|
| Manager | AdvancedOrchestrator |
β Implemented |
| Hypothesis Agent | hypothesis_agent.py |
β Implemented |
| Search Agent | search_agent.py |
β Implemented |
| Judge Agent | judge_agent.py |
β Implemented |
| Report Agent | report_agent.py |
β Implemented |
| Retrieval Agent | retrieval_agent.py |
β Implemented (web search) |
| N/A | β Not implemented (no code execution) |
Note: Some diagrams below show "AnalysisAgent" with code execution capabilities. This was planned but not implemented. Modal code execution was removed in PR #130.
1. High-Level Magentic Workflow
flowchart TD
Start([User Query]) --> Manager[Magentic Manager<br/>Plan β’ Select β’ Assess β’ Adapt]
Manager -->|Plans| Task1[Task Decomposition]
Task1 --> Manager
Manager -->|Selects & Executes| HypAgent[Hypothesis Agent]
Manager -->|Selects & Executes| SearchAgent[Search Agent]
Manager -->|Selects & Executes| AnalysisAgent[Analysis Agent]
Manager -->|Selects & Executes| ReportAgent[Report Agent]
HypAgent -->|Results| Manager
SearchAgent -->|Results| Manager
AnalysisAgent -->|Results| Manager
ReportAgent -->|Results| Manager
Manager -->|Assesses Quality| Decision{Good Enough?}
Decision -->|No - Refine| Manager
Decision -->|No - Different Agent| Manager
Decision -->|No - Stalled| Replan[Reset Plan]
Replan --> Manager
Decision -->|Yes| Synthesis[Synthesize Final Result]
Synthesis --> Output([Research Report])
style Start fill:#e1f5e1
style Manager fill:#ffe6e6
style HypAgent fill:#fff4e6
style SearchAgent fill:#fff4e6
style AnalysisAgent fill:#fff4e6
style ReportAgent fill:#fff4e6
style Decision fill:#ffd6d6
style Synthesis fill:#d4edda
style Output fill:#e1f5e1
2. Magentic Manager: The 6-Phase Cycle
flowchart LR
P1[1. Planning<br/>Analyze task<br/>Create strategy] --> P2[2. Agent Selection<br/>Pick best agent<br/>for subtask]
P2 --> P3[3. Execution<br/>Run selected<br/>agent with tools]
P3 --> P4[4. Assessment<br/>Evaluate quality<br/>Check progress]
P4 --> Decision{Quality OK?<br/>Progress made?}
Decision -->|Yes| P6[6. Synthesis<br/>Combine results<br/>Generate report]
Decision -->|No| P5[5. Iteration<br/>Adjust plan<br/>Try again]
P5 --> P2
P6 --> Done([Complete])
style P1 fill:#fff4e6
style P2 fill:#ffe6e6
style P3 fill:#e6f3ff
style P4 fill:#ffd6d6
style P5 fill:#fff3cd
style P6 fill:#d4edda
style Done fill:#e1f5e1
3. Simplified Agent Architecture
graph TB
subgraph "Orchestration Layer"
Manager[Magentic Manager<br/>β’ Plans workflow<br/>β’ Selects agents<br/>β’ Assesses quality<br/>β’ Adapts strategy]
SharedContext[(Shared Context<br/>β’ Hypotheses<br/>β’ Search Results<br/>β’ Analysis<br/>β’ Progress)]
Manager <--> SharedContext
end
subgraph "Specialist Agents"
HypAgent[Hypothesis Agent<br/>β’ Domain understanding<br/>β’ Hypothesis generation<br/>β’ Testability refinement]
SearchAgent[Search Agent<br/>β’ Multi-source search<br/>β’ RAG retrieval<br/>β’ Result ranking]
AnalysisAgent[Analysis Agent<br/>β’ Evidence extraction<br/>β’ Statistical analysis<br/>β’ Code execution]
ReportAgent[Report Agent<br/>β’ Report assembly<br/>β’ Visualization<br/>β’ Citation formatting]
end
subgraph "MCP Tools"
WebSearch[Web Search<br/>PubMed β’ ClinicalTrials β’ Europe PMC]
CodeExec[Code Execution<br/>Sandboxed Python]
RAG[RAG Retrieval<br/>Vector DB β’ Embeddings]
Viz[Visualization<br/>Charts β’ Graphs]
end
Manager -->|Selects & Directs| HypAgent
Manager -->|Selects & Directs| SearchAgent
Manager -->|Selects & Directs| AnalysisAgent
Manager -->|Selects & Directs| ReportAgent
HypAgent --> SharedContext
SearchAgent --> SharedContext
AnalysisAgent --> SharedContext
ReportAgent --> SharedContext
SearchAgent --> WebSearch
SearchAgent --> RAG
AnalysisAgent --> CodeExec
ReportAgent --> CodeExec
ReportAgent --> Viz
style Manager fill:#ffe6e6
style SharedContext fill:#ffe6f0
style HypAgent fill:#fff4e6
style SearchAgent fill:#fff4e6
style AnalysisAgent fill:#fff4e6
style ReportAgent fill:#fff4e6
style WebSearch fill:#e6f3ff
style CodeExec fill:#e6f3ff
style RAG fill:#e6f3ff
style Viz fill:#e6f3ff
4. Dynamic Workflow Example
sequenceDiagram
participant User
participant Manager
participant HypAgent
participant SearchAgent
participant AnalysisAgent
participant ReportAgent
User->>Manager: "Research protein folding in Alzheimer's"
Note over Manager: PLAN: Generate hypotheses β Search β Analyze β Report
Manager->>HypAgent: Generate 3 hypotheses
HypAgent-->>Manager: Returns 3 hypotheses
Note over Manager: ASSESS: Good quality, proceed
Manager->>SearchAgent: Search literature for hypothesis 1
SearchAgent-->>Manager: Returns 15 papers
Note over Manager: ASSESS: Good results, continue
Manager->>SearchAgent: Search for hypothesis 2
SearchAgent-->>Manager: Only 2 papers found
Note over Manager: ASSESS: Insufficient, refine search
Manager->>SearchAgent: Refined query for hypothesis 2
SearchAgent-->>Manager: Returns 12 papers
Note over Manager: ASSESS: Better, proceed
Manager->>AnalysisAgent: Analyze evidence for all hypotheses
AnalysisAgent-->>Manager: Returns analysis with code
Note over Manager: ASSESS: Complete, generate report
Manager->>ReportAgent: Create comprehensive report
ReportAgent-->>Manager: Returns formatted report
Note over Manager: SYNTHESIZE: Combine all results
Manager->>User: Final Research Report
5. Manager Decision Logic
flowchart TD
Start([Manager Receives Task]) --> Plan[Create Initial Plan]
Plan --> Select[Select Agent for Next Subtask]
Select --> Execute[Execute Agent]
Execute --> Collect[Collect Results]
Collect --> Assess[Assess Quality & Progress]
Assess --> Q1{Quality Sufficient?}
Q1 -->|No| Q2{Same Agent Can Fix?}
Q2 -->|Yes| Feedback[Provide Specific Feedback]
Feedback --> Execute
Q2 -->|No| Different[Try Different Agent]
Different --> Select
Q1 -->|Yes| Q3{Task Complete?}
Q3 -->|No| Q4{Making Progress?}
Q4 -->|Yes| Select
Q4 -->|No - Stalled| Replan[Reset Plan & Approach]
Replan --> Plan
Q3 -->|Yes| Synth[Synthesize Final Result]
Synth --> Done([Return Report])
style Start fill:#e1f5e1
style Plan fill:#fff4e6
style Select fill:#ffe6e6
style Execute fill:#e6f3ff
style Assess fill:#ffd6d6
style Q1 fill:#ffe6e6
style Q2 fill:#ffe6e6
style Q3 fill:#ffe6e6
style Q4 fill:#ffe6e6
style Synth fill:#d4edda
style Done fill:#e1f5e1
6. Hypothesis Agent Workflow
flowchart LR
Input[Research Query] --> Domain[Identify Domain<br/>& Key Concepts]
Domain --> Context[Retrieve Background<br/>Knowledge]
Context --> Generate[Generate 3-5<br/>Initial Hypotheses]
Generate --> Refine[Refine for<br/>Testability]
Refine --> Rank[Rank by<br/>Quality Score]
Rank --> Output[Return Top<br/>Hypotheses]
Output --> Struct[Hypothesis Structure:<br/>β’ Statement<br/>β’ Rationale<br/>β’ Testability Score<br/>β’ Data Requirements<br/>β’ Expected Outcomes]
style Input fill:#e1f5e1
style Output fill:#fff4e6
style Struct fill:#e6f3ff
7. Search Agent Workflow
flowchart TD
Input[Hypotheses] --> Strategy[Formulate Search<br/>Strategy per Hypothesis]
Strategy --> Multi[Multi-Source Search]
Multi --> PubMed[PubMed Search<br/>via MCP]
Multi --> Trials[ClinicalTrials Search<br/>via MCP]
Multi --> EuropePMC[Europe PMC Search<br/>via MCP]
PubMed --> Aggregate[Aggregate Results]
Trials --> Aggregate
EuropePMC --> Aggregate
Aggregate --> Filter[Filter & Rank<br/>by Relevance]
Filter --> Dedup[Deduplicate<br/>Cross-Reference]
Dedup --> Embed[Embed Documents<br/>via MCP]
Embed --> Vector[(Vector DB)]
Vector --> RAGRetrieval[RAG Retrieval<br/>Top-K per Hypothesis]
RAGRetrieval --> Output[Return Contextualized<br/>Search Results]
style Input fill:#fff4e6
style Multi fill:#ffe6e6
style Vector fill:#ffe6f0
style Output fill:#e6f3ff
8. Analysis Agent Workflow
flowchart TD
Input1[Hypotheses] --> Extract
Input2[Search Results] --> Extract[Extract Evidence<br/>per Hypothesis]
Extract --> Methods[Determine Analysis<br/>Methods Needed]
Methods --> Branch{Requires<br/>Computation?}
Branch -->|Yes| GenCode[Generate Python<br/>Analysis Code]
Branch -->|No| Qual[Qualitative<br/>Synthesis]
GenCode --> Execute[Execute Code<br/>via MCP Sandbox]
Execute --> Interpret1[Interpret<br/>Results]
Qual --> Interpret2[Interpret<br/>Findings]
Interpret1 --> Synthesize[Synthesize Evidence<br/>Across Sources]
Interpret2 --> Synthesize
Synthesize --> Verdict[Determine Verdict<br/>per Hypothesis]
Verdict --> Support[β’ Supported<br/>β’ Refuted<br/>β’ Inconclusive]
Support --> Gaps[Identify Knowledge<br/>Gaps & Limitations]
Gaps --> Output[Return Analysis<br/>Report]
style Input1 fill:#fff4e6
style Input2 fill:#e6f3ff
style Execute fill:#ffe6e6
style Output fill:#e6ffe6
9. Report Agent Workflow
flowchart TD
Input1[Query] --> Assemble
Input2[Hypotheses] --> Assemble
Input3[Search Results] --> Assemble
Input4[Analysis] --> Assemble[Assemble Report<br/>Sections]
Assemble --> Exec[Executive Summary]
Assemble --> Intro[Introduction]
Assemble --> Methods[Methods]
Assemble --> Results[Results per<br/>Hypothesis]
Assemble --> Discussion[Discussion]
Assemble --> Future[Future Directions]
Assemble --> Refs[References]
Results --> VizCheck{Needs<br/>Visualization?}
VizCheck -->|Yes| GenViz[Generate Viz Code]
GenViz --> ExecViz[Execute via MCP<br/>Create Charts]
ExecViz --> Combine
VizCheck -->|No| Combine[Combine All<br/>Sections]
Exec --> Combine
Intro --> Combine
Methods --> Combine
Discussion --> Combine
Future --> Combine
Refs --> Combine
Combine --> Format[Format Output]
Format --> MD[Markdown]
Format --> PDF[PDF]
Format --> JSON[JSON]
MD --> Output[Return Final<br/>Report]
PDF --> Output
JSON --> Output
style Input1 fill:#e1f5e1
style Input2 fill:#fff4e6
style Input3 fill:#e6f3ff
style Input4 fill:#e6ffe6
style Output fill:#d4edda
10. Data Flow & Event Streaming
flowchart TD
User[π€ User] -->|Research Query| UI[Gradio UI]
UI -->|Submit| Manager[Magentic Manager]
Manager -->|Event: Planning| UI
Manager -->|Select Agent| HypAgent[Hypothesis Agent]
HypAgent -->|Event: Delta/Message| UI
HypAgent -->|Hypotheses| Context[(Shared Context)]
Context -->|Retrieved by| Manager
Manager -->|Select Agent| SearchAgent[Search Agent]
SearchAgent -->|MCP Request| WebSearch[Web Search Tool]
WebSearch -->|Results| SearchAgent
SearchAgent -->|Event: Delta/Message| UI
SearchAgent -->|Documents| Context
SearchAgent -->|Embeddings| VectorDB[(Vector DB)]
Context -->|Retrieved by| Manager
Manager -->|Select Agent| AnalysisAgent[Analysis Agent]
AnalysisAgent -->|MCP Request| CodeExec[Code Execution Tool]
CodeExec -->|Results| AnalysisAgent
AnalysisAgent -->|Event: Delta/Message| UI
AnalysisAgent -->|Analysis| Context
Context -->|Retrieved by| Manager
Manager -->|Select Agent| ReportAgent[Report Agent]
ReportAgent -->|MCP Request| CodeExec
ReportAgent -->|Event: Delta/Message| UI
ReportAgent -->|Report| Context
Manager -->|Event: Final Result| UI
UI -->|Display| User
style User fill:#e1f5e1
style UI fill:#e6f3ff
style Manager fill:#ffe6e6
style Context fill:#ffe6f0
style VectorDB fill:#ffe6f0
style WebSearch fill:#f0f0f0
style CodeExec fill:#f0f0f0
11. Tool Architecture
graph TB
subgraph "Agent Layer"
Manager[Magentic Manager]
HypAgent[Hypothesis Agent]
SearchAgent[Search Agent]
JudgeAgent[Judge Agent]
ReportAgent[Report Agent]
end
subgraph "Tool Layer (Direct Calls)"
Tools[AI Functions<br/>@ai_function decorated<br/>β’ search_pubmed<br/>β’ search_clinical_trials<br/>β’ search_preprints<br/>β’ search_web<br/>β’ get_bibliography]
end
subgraph "Tool Wrappers"
PubMedTool[PubMedTool]
TrialsTool[ClinicalTrialsTool]
EuropePMCTool[EuropePMCTool]
WebSearchTool[WebSearchTool]
end
subgraph "External APIs"
PubMed[PubMed E-utilities]
Trials[ClinicalTrials.gov API]
EuropePMC[Europe PMC API]
DDG[DuckDuckGo]
end
SearchAgent -->|Calls| Tools
Tools --> PubMedTool
Tools --> TrialsTool
Tools --> EuropePMCTool
Tools --> WebSearchTool
PubMedTool --> PubMed
TrialsTool --> Trials
EuropePMCTool --> EuropePMC
WebSearchTool --> DDG
style Manager fill:#ffe6e6
style Tools fill:#fff4e6
style PubMedTool fill:#e6f3ff
style TrialsTool fill:#e6f3ff
style EuropePMCTool fill:#e6f3ff
style WebSearchTool fill:#e6f3ff
Note: MCP support is provided via Gradio's built-in
mcp_server=Trueoption insrc/app.py. This exposes the Gradio interface as an MCP server for Claude Desktop integration.
12. Progress Tracking & Stall Detection
stateDiagram-v2
[*] --> Initialization: User Query
Initialization --> Planning: Manager starts
Planning --> AgentExecution: Select agent
AgentExecution --> Assessment: Collect results
Assessment --> QualityCheck: Evaluate output
QualityCheck --> AgentExecution: Poor quality<br/>(retry < max_rounds)
QualityCheck --> Planning: Poor quality<br/>(try different agent)
QualityCheck --> NextAgent: Good quality<br/>(task incomplete)
QualityCheck --> Synthesis: Good quality<br/>(task complete)
NextAgent --> AgentExecution: Select next agent
state StallDetection <<choice>>
Assessment --> StallDetection: Check progress
StallDetection --> Planning: No progress<br/>(stall count < max)
StallDetection --> ErrorRecovery: No progress<br/>(max stalls reached)
ErrorRecovery --> PartialReport: Generate partial results
PartialReport --> [*]
Synthesis --> FinalReport: Combine all outputs
FinalReport --> [*]
note right of QualityCheck
Manager assesses:
β’ Output completeness
β’ Quality metrics
β’ Progress made
end note
note right of StallDetection
Stall = no new progress
after agent execution
Triggers plan reset
end note
13. Gradio UI Integration
graph TD
App[Gradio App<br/>DeepBoner Research Agent]
App --> Input[Input Section]
App --> Status[Status Section]
App --> Output[Output Section]
Input --> Query[Research Question<br/>Text Area]
Input --> Controls[Controls]
Controls --> MaxHyp[Max Hypotheses: 1-10]
Controls --> MaxRounds[Max Rounds: 5-20]
Controls --> Submit[Start Research Button]
Status --> Log[Real-time Event Log<br/>β’ Manager planning<br/>β’ Agent selection<br/>β’ Execution updates<br/>β’ Quality assessment]
Status --> Progress[Progress Tracker<br/>β’ Current agent<br/>β’ Round count<br/>β’ Stall count]
Output --> Tabs[Tabbed Results]
Tabs --> Tab1[Hypotheses Tab<br/>Generated hypotheses with scores]
Tabs --> Tab2[Search Results Tab<br/>Papers & sources found]
Tabs --> Tab3[Analysis Tab<br/>Evidence & verdicts]
Tabs --> Tab4[Report Tab<br/>Final research report]
Tab4 --> Download[Download Report<br/>MD / PDF / JSON]
Submit -.->|Triggers| Workflow[Magentic Workflow]
Workflow -.->|MagenticOrchestratorMessageEvent| Log
Workflow -.->|AgentRunUpdateEvent| Log
Workflow -.->|ExecutorCompletedEvent| Log
Workflow -.->|WorkflowOutputEvent| Tab4
style App fill:#e1f5e1
style Input fill:#fff4e6
style Status fill:#e6f3ff
style Output fill:#e6ffe6
style Workflow fill:#ffe6e6
14. Complete System Context
graph LR
User[π€ Researcher<br/>Asks research questions] -->|Submits query| DC[DeepBoner<br/>Magentic Workflow]
DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
DC -->|Clinical trials| Trials[ClinicalTrials.gov<br/>Trial data]
DC -->|Preprints| EuropePMC[Europe PMC API<br/>Preprints & papers]
DC -->|Web search| DDG[DuckDuckGo<br/>General web]
DC -->|Agent reasoning| LLM[LLM Backend<br/>OpenAI or HuggingFace]
DC -->|Embeddings| Embed[SentenceTransformers<br/>Local embeddings]
DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 5.x]
PubMed -->|Results| DC
Trials -->|Results| DC
EuropePMC -->|Results| DC
DDG -->|Results| DC
LLM -->|Responses| DC
Embed -->|Vectors| DC
DC -->|Research report| User
style User fill:#e1f5e1
style DC fill:#ffe6e6
style PubMed fill:#e6f3ff
style Trials fill:#e6f3ff
style EuropePMC fill:#e6f3ff
style DDG fill:#e6f3ff
style LLM fill:#ffd6d6
style Embed fill:#ffe6f0
style HF fill:#d4edda
15. Workflow Timeline (Simplified)
gantt
title DeepBoner Magentic Workflow - Typical Execution
dateFormat mm:ss
axisFormat %M:%S
section Manager Planning
Initial planning :p1, 00:00, 10s
section Hypothesis Agent
Generate hypotheses :h1, after p1, 30s
Manager assessment :h2, after h1, 5s
section Search Agent
Search hypothesis 1 :s1, after h2, 20s
Search hypothesis 2 :s2, after s1, 20s
Search hypothesis 3 :s3, after s2, 20s
RAG processing :s4, after s3, 15s
Manager assessment :s5, after s4, 5s
section Analysis Agent
Evidence extraction :a1, after s5, 15s
Code generation :a2, after a1, 20s
Code execution :a3, after a2, 25s
Synthesis :a4, after a3, 20s
Manager assessment :a5, after a4, 5s
section Report Agent
Report assembly :r1, after a5, 30s
Visualization :r2, after r1, 15s
Formatting :r3, after r2, 10s
section Manager Synthesis
Final synthesis :f1, after r3, 10s
Key Differences from Original Design
| Aspect | Original (Judge-in-Loop) | New (Magentic) |
|---|---|---|
| Control Flow | Fixed sequential phases | Dynamic agent selection |
| Quality Control | Separate Judge Agent | Manager assessment built-in |
| Retry Logic | Phase-level with feedback | Agent-level with adaptation |
| Flexibility | Rigid 4-phase pipeline | Adaptive workflow |
| Complexity | 5 agents (including Judge) | 4 agents (no Judge) |
| Progress Tracking | Manual state management | Built-in round/stall detection |
| Agent Coordination | Sequential handoff | Manager-driven dynamic selection |
| Error Recovery | Retry same phase | Try different agent or replan |
Simplified Design Principles
- Manager is Intelligent: LLM-powered manager handles planning, selection, and quality assessment
- No Separate Judge: Manager's assessment phase replaces dedicated Judge Agent
- Dynamic Workflow: Agents can be called multiple times in any order based on need
- Built-in Safety: max_round_count (15) and max_stall_count (3) prevent infinite loops
- Event-Driven UI: Real-time streaming updates to Gradio interface
- MCP-Powered Tools: All external capabilities via Model Context Protocol
- Shared Context: Centralized state accessible to all agents
- Progress Awareness: Manager tracks what's been done and what's needed
Legend
- π΄ Red/Pink: Manager, orchestration, decision-making
- π‘ Yellow/Orange: Specialist agents, processing
- π΅ Blue: Data, tools, MCP services
- π£ Purple/Pink: Storage, databases, state
- π’ Green: User interactions, final outputs
- βͺ Gray: External services, APIs
Implementation Highlights
Actual Agent Factory Pattern (from magentic_agents.py):
# Create agents via factory functions
search_agent = create_search_agent(chat_client, domain, api_key)
judge_agent = create_judge_agent(chat_client, domain, api_key)
hypothesis_agent = create_hypothesis_agent(chat_client, domain, api_key)
report_agent = create_report_agent(chat_client, domain, api_key)
# Each agent is a ChatAgent with specific tools:
# - SearchAgent: search_pubmed, search_clinical_trials, search_preprints
# - JudgeAgent: None (LLM-only evaluation)
# - HypothesisAgent: None (LLM-only generation)
# - ReportAgent: get_bibliography
Current Agent Capabilities:
- SearchAgent: Multi-source search (PubMed, ClinicalTrials, Europe PMC)
- JudgeAgent: Evaluates evidence quality, determines sufficiency
- HypothesisAgent: Generates research hypotheses
- ReportAgent: Generates final research report with bibliography
- RetrievalAgent: Web search via DuckDuckGo (β οΈ NOT wired in - see issue #134)
Manager (AdvancedOrchestrator) coordinates agent execution and workflow.
Document Version: 2.2 (Audited for accuracy) Last Updated: 2025-12-06 Architecture: Microsoft Magentic Orchestration Pattern Active Agents: 4 (Search, Judge, Hypothesis, Report) + Manager Implemented but Not Wired: RetrievalAgent (see issue #134) Planned but Not Implemented: AnalysisAgent (code execution removed in PR #130) License: MIT