| # Architecture Overview | |
| > **Last Updated**: 2025-12-06 | |
| This document provides a comprehensive overview of DeepBoner's architecture. | |
| ## System Purpose | |
| DeepBoner is an **AI-native sexual health research agent** that autonomously: | |
| 1. Searches biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC, OpenAlex) | |
| 2. Evaluates evidence quality | |
| 3. Synthesizes research reports with citations | |
| ## High-Level Architecture | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β USER INTERFACE β | |
| β β | |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β | |
| β β Gradio UI β β MCP Server β β Examples β β | |
| β β (src/app) β β(mcp_tools.py)β β (scripts) β β | |
| β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β | |
| βββββββββββΌββββββββββββββββββββΌββββββββββββββββββββΌββββββββββββββββββββ | |
| β β β | |
| βΌ βΌ βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β ORCHESTRATION LAYER β | |
| β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β AdvancedOrchestrator β β | |
| β β (Microsoft Agent Framework) β β | |
| β β β β | |
| β β βββββββββββ βββββββββββ βββββββββββ β β | |
| β β β Search β β β Judge β β β Report β β β | |
| β β β Agent β β Agent β β Agent β β β | |
| β β βββββββββββ βββββββββββ βββββββββββ β β | |
| β β β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β LangGraph Orchestrator β β | |
| β β (Experimental) β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β LLM BACKENDS β | |
| β β | |
| β βββββββββββββββββββββββ βββββββββββββββββββββββ β | |
| β β OpenAI Client β β HuggingFace Client β β | |
| β β (GPT-5) β β (Qwen 2.5 7B) β β | |
| β β Premium Tier β β Free Tier β β | |
| β βββββββββββββββββββββββ βββββββββββββββββββββββ β | |
| β β | |
| β Auto-selected by ClientFactory based on API key β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β SEARCH TOOLS β | |
| β β | |
| β ββββββββββββ ββββββββββββββββ ββββββββββββ ββββββββββββ β | |
| β β PubMed β βClinicalTrialsβ βEuropePMC β β OpenAlex β β | |
| β ββββββββββββ ββββββββββββββββ ββββββββββββ ββββββββββββ β | |
| β β | |
| β SearchHandler: Parallel scatter-gather β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β SERVICES β | |
| β β | |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β | |
| β β Embeddings β β LlamaIndex β β Research β β | |
| β β Service β β RAG β β Memory β β | |
| β β (local) β β (premium) β β (shared) β β | |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β | |
| β β | |
| β ChromaDB Vector Store β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| ## Core Research Loop | |
| The system operates on a **search-and-judge loop**: | |
| ``` | |
| User Question | |
| β | |
| βΌ | |
| βββββββββββββββ | |
| β SEARCH β β Query PubMed, ClinicalTrials, Europe PMC, OpenAlex | |
| ββββββββ¬βββββββ | |
| β | |
| βΌ | |
| βββββββββββββββ | |
| β GATHER β β Collect and deduplicate evidence (PMID/DOI) | |
| ββββββββ¬βββββββ | |
| β | |
| βΌ | |
| βββββββββββββββ ββββββββββββββββββββ | |
| β JUDGE β βββΊ β "Enough evidence?"β | |
| ββββββββ¬βββββββ ββββββββββ¬ββββββββββ | |
| β β | |
| β ββββββββββββββββββ΄βββββββββββββββββ | |
| β β β | |
| βΌ βΌ βΌ | |
| βββββββββββββββ βββββββββββββββ | |
| β REFINE β β NO: Expand query β SYNTHESIZE β β YES: Generate report | |
| β & LOOP β and search again βββββββββββββββ | |
| βββββββββββββββ | |
| ``` | |
| **Break Conditions:** | |
| - Judge approves evidence as sufficient | |
| - Token budget exceeded (50K max) | |
| - Max iterations reached (default 10) | |
| ## Framework Integration | |
| DeepBoner combines two AI frameworks: | |
| | Framework | Role | Usage | | |
| |-----------|------|-------| | |
| | **Microsoft Agent Framework** | Multi-agent orchestration | Manager β Agent coordination | | |
| | **Pydantic AI** | Structured outputs | Evidence models, judge assessments | | |
| They work together - Microsoft AF handles the workflow, Pydantic AI handles data validation. | |
| ## Dual-Backend Architecture | |
| The system auto-selects LLM backend: | |
| ```python | |
| # src/clients/factory.py | |
| def get_chat_client(): | |
| if settings.has_openai_key: | |
| return OpenAIChatClient(...) # Premium | |
| else: | |
| return HuggingFaceChatClient(...) # Free | |
| ``` | |
| | Tier | Backend | Model | Features | | |
| |------|---------|-------|----------| | |
| | Free | HuggingFace | Qwen 2.5 7B | Full functionality, slower | | |
| | Premium | OpenAI | GPT-5 | Full functionality, faster | | |
| **Same orchestration logic** - only the LLM differs. | |
| ## Key Components | |
| ### Orchestrators (`src/orchestrators/`) | |
| | Component | File | Purpose | | |
| |-----------|------|---------| | |
| | AdvancedOrchestrator | `advanced.py` | Main multi-agent orchestrator | | |
| | OrchestratorFactory | `factory.py` | Backend selection | | |
| | LangGraphOrchestrator | `langgraph_orchestrator.py` | Experimental workflow engine | | |
| ### Agents (`src/agents/`) | |
| | Agent | File | Role | Status | | |
| |-------|------|------|--------| | |
| | SearchAgent | `search_agent.py` | Evidence retrieval | β Active | | |
| | JudgeAgent | `judge_agent.py` | Evidence evaluation | β Active | | |
| | ReportAgent | `report_agent.py` | Report synthesis | β Active | | |
| | HypothesisAgent | `hypothesis_agent.py` | Mechanistic pathway analysis | β Active | | |
| | RetrievalAgent | `retrieval_agent.py` | Web search (DuckDuckGo) | β οΈ Not wired (see #134) | | |
| ### Tools (`src/tools/`) | |
| | Tool | File | API | | |
| |------|------|-----| | |
| | PubMed | `pubmed.py` | NCBI E-utilities | | |
| | ClinicalTrials | `clinicaltrials.py` | ClinicalTrials.gov | | |
| | EuropePMC | `europepmc.py` | Europe PMC API | | |
| | OpenAlex | `openalex.py` | OpenAlex API | | |
| | SearchHandler | `search_handler.py` | Parallel orchestration | | |
| ### Services (`src/services/`) | |
| | Service | File | Purpose | | |
| |---------|------|---------| | |
| | EmbeddingService | `embeddings.py` | Local embeddings (sentence-transformers) | | |
| | LlamaIndexRAG | `llamaindex_rag.py` | Premium RAG (OpenAI embeddings) | | |
| | ResearchMemory | `research_memory.py` | Shared state across agents | | |
| ## Data Flow | |
| 1. **User Input** β Gradio UI / MCP Client | |
| 2. **Query** β AdvancedOrchestrator | |
| 3. **Search** β SearchHandler β [PubMed, ClinicalTrials, EuropePMC, OpenAlex] | |
| 4. **Evidence** β Deduplicated by PMID/DOI | |
| 5. **Judge** β LLM evaluates sufficiency | |
| 6. **Loop or Synthesize** β Based on judge decision | |
| 7. **Report** β Structured output with citations | |
| 8. **Response** β Back to user | |
| ## Configuration | |
| Settings are loaded from environment via Pydantic Settings: | |
| ```python | |
| # src/utils/config.py | |
| class Settings(BaseSettings): | |
| openai_api_key: str | None | |
| huggingface_model: str = "Qwen/Qwen2.5-7B-Instruct" | |
| max_iterations: int = 10 | |
| # ... | |
| ``` | |
| See [Configuration Reference](../reference/configuration.md) for all options. | |
| ## Related Documentation | |
| - [Component Inventory](component-inventory.md) - Complete module catalog | |
| - [Data Models](data-models.md) - Pydantic model reference | |
| - [System Registry](system-registry.md) - Service wiring specification | |
| - [Workflow Diagrams](workflow-diagrams.md) - Visual documentation | |
| --- | |
| *"Architecturally rock solid."* ποΈ | |