fix: complete DeepBoner rebrand with test fixes and CodeRabbit feedback
Browse files* docs: add deep research roadmap and GradioDemo analysis
- Introduced a comprehensive roadmap for implementing GPT-Researcher-style deep research in DeepCritical, detailing phases from input parsing to long-form writing.
- Added an analysis of the GradioDemo codebase, highlighting redundant components and useful patterns, while emphasizing a streamlined approach to integration and implementation.
* rebrand: DeepCritical β DeepBoner (sexual health research agent)
π AI-Native Sexual Health Research Agent
- Package renamed: deepcritical β deepboner
- New focus: sexual wellness, ED, hormone therapy, libido, reproductive health
- Exception class: DeepCriticalError β DeepBonerError (with backwards compat)
- UI examples updated for sexual health queries
- Modal app renamed: deepboner-code-execution
- Fixed pre-existing mypy issue (Modal API: uv_pip_install β pip_install)
46 files updated across:
- Core: pyproject.toml, README.md, CLAUDE.md, AGENTS.md, GEMINI.md
- Source: app.py, mcp_tools.py, exceptions.py, code_execution.py
- Docs: ~25 markdown files
- Examples: 8 demo scripts
- Tests: exception tests updated
All tests passing (102 passed, 6 skipped)
* fix: resolve all test warnings and complete DeepBoner rebrand
Code fixes:
- Update OpenAIModel β OpenAIChatModel (pydantic-ai deprecation)
- Add pytest.importorskip guards to 3 test files for optional deps
Rebrand fixes (missed in initial rename):
- src/services/llamaindex_rag.py: collection name
- src/tools/clinicaltrials.py: User-Agent header
- 7 doc files: package names, URLs, examples
Config:
- Add targeted pytest warning filters for Pydantic mock introspection
(known upstream issue: pydantic/pydantic#9927)
Result: 127 tests pass, 0 warnings (was 50 warnings)
* fix: address CodeRabbit review feedback
Fixes from CodeRabbit analysis:
- docs/to_do/DEEP_RESEARCH_ROADMAP.md: Fix MagenticBuilder β MagenticOrchestrator
- docs/to_do/DEEP_RESEARCH_ROADMAP.md: Add note clarifying RAGService vs EmbeddingService
- src/services/llamaindex_rag.py: Add migration note for collection name change
- tests/unit/utils/test_exceptions.py: Add pytestmark = pytest.mark.unit
- docs/implementation/12_phase_mcp_server.md: Update all bioRxiv refs β Europe PMC
All 127 tests pass, 0 warnings.
- .gitignore +1 -0
- AGENTS.md +4 -4
- CLAUDE.md +4 -4
- Dockerfile +1 -1
- GEMINI.md +6 -6
- README.md +26 -16
- docs/architecture/design-patterns.md +3 -3
- docs/architecture/overview.md +3 -3
- docs/brainstorming/00_ROADMAP_SUMMARY.md +3 -3
- docs/brainstorming/01_PUBMED_IMPROVEMENTS.md +2 -2
- docs/brainstorming/03_EUROPEPMC_IMPROVEMENTS.md +1 -1
- docs/brainstorming/04_OPENALEX_INTEGRATION.md +4 -4
- docs/brainstorming/implementation/15_PHASE_OPENALEX.md +1 -1
- docs/brainstorming/magentic-pydantic/00_SITUATION_AND_PLAN.md +1 -1
- docs/brainstorming/magentic-pydantic/REVIEW_PROMPT_FOR_SENIOR_AGENT.md +10 -10
- docs/bugs/P1_GRADIO_SETTINGS_CLEANUP.md +2 -2
- docs/development/testing.md +1 -1
- docs/guides/deployment.md +5 -5
- docs/implementation/01_phase_foundation.md +10 -10
- docs/implementation/04_phase_ui.md +7 -7
- docs/implementation/10_phase_clinicaltrials.md +2 -2
- docs/implementation/11_phase_biorxiv.md +1 -1
- docs/implementation/12_phase_mcp_server.md +45 -45
- docs/implementation/13_phase_modal_integration.md +1 -1
- docs/implementation/14_phase_demo_submission.md +12 -12
- docs/implementation/roadmap.md +3 -3
- docs/index.md +1 -1
- docs/to_do/DEEP_RESEARCH_ROADMAP.md +337 -0
- docs/to_do/REFERENCE_GRADDIO_DEMO_ANALYSIS.md +229 -0
- docs/workflow-diagrams.md +4 -4
- examples/README.md +2 -2
- examples/embeddings_demo/run_embeddings.py +1 -1
- examples/full_stack_demo/run_full.py +4 -4
- examples/hypothesis_demo/run_hypothesis.py +1 -1
- examples/modal_demo/run_analysis.py +1 -1
- examples/orchestrator_demo/run_agent.py +3 -3
- examples/orchestrator_demo/run_magentic.py +3 -3
- examples/search_demo/run_search.py +1 -1
- main.py +1 -1
- pyproject.toml +14 -2
- src/agent_factory/judges.py +2 -2
- src/app.py +8 -6
- src/mcp_tools.py +1 -1
- src/services/__init__.py +1 -1
- src/services/llamaindex_rag.py +8 -3
- src/tools/clinicaltrials.py +1 -1
- src/tools/code_execution.py +2 -2
- src/utils/exceptions.py +10 -6
- tests/unit/agent_factory/test_judges_factory.py +3 -3
- tests/unit/agents/test_hypothesis_agent.py +12 -3
|
@@ -49,6 +49,7 @@ reference_repos/claude-agent-sdk/
|
|
| 49 |
reference_repos/pydanticai-research-agent/
|
| 50 |
reference_repos/pubmed-mcp-server/
|
| 51 |
reference_repos/DeepCritical/
|
|
|
|
| 52 |
|
| 53 |
# Keep the README in reference_repos
|
| 54 |
!reference_repos/README.md
|
|
|
|
| 49 |
reference_repos/pydanticai-research-agent/
|
| 50 |
reference_repos/pubmed-mcp-server/
|
| 51 |
reference_repos/DeepCritical/
|
| 52 |
+
reference_repos/GradioDemo/
|
| 53 |
|
| 54 |
# Keep the README in reference_repos
|
| 55 |
!reference_repos/README.md
|
|
@@ -4,7 +4,7 @@ This file provides guidance to AI agents when working with code in this reposito
|
|
| 4 |
|
| 5 |
## Project Overview
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
**Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
|
| 10 |
|
|
@@ -39,7 +39,7 @@ uv run pytest -m integration
|
|
| 39 |
User Question β Orchestrator
|
| 40 |
β
|
| 41 |
Search Loop:
|
| 42 |
-
1. Query PubMed, ClinicalTrials.gov,
|
| 43 |
2. Gather evidence
|
| 44 |
3. Judge quality ("Do we have enough?")
|
| 45 |
4. If NO β Refine query, search more
|
|
@@ -53,7 +53,7 @@ Research Report with Citations
|
|
| 53 |
- `src/orchestrator.py` - Main agent loop
|
| 54 |
- `src/tools/pubmed.py` - PubMed E-utilities search
|
| 55 |
- `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
|
| 56 |
-
- `src/tools/
|
| 57 |
- `src/tools/code_execution.py` - Modal sandbox execution
|
| 58 |
- `src/tools/search_handler.py` - Scatter-gather orchestration
|
| 59 |
- `src/services/embeddings.py` - Semantic search & deduplication (ChromaDB)
|
|
@@ -82,7 +82,7 @@ Settings via pydantic-settings from `.env`:
|
|
| 82 |
## Exception Hierarchy
|
| 83 |
|
| 84 |
```text
|
| 85 |
-
|
| 86 |
βββ SearchError
|
| 87 |
β βββ RateLimitError
|
| 88 |
βββ JudgeError
|
|
|
|
| 4 |
|
| 5 |
## Project Overview
|
| 6 |
|
| 7 |
+
DeepBoner is an AI-native sexual health research agent. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) and synthesize evidence for queries like "What drugs improve female libido post-menopause?" or "Evidence for testosterone therapy in women with HSDD?".
|
| 8 |
|
| 9 |
**Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
|
| 10 |
|
|
|
|
| 39 |
User Question β Orchestrator
|
| 40 |
β
|
| 41 |
Search Loop:
|
| 42 |
+
1. Query PubMed, ClinicalTrials.gov, Europe PMC
|
| 43 |
2. Gather evidence
|
| 44 |
3. Judge quality ("Do we have enough?")
|
| 45 |
4. If NO β Refine query, search more
|
|
|
|
| 53 |
- `src/orchestrator.py` - Main agent loop
|
| 54 |
- `src/tools/pubmed.py` - PubMed E-utilities search
|
| 55 |
- `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
|
| 56 |
+
- `src/tools/europepmc.py` - Europe PMC search
|
| 57 |
- `src/tools/code_execution.py` - Modal sandbox execution
|
| 58 |
- `src/tools/search_handler.py` - Scatter-gather orchestration
|
| 59 |
- `src/services/embeddings.py` - Semantic search & deduplication (ChromaDB)
|
|
|
|
| 82 |
## Exception Hierarchy
|
| 83 |
|
| 84 |
```text
|
| 85 |
+
DeepBonerError (base)
|
| 86 |
βββ SearchError
|
| 87 |
β βββ RateLimitError
|
| 88 |
βββ JudgeError
|
|
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
|
| 4 |
|
| 5 |
## Project Overview
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
**Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
|
| 10 |
|
|
@@ -39,7 +39,7 @@ uv run pytest -m integration
|
|
| 39 |
User Question β Orchestrator
|
| 40 |
β
|
| 41 |
Search Loop:
|
| 42 |
-
1. Query PubMed, ClinicalTrials.gov,
|
| 43 |
2. Gather evidence
|
| 44 |
3. Judge quality ("Do we have enough?")
|
| 45 |
4. If NO β Refine query, search more
|
|
@@ -53,7 +53,7 @@ Research Report with Citations
|
|
| 53 |
- `src/orchestrator.py` - Main agent loop
|
| 54 |
- `src/tools/pubmed.py` - PubMed E-utilities search
|
| 55 |
- `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
|
| 56 |
-
- `src/tools/
|
| 57 |
- `src/tools/code_execution.py` - Modal sandbox execution
|
| 58 |
- `src/tools/search_handler.py` - Scatter-gather orchestration
|
| 59 |
- `src/services/embeddings.py` - Semantic search & deduplication (ChromaDB)
|
|
@@ -82,7 +82,7 @@ Settings via pydantic-settings from `.env`:
|
|
| 82 |
## Exception Hierarchy
|
| 83 |
|
| 84 |
```text
|
| 85 |
-
|
| 86 |
βββ SearchError
|
| 87 |
β βββ RateLimitError
|
| 88 |
βββ JudgeError
|
|
|
|
| 4 |
|
| 5 |
## Project Overview
|
| 6 |
|
| 7 |
+
DeepBoner is an AI-native sexual health research agent. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) and synthesize evidence for queries like "What drugs improve female libido post-menopause?" or "Evidence for testosterone therapy in women with HSDD?".
|
| 8 |
|
| 9 |
**Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
|
| 10 |
|
|
|
|
| 39 |
User Question β Orchestrator
|
| 40 |
β
|
| 41 |
Search Loop:
|
| 42 |
+
1. Query PubMed, ClinicalTrials.gov, Europe PMC
|
| 43 |
2. Gather evidence
|
| 44 |
3. Judge quality ("Do we have enough?")
|
| 45 |
4. If NO β Refine query, search more
|
|
|
|
| 53 |
- `src/orchestrator.py` - Main agent loop
|
| 54 |
- `src/tools/pubmed.py` - PubMed E-utilities search
|
| 55 |
- `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
|
| 56 |
+
- `src/tools/europepmc.py` - Europe PMC search
|
| 57 |
- `src/tools/code_execution.py` - Modal sandbox execution
|
| 58 |
- `src/tools/search_handler.py` - Scatter-gather orchestration
|
| 59 |
- `src/services/embeddings.py` - Semantic search & deduplication (ChromaDB)
|
|
|
|
| 82 |
## Exception Hierarchy
|
| 83 |
|
| 84 |
```text
|
| 85 |
+
DeepBonerError (base)
|
| 86 |
βββ SearchError
|
| 87 |
β βββ RateLimitError
|
| 88 |
βββ JudgeError
|
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
# Dockerfile for
|
| 2 |
FROM python:3.11-slim
|
| 3 |
|
| 4 |
# Set working directory
|
|
|
|
| 1 |
+
# Dockerfile for DeepBoner
|
| 2 |
FROM python:3.11-slim
|
| 3 |
|
| 4 |
# Set working directory
|
|
@@ -1,9 +1,9 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
## Project Overview
|
| 4 |
|
| 5 |
-
**
|
| 6 |
-
**Goal:** To accelerate
|
| 7 |
|
| 8 |
**Architecture:**
|
| 9 |
The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orchestrator) and adheres to **Strict TDD** (Test-Driven Development).
|
|
@@ -11,7 +11,7 @@ The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orches
|
|
| 11 |
**Current Status:**
|
| 12 |
|
| 13 |
- **Phases 1-9:** COMPLETE. Foundation, Search, Judge, UI, Orchestrator, Embeddings, Hypothesis, Report, Cleanup.
|
| 14 |
-
- **Phases 10-11:** COMPLETE. ClinicalTrials.gov and
|
| 15 |
- **Phase 12:** COMPLETE. MCP Server integration (Gradio MCP at `/gradio_api/mcp/`).
|
| 16 |
- **Phase 13:** COMPLETE. Modal sandbox for statistical analysis.
|
| 17 |
|
|
@@ -41,7 +41,7 @@ The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orches
|
|
| 41 |
|
| 42 |
- `src/`: Source code
|
| 43 |
- `utils/`: Shared utilities (`config.py`, `exceptions.py`, `models.py`)
|
| 44 |
-
- `tools/`: Search tools (`pubmed.py`, `clinicaltrials.py`, `
|
| 45 |
- `services/`: Services (`embeddings.py`, `statistical_analyzer.py`)
|
| 46 |
- `agents/`: Magentic multi-agent mode agents
|
| 47 |
- `agent_factory/`: Agent definitions (judges, prompts)
|
|
@@ -58,7 +58,7 @@ The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orches
|
|
| 58 |
- `src/orchestrator.py` - Main agent loop
|
| 59 |
- `src/tools/pubmed.py` - PubMed E-utilities search
|
| 60 |
- `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
|
| 61 |
-
- `src/tools/
|
| 62 |
- `src/tools/code_execution.py` - Modal sandbox execution
|
| 63 |
- `src/services/statistical_analyzer.py` - Statistical analysis via Modal
|
| 64 |
- `src/mcp_tools.py` - MCP tool wrappers
|
|
|
|
| 1 |
+
# DeepBoner Context
|
| 2 |
|
| 3 |
## Project Overview
|
| 4 |
|
| 5 |
+
**DeepBoner** is an AI-native Sexual Health Research Agent.
|
| 6 |
+
**Goal:** To accelerate research into sexual health, wellness, and reproductive medicine by intelligently searching biomedical literature (PubMed, ClinicalTrials.gov, Europe PMC), evaluating evidence, and synthesizing findings.
|
| 7 |
|
| 8 |
**Architecture:**
|
| 9 |
The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orchestrator) and adheres to **Strict TDD** (Test-Driven Development).
|
|
|
|
| 11 |
**Current Status:**
|
| 12 |
|
| 13 |
- **Phases 1-9:** COMPLETE. Foundation, Search, Judge, UI, Orchestrator, Embeddings, Hypothesis, Report, Cleanup.
|
| 14 |
+
- **Phases 10-11:** COMPLETE. ClinicalTrials.gov and Europe PMC integration.
|
| 15 |
- **Phase 12:** COMPLETE. MCP Server integration (Gradio MCP at `/gradio_api/mcp/`).
|
| 16 |
- **Phase 13:** COMPLETE. Modal sandbox for statistical analysis.
|
| 17 |
|
|
|
|
| 41 |
|
| 42 |
- `src/`: Source code
|
| 43 |
- `utils/`: Shared utilities (`config.py`, `exceptions.py`, `models.py`)
|
| 44 |
+
- `tools/`: Search tools (`pubmed.py`, `clinicaltrials.py`, `europepmc.py`, `code_execution.py`)
|
| 45 |
- `services/`: Services (`embeddings.py`, `statistical_analyzer.py`)
|
| 46 |
- `agents/`: Magentic multi-agent mode agents
|
| 47 |
- `agent_factory/`: Agent definitions (judges, prompts)
|
|
|
|
| 58 |
- `src/orchestrator.py` - Main agent loop
|
| 59 |
- `src/tools/pubmed.py` - PubMed E-utilities search
|
| 60 |
- `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
|
| 61 |
+
- `src/tools/europepmc.py` - Europe PMC search
|
| 62 |
- `src/tools/code_execution.py` - Modal sandbox execution
|
| 63 |
- `src/services/statistical_analyzer.py` - Statistical analysis via Modal
|
| 64 |
- `src/mcp_tools.py` - MCP tool wrappers
|
|
@@ -1,7 +1,7 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: "6.0.1"
|
|
@@ -10,26 +10,37 @@ app_file: src/app.py
|
|
| 10 |
pinned: false
|
| 11 |
license: mit
|
| 12 |
tags:
|
| 13 |
-
-
|
|
|
|
|
|
|
|
|
|
| 14 |
- mcp-hackathon
|
| 15 |
-
- drug-repurposing
|
| 16 |
-
- biomedical-ai
|
| 17 |
- pydantic-ai
|
| 18 |
- llamaindex
|
| 19 |
- modal
|
| 20 |
---
|
| 21 |
|
| 22 |
-
#
|
| 23 |
|
| 24 |
-
AI-
|
|
|
|
|
|
|
| 25 |
|
| 26 |
## Features
|
| 27 |
|
| 28 |
-
- **Multi-Source Search**: PubMed, ClinicalTrials.gov,
|
| 29 |
- **MCP Integration**: Use our tools from Claude Desktop or any MCP client
|
| 30 |
- **Modal Sandbox**: Secure execution of AI-generated statistical code
|
| 31 |
- **LlamaIndex RAG**: Semantic search and evidence synthesis
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
## Quick Start
|
| 34 |
|
| 35 |
### 1. Environment Setup
|
|
@@ -62,7 +73,7 @@ Add this to your `claude_desktop_config.json`:
|
|
| 62 |
```json
|
| 63 |
{
|
| 64 |
"mcpServers": {
|
| 65 |
-
"
|
| 66 |
"url": "http://localhost:7860/gradio_api/mcp/"
|
| 67 |
}
|
| 68 |
}
|
|
@@ -72,7 +83,7 @@ Add this to your `claude_desktop_config.json`:
|
|
| 72 |
**Available Tools**:
|
| 73 |
- `search_pubmed`: Search peer-reviewed biomedical literature.
|
| 74 |
- `search_clinical_trials`: Search ClinicalTrials.gov.
|
| 75 |
-
- `
|
| 76 |
- `search_all`: Search all sources simultaneously.
|
| 77 |
- `analyze_hypothesis`: Secure statistical analysis using Modal sandboxes.
|
| 78 |
|
|
@@ -92,16 +103,16 @@ make check
|
|
| 92 |
|
| 93 |
## Architecture
|
| 94 |
|
| 95 |
-
|
| 96 |
|
| 97 |
-
1. **Search Slice**: Retrieving evidence from PubMed, ClinicalTrials.gov, and
|
| 98 |
2. **Judge Slice**: Evaluating evidence quality using LLMs.
|
| 99 |
3. **Orchestrator Slice**: Managing the research loop and UI.
|
| 100 |
|
| 101 |
Built with:
|
| 102 |
- **PydanticAI**: For robust agent interactions.
|
| 103 |
- **Gradio**: For the streaming user interface.
|
| 104 |
-
- **PubMed, ClinicalTrials.gov,
|
| 105 |
- **MCP**: For universal tool access.
|
| 106 |
- **Modal**: For secure code execution.
|
| 107 |
|
|
@@ -110,8 +121,7 @@ Built with:
|
|
| 110 |
- The-Obstacle-Is-The-Way
|
| 111 |
- MarioAderman
|
| 112 |
- EmployeeNo427
|
| 113 |
-
- Josephrp *(provided initial template)*
|
| 114 |
|
| 115 |
## Links
|
| 116 |
|
| 117 |
-
- [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/
|
|
|
|
| 1 |
---
|
| 2 |
+
title: DeepBoner
|
| 3 |
+
emoji: π
|
| 4 |
+
colorFrom: pink
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: "6.0.1"
|
|
|
|
| 10 |
pinned: false
|
| 11 |
license: mit
|
| 12 |
tags:
|
| 13 |
+
- sexual-health
|
| 14 |
+
- reproductive-medicine
|
| 15 |
+
- hormone-therapy
|
| 16 |
+
- wellness-research
|
| 17 |
- mcp-hackathon
|
|
|
|
|
|
|
| 18 |
- pydantic-ai
|
| 19 |
- llamaindex
|
| 20 |
- modal
|
| 21 |
---
|
| 22 |
|
| 23 |
+
# DeepBoner π
|
| 24 |
|
| 25 |
+
AI-Native Sexual Health Research Agent
|
| 26 |
+
|
| 27 |
+
Deep research for sexual wellness, ED treatments, hormone therapy, libido, and reproductive health - for all genders.
|
| 28 |
|
| 29 |
## Features
|
| 30 |
|
| 31 |
+
- **Multi-Source Search**: PubMed, ClinicalTrials.gov, Europe PMC
|
| 32 |
- **MCP Integration**: Use our tools from Claude Desktop or any MCP client
|
| 33 |
- **Modal Sandbox**: Secure execution of AI-generated statistical code
|
| 34 |
- **LlamaIndex RAG**: Semantic search and evidence synthesis
|
| 35 |
|
| 36 |
+
## Example Queries
|
| 37 |
+
|
| 38 |
+
- "What drugs improve female libido post-menopause?"
|
| 39 |
+
- "Clinical trials for erectile dysfunction alternatives to PDE5 inhibitors?"
|
| 40 |
+
- "Evidence for testosterone therapy in women with HSDD?"
|
| 41 |
+
- "Drug interactions with sildenafil?"
|
| 42 |
+
- "What's the latest research on flibanserin efficacy?"
|
| 43 |
+
|
| 44 |
## Quick Start
|
| 45 |
|
| 46 |
### 1. Environment Setup
|
|
|
|
| 73 |
```json
|
| 74 |
{
|
| 75 |
"mcpServers": {
|
| 76 |
+
"deepboner": {
|
| 77 |
"url": "http://localhost:7860/gradio_api/mcp/"
|
| 78 |
}
|
| 79 |
}
|
|
|
|
| 83 |
**Available Tools**:
|
| 84 |
- `search_pubmed`: Search peer-reviewed biomedical literature.
|
| 85 |
- `search_clinical_trials`: Search ClinicalTrials.gov.
|
| 86 |
+
- `search_europepmc`: Search Europe PMC preprints and papers.
|
| 87 |
- `search_all`: Search all sources simultaneously.
|
| 88 |
- `analyze_hypothesis`: Secure statistical analysis using Modal sandboxes.
|
| 89 |
|
|
|
|
| 103 |
|
| 104 |
## Architecture
|
| 105 |
|
| 106 |
+
DeepBoner uses a Vertical Slice Architecture:
|
| 107 |
|
| 108 |
+
1. **Search Slice**: Retrieving evidence from PubMed, ClinicalTrials.gov, and Europe PMC.
|
| 109 |
2. **Judge Slice**: Evaluating evidence quality using LLMs.
|
| 110 |
3. **Orchestrator Slice**: Managing the research loop and UI.
|
| 111 |
|
| 112 |
Built with:
|
| 113 |
- **PydanticAI**: For robust agent interactions.
|
| 114 |
- **Gradio**: For the streaming user interface.
|
| 115 |
+
- **PubMed, ClinicalTrials.gov, Europe PMC**: For biomedical data.
|
| 116 |
- **MCP**: For universal tool access.
|
| 117 |
- **Modal**: For secure code execution.
|
| 118 |
|
|
|
|
| 121 |
- The-Obstacle-Is-The-Way
|
| 122 |
- MarioAderman
|
| 123 |
- EmployeeNo427
|
|
|
|
| 124 |
|
| 125 |
## Links
|
| 126 |
|
| 127 |
+
- [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/DeepBoner)
|
|
@@ -726,7 +726,7 @@ If evidence is weak, say so clearly."""
|
|
| 726 |
**Architecture**:
|
| 727 |
```
|
| 728 |
βββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 729 |
-
β
|
| 730 |
β (uses tools directly OR via MCP) β
|
| 731 |
βββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 732 |
β
|
|
@@ -811,7 +811,7 @@ uvx fastmcp run src/mcp_servers/pubmed_server.py
|
|
| 811 |
"pubmed": {
|
| 812 |
"command": "python",
|
| 813 |
"args": ["-m", "src.mcp_servers.pubmed_server"],
|
| 814 |
-
"cwd": "/path/to/
|
| 815 |
}
|
| 816 |
}
|
| 817 |
}
|
|
@@ -865,7 +865,7 @@ def research_with_streaming(question: str) -> Generator[str, None, None]:
|
|
| 865 |
|
| 866 |
# Gradio 5 UI
|
| 867 |
with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
| 868 |
-
gr.Markdown("# π¬
|
| 869 |
gr.Markdown("Ask a question about potential drug repurposing opportunities.")
|
| 870 |
|
| 871 |
with gr.Row():
|
|
|
|
| 726 |
**Architecture**:
|
| 727 |
```
|
| 728 |
βββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 729 |
+
β DeepBoner Agent β
|
| 730 |
β (uses tools directly OR via MCP) β
|
| 731 |
βββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 732 |
β
|
|
|
|
| 811 |
"pubmed": {
|
| 812 |
"command": "python",
|
| 813 |
"args": ["-m", "src.mcp_servers.pubmed_server"],
|
| 814 |
+
"cwd": "/path/to/deepboner"
|
| 815 |
}
|
| 816 |
}
|
| 817 |
}
|
|
|
|
| 865 |
|
| 866 |
# Gradio 5 UI
|
| 867 |
with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
| 868 |
+
gr.Markdown("# π¬ DeepBoner: Drug Repurposing Research Agent")
|
| 869 |
gr.Markdown("Ask a question about potential drug repurposing opportunities.")
|
| 870 |
|
| 871 |
with gr.Row():
|
|
@@ -1,11 +1,11 @@
|
|
| 1 |
-
#
|
| 2 |
## Project Overview
|
| 3 |
|
| 4 |
---
|
| 5 |
|
| 6 |
## Executive Summary
|
| 7 |
|
| 8 |
-
**
|
| 9 |
|
| 10 |
### The Problem We Solve
|
| 11 |
|
|
@@ -16,7 +16,7 @@ Drug repurposing - finding new therapeutic uses for existing FDA-approved drugs
|
|
| 16 |
- Assess safety profiles
|
| 17 |
- Synthesize evidence into actionable insights
|
| 18 |
|
| 19 |
-
**
|
| 20 |
|
| 21 |
### What Is Drug Repurposing?
|
| 22 |
|
|
|
|
| 1 |
+
# DeepBoner: Medical Drug Repurposing Research Agent
|
| 2 |
## Project Overview
|
| 3 |
|
| 4 |
---
|
| 5 |
|
| 6 |
## Executive Summary
|
| 7 |
|
| 8 |
+
**DeepBoner** is a deep research agent designed to accelerate medical drug repurposing research by autonomously searching, analyzing, and synthesizing evidence from multiple biomedical databases.
|
| 9 |
|
| 10 |
### The Problem We Solve
|
| 11 |
|
|
|
|
| 16 |
- Assess safety profiles
|
| 17 |
- Synthesize evidence into actionable insights
|
| 18 |
|
| 19 |
+
**DeepBoner automates this process from hours to minutes.**
|
| 20 |
|
| 21 |
### What Is Drug Repurposing?
|
| 22 |
|
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
**Created**: 2024-11-27
|
| 4 |
**Purpose**: Future maintainability and hackathon continuation
|
|
@@ -131,7 +131,7 @@ Keep current architecture working, add OpenAlex incrementally.
|
|
| 131 |
```
|
| 132 |
|
| 133 |
2. **Copy OpenAlex tool from reference repo**
|
| 134 |
-
- File: `reference_repos/
|
| 135 |
- Adapt to our `SearchTool` base class
|
| 136 |
|
| 137 |
3. **Enable NCBI API Key**
|
|
@@ -189,6 +189,6 @@ If you're picking this up after the hackathon:
|
|
| 189 |
1. **Start with OpenAlex** - biggest bang for buck
|
| 190 |
2. **Add rate limiting** - prevents API blocks
|
| 191 |
3. **Don't bother with bioRxiv** - use Europe PMC instead
|
| 192 |
-
4. **Reference repo is gold** - `reference_repos/
|
| 193 |
|
| 194 |
Good luck! π
|
|
|
|
| 1 |
+
# DeepBoner Data Sources: Roadmap Summary
|
| 2 |
|
| 3 |
**Created**: 2024-11-27
|
| 4 |
**Purpose**: Future maintainability and hackathon continuation
|
|
|
|
| 131 |
```
|
| 132 |
|
| 133 |
2. **Copy OpenAlex tool from reference repo**
|
| 134 |
+
- File: `reference_repos/DeepBoner/DeepResearch/src/tools/openalex_tools.py`
|
| 135 |
- Adapt to our `SearchTool` base class
|
| 136 |
|
| 137 |
3. **Enable NCBI API Key**
|
|
|
|
| 189 |
1. **Start with OpenAlex** - biggest bang for buck
|
| 190 |
2. **Add rate limiting** - prevents API blocks
|
| 191 |
3. **Don't bother with bioRxiv** - use Europe PMC instead
|
| 192 |
+
4. **Reference repo is gold** - `reference_repos/DeepBoner/` has working implementations
|
| 193 |
|
| 194 |
Good luck! π
|
|
@@ -24,9 +24,9 @@
|
|
| 24 |
|
| 25 |
---
|
| 26 |
|
| 27 |
-
## Reference Implementation (
|
| 28 |
|
| 29 |
-
The reference repo at `reference_repos/
|
| 30 |
|
| 31 |
### Features We're Missing
|
| 32 |
|
|
|
|
| 24 |
|
| 25 |
---
|
| 26 |
|
| 27 |
+
## Reference Implementation (DeepBoner Reference Repo)
|
| 28 |
|
| 29 |
+
The reference repo at `reference_repos/DeepBoner/DeepResearch/src/tools/bioinformatics_tools.py` has a more sophisticated implementation:
|
| 30 |
|
| 31 |
### Features We're Missing
|
| 32 |
|
|
@@ -182,7 +182,7 @@ Europe PMC is more generous than NCBI:
|
|
| 182 |
# Recommend: 10-20 requests/second max
|
| 183 |
# Use email in User-Agent for polite pool
|
| 184 |
headers = {
|
| 185 |
-
"User-Agent": "
|
| 186 |
}
|
| 187 |
```
|
| 188 |
|
|
|
|
| 182 |
# Recommend: 10-20 requests/second max
|
| 183 |
# Use email in User-Agent for polite pool
|
| 184 |
headers = {
|
| 185 |
+
"User-Agent": "DeepBoner/1.0 (mailto:your@email.com)"
|
| 186 |
}
|
| 187 |
```
|
| 188 |
|
|
@@ -2,7 +2,7 @@
|
|
| 2 |
|
| 3 |
**Status**: NOT Implemented (Candidate for Addition)
|
| 4 |
**Priority**: HIGH - Could Replace Multiple Tools
|
| 5 |
-
**Reference**: Already implemented in `reference_repos/
|
| 6 |
|
| 7 |
---
|
| 8 |
|
|
@@ -20,7 +20,7 @@ OpenAlex is a **fully open** index of the global research system:
|
|
| 20 |
|
| 21 |
---
|
| 22 |
|
| 23 |
-
## Why OpenAlex for
|
| 24 |
|
| 25 |
### Current Architecture
|
| 26 |
|
|
@@ -60,7 +60,7 @@ Orchestrator (enrich with CT.gov for trials)
|
|
| 60 |
|
| 61 |
## Reference Implementation
|
| 62 |
|
| 63 |
-
From `reference_repos/
|
| 64 |
|
| 65 |
```python
|
| 66 |
class OpenAlexFetchTool(ToolRunner):
|
|
@@ -212,7 +212,7 @@ class OpenAlexTool(SearchTool):
|
|
| 212 |
"filter": "type:article,is_oa:true",
|
| 213 |
"sort": "cited_by_count:desc",
|
| 214 |
"per_page": max_results,
|
| 215 |
-
"mailto": "
|
| 216 |
},
|
| 217 |
)
|
| 218 |
data = resp.json()
|
|
|
|
| 2 |
|
| 3 |
**Status**: NOT Implemented (Candidate for Addition)
|
| 4 |
**Priority**: HIGH - Could Replace Multiple Tools
|
| 5 |
+
**Reference**: Already implemented in `reference_repos/DeepBoner`
|
| 6 |
|
| 7 |
---
|
| 8 |
|
|
|
|
| 20 |
|
| 21 |
---
|
| 22 |
|
| 23 |
+
## Why OpenAlex for DeepBoner?
|
| 24 |
|
| 25 |
### Current Architecture
|
| 26 |
|
|
|
|
| 60 |
|
| 61 |
## Reference Implementation
|
| 62 |
|
| 63 |
+
From `reference_repos/DeepBoner/DeepResearch/src/tools/openalex_tools.py`:
|
| 64 |
|
| 65 |
```python
|
| 66 |
class OpenAlexFetchTool(ToolRunner):
|
|
|
|
| 212 |
"filter": "type:article,is_oa:true",
|
| 213 |
"sort": "cited_by_count:desc",
|
| 214 |
"per_page": max_results,
|
| 215 |
+
"mailto": "deepboner@example.com", # Polite pool
|
| 216 |
},
|
| 217 |
)
|
| 218 |
data = resp.json()
|
|
@@ -305,7 +305,7 @@ class OpenAlexTool:
|
|
| 305 |
Args:
|
| 306 |
email: Optional email for polite pool (faster responses)
|
| 307 |
"""
|
| 308 |
-
self.email = email or "
|
| 309 |
|
| 310 |
@property
|
| 311 |
def name(self) -> str:
|
|
|
|
| 305 |
Args:
|
| 306 |
email: Optional email for polite pool (faster responses)
|
| 307 |
"""
|
| 308 |
+
self.email = email or "deepboner@example.com"
|
| 309 |
|
| 310 |
@property
|
| 311 |
def name(self) -> str:
|
|
@@ -167,7 +167,7 @@ The refactor branch (`feat/pubmed-fulltext`) has some valuable improvements:
|
|
| 167 |
## 9. Questions to Answer Before Proceeding
|
| 168 |
|
| 169 |
1. **For the hackathon**: Do we need full multi-agent orchestration, or is single-agent sufficient?
|
| 170 |
-
2. **For
|
| 171 |
3. **Timeline**: How much time do we have to get this right?
|
| 172 |
|
| 173 |
---
|
|
|
|
| 167 |
## 9. Questions to Answer Before Proceeding
|
| 168 |
|
| 169 |
1. **For the hackathon**: Do we need full multi-agent orchestration, or is single-agent sufficient?
|
| 170 |
+
2. **For DeepBoner mainline**: Is the plan to use Microsoft Agent Framework for orchestration?
|
| 171 |
3. **Timeline**: How much time do we have to get this right?
|
| 172 |
|
| 173 |
---
|
|
@@ -6,7 +6,7 @@ Copy and paste everything below this line to a fresh Claude/AI session:
|
|
| 6 |
|
| 7 |
## Context
|
| 8 |
|
| 9 |
-
I am a junior developer working on a HuggingFace hackathon project called
|
| 10 |
|
| 11 |
## The Situation
|
| 12 |
|
|
@@ -62,28 +62,28 @@ Please perform a **deep, critical review** of:
|
|
| 62 |
|
| 63 |
Please read these files in order:
|
| 64 |
|
| 65 |
-
1. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/
|
| 66 |
-
2. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/
|
| 67 |
-
3. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/
|
| 68 |
-
4. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/
|
| 69 |
|
| 70 |
And the architecture diagram:
|
| 71 |
-
5. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/
|
| 72 |
|
| 73 |
## Reference Repositories to Consult
|
| 74 |
|
| 75 |
We have local clones of the source-of-truth repositories:
|
| 76 |
|
| 77 |
-
- **Original
|
| 78 |
-
- **Microsoft Agent Framework:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/
|
| 79 |
-
- **Microsoft AutoGen:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/
|
| 80 |
|
| 81 |
Please cross-reference our hackathon fork against these to verify architectural alignment.
|
| 82 |
|
| 83 |
## Codebase to Analyze
|
| 84 |
|
| 85 |
Our hackathon fork is at:
|
| 86 |
-
`/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/
|
| 87 |
|
| 88 |
Key files to examine:
|
| 89 |
- `src/agents/` - Agent framework integration
|
|
|
|
| 6 |
|
| 7 |
## Context
|
| 8 |
|
| 9 |
+
I am a junior developer working on a HuggingFace hackathon project called DeepBoner. We made a significant architectural mistake and are now trying to course-correct. I need you to act as a **senior staff engineer** and critically review our proposed solution.
|
| 10 |
|
| 11 |
## The Situation
|
| 12 |
|
|
|
|
| 62 |
|
| 63 |
Please read these files in order:
|
| 64 |
|
| 65 |
+
1. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/00_SITUATION_AND_PLAN.md`
|
| 66 |
+
2. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/01_ARCHITECTURE_SPEC.md`
|
| 67 |
+
3. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/02_IMPLEMENTATION_PHASES.md`
|
| 68 |
+
4. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/03_IMMEDIATE_ACTIONS.md`
|
| 69 |
|
| 70 |
And the architecture diagram:
|
| 71 |
+
5. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/assets/magentic-pydantic.png`
|
| 72 |
|
| 73 |
## Reference Repositories to Consult
|
| 74 |
|
| 75 |
We have local clones of the source-of-truth repositories:
|
| 76 |
|
| 77 |
+
- **Original DeepBoner:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/DeepBoner/`
|
| 78 |
+
- **Microsoft Agent Framework:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/agent-framework/`
|
| 79 |
+
- **Microsoft AutoGen:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/autogen-microsoft/`
|
| 80 |
|
| 81 |
Please cross-reference our hackathon fork against these to verify architectural alignment.
|
| 82 |
|
| 83 |
## Codebase to Analyze
|
| 84 |
|
| 85 |
Our hackathon fork is at:
|
| 86 |
+
`/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/`
|
| 87 |
|
| 88 |
Key files to examine:
|
| 89 |
- `src/agents/` - Agent framework integration
|
|
@@ -55,7 +55,7 @@ def create_demo():
|
|
| 55 |
def create_demo():
|
| 56 |
return gr.ChatInterface( # <--- FIX: Top-level component
|
| 57 |
...,
|
| 58 |
-
title="π§¬
|
| 59 |
description="*AI-Powered Drug Repurposing Agent...*\n\n---\n**MCP Server Active**...",
|
| 60 |
additional_inputs_accordion=gr.Accordion(label="βοΈ Settings", open=False)
|
| 61 |
)
|
|
@@ -69,7 +69,7 @@ def create_demo():
|
|
| 69 |
2. **Check**: Open `http://localhost:7860`
|
| 70 |
3. **Verify**:
|
| 71 |
* Settings accordion starts **COLLAPSED**.
|
| 72 |
-
* Header title ("
|
| 73 |
* Footer text ("MCP Server Active") is visible in the description area.
|
| 74 |
* Chat functionality works (Magentic/Simple modes).
|
| 75 |
|
|
|
|
| 55 |
def create_demo():
|
| 56 |
return gr.ChatInterface( # <--- FIX: Top-level component
|
| 57 |
...,
|
| 58 |
+
title="𧬠DeepBoner",
|
| 59 |
description="*AI-Powered Drug Repurposing Agent...*\n\n---\n**MCP Server Active**...",
|
| 60 |
additional_inputs_accordion=gr.Accordion(label="βοΈ Settings", open=False)
|
| 61 |
)
|
|
|
|
| 69 |
2. **Check**: Open `http://localhost:7860`
|
| 70 |
3. **Verify**:
|
| 71 |
* Settings accordion starts **COLLAPSED**.
|
| 72 |
+
* Header title ("DeepBoner") is visible.
|
| 73 |
* Footer text ("MCP Server Active") is visible in the description area.
|
| 74 |
* Chat functionality works (Magentic/Simple modes).
|
| 75 |
|
|
@@ -1,5 +1,5 @@
|
|
| 1 |
# Testing Strategy
|
| 2 |
-
## ensuring
|
| 3 |
|
| 4 |
---
|
| 5 |
|
|
|
|
| 1 |
# Testing Strategy
|
| 2 |
+
## ensuring DeepBoner is Ironclad
|
| 3 |
|
| 4 |
---
|
| 5 |
|
|
@@ -1,11 +1,11 @@
|
|
| 1 |
# Deployment Guide
|
| 2 |
-
## Launching
|
| 3 |
|
| 4 |
---
|
| 5 |
|
| 6 |
## Overview
|
| 7 |
|
| 8 |
-
|
| 9 |
|
| 10 |
1. **HuggingFace Spaces**: Host the Gradio UI (User Interface).
|
| 11 |
2. **MCP Server**: Expose research tools to Claude Desktop/Agents.
|
|
@@ -69,10 +69,10 @@ def predict(message, history):
|
|
| 69 |
```json
|
| 70 |
{
|
| 71 |
"mcpServers": {
|
| 72 |
-
"
|
| 73 |
"command": "uv",
|
| 74 |
"args": ["run", "fastmcp", "run", "src/mcp_servers/pubmed_server.py"],
|
| 75 |
-
"cwd": "/absolute/path/to/
|
| 76 |
}
|
| 77 |
}
|
| 78 |
}
|
|
@@ -111,7 +111,7 @@ Instead of calling Anthropic API, we call a Modal function:
|
|
| 111 |
# src/llm/modal_client.py
|
| 112 |
import modal
|
| 113 |
|
| 114 |
-
stub = modal.Stub("
|
| 115 |
|
| 116 |
@stub.function(gpu="A100")
|
| 117 |
def generate_text(prompt: str):
|
|
|
|
| 1 |
# Deployment Guide
|
| 2 |
+
## Launching DeepBoner: Gradio, MCP, & Modal
|
| 3 |
|
| 4 |
---
|
| 5 |
|
| 6 |
## Overview
|
| 7 |
|
| 8 |
+
DeepBoner is designed for a multi-platform deployment strategy to maximize hackathon impact:
|
| 9 |
|
| 10 |
1. **HuggingFace Spaces**: Host the Gradio UI (User Interface).
|
| 11 |
2. **MCP Server**: Expose research tools to Claude Desktop/Agents.
|
|
|
|
| 69 |
```json
|
| 70 |
{
|
| 71 |
"mcpServers": {
|
| 72 |
+
"deepboner": {
|
| 73 |
"command": "uv",
|
| 74 |
"args": ["run", "fastmcp", "run", "src/mcp_servers/pubmed_server.py"],
|
| 75 |
+
"cwd": "/absolute/path/to/DeepBoner"
|
| 76 |
}
|
| 77 |
}
|
| 78 |
}
|
|
|
|
| 111 |
# src/llm/modal_client.py
|
| 112 |
import modal
|
| 113 |
|
| 114 |
+
stub = modal.Stub("deepboner-inference")
|
| 115 |
|
| 116 |
@stub.function(gpu="A100")
|
| 117 |
def generate_text(prompt: str):
|
|
@@ -23,7 +23,7 @@ uv --version # Should be >= 0.4.0
|
|
| 23 |
|
| 24 |
```bash
|
| 25 |
# From project root
|
| 26 |
-
uv init --name
|
| 27 |
uv python install 3.11 # Pin Python version
|
| 28 |
```
|
| 29 |
|
|
@@ -35,9 +35,9 @@ uv python install 3.11 # Pin Python version
|
|
| 35 |
|
| 36 |
```toml
|
| 37 |
[project]
|
| 38 |
-
name = "
|
| 39 |
version = "0.1.0"
|
| 40 |
-
description = "AI-Native
|
| 41 |
readme = "README.md"
|
| 42 |
requires-python = ">=3.11"
|
| 43 |
dependencies = [
|
|
@@ -401,25 +401,25 @@ settings = get_settings()
|
|
| 401 |
### `src/utils/exceptions.py`
|
| 402 |
|
| 403 |
```python
|
| 404 |
-
"""Custom exceptions for
|
| 405 |
|
| 406 |
|
| 407 |
-
class
|
| 408 |
-
"""Base exception for all
|
| 409 |
pass
|
| 410 |
|
| 411 |
|
| 412 |
-
class SearchError(
|
| 413 |
"""Raised when a search operation fails."""
|
| 414 |
pass
|
| 415 |
|
| 416 |
|
| 417 |
-
class JudgeError(
|
| 418 |
"""Raised when the judge fails to assess evidence."""
|
| 419 |
pass
|
| 420 |
|
| 421 |
|
| 422 |
-
class ConfigurationError(
|
| 423 |
"""Raised when configuration is invalid."""
|
| 424 |
pass
|
| 425 |
|
|
@@ -558,7 +558,7 @@ uv run pre-commit install
|
|
| 558 |
## 10. Implementation Checklist
|
| 559 |
|
| 560 |
- [ ] Install `uv` and verify version
|
| 561 |
-
- [ ] Run `uv init --name
|
| 562 |
- [ ] Create `pyproject.toml` (copy from above)
|
| 563 |
- [ ] Create directory structure (run mkdir commands)
|
| 564 |
- [ ] Create `.env.example` and `.env`
|
|
|
|
| 23 |
|
| 24 |
```bash
|
| 25 |
# From project root
|
| 26 |
+
uv init --name deepboner
|
| 27 |
uv python install 3.11 # Pin Python version
|
| 28 |
```
|
| 29 |
|
|
|
|
| 35 |
|
| 36 |
```toml
|
| 37 |
[project]
|
| 38 |
+
name = "deepboner"
|
| 39 |
version = "0.1.0"
|
| 40 |
+
description = "AI-Native Sexual Health Research Agent"
|
| 41 |
readme = "README.md"
|
| 42 |
requires-python = ">=3.11"
|
| 43 |
dependencies = [
|
|
|
|
| 401 |
### `src/utils/exceptions.py`
|
| 402 |
|
| 403 |
```python
|
| 404 |
+
"""Custom exceptions for DeepBoner."""
|
| 405 |
|
| 406 |
|
| 407 |
+
class DeepBonerError(Exception):
|
| 408 |
+
"""Base exception for all DeepBoner errors."""
|
| 409 |
pass
|
| 410 |
|
| 411 |
|
| 412 |
+
class SearchError(DeepBonerError):
|
| 413 |
"""Raised when a search operation fails."""
|
| 414 |
pass
|
| 415 |
|
| 416 |
|
| 417 |
+
class JudgeError(DeepBonerError):
|
| 418 |
"""Raised when the judge fails to assess evidence."""
|
| 419 |
pass
|
| 420 |
|
| 421 |
|
| 422 |
+
class ConfigurationError(DeepBonerError):
|
| 423 |
"""Raised when configuration is invalid."""
|
| 424 |
pass
|
| 425 |
|
|
|
|
| 558 |
## 10. Implementation Checklist
|
| 559 |
|
| 560 |
- [ ] Install `uv` and verify version
|
| 561 |
+
- [ ] Run `uv init --name deepboner`
|
| 562 |
- [ ] Create `pyproject.toml` (copy from above)
|
| 563 |
- [ ] Create directory structure (run mkdir commands)
|
| 564 |
- [ ] Create `.env.example` and `.env`
|
|
@@ -401,7 +401,7 @@ Found {len(evidence)} sources. Consider refining your query for more specific re
|
|
| 401 |
Using Gradio 5 generator pattern for real-time streaming.
|
| 402 |
|
| 403 |
```python
|
| 404 |
-
"""Gradio UI for
|
| 405 |
import asyncio
|
| 406 |
import gradio as gr
|
| 407 |
from typing import AsyncGenerator
|
|
@@ -557,11 +557,11 @@ def create_demo() -> gr.Blocks:
|
|
| 557 |
Configured Gradio Blocks interface
|
| 558 |
"""
|
| 559 |
with gr.Blocks(
|
| 560 |
-
title="
|
| 561 |
theme=gr.themes.Soft(),
|
| 562 |
) as demo:
|
| 563 |
gr.Markdown("""
|
| 564 |
-
# π§¬
|
| 565 |
## AI-Powered Drug Repurposing Research Agent
|
| 566 |
|
| 567 |
Ask questions about potential drug repurposing opportunities.
|
|
@@ -935,7 +935,7 @@ class TestAgentEvent:
|
|
| 935 |
## 6. Dockerfile
|
| 936 |
|
| 937 |
```dockerfile
|
| 938 |
-
# Dockerfile for
|
| 939 |
FROM python:3.11-slim
|
| 940 |
|
| 941 |
# Set working directory
|
|
@@ -975,7 +975,7 @@ Create `README.md` header for HuggingFace Spaces:
|
|
| 975 |
|
| 976 |
```markdown
|
| 977 |
---
|
| 978 |
-
title:
|
| 979 |
emoji: π§¬
|
| 980 |
colorFrom: blue
|
| 981 |
colorTo: purple
|
|
@@ -986,7 +986,7 @@ pinned: false
|
|
| 986 |
license: mit
|
| 987 |
---
|
| 988 |
|
| 989 |
-
#
|
| 990 |
|
| 991 |
AI-Powered Drug Repurposing Research Agent
|
| 992 |
```
|
|
@@ -1088,7 +1088,7 @@ After deployment to HuggingFace Spaces:
|
|
| 1088 |
|
| 1089 |
## Project Complete! π
|
| 1090 |
|
| 1091 |
-
When Phase 4 is done, the
|
| 1092 |
|
| 1093 |
- **Phase 1**: Foundation (uv, pytest, config) β
|
| 1094 |
- **Phase 2**: Search Slice (PubMed, DuckDuckGo) β
|
|
|
|
| 401 |
Using Gradio 5 generator pattern for real-time streaming.
|
| 402 |
|
| 403 |
```python
|
| 404 |
+
"""Gradio UI for DeepBoner agent."""
|
| 405 |
import asyncio
|
| 406 |
import gradio as gr
|
| 407 |
from typing import AsyncGenerator
|
|
|
|
| 557 |
Configured Gradio Blocks interface
|
| 558 |
"""
|
| 559 |
with gr.Blocks(
|
| 560 |
+
title="DeepBoner - Drug Repurposing Research Agent",
|
| 561 |
theme=gr.themes.Soft(),
|
| 562 |
) as demo:
|
| 563 |
gr.Markdown("""
|
| 564 |
+
# 𧬠DeepBoner
|
| 565 |
## AI-Powered Drug Repurposing Research Agent
|
| 566 |
|
| 567 |
Ask questions about potential drug repurposing opportunities.
|
|
|
|
| 935 |
## 6. Dockerfile
|
| 936 |
|
| 937 |
```dockerfile
|
| 938 |
+
# Dockerfile for DeepBoner
|
| 939 |
FROM python:3.11-slim
|
| 940 |
|
| 941 |
# Set working directory
|
|
|
|
| 975 |
|
| 976 |
```markdown
|
| 977 |
---
|
| 978 |
+
title: DeepBoner
|
| 979 |
emoji: π§¬
|
| 980 |
colorFrom: blue
|
| 981 |
colorTo: purple
|
|
|
|
| 986 |
license: mit
|
| 987 |
---
|
| 988 |
|
| 989 |
+
# DeepBoner
|
| 990 |
|
| 991 |
AI-Powered Drug Repurposing Research Agent
|
| 992 |
```
|
|
|
|
| 1088 |
|
| 1089 |
## Project Complete! π
|
| 1090 |
|
| 1091 |
+
When Phase 4 is done, the DeepBoner MVP is complete:
|
| 1092 |
|
| 1093 |
- **Phase 1**: Foundation (uv, pytest, config) β
|
| 1094 |
- **Phase 2**: Search Slice (PubMed, DuckDuckGo) β
|
|
@@ -185,7 +185,7 @@ class ClinicalTrialsTool:
|
|
| 185 |
requests.get,
|
| 186 |
self.BASE_URL,
|
| 187 |
params=params,
|
| 188 |
-
headers={"User-Agent": "
|
| 189 |
timeout=30,
|
| 190 |
)
|
| 191 |
response.raise_for_status()
|
|
@@ -434,4 +434,4 @@ source .env && uv run python examples/search_demo/run_search.py "metformin alzhe
|
|
| 434 |
| No phase info | Phase I/II/III evidence strength |
|
| 435 |
|
| 436 |
**Demo pitch addition**:
|
| 437 |
-
> "
|
|
|
|
| 185 |
requests.get,
|
| 186 |
self.BASE_URL,
|
| 187 |
params=params,
|
| 188 |
+
headers={"User-Agent": "DeepBoner-Research-Agent/1.0"},
|
| 189 |
timeout=30,
|
| 190 |
)
|
| 191 |
response.raise_for_status()
|
|
|
|
| 434 |
| No phase info | Phase I/II/III evidence strength |
|
| 435 |
|
| 436 |
**Demo pitch addition**:
|
| 437 |
+
> "DeepBoner searches PubMed for peer-reviewed evidence AND ClinicalTrials.gov for 400,000+ clinical trials."
|
|
@@ -531,7 +531,7 @@ source .env && uv run python examples/search_demo/run_search.py "metformin diabe
|
|
| 531 |
| Miss cutting-edge | Catch breakthroughs early |
|
| 532 |
|
| 533 |
**Demo pitch (final)**:
|
| 534 |
-
> "
|
| 535 |
|
| 536 |
---
|
| 537 |
|
|
|
|
| 531 |
| Miss cutting-edge | Catch breakthroughs early |
|
| 532 |
|
| 533 |
**Demo pitch (final)**:
|
| 534 |
+
> "DeepBoner searches PubMed for peer-reviewed evidence, ClinicalTrials.gov for 400,000+ clinical trials, and bioRxiv/medRxiv for cutting-edge preprints - then uses LLMs to generate mechanistic hypotheses and synthesize findings into publication-quality reports."
|
| 535 |
|
| 536 |
---
|
| 537 |
|
|
@@ -1,6 +1,6 @@
|
|
| 1 |
# Phase 12 Implementation Spec: MCP Server Integration
|
| 2 |
|
| 3 |
-
**Goal**: Expose
|
| 4 |
**Philosophy**: "MCP is the bridge between tools and LLMs."
|
| 5 |
**Prerequisite**: Phase 11 complete (all search tools working)
|
| 6 |
**Priority**: P0 - REQUIRED FOR HACKATHON TRACK 2
|
|
@@ -121,7 +121,7 @@ https://[space-id].hf.space/gradio_api/mcp/
|
|
| 121 |
### 4.1 MCP Tool Wrappers (`src/mcp_tools.py`)
|
| 122 |
|
| 123 |
```python
|
| 124 |
-
"""MCP tool wrappers for
|
| 125 |
|
| 126 |
These functions expose our search tools via MCP protocol.
|
| 127 |
Each function follows the MCP tool contract:
|
|
@@ -130,15 +130,15 @@ Each function follows the MCP tool contract:
|
|
| 130 |
- Formatted string returns
|
| 131 |
"""
|
| 132 |
|
| 133 |
-
from src.tools.biorxiv import BioRxivTool
|
| 134 |
from src.tools.clinicaltrials import ClinicalTrialsTool
|
|
|
|
| 135 |
from src.tools.pubmed import PubMedTool
|
| 136 |
|
| 137 |
|
| 138 |
# Singleton instances (avoid recreating on each call)
|
| 139 |
_pubmed = PubMedTool()
|
| 140 |
_trials = ClinicalTrialsTool()
|
| 141 |
-
|
| 142 |
|
| 143 |
|
| 144 |
async def search_pubmed(query: str, max_results: int = 10) -> str:
|
|
@@ -202,10 +202,10 @@ async def search_clinical_trials(query: str, max_results: int = 10) -> str:
|
|
| 202 |
return "\n".join(formatted)
|
| 203 |
|
| 204 |
|
| 205 |
-
async def
|
| 206 |
-
"""Search
|
| 207 |
|
| 208 |
-
Searches
|
| 209 |
Note: Preprints are NOT peer-reviewed but contain the latest findings.
|
| 210 |
|
| 211 |
Args:
|
|
@@ -217,10 +217,10 @@ async def search_biorxiv(query: str, max_results: int = 10) -> str:
|
|
| 217 |
"""
|
| 218 |
max_results = max(1, min(50, max_results))
|
| 219 |
|
| 220 |
-
results = await
|
| 221 |
|
| 222 |
if not results:
|
| 223 |
-
return f"No
|
| 224 |
|
| 225 |
formatted = [f"## Preprint Results for: {query}\n"]
|
| 226 |
for i, evidence in enumerate(results, 1):
|
|
@@ -236,7 +236,7 @@ async def search_biorxiv(query: str, max_results: int = 10) -> str:
|
|
| 236 |
async def search_all_sources(query: str, max_per_source: int = 5) -> str:
|
| 237 |
"""Search all biomedical sources simultaneously.
|
| 238 |
|
| 239 |
-
Performs parallel search across PubMed, ClinicalTrials.gov, and
|
| 240 |
This is the most comprehensive search option for drug repurposing research.
|
| 241 |
|
| 242 |
Args:
|
|
@@ -253,10 +253,10 @@ async def search_all_sources(query: str, max_per_source: int = 5) -> str:
|
|
| 253 |
# Run all searches in parallel
|
| 254 |
pubmed_task = search_pubmed(query, max_per_source)
|
| 255 |
trials_task = search_clinical_trials(query, max_per_source)
|
| 256 |
-
|
| 257 |
|
| 258 |
-
pubmed_results, trials_results,
|
| 259 |
-
pubmed_task, trials_task,
|
| 260 |
)
|
| 261 |
|
| 262 |
formatted = [f"# Comprehensive Search: {query}\n"]
|
|
@@ -272,10 +272,10 @@ async def search_all_sources(query: str, max_per_source: int = 5) -> str:
|
|
| 272 |
else:
|
| 273 |
formatted.append(f"## Clinical Trials\n*Error: {trials_results}*\n")
|
| 274 |
|
| 275 |
-
if isinstance(
|
| 276 |
-
formatted.append(
|
| 277 |
else:
|
| 278 |
-
formatted.append(f"## Preprints\n*Error: {
|
| 279 |
|
| 280 |
return "\n---\n".join(formatted)
|
| 281 |
```
|
|
@@ -283,7 +283,7 @@ async def search_all_sources(query: str, max_per_source: int = 5) -> str:
|
|
| 283 |
### 4.2 Update Gradio App (`src/app.py`)
|
| 284 |
|
| 285 |
```python
|
| 286 |
-
"""Gradio UI for
|
| 287 |
|
| 288 |
import os
|
| 289 |
from collections.abc import AsyncGenerator
|
|
@@ -294,12 +294,12 @@ import gradio as gr
|
|
| 294 |
from src.agent_factory.judges import JudgeHandler, MockJudgeHandler
|
| 295 |
from src.mcp_tools import (
|
| 296 |
search_all_sources,
|
| 297 |
-
|
| 298 |
search_clinical_trials,
|
| 299 |
search_pubmed,
|
| 300 |
)
|
| 301 |
from src.orchestrator_factory import create_orchestrator
|
| 302 |
-
from src.tools.
|
| 303 |
from src.tools.clinicaltrials import ClinicalTrialsTool
|
| 304 |
from src.tools.pubmed import PubMedTool
|
| 305 |
from src.tools.search_handler import SearchHandler
|
|
@@ -317,15 +317,15 @@ def create_demo() -> Any:
|
|
| 317 |
Configured Gradio Blocks interface with MCP server enabled
|
| 318 |
"""
|
| 319 |
with gr.Blocks(
|
| 320 |
-
title="
|
| 321 |
theme=gr.themes.Soft(),
|
| 322 |
) as demo:
|
| 323 |
gr.Markdown("""
|
| 324 |
-
#
|
| 325 |
## AI-Powered Drug Repurposing Research Agent
|
| 326 |
|
| 327 |
Ask questions about potential drug repurposing opportunities.
|
| 328 |
-
The agent searches PubMed, ClinicalTrials.gov, and
|
| 329 |
|
| 330 |
**Example questions:**
|
| 331 |
- "What drugs could be repurposed for Alzheimer's disease?"
|
|
@@ -381,13 +381,13 @@ def create_demo() -> Any:
|
|
| 381 |
|
| 382 |
with gr.Tab("Preprints"):
|
| 383 |
gr.Interface(
|
| 384 |
-
fn=
|
| 385 |
inputs=[
|
| 386 |
gr.Textbox(label="Query", placeholder="long covid treatment"),
|
| 387 |
gr.Slider(1, 50, value=10, step=1, label="Max Results"),
|
| 388 |
],
|
| 389 |
outputs=gr.Markdown(label="Results"),
|
| 390 |
-
api_name="
|
| 391 |
)
|
| 392 |
|
| 393 |
with gr.Tab("Search All"):
|
|
@@ -406,7 +406,7 @@ def create_demo() -> Any:
|
|
| 406 |
**Note**: This is a research tool and should not be used for medical decisions.
|
| 407 |
Always consult healthcare professionals for medical advice.
|
| 408 |
|
| 409 |
-
Built with PydanticAI + PubMed, ClinicalTrials.gov &
|
| 410 |
|
| 411 |
**MCP Server**: Available at `/gradio_api/mcp/` for Claude Desktop integration
|
| 412 |
""")
|
|
@@ -444,7 +444,7 @@ import pytest
|
|
| 444 |
|
| 445 |
from src.mcp_tools import (
|
| 446 |
search_all_sources,
|
| 447 |
-
|
| 448 |
search_clinical_trials,
|
| 449 |
search_pubmed,
|
| 450 |
)
|
|
@@ -525,18 +525,18 @@ class TestSearchClinicalTrials:
|
|
| 525 |
assert "Clinical Trials" in result
|
| 526 |
|
| 527 |
|
| 528 |
-
class
|
| 529 |
-
"""Tests for
|
| 530 |
|
| 531 |
@pytest.mark.asyncio
|
| 532 |
async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
|
| 533 |
"""Should return formatted markdown string."""
|
| 534 |
-
mock_evidence.citation.source = "
|
| 535 |
|
| 536 |
-
with patch("src.mcp_tools.
|
| 537 |
mock_tool.search = AsyncMock(return_value=[mock_evidence])
|
| 538 |
|
| 539 |
-
result = await
|
| 540 |
|
| 541 |
assert isinstance(result, str)
|
| 542 |
assert "Preprint Results" in result
|
|
@@ -550,11 +550,11 @@ class TestSearchAllSources:
|
|
| 550 |
"""Should combine results from all sources."""
|
| 551 |
with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
|
| 552 |
patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
|
| 553 |
-
patch("src.mcp_tools.
|
| 554 |
|
| 555 |
mock_pubmed.return_value = "## PubMed Results"
|
| 556 |
mock_trials.return_value = "## Clinical Trials"
|
| 557 |
-
|
| 558 |
|
| 559 |
result = await search_all_sources("metformin", 5)
|
| 560 |
|
|
@@ -568,11 +568,11 @@ class TestSearchAllSources:
|
|
| 568 |
"""Should handle partial failures gracefully."""
|
| 569 |
with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
|
| 570 |
patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
|
| 571 |
-
patch("src.mcp_tools.
|
| 572 |
|
| 573 |
mock_pubmed.return_value = "## PubMed Results"
|
| 574 |
mock_trials.side_effect = Exception("API Error")
|
| 575 |
-
|
| 576 |
|
| 577 |
result = await search_all_sources("metformin", 5)
|
| 578 |
|
|
@@ -599,10 +599,10 @@ class TestMCPDocstrings:
|
|
| 599 |
assert search_clinical_trials.__doc__ is not None
|
| 600 |
assert "Args:" in search_clinical_trials.__doc__
|
| 601 |
|
| 602 |
-
def
|
| 603 |
"""Docstring must have Args section for MCP schema generation."""
|
| 604 |
-
assert
|
| 605 |
-
assert "Args:" in
|
| 606 |
|
| 607 |
def test_search_all_sources_has_args_section(self) -> None:
|
| 608 |
"""Docstring must have Args section for MCP schema generation."""
|
|
@@ -672,7 +672,7 @@ class TestMCPServerIntegration:
|
|
| 672 |
// %APPDATA%\Claude\claude_desktop_config.json (Windows)
|
| 673 |
{
|
| 674 |
"mcpServers": {
|
| 675 |
-
"
|
| 676 |
"url": "http://localhost:7860/gradio_api/mcp/"
|
| 677 |
}
|
| 678 |
}
|
|
@@ -684,8 +684,8 @@ class TestMCPServerIntegration:
|
|
| 684 |
```json
|
| 685 |
{
|
| 686 |
"mcpServers": {
|
| 687 |
-
"
|
| 688 |
-
"url": "https://
|
| 689 |
}
|
| 690 |
}
|
| 691 |
}
|
|
@@ -696,7 +696,7 @@ class TestMCPServerIntegration:
|
|
| 696 |
```json
|
| 697 |
{
|
| 698 |
"mcpServers": {
|
| 699 |
-
"
|
| 700 |
"url": "https://your-space.hf.space/gradio_api/mcp/",
|
| 701 |
"headers": {
|
| 702 |
"Authorization": "Bearer hf_xxxxxxxxxxxxx"
|
|
@@ -761,7 +761,7 @@ Phase 12 is **COMPLETE** when:
|
|
| 761 |
```
|
| 762 |
|
| 763 |
2. **Show Claude Desktop using our tools**:
|
| 764 |
-
- Open Claude Desktop with
|
| 765 |
- Ask: "Search PubMed for metformin Alzheimer's"
|
| 766 |
- Show real results appearing
|
| 767 |
- Ask: "Now search clinical trials for the same"
|
|
@@ -817,14 +817,14 @@ Phase 12 is **COMPLETE** when:
|
|
| 817 |
β Gradio MCP Server β
|
| 818 |
β /gradio_api/mcp/ β
|
| 819 |
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ βββββββββββ β
|
| 820 |
-
β βsearch_pubmed β βsearch_trials β β
|
| 821 |
β β β β β β β βall β β
|
| 822 |
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββ¬βββββ β
|
| 823 |
βββββββββββΌβββββββββββββββββΌβββββββββββββββββΌβββββββββββββββΌβββββββ
|
| 824 |
β β β β
|
| 825 |
βΌ βΌ βΌ βΌ
|
| 826 |
ββββββββββββ ββββββββββββ ββββββββββββ (calls all)
|
| 827 |
-
βPubMedToolβ βTrials β β
|
| 828 |
β β βTool β βTool β
|
| 829 |
ββββββββββββ ββββββββββββ ββββββββββββ
|
| 830 |
```
|
|
|
|
| 1 |
# Phase 12 Implementation Spec: MCP Server Integration
|
| 2 |
|
| 3 |
+
**Goal**: Expose DeepBoner search tools as MCP servers for Track 2 compliance.
|
| 4 |
**Philosophy**: "MCP is the bridge between tools and LLMs."
|
| 5 |
**Prerequisite**: Phase 11 complete (all search tools working)
|
| 6 |
**Priority**: P0 - REQUIRED FOR HACKATHON TRACK 2
|
|
|
|
| 121 |
### 4.1 MCP Tool Wrappers (`src/mcp_tools.py`)
|
| 122 |
|
| 123 |
```python
|
| 124 |
+
"""MCP tool wrappers for DeepBoner search tools.
|
| 125 |
|
| 126 |
These functions expose our search tools via MCP protocol.
|
| 127 |
Each function follows the MCP tool contract:
|
|
|
|
| 130 |
- Formatted string returns
|
| 131 |
"""
|
| 132 |
|
|
|
|
| 133 |
from src.tools.clinicaltrials import ClinicalTrialsTool
|
| 134 |
+
from src.tools.europepmc import EuropePMCTool
|
| 135 |
from src.tools.pubmed import PubMedTool
|
| 136 |
|
| 137 |
|
| 138 |
# Singleton instances (avoid recreating on each call)
|
| 139 |
_pubmed = PubMedTool()
|
| 140 |
_trials = ClinicalTrialsTool()
|
| 141 |
+
_europepmc = EuropePMCTool()
|
| 142 |
|
| 143 |
|
| 144 |
async def search_pubmed(query: str, max_results: int = 10) -> str:
|
|
|
|
| 202 |
return "\n".join(formatted)
|
| 203 |
|
| 204 |
|
| 205 |
+
async def search_europepmc(query: str, max_results: int = 10) -> str:
|
| 206 |
+
"""Search Europe PMC for preprint and open access research.
|
| 207 |
|
| 208 |
+
Searches Europe PMC for preprints and open access papers.
|
| 209 |
Note: Preprints are NOT peer-reviewed but contain the latest findings.
|
| 210 |
|
| 211 |
Args:
|
|
|
|
| 217 |
"""
|
| 218 |
max_results = max(1, min(50, max_results))
|
| 219 |
|
| 220 |
+
results = await _europepmc.search(query, max_results)
|
| 221 |
|
| 222 |
if not results:
|
| 223 |
+
return f"No Europe PMC results found for: {query}"
|
| 224 |
|
| 225 |
formatted = [f"## Preprint Results for: {query}\n"]
|
| 226 |
for i, evidence in enumerate(results, 1):
|
|
|
|
| 236 |
async def search_all_sources(query: str, max_per_source: int = 5) -> str:
|
| 237 |
"""Search all biomedical sources simultaneously.
|
| 238 |
|
| 239 |
+
Performs parallel search across PubMed, ClinicalTrials.gov, and Europe PMC.
|
| 240 |
This is the most comprehensive search option for drug repurposing research.
|
| 241 |
|
| 242 |
Args:
|
|
|
|
| 253 |
# Run all searches in parallel
|
| 254 |
pubmed_task = search_pubmed(query, max_per_source)
|
| 255 |
trials_task = search_clinical_trials(query, max_per_source)
|
| 256 |
+
europepmc_task = search_europepmc(query, max_per_source)
|
| 257 |
|
| 258 |
+
pubmed_results, trials_results, europepmc_results = await asyncio.gather(
|
| 259 |
+
pubmed_task, trials_task, europepmc_task, return_exceptions=True
|
| 260 |
)
|
| 261 |
|
| 262 |
formatted = [f"# Comprehensive Search: {query}\n"]
|
|
|
|
| 272 |
else:
|
| 273 |
formatted.append(f"## Clinical Trials\n*Error: {trials_results}*\n")
|
| 274 |
|
| 275 |
+
if isinstance(europepmc_results, str):
|
| 276 |
+
formatted.append(europepmc_results)
|
| 277 |
else:
|
| 278 |
+
formatted.append(f"## Preprints\n*Error: {europepmc_results}*\n")
|
| 279 |
|
| 280 |
return "\n---\n".join(formatted)
|
| 281 |
```
|
|
|
|
| 283 |
### 4.2 Update Gradio App (`src/app.py`)
|
| 284 |
|
| 285 |
```python
|
| 286 |
+
"""Gradio UI for DeepBoner agent with MCP server support."""
|
| 287 |
|
| 288 |
import os
|
| 289 |
from collections.abc import AsyncGenerator
|
|
|
|
| 294 |
from src.agent_factory.judges import JudgeHandler, MockJudgeHandler
|
| 295 |
from src.mcp_tools import (
|
| 296 |
search_all_sources,
|
| 297 |
+
search_europepmc,
|
| 298 |
search_clinical_trials,
|
| 299 |
search_pubmed,
|
| 300 |
)
|
| 301 |
from src.orchestrator_factory import create_orchestrator
|
| 302 |
+
from src.tools.europepmc import EuropePMCTool
|
| 303 |
from src.tools.clinicaltrials import ClinicalTrialsTool
|
| 304 |
from src.tools.pubmed import PubMedTool
|
| 305 |
from src.tools.search_handler import SearchHandler
|
|
|
|
| 317 |
Configured Gradio Blocks interface with MCP server enabled
|
| 318 |
"""
|
| 319 |
with gr.Blocks(
|
| 320 |
+
title="DeepBoner - Drug Repurposing Research Agent",
|
| 321 |
theme=gr.themes.Soft(),
|
| 322 |
) as demo:
|
| 323 |
gr.Markdown("""
|
| 324 |
+
# DeepBoner
|
| 325 |
## AI-Powered Drug Repurposing Research Agent
|
| 326 |
|
| 327 |
Ask questions about potential drug repurposing opportunities.
|
| 328 |
+
The agent searches PubMed, ClinicalTrials.gov, and Europe PMC preprints.
|
| 329 |
|
| 330 |
**Example questions:**
|
| 331 |
- "What drugs could be repurposed for Alzheimer's disease?"
|
|
|
|
| 381 |
|
| 382 |
with gr.Tab("Preprints"):
|
| 383 |
gr.Interface(
|
| 384 |
+
fn=search_europepmc,
|
| 385 |
inputs=[
|
| 386 |
gr.Textbox(label="Query", placeholder="long covid treatment"),
|
| 387 |
gr.Slider(1, 50, value=10, step=1, label="Max Results"),
|
| 388 |
],
|
| 389 |
outputs=gr.Markdown(label="Results"),
|
| 390 |
+
api_name="search_europepmc",
|
| 391 |
)
|
| 392 |
|
| 393 |
with gr.Tab("Search All"):
|
|
|
|
| 406 |
**Note**: This is a research tool and should not be used for medical decisions.
|
| 407 |
Always consult healthcare professionals for medical advice.
|
| 408 |
|
| 409 |
+
Built with PydanticAI + PubMed, ClinicalTrials.gov & Europe PMC
|
| 410 |
|
| 411 |
**MCP Server**: Available at `/gradio_api/mcp/` for Claude Desktop integration
|
| 412 |
""")
|
|
|
|
| 444 |
|
| 445 |
from src.mcp_tools import (
|
| 446 |
search_all_sources,
|
| 447 |
+
search_europepmc,
|
| 448 |
search_clinical_trials,
|
| 449 |
search_pubmed,
|
| 450 |
)
|
|
|
|
| 525 |
assert "Clinical Trials" in result
|
| 526 |
|
| 527 |
|
| 528 |
+
class TestSearchEuropePMC:
|
| 529 |
+
"""Tests for search_europepmc MCP tool."""
|
| 530 |
|
| 531 |
@pytest.mark.asyncio
|
| 532 |
async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
|
| 533 |
"""Should return formatted markdown string."""
|
| 534 |
+
mock_evidence.citation.source = "europepmc" # type: ignore
|
| 535 |
|
| 536 |
+
with patch("src.mcp_tools._europepmc") as mock_tool:
|
| 537 |
mock_tool.search = AsyncMock(return_value=[mock_evidence])
|
| 538 |
|
| 539 |
+
result = await search_europepmc("preprint search", 10)
|
| 540 |
|
| 541 |
assert isinstance(result, str)
|
| 542 |
assert "Preprint Results" in result
|
|
|
|
| 550 |
"""Should combine results from all sources."""
|
| 551 |
with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
|
| 552 |
patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
|
| 553 |
+
patch("src.mcp_tools.search_europepmc", new_callable=AsyncMock) as mock_europepmc:
|
| 554 |
|
| 555 |
mock_pubmed.return_value = "## PubMed Results"
|
| 556 |
mock_trials.return_value = "## Clinical Trials"
|
| 557 |
+
mock_europepmc.return_value = "## Preprints"
|
| 558 |
|
| 559 |
result = await search_all_sources("metformin", 5)
|
| 560 |
|
|
|
|
| 568 |
"""Should handle partial failures gracefully."""
|
| 569 |
with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
|
| 570 |
patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
|
| 571 |
+
patch("src.mcp_tools.search_europepmc", new_callable=AsyncMock) as mock_europepmc:
|
| 572 |
|
| 573 |
mock_pubmed.return_value = "## PubMed Results"
|
| 574 |
mock_trials.side_effect = Exception("API Error")
|
| 575 |
+
mock_europepmc.return_value = "## Preprints"
|
| 576 |
|
| 577 |
result = await search_all_sources("metformin", 5)
|
| 578 |
|
|
|
|
| 599 |
assert search_clinical_trials.__doc__ is not None
|
| 600 |
assert "Args:" in search_clinical_trials.__doc__
|
| 601 |
|
| 602 |
+
def test_search_europepmc_has_args_section(self) -> None:
|
| 603 |
"""Docstring must have Args section for MCP schema generation."""
|
| 604 |
+
assert search_europepmc.__doc__ is not None
|
| 605 |
+
assert "Args:" in search_europepmc.__doc__
|
| 606 |
|
| 607 |
def test_search_all_sources_has_args_section(self) -> None:
|
| 608 |
"""Docstring must have Args section for MCP schema generation."""
|
|
|
|
| 672 |
// %APPDATA%\Claude\claude_desktop_config.json (Windows)
|
| 673 |
{
|
| 674 |
"mcpServers": {
|
| 675 |
+
"deepboner": {
|
| 676 |
"url": "http://localhost:7860/gradio_api/mcp/"
|
| 677 |
}
|
| 678 |
}
|
|
|
|
| 684 |
```json
|
| 685 |
{
|
| 686 |
"mcpServers": {
|
| 687 |
+
"deepboner": {
|
| 688 |
+
"url": "https://your-space.hf.space/gradio_api/mcp/"
|
| 689 |
}
|
| 690 |
}
|
| 691 |
}
|
|
|
|
| 696 |
```json
|
| 697 |
{
|
| 698 |
"mcpServers": {
|
| 699 |
+
"deepboner": {
|
| 700 |
"url": "https://your-space.hf.space/gradio_api/mcp/",
|
| 701 |
"headers": {
|
| 702 |
"Authorization": "Bearer hf_xxxxxxxxxxxxx"
|
|
|
|
| 761 |
```
|
| 762 |
|
| 763 |
2. **Show Claude Desktop using our tools**:
|
| 764 |
+
- Open Claude Desktop with DeepBoner MCP configured
|
| 765 |
- Ask: "Search PubMed for metformin Alzheimer's"
|
| 766 |
- Show real results appearing
|
| 767 |
- Ask: "Now search clinical trials for the same"
|
|
|
|
| 817 |
β Gradio MCP Server β
|
| 818 |
β /gradio_api/mcp/ β
|
| 819 |
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ βββββββββββ β
|
| 820 |
+
β βsearch_pubmed β βsearch_trials β βsearch_epmc β βsearch_ β β
|
| 821 |
β β β β β β β βall β β
|
| 822 |
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββ¬βββββ β
|
| 823 |
βββββββββββΌβββββββββββββββββΌβββββββββββββββββΌβββββββββββββββΌβββββββ
|
| 824 |
β β β β
|
| 825 |
βΌ βΌ βΌ βΌ
|
| 826 |
ββββββββββββ ββββββββββββ ββββββββββββ (calls all)
|
| 827 |
+
βPubMedToolβ βTrials β βEuropePMC β
|
| 828 |
β β βTool β βTool β
|
| 829 |
ββββββββββββ ββββββββββββ ββββββββββββ
|
| 830 |
```
|
|
@@ -872,7 +872,7 @@ async def main() -> None:
|
|
| 872 |
sys.exit(1)
|
| 873 |
|
| 874 |
print(f"\n{'=' * 60}")
|
| 875 |
-
print("
|
| 876 |
print(f"Query: {args.query}")
|
| 877 |
print(f"{'=' * 60}\n")
|
| 878 |
|
|
|
|
| 872 |
sys.exit(1)
|
| 873 |
|
| 874 |
print(f"\n{'=' * 60}")
|
| 875 |
+
print("DeepBoner Modal Analysis Demo")
|
| 876 |
print(f"Query: {args.query}")
|
| 877 |
print(f"{'=' * 60}\n")
|
| 878 |
|
|
@@ -71,7 +71,7 @@ tags:
|
|
| 71 |
|
| 72 |
[Show Gradio UI]
|
| 73 |
|
| 74 |
-
"
|
| 75 |
It searches peer-reviewed literature, clinical trials, and cutting-edge preprints
|
| 76 |
to find new uses for existing drugs."
|
| 77 |
|
|
@@ -83,7 +83,7 @@ to find new uses for existing drugs."
|
|
| 83 |
|
| 84 |
[Type query: "Can metformin treat Alzheimer's disease?"]
|
| 85 |
|
| 86 |
-
"When I ask about metformin for Alzheimer's,
|
| 87 |
1. Searches PubMed for peer-reviewed papers
|
| 88 |
2. Queries ClinicalTrials.gov for active trials
|
| 89 |
3. Scans bioRxiv for the latest preprints"
|
|
@@ -101,10 +101,10 @@ synthesize findings into a structured research report."
|
|
| 101 |
|
| 102 |
[Switch to Claude Desktop]
|
| 103 |
|
| 104 |
-
"What makes
|
| 105 |
These same tools are available to any MCP client."
|
| 106 |
|
| 107 |
-
[Show Claude Desktop with
|
| 108 |
|
| 109 |
"I can ask Claude: 'Search PubMed for aspirin cancer prevention'"
|
| 110 |
|
|
@@ -140,7 +140,7 @@ returning verdicts like SUPPORTED, REFUTED, or INCONCLUSIVE."
|
|
| 140 |
|
| 141 |
[Return to Gradio UI]
|
| 142 |
|
| 143 |
-
"
|
| 144 |
- Three biomedical data sources
|
| 145 |
- MCP protocol for universal tool access
|
| 146 |
- Modal sandboxes for safe code execution
|
|
@@ -164,7 +164,7 @@ and let us know what you think."
|
|
| 164 |
|
| 165 |
```markdown
|
| 166 |
---
|
| 167 |
-
title:
|
| 168 |
emoji: π§¬
|
| 169 |
colorFrom: blue
|
| 170 |
colorTo: purple
|
|
@@ -183,7 +183,7 @@ tags:
|
|
| 183 |
- modal
|
| 184 |
---
|
| 185 |
|
| 186 |
-
#
|
| 187 |
|
| 188 |
AI-Powered Drug Repurposing Research Agent
|
| 189 |
|
|
@@ -198,7 +198,7 @@ AI-Powered Drug Repurposing Research Agent
|
|
| 198 |
|
| 199 |
Connect to our MCP server at:
|
| 200 |
```
|
| 201 |
-
https://
|
| 202 |
```
|
| 203 |
|
| 204 |
Available tools:
|
|
@@ -214,7 +214,7 @@ Available tools:
|
|
| 214 |
|
| 215 |
## Links
|
| 216 |
|
| 217 |
-
- [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/
|
| 218 |
- [Demo Video](link-to-video)
|
| 219 |
```
|
| 220 |
|
|
@@ -237,7 +237,7 @@ MODAL_TOKEN_SECRET=...
|
|
| 237 |
### Twitter/X Template
|
| 238 |
|
| 239 |
```
|
| 240 |
-
𧬠Excited to submit
|
| 241 |
|
| 242 |
An AI agent that:
|
| 243 |
β
Searches PubMed, ClinicalTrials.gov & bioRxiv
|
|
@@ -254,10 +254,10 @@ Demo: [Video link]
|
|
| 254 |
### LinkedIn Template
|
| 255 |
|
| 256 |
```
|
| 257 |
-
Thrilled to share
|
| 258 |
|
| 259 |
π¬ What it does:
|
| 260 |
-
|
| 261 |
peer-reviewed literature, clinical trials, and preprints to find new uses
|
| 262 |
for existing drugs.
|
| 263 |
|
|
|
|
| 71 |
|
| 72 |
[Show Gradio UI]
|
| 73 |
|
| 74 |
+
"DeepBoner is an AI-powered drug repurposing research agent.
|
| 75 |
It searches peer-reviewed literature, clinical trials, and cutting-edge preprints
|
| 76 |
to find new uses for existing drugs."
|
| 77 |
|
|
|
|
| 83 |
|
| 84 |
[Type query: "Can metformin treat Alzheimer's disease?"]
|
| 85 |
|
| 86 |
+
"When I ask about metformin for Alzheimer's, DeepBoner:
|
| 87 |
1. Searches PubMed for peer-reviewed papers
|
| 88 |
2. Queries ClinicalTrials.gov for active trials
|
| 89 |
3. Scans bioRxiv for the latest preprints"
|
|
|
|
| 101 |
|
| 102 |
[Switch to Claude Desktop]
|
| 103 |
|
| 104 |
+
"What makes DeepBoner unique is full MCP integration.
|
| 105 |
These same tools are available to any MCP client."
|
| 106 |
|
| 107 |
+
[Show Claude Desktop with DeepBoner tools]
|
| 108 |
|
| 109 |
"I can ask Claude: 'Search PubMed for aspirin cancer prevention'"
|
| 110 |
|
|
|
|
| 140 |
|
| 141 |
[Return to Gradio UI]
|
| 142 |
|
| 143 |
+
"DeepBoner brings together:
|
| 144 |
- Three biomedical data sources
|
| 145 |
- MCP protocol for universal tool access
|
| 146 |
- Modal sandboxes for safe code execution
|
|
|
|
| 164 |
|
| 165 |
```markdown
|
| 166 |
---
|
| 167 |
+
title: DeepBoner
|
| 168 |
emoji: π§¬
|
| 169 |
colorFrom: blue
|
| 170 |
colorTo: purple
|
|
|
|
| 183 |
- modal
|
| 184 |
---
|
| 185 |
|
| 186 |
+
# DeepBoner
|
| 187 |
|
| 188 |
AI-Powered Drug Repurposing Research Agent
|
| 189 |
|
|
|
|
| 198 |
|
| 199 |
Connect to our MCP server at:
|
| 200 |
```
|
| 201 |
+
https://your-space.hf.space/gradio_api/mcp/
|
| 202 |
```
|
| 203 |
|
| 204 |
Available tools:
|
|
|
|
| 214 |
|
| 215 |
## Links
|
| 216 |
|
| 217 |
+
- [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/DeepBoner-1)
|
| 218 |
- [Demo Video](link-to-video)
|
| 219 |
```
|
| 220 |
|
|
|
|
| 237 |
### Twitter/X Template
|
| 238 |
|
| 239 |
```
|
| 240 |
+
𧬠Excited to submit DeepBoner to MCP's 1st Birthday Hackathon!
|
| 241 |
|
| 242 |
An AI agent that:
|
| 243 |
β
Searches PubMed, ClinicalTrials.gov & bioRxiv
|
|
|
|
| 254 |
### LinkedIn Template
|
| 255 |
|
| 256 |
```
|
| 257 |
+
Thrilled to share DeepBoner - our submission to MCP's 1st Birthday Hackathon!
|
| 258 |
|
| 259 |
π¬ What it does:
|
| 260 |
+
DeepBoner is an AI-powered drug repurposing research agent that searches
|
| 261 |
peer-reviewed literature, clinical trials, and preprints to find new uses
|
| 262 |
for existing drugs.
|
| 263 |
|
|
@@ -1,8 +1,8 @@
|
|
| 1 |
-
# Implementation Roadmap:
|
| 2 |
|
| 3 |
**Philosophy:** AI-Native Engineering, Vertical Slice Architecture, TDD, Modern Tooling (2025).
|
| 4 |
|
| 5 |
-
This roadmap defines the execution strategy to deliver **
|
| 6 |
|
| 7 |
---
|
| 8 |
|
|
@@ -114,7 +114,7 @@ tests/
|
|
| 114 |
|
| 115 |
- [ ] Implement `src/orchestrator.py` (Connects Search + Judge loops).
|
| 116 |
- [ ] Build `src/app.py` (Gradio with Streaming).
|
| 117 |
-
- **Deliverable**: Working
|
| 118 |
|
| 119 |
---
|
| 120 |
|
|
|
|
| 1 |
+
# Implementation Roadmap: DeepBoner (Vertical Slices)
|
| 2 |
|
| 3 |
**Philosophy:** AI-Native Engineering, Vertical Slice Architecture, TDD, Modern Tooling (2025).
|
| 4 |
|
| 5 |
+
This roadmap defines the execution strategy to deliver **DeepBoner** effectively. We reject "overplanning" in favor of **ironclad, testable vertical slices**. Each phase delivers a fully functional slice of end-to-end value.
|
| 6 |
|
| 7 |
---
|
| 8 |
|
|
|
|
| 114 |
|
| 115 |
- [ ] Implement `src/orchestrator.py` (Connects Search + Judge loops).
|
| 116 |
- [ ] Build `src/app.py` (Gradio with Streaming).
|
| 117 |
+
- **Deliverable**: Working DeepBoner Agent on HuggingFace.
|
| 118 |
|
| 119 |
---
|
| 120 |
|
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
## Medical Drug Repurposing Research Agent
|
| 4 |
|
|
|
|
| 1 |
+
# DeepBoner Documentation
|
| 2 |
|
| 3 |
## Medical Drug Repurposing Research Agent
|
| 4 |
|
|
@@ -0,0 +1,337 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Deep Research Roadmap
|
| 2 |
+
|
| 3 |
+
> How to properly add GPT-Researcher-style deep research to DeepBoner
|
| 4 |
+
> using the EXISTING Magentic + Pydantic AI architecture.
|
| 5 |
+
|
| 6 |
+
## Current State
|
| 7 |
+
|
| 8 |
+
We already have:
|
| 9 |
+
|
| 10 |
+
| Feature | Location | Status |
|
| 11 |
+
|---------|----------|--------|
|
| 12 |
+
| Multi-agent orchestration | `orchestrator_magentic.py` | Working |
|
| 13 |
+
| SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent | `agents/magentic_agents.py` | Working |
|
| 14 |
+
| HuggingFace free tier | `agent_factory/judges.py` (HFInferenceJudgeHandler) | Working |
|
| 15 |
+
| Budget constraints | MagenticOrchestrator (max_round_count, max_stall_count) | Built-in |
|
| 16 |
+
| Simple mode (linear) | `orchestrator.py` | Working |
|
| 17 |
+
|
| 18 |
+
## What Deep Research Adds
|
| 19 |
+
|
| 20 |
+
GPT-Researcher style "deep research" means:
|
| 21 |
+
|
| 22 |
+
1. **Query Analysis** - Detect if query needs simple lookup vs comprehensive report
|
| 23 |
+
2. **Section Planning** - Break complex query into 3-7 parallel research sections
|
| 24 |
+
3. **Parallel Research** - Run multiple research loops simultaneously
|
| 25 |
+
4. **Long-form Writing** - Synthesize sections into cohesive report
|
| 26 |
+
5. **RAG** - Semantic search over accumulated evidence
|
| 27 |
+
|
| 28 |
+
## Implementation Plan (TDD, Vertical Slices)
|
| 29 |
+
|
| 30 |
+
### Phase 1: Input Parser (Est. 50-100 lines)
|
| 31 |
+
|
| 32 |
+
**Goal**: Detect research mode from query.
|
| 33 |
+
|
| 34 |
+
```python
|
| 35 |
+
# src/agents/input_parser.py
|
| 36 |
+
|
| 37 |
+
class ParsedQuery(BaseModel):
|
| 38 |
+
original_query: str
|
| 39 |
+
improved_query: str
|
| 40 |
+
research_mode: Literal["iterative", "deep"]
|
| 41 |
+
key_entities: list[str]
|
| 42 |
+
|
| 43 |
+
async def parse_query(query: str) -> ParsedQuery:
|
| 44 |
+
"""
|
| 45 |
+
Detect if query needs deep research.
|
| 46 |
+
|
| 47 |
+
Deep indicators:
|
| 48 |
+
- "comprehensive", "report", "overview", "analysis"
|
| 49 |
+
- Multiple topics/drugs mentioned
|
| 50 |
+
- Requests for sections/structure
|
| 51 |
+
|
| 52 |
+
Iterative indicators:
|
| 53 |
+
- Single focused question
|
| 54 |
+
- "what is", "how does", "find"
|
| 55 |
+
"""
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
**Test first**:
|
| 59 |
+
```python
|
| 60 |
+
def test_parse_query_detects_deep_mode():
|
| 61 |
+
result = await parse_query("Write a comprehensive report on Alzheimer's treatments")
|
| 62 |
+
assert result.research_mode == "deep"
|
| 63 |
+
|
| 64 |
+
def test_parse_query_detects_iterative_mode():
|
| 65 |
+
result = await parse_query("What is the mechanism of metformin?")
|
| 66 |
+
assert result.research_mode == "iterative"
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
**Wire in**:
|
| 70 |
+
```python
|
| 71 |
+
# In app.py or orchestrator_factory.py
|
| 72 |
+
parsed = await parse_query(user_query)
|
| 73 |
+
if parsed.research_mode == "deep":
|
| 74 |
+
orchestrator = create_deep_orchestrator()
|
| 75 |
+
else:
|
| 76 |
+
orchestrator = create_orchestrator() # existing
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
---
|
| 80 |
+
|
| 81 |
+
### Phase 2: Section Planner (Est. 80-120 lines)
|
| 82 |
+
|
| 83 |
+
**Goal**: Create report outline for deep research.
|
| 84 |
+
|
| 85 |
+
```python
|
| 86 |
+
# src/agents/planner.py
|
| 87 |
+
|
| 88 |
+
class ReportSection(BaseModel):
|
| 89 |
+
title: str
|
| 90 |
+
query: str # Search query for this section
|
| 91 |
+
description: str
|
| 92 |
+
|
| 93 |
+
class ReportPlan(BaseModel):
|
| 94 |
+
title: str
|
| 95 |
+
sections: list[ReportSection]
|
| 96 |
+
|
| 97 |
+
# Use existing ChatAgent pattern from magentic_agents.py
|
| 98 |
+
def create_planner_agent(chat_client: OpenAIChatClient | None = None) -> ChatAgent:
|
| 99 |
+
return ChatAgent(
|
| 100 |
+
name="PlannerAgent",
|
| 101 |
+
description="Creates structured report outlines",
|
| 102 |
+
instructions="""Given a research query, create a report plan with 3-7 sections.
|
| 103 |
+
Each section should have:
|
| 104 |
+
- A clear title
|
| 105 |
+
- A focused search query
|
| 106 |
+
- Brief description of what to cover
|
| 107 |
+
|
| 108 |
+
Example for "Alzheimer's drug repurposing":
|
| 109 |
+
1. Current Treatment Landscape
|
| 110 |
+
2. Mechanism-Based Candidates (targeting amyloid, tau, inflammation)
|
| 111 |
+
3. Clinical Trial Evidence
|
| 112 |
+
4. Safety Considerations
|
| 113 |
+
5. Emerging Research Directions
|
| 114 |
+
""",
|
| 115 |
+
chat_client=client,
|
| 116 |
+
)
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
**Test first**:
|
| 120 |
+
```python
|
| 121 |
+
def test_planner_creates_sections():
|
| 122 |
+
plan = await planner.create_plan("Comprehensive Alzheimer's drug repurposing report")
|
| 123 |
+
assert len(plan.sections) >= 3
|
| 124 |
+
assert all(s.query for s in plan.sections)
|
| 125 |
+
```
|
| 126 |
+
|
| 127 |
+
**Wire in**: Used by Phase 3.
|
| 128 |
+
|
| 129 |
+
---
|
| 130 |
+
|
| 131 |
+
### Phase 3: Parallel Research Flow (Est. 100-150 lines)
|
| 132 |
+
|
| 133 |
+
**Goal**: Run multiple MagenticOrchestrator instances in parallel.
|
| 134 |
+
|
| 135 |
+
```python
|
| 136 |
+
# src/orchestrator_deep.py
|
| 137 |
+
|
| 138 |
+
class DeepResearchOrchestrator:
|
| 139 |
+
"""
|
| 140 |
+
Runs parallel research loops using EXISTING MagenticOrchestrator.
|
| 141 |
+
|
| 142 |
+
NOT a new orchestration system - just a wrapper that:
|
| 143 |
+
1. Plans sections
|
| 144 |
+
2. Runs existing orchestrator per section (in parallel)
|
| 145 |
+
3. Aggregates results
|
| 146 |
+
"""
|
| 147 |
+
|
| 148 |
+
def __init__(self, max_parallel: int = 5):
|
| 149 |
+
self.planner = create_planner_agent()
|
| 150 |
+
self.max_parallel = max_parallel
|
| 151 |
+
|
| 152 |
+
async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
|
| 153 |
+
# 1. Create plan
|
| 154 |
+
plan = await self.planner.create_plan(query)
|
| 155 |
+
yield AgentEvent(type="planning", message=f"Created {len(plan.sections)} section plan")
|
| 156 |
+
|
| 157 |
+
# 2. Run parallel research (reuse existing orchestrator!)
|
| 158 |
+
from src.orchestrator_magentic import MagenticOrchestrator
|
| 159 |
+
|
| 160 |
+
async def research_section(section: ReportSection) -> str:
|
| 161 |
+
orchestrator = MagenticOrchestrator(max_rounds=5) # Fewer rounds per section
|
| 162 |
+
result = ""
|
| 163 |
+
async for event in orchestrator.run(section.query):
|
| 164 |
+
if event.type == "complete":
|
| 165 |
+
result = event.message
|
| 166 |
+
return result
|
| 167 |
+
|
| 168 |
+
# Run in parallel with semaphore
|
| 169 |
+
semaphore = asyncio.Semaphore(self.max_parallel)
|
| 170 |
+
async def bounded_research(section):
|
| 171 |
+
async with semaphore:
|
| 172 |
+
return await research_section(section)
|
| 173 |
+
|
| 174 |
+
results = await asyncio.gather(*[
|
| 175 |
+
bounded_research(s) for s in plan.sections
|
| 176 |
+
])
|
| 177 |
+
|
| 178 |
+
# 3. Aggregate
|
| 179 |
+
yield AgentEvent(
|
| 180 |
+
type="complete",
|
| 181 |
+
message=self._aggregate_sections(plan, results)
|
| 182 |
+
)
|
| 183 |
+
```
|
| 184 |
+
|
| 185 |
+
**Key insight**: We're NOT replacing MagenticOrchestrator. We're running multiple instances of it.
|
| 186 |
+
|
| 187 |
+
**Test first**:
|
| 188 |
+
```python
|
| 189 |
+
@pytest.mark.integration
|
| 190 |
+
async def test_deep_orchestrator_runs_parallel():
|
| 191 |
+
orchestrator = DeepResearchOrchestrator(max_parallel=2)
|
| 192 |
+
events = [e async for e in orchestrator.run("Comprehensive Alzheimer's report")]
|
| 193 |
+
assert any(e.type == "planning" for e in events)
|
| 194 |
+
assert any(e.type == "complete" for e in events)
|
| 195 |
+
```
|
| 196 |
+
|
| 197 |
+
---
|
| 198 |
+
|
| 199 |
+
### Phase 4: RAG Integration (Est. 100-150 lines)
|
| 200 |
+
|
| 201 |
+
**Goal**: Semantic search over accumulated evidence.
|
| 202 |
+
|
| 203 |
+
> **Note**: We already have `src/services/embeddings.py` (EmbeddingService) which provides
|
| 204 |
+
> ChromaDB + sentence-transformers with `add_evidence()` and `search_similar()` methods.
|
| 205 |
+
> The code below is illustrative - in practice, extend EmbeddingService or use it directly.
|
| 206 |
+
> See also: `src/services/llamaindex_rag.py` for OpenAI-based RAG (different use case).
|
| 207 |
+
|
| 208 |
+
```python
|
| 209 |
+
# src/services/rag.py (illustrative - use EmbeddingService instead)
|
| 210 |
+
|
| 211 |
+
class RAGService:
|
| 212 |
+
"""
|
| 213 |
+
Simple RAG using ChromaDB + sentence-transformers.
|
| 214 |
+
No LlamaIndex dependency - keep it lightweight.
|
| 215 |
+
"""
|
| 216 |
+
|
| 217 |
+
def __init__(self):
|
| 218 |
+
import chromadb
|
| 219 |
+
from sentence_transformers import SentenceTransformer
|
| 220 |
+
|
| 221 |
+
self.client = chromadb.Client()
|
| 222 |
+
self.collection = self.client.get_or_create_collection("evidence")
|
| 223 |
+
self.encoder = SentenceTransformer("all-MiniLM-L6-v2")
|
| 224 |
+
|
| 225 |
+
def add_evidence(self, evidence: list[Evidence]) -> int:
|
| 226 |
+
"""Add evidence to vector store, return count added."""
|
| 227 |
+
# Dedupe by URL
|
| 228 |
+
existing = set(self.collection.get()["ids"])
|
| 229 |
+
new_evidence = [e for e in evidence if e.citation.url not in existing]
|
| 230 |
+
|
| 231 |
+
if not new_evidence:
|
| 232 |
+
return 0
|
| 233 |
+
|
| 234 |
+
self.collection.add(
|
| 235 |
+
ids=[e.citation.url for e in new_evidence],
|
| 236 |
+
documents=[e.content for e in new_evidence],
|
| 237 |
+
metadatas=[{"title": e.citation.title, "source": e.citation.source} for e in new_evidence],
|
| 238 |
+
)
|
| 239 |
+
return len(new_evidence)
|
| 240 |
+
|
| 241 |
+
def search(self, query: str, n_results: int = 5) -> list[Evidence]:
|
| 242 |
+
"""Semantic search for relevant evidence."""
|
| 243 |
+
results = self.collection.query(query_texts=[query], n_results=n_results)
|
| 244 |
+
# Convert back to Evidence objects
|
| 245 |
+
...
|
| 246 |
+
```
|
| 247 |
+
|
| 248 |
+
**Wire in as tool**:
|
| 249 |
+
```python
|
| 250 |
+
# Add to SearchAgent's tools
|
| 251 |
+
def rag_search(query: str, n_results: int = 5) -> str:
|
| 252 |
+
"""Search previously collected evidence for relevant information."""
|
| 253 |
+
service = get_rag_service()
|
| 254 |
+
results = service.search(query, n_results)
|
| 255 |
+
return format_evidence(results)
|
| 256 |
+
|
| 257 |
+
# In magentic_agents.py
|
| 258 |
+
ChatAgent(
|
| 259 |
+
tools=[search_pubmed, search_clinical_trials, search_preprints, rag_search], # ADD RAG
|
| 260 |
+
)
|
| 261 |
+
```
|
| 262 |
+
|
| 263 |
+
---
|
| 264 |
+
|
| 265 |
+
### Phase 5: Long Writer (Est. 80-100 lines)
|
| 266 |
+
|
| 267 |
+
**Goal**: Write longer reports section-by-section.
|
| 268 |
+
|
| 269 |
+
```python
|
| 270 |
+
# Extend existing ReportAgent or create LongWriterAgent
|
| 271 |
+
|
| 272 |
+
def create_long_writer_agent() -> ChatAgent:
|
| 273 |
+
return ChatAgent(
|
| 274 |
+
name="LongWriterAgent",
|
| 275 |
+
description="Writes detailed report sections with proper citations",
|
| 276 |
+
instructions="""Write a detailed section for a research report.
|
| 277 |
+
|
| 278 |
+
You will receive:
|
| 279 |
+
- Section title
|
| 280 |
+
- Relevant evidence/findings
|
| 281 |
+
- What previous sections covered (to avoid repetition)
|
| 282 |
+
|
| 283 |
+
Output:
|
| 284 |
+
- 500-1000 words per section
|
| 285 |
+
- Proper citations [1], [2], etc.
|
| 286 |
+
- Smooth transitions
|
| 287 |
+
- No repetition of earlier content
|
| 288 |
+
""",
|
| 289 |
+
tools=[get_bibliography, rag_search],
|
| 290 |
+
)
|
| 291 |
+
```
|
| 292 |
+
|
| 293 |
+
---
|
| 294 |
+
|
| 295 |
+
## What NOT To Build
|
| 296 |
+
|
| 297 |
+
These are REDUNDANT with existing Magentic system:
|
| 298 |
+
|
| 299 |
+
| Component | Why Skip |
|
| 300 |
+
|-----------|----------|
|
| 301 |
+
| GraphOrchestrator | MagenticBuilder already handles agent coordination |
|
| 302 |
+
| BudgetTracker | MagenticBuilder has max_round_count, max_stall_count |
|
| 303 |
+
| WorkflowManager | asyncio.gather() + Semaphore is simpler |
|
| 304 |
+
| StateMachine | contextvars already used in agents/state.py |
|
| 305 |
+
| New agent primitives | ChatAgent pattern already works |
|
| 306 |
+
|
| 307 |
+
## Implementation Order
|
| 308 |
+
|
| 309 |
+
```
|
| 310 |
+
Week 1: Phase 1 (InputParser) - Ship it working
|
| 311 |
+
Week 2: Phase 2 (Planner) - Ship it working
|
| 312 |
+
Week 3: Phase 3 (Parallel Flow) - Ship it working
|
| 313 |
+
Week 4: Phase 4 (RAG) - Ship it working
|
| 314 |
+
Week 5: Phase 5 (LongWriter) - Ship it working
|
| 315 |
+
```
|
| 316 |
+
|
| 317 |
+
Each phase:
|
| 318 |
+
1. Write tests first
|
| 319 |
+
2. Implement minimal code
|
| 320 |
+
3. Wire into app.py
|
| 321 |
+
4. Manual test
|
| 322 |
+
5. PR with <200 lines
|
| 323 |
+
6. Ship
|
| 324 |
+
|
| 325 |
+
## References
|
| 326 |
+
|
| 327 |
+
- GPT-Researcher: https://github.com/assafelovic/gpt-researcher
|
| 328 |
+
- LangGraph patterns: https://python.langchain.com/docs/langgraph
|
| 329 |
+
- Your existing Magentic setup: `src/orchestrator_magentic.py`
|
| 330 |
+
|
| 331 |
+
## Why This Approach
|
| 332 |
+
|
| 333 |
+
1. **Builds on existing working code** - Don't replace, extend
|
| 334 |
+
2. **Each phase ships value** - User sees improvement after each PR
|
| 335 |
+
3. **Tests prove it works** - Not "trust me it imports"
|
| 336 |
+
4. **Minimal new abstractions** - Reuse ChatAgent, MagenticOrchestrator
|
| 337 |
+
5. **~500 total lines** vs 7,000 lines of parallel infrastructure
|
|
@@ -0,0 +1,229 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Reference: GradioDemo Analysis
|
| 2 |
+
|
| 3 |
+
> Analysis of code from https://github.com/DeepBoner/GradioDemo
|
| 4 |
+
> Purpose: Extract good ideas, understand patterns, avoid mistakes
|
| 5 |
+
|
| 6 |
+
## Overview
|
| 7 |
+
|
| 8 |
+
| Metric | Value |
|
| 9 |
+
|--------|-------|
|
| 10 |
+
| Total lines added | ~7,000 |
|
| 11 |
+
| New Python files | +20 |
|
| 12 |
+
| Test pass rate | 80% (62 errors due to missing mocks) |
|
| 13 |
+
| Integration status | **NOT WIRED IN** |
|
| 14 |
+
|
| 15 |
+
## Component Catalog
|
| 16 |
+
|
| 17 |
+
### REDUNDANT (Already have equivalent)
|
| 18 |
+
|
| 19 |
+
| Component | Lines | What We Have Instead |
|
| 20 |
+
|-----------|-------|---------------------|
|
| 21 |
+
| `orchestrator/graph_orchestrator.py` | 974 | MagenticBuilder |
|
| 22 |
+
| `middleware/budget_tracker.py` | 391 | MagenticBuilder max_round_count |
|
| 23 |
+
| `middleware/state_machine.py` | 130 | agents/state.py with contextvars |
|
| 24 |
+
| `middleware/workflow_manager.py` | 300 | asyncio.gather() |
|
| 25 |
+
| `orchestrator/research_flow.py` (IterativeResearchFlow) | 500 | MagenticOrchestrator |
|
| 26 |
+
| HuggingFace integration | various | HFInferenceJudgeHandler |
|
| 27 |
+
|
| 28 |
+
### POTENTIALLY USEFUL (Ideas to cherry-pick)
|
| 29 |
+
|
| 30 |
+
#### 1. InputParser (`agents/input_parser.py` - 179 lines)
|
| 31 |
+
|
| 32 |
+
**Idea**: Detect research mode from query text.
|
| 33 |
+
|
| 34 |
+
```python
|
| 35 |
+
# Key logic (simplified)
|
| 36 |
+
research_mode: Literal["iterative", "deep"] = "iterative"
|
| 37 |
+
if any(keyword in query.lower() for keyword in [
|
| 38 |
+
"comprehensive", "report", "sections", "analyze", "analysis", "overview", "market"
|
| 39 |
+
]):
|
| 40 |
+
research_mode = "deep"
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
**Good pattern**: Heuristic fallback when LLM fails.
|
| 44 |
+
**Our implementation**: See Phase 1 in DEEP_RESEARCH_ROADMAP.md
|
| 45 |
+
|
| 46 |
+
#### 2. PlannerAgent (`orchestrator/planner_agent.py` - 184 lines)
|
| 47 |
+
|
| 48 |
+
**Idea**: LLM creates section outline for report.
|
| 49 |
+
|
| 50 |
+
```python
|
| 51 |
+
class ReportPlan(BaseModel):
|
| 52 |
+
title: str
|
| 53 |
+
sections: list[ReportSection]
|
| 54 |
+
estimated_time_minutes: int
|
| 55 |
+
|
| 56 |
+
class ReportSection(BaseModel):
|
| 57 |
+
title: str
|
| 58 |
+
query: str
|
| 59 |
+
description: str
|
| 60 |
+
priority: int
|
| 61 |
+
```
|
| 62 |
+
|
| 63 |
+
**Good pattern**: Structured output with Pydantic models.
|
| 64 |
+
**Our implementation**: See Phase 2 in DEEP_RESEARCH_ROADMAP.md
|
| 65 |
+
|
| 66 |
+
#### 3. DeepResearchFlow (`orchestrator/research_flow.py` - 500 lines)
|
| 67 |
+
|
| 68 |
+
**Idea**: Run parallel research loops per section.
|
| 69 |
+
|
| 70 |
+
```python
|
| 71 |
+
# Their pattern (simplified)
|
| 72 |
+
async def run_parallel_loops(sections: list[ReportSection]):
|
| 73 |
+
tasks = [run_single_loop(s) for s in sections]
|
| 74 |
+
results = await asyncio.gather(*tasks, return_exceptions=True)
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
**Problem**: They built new IterativeResearchFlow instead of reusing MagenticOrchestrator.
|
| 78 |
+
**Our implementation**: Just run multiple MagenticOrchestrator instances.
|
| 79 |
+
|
| 80 |
+
#### 4. LlamaIndex RAG (`services/llamaindex_rag.py` - 454 lines)
|
| 81 |
+
|
| 82 |
+
**Idea**: Semantic search over collected evidence.
|
| 83 |
+
|
| 84 |
+
```python
|
| 85 |
+
# Their approach
|
| 86 |
+
class LlamaIndexRAGService:
|
| 87 |
+
def __init__(self):
|
| 88 |
+
# ChromaDB + LlamaIndex + HuggingFace embeddings
|
| 89 |
+
self.vector_store = ChromaVectorStore(...)
|
| 90 |
+
self.index = VectorStoreIndex(...)
|
| 91 |
+
|
| 92 |
+
def retrieve(self, query: str, top_k: int = 5) -> list[dict]:
|
| 93 |
+
retriever = VectorIndexRetriever(index=self.index, similarity_top_k=top_k)
|
| 94 |
+
return retriever.retrieve(query)
|
| 95 |
+
```
|
| 96 |
+
|
| 97 |
+
**Good**: Full-featured RAG with multiple embedding providers.
|
| 98 |
+
**Simpler alternative**: Direct ChromaDB + sentence-transformers (no LlamaIndex).
|
| 99 |
+
**Our implementation**: See Phase 4 in DEEP_RESEARCH_ROADMAP.md
|
| 100 |
+
|
| 101 |
+
#### 5. LongWriterAgent (`agents/long_writer.py` - ~300 lines)
|
| 102 |
+
|
| 103 |
+
**Idea**: Write reports section-by-section to handle length.
|
| 104 |
+
|
| 105 |
+
```python
|
| 106 |
+
class SectionOutput(BaseModel):
|
| 107 |
+
section_content: str
|
| 108 |
+
references: list[str]
|
| 109 |
+
next_section_context: str # What to avoid repeating
|
| 110 |
+
|
| 111 |
+
async def write_next_section(
|
| 112 |
+
section_title: str,
|
| 113 |
+
findings: str,
|
| 114 |
+
previous_sections: str, # Avoid repetition
|
| 115 |
+
) -> SectionOutput:
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
**Good pattern**: Passing context to avoid repetition.
|
| 119 |
+
**Our implementation**: See Phase 5 in DEEP_RESEARCH_ROADMAP.md
|
| 120 |
+
|
| 121 |
+
#### 6. ProofreaderAgent (`agents/proofreader.py` - ~200 lines)
|
| 122 |
+
|
| 123 |
+
**Idea**: Final cleanup pass on report.
|
| 124 |
+
|
| 125 |
+
```python
|
| 126 |
+
# Tasks:
|
| 127 |
+
# 1. Remove duplicate information
|
| 128 |
+
# 2. Fix citation numbering
|
| 129 |
+
# 3. Add executive summary
|
| 130 |
+
# 4. Ensure consistent formatting
|
| 131 |
+
```
|
| 132 |
+
|
| 133 |
+
**Good pattern**: Separate concerns - writer writes, proofreader polishes.
|
| 134 |
+
**Our implementation**: Optional Phase 6 if needed.
|
| 135 |
+
|
| 136 |
+
### Graph Architecture (Educational Reference)
|
| 137 |
+
|
| 138 |
+
The graph system is well-designed in theory:
|
| 139 |
+
|
| 140 |
+
```python
|
| 141 |
+
# Node types
|
| 142 |
+
class AgentNode(GraphNode):
|
| 143 |
+
agent: Any # Pydantic AI agent
|
| 144 |
+
input_transformer: Callable # Transform input
|
| 145 |
+
output_transformer: Callable # Transform output
|
| 146 |
+
|
| 147 |
+
class DecisionNode(GraphNode):
|
| 148 |
+
decision_function: Callable[[Any], str] # Returns next node ID
|
| 149 |
+
options: list[str]
|
| 150 |
+
|
| 151 |
+
class ParallelNode(GraphNode):
|
| 152 |
+
parallel_nodes: list[str] # Run these in parallel
|
| 153 |
+
aggregator: Callable # Combine results
|
| 154 |
+
|
| 155 |
+
# Graph structure
|
| 156 |
+
class ResearchGraph:
|
| 157 |
+
nodes: dict[str, GraphNode]
|
| 158 |
+
edges: dict[str, list[GraphEdge]]
|
| 159 |
+
entry_node: str
|
| 160 |
+
exit_nodes: list[str]
|
| 161 |
+
```
|
| 162 |
+
|
| 163 |
+
**Why we don't need it**: MagenticBuilder already provides:
|
| 164 |
+
- Agent coordination via manager
|
| 165 |
+
- Conditional routing (manager decides)
|
| 166 |
+
- Multiple participants
|
| 167 |
+
|
| 168 |
+
This is essentially reimplementing what `agent-framework` already does.
|
| 169 |
+
|
| 170 |
+
## Key Lessons
|
| 171 |
+
|
| 172 |
+
### What Went Wrong
|
| 173 |
+
|
| 174 |
+
1. **Parallel architecture** - Built new system instead of extending existing
|
| 175 |
+
2. **Horizontal sprawl** - All infrastructure, nothing wired in
|
| 176 |
+
3. **Test mocking** - Tests don't mock API clients properly
|
| 177 |
+
4. **No manual testing** - Code never ran end-to-end
|
| 178 |
+
|
| 179 |
+
### What To Learn From
|
| 180 |
+
|
| 181 |
+
1. **Pydantic models for structured output** - Good pattern
|
| 182 |
+
2. **Heuristic fallbacks** - When LLM fails, have a fallback
|
| 183 |
+
3. **Section-by-section writing** - For long reports
|
| 184 |
+
4. **RAG for evidence retrieval** - Useful for large evidence sets
|
| 185 |
+
|
| 186 |
+
### The 7,000 Line vs 500 Line Comparison
|
| 187 |
+
|
| 188 |
+
**Their approach**:
|
| 189 |
+
- New GraphOrchestrator (974 lines)
|
| 190 |
+
- New ResearchFlow (999 lines)
|
| 191 |
+
- New BudgetTracker (391 lines)
|
| 192 |
+
- New StateMachine (130 lines)
|
| 193 |
+
- New WorkflowManager (300 lines)
|
| 194 |
+
- New agents (InputParser, Writer, LongWriter, Proofreader, etc.)
|
| 195 |
+
- Total: ~7,000 lines, not integrated
|
| 196 |
+
|
| 197 |
+
**Our approach**:
|
| 198 |
+
- InputParser (50-100 lines) - extends existing
|
| 199 |
+
- PlannerAgent (80-120 lines) - uses ChatAgent pattern
|
| 200 |
+
- DeepOrchestrator (100-150 lines) - wraps MagenticOrchestrator
|
| 201 |
+
- RAGService (100-150 lines) - simple ChromaDB
|
| 202 |
+
- LongWriter (80-100 lines) - extends ReportAgent
|
| 203 |
+
- Total: ~500 lines, each phase ships working
|
| 204 |
+
|
| 205 |
+
## File Locations (for reference)
|
| 206 |
+
|
| 207 |
+
```
|
| 208 |
+
reference_repos/GradioDemo/src/
|
| 209 |
+
βββ orchestrator/
|
| 210 |
+
β βββ graph_orchestrator.py # 974 lines - graph execution
|
| 211 |
+
β βββ research_flow.py # 999 lines - iterative/deep flows
|
| 212 |
+
β βββ planner_agent.py # 184 lines - section planning
|
| 213 |
+
βββ agents/
|
| 214 |
+
β βββ input_parser.py # 179 lines - query analysis
|
| 215 |
+
β βββ writer.py # 210 lines - report writing
|
| 216 |
+
β βββ long_writer.py # ~300 lines - section writing
|
| 217 |
+
β βββ proofreader.py # ~200 lines - cleanup
|
| 218 |
+
β βββ knowledge_gap.py # gap detection
|
| 219 |
+
βββ middleware/
|
| 220 |
+
β βββ budget_tracker.py # 391 lines - token/time tracking
|
| 221 |
+
β βββ state_machine.py # 130 lines - workflow state
|
| 222 |
+
β βββ workflow_manager.py # 300 lines - parallel loop mgmt
|
| 223 |
+
βββ services/
|
| 224 |
+
β βββ llamaindex_rag.py # 454 lines - RAG service
|
| 225 |
+
βββ tools/
|
| 226 |
+
β βββ rag_tool.py # 191 lines - RAG as search tool
|
| 227 |
+
βββ agent_factory/
|
| 228 |
+
βββ graph_builder.py # ~400 lines - graph construction
|
| 229 |
+
```
|
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
> **Architecture Pattern**: Microsoft Magentic Orchestration
|
| 4 |
> **Design Philosophy**: Simple, dynamic, manager-driven coordination
|
|
@@ -475,7 +475,7 @@ stateDiagram-v2
|
|
| 475 |
|
| 476 |
```mermaid
|
| 477 |
graph TD
|
| 478 |
-
App[Gradio App<br/>
|
| 479 |
|
| 480 |
App --> Input[Input Section]
|
| 481 |
App --> Status[Status Section]
|
|
@@ -514,7 +514,7 @@ graph TD
|
|
| 514 |
|
| 515 |
```mermaid
|
| 516 |
graph LR
|
| 517 |
-
User[π€ Researcher<br/>Asks research questions] -->|Submits query| DC[
|
| 518 |
|
| 519 |
DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
|
| 520 |
DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]
|
|
@@ -549,7 +549,7 @@ graph LR
|
|
| 549 |
|
| 550 |
```mermaid
|
| 551 |
gantt
|
| 552 |
-
title
|
| 553 |
dateFormat mm:ss
|
| 554 |
axisFormat %M:%S
|
| 555 |
|
|
|
|
| 1 |
+
# DeepBoner Workflow - Simplified Magentic Architecture
|
| 2 |
|
| 3 |
> **Architecture Pattern**: Microsoft Magentic Orchestration
|
| 4 |
> **Design Philosophy**: Simple, dynamic, manager-driven coordination
|
|
|
|
| 475 |
|
| 476 |
```mermaid
|
| 477 |
graph TD
|
| 478 |
+
App[Gradio App<br/>DeepBoner Research Agent]
|
| 479 |
|
| 480 |
App --> Input[Input Section]
|
| 481 |
App --> Status[Status Section]
|
|
|
|
| 514 |
|
| 515 |
```mermaid
|
| 516 |
graph LR
|
| 517 |
+
User[π€ Researcher<br/>Asks research questions] -->|Submits query| DC[DeepBoner<br/>Magentic Workflow]
|
| 518 |
|
| 519 |
DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
|
| 520 |
DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]
|
|
|
|
| 549 |
|
| 550 |
```mermaid
|
| 551 |
gantt
|
| 552 |
+
title DeepBoner Magentic Workflow - Typical Execution
|
| 553 |
dateFormat mm:ss
|
| 554 |
axisFormat %M:%S
|
| 555 |
|
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
**NO MOCKS. NO FAKE DATA. REAL SCIENCE.**
|
| 4 |
|
|
@@ -181,4 +181,4 @@ Mocks belong in `tests/unit/`, not in demos. When you run these examples, you se
|
|
| 181 |
- Real scientific hypotheses
|
| 182 |
- Real research reports
|
| 183 |
|
| 184 |
-
This is what
|
|
|
|
| 1 |
+
# DeepBoner Examples
|
| 2 |
|
| 3 |
**NO MOCKS. NO FAKE DATA. REAL SCIENCE.**
|
| 4 |
|
|
|
|
| 181 |
- Real scientific hypotheses
|
| 182 |
- Real research reports
|
| 183 |
|
| 184 |
+
This is what DeepBoner actually does. No fake data. No canned responses.
|
|
@@ -35,7 +35,7 @@ def create_fresh_service(name_suffix: str = "") -> EmbeddingService:
|
|
| 35 |
async def demo_real_pipeline() -> None:
|
| 36 |
"""Run the demo using REAL PubMed data."""
|
| 37 |
print("\n" + "=" * 60)
|
| 38 |
-
print("
|
| 39 |
print("=" * 60)
|
| 40 |
|
| 41 |
# 1. Fetch Real Data
|
|
|
|
| 35 |
async def demo_real_pipeline() -> None:
|
| 36 |
"""Run the demo using REAL PubMed data."""
|
| 37 |
print("\n" + "=" * 60)
|
| 38 |
+
print("DeepBoner Embeddings Demo (REAL DATA)")
|
| 39 |
print("=" * 60)
|
| 40 |
|
| 41 |
# 1. Fetch Real Data
|
|
@@ -1,6 +1,6 @@
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
"""
|
| 3 |
-
Demo: Full Stack
|
| 4 |
|
| 5 |
This script demonstrates the COMPLETE REAL drug repurposing research pipeline:
|
| 6 |
- Phase 2: REAL Search (PubMed + ClinicalTrials + Europe PMC)
|
|
@@ -104,7 +104,7 @@ async def _handle_judge_step(
|
|
| 104 |
|
| 105 |
async def run_full_demo(query: str, max_iterations: int) -> None:
|
| 106 |
"""Run the REAL full stack pipeline."""
|
| 107 |
-
print_header("
|
| 108 |
print(f"Query: {query}")
|
| 109 |
print(f"Max iterations: {max_iterations}")
|
| 110 |
print("Mode: REAL (All live API calls - no mocks)\n")
|
|
@@ -172,7 +172,7 @@ async def run_full_demo(query: str, max_iterations: int) -> None:
|
|
| 172 |
async def main() -> None:
|
| 173 |
"""Entry point."""
|
| 174 |
parser = argparse.ArgumentParser(
|
| 175 |
-
description="
|
| 176 |
formatter_class=argparse.RawDescriptionHelpFormatter,
|
| 177 |
epilog="""
|
| 178 |
This demo runs the COMPLETE pipeline with REAL API calls:
|
|
@@ -222,7 +222,7 @@ Examples:
|
|
| 222 |
await run_full_demo(args.query, args.iterations)
|
| 223 |
|
| 224 |
print("\n" + "=" * 70)
|
| 225 |
-
print("
|
| 226 |
print(" ")
|
| 227 |
print(" Everything you just saw was REAL:")
|
| 228 |
print(" - Real PubMed + ClinicalTrials + Europe PMC searches")
|
|
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
"""
|
| 3 |
+
Demo: Full Stack DeepBoner Agent (Phases 1-8).
|
| 4 |
|
| 5 |
This script demonstrates the COMPLETE REAL drug repurposing research pipeline:
|
| 6 |
- Phase 2: REAL Search (PubMed + ClinicalTrials + Europe PMC)
|
|
|
|
| 104 |
|
| 105 |
async def run_full_demo(query: str, max_iterations: int) -> None:
|
| 106 |
"""Run the REAL full stack pipeline."""
|
| 107 |
+
print_header("DeepBoner Full Stack Demo (REAL)")
|
| 108 |
print(f"Query: {query}")
|
| 109 |
print(f"Max iterations: {max_iterations}")
|
| 110 |
print("Mode: REAL (All live API calls - no mocks)\n")
|
|
|
|
| 172 |
async def main() -> None:
|
| 173 |
"""Entry point."""
|
| 174 |
parser = argparse.ArgumentParser(
|
| 175 |
+
description="DeepBoner Full Stack Demo - REAL, No Mocks",
|
| 176 |
formatter_class=argparse.RawDescriptionHelpFormatter,
|
| 177 |
epilog="""
|
| 178 |
This demo runs the COMPLETE pipeline with REAL API calls:
|
|
|
|
| 222 |
await run_full_demo(args.query, args.iterations)
|
| 223 |
|
| 224 |
print("\n" + "=" * 70)
|
| 225 |
+
print(" DeepBoner Full Stack Demo Complete!")
|
| 226 |
print(" ")
|
| 227 |
print(" Everything you just saw was REAL:")
|
| 228 |
print(" - Real PubMed + ClinicalTrials + Europe PMC searches")
|
|
@@ -31,7 +31,7 @@ async def run_hypothesis_demo(query: str) -> None:
|
|
| 31 |
"""Run the REAL hypothesis generation pipeline."""
|
| 32 |
try:
|
| 33 |
print(f"\n{'=' * 60}")
|
| 34 |
-
print("
|
| 35 |
print(f"Query: {query}")
|
| 36 |
print("Mode: REAL (Live API calls)")
|
| 37 |
print(f"{'=' * 60}\n")
|
|
|
|
| 31 |
"""Run the REAL hypothesis generation pipeline."""
|
| 32 |
try:
|
| 33 |
print(f"\n{'=' * 60}")
|
| 34 |
+
print("DeepBoner Hypothesis Agent Demo (Phase 7)")
|
| 35 |
print(f"Query: {query}")
|
| 36 |
print("Mode: REAL (Live API calls)")
|
| 37 |
print(f"{'=' * 60}\n")
|
|
@@ -32,7 +32,7 @@ async def main() -> None:
|
|
| 32 |
sys.exit(1)
|
| 33 |
|
| 34 |
print(f"\n{'=' * 60}")
|
| 35 |
-
print("
|
| 36 |
print(f"Query: {args.query}")
|
| 37 |
print(f"{'=' * 60}\n")
|
| 38 |
|
|
|
|
| 32 |
sys.exit(1)
|
| 33 |
|
| 34 |
print(f"\n{'=' * 60}")
|
| 35 |
+
print("DeepBoner Modal Analysis Demo")
|
| 36 |
print(f"Query: {args.query}")
|
| 37 |
print(f"{'=' * 60}\n")
|
| 38 |
|
|
@@ -1,6 +1,6 @@
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
"""
|
| 3 |
-
Demo:
|
| 4 |
|
| 5 |
This script demonstrates the REAL Phase 4 orchestration:
|
| 6 |
- REAL Iterative Search (PubMed + ClinicalTrials + Europe PMC)
|
|
@@ -36,7 +36,7 @@ MAX_ITERATIONS = 10
|
|
| 36 |
async def main() -> None:
|
| 37 |
"""Run the REAL agent demo."""
|
| 38 |
parser = argparse.ArgumentParser(
|
| 39 |
-
description="
|
| 40 |
formatter_class=argparse.RawDescriptionHelpFormatter,
|
| 41 |
epilog="""
|
| 42 |
This demo runs the REAL search-judge-synthesize loop:
|
|
@@ -72,7 +72,7 @@ Examples:
|
|
| 72 |
sys.exit(1)
|
| 73 |
|
| 74 |
print(f"\n{'=' * 60}")
|
| 75 |
-
print("
|
| 76 |
print(f"Query: {args.query}")
|
| 77 |
print(f"Max Iterations: {args.iterations}")
|
| 78 |
print("Mode: REAL (All live API calls)")
|
|
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
"""
|
| 3 |
+
Demo: DeepBoner Agent Loop (Search + Judge + Orchestrator).
|
| 4 |
|
| 5 |
This script demonstrates the REAL Phase 4 orchestration:
|
| 6 |
- REAL Iterative Search (PubMed + ClinicalTrials + Europe PMC)
|
|
|
|
| 36 |
async def main() -> None:
|
| 37 |
"""Run the REAL agent demo."""
|
| 38 |
parser = argparse.ArgumentParser(
|
| 39 |
+
description="DeepBoner Agent Demo - REAL, No Mocks",
|
| 40 |
formatter_class=argparse.RawDescriptionHelpFormatter,
|
| 41 |
epilog="""
|
| 42 |
This demo runs the REAL search-judge-synthesize loop:
|
|
|
|
| 72 |
sys.exit(1)
|
| 73 |
|
| 74 |
print(f"\n{'=' * 60}")
|
| 75 |
+
print("DeepBoner Agent Demo (REAL)")
|
| 76 |
print(f"Query: {args.query}")
|
| 77 |
print(f"Max Iterations: {args.iterations}")
|
| 78 |
print("Mode: REAL (All live API calls)")
|
|
@@ -1,6 +1,6 @@
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
"""
|
| 3 |
-
Demo: Magentic-One Orchestrator for
|
| 4 |
|
| 5 |
This script demonstrates Phase 5 functionality:
|
| 6 |
- Multi-Agent Coordination (Searcher + Judge + Manager)
|
|
@@ -27,7 +27,7 @@ from src.utils.models import OrchestratorConfig
|
|
| 27 |
|
| 28 |
async def main() -> None:
|
| 29 |
"""Run the magentic agent demo."""
|
| 30 |
-
parser = argparse.ArgumentParser(description="Run
|
| 31 |
parser.add_argument("query", help="Research query (e.g., 'metformin cancer')")
|
| 32 |
parser.add_argument("--iterations", type=int, default=10, help="Max rounds")
|
| 33 |
args = parser.parse_args()
|
|
@@ -40,7 +40,7 @@ async def main() -> None:
|
|
| 40 |
sys.exit(1)
|
| 41 |
|
| 42 |
print(f"\n{'=' * 60}")
|
| 43 |
-
print("
|
| 44 |
print(f"Query: {args.query}")
|
| 45 |
print("Mode: MAGENTIC (Multi-Agent)")
|
| 46 |
print(f"{'=' * 60}\n")
|
|
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
"""
|
| 3 |
+
Demo: Magentic-One Orchestrator for DeepBoner.
|
| 4 |
|
| 5 |
This script demonstrates Phase 5 functionality:
|
| 6 |
- Multi-Agent Coordination (Searcher + Judge + Manager)
|
|
|
|
| 27 |
|
| 28 |
async def main() -> None:
|
| 29 |
"""Run the magentic agent demo."""
|
| 30 |
+
parser = argparse.ArgumentParser(description="Run DeepBoner Magentic Agent")
|
| 31 |
parser.add_argument("query", help="Research query (e.g., 'metformin cancer')")
|
| 32 |
parser.add_argument("--iterations", type=int, default=10, help="Max rounds")
|
| 33 |
args = parser.parse_args()
|
|
|
|
| 40 |
sys.exit(1)
|
| 41 |
|
| 42 |
print(f"\n{'=' * 60}")
|
| 43 |
+
print("DeepBoner Magentic Agent Demo")
|
| 44 |
print(f"Query: {args.query}")
|
| 45 |
print("Mode: MAGENTIC (Multi-Agent)")
|
| 46 |
print(f"{'=' * 60}\n")
|
|
@@ -30,7 +30,7 @@ from src.tools.search_handler import SearchHandler
|
|
| 30 |
async def main(query: str) -> None:
|
| 31 |
"""Run search demo with the given query."""
|
| 32 |
print(f"\n{'=' * 60}")
|
| 33 |
-
print("
|
| 34 |
print(f"Query: {query}")
|
| 35 |
print(f"{'=' * 60}\n")
|
| 36 |
|
|
|
|
| 30 |
async def main(query: str) -> None:
|
| 31 |
"""Run search demo with the given query."""
|
| 32 |
print(f"\n{'=' * 60}")
|
| 33 |
+
print("DeepBoner Search Demo")
|
| 34 |
print(f"Query: {query}")
|
| 35 |
print(f"{'=' * 60}\n")
|
| 36 |
|
|
@@ -1,5 +1,5 @@
|
|
| 1 |
def main():
|
| 2 |
-
print("Hello from
|
| 3 |
|
| 4 |
|
| 5 |
if __name__ == "__main__":
|
|
|
|
| 1 |
def main():
|
| 2 |
+
print("Hello from deepboner!")
|
| 3 |
|
| 4 |
|
| 5 |
if __name__ == "__main__":
|
|
@@ -1,7 +1,7 @@
|
|
| 1 |
[project]
|
| 2 |
-
name = "
|
| 3 |
version = "0.1.0"
|
| 4 |
-
description = "AI-Native
|
| 5 |
readme = "README.md"
|
| 6 |
requires-python = ">=3.11"
|
| 7 |
dependencies = [
|
|
@@ -126,6 +126,18 @@ markers = [
|
|
| 126 |
"integration: Integration tests (real APIs)",
|
| 127 |
"slow: Slow tests",
|
| 128 |
]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
|
| 130 |
# ============== COVERAGE CONFIG ==============
|
| 131 |
[tool.coverage.run]
|
|
|
|
| 1 |
[project]
|
| 2 |
+
name = "deepboner"
|
| 3 |
version = "0.1.0"
|
| 4 |
+
description = "AI-Native Sexual Health Research Agent"
|
| 5 |
readme = "README.md"
|
| 6 |
requires-python = ">=3.11"
|
| 7 |
dependencies = [
|
|
|
|
| 126 |
"integration: Integration tests (real APIs)",
|
| 127 |
"slow: Slow tests",
|
| 128 |
]
|
| 129 |
+
# Filter warnings from unittest.mock introspecting Pydantic models.
|
| 130 |
+
# This is a known upstream issue: https://github.com/pydantic/pydantic/issues/9927
|
| 131 |
+
# When autospec=True, mock.py accesses deprecated Pydantic attributes during introspection.
|
| 132 |
+
# We filter these specifically because it's NOT our code triggering deprecations.
|
| 133 |
+
filterwarnings = [
|
| 134 |
+
# Pydantic 2.0 deprecations triggered by mock introspection
|
| 135 |
+
"ignore:The `__fields__` attribute is deprecated:pydantic.warnings.PydanticDeprecatedSince20",
|
| 136 |
+
"ignore:The `__fields_set__` attribute is deprecated:pydantic.warnings.PydanticDeprecatedSince20",
|
| 137 |
+
# Pydantic 2.11 deprecations triggered by mock introspection
|
| 138 |
+
"ignore:Accessing the 'model_computed_fields' attribute on the instance is deprecated:pydantic.warnings.PydanticDeprecatedSince211",
|
| 139 |
+
"ignore:Accessing the 'model_fields' attribute on the instance is deprecated:pydantic.warnings.PydanticDeprecatedSince211",
|
| 140 |
+
]
|
| 141 |
|
| 142 |
# ============== COVERAGE CONFIG ==============
|
| 143 |
[tool.coverage.run]
|
|
@@ -9,7 +9,7 @@ from huggingface_hub import InferenceClient
|
|
| 9 |
from pydantic_ai import Agent
|
| 10 |
from pydantic_ai.models.anthropic import AnthropicModel
|
| 11 |
from pydantic_ai.models.huggingface import HuggingFaceModel
|
| 12 |
-
from pydantic_ai.models.openai import
|
| 13 |
from pydantic_ai.providers.anthropic import AnthropicProvider
|
| 14 |
from pydantic_ai.providers.huggingface import HuggingFaceProvider
|
| 15 |
from pydantic_ai.providers.openai import OpenAIProvider
|
|
@@ -48,7 +48,7 @@ def get_model() -> Any:
|
|
| 48 |
logger.warning("Unknown LLM provider, defaulting to OpenAI", provider=llm_provider)
|
| 49 |
|
| 50 |
openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
|
| 51 |
-
return
|
| 52 |
|
| 53 |
|
| 54 |
class JudgeHandler:
|
|
|
|
| 9 |
from pydantic_ai import Agent
|
| 10 |
from pydantic_ai.models.anthropic import AnthropicModel
|
| 11 |
from pydantic_ai.models.huggingface import HuggingFaceModel
|
| 12 |
+
from pydantic_ai.models.openai import OpenAIChatModel
|
| 13 |
from pydantic_ai.providers.anthropic import AnthropicProvider
|
| 14 |
from pydantic_ai.providers.huggingface import HuggingFaceProvider
|
| 15 |
from pydantic_ai.providers.openai import OpenAIProvider
|
|
|
|
| 48 |
logger.warning("Unknown LLM provider, defaulting to OpenAI", provider=llm_provider)
|
| 49 |
|
| 50 |
openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
|
| 51 |
+
return OpenAIChatModel(settings.openai_model, provider=openai_provider)
|
| 52 |
|
| 53 |
|
| 54 |
class JudgeHandler:
|
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
"""Gradio UI for
|
| 2 |
|
| 3 |
import os
|
| 4 |
from collections.abc import AsyncGenerator
|
|
@@ -197,29 +197,31 @@ def create_demo() -> gr.ChatInterface:
|
|
| 197 |
# 1. Unwrapped ChatInterface (Fixes Accordion Bug)
|
| 198 |
demo = gr.ChatInterface(
|
| 199 |
fn=research_agent,
|
| 200 |
-
title="
|
| 201 |
description=(
|
| 202 |
-
"*AI-Powered
|
| 203 |
"ClinicalTrials.gov & Europe PMC*\n\n"
|
|
|
|
|
|
|
| 204 |
"---\n"
|
| 205 |
"*Research tool only β not for medical advice.* \n"
|
| 206 |
"**MCP Server Active**: Connect Claude Desktop to `/gradio_api/mcp/`"
|
| 207 |
),
|
| 208 |
examples=[
|
| 209 |
[
|
| 210 |
-
"What drugs
|
| 211 |
"simple",
|
| 212 |
"",
|
| 213 |
"openai",
|
| 214 |
],
|
| 215 |
[
|
| 216 |
-
"
|
| 217 |
"simple",
|
| 218 |
"",
|
| 219 |
"openai",
|
| 220 |
],
|
| 221 |
[
|
| 222 |
-
"
|
| 223 |
"simple",
|
| 224 |
"",
|
| 225 |
"openai",
|
|
|
|
| 1 |
+
"""Gradio UI for DeepBoner agent with MCP server support."""
|
| 2 |
|
| 3 |
import os
|
| 4 |
from collections.abc import AsyncGenerator
|
|
|
|
| 197 |
# 1. Unwrapped ChatInterface (Fixes Accordion Bug)
|
| 198 |
demo = gr.ChatInterface(
|
| 199 |
fn=research_agent,
|
| 200 |
+
title="π DeepBoner",
|
| 201 |
description=(
|
| 202 |
+
"*AI-Powered Sexual Health Research Agent β searches PubMed, "
|
| 203 |
"ClinicalTrials.gov & Europe PMC*\n\n"
|
| 204 |
+
"Deep research for sexual wellness, ED treatments, hormone therapy, "
|
| 205 |
+
"libido, and reproductive health - for all genders.\n\n"
|
| 206 |
"---\n"
|
| 207 |
"*Research tool only β not for medical advice.* \n"
|
| 208 |
"**MCP Server Active**: Connect Claude Desktop to `/gradio_api/mcp/`"
|
| 209 |
),
|
| 210 |
examples=[
|
| 211 |
[
|
| 212 |
+
"What drugs improve female libido post-menopause?",
|
| 213 |
"simple",
|
| 214 |
"",
|
| 215 |
"openai",
|
| 216 |
],
|
| 217 |
[
|
| 218 |
+
"Clinical trials for erectile dysfunction alternatives to PDE5 inhibitors?",
|
| 219 |
"simple",
|
| 220 |
"",
|
| 221 |
"openai",
|
| 222 |
],
|
| 223 |
[
|
| 224 |
+
"Evidence for testosterone therapy in women with HSDD?",
|
| 225 |
"simple",
|
| 226 |
"",
|
| 227 |
"openai",
|
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
"""MCP tool wrappers for
|
| 2 |
|
| 3 |
These functions expose our search tools via MCP protocol.
|
| 4 |
Each function follows the MCP tool contract:
|
|
|
|
| 1 |
+
"""MCP tool wrappers for DeepBoner search tools.
|
| 2 |
|
| 3 |
These functions expose our search tools via MCP protocol.
|
| 4 |
Each function follows the MCP tool contract:
|
|
@@ -1 +1 @@
|
|
| 1 |
-
"""Services for
|
|
|
|
| 1 |
+
"""Services for DeepBoner."""
|
|
@@ -1,6 +1,10 @@
|
|
| 1 |
"""LlamaIndex RAG service for evidence retrieval and indexing.
|
| 2 |
|
| 3 |
Requires optional dependencies: uv sync --extra modal
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
"""
|
| 5 |
|
| 6 |
from typing import Any
|
|
@@ -25,7 +29,7 @@ class LlamaIndexRAGService:
|
|
| 25 |
|
| 26 |
def __init__(
|
| 27 |
self,
|
| 28 |
-
collection_name: str = "
|
| 29 |
persist_dir: str | None = None,
|
| 30 |
embedding_model: str | None = None,
|
| 31 |
similarity_top_k: int = 5,
|
|
@@ -34,7 +38,8 @@ class LlamaIndexRAGService:
|
|
| 34 |
Initialize LlamaIndex RAG service.
|
| 35 |
|
| 36 |
Args:
|
| 37 |
-
collection_name: Name of the ChromaDB collection
|
|
|
|
| 38 |
persist_dir: Directory to persist ChromaDB data
|
| 39 |
embedding_model: OpenAI embedding model (defaults to settings.openai_embedding_model)
|
| 40 |
similarity_top_k: Number of top results to retrieve
|
|
@@ -248,7 +253,7 @@ class LlamaIndexRAGService:
|
|
| 248 |
|
| 249 |
|
| 250 |
def get_rag_service(
|
| 251 |
-
collection_name: str = "
|
| 252 |
**kwargs: Any,
|
| 253 |
) -> LlamaIndexRAGService:
|
| 254 |
"""
|
|
|
|
| 1 |
"""LlamaIndex RAG service for evidence retrieval and indexing.
|
| 2 |
|
| 3 |
Requires optional dependencies: uv sync --extra modal
|
| 4 |
+
|
| 5 |
+
Migration Note (v1.0 rebrand):
|
| 6 |
+
Default collection_name changed from "deepcritical_evidence" to "deepboner_evidence".
|
| 7 |
+
To preserve existing data, explicitly pass collection_name="deepcritical_evidence".
|
| 8 |
"""
|
| 9 |
|
| 10 |
from typing import Any
|
|
|
|
| 29 |
|
| 30 |
def __init__(
|
| 31 |
self,
|
| 32 |
+
collection_name: str = "deepboner_evidence",
|
| 33 |
persist_dir: str | None = None,
|
| 34 |
embedding_model: str | None = None,
|
| 35 |
similarity_top_k: int = 5,
|
|
|
|
| 38 |
Initialize LlamaIndex RAG service.
|
| 39 |
|
| 40 |
Args:
|
| 41 |
+
collection_name: Name of the ChromaDB collection (default changed from
|
| 42 |
+
"deepcritical_evidence" to "deepboner_evidence" in v1.0 rebrand)
|
| 43 |
persist_dir: Directory to persist ChromaDB data
|
| 44 |
embedding_model: OpenAI embedding model (defaults to settings.openai_embedding_model)
|
| 45 |
similarity_top_k: Number of top results to retrieve
|
|
|
|
| 253 |
|
| 254 |
|
| 255 |
def get_rag_service(
|
| 256 |
+
collection_name: str = "deepboner_evidence",
|
| 257 |
**kwargs: Any,
|
| 258 |
) -> LlamaIndexRAGService:
|
| 259 |
"""
|
|
@@ -75,7 +75,7 @@ class ClinicalTrialsTool:
|
|
| 75 |
requests.get,
|
| 76 |
self.BASE_URL,
|
| 77 |
params=params,
|
| 78 |
-
headers={"User-Agent": "
|
| 79 |
timeout=30,
|
| 80 |
)
|
| 81 |
response.raise_for_status()
|
|
|
|
| 75 |
requests.get,
|
| 76 |
self.BASE_URL,
|
| 77 |
params=params,
|
| 78 |
+
headers={"User-Agent": "DeepBoner-Research-Agent/1.0"},
|
| 79 |
timeout=30,
|
| 80 |
)
|
| 81 |
response.raise_for_status()
|
|
@@ -109,10 +109,10 @@ class ModalCodeExecutor:
|
|
| 109 |
|
| 110 |
try:
|
| 111 |
# Create or lookup Modal app
|
| 112 |
-
app = modal.App.lookup("
|
| 113 |
|
| 114 |
# Define scientific computing image with common libraries
|
| 115 |
-
scientific_image = modal.Image.debian_slim(python_version="3.11").
|
| 116 |
*get_sandbox_library_list()
|
| 117 |
)
|
| 118 |
|
|
|
|
| 109 |
|
| 110 |
try:
|
| 111 |
# Create or lookup Modal app
|
| 112 |
+
app = modal.App.lookup("deepboner-code-execution", create_if_missing=True)
|
| 113 |
|
| 114 |
# Define scientific computing image with common libraries
|
| 115 |
+
scientific_image = modal.Image.debian_slim(python_version="3.11").pip_install(
|
| 116 |
*get_sandbox_library_list()
|
| 117 |
)
|
| 118 |
|
|
@@ -1,25 +1,25 @@
|
|
| 1 |
-
"""Custom exceptions for
|
| 2 |
|
| 3 |
|
| 4 |
-
class
|
| 5 |
-
"""Base exception for all
|
| 6 |
|
| 7 |
pass
|
| 8 |
|
| 9 |
|
| 10 |
-
class SearchError(
|
| 11 |
"""Raised when a search operation fails."""
|
| 12 |
|
| 13 |
pass
|
| 14 |
|
| 15 |
|
| 16 |
-
class JudgeError(
|
| 17 |
"""Raised when the judge fails to assess evidence."""
|
| 18 |
|
| 19 |
pass
|
| 20 |
|
| 21 |
|
| 22 |
-
class ConfigurationError(
|
| 23 |
"""Raised when configuration is invalid."""
|
| 24 |
|
| 25 |
pass
|
|
@@ -29,3 +29,7 @@ class RateLimitError(SearchError):
|
|
| 29 |
"""Raised when we hit API rate limits."""
|
| 30 |
|
| 31 |
pass
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Custom exceptions for DeepBoner."""
|
| 2 |
|
| 3 |
|
| 4 |
+
class DeepBonerError(Exception):
|
| 5 |
+
"""Base exception for all DeepBoner errors."""
|
| 6 |
|
| 7 |
pass
|
| 8 |
|
| 9 |
|
| 10 |
+
class SearchError(DeepBonerError):
|
| 11 |
"""Raised when a search operation fails."""
|
| 12 |
|
| 13 |
pass
|
| 14 |
|
| 15 |
|
| 16 |
+
class JudgeError(DeepBonerError):
|
| 17 |
"""Raised when the judge fails to assess evidence."""
|
| 18 |
|
| 19 |
pass
|
| 20 |
|
| 21 |
|
| 22 |
+
class ConfigurationError(DeepBonerError):
|
| 23 |
"""Raised when configuration is invalid."""
|
| 24 |
|
| 25 |
pass
|
|
|
|
| 29 |
"""Raised when we hit API rate limits."""
|
| 30 |
|
| 31 |
pass
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
# Backwards compatibility alias
|
| 35 |
+
DeepCriticalError = DeepBonerError
|
|
@@ -10,7 +10,7 @@ from pydantic_ai.models.anthropic import AnthropicModel
|
|
| 10 |
# We expect this import to exist after we implement it, or we mock it if it's not there yet
|
| 11 |
# For TDD, we assume we will use the library class
|
| 12 |
from pydantic_ai.models.huggingface import HuggingFaceModel
|
| 13 |
-
from pydantic_ai.models.openai import
|
| 14 |
|
| 15 |
from src.agent_factory.judges import get_model
|
| 16 |
|
|
@@ -28,7 +28,7 @@ def test_get_model_openai(mock_settings):
|
|
| 28 |
mock_settings.openai_model = "gpt-5.1"
|
| 29 |
|
| 30 |
model = get_model()
|
| 31 |
-
assert isinstance(model,
|
| 32 |
assert model.model_name == "gpt-5.1"
|
| 33 |
|
| 34 |
|
|
@@ -61,4 +61,4 @@ def test_get_model_default_fallback(mock_settings):
|
|
| 61 |
mock_settings.openai_model = "gpt-5.1"
|
| 62 |
|
| 63 |
model = get_model()
|
| 64 |
-
assert isinstance(model,
|
|
|
|
| 10 |
# We expect this import to exist after we implement it, or we mock it if it's not there yet
|
| 11 |
# For TDD, we assume we will use the library class
|
| 12 |
from pydantic_ai.models.huggingface import HuggingFaceModel
|
| 13 |
+
from pydantic_ai.models.openai import OpenAIChatModel
|
| 14 |
|
| 15 |
from src.agent_factory.judges import get_model
|
| 16 |
|
|
|
|
| 28 |
mock_settings.openai_model = "gpt-5.1"
|
| 29 |
|
| 30 |
model = get_model()
|
| 31 |
+
assert isinstance(model, OpenAIChatModel)
|
| 32 |
assert model.model_name == "gpt-5.1"
|
| 33 |
|
| 34 |
|
|
|
|
| 61 |
mock_settings.openai_model = "gpt-5.1"
|
| 62 |
|
| 63 |
model = get_model()
|
| 64 |
+
assert isinstance(model, OpenAIChatModel)
|
|
@@ -3,10 +3,19 @@
|
|
| 3 |
from unittest.mock import AsyncMock, MagicMock, patch
|
| 4 |
|
| 5 |
import pytest
|
| 6 |
-
from agent_framework import AgentRunResponse
|
| 7 |
|
| 8 |
-
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
|
| 12 |
@pytest.fixture
|
|
|
|
| 3 |
from unittest.mock import AsyncMock, MagicMock, patch
|
| 4 |
|
| 5 |
import pytest
|
|
|
|
| 6 |
|
| 7 |
+
# Skip all tests if agent_framework not installed (optional dep)
|
| 8 |
+
pytest.importorskip("agent_framework")
|
| 9 |
+
|
| 10 |
+
from agent_framework import AgentRunResponse # noqa: E402
|
| 11 |
+
|
| 12 |
+
from src.agents.hypothesis_agent import HypothesisAgent # noqa: E402
|
| 13 |
+
from src.utils.models import ( # noqa: E402
|
| 14 |
+
Citation,
|
| 15 |
+
Evidence,
|
| 16 |
+
HypothesisAssessment,
|
| 17 |
+
MechanismHypothesis,
|
| 18 |
+
)
|
| 19 |
|
| 20 |
|
| 21 |
@pytest.fixture
|