VibecoderMcSwaggins commited on
Commit
e85636a
Β·
unverified Β·
1 Parent(s): 278a440

fix: complete DeepBoner rebrand with test fixes and CodeRabbit feedback

Browse files

* docs: add deep research roadmap and GradioDemo analysis

- Introduced a comprehensive roadmap for implementing GPT-Researcher-style deep research in DeepCritical, detailing phases from input parsing to long-form writing.
- Added an analysis of the GradioDemo codebase, highlighting redundant components and useful patterns, while emphasizing a streamlined approach to integration and implementation.

* rebrand: DeepCritical β†’ DeepBoner (sexual health research agent)

πŸ† AI-Native Sexual Health Research Agent

- Package renamed: deepcritical β†’ deepboner
- New focus: sexual wellness, ED, hormone therapy, libido, reproductive health
- Exception class: DeepCriticalError β†’ DeepBonerError (with backwards compat)
- UI examples updated for sexual health queries
- Modal app renamed: deepboner-code-execution
- Fixed pre-existing mypy issue (Modal API: uv_pip_install β†’ pip_install)

46 files updated across:
- Core: pyproject.toml, README.md, CLAUDE.md, AGENTS.md, GEMINI.md
- Source: app.py, mcp_tools.py, exceptions.py, code_execution.py
- Docs: ~25 markdown files
- Examples: 8 demo scripts
- Tests: exception tests updated

All tests passing (102 passed, 6 skipped)

* fix: resolve all test warnings and complete DeepBoner rebrand

Code fixes:
- Update OpenAIModel β†’ OpenAIChatModel (pydantic-ai deprecation)
- Add pytest.importorskip guards to 3 test files for optional deps

Rebrand fixes (missed in initial rename):
- src/services/llamaindex_rag.py: collection name
- src/tools/clinicaltrials.py: User-Agent header
- 7 doc files: package names, URLs, examples

Config:
- Add targeted pytest warning filters for Pydantic mock introspection
(known upstream issue: pydantic/pydantic#9927)

Result: 127 tests pass, 0 warnings (was 50 warnings)

* fix: address CodeRabbit review feedback

Fixes from CodeRabbit analysis:
- docs/to_do/DEEP_RESEARCH_ROADMAP.md: Fix MagenticBuilder β†’ MagenticOrchestrator
- docs/to_do/DEEP_RESEARCH_ROADMAP.md: Add note clarifying RAGService vs EmbeddingService
- src/services/llamaindex_rag.py: Add migration note for collection name change
- tests/unit/utils/test_exceptions.py: Add pytestmark = pytest.mark.unit
- docs/implementation/12_phase_mcp_server.md: Update all bioRxiv refs β†’ Europe PMC

All 127 tests pass, 0 warnings.

This view is limited to 50 files because it contains too many changes. Β  See raw diff
Files changed (50) hide show
  1. .gitignore +1 -0
  2. AGENTS.md +4 -4
  3. CLAUDE.md +4 -4
  4. Dockerfile +1 -1
  5. GEMINI.md +6 -6
  6. README.md +26 -16
  7. docs/architecture/design-patterns.md +3 -3
  8. docs/architecture/overview.md +3 -3
  9. docs/brainstorming/00_ROADMAP_SUMMARY.md +3 -3
  10. docs/brainstorming/01_PUBMED_IMPROVEMENTS.md +2 -2
  11. docs/brainstorming/03_EUROPEPMC_IMPROVEMENTS.md +1 -1
  12. docs/brainstorming/04_OPENALEX_INTEGRATION.md +4 -4
  13. docs/brainstorming/implementation/15_PHASE_OPENALEX.md +1 -1
  14. docs/brainstorming/magentic-pydantic/00_SITUATION_AND_PLAN.md +1 -1
  15. docs/brainstorming/magentic-pydantic/REVIEW_PROMPT_FOR_SENIOR_AGENT.md +10 -10
  16. docs/bugs/P1_GRADIO_SETTINGS_CLEANUP.md +2 -2
  17. docs/development/testing.md +1 -1
  18. docs/guides/deployment.md +5 -5
  19. docs/implementation/01_phase_foundation.md +10 -10
  20. docs/implementation/04_phase_ui.md +7 -7
  21. docs/implementation/10_phase_clinicaltrials.md +2 -2
  22. docs/implementation/11_phase_biorxiv.md +1 -1
  23. docs/implementation/12_phase_mcp_server.md +45 -45
  24. docs/implementation/13_phase_modal_integration.md +1 -1
  25. docs/implementation/14_phase_demo_submission.md +12 -12
  26. docs/implementation/roadmap.md +3 -3
  27. docs/index.md +1 -1
  28. docs/to_do/DEEP_RESEARCH_ROADMAP.md +337 -0
  29. docs/to_do/REFERENCE_GRADDIO_DEMO_ANALYSIS.md +229 -0
  30. docs/workflow-diagrams.md +4 -4
  31. examples/README.md +2 -2
  32. examples/embeddings_demo/run_embeddings.py +1 -1
  33. examples/full_stack_demo/run_full.py +4 -4
  34. examples/hypothesis_demo/run_hypothesis.py +1 -1
  35. examples/modal_demo/run_analysis.py +1 -1
  36. examples/orchestrator_demo/run_agent.py +3 -3
  37. examples/orchestrator_demo/run_magentic.py +3 -3
  38. examples/search_demo/run_search.py +1 -1
  39. main.py +1 -1
  40. pyproject.toml +14 -2
  41. src/agent_factory/judges.py +2 -2
  42. src/app.py +8 -6
  43. src/mcp_tools.py +1 -1
  44. src/services/__init__.py +1 -1
  45. src/services/llamaindex_rag.py +8 -3
  46. src/tools/clinicaltrials.py +1 -1
  47. src/tools/code_execution.py +2 -2
  48. src/utils/exceptions.py +10 -6
  49. tests/unit/agent_factory/test_judges_factory.py +3 -3
  50. tests/unit/agents/test_hypothesis_agent.py +12 -3
.gitignore CHANGED
@@ -49,6 +49,7 @@ reference_repos/claude-agent-sdk/
49
  reference_repos/pydanticai-research-agent/
50
  reference_repos/pubmed-mcp-server/
51
  reference_repos/DeepCritical/
 
52
 
53
  # Keep the README in reference_repos
54
  !reference_repos/README.md
 
49
  reference_repos/pydanticai-research-agent/
50
  reference_repos/pubmed-mcp-server/
51
  reference_repos/DeepCritical/
52
+ reference_repos/GradioDemo/
53
 
54
  # Keep the README in reference_repos
55
  !reference_repos/README.md
AGENTS.md CHANGED
@@ -4,7 +4,7 @@ This file provides guidance to AI agents when working with code in this reposito
4
 
5
  ## Project Overview
6
 
7
- DeepCritical is an AI-native drug repurposing research agent for a HuggingFace hackathon. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, bioRxiv) and synthesize evidence for queries like "What existing drugs might help treat long COVID fatigue?".
8
 
9
  **Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
10
 
@@ -39,7 +39,7 @@ uv run pytest -m integration
39
  User Question β†’ Orchestrator
40
  ↓
41
  Search Loop:
42
- 1. Query PubMed, ClinicalTrials.gov, bioRxiv
43
  2. Gather evidence
44
  3. Judge quality ("Do we have enough?")
45
  4. If NO β†’ Refine query, search more
@@ -53,7 +53,7 @@ Research Report with Citations
53
  - `src/orchestrator.py` - Main agent loop
54
  - `src/tools/pubmed.py` - PubMed E-utilities search
55
  - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
56
- - `src/tools/biorxiv.py` - bioRxiv/medRxiv preprint search
57
  - `src/tools/code_execution.py` - Modal sandbox execution
58
  - `src/tools/search_handler.py` - Scatter-gather orchestration
59
  - `src/services/embeddings.py` - Semantic search & deduplication (ChromaDB)
@@ -82,7 +82,7 @@ Settings via pydantic-settings from `.env`:
82
  ## Exception Hierarchy
83
 
84
  ```text
85
- DeepCriticalError (base)
86
  β”œβ”€β”€ SearchError
87
  β”‚ └── RateLimitError
88
  β”œβ”€β”€ JudgeError
 
4
 
5
  ## Project Overview
6
 
7
+ DeepBoner is an AI-native sexual health research agent. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) and synthesize evidence for queries like "What drugs improve female libido post-menopause?" or "Evidence for testosterone therapy in women with HSDD?".
8
 
9
  **Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
10
 
 
39
  User Question β†’ Orchestrator
40
  ↓
41
  Search Loop:
42
+ 1. Query PubMed, ClinicalTrials.gov, Europe PMC
43
  2. Gather evidence
44
  3. Judge quality ("Do we have enough?")
45
  4. If NO β†’ Refine query, search more
 
53
  - `src/orchestrator.py` - Main agent loop
54
  - `src/tools/pubmed.py` - PubMed E-utilities search
55
  - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
56
+ - `src/tools/europepmc.py` - Europe PMC search
57
  - `src/tools/code_execution.py` - Modal sandbox execution
58
  - `src/tools/search_handler.py` - Scatter-gather orchestration
59
  - `src/services/embeddings.py` - Semantic search & deduplication (ChromaDB)
 
82
  ## Exception Hierarchy
83
 
84
  ```text
85
+ DeepBonerError (base)
86
  β”œβ”€β”€ SearchError
87
  β”‚ └── RateLimitError
88
  β”œβ”€β”€ JudgeError
CLAUDE.md CHANGED
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
4
 
5
  ## Project Overview
6
 
7
- DeepCritical is an AI-native drug repurposing research agent for a HuggingFace hackathon. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, bioRxiv) and synthesize evidence for queries like "What existing drugs might help treat long COVID fatigue?".
8
 
9
  **Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
10
 
@@ -39,7 +39,7 @@ uv run pytest -m integration
39
  User Question β†’ Orchestrator
40
  ↓
41
  Search Loop:
42
- 1. Query PubMed, ClinicalTrials.gov, bioRxiv
43
  2. Gather evidence
44
  3. Judge quality ("Do we have enough?")
45
  4. If NO β†’ Refine query, search more
@@ -53,7 +53,7 @@ Research Report with Citations
53
  - `src/orchestrator.py` - Main agent loop
54
  - `src/tools/pubmed.py` - PubMed E-utilities search
55
  - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
56
- - `src/tools/biorxiv.py` - bioRxiv/medRxiv preprint search
57
  - `src/tools/code_execution.py` - Modal sandbox execution
58
  - `src/tools/search_handler.py` - Scatter-gather orchestration
59
  - `src/services/embeddings.py` - Semantic search & deduplication (ChromaDB)
@@ -82,7 +82,7 @@ Settings via pydantic-settings from `.env`:
82
  ## Exception Hierarchy
83
 
84
  ```text
85
- DeepCriticalError (base)
86
  β”œβ”€β”€ SearchError
87
  β”‚ └── RateLimitError
88
  β”œβ”€β”€ JudgeError
 
4
 
5
  ## Project Overview
6
 
7
+ DeepBoner is an AI-native sexual health research agent. It uses a search-and-judge loop to autonomously search biomedical databases (PubMed, ClinicalTrials.gov, Europe PMC) and synthesize evidence for queries like "What drugs improve female libido post-menopause?" or "Evidence for testosterone therapy in women with HSDD?".
8
 
9
  **Current Status:** Phases 1-13 COMPLETE (Foundation through Modal sandbox integration).
10
 
 
39
  User Question β†’ Orchestrator
40
  ↓
41
  Search Loop:
42
+ 1. Query PubMed, ClinicalTrials.gov, Europe PMC
43
  2. Gather evidence
44
  3. Judge quality ("Do we have enough?")
45
  4. If NO β†’ Refine query, search more
 
53
  - `src/orchestrator.py` - Main agent loop
54
  - `src/tools/pubmed.py` - PubMed E-utilities search
55
  - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
56
+ - `src/tools/europepmc.py` - Europe PMC search
57
  - `src/tools/code_execution.py` - Modal sandbox execution
58
  - `src/tools/search_handler.py` - Scatter-gather orchestration
59
  - `src/services/embeddings.py` - Semantic search & deduplication (ChromaDB)
 
82
  ## Exception Hierarchy
83
 
84
  ```text
85
+ DeepBonerError (base)
86
  β”œβ”€β”€ SearchError
87
  β”‚ └── RateLimitError
88
  β”œβ”€β”€ JudgeError
Dockerfile CHANGED
@@ -1,4 +1,4 @@
1
- # Dockerfile for DeepCritical
2
  FROM python:3.11-slim
3
 
4
  # Set working directory
 
1
+ # Dockerfile for DeepBoner
2
  FROM python:3.11-slim
3
 
4
  # Set working directory
GEMINI.md CHANGED
@@ -1,9 +1,9 @@
1
- # DeepCritical Context
2
 
3
  ## Project Overview
4
 
5
- **DeepCritical** is an AI-native Medical Drug Repurposing Research Agent.
6
- **Goal:** To accelerate the discovery of new uses for existing drugs by intelligently searching biomedical literature (PubMed, ClinicalTrials.gov, bioRxiv), evaluating evidence, and hypothesizing potential applications.
7
 
8
  **Architecture:**
9
  The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orchestrator) and adheres to **Strict TDD** (Test-Driven Development).
@@ -11,7 +11,7 @@ The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orches
11
  **Current Status:**
12
 
13
  - **Phases 1-9:** COMPLETE. Foundation, Search, Judge, UI, Orchestrator, Embeddings, Hypothesis, Report, Cleanup.
14
- - **Phases 10-11:** COMPLETE. ClinicalTrials.gov and bioRxiv integration.
15
  - **Phase 12:** COMPLETE. MCP Server integration (Gradio MCP at `/gradio_api/mcp/`).
16
  - **Phase 13:** COMPLETE. Modal sandbox for statistical analysis.
17
 
@@ -41,7 +41,7 @@ The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orches
41
 
42
  - `src/`: Source code
43
  - `utils/`: Shared utilities (`config.py`, `exceptions.py`, `models.py`)
44
- - `tools/`: Search tools (`pubmed.py`, `clinicaltrials.py`, `biorxiv.py`, `code_execution.py`)
45
  - `services/`: Services (`embeddings.py`, `statistical_analyzer.py`)
46
  - `agents/`: Magentic multi-agent mode agents
47
  - `agent_factory/`: Agent definitions (judges, prompts)
@@ -58,7 +58,7 @@ The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orches
58
  - `src/orchestrator.py` - Main agent loop
59
  - `src/tools/pubmed.py` - PubMed E-utilities search
60
  - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
61
- - `src/tools/biorxiv.py` - bioRxiv/medRxiv preprint search
62
  - `src/tools/code_execution.py` - Modal sandbox execution
63
  - `src/services/statistical_analyzer.py` - Statistical analysis via Modal
64
  - `src/mcp_tools.py` - MCP tool wrappers
 
1
+ # DeepBoner Context
2
 
3
  ## Project Overview
4
 
5
+ **DeepBoner** is an AI-native Sexual Health Research Agent.
6
+ **Goal:** To accelerate research into sexual health, wellness, and reproductive medicine by intelligently searching biomedical literature (PubMed, ClinicalTrials.gov, Europe PMC), evaluating evidence, and synthesizing findings.
7
 
8
  **Architecture:**
9
  The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orchestrator) and adheres to **Strict TDD** (Test-Driven Development).
 
11
  **Current Status:**
12
 
13
  - **Phases 1-9:** COMPLETE. Foundation, Search, Judge, UI, Orchestrator, Embeddings, Hypothesis, Report, Cleanup.
14
+ - **Phases 10-11:** COMPLETE. ClinicalTrials.gov and Europe PMC integration.
15
  - **Phase 12:** COMPLETE. MCP Server integration (Gradio MCP at `/gradio_api/mcp/`).
16
  - **Phase 13:** COMPLETE. Modal sandbox for statistical analysis.
17
 
 
41
 
42
  - `src/`: Source code
43
  - `utils/`: Shared utilities (`config.py`, `exceptions.py`, `models.py`)
44
+ - `tools/`: Search tools (`pubmed.py`, `clinicaltrials.py`, `europepmc.py`, `code_execution.py`)
45
  - `services/`: Services (`embeddings.py`, `statistical_analyzer.py`)
46
  - `agents/`: Magentic multi-agent mode agents
47
  - `agent_factory/`: Agent definitions (judges, prompts)
 
58
  - `src/orchestrator.py` - Main agent loop
59
  - `src/tools/pubmed.py` - PubMed E-utilities search
60
  - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
61
+ - `src/tools/europepmc.py` - Europe PMC search
62
  - `src/tools/code_execution.py` - Modal sandbox execution
63
  - `src/services/statistical_analyzer.py` - Statistical analysis via Modal
64
  - `src/mcp_tools.py` - MCP tool wrappers
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
- title: DeepCritical
3
- emoji: 🧬
4
- colorFrom: blue
5
  colorTo: purple
6
  sdk: gradio
7
  sdk_version: "6.0.1"
@@ -10,26 +10,37 @@ app_file: src/app.py
10
  pinned: false
11
  license: mit
12
  tags:
13
- - mcp-in-action-track-enterprise
 
 
 
14
  - mcp-hackathon
15
- - drug-repurposing
16
- - biomedical-ai
17
  - pydantic-ai
18
  - llamaindex
19
  - modal
20
  ---
21
 
22
- # DeepCritical
23
 
24
- AI-Powered Drug Repurposing Research Agent
 
 
25
 
26
  ## Features
27
 
28
- - **Multi-Source Search**: PubMed, ClinicalTrials.gov, bioRxiv/medRxiv
29
  - **MCP Integration**: Use our tools from Claude Desktop or any MCP client
30
  - **Modal Sandbox**: Secure execution of AI-generated statistical code
31
  - **LlamaIndex RAG**: Semantic search and evidence synthesis
32
 
 
 
 
 
 
 
 
 
33
  ## Quick Start
34
 
35
  ### 1. Environment Setup
@@ -62,7 +73,7 @@ Add this to your `claude_desktop_config.json`:
62
  ```json
63
  {
64
  "mcpServers": {
65
- "deepcritical": {
66
  "url": "http://localhost:7860/gradio_api/mcp/"
67
  }
68
  }
@@ -72,7 +83,7 @@ Add this to your `claude_desktop_config.json`:
72
  **Available Tools**:
73
  - `search_pubmed`: Search peer-reviewed biomedical literature.
74
  - `search_clinical_trials`: Search ClinicalTrials.gov.
75
- - `search_biorxiv`: Search bioRxiv/medRxiv preprints.
76
  - `search_all`: Search all sources simultaneously.
77
  - `analyze_hypothesis`: Secure statistical analysis using Modal sandboxes.
78
 
@@ -92,16 +103,16 @@ make check
92
 
93
  ## Architecture
94
 
95
- DeepCritical uses a Vertical Slice Architecture:
96
 
97
- 1. **Search Slice**: Retrieving evidence from PubMed, ClinicalTrials.gov, and bioRxiv.
98
  2. **Judge Slice**: Evaluating evidence quality using LLMs.
99
  3. **Orchestrator Slice**: Managing the research loop and UI.
100
 
101
  Built with:
102
  - **PydanticAI**: For robust agent interactions.
103
  - **Gradio**: For the streaming user interface.
104
- - **PubMed, ClinicalTrials.gov, bioRxiv**: For biomedical data.
105
  - **MCP**: For universal tool access.
106
  - **Modal**: For secure code execution.
107
 
@@ -110,8 +121,7 @@ Built with:
110
  - The-Obstacle-Is-The-Way
111
  - MarioAderman
112
  - EmployeeNo427
113
- - Josephrp *(provided initial template)*
114
 
115
  ## Links
116
 
117
- - [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/DeepCritical-1)
 
1
  ---
2
+ title: DeepBoner
3
+ emoji: πŸ†
4
+ colorFrom: pink
5
  colorTo: purple
6
  sdk: gradio
7
  sdk_version: "6.0.1"
 
10
  pinned: false
11
  license: mit
12
  tags:
13
+ - sexual-health
14
+ - reproductive-medicine
15
+ - hormone-therapy
16
+ - wellness-research
17
  - mcp-hackathon
 
 
18
  - pydantic-ai
19
  - llamaindex
20
  - modal
21
  ---
22
 
23
+ # DeepBoner πŸ†
24
 
25
+ AI-Native Sexual Health Research Agent
26
+
27
+ Deep research for sexual wellness, ED treatments, hormone therapy, libido, and reproductive health - for all genders.
28
 
29
  ## Features
30
 
31
+ - **Multi-Source Search**: PubMed, ClinicalTrials.gov, Europe PMC
32
  - **MCP Integration**: Use our tools from Claude Desktop or any MCP client
33
  - **Modal Sandbox**: Secure execution of AI-generated statistical code
34
  - **LlamaIndex RAG**: Semantic search and evidence synthesis
35
 
36
+ ## Example Queries
37
+
38
+ - "What drugs improve female libido post-menopause?"
39
+ - "Clinical trials for erectile dysfunction alternatives to PDE5 inhibitors?"
40
+ - "Evidence for testosterone therapy in women with HSDD?"
41
+ - "Drug interactions with sildenafil?"
42
+ - "What's the latest research on flibanserin efficacy?"
43
+
44
  ## Quick Start
45
 
46
  ### 1. Environment Setup
 
73
  ```json
74
  {
75
  "mcpServers": {
76
+ "deepboner": {
77
  "url": "http://localhost:7860/gradio_api/mcp/"
78
  }
79
  }
 
83
  **Available Tools**:
84
  - `search_pubmed`: Search peer-reviewed biomedical literature.
85
  - `search_clinical_trials`: Search ClinicalTrials.gov.
86
+ - `search_europepmc`: Search Europe PMC preprints and papers.
87
  - `search_all`: Search all sources simultaneously.
88
  - `analyze_hypothesis`: Secure statistical analysis using Modal sandboxes.
89
 
 
103
 
104
  ## Architecture
105
 
106
+ DeepBoner uses a Vertical Slice Architecture:
107
 
108
+ 1. **Search Slice**: Retrieving evidence from PubMed, ClinicalTrials.gov, and Europe PMC.
109
  2. **Judge Slice**: Evaluating evidence quality using LLMs.
110
  3. **Orchestrator Slice**: Managing the research loop and UI.
111
 
112
  Built with:
113
  - **PydanticAI**: For robust agent interactions.
114
  - **Gradio**: For the streaming user interface.
115
+ - **PubMed, ClinicalTrials.gov, Europe PMC**: For biomedical data.
116
  - **MCP**: For universal tool access.
117
  - **Modal**: For secure code execution.
118
 
 
121
  - The-Obstacle-Is-The-Way
122
  - MarioAderman
123
  - EmployeeNo427
 
124
 
125
  ## Links
126
 
127
+ - [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/DeepBoner)
docs/architecture/design-patterns.md CHANGED
@@ -726,7 +726,7 @@ If evidence is weak, say so clearly."""
726
  **Architecture**:
727
  ```
728
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
729
- β”‚ DeepCritical Agent β”‚
730
  β”‚ (uses tools directly OR via MCP) β”‚
731
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
732
  β”‚
@@ -811,7 +811,7 @@ uvx fastmcp run src/mcp_servers/pubmed_server.py
811
  "pubmed": {
812
  "command": "python",
813
  "args": ["-m", "src.mcp_servers.pubmed_server"],
814
- "cwd": "/path/to/deepcritical"
815
  }
816
  }
817
  }
@@ -865,7 +865,7 @@ def research_with_streaming(question: str) -> Generator[str, None, None]:
865
 
866
  # Gradio 5 UI
867
  with gr.Blocks(theme=gr.themes.Soft()) as demo:
868
- gr.Markdown("# πŸ”¬ DeepCritical: Drug Repurposing Research Agent")
869
  gr.Markdown("Ask a question about potential drug repurposing opportunities.")
870
 
871
  with gr.Row():
 
726
  **Architecture**:
727
  ```
728
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
729
+ β”‚ DeepBoner Agent β”‚
730
  β”‚ (uses tools directly OR via MCP) β”‚
731
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
732
  β”‚
 
811
  "pubmed": {
812
  "command": "python",
813
  "args": ["-m", "src.mcp_servers.pubmed_server"],
814
+ "cwd": "/path/to/deepboner"
815
  }
816
  }
817
  }
 
865
 
866
  # Gradio 5 UI
867
  with gr.Blocks(theme=gr.themes.Soft()) as demo:
868
+ gr.Markdown("# πŸ”¬ DeepBoner: Drug Repurposing Research Agent")
869
  gr.Markdown("Ask a question about potential drug repurposing opportunities.")
870
 
871
  with gr.Row():
docs/architecture/overview.md CHANGED
@@ -1,11 +1,11 @@
1
- # DeepCritical: Medical Drug Repurposing Research Agent
2
  ## Project Overview
3
 
4
  ---
5
 
6
  ## Executive Summary
7
 
8
- **DeepCritical** is a deep research agent designed to accelerate medical drug repurposing research by autonomously searching, analyzing, and synthesizing evidence from multiple biomedical databases.
9
 
10
  ### The Problem We Solve
11
 
@@ -16,7 +16,7 @@ Drug repurposing - finding new therapeutic uses for existing FDA-approved drugs
16
  - Assess safety profiles
17
  - Synthesize evidence into actionable insights
18
 
19
- **DeepCritical automates this process from hours to minutes.**
20
 
21
  ### What Is Drug Repurposing?
22
 
 
1
+ # DeepBoner: Medical Drug Repurposing Research Agent
2
  ## Project Overview
3
 
4
  ---
5
 
6
  ## Executive Summary
7
 
8
+ **DeepBoner** is a deep research agent designed to accelerate medical drug repurposing research by autonomously searching, analyzing, and synthesizing evidence from multiple biomedical databases.
9
 
10
  ### The Problem We Solve
11
 
 
16
  - Assess safety profiles
17
  - Synthesize evidence into actionable insights
18
 
19
+ **DeepBoner automates this process from hours to minutes.**
20
 
21
  ### What Is Drug Repurposing?
22
 
docs/brainstorming/00_ROADMAP_SUMMARY.md CHANGED
@@ -1,4 +1,4 @@
1
- # DeepCritical Data Sources: Roadmap Summary
2
 
3
  **Created**: 2024-11-27
4
  **Purpose**: Future maintainability and hackathon continuation
@@ -131,7 +131,7 @@ Keep current architecture working, add OpenAlex incrementally.
131
  ```
132
 
133
  2. **Copy OpenAlex tool from reference repo**
134
- - File: `reference_repos/DeepCritical/DeepResearch/src/tools/openalex_tools.py`
135
  - Adapt to our `SearchTool` base class
136
 
137
  3. **Enable NCBI API Key**
@@ -189,6 +189,6 @@ If you're picking this up after the hackathon:
189
  1. **Start with OpenAlex** - biggest bang for buck
190
  2. **Add rate limiting** - prevents API blocks
191
  3. **Don't bother with bioRxiv** - use Europe PMC instead
192
- 4. **Reference repo is gold** - `reference_repos/DeepCritical/` has working implementations
193
 
194
  Good luck! πŸš€
 
1
+ # DeepBoner Data Sources: Roadmap Summary
2
 
3
  **Created**: 2024-11-27
4
  **Purpose**: Future maintainability and hackathon continuation
 
131
  ```
132
 
133
  2. **Copy OpenAlex tool from reference repo**
134
+ - File: `reference_repos/DeepBoner/DeepResearch/src/tools/openalex_tools.py`
135
  - Adapt to our `SearchTool` base class
136
 
137
  3. **Enable NCBI API Key**
 
189
  1. **Start with OpenAlex** - biggest bang for buck
190
  2. **Add rate limiting** - prevents API blocks
191
  3. **Don't bother with bioRxiv** - use Europe PMC instead
192
+ 4. **Reference repo is gold** - `reference_repos/DeepBoner/` has working implementations
193
 
194
  Good luck! πŸš€
docs/brainstorming/01_PUBMED_IMPROVEMENTS.md CHANGED
@@ -24,9 +24,9 @@
24
 
25
  ---
26
 
27
- ## Reference Implementation (DeepCritical Reference Repo)
28
 
29
- The reference repo at `reference_repos/DeepCritical/DeepResearch/src/tools/bioinformatics_tools.py` has a more sophisticated implementation:
30
 
31
  ### Features We're Missing
32
 
 
24
 
25
  ---
26
 
27
+ ## Reference Implementation (DeepBoner Reference Repo)
28
 
29
+ The reference repo at `reference_repos/DeepBoner/DeepResearch/src/tools/bioinformatics_tools.py` has a more sophisticated implementation:
30
 
31
  ### Features We're Missing
32
 
docs/brainstorming/03_EUROPEPMC_IMPROVEMENTS.md CHANGED
@@ -182,7 +182,7 @@ Europe PMC is more generous than NCBI:
182
  # Recommend: 10-20 requests/second max
183
  # Use email in User-Agent for polite pool
184
  headers = {
185
- "User-Agent": "DeepCritical/1.0 (mailto:your@email.com)"
186
  }
187
  ```
188
 
 
182
  # Recommend: 10-20 requests/second max
183
  # Use email in User-Agent for polite pool
184
  headers = {
185
+ "User-Agent": "DeepBoner/1.0 (mailto:your@email.com)"
186
  }
187
  ```
188
 
docs/brainstorming/04_OPENALEX_INTEGRATION.md CHANGED
@@ -2,7 +2,7 @@
2
 
3
  **Status**: NOT Implemented (Candidate for Addition)
4
  **Priority**: HIGH - Could Replace Multiple Tools
5
- **Reference**: Already implemented in `reference_repos/DeepCritical`
6
 
7
  ---
8
 
@@ -20,7 +20,7 @@ OpenAlex is a **fully open** index of the global research system:
20
 
21
  ---
22
 
23
- ## Why OpenAlex for DeepCritical?
24
 
25
  ### Current Architecture
26
 
@@ -60,7 +60,7 @@ Orchestrator (enrich with CT.gov for trials)
60
 
61
  ## Reference Implementation
62
 
63
- From `reference_repos/DeepCritical/DeepResearch/src/tools/openalex_tools.py`:
64
 
65
  ```python
66
  class OpenAlexFetchTool(ToolRunner):
@@ -212,7 +212,7 @@ class OpenAlexTool(SearchTool):
212
  "filter": "type:article,is_oa:true",
213
  "sort": "cited_by_count:desc",
214
  "per_page": max_results,
215
- "mailto": "deepcritical@example.com", # Polite pool
216
  },
217
  )
218
  data = resp.json()
 
2
 
3
  **Status**: NOT Implemented (Candidate for Addition)
4
  **Priority**: HIGH - Could Replace Multiple Tools
5
+ **Reference**: Already implemented in `reference_repos/DeepBoner`
6
 
7
  ---
8
 
 
20
 
21
  ---
22
 
23
+ ## Why OpenAlex for DeepBoner?
24
 
25
  ### Current Architecture
26
 
 
60
 
61
  ## Reference Implementation
62
 
63
+ From `reference_repos/DeepBoner/DeepResearch/src/tools/openalex_tools.py`:
64
 
65
  ```python
66
  class OpenAlexFetchTool(ToolRunner):
 
212
  "filter": "type:article,is_oa:true",
213
  "sort": "cited_by_count:desc",
214
  "per_page": max_results,
215
+ "mailto": "deepboner@example.com", # Polite pool
216
  },
217
  )
218
  data = resp.json()
docs/brainstorming/implementation/15_PHASE_OPENALEX.md CHANGED
@@ -305,7 +305,7 @@ class OpenAlexTool:
305
  Args:
306
  email: Optional email for polite pool (faster responses)
307
  """
308
- self.email = email or "deepcritical@example.com"
309
 
310
  @property
311
  def name(self) -> str:
 
305
  Args:
306
  email: Optional email for polite pool (faster responses)
307
  """
308
+ self.email = email or "deepboner@example.com"
309
 
310
  @property
311
  def name(self) -> str:
docs/brainstorming/magentic-pydantic/00_SITUATION_AND_PLAN.md CHANGED
@@ -167,7 +167,7 @@ The refactor branch (`feat/pubmed-fulltext`) has some valuable improvements:
167
  ## 9. Questions to Answer Before Proceeding
168
 
169
  1. **For the hackathon**: Do we need full multi-agent orchestration, or is single-agent sufficient?
170
- 2. **For DeepCritical mainline**: Is the plan to use Microsoft Agent Framework for orchestration?
171
  3. **Timeline**: How much time do we have to get this right?
172
 
173
  ---
 
167
  ## 9. Questions to Answer Before Proceeding
168
 
169
  1. **For the hackathon**: Do we need full multi-agent orchestration, or is single-agent sufficient?
170
+ 2. **For DeepBoner mainline**: Is the plan to use Microsoft Agent Framework for orchestration?
171
  3. **Timeline**: How much time do we have to get this right?
172
 
173
  ---
docs/brainstorming/magentic-pydantic/REVIEW_PROMPT_FOR_SENIOR_AGENT.md CHANGED
@@ -6,7 +6,7 @@ Copy and paste everything below this line to a fresh Claude/AI session:
6
 
7
  ## Context
8
 
9
- I am a junior developer working on a HuggingFace hackathon project called DeepCritical. We made a significant architectural mistake and are now trying to course-correct. I need you to act as a **senior staff engineer** and critically review our proposed solution.
10
 
11
  ## The Situation
12
 
@@ -62,28 +62,28 @@ Please perform a **deep, critical review** of:
62
 
63
  Please read these files in order:
64
 
65
- 1. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/docs/brainstorming/magentic-pydantic/00_SITUATION_AND_PLAN.md`
66
- 2. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/docs/brainstorming/magentic-pydantic/01_ARCHITECTURE_SPEC.md`
67
- 3. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/docs/brainstorming/magentic-pydantic/02_IMPLEMENTATION_PHASES.md`
68
- 4. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/docs/brainstorming/magentic-pydantic/03_IMMEDIATE_ACTIONS.md`
69
 
70
  And the architecture diagram:
71
- 5. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/assets/magentic-pydantic.png`
72
 
73
  ## Reference Repositories to Consult
74
 
75
  We have local clones of the source-of-truth repositories:
76
 
77
- - **Original DeepCritical:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/reference_repos/DeepCritical/`
78
- - **Microsoft Agent Framework:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/reference_repos/agent-framework/`
79
- - **Microsoft AutoGen:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/reference_repos/autogen-microsoft/`
80
 
81
  Please cross-reference our hackathon fork against these to verify architectural alignment.
82
 
83
  ## Codebase to Analyze
84
 
85
  Our hackathon fork is at:
86
- `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepCritical-1/`
87
 
88
  Key files to examine:
89
  - `src/agents/` - Agent framework integration
 
6
 
7
  ## Context
8
 
9
+ I am a junior developer working on a HuggingFace hackathon project called DeepBoner. We made a significant architectural mistake and are now trying to course-correct. I need you to act as a **senior staff engineer** and critically review our proposed solution.
10
 
11
  ## The Situation
12
 
 
62
 
63
  Please read these files in order:
64
 
65
+ 1. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/00_SITUATION_AND_PLAN.md`
66
+ 2. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/01_ARCHITECTURE_SPEC.md`
67
+ 3. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/02_IMPLEMENTATION_PHASES.md`
68
+ 4. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/03_IMMEDIATE_ACTIONS.md`
69
 
70
  And the architecture diagram:
71
+ 5. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/assets/magentic-pydantic.png`
72
 
73
  ## Reference Repositories to Consult
74
 
75
  We have local clones of the source-of-truth repositories:
76
 
77
+ - **Original DeepBoner:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/DeepBoner/`
78
+ - **Microsoft Agent Framework:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/agent-framework/`
79
+ - **Microsoft AutoGen:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/autogen-microsoft/`
80
 
81
  Please cross-reference our hackathon fork against these to verify architectural alignment.
82
 
83
  ## Codebase to Analyze
84
 
85
  Our hackathon fork is at:
86
+ `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/`
87
 
88
  Key files to examine:
89
  - `src/agents/` - Agent framework integration
docs/bugs/P1_GRADIO_SETTINGS_CLEANUP.md CHANGED
@@ -55,7 +55,7 @@ def create_demo():
55
  def create_demo():
56
  return gr.ChatInterface( # <--- FIX: Top-level component
57
  ...,
58
- title="🧬 DeepCritical",
59
  description="*AI-Powered Drug Repurposing Agent...*\n\n---\n**MCP Server Active**...",
60
  additional_inputs_accordion=gr.Accordion(label="βš™οΈ Settings", open=False)
61
  )
@@ -69,7 +69,7 @@ def create_demo():
69
  2. **Check**: Open `http://localhost:7860`
70
  3. **Verify**:
71
  * Settings accordion starts **COLLAPSED**.
72
- * Header title ("DeepCritical") is visible.
73
  * Footer text ("MCP Server Active") is visible in the description area.
74
  * Chat functionality works (Magentic/Simple modes).
75
 
 
55
  def create_demo():
56
  return gr.ChatInterface( # <--- FIX: Top-level component
57
  ...,
58
+ title="🧬 DeepBoner",
59
  description="*AI-Powered Drug Repurposing Agent...*\n\n---\n**MCP Server Active**...",
60
  additional_inputs_accordion=gr.Accordion(label="βš™οΈ Settings", open=False)
61
  )
 
69
  2. **Check**: Open `http://localhost:7860`
70
  3. **Verify**:
71
  * Settings accordion starts **COLLAPSED**.
72
+ * Header title ("DeepBoner") is visible.
73
  * Footer text ("MCP Server Active") is visible in the description area.
74
  * Chat functionality works (Magentic/Simple modes).
75
 
docs/development/testing.md CHANGED
@@ -1,5 +1,5 @@
1
  # Testing Strategy
2
- ## ensuring DeepCritical is Ironclad
3
 
4
  ---
5
 
 
1
  # Testing Strategy
2
+ ## ensuring DeepBoner is Ironclad
3
 
4
  ---
5
 
docs/guides/deployment.md CHANGED
@@ -1,11 +1,11 @@
1
  # Deployment Guide
2
- ## Launching DeepCritical: Gradio, MCP, & Modal
3
 
4
  ---
5
 
6
  ## Overview
7
 
8
- DeepCritical is designed for a multi-platform deployment strategy to maximize hackathon impact:
9
 
10
  1. **HuggingFace Spaces**: Host the Gradio UI (User Interface).
11
  2. **MCP Server**: Expose research tools to Claude Desktop/Agents.
@@ -69,10 +69,10 @@ def predict(message, history):
69
  ```json
70
  {
71
  "mcpServers": {
72
- "deepcritical": {
73
  "command": "uv",
74
  "args": ["run", "fastmcp", "run", "src/mcp_servers/pubmed_server.py"],
75
- "cwd": "/absolute/path/to/DeepCritical"
76
  }
77
  }
78
  }
@@ -111,7 +111,7 @@ Instead of calling Anthropic API, we call a Modal function:
111
  # src/llm/modal_client.py
112
  import modal
113
 
114
- stub = modal.Stub("deepcritical-inference")
115
 
116
  @stub.function(gpu="A100")
117
  def generate_text(prompt: str):
 
1
  # Deployment Guide
2
+ ## Launching DeepBoner: Gradio, MCP, & Modal
3
 
4
  ---
5
 
6
  ## Overview
7
 
8
+ DeepBoner is designed for a multi-platform deployment strategy to maximize hackathon impact:
9
 
10
  1. **HuggingFace Spaces**: Host the Gradio UI (User Interface).
11
  2. **MCP Server**: Expose research tools to Claude Desktop/Agents.
 
69
  ```json
70
  {
71
  "mcpServers": {
72
+ "deepboner": {
73
  "command": "uv",
74
  "args": ["run", "fastmcp", "run", "src/mcp_servers/pubmed_server.py"],
75
+ "cwd": "/absolute/path/to/DeepBoner"
76
  }
77
  }
78
  }
 
111
  # src/llm/modal_client.py
112
  import modal
113
 
114
+ stub = modal.Stub("deepboner-inference")
115
 
116
  @stub.function(gpu="A100")
117
  def generate_text(prompt: str):
docs/implementation/01_phase_foundation.md CHANGED
@@ -23,7 +23,7 @@ uv --version # Should be >= 0.4.0
23
 
24
  ```bash
25
  # From project root
26
- uv init --name deepcritical
27
  uv python install 3.11 # Pin Python version
28
  ```
29
 
@@ -35,9 +35,9 @@ uv python install 3.11 # Pin Python version
35
 
36
  ```toml
37
  [project]
38
- name = "deepcritical"
39
  version = "0.1.0"
40
- description = "AI-Native Drug Repurposing Research Agent"
41
  readme = "README.md"
42
  requires-python = ">=3.11"
43
  dependencies = [
@@ -401,25 +401,25 @@ settings = get_settings()
401
  ### `src/utils/exceptions.py`
402
 
403
  ```python
404
- """Custom exceptions for DeepCritical."""
405
 
406
 
407
- class DeepCriticalError(Exception):
408
- """Base exception for all DeepCritical errors."""
409
  pass
410
 
411
 
412
- class SearchError(DeepCriticalError):
413
  """Raised when a search operation fails."""
414
  pass
415
 
416
 
417
- class JudgeError(DeepCriticalError):
418
  """Raised when the judge fails to assess evidence."""
419
  pass
420
 
421
 
422
- class ConfigurationError(DeepCriticalError):
423
  """Raised when configuration is invalid."""
424
  pass
425
 
@@ -558,7 +558,7 @@ uv run pre-commit install
558
  ## 10. Implementation Checklist
559
 
560
  - [ ] Install `uv` and verify version
561
- - [ ] Run `uv init --name deepcritical`
562
  - [ ] Create `pyproject.toml` (copy from above)
563
  - [ ] Create directory structure (run mkdir commands)
564
  - [ ] Create `.env.example` and `.env`
 
23
 
24
  ```bash
25
  # From project root
26
+ uv init --name deepboner
27
  uv python install 3.11 # Pin Python version
28
  ```
29
 
 
35
 
36
  ```toml
37
  [project]
38
+ name = "deepboner"
39
  version = "0.1.0"
40
+ description = "AI-Native Sexual Health Research Agent"
41
  readme = "README.md"
42
  requires-python = ">=3.11"
43
  dependencies = [
 
401
  ### `src/utils/exceptions.py`
402
 
403
  ```python
404
+ """Custom exceptions for DeepBoner."""
405
 
406
 
407
+ class DeepBonerError(Exception):
408
+ """Base exception for all DeepBoner errors."""
409
  pass
410
 
411
 
412
+ class SearchError(DeepBonerError):
413
  """Raised when a search operation fails."""
414
  pass
415
 
416
 
417
+ class JudgeError(DeepBonerError):
418
  """Raised when the judge fails to assess evidence."""
419
  pass
420
 
421
 
422
+ class ConfigurationError(DeepBonerError):
423
  """Raised when configuration is invalid."""
424
  pass
425
 
 
558
  ## 10. Implementation Checklist
559
 
560
  - [ ] Install `uv` and verify version
561
+ - [ ] Run `uv init --name deepboner`
562
  - [ ] Create `pyproject.toml` (copy from above)
563
  - [ ] Create directory structure (run mkdir commands)
564
  - [ ] Create `.env.example` and `.env`
docs/implementation/04_phase_ui.md CHANGED
@@ -401,7 +401,7 @@ Found {len(evidence)} sources. Consider refining your query for more specific re
401
  Using Gradio 5 generator pattern for real-time streaming.
402
 
403
  ```python
404
- """Gradio UI for DeepCritical agent."""
405
  import asyncio
406
  import gradio as gr
407
  from typing import AsyncGenerator
@@ -557,11 +557,11 @@ def create_demo() -> gr.Blocks:
557
  Configured Gradio Blocks interface
558
  """
559
  with gr.Blocks(
560
- title="DeepCritical - Drug Repurposing Research Agent",
561
  theme=gr.themes.Soft(),
562
  ) as demo:
563
  gr.Markdown("""
564
- # 🧬 DeepCritical
565
  ## AI-Powered Drug Repurposing Research Agent
566
 
567
  Ask questions about potential drug repurposing opportunities.
@@ -935,7 +935,7 @@ class TestAgentEvent:
935
  ## 6. Dockerfile
936
 
937
  ```dockerfile
938
- # Dockerfile for DeepCritical
939
  FROM python:3.11-slim
940
 
941
  # Set working directory
@@ -975,7 +975,7 @@ Create `README.md` header for HuggingFace Spaces:
975
 
976
  ```markdown
977
  ---
978
- title: DeepCritical
979
  emoji: 🧬
980
  colorFrom: blue
981
  colorTo: purple
@@ -986,7 +986,7 @@ pinned: false
986
  license: mit
987
  ---
988
 
989
- # DeepCritical
990
 
991
  AI-Powered Drug Repurposing Research Agent
992
  ```
@@ -1088,7 +1088,7 @@ After deployment to HuggingFace Spaces:
1088
 
1089
  ## Project Complete! πŸŽ‰
1090
 
1091
- When Phase 4 is done, the DeepCritical MVP is complete:
1092
 
1093
  - **Phase 1**: Foundation (uv, pytest, config) βœ…
1094
  - **Phase 2**: Search Slice (PubMed, DuckDuckGo) βœ…
 
401
  Using Gradio 5 generator pattern for real-time streaming.
402
 
403
  ```python
404
+ """Gradio UI for DeepBoner agent."""
405
  import asyncio
406
  import gradio as gr
407
  from typing import AsyncGenerator
 
557
  Configured Gradio Blocks interface
558
  """
559
  with gr.Blocks(
560
+ title="DeepBoner - Drug Repurposing Research Agent",
561
  theme=gr.themes.Soft(),
562
  ) as demo:
563
  gr.Markdown("""
564
+ # 🧬 DeepBoner
565
  ## AI-Powered Drug Repurposing Research Agent
566
 
567
  Ask questions about potential drug repurposing opportunities.
 
935
  ## 6. Dockerfile
936
 
937
  ```dockerfile
938
+ # Dockerfile for DeepBoner
939
  FROM python:3.11-slim
940
 
941
  # Set working directory
 
975
 
976
  ```markdown
977
  ---
978
+ title: DeepBoner
979
  emoji: 🧬
980
  colorFrom: blue
981
  colorTo: purple
 
986
  license: mit
987
  ---
988
 
989
+ # DeepBoner
990
 
991
  AI-Powered Drug Repurposing Research Agent
992
  ```
 
1088
 
1089
  ## Project Complete! πŸŽ‰
1090
 
1091
+ When Phase 4 is done, the DeepBoner MVP is complete:
1092
 
1093
  - **Phase 1**: Foundation (uv, pytest, config) βœ…
1094
  - **Phase 2**: Search Slice (PubMed, DuckDuckGo) βœ…
docs/implementation/10_phase_clinicaltrials.md CHANGED
@@ -185,7 +185,7 @@ class ClinicalTrialsTool:
185
  requests.get,
186
  self.BASE_URL,
187
  params=params,
188
- headers={"User-Agent": "DeepCritical-Research-Agent/1.0"},
189
  timeout=30,
190
  )
191
  response.raise_for_status()
@@ -434,4 +434,4 @@ source .env && uv run python examples/search_demo/run_search.py "metformin alzhe
434
  | No phase info | Phase I/II/III evidence strength |
435
 
436
  **Demo pitch addition**:
437
- > "DeepCritical searches PubMed for peer-reviewed evidence AND ClinicalTrials.gov for 400,000+ clinical trials."
 
185
  requests.get,
186
  self.BASE_URL,
187
  params=params,
188
+ headers={"User-Agent": "DeepBoner-Research-Agent/1.0"},
189
  timeout=30,
190
  )
191
  response.raise_for_status()
 
434
  | No phase info | Phase I/II/III evidence strength |
435
 
436
  **Demo pitch addition**:
437
+ > "DeepBoner searches PubMed for peer-reviewed evidence AND ClinicalTrials.gov for 400,000+ clinical trials."
docs/implementation/11_phase_biorxiv.md CHANGED
@@ -531,7 +531,7 @@ source .env && uv run python examples/search_demo/run_search.py "metformin diabe
531
  | Miss cutting-edge | Catch breakthroughs early |
532
 
533
  **Demo pitch (final)**:
534
- > "DeepCritical searches PubMed for peer-reviewed evidence, ClinicalTrials.gov for 400,000+ clinical trials, and bioRxiv/medRxiv for cutting-edge preprints - then uses LLMs to generate mechanistic hypotheses and synthesize findings into publication-quality reports."
535
 
536
  ---
537
 
 
531
  | Miss cutting-edge | Catch breakthroughs early |
532
 
533
  **Demo pitch (final)**:
534
+ > "DeepBoner searches PubMed for peer-reviewed evidence, ClinicalTrials.gov for 400,000+ clinical trials, and bioRxiv/medRxiv for cutting-edge preprints - then uses LLMs to generate mechanistic hypotheses and synthesize findings into publication-quality reports."
535
 
536
  ---
537
 
docs/implementation/12_phase_mcp_server.md CHANGED
@@ -1,6 +1,6 @@
1
  # Phase 12 Implementation Spec: MCP Server Integration
2
 
3
- **Goal**: Expose DeepCritical search tools as MCP servers for Track 2 compliance.
4
  **Philosophy**: "MCP is the bridge between tools and LLMs."
5
  **Prerequisite**: Phase 11 complete (all search tools working)
6
  **Priority**: P0 - REQUIRED FOR HACKATHON TRACK 2
@@ -121,7 +121,7 @@ https://[space-id].hf.space/gradio_api/mcp/
121
  ### 4.1 MCP Tool Wrappers (`src/mcp_tools.py`)
122
 
123
  ```python
124
- """MCP tool wrappers for DeepCritical search tools.
125
 
126
  These functions expose our search tools via MCP protocol.
127
  Each function follows the MCP tool contract:
@@ -130,15 +130,15 @@ Each function follows the MCP tool contract:
130
  - Formatted string returns
131
  """
132
 
133
- from src.tools.biorxiv import BioRxivTool
134
  from src.tools.clinicaltrials import ClinicalTrialsTool
 
135
  from src.tools.pubmed import PubMedTool
136
 
137
 
138
  # Singleton instances (avoid recreating on each call)
139
  _pubmed = PubMedTool()
140
  _trials = ClinicalTrialsTool()
141
- _biorxiv = BioRxivTool()
142
 
143
 
144
  async def search_pubmed(query: str, max_results: int = 10) -> str:
@@ -202,10 +202,10 @@ async def search_clinical_trials(query: str, max_results: int = 10) -> str:
202
  return "\n".join(formatted)
203
 
204
 
205
- async def search_biorxiv(query: str, max_results: int = 10) -> str:
206
- """Search bioRxiv/medRxiv for preprint research.
207
 
208
- Searches bioRxiv and medRxiv preprint servers for cutting-edge research.
209
  Note: Preprints are NOT peer-reviewed but contain the latest findings.
210
 
211
  Args:
@@ -217,10 +217,10 @@ async def search_biorxiv(query: str, max_results: int = 10) -> str:
217
  """
218
  max_results = max(1, min(50, max_results))
219
 
220
- results = await _biorxiv.search(query, max_results)
221
 
222
  if not results:
223
- return f"No bioRxiv/medRxiv preprints found for: {query}"
224
 
225
  formatted = [f"## Preprint Results for: {query}\n"]
226
  for i, evidence in enumerate(results, 1):
@@ -236,7 +236,7 @@ async def search_biorxiv(query: str, max_results: int = 10) -> str:
236
  async def search_all_sources(query: str, max_per_source: int = 5) -> str:
237
  """Search all biomedical sources simultaneously.
238
 
239
- Performs parallel search across PubMed, ClinicalTrials.gov, and bioRxiv.
240
  This is the most comprehensive search option for drug repurposing research.
241
 
242
  Args:
@@ -253,10 +253,10 @@ async def search_all_sources(query: str, max_per_source: int = 5) -> str:
253
  # Run all searches in parallel
254
  pubmed_task = search_pubmed(query, max_per_source)
255
  trials_task = search_clinical_trials(query, max_per_source)
256
- biorxiv_task = search_biorxiv(query, max_per_source)
257
 
258
- pubmed_results, trials_results, biorxiv_results = await asyncio.gather(
259
- pubmed_task, trials_task, biorxiv_task, return_exceptions=True
260
  )
261
 
262
  formatted = [f"# Comprehensive Search: {query}\n"]
@@ -272,10 +272,10 @@ async def search_all_sources(query: str, max_per_source: int = 5) -> str:
272
  else:
273
  formatted.append(f"## Clinical Trials\n*Error: {trials_results}*\n")
274
 
275
- if isinstance(biorxiv_results, str):
276
- formatted.append(biorxiv_results)
277
  else:
278
- formatted.append(f"## Preprints\n*Error: {biorxiv_results}*\n")
279
 
280
  return "\n---\n".join(formatted)
281
  ```
@@ -283,7 +283,7 @@ async def search_all_sources(query: str, max_per_source: int = 5) -> str:
283
  ### 4.2 Update Gradio App (`src/app.py`)
284
 
285
  ```python
286
- """Gradio UI for DeepCritical agent with MCP server support."""
287
 
288
  import os
289
  from collections.abc import AsyncGenerator
@@ -294,12 +294,12 @@ import gradio as gr
294
  from src.agent_factory.judges import JudgeHandler, MockJudgeHandler
295
  from src.mcp_tools import (
296
  search_all_sources,
297
- search_biorxiv,
298
  search_clinical_trials,
299
  search_pubmed,
300
  )
301
  from src.orchestrator_factory import create_orchestrator
302
- from src.tools.biorxiv import BioRxivTool
303
  from src.tools.clinicaltrials import ClinicalTrialsTool
304
  from src.tools.pubmed import PubMedTool
305
  from src.tools.search_handler import SearchHandler
@@ -317,15 +317,15 @@ def create_demo() -> Any:
317
  Configured Gradio Blocks interface with MCP server enabled
318
  """
319
  with gr.Blocks(
320
- title="DeepCritical - Drug Repurposing Research Agent",
321
  theme=gr.themes.Soft(),
322
  ) as demo:
323
  gr.Markdown("""
324
- # DeepCritical
325
  ## AI-Powered Drug Repurposing Research Agent
326
 
327
  Ask questions about potential drug repurposing opportunities.
328
- The agent searches PubMed, ClinicalTrials.gov, and bioRxiv/medRxiv preprints.
329
 
330
  **Example questions:**
331
  - "What drugs could be repurposed for Alzheimer's disease?"
@@ -381,13 +381,13 @@ def create_demo() -> Any:
381
 
382
  with gr.Tab("Preprints"):
383
  gr.Interface(
384
- fn=search_biorxiv,
385
  inputs=[
386
  gr.Textbox(label="Query", placeholder="long covid treatment"),
387
  gr.Slider(1, 50, value=10, step=1, label="Max Results"),
388
  ],
389
  outputs=gr.Markdown(label="Results"),
390
- api_name="search_biorxiv",
391
  )
392
 
393
  with gr.Tab("Search All"):
@@ -406,7 +406,7 @@ def create_demo() -> Any:
406
  **Note**: This is a research tool and should not be used for medical decisions.
407
  Always consult healthcare professionals for medical advice.
408
 
409
- Built with PydanticAI + PubMed, ClinicalTrials.gov & bioRxiv
410
 
411
  **MCP Server**: Available at `/gradio_api/mcp/` for Claude Desktop integration
412
  """)
@@ -444,7 +444,7 @@ import pytest
444
 
445
  from src.mcp_tools import (
446
  search_all_sources,
447
- search_biorxiv,
448
  search_clinical_trials,
449
  search_pubmed,
450
  )
@@ -525,18 +525,18 @@ class TestSearchClinicalTrials:
525
  assert "Clinical Trials" in result
526
 
527
 
528
- class TestSearchBiorxiv:
529
- """Tests for search_biorxiv MCP tool."""
530
 
531
  @pytest.mark.asyncio
532
  async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
533
  """Should return formatted markdown string."""
534
- mock_evidence.citation.source = "biorxiv" # type: ignore
535
 
536
- with patch("src.mcp_tools._biorxiv") as mock_tool:
537
  mock_tool.search = AsyncMock(return_value=[mock_evidence])
538
 
539
- result = await search_biorxiv("preprint search", 10)
540
 
541
  assert isinstance(result, str)
542
  assert "Preprint Results" in result
@@ -550,11 +550,11 @@ class TestSearchAllSources:
550
  """Should combine results from all sources."""
551
  with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
552
  patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
553
- patch("src.mcp_tools.search_biorxiv", new_callable=AsyncMock) as mock_biorxiv:
554
 
555
  mock_pubmed.return_value = "## PubMed Results"
556
  mock_trials.return_value = "## Clinical Trials"
557
- mock_biorxiv.return_value = "## Preprints"
558
 
559
  result = await search_all_sources("metformin", 5)
560
 
@@ -568,11 +568,11 @@ class TestSearchAllSources:
568
  """Should handle partial failures gracefully."""
569
  with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
570
  patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
571
- patch("src.mcp_tools.search_biorxiv", new_callable=AsyncMock) as mock_biorxiv:
572
 
573
  mock_pubmed.return_value = "## PubMed Results"
574
  mock_trials.side_effect = Exception("API Error")
575
- mock_biorxiv.return_value = "## Preprints"
576
 
577
  result = await search_all_sources("metformin", 5)
578
 
@@ -599,10 +599,10 @@ class TestMCPDocstrings:
599
  assert search_clinical_trials.__doc__ is not None
600
  assert "Args:" in search_clinical_trials.__doc__
601
 
602
- def test_search_biorxiv_has_args_section(self) -> None:
603
  """Docstring must have Args section for MCP schema generation."""
604
- assert search_biorxiv.__doc__ is not None
605
- assert "Args:" in search_biorxiv.__doc__
606
 
607
  def test_search_all_sources_has_args_section(self) -> None:
608
  """Docstring must have Args section for MCP schema generation."""
@@ -672,7 +672,7 @@ class TestMCPServerIntegration:
672
  // %APPDATA%\Claude\claude_desktop_config.json (Windows)
673
  {
674
  "mcpServers": {
675
- "deepcritical": {
676
  "url": "http://localhost:7860/gradio_api/mcp/"
677
  }
678
  }
@@ -684,8 +684,8 @@ class TestMCPServerIntegration:
684
  ```json
685
  {
686
  "mcpServers": {
687
- "deepcritical": {
688
- "url": "https://MCP-1st-Birthday-deepcritical.hf.space/gradio_api/mcp/"
689
  }
690
  }
691
  }
@@ -696,7 +696,7 @@ class TestMCPServerIntegration:
696
  ```json
697
  {
698
  "mcpServers": {
699
- "deepcritical": {
700
  "url": "https://your-space.hf.space/gradio_api/mcp/",
701
  "headers": {
702
  "Authorization": "Bearer hf_xxxxxxxxxxxxx"
@@ -761,7 +761,7 @@ Phase 12 is **COMPLETE** when:
761
  ```
762
 
763
  2. **Show Claude Desktop using our tools**:
764
- - Open Claude Desktop with DeepCritical MCP configured
765
  - Ask: "Search PubMed for metformin Alzheimer's"
766
  - Show real results appearing
767
  - Ask: "Now search clinical trials for the same"
@@ -817,14 +817,14 @@ Phase 12 is **COMPLETE** when:
817
  β”‚ Gradio MCP Server β”‚
818
  β”‚ /gradio_api/mcp/ β”‚
819
  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
820
- β”‚ β”‚search_pubmed β”‚ β”‚search_trials β”‚ β”‚search_biorxivβ”‚ β”‚search_ β”‚ β”‚
821
  β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚all β”‚ β”‚
822
  β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚
823
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”˜
824
  β”‚ β”‚ β”‚ β”‚
825
  β–Ό β–Ό β–Ό β–Ό
826
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” (calls all)
827
- β”‚PubMedToolβ”‚ β”‚Trials β”‚ β”‚BioRxiv β”‚
828
  β”‚ β”‚ β”‚Tool β”‚ β”‚Tool β”‚
829
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
830
  ```
 
1
  # Phase 12 Implementation Spec: MCP Server Integration
2
 
3
+ **Goal**: Expose DeepBoner search tools as MCP servers for Track 2 compliance.
4
  **Philosophy**: "MCP is the bridge between tools and LLMs."
5
  **Prerequisite**: Phase 11 complete (all search tools working)
6
  **Priority**: P0 - REQUIRED FOR HACKATHON TRACK 2
 
121
  ### 4.1 MCP Tool Wrappers (`src/mcp_tools.py`)
122
 
123
  ```python
124
+ """MCP tool wrappers for DeepBoner search tools.
125
 
126
  These functions expose our search tools via MCP protocol.
127
  Each function follows the MCP tool contract:
 
130
  - Formatted string returns
131
  """
132
 
 
133
  from src.tools.clinicaltrials import ClinicalTrialsTool
134
+ from src.tools.europepmc import EuropePMCTool
135
  from src.tools.pubmed import PubMedTool
136
 
137
 
138
  # Singleton instances (avoid recreating on each call)
139
  _pubmed = PubMedTool()
140
  _trials = ClinicalTrialsTool()
141
+ _europepmc = EuropePMCTool()
142
 
143
 
144
  async def search_pubmed(query: str, max_results: int = 10) -> str:
 
202
  return "\n".join(formatted)
203
 
204
 
205
+ async def search_europepmc(query: str, max_results: int = 10) -> str:
206
+ """Search Europe PMC for preprint and open access research.
207
 
208
+ Searches Europe PMC for preprints and open access papers.
209
  Note: Preprints are NOT peer-reviewed but contain the latest findings.
210
 
211
  Args:
 
217
  """
218
  max_results = max(1, min(50, max_results))
219
 
220
+ results = await _europepmc.search(query, max_results)
221
 
222
  if not results:
223
+ return f"No Europe PMC results found for: {query}"
224
 
225
  formatted = [f"## Preprint Results for: {query}\n"]
226
  for i, evidence in enumerate(results, 1):
 
236
  async def search_all_sources(query: str, max_per_source: int = 5) -> str:
237
  """Search all biomedical sources simultaneously.
238
 
239
+ Performs parallel search across PubMed, ClinicalTrials.gov, and Europe PMC.
240
  This is the most comprehensive search option for drug repurposing research.
241
 
242
  Args:
 
253
  # Run all searches in parallel
254
  pubmed_task = search_pubmed(query, max_per_source)
255
  trials_task = search_clinical_trials(query, max_per_source)
256
+ europepmc_task = search_europepmc(query, max_per_source)
257
 
258
+ pubmed_results, trials_results, europepmc_results = await asyncio.gather(
259
+ pubmed_task, trials_task, europepmc_task, return_exceptions=True
260
  )
261
 
262
  formatted = [f"# Comprehensive Search: {query}\n"]
 
272
  else:
273
  formatted.append(f"## Clinical Trials\n*Error: {trials_results}*\n")
274
 
275
+ if isinstance(europepmc_results, str):
276
+ formatted.append(europepmc_results)
277
  else:
278
+ formatted.append(f"## Preprints\n*Error: {europepmc_results}*\n")
279
 
280
  return "\n---\n".join(formatted)
281
  ```
 
283
  ### 4.2 Update Gradio App (`src/app.py`)
284
 
285
  ```python
286
+ """Gradio UI for DeepBoner agent with MCP server support."""
287
 
288
  import os
289
  from collections.abc import AsyncGenerator
 
294
  from src.agent_factory.judges import JudgeHandler, MockJudgeHandler
295
  from src.mcp_tools import (
296
  search_all_sources,
297
+ search_europepmc,
298
  search_clinical_trials,
299
  search_pubmed,
300
  )
301
  from src.orchestrator_factory import create_orchestrator
302
+ from src.tools.europepmc import EuropePMCTool
303
  from src.tools.clinicaltrials import ClinicalTrialsTool
304
  from src.tools.pubmed import PubMedTool
305
  from src.tools.search_handler import SearchHandler
 
317
  Configured Gradio Blocks interface with MCP server enabled
318
  """
319
  with gr.Blocks(
320
+ title="DeepBoner - Drug Repurposing Research Agent",
321
  theme=gr.themes.Soft(),
322
  ) as demo:
323
  gr.Markdown("""
324
+ # DeepBoner
325
  ## AI-Powered Drug Repurposing Research Agent
326
 
327
  Ask questions about potential drug repurposing opportunities.
328
+ The agent searches PubMed, ClinicalTrials.gov, and Europe PMC preprints.
329
 
330
  **Example questions:**
331
  - "What drugs could be repurposed for Alzheimer's disease?"
 
381
 
382
  with gr.Tab("Preprints"):
383
  gr.Interface(
384
+ fn=search_europepmc,
385
  inputs=[
386
  gr.Textbox(label="Query", placeholder="long covid treatment"),
387
  gr.Slider(1, 50, value=10, step=1, label="Max Results"),
388
  ],
389
  outputs=gr.Markdown(label="Results"),
390
+ api_name="search_europepmc",
391
  )
392
 
393
  with gr.Tab("Search All"):
 
406
  **Note**: This is a research tool and should not be used for medical decisions.
407
  Always consult healthcare professionals for medical advice.
408
 
409
+ Built with PydanticAI + PubMed, ClinicalTrials.gov & Europe PMC
410
 
411
  **MCP Server**: Available at `/gradio_api/mcp/` for Claude Desktop integration
412
  """)
 
444
 
445
  from src.mcp_tools import (
446
  search_all_sources,
447
+ search_europepmc,
448
  search_clinical_trials,
449
  search_pubmed,
450
  )
 
525
  assert "Clinical Trials" in result
526
 
527
 
528
+ class TestSearchEuropePMC:
529
+ """Tests for search_europepmc MCP tool."""
530
 
531
  @pytest.mark.asyncio
532
  async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
533
  """Should return formatted markdown string."""
534
+ mock_evidence.citation.source = "europepmc" # type: ignore
535
 
536
+ with patch("src.mcp_tools._europepmc") as mock_tool:
537
  mock_tool.search = AsyncMock(return_value=[mock_evidence])
538
 
539
+ result = await search_europepmc("preprint search", 10)
540
 
541
  assert isinstance(result, str)
542
  assert "Preprint Results" in result
 
550
  """Should combine results from all sources."""
551
  with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
552
  patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
553
+ patch("src.mcp_tools.search_europepmc", new_callable=AsyncMock) as mock_europepmc:
554
 
555
  mock_pubmed.return_value = "## PubMed Results"
556
  mock_trials.return_value = "## Clinical Trials"
557
+ mock_europepmc.return_value = "## Preprints"
558
 
559
  result = await search_all_sources("metformin", 5)
560
 
 
568
  """Should handle partial failures gracefully."""
569
  with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
570
  patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
571
+ patch("src.mcp_tools.search_europepmc", new_callable=AsyncMock) as mock_europepmc:
572
 
573
  mock_pubmed.return_value = "## PubMed Results"
574
  mock_trials.side_effect = Exception("API Error")
575
+ mock_europepmc.return_value = "## Preprints"
576
 
577
  result = await search_all_sources("metformin", 5)
578
 
 
599
  assert search_clinical_trials.__doc__ is not None
600
  assert "Args:" in search_clinical_trials.__doc__
601
 
602
+ def test_search_europepmc_has_args_section(self) -> None:
603
  """Docstring must have Args section for MCP schema generation."""
604
+ assert search_europepmc.__doc__ is not None
605
+ assert "Args:" in search_europepmc.__doc__
606
 
607
  def test_search_all_sources_has_args_section(self) -> None:
608
  """Docstring must have Args section for MCP schema generation."""
 
672
  // %APPDATA%\Claude\claude_desktop_config.json (Windows)
673
  {
674
  "mcpServers": {
675
+ "deepboner": {
676
  "url": "http://localhost:7860/gradio_api/mcp/"
677
  }
678
  }
 
684
  ```json
685
  {
686
  "mcpServers": {
687
+ "deepboner": {
688
+ "url": "https://your-space.hf.space/gradio_api/mcp/"
689
  }
690
  }
691
  }
 
696
  ```json
697
  {
698
  "mcpServers": {
699
+ "deepboner": {
700
  "url": "https://your-space.hf.space/gradio_api/mcp/",
701
  "headers": {
702
  "Authorization": "Bearer hf_xxxxxxxxxxxxx"
 
761
  ```
762
 
763
  2. **Show Claude Desktop using our tools**:
764
+ - Open Claude Desktop with DeepBoner MCP configured
765
  - Ask: "Search PubMed for metformin Alzheimer's"
766
  - Show real results appearing
767
  - Ask: "Now search clinical trials for the same"
 
817
  β”‚ Gradio MCP Server β”‚
818
  β”‚ /gradio_api/mcp/ β”‚
819
  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
820
+ β”‚ β”‚search_pubmed β”‚ β”‚search_trials β”‚ β”‚search_epmc β”‚ β”‚search_ β”‚ β”‚
821
  β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚all β”‚ β”‚
822
  β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚
823
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”˜
824
  β”‚ β”‚ β”‚ β”‚
825
  β–Ό β–Ό β–Ό β–Ό
826
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” (calls all)
827
+ β”‚PubMedToolβ”‚ β”‚Trials β”‚ β”‚EuropePMC β”‚
828
  β”‚ β”‚ β”‚Tool β”‚ β”‚Tool β”‚
829
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
830
  ```
docs/implementation/13_phase_modal_integration.md CHANGED
@@ -872,7 +872,7 @@ async def main() -> None:
872
  sys.exit(1)
873
 
874
  print(f"\n{'=' * 60}")
875
- print("DeepCritical Modal Analysis Demo")
876
  print(f"Query: {args.query}")
877
  print(f"{'=' * 60}\n")
878
 
 
872
  sys.exit(1)
873
 
874
  print(f"\n{'=' * 60}")
875
+ print("DeepBoner Modal Analysis Demo")
876
  print(f"Query: {args.query}")
877
  print(f"{'=' * 60}\n")
878
 
docs/implementation/14_phase_demo_submission.md CHANGED
@@ -71,7 +71,7 @@ tags:
71
 
72
  [Show Gradio UI]
73
 
74
- "DeepCritical is an AI-powered drug repurposing research agent.
75
  It searches peer-reviewed literature, clinical trials, and cutting-edge preprints
76
  to find new uses for existing drugs."
77
 
@@ -83,7 +83,7 @@ to find new uses for existing drugs."
83
 
84
  [Type query: "Can metformin treat Alzheimer's disease?"]
85
 
86
- "When I ask about metformin for Alzheimer's, DeepCritical:
87
  1. Searches PubMed for peer-reviewed papers
88
  2. Queries ClinicalTrials.gov for active trials
89
  3. Scans bioRxiv for the latest preprints"
@@ -101,10 +101,10 @@ synthesize findings into a structured research report."
101
 
102
  [Switch to Claude Desktop]
103
 
104
- "What makes DeepCritical unique is full MCP integration.
105
  These same tools are available to any MCP client."
106
 
107
- [Show Claude Desktop with DeepCritical tools]
108
 
109
  "I can ask Claude: 'Search PubMed for aspirin cancer prevention'"
110
 
@@ -140,7 +140,7 @@ returning verdicts like SUPPORTED, REFUTED, or INCONCLUSIVE."
140
 
141
  [Return to Gradio UI]
142
 
143
- "DeepCritical brings together:
144
  - Three biomedical data sources
145
  - MCP protocol for universal tool access
146
  - Modal sandboxes for safe code execution
@@ -164,7 +164,7 @@ and let us know what you think."
164
 
165
  ```markdown
166
  ---
167
- title: DeepCritical
168
  emoji: 🧬
169
  colorFrom: blue
170
  colorTo: purple
@@ -183,7 +183,7 @@ tags:
183
  - modal
184
  ---
185
 
186
- # DeepCritical
187
 
188
  AI-Powered Drug Repurposing Research Agent
189
 
@@ -198,7 +198,7 @@ AI-Powered Drug Repurposing Research Agent
198
 
199
  Connect to our MCP server at:
200
  ```
201
- https://MCP-1st-Birthday-deepcritical.hf.space/gradio_api/mcp/
202
  ```
203
 
204
  Available tools:
@@ -214,7 +214,7 @@ Available tools:
214
 
215
  ## Links
216
 
217
- - [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/DeepCritical-1)
218
  - [Demo Video](link-to-video)
219
  ```
220
 
@@ -237,7 +237,7 @@ MODAL_TOKEN_SECRET=...
237
  ### Twitter/X Template
238
 
239
  ```
240
- 🧬 Excited to submit DeepCritical to MCP's 1st Birthday Hackathon!
241
 
242
  An AI agent that:
243
  βœ… Searches PubMed, ClinicalTrials.gov & bioRxiv
@@ -254,10 +254,10 @@ Demo: [Video link]
254
  ### LinkedIn Template
255
 
256
  ```
257
- Thrilled to share DeepCritical - our submission to MCP's 1st Birthday Hackathon!
258
 
259
  πŸ”¬ What it does:
260
- DeepCritical is an AI-powered drug repurposing research agent that searches
261
  peer-reviewed literature, clinical trials, and preprints to find new uses
262
  for existing drugs.
263
 
 
71
 
72
  [Show Gradio UI]
73
 
74
+ "DeepBoner is an AI-powered drug repurposing research agent.
75
  It searches peer-reviewed literature, clinical trials, and cutting-edge preprints
76
  to find new uses for existing drugs."
77
 
 
83
 
84
  [Type query: "Can metformin treat Alzheimer's disease?"]
85
 
86
+ "When I ask about metformin for Alzheimer's, DeepBoner:
87
  1. Searches PubMed for peer-reviewed papers
88
  2. Queries ClinicalTrials.gov for active trials
89
  3. Scans bioRxiv for the latest preprints"
 
101
 
102
  [Switch to Claude Desktop]
103
 
104
+ "What makes DeepBoner unique is full MCP integration.
105
  These same tools are available to any MCP client."
106
 
107
+ [Show Claude Desktop with DeepBoner tools]
108
 
109
  "I can ask Claude: 'Search PubMed for aspirin cancer prevention'"
110
 
 
140
 
141
  [Return to Gradio UI]
142
 
143
+ "DeepBoner brings together:
144
  - Three biomedical data sources
145
  - MCP protocol for universal tool access
146
  - Modal sandboxes for safe code execution
 
164
 
165
  ```markdown
166
  ---
167
+ title: DeepBoner
168
  emoji: 🧬
169
  colorFrom: blue
170
  colorTo: purple
 
183
  - modal
184
  ---
185
 
186
+ # DeepBoner
187
 
188
  AI-Powered Drug Repurposing Research Agent
189
 
 
198
 
199
  Connect to our MCP server at:
200
  ```
201
+ https://your-space.hf.space/gradio_api/mcp/
202
  ```
203
 
204
  Available tools:
 
214
 
215
  ## Links
216
 
217
+ - [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/DeepBoner-1)
218
  - [Demo Video](link-to-video)
219
  ```
220
 
 
237
  ### Twitter/X Template
238
 
239
  ```
240
+ 🧬 Excited to submit DeepBoner to MCP's 1st Birthday Hackathon!
241
 
242
  An AI agent that:
243
  βœ… Searches PubMed, ClinicalTrials.gov & bioRxiv
 
254
  ### LinkedIn Template
255
 
256
  ```
257
+ Thrilled to share DeepBoner - our submission to MCP's 1st Birthday Hackathon!
258
 
259
  πŸ”¬ What it does:
260
+ DeepBoner is an AI-powered drug repurposing research agent that searches
261
  peer-reviewed literature, clinical trials, and preprints to find new uses
262
  for existing drugs.
263
 
docs/implementation/roadmap.md CHANGED
@@ -1,8 +1,8 @@
1
- # Implementation Roadmap: DeepCritical (Vertical Slices)
2
 
3
  **Philosophy:** AI-Native Engineering, Vertical Slice Architecture, TDD, Modern Tooling (2025).
4
 
5
- This roadmap defines the execution strategy to deliver **DeepCritical** effectively. We reject "overplanning" in favor of **ironclad, testable vertical slices**. Each phase delivers a fully functional slice of end-to-end value.
6
 
7
  ---
8
 
@@ -114,7 +114,7 @@ tests/
114
 
115
  - [ ] Implement `src/orchestrator.py` (Connects Search + Judge loops).
116
  - [ ] Build `src/app.py` (Gradio with Streaming).
117
- - **Deliverable**: Working DeepCritical Agent on HuggingFace.
118
 
119
  ---
120
 
 
1
+ # Implementation Roadmap: DeepBoner (Vertical Slices)
2
 
3
  **Philosophy:** AI-Native Engineering, Vertical Slice Architecture, TDD, Modern Tooling (2025).
4
 
5
+ This roadmap defines the execution strategy to deliver **DeepBoner** effectively. We reject "overplanning" in favor of **ironclad, testable vertical slices**. Each phase delivers a fully functional slice of end-to-end value.
6
 
7
  ---
8
 
 
114
 
115
  - [ ] Implement `src/orchestrator.py` (Connects Search + Judge loops).
116
  - [ ] Build `src/app.py` (Gradio with Streaming).
117
+ - **Deliverable**: Working DeepBoner Agent on HuggingFace.
118
 
119
  ---
120
 
docs/index.md CHANGED
@@ -1,4 +1,4 @@
1
- # DeepCritical Documentation
2
 
3
  ## Medical Drug Repurposing Research Agent
4
 
 
1
+ # DeepBoner Documentation
2
 
3
  ## Medical Drug Repurposing Research Agent
4
 
docs/to_do/DEEP_RESEARCH_ROADMAP.md ADDED
@@ -0,0 +1,337 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deep Research Roadmap
2
+
3
+ > How to properly add GPT-Researcher-style deep research to DeepBoner
4
+ > using the EXISTING Magentic + Pydantic AI architecture.
5
+
6
+ ## Current State
7
+
8
+ We already have:
9
+
10
+ | Feature | Location | Status |
11
+ |---------|----------|--------|
12
+ | Multi-agent orchestration | `orchestrator_magentic.py` | Working |
13
+ | SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent | `agents/magentic_agents.py` | Working |
14
+ | HuggingFace free tier | `agent_factory/judges.py` (HFInferenceJudgeHandler) | Working |
15
+ | Budget constraints | MagenticOrchestrator (max_round_count, max_stall_count) | Built-in |
16
+ | Simple mode (linear) | `orchestrator.py` | Working |
17
+
18
+ ## What Deep Research Adds
19
+
20
+ GPT-Researcher style "deep research" means:
21
+
22
+ 1. **Query Analysis** - Detect if query needs simple lookup vs comprehensive report
23
+ 2. **Section Planning** - Break complex query into 3-7 parallel research sections
24
+ 3. **Parallel Research** - Run multiple research loops simultaneously
25
+ 4. **Long-form Writing** - Synthesize sections into cohesive report
26
+ 5. **RAG** - Semantic search over accumulated evidence
27
+
28
+ ## Implementation Plan (TDD, Vertical Slices)
29
+
30
+ ### Phase 1: Input Parser (Est. 50-100 lines)
31
+
32
+ **Goal**: Detect research mode from query.
33
+
34
+ ```python
35
+ # src/agents/input_parser.py
36
+
37
+ class ParsedQuery(BaseModel):
38
+ original_query: str
39
+ improved_query: str
40
+ research_mode: Literal["iterative", "deep"]
41
+ key_entities: list[str]
42
+
43
+ async def parse_query(query: str) -> ParsedQuery:
44
+ """
45
+ Detect if query needs deep research.
46
+
47
+ Deep indicators:
48
+ - "comprehensive", "report", "overview", "analysis"
49
+ - Multiple topics/drugs mentioned
50
+ - Requests for sections/structure
51
+
52
+ Iterative indicators:
53
+ - Single focused question
54
+ - "what is", "how does", "find"
55
+ """
56
+ ```
57
+
58
+ **Test first**:
59
+ ```python
60
+ def test_parse_query_detects_deep_mode():
61
+ result = await parse_query("Write a comprehensive report on Alzheimer's treatments")
62
+ assert result.research_mode == "deep"
63
+
64
+ def test_parse_query_detects_iterative_mode():
65
+ result = await parse_query("What is the mechanism of metformin?")
66
+ assert result.research_mode == "iterative"
67
+ ```
68
+
69
+ **Wire in**:
70
+ ```python
71
+ # In app.py or orchestrator_factory.py
72
+ parsed = await parse_query(user_query)
73
+ if parsed.research_mode == "deep":
74
+ orchestrator = create_deep_orchestrator()
75
+ else:
76
+ orchestrator = create_orchestrator() # existing
77
+ ```
78
+
79
+ ---
80
+
81
+ ### Phase 2: Section Planner (Est. 80-120 lines)
82
+
83
+ **Goal**: Create report outline for deep research.
84
+
85
+ ```python
86
+ # src/agents/planner.py
87
+
88
+ class ReportSection(BaseModel):
89
+ title: str
90
+ query: str # Search query for this section
91
+ description: str
92
+
93
+ class ReportPlan(BaseModel):
94
+ title: str
95
+ sections: list[ReportSection]
96
+
97
+ # Use existing ChatAgent pattern from magentic_agents.py
98
+ def create_planner_agent(chat_client: OpenAIChatClient | None = None) -> ChatAgent:
99
+ return ChatAgent(
100
+ name="PlannerAgent",
101
+ description="Creates structured report outlines",
102
+ instructions="""Given a research query, create a report plan with 3-7 sections.
103
+ Each section should have:
104
+ - A clear title
105
+ - A focused search query
106
+ - Brief description of what to cover
107
+
108
+ Example for "Alzheimer's drug repurposing":
109
+ 1. Current Treatment Landscape
110
+ 2. Mechanism-Based Candidates (targeting amyloid, tau, inflammation)
111
+ 3. Clinical Trial Evidence
112
+ 4. Safety Considerations
113
+ 5. Emerging Research Directions
114
+ """,
115
+ chat_client=client,
116
+ )
117
+ ```
118
+
119
+ **Test first**:
120
+ ```python
121
+ def test_planner_creates_sections():
122
+ plan = await planner.create_plan("Comprehensive Alzheimer's drug repurposing report")
123
+ assert len(plan.sections) >= 3
124
+ assert all(s.query for s in plan.sections)
125
+ ```
126
+
127
+ **Wire in**: Used by Phase 3.
128
+
129
+ ---
130
+
131
+ ### Phase 3: Parallel Research Flow (Est. 100-150 lines)
132
+
133
+ **Goal**: Run multiple MagenticOrchestrator instances in parallel.
134
+
135
+ ```python
136
+ # src/orchestrator_deep.py
137
+
138
+ class DeepResearchOrchestrator:
139
+ """
140
+ Runs parallel research loops using EXISTING MagenticOrchestrator.
141
+
142
+ NOT a new orchestration system - just a wrapper that:
143
+ 1. Plans sections
144
+ 2. Runs existing orchestrator per section (in parallel)
145
+ 3. Aggregates results
146
+ """
147
+
148
+ def __init__(self, max_parallel: int = 5):
149
+ self.planner = create_planner_agent()
150
+ self.max_parallel = max_parallel
151
+
152
+ async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
153
+ # 1. Create plan
154
+ plan = await self.planner.create_plan(query)
155
+ yield AgentEvent(type="planning", message=f"Created {len(plan.sections)} section plan")
156
+
157
+ # 2. Run parallel research (reuse existing orchestrator!)
158
+ from src.orchestrator_magentic import MagenticOrchestrator
159
+
160
+ async def research_section(section: ReportSection) -> str:
161
+ orchestrator = MagenticOrchestrator(max_rounds=5) # Fewer rounds per section
162
+ result = ""
163
+ async for event in orchestrator.run(section.query):
164
+ if event.type == "complete":
165
+ result = event.message
166
+ return result
167
+
168
+ # Run in parallel with semaphore
169
+ semaphore = asyncio.Semaphore(self.max_parallel)
170
+ async def bounded_research(section):
171
+ async with semaphore:
172
+ return await research_section(section)
173
+
174
+ results = await asyncio.gather(*[
175
+ bounded_research(s) for s in plan.sections
176
+ ])
177
+
178
+ # 3. Aggregate
179
+ yield AgentEvent(
180
+ type="complete",
181
+ message=self._aggregate_sections(plan, results)
182
+ )
183
+ ```
184
+
185
+ **Key insight**: We're NOT replacing MagenticOrchestrator. We're running multiple instances of it.
186
+
187
+ **Test first**:
188
+ ```python
189
+ @pytest.mark.integration
190
+ async def test_deep_orchestrator_runs_parallel():
191
+ orchestrator = DeepResearchOrchestrator(max_parallel=2)
192
+ events = [e async for e in orchestrator.run("Comprehensive Alzheimer's report")]
193
+ assert any(e.type == "planning" for e in events)
194
+ assert any(e.type == "complete" for e in events)
195
+ ```
196
+
197
+ ---
198
+
199
+ ### Phase 4: RAG Integration (Est. 100-150 lines)
200
+
201
+ **Goal**: Semantic search over accumulated evidence.
202
+
203
+ > **Note**: We already have `src/services/embeddings.py` (EmbeddingService) which provides
204
+ > ChromaDB + sentence-transformers with `add_evidence()` and `search_similar()` methods.
205
+ > The code below is illustrative - in practice, extend EmbeddingService or use it directly.
206
+ > See also: `src/services/llamaindex_rag.py` for OpenAI-based RAG (different use case).
207
+
208
+ ```python
209
+ # src/services/rag.py (illustrative - use EmbeddingService instead)
210
+
211
+ class RAGService:
212
+ """
213
+ Simple RAG using ChromaDB + sentence-transformers.
214
+ No LlamaIndex dependency - keep it lightweight.
215
+ """
216
+
217
+ def __init__(self):
218
+ import chromadb
219
+ from sentence_transformers import SentenceTransformer
220
+
221
+ self.client = chromadb.Client()
222
+ self.collection = self.client.get_or_create_collection("evidence")
223
+ self.encoder = SentenceTransformer("all-MiniLM-L6-v2")
224
+
225
+ def add_evidence(self, evidence: list[Evidence]) -> int:
226
+ """Add evidence to vector store, return count added."""
227
+ # Dedupe by URL
228
+ existing = set(self.collection.get()["ids"])
229
+ new_evidence = [e for e in evidence if e.citation.url not in existing]
230
+
231
+ if not new_evidence:
232
+ return 0
233
+
234
+ self.collection.add(
235
+ ids=[e.citation.url for e in new_evidence],
236
+ documents=[e.content for e in new_evidence],
237
+ metadatas=[{"title": e.citation.title, "source": e.citation.source} for e in new_evidence],
238
+ )
239
+ return len(new_evidence)
240
+
241
+ def search(self, query: str, n_results: int = 5) -> list[Evidence]:
242
+ """Semantic search for relevant evidence."""
243
+ results = self.collection.query(query_texts=[query], n_results=n_results)
244
+ # Convert back to Evidence objects
245
+ ...
246
+ ```
247
+
248
+ **Wire in as tool**:
249
+ ```python
250
+ # Add to SearchAgent's tools
251
+ def rag_search(query: str, n_results: int = 5) -> str:
252
+ """Search previously collected evidence for relevant information."""
253
+ service = get_rag_service()
254
+ results = service.search(query, n_results)
255
+ return format_evidence(results)
256
+
257
+ # In magentic_agents.py
258
+ ChatAgent(
259
+ tools=[search_pubmed, search_clinical_trials, search_preprints, rag_search], # ADD RAG
260
+ )
261
+ ```
262
+
263
+ ---
264
+
265
+ ### Phase 5: Long Writer (Est. 80-100 lines)
266
+
267
+ **Goal**: Write longer reports section-by-section.
268
+
269
+ ```python
270
+ # Extend existing ReportAgent or create LongWriterAgent
271
+
272
+ def create_long_writer_agent() -> ChatAgent:
273
+ return ChatAgent(
274
+ name="LongWriterAgent",
275
+ description="Writes detailed report sections with proper citations",
276
+ instructions="""Write a detailed section for a research report.
277
+
278
+ You will receive:
279
+ - Section title
280
+ - Relevant evidence/findings
281
+ - What previous sections covered (to avoid repetition)
282
+
283
+ Output:
284
+ - 500-1000 words per section
285
+ - Proper citations [1], [2], etc.
286
+ - Smooth transitions
287
+ - No repetition of earlier content
288
+ """,
289
+ tools=[get_bibliography, rag_search],
290
+ )
291
+ ```
292
+
293
+ ---
294
+
295
+ ## What NOT To Build
296
+
297
+ These are REDUNDANT with existing Magentic system:
298
+
299
+ | Component | Why Skip |
300
+ |-----------|----------|
301
+ | GraphOrchestrator | MagenticBuilder already handles agent coordination |
302
+ | BudgetTracker | MagenticBuilder has max_round_count, max_stall_count |
303
+ | WorkflowManager | asyncio.gather() + Semaphore is simpler |
304
+ | StateMachine | contextvars already used in agents/state.py |
305
+ | New agent primitives | ChatAgent pattern already works |
306
+
307
+ ## Implementation Order
308
+
309
+ ```
310
+ Week 1: Phase 1 (InputParser) - Ship it working
311
+ Week 2: Phase 2 (Planner) - Ship it working
312
+ Week 3: Phase 3 (Parallel Flow) - Ship it working
313
+ Week 4: Phase 4 (RAG) - Ship it working
314
+ Week 5: Phase 5 (LongWriter) - Ship it working
315
+ ```
316
+
317
+ Each phase:
318
+ 1. Write tests first
319
+ 2. Implement minimal code
320
+ 3. Wire into app.py
321
+ 4. Manual test
322
+ 5. PR with <200 lines
323
+ 6. Ship
324
+
325
+ ## References
326
+
327
+ - GPT-Researcher: https://github.com/assafelovic/gpt-researcher
328
+ - LangGraph patterns: https://python.langchain.com/docs/langgraph
329
+ - Your existing Magentic setup: `src/orchestrator_magentic.py`
330
+
331
+ ## Why This Approach
332
+
333
+ 1. **Builds on existing working code** - Don't replace, extend
334
+ 2. **Each phase ships value** - User sees improvement after each PR
335
+ 3. **Tests prove it works** - Not "trust me it imports"
336
+ 4. **Minimal new abstractions** - Reuse ChatAgent, MagenticOrchestrator
337
+ 5. **~500 total lines** vs 7,000 lines of parallel infrastructure
docs/to_do/REFERENCE_GRADDIO_DEMO_ANALYSIS.md ADDED
@@ -0,0 +1,229 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Reference: GradioDemo Analysis
2
+
3
+ > Analysis of code from https://github.com/DeepBoner/GradioDemo
4
+ > Purpose: Extract good ideas, understand patterns, avoid mistakes
5
+
6
+ ## Overview
7
+
8
+ | Metric | Value |
9
+ |--------|-------|
10
+ | Total lines added | ~7,000 |
11
+ | New Python files | +20 |
12
+ | Test pass rate | 80% (62 errors due to missing mocks) |
13
+ | Integration status | **NOT WIRED IN** |
14
+
15
+ ## Component Catalog
16
+
17
+ ### REDUNDANT (Already have equivalent)
18
+
19
+ | Component | Lines | What We Have Instead |
20
+ |-----------|-------|---------------------|
21
+ | `orchestrator/graph_orchestrator.py` | 974 | MagenticBuilder |
22
+ | `middleware/budget_tracker.py` | 391 | MagenticBuilder max_round_count |
23
+ | `middleware/state_machine.py` | 130 | agents/state.py with contextvars |
24
+ | `middleware/workflow_manager.py` | 300 | asyncio.gather() |
25
+ | `orchestrator/research_flow.py` (IterativeResearchFlow) | 500 | MagenticOrchestrator |
26
+ | HuggingFace integration | various | HFInferenceJudgeHandler |
27
+
28
+ ### POTENTIALLY USEFUL (Ideas to cherry-pick)
29
+
30
+ #### 1. InputParser (`agents/input_parser.py` - 179 lines)
31
+
32
+ **Idea**: Detect research mode from query text.
33
+
34
+ ```python
35
+ # Key logic (simplified)
36
+ research_mode: Literal["iterative", "deep"] = "iterative"
37
+ if any(keyword in query.lower() for keyword in [
38
+ "comprehensive", "report", "sections", "analyze", "analysis", "overview", "market"
39
+ ]):
40
+ research_mode = "deep"
41
+ ```
42
+
43
+ **Good pattern**: Heuristic fallback when LLM fails.
44
+ **Our implementation**: See Phase 1 in DEEP_RESEARCH_ROADMAP.md
45
+
46
+ #### 2. PlannerAgent (`orchestrator/planner_agent.py` - 184 lines)
47
+
48
+ **Idea**: LLM creates section outline for report.
49
+
50
+ ```python
51
+ class ReportPlan(BaseModel):
52
+ title: str
53
+ sections: list[ReportSection]
54
+ estimated_time_minutes: int
55
+
56
+ class ReportSection(BaseModel):
57
+ title: str
58
+ query: str
59
+ description: str
60
+ priority: int
61
+ ```
62
+
63
+ **Good pattern**: Structured output with Pydantic models.
64
+ **Our implementation**: See Phase 2 in DEEP_RESEARCH_ROADMAP.md
65
+
66
+ #### 3. DeepResearchFlow (`orchestrator/research_flow.py` - 500 lines)
67
+
68
+ **Idea**: Run parallel research loops per section.
69
+
70
+ ```python
71
+ # Their pattern (simplified)
72
+ async def run_parallel_loops(sections: list[ReportSection]):
73
+ tasks = [run_single_loop(s) for s in sections]
74
+ results = await asyncio.gather(*tasks, return_exceptions=True)
75
+ ```
76
+
77
+ **Problem**: They built new IterativeResearchFlow instead of reusing MagenticOrchestrator.
78
+ **Our implementation**: Just run multiple MagenticOrchestrator instances.
79
+
80
+ #### 4. LlamaIndex RAG (`services/llamaindex_rag.py` - 454 lines)
81
+
82
+ **Idea**: Semantic search over collected evidence.
83
+
84
+ ```python
85
+ # Their approach
86
+ class LlamaIndexRAGService:
87
+ def __init__(self):
88
+ # ChromaDB + LlamaIndex + HuggingFace embeddings
89
+ self.vector_store = ChromaVectorStore(...)
90
+ self.index = VectorStoreIndex(...)
91
+
92
+ def retrieve(self, query: str, top_k: int = 5) -> list[dict]:
93
+ retriever = VectorIndexRetriever(index=self.index, similarity_top_k=top_k)
94
+ return retriever.retrieve(query)
95
+ ```
96
+
97
+ **Good**: Full-featured RAG with multiple embedding providers.
98
+ **Simpler alternative**: Direct ChromaDB + sentence-transformers (no LlamaIndex).
99
+ **Our implementation**: See Phase 4 in DEEP_RESEARCH_ROADMAP.md
100
+
101
+ #### 5. LongWriterAgent (`agents/long_writer.py` - ~300 lines)
102
+
103
+ **Idea**: Write reports section-by-section to handle length.
104
+
105
+ ```python
106
+ class SectionOutput(BaseModel):
107
+ section_content: str
108
+ references: list[str]
109
+ next_section_context: str # What to avoid repeating
110
+
111
+ async def write_next_section(
112
+ section_title: str,
113
+ findings: str,
114
+ previous_sections: str, # Avoid repetition
115
+ ) -> SectionOutput:
116
+ ```
117
+
118
+ **Good pattern**: Passing context to avoid repetition.
119
+ **Our implementation**: See Phase 5 in DEEP_RESEARCH_ROADMAP.md
120
+
121
+ #### 6. ProofreaderAgent (`agents/proofreader.py` - ~200 lines)
122
+
123
+ **Idea**: Final cleanup pass on report.
124
+
125
+ ```python
126
+ # Tasks:
127
+ # 1. Remove duplicate information
128
+ # 2. Fix citation numbering
129
+ # 3. Add executive summary
130
+ # 4. Ensure consistent formatting
131
+ ```
132
+
133
+ **Good pattern**: Separate concerns - writer writes, proofreader polishes.
134
+ **Our implementation**: Optional Phase 6 if needed.
135
+
136
+ ### Graph Architecture (Educational Reference)
137
+
138
+ The graph system is well-designed in theory:
139
+
140
+ ```python
141
+ # Node types
142
+ class AgentNode(GraphNode):
143
+ agent: Any # Pydantic AI agent
144
+ input_transformer: Callable # Transform input
145
+ output_transformer: Callable # Transform output
146
+
147
+ class DecisionNode(GraphNode):
148
+ decision_function: Callable[[Any], str] # Returns next node ID
149
+ options: list[str]
150
+
151
+ class ParallelNode(GraphNode):
152
+ parallel_nodes: list[str] # Run these in parallel
153
+ aggregator: Callable # Combine results
154
+
155
+ # Graph structure
156
+ class ResearchGraph:
157
+ nodes: dict[str, GraphNode]
158
+ edges: dict[str, list[GraphEdge]]
159
+ entry_node: str
160
+ exit_nodes: list[str]
161
+ ```
162
+
163
+ **Why we don't need it**: MagenticBuilder already provides:
164
+ - Agent coordination via manager
165
+ - Conditional routing (manager decides)
166
+ - Multiple participants
167
+
168
+ This is essentially reimplementing what `agent-framework` already does.
169
+
170
+ ## Key Lessons
171
+
172
+ ### What Went Wrong
173
+
174
+ 1. **Parallel architecture** - Built new system instead of extending existing
175
+ 2. **Horizontal sprawl** - All infrastructure, nothing wired in
176
+ 3. **Test mocking** - Tests don't mock API clients properly
177
+ 4. **No manual testing** - Code never ran end-to-end
178
+
179
+ ### What To Learn From
180
+
181
+ 1. **Pydantic models for structured output** - Good pattern
182
+ 2. **Heuristic fallbacks** - When LLM fails, have a fallback
183
+ 3. **Section-by-section writing** - For long reports
184
+ 4. **RAG for evidence retrieval** - Useful for large evidence sets
185
+
186
+ ### The 7,000 Line vs 500 Line Comparison
187
+
188
+ **Their approach**:
189
+ - New GraphOrchestrator (974 lines)
190
+ - New ResearchFlow (999 lines)
191
+ - New BudgetTracker (391 lines)
192
+ - New StateMachine (130 lines)
193
+ - New WorkflowManager (300 lines)
194
+ - New agents (InputParser, Writer, LongWriter, Proofreader, etc.)
195
+ - Total: ~7,000 lines, not integrated
196
+
197
+ **Our approach**:
198
+ - InputParser (50-100 lines) - extends existing
199
+ - PlannerAgent (80-120 lines) - uses ChatAgent pattern
200
+ - DeepOrchestrator (100-150 lines) - wraps MagenticOrchestrator
201
+ - RAGService (100-150 lines) - simple ChromaDB
202
+ - LongWriter (80-100 lines) - extends ReportAgent
203
+ - Total: ~500 lines, each phase ships working
204
+
205
+ ## File Locations (for reference)
206
+
207
+ ```
208
+ reference_repos/GradioDemo/src/
209
+ β”œβ”€β”€ orchestrator/
210
+ β”‚ β”œβ”€β”€ graph_orchestrator.py # 974 lines - graph execution
211
+ β”‚ β”œβ”€β”€ research_flow.py # 999 lines - iterative/deep flows
212
+ β”‚ └── planner_agent.py # 184 lines - section planning
213
+ β”œβ”€β”€ agents/
214
+ β”‚ β”œβ”€β”€ input_parser.py # 179 lines - query analysis
215
+ β”‚ β”œβ”€β”€ writer.py # 210 lines - report writing
216
+ β”‚ β”œβ”€β”€ long_writer.py # ~300 lines - section writing
217
+ β”‚ β”œβ”€β”€ proofreader.py # ~200 lines - cleanup
218
+ β”‚ └── knowledge_gap.py # gap detection
219
+ β”œβ”€β”€ middleware/
220
+ β”‚ β”œβ”€β”€ budget_tracker.py # 391 lines - token/time tracking
221
+ β”‚ β”œβ”€β”€ state_machine.py # 130 lines - workflow state
222
+ β”‚ └── workflow_manager.py # 300 lines - parallel loop mgmt
223
+ β”œβ”€β”€ services/
224
+ β”‚ └── llamaindex_rag.py # 454 lines - RAG service
225
+ β”œβ”€β”€ tools/
226
+ β”‚ └── rag_tool.py # 191 lines - RAG as search tool
227
+ └── agent_factory/
228
+ └── graph_builder.py # ~400 lines - graph construction
229
+ ```
docs/workflow-diagrams.md CHANGED
@@ -1,4 +1,4 @@
1
- # DeepCritical Workflow - Simplified Magentic Architecture
2
 
3
  > **Architecture Pattern**: Microsoft Magentic Orchestration
4
  > **Design Philosophy**: Simple, dynamic, manager-driven coordination
@@ -475,7 +475,7 @@ stateDiagram-v2
475
 
476
  ```mermaid
477
  graph TD
478
- App[Gradio App<br/>DeepCritical Research Agent]
479
 
480
  App --> Input[Input Section]
481
  App --> Status[Status Section]
@@ -514,7 +514,7 @@ graph TD
514
 
515
  ```mermaid
516
  graph LR
517
- User[πŸ‘€ Researcher<br/>Asks research questions] -->|Submits query| DC[DeepCritical<br/>Magentic Workflow]
518
 
519
  DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
520
  DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]
@@ -549,7 +549,7 @@ graph LR
549
 
550
  ```mermaid
551
  gantt
552
- title DeepCritical Magentic Workflow - Typical Execution
553
  dateFormat mm:ss
554
  axisFormat %M:%S
555
 
 
1
+ # DeepBoner Workflow - Simplified Magentic Architecture
2
 
3
  > **Architecture Pattern**: Microsoft Magentic Orchestration
4
  > **Design Philosophy**: Simple, dynamic, manager-driven coordination
 
475
 
476
  ```mermaid
477
  graph TD
478
+ App[Gradio App<br/>DeepBoner Research Agent]
479
 
480
  App --> Input[Input Section]
481
  App --> Status[Status Section]
 
514
 
515
  ```mermaid
516
  graph LR
517
+ User[πŸ‘€ Researcher<br/>Asks research questions] -->|Submits query| DC[DeepBoner<br/>Magentic Workflow]
518
 
519
  DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
520
  DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]
 
549
 
550
  ```mermaid
551
  gantt
552
+ title DeepBoner Magentic Workflow - Typical Execution
553
  dateFormat mm:ss
554
  axisFormat %M:%S
555
 
examples/README.md CHANGED
@@ -1,4 +1,4 @@
1
- # DeepCritical Examples
2
 
3
  **NO MOCKS. NO FAKE DATA. REAL SCIENCE.**
4
 
@@ -181,4 +181,4 @@ Mocks belong in `tests/unit/`, not in demos. When you run these examples, you se
181
  - Real scientific hypotheses
182
  - Real research reports
183
 
184
- This is what DeepCritical actually does. No fake data. No canned responses.
 
1
+ # DeepBoner Examples
2
 
3
  **NO MOCKS. NO FAKE DATA. REAL SCIENCE.**
4
 
 
181
  - Real scientific hypotheses
182
  - Real research reports
183
 
184
+ This is what DeepBoner actually does. No fake data. No canned responses.
examples/embeddings_demo/run_embeddings.py CHANGED
@@ -35,7 +35,7 @@ def create_fresh_service(name_suffix: str = "") -> EmbeddingService:
35
  async def demo_real_pipeline() -> None:
36
  """Run the demo using REAL PubMed data."""
37
  print("\n" + "=" * 60)
38
- print("DeepCritical Embeddings Demo (REAL DATA)")
39
  print("=" * 60)
40
 
41
  # 1. Fetch Real Data
 
35
  async def demo_real_pipeline() -> None:
36
  """Run the demo using REAL PubMed data."""
37
  print("\n" + "=" * 60)
38
+ print("DeepBoner Embeddings Demo (REAL DATA)")
39
  print("=" * 60)
40
 
41
  # 1. Fetch Real Data
examples/full_stack_demo/run_full.py CHANGED
@@ -1,6 +1,6 @@
1
  #!/usr/bin/env python3
2
  """
3
- Demo: Full Stack DeepCritical Agent (Phases 1-8).
4
 
5
  This script demonstrates the COMPLETE REAL drug repurposing research pipeline:
6
  - Phase 2: REAL Search (PubMed + ClinicalTrials + Europe PMC)
@@ -104,7 +104,7 @@ async def _handle_judge_step(
104
 
105
  async def run_full_demo(query: str, max_iterations: int) -> None:
106
  """Run the REAL full stack pipeline."""
107
- print_header("DeepCritical Full Stack Demo (REAL)")
108
  print(f"Query: {query}")
109
  print(f"Max iterations: {max_iterations}")
110
  print("Mode: REAL (All live API calls - no mocks)\n")
@@ -172,7 +172,7 @@ async def run_full_demo(query: str, max_iterations: int) -> None:
172
  async def main() -> None:
173
  """Entry point."""
174
  parser = argparse.ArgumentParser(
175
- description="DeepCritical Full Stack Demo - REAL, No Mocks",
176
  formatter_class=argparse.RawDescriptionHelpFormatter,
177
  epilog="""
178
  This demo runs the COMPLETE pipeline with REAL API calls:
@@ -222,7 +222,7 @@ Examples:
222
  await run_full_demo(args.query, args.iterations)
223
 
224
  print("\n" + "=" * 70)
225
- print(" DeepCritical Full Stack Demo Complete!")
226
  print(" ")
227
  print(" Everything you just saw was REAL:")
228
  print(" - Real PubMed + ClinicalTrials + Europe PMC searches")
 
1
  #!/usr/bin/env python3
2
  """
3
+ Demo: Full Stack DeepBoner Agent (Phases 1-8).
4
 
5
  This script demonstrates the COMPLETE REAL drug repurposing research pipeline:
6
  - Phase 2: REAL Search (PubMed + ClinicalTrials + Europe PMC)
 
104
 
105
  async def run_full_demo(query: str, max_iterations: int) -> None:
106
  """Run the REAL full stack pipeline."""
107
+ print_header("DeepBoner Full Stack Demo (REAL)")
108
  print(f"Query: {query}")
109
  print(f"Max iterations: {max_iterations}")
110
  print("Mode: REAL (All live API calls - no mocks)\n")
 
172
  async def main() -> None:
173
  """Entry point."""
174
  parser = argparse.ArgumentParser(
175
+ description="DeepBoner Full Stack Demo - REAL, No Mocks",
176
  formatter_class=argparse.RawDescriptionHelpFormatter,
177
  epilog="""
178
  This demo runs the COMPLETE pipeline with REAL API calls:
 
222
  await run_full_demo(args.query, args.iterations)
223
 
224
  print("\n" + "=" * 70)
225
+ print(" DeepBoner Full Stack Demo Complete!")
226
  print(" ")
227
  print(" Everything you just saw was REAL:")
228
  print(" - Real PubMed + ClinicalTrials + Europe PMC searches")
examples/hypothesis_demo/run_hypothesis.py CHANGED
@@ -31,7 +31,7 @@ async def run_hypothesis_demo(query: str) -> None:
31
  """Run the REAL hypothesis generation pipeline."""
32
  try:
33
  print(f"\n{'=' * 60}")
34
- print("DeepCritical Hypothesis Agent Demo (Phase 7)")
35
  print(f"Query: {query}")
36
  print("Mode: REAL (Live API calls)")
37
  print(f"{'=' * 60}\n")
 
31
  """Run the REAL hypothesis generation pipeline."""
32
  try:
33
  print(f"\n{'=' * 60}")
34
+ print("DeepBoner Hypothesis Agent Demo (Phase 7)")
35
  print(f"Query: {query}")
36
  print("Mode: REAL (Live API calls)")
37
  print(f"{'=' * 60}\n")
examples/modal_demo/run_analysis.py CHANGED
@@ -32,7 +32,7 @@ async def main() -> None:
32
  sys.exit(1)
33
 
34
  print(f"\n{'=' * 60}")
35
- print("DeepCritical Modal Analysis Demo")
36
  print(f"Query: {args.query}")
37
  print(f"{'=' * 60}\n")
38
 
 
32
  sys.exit(1)
33
 
34
  print(f"\n{'=' * 60}")
35
+ print("DeepBoner Modal Analysis Demo")
36
  print(f"Query: {args.query}")
37
  print(f"{'=' * 60}\n")
38
 
examples/orchestrator_demo/run_agent.py CHANGED
@@ -1,6 +1,6 @@
1
  #!/usr/bin/env python3
2
  """
3
- Demo: DeepCritical Agent Loop (Search + Judge + Orchestrator).
4
 
5
  This script demonstrates the REAL Phase 4 orchestration:
6
  - REAL Iterative Search (PubMed + ClinicalTrials + Europe PMC)
@@ -36,7 +36,7 @@ MAX_ITERATIONS = 10
36
  async def main() -> None:
37
  """Run the REAL agent demo."""
38
  parser = argparse.ArgumentParser(
39
- description="DeepCritical Agent Demo - REAL, No Mocks",
40
  formatter_class=argparse.RawDescriptionHelpFormatter,
41
  epilog="""
42
  This demo runs the REAL search-judge-synthesize loop:
@@ -72,7 +72,7 @@ Examples:
72
  sys.exit(1)
73
 
74
  print(f"\n{'=' * 60}")
75
- print("DeepCritical Agent Demo (REAL)")
76
  print(f"Query: {args.query}")
77
  print(f"Max Iterations: {args.iterations}")
78
  print("Mode: REAL (All live API calls)")
 
1
  #!/usr/bin/env python3
2
  """
3
+ Demo: DeepBoner Agent Loop (Search + Judge + Orchestrator).
4
 
5
  This script demonstrates the REAL Phase 4 orchestration:
6
  - REAL Iterative Search (PubMed + ClinicalTrials + Europe PMC)
 
36
  async def main() -> None:
37
  """Run the REAL agent demo."""
38
  parser = argparse.ArgumentParser(
39
+ description="DeepBoner Agent Demo - REAL, No Mocks",
40
  formatter_class=argparse.RawDescriptionHelpFormatter,
41
  epilog="""
42
  This demo runs the REAL search-judge-synthesize loop:
 
72
  sys.exit(1)
73
 
74
  print(f"\n{'=' * 60}")
75
+ print("DeepBoner Agent Demo (REAL)")
76
  print(f"Query: {args.query}")
77
  print(f"Max Iterations: {args.iterations}")
78
  print("Mode: REAL (All live API calls)")
examples/orchestrator_demo/run_magentic.py CHANGED
@@ -1,6 +1,6 @@
1
  #!/usr/bin/env python3
2
  """
3
- Demo: Magentic-One Orchestrator for DeepCritical.
4
 
5
  This script demonstrates Phase 5 functionality:
6
  - Multi-Agent Coordination (Searcher + Judge + Manager)
@@ -27,7 +27,7 @@ from src.utils.models import OrchestratorConfig
27
 
28
  async def main() -> None:
29
  """Run the magentic agent demo."""
30
- parser = argparse.ArgumentParser(description="Run DeepCritical Magentic Agent")
31
  parser.add_argument("query", help="Research query (e.g., 'metformin cancer')")
32
  parser.add_argument("--iterations", type=int, default=10, help="Max rounds")
33
  args = parser.parse_args()
@@ -40,7 +40,7 @@ async def main() -> None:
40
  sys.exit(1)
41
 
42
  print(f"\n{'=' * 60}")
43
- print("DeepCritical Magentic Agent Demo")
44
  print(f"Query: {args.query}")
45
  print("Mode: MAGENTIC (Multi-Agent)")
46
  print(f"{'=' * 60}\n")
 
1
  #!/usr/bin/env python3
2
  """
3
+ Demo: Magentic-One Orchestrator for DeepBoner.
4
 
5
  This script demonstrates Phase 5 functionality:
6
  - Multi-Agent Coordination (Searcher + Judge + Manager)
 
27
 
28
  async def main() -> None:
29
  """Run the magentic agent demo."""
30
+ parser = argparse.ArgumentParser(description="Run DeepBoner Magentic Agent")
31
  parser.add_argument("query", help="Research query (e.g., 'metformin cancer')")
32
  parser.add_argument("--iterations", type=int, default=10, help="Max rounds")
33
  args = parser.parse_args()
 
40
  sys.exit(1)
41
 
42
  print(f"\n{'=' * 60}")
43
+ print("DeepBoner Magentic Agent Demo")
44
  print(f"Query: {args.query}")
45
  print("Mode: MAGENTIC (Multi-Agent)")
46
  print(f"{'=' * 60}\n")
examples/search_demo/run_search.py CHANGED
@@ -30,7 +30,7 @@ from src.tools.search_handler import SearchHandler
30
  async def main(query: str) -> None:
31
  """Run search demo with the given query."""
32
  print(f"\n{'=' * 60}")
33
- print("DeepCritical Search Demo")
34
  print(f"Query: {query}")
35
  print(f"{'=' * 60}\n")
36
 
 
30
  async def main(query: str) -> None:
31
  """Run search demo with the given query."""
32
  print(f"\n{'=' * 60}")
33
+ print("DeepBoner Search Demo")
34
  print(f"Query: {query}")
35
  print(f"{'=' * 60}\n")
36
 
main.py CHANGED
@@ -1,5 +1,5 @@
1
  def main():
2
- print("Hello from deepcritical!")
3
 
4
 
5
  if __name__ == "__main__":
 
1
  def main():
2
+ print("Hello from deepboner!")
3
 
4
 
5
  if __name__ == "__main__":
pyproject.toml CHANGED
@@ -1,7 +1,7 @@
1
  [project]
2
- name = "deepcritical"
3
  version = "0.1.0"
4
- description = "AI-Native Drug Repurposing Research Agent"
5
  readme = "README.md"
6
  requires-python = ">=3.11"
7
  dependencies = [
@@ -126,6 +126,18 @@ markers = [
126
  "integration: Integration tests (real APIs)",
127
  "slow: Slow tests",
128
  ]
 
 
 
 
 
 
 
 
 
 
 
 
129
 
130
  # ============== COVERAGE CONFIG ==============
131
  [tool.coverage.run]
 
1
  [project]
2
+ name = "deepboner"
3
  version = "0.1.0"
4
+ description = "AI-Native Sexual Health Research Agent"
5
  readme = "README.md"
6
  requires-python = ">=3.11"
7
  dependencies = [
 
126
  "integration: Integration tests (real APIs)",
127
  "slow: Slow tests",
128
  ]
129
+ # Filter warnings from unittest.mock introspecting Pydantic models.
130
+ # This is a known upstream issue: https://github.com/pydantic/pydantic/issues/9927
131
+ # When autospec=True, mock.py accesses deprecated Pydantic attributes during introspection.
132
+ # We filter these specifically because it's NOT our code triggering deprecations.
133
+ filterwarnings = [
134
+ # Pydantic 2.0 deprecations triggered by mock introspection
135
+ "ignore:The `__fields__` attribute is deprecated:pydantic.warnings.PydanticDeprecatedSince20",
136
+ "ignore:The `__fields_set__` attribute is deprecated:pydantic.warnings.PydanticDeprecatedSince20",
137
+ # Pydantic 2.11 deprecations triggered by mock introspection
138
+ "ignore:Accessing the 'model_computed_fields' attribute on the instance is deprecated:pydantic.warnings.PydanticDeprecatedSince211",
139
+ "ignore:Accessing the 'model_fields' attribute on the instance is deprecated:pydantic.warnings.PydanticDeprecatedSince211",
140
+ ]
141
 
142
  # ============== COVERAGE CONFIG ==============
143
  [tool.coverage.run]
src/agent_factory/judges.py CHANGED
@@ -9,7 +9,7 @@ from huggingface_hub import InferenceClient
9
  from pydantic_ai import Agent
10
  from pydantic_ai.models.anthropic import AnthropicModel
11
  from pydantic_ai.models.huggingface import HuggingFaceModel
12
- from pydantic_ai.models.openai import OpenAIModel
13
  from pydantic_ai.providers.anthropic import AnthropicProvider
14
  from pydantic_ai.providers.huggingface import HuggingFaceProvider
15
  from pydantic_ai.providers.openai import OpenAIProvider
@@ -48,7 +48,7 @@ def get_model() -> Any:
48
  logger.warning("Unknown LLM provider, defaulting to OpenAI", provider=llm_provider)
49
 
50
  openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
51
- return OpenAIModel(settings.openai_model, provider=openai_provider)
52
 
53
 
54
  class JudgeHandler:
 
9
  from pydantic_ai import Agent
10
  from pydantic_ai.models.anthropic import AnthropicModel
11
  from pydantic_ai.models.huggingface import HuggingFaceModel
12
+ from pydantic_ai.models.openai import OpenAIChatModel
13
  from pydantic_ai.providers.anthropic import AnthropicProvider
14
  from pydantic_ai.providers.huggingface import HuggingFaceProvider
15
  from pydantic_ai.providers.openai import OpenAIProvider
 
48
  logger.warning("Unknown LLM provider, defaulting to OpenAI", provider=llm_provider)
49
 
50
  openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
51
+ return OpenAIChatModel(settings.openai_model, provider=openai_provider)
52
 
53
 
54
  class JudgeHandler:
src/app.py CHANGED
@@ -1,4 +1,4 @@
1
- """Gradio UI for DeepCritical agent with MCP server support."""
2
 
3
  import os
4
  from collections.abc import AsyncGenerator
@@ -197,29 +197,31 @@ def create_demo() -> gr.ChatInterface:
197
  # 1. Unwrapped ChatInterface (Fixes Accordion Bug)
198
  demo = gr.ChatInterface(
199
  fn=research_agent,
200
- title="🧬 DeepCritical",
201
  description=(
202
- "*AI-Powered Drug Repurposing Agent β€” searches PubMed, "
203
  "ClinicalTrials.gov & Europe PMC*\n\n"
 
 
204
  "---\n"
205
  "*Research tool only β€” not for medical advice.* \n"
206
  "**MCP Server Active**: Connect Claude Desktop to `/gradio_api/mcp/`"
207
  ),
208
  examples=[
209
  [
210
- "What drugs could be repurposed for Alzheimer's disease?",
211
  "simple",
212
  "",
213
  "openai",
214
  ],
215
  [
216
- "Is metformin effective for treating cancer?",
217
  "simple",
218
  "",
219
  "openai",
220
  ],
221
  [
222
- "What medications show promise for Long COVID treatment?",
223
  "simple",
224
  "",
225
  "openai",
 
1
+ """Gradio UI for DeepBoner agent with MCP server support."""
2
 
3
  import os
4
  from collections.abc import AsyncGenerator
 
197
  # 1. Unwrapped ChatInterface (Fixes Accordion Bug)
198
  demo = gr.ChatInterface(
199
  fn=research_agent,
200
+ title="πŸ† DeepBoner",
201
  description=(
202
+ "*AI-Powered Sexual Health Research Agent β€” searches PubMed, "
203
  "ClinicalTrials.gov & Europe PMC*\n\n"
204
+ "Deep research for sexual wellness, ED treatments, hormone therapy, "
205
+ "libido, and reproductive health - for all genders.\n\n"
206
  "---\n"
207
  "*Research tool only β€” not for medical advice.* \n"
208
  "**MCP Server Active**: Connect Claude Desktop to `/gradio_api/mcp/`"
209
  ),
210
  examples=[
211
  [
212
+ "What drugs improve female libido post-menopause?",
213
  "simple",
214
  "",
215
  "openai",
216
  ],
217
  [
218
+ "Clinical trials for erectile dysfunction alternatives to PDE5 inhibitors?",
219
  "simple",
220
  "",
221
  "openai",
222
  ],
223
  [
224
+ "Evidence for testosterone therapy in women with HSDD?",
225
  "simple",
226
  "",
227
  "openai",
src/mcp_tools.py CHANGED
@@ -1,4 +1,4 @@
1
- """MCP tool wrappers for DeepCritical search tools.
2
 
3
  These functions expose our search tools via MCP protocol.
4
  Each function follows the MCP tool contract:
 
1
+ """MCP tool wrappers for DeepBoner search tools.
2
 
3
  These functions expose our search tools via MCP protocol.
4
  Each function follows the MCP tool contract:
src/services/__init__.py CHANGED
@@ -1 +1 @@
1
- """Services for DeepCritical."""
 
1
+ """Services for DeepBoner."""
src/services/llamaindex_rag.py CHANGED
@@ -1,6 +1,10 @@
1
  """LlamaIndex RAG service for evidence retrieval and indexing.
2
 
3
  Requires optional dependencies: uv sync --extra modal
 
 
 
 
4
  """
5
 
6
  from typing import Any
@@ -25,7 +29,7 @@ class LlamaIndexRAGService:
25
 
26
  def __init__(
27
  self,
28
- collection_name: str = "deepcritical_evidence",
29
  persist_dir: str | None = None,
30
  embedding_model: str | None = None,
31
  similarity_top_k: int = 5,
@@ -34,7 +38,8 @@ class LlamaIndexRAGService:
34
  Initialize LlamaIndex RAG service.
35
 
36
  Args:
37
- collection_name: Name of the ChromaDB collection
 
38
  persist_dir: Directory to persist ChromaDB data
39
  embedding_model: OpenAI embedding model (defaults to settings.openai_embedding_model)
40
  similarity_top_k: Number of top results to retrieve
@@ -248,7 +253,7 @@ class LlamaIndexRAGService:
248
 
249
 
250
  def get_rag_service(
251
- collection_name: str = "deepcritical_evidence",
252
  **kwargs: Any,
253
  ) -> LlamaIndexRAGService:
254
  """
 
1
  """LlamaIndex RAG service for evidence retrieval and indexing.
2
 
3
  Requires optional dependencies: uv sync --extra modal
4
+
5
+ Migration Note (v1.0 rebrand):
6
+ Default collection_name changed from "deepcritical_evidence" to "deepboner_evidence".
7
+ To preserve existing data, explicitly pass collection_name="deepcritical_evidence".
8
  """
9
 
10
  from typing import Any
 
29
 
30
  def __init__(
31
  self,
32
+ collection_name: str = "deepboner_evidence",
33
  persist_dir: str | None = None,
34
  embedding_model: str | None = None,
35
  similarity_top_k: int = 5,
 
38
  Initialize LlamaIndex RAG service.
39
 
40
  Args:
41
+ collection_name: Name of the ChromaDB collection (default changed from
42
+ "deepcritical_evidence" to "deepboner_evidence" in v1.0 rebrand)
43
  persist_dir: Directory to persist ChromaDB data
44
  embedding_model: OpenAI embedding model (defaults to settings.openai_embedding_model)
45
  similarity_top_k: Number of top results to retrieve
 
253
 
254
 
255
  def get_rag_service(
256
+ collection_name: str = "deepboner_evidence",
257
  **kwargs: Any,
258
  ) -> LlamaIndexRAGService:
259
  """
src/tools/clinicaltrials.py CHANGED
@@ -75,7 +75,7 @@ class ClinicalTrialsTool:
75
  requests.get,
76
  self.BASE_URL,
77
  params=params,
78
- headers={"User-Agent": "DeepCritical-Research-Agent/1.0"},
79
  timeout=30,
80
  )
81
  response.raise_for_status()
 
75
  requests.get,
76
  self.BASE_URL,
77
  params=params,
78
+ headers={"User-Agent": "DeepBoner-Research-Agent/1.0"},
79
  timeout=30,
80
  )
81
  response.raise_for_status()
src/tools/code_execution.py CHANGED
@@ -109,10 +109,10 @@ class ModalCodeExecutor:
109
 
110
  try:
111
  # Create or lookup Modal app
112
- app = modal.App.lookup("deepcritical-code-execution", create_if_missing=True)
113
 
114
  # Define scientific computing image with common libraries
115
- scientific_image = modal.Image.debian_slim(python_version="3.11").uv_pip_install(
116
  *get_sandbox_library_list()
117
  )
118
 
 
109
 
110
  try:
111
  # Create or lookup Modal app
112
+ app = modal.App.lookup("deepboner-code-execution", create_if_missing=True)
113
 
114
  # Define scientific computing image with common libraries
115
+ scientific_image = modal.Image.debian_slim(python_version="3.11").pip_install(
116
  *get_sandbox_library_list()
117
  )
118
 
src/utils/exceptions.py CHANGED
@@ -1,25 +1,25 @@
1
- """Custom exceptions for DeepCritical."""
2
 
3
 
4
- class DeepCriticalError(Exception):
5
- """Base exception for all DeepCritical errors."""
6
 
7
  pass
8
 
9
 
10
- class SearchError(DeepCriticalError):
11
  """Raised when a search operation fails."""
12
 
13
  pass
14
 
15
 
16
- class JudgeError(DeepCriticalError):
17
  """Raised when the judge fails to assess evidence."""
18
 
19
  pass
20
 
21
 
22
- class ConfigurationError(DeepCriticalError):
23
  """Raised when configuration is invalid."""
24
 
25
  pass
@@ -29,3 +29,7 @@ class RateLimitError(SearchError):
29
  """Raised when we hit API rate limits."""
30
 
31
  pass
 
 
 
 
 
1
+ """Custom exceptions for DeepBoner."""
2
 
3
 
4
+ class DeepBonerError(Exception):
5
+ """Base exception for all DeepBoner errors."""
6
 
7
  pass
8
 
9
 
10
+ class SearchError(DeepBonerError):
11
  """Raised when a search operation fails."""
12
 
13
  pass
14
 
15
 
16
+ class JudgeError(DeepBonerError):
17
  """Raised when the judge fails to assess evidence."""
18
 
19
  pass
20
 
21
 
22
+ class ConfigurationError(DeepBonerError):
23
  """Raised when configuration is invalid."""
24
 
25
  pass
 
29
  """Raised when we hit API rate limits."""
30
 
31
  pass
32
+
33
+
34
+ # Backwards compatibility alias
35
+ DeepCriticalError = DeepBonerError
tests/unit/agent_factory/test_judges_factory.py CHANGED
@@ -10,7 +10,7 @@ from pydantic_ai.models.anthropic import AnthropicModel
10
  # We expect this import to exist after we implement it, or we mock it if it's not there yet
11
  # For TDD, we assume we will use the library class
12
  from pydantic_ai.models.huggingface import HuggingFaceModel
13
- from pydantic_ai.models.openai import OpenAIModel
14
 
15
  from src.agent_factory.judges import get_model
16
 
@@ -28,7 +28,7 @@ def test_get_model_openai(mock_settings):
28
  mock_settings.openai_model = "gpt-5.1"
29
 
30
  model = get_model()
31
- assert isinstance(model, OpenAIModel)
32
  assert model.model_name == "gpt-5.1"
33
 
34
 
@@ -61,4 +61,4 @@ def test_get_model_default_fallback(mock_settings):
61
  mock_settings.openai_model = "gpt-5.1"
62
 
63
  model = get_model()
64
- assert isinstance(model, OpenAIModel)
 
10
  # We expect this import to exist after we implement it, or we mock it if it's not there yet
11
  # For TDD, we assume we will use the library class
12
  from pydantic_ai.models.huggingface import HuggingFaceModel
13
+ from pydantic_ai.models.openai import OpenAIChatModel
14
 
15
  from src.agent_factory.judges import get_model
16
 
 
28
  mock_settings.openai_model = "gpt-5.1"
29
 
30
  model = get_model()
31
+ assert isinstance(model, OpenAIChatModel)
32
  assert model.model_name == "gpt-5.1"
33
 
34
 
 
61
  mock_settings.openai_model = "gpt-5.1"
62
 
63
  model = get_model()
64
+ assert isinstance(model, OpenAIChatModel)
tests/unit/agents/test_hypothesis_agent.py CHANGED
@@ -3,10 +3,19 @@
3
  from unittest.mock import AsyncMock, MagicMock, patch
4
 
5
  import pytest
6
- from agent_framework import AgentRunResponse
7
 
8
- from src.agents.hypothesis_agent import HypothesisAgent
9
- from src.utils.models import Citation, Evidence, HypothesisAssessment, MechanismHypothesis
 
 
 
 
 
 
 
 
 
 
10
 
11
 
12
  @pytest.fixture
 
3
  from unittest.mock import AsyncMock, MagicMock, patch
4
 
5
  import pytest
 
6
 
7
+ # Skip all tests if agent_framework not installed (optional dep)
8
+ pytest.importorskip("agent_framework")
9
+
10
+ from agent_framework import AgentRunResponse # noqa: E402
11
+
12
+ from src.agents.hypothesis_agent import HypothesisAgent # noqa: E402
13
+ from src.utils.models import ( # noqa: E402
14
+ Citation,
15
+ Evidence,
16
+ HypothesisAssessment,
17
+ MechanismHypothesis,
18
+ )
19
 
20
 
21
  @pytest.fixture