Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

App Files Files Community

DeepBoner / GEMINI.md

VibecoderMcSwaggins

refactor(cleanup): Remove Anthropic + Modal partial wiring (P3 Tech Debt) (#130)

0cdf561 unverified 8 days ago

preview code

raw

history blame contribute delete

6.54 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

DeepBoner Context

Project Overview

DeepBoner is an AI-native Sexual Health Research Agent. Goal: To accelerate research into sexual health, wellness, and reproductive medicine by intelligently searching biomedical literature (PubMed, ClinicalTrials.gov, Europe PMC), evaluating evidence, and synthesizing findings.

Architecture: The project follows a Vertical Slice Architecture (Search -> Judge -> Orchestrator) and adheres to Strict TDD (Test-Driven Development).

Current Status: Phases 1-14 COMPLETE (Foundation through Demo Submission).

Tech Stack & Tooling

Language: Python 3.11 (Pinned)
Package Manager: uv (Rust-based, extremely fast)
Frameworks: pydantic, pydantic-ai, httpx, gradio[mcp]
Vector DB: chromadb with sentence-transformers for semantic search
Testing: pytest, pytest-asyncio, respx (for mocking)
Quality: ruff (linting/formatting), mypy (strict type checking), pre-commit

Building & Running

Command	Description
`make install`	Install dependencies and pre-commit hooks.
`make test`	Run unit tests.
`make lint`	Run Ruff linter.
`make format`	Run Ruff formatter.
`make typecheck`	Run Mypy static type checker.
`make check`	The Golden Gate: Runs lint, typecheck, and test. Must pass before committing.
`make clean`	Clean up cache and artifacts.

Directory Structure

src/: Source code
- utils/: Shared utilities (config.py, exceptions.py, models.py)
- tools/: Search tools (pubmed.py, clinicaltrials.py, europepmc.py, code_execution.py)
- services/: Services (embeddings.py, statistical_analyzer.py)
- agents/: Magentic multi-agent mode agents
- agent_factory/: Agent definitions (judges, prompts)
- mcp_tools.py: MCP tool wrappers for Claude Desktop integration
- app.py: Gradio UI with MCP server
tests/: Test suite
- unit/: Isolated unit tests (Mocked)
- integration/: Real API tests (Marked as slow/integration)
docs/: Documentation and Implementation Specs
examples/: Working demos for each phase

Key Components

src/orchestrators/ - Unified orchestrator package
- advanced.py - Main orchestrator (handles both Free and Paid tiers)
- factory.py - Auto-selects backend based on API key presence
- langgraph_orchestrator.py - LangGraph-based workflow (experimental)
src/clients/ - LLM backend adapters
- factory.py - Auto-selects: OpenAI (if key) or HuggingFace (free)
- huggingface.py - HuggingFace adapter for free tier
src/tools/pubmed.py - PubMed E-utilities search
src/tools/clinicaltrials.py - ClinicalTrials.gov API
src/tools/europepmc.py - Europe PMC search
src/tools/search_handler.py - Scatter-gather orchestration
src/services/embeddings.py - Local embeddings (sentence-transformers, in-memory)
src/services/llamaindex_rag.py - Premium embeddings (OpenAI, persistent ChromaDB)
src/services/embedding_protocol.py - Protocol interface for embedding services
src/services/research_memory.py - Shared memory layer for research state
src/utils/service_loader.py - Tiered service selection (free vs premium)
src/mcp_tools.py - MCP tool wrappers
src/app.py - Gradio UI (HuggingFace Spaces) with MCP server

Configuration

Settings via pydantic-settings from .env:

LLM_PROVIDER: "openai" or "huggingface"
OPENAI_API_KEY: LLM keys
NCBI_API_KEY: Optional, for higher PubMed rate limits
MAX_ITERATIONS: 1-50, default 10
LOG_LEVEL: DEBUG, INFO, WARNING, ERROR

LLM Model Defaults (December 2025)

Default models in src/utils/config.py:

OpenAI: gpt-5 - Flagship model
HuggingFace (Free Tier): Qwen/Qwen2.5-7B-Instruct - See critical note below

⚠️ OpenAI API Keys

If you have a valid OpenAI API key, it will work. Period.

BYOK (Bring Your Own Key) auto-detects sk-... prefix and routes to OpenAI
If you get errors, the key is invalid or expired - NOT an access tier issue
NEVER suggest "access tier" or "upgrade your plan" - this is not how OpenAI works for API keys
Valid keys work. Invalid keys don't. That's it.

⚠️ CRITICAL: HuggingFace Free Tier Architecture

THIS IS IMPORTANT - READ BEFORE CHANGING THE FREE TIER MODEL

HuggingFace has TWO execution paths for inference:

Path	Host	Reliability	Model Size
Native Serverless	HuggingFace infrastructure	✅ High	< 30B params
Inference Providers	Third-party (Novita, Hyperbolic)	❌ Unreliable	70B+ params

The Trap: When you request a large model (70B+) without a paid API key, HuggingFace silently routes the request to third-party providers. These providers have:

500 Internal Server Errors (Novita - current)
401 "Staging Mode" auth failures (Hyperbolic - past)

The Rule: Free Tier MUST use models < 30B to stay on native infrastructure.

Current Safe Models (Dec 2025):

Model	Size	Status
`Qwen/Qwen2.5-7B-Instruct`	7B	✅ DEFAULT - Native, reliable
`mistralai/Mistral-Nemo-Instruct-2407`	12B	✅ Native, reliable
`Qwen/Qwen2.5-72B-Instruct`	72B	❌ Routed to Novita (500 errors)
`meta-llama/Llama-3.1-70B-Instruct`	70B	❌ Routed to Hyperbolic (401 errors)

See: HF_FREE_TIER_ANALYSIS.md for full analysis.

Development Conventions

Strict TDD: Write failing tests in tests/unit/ before implementing logic in src/.
Type Safety: All code must pass mypy --strict. Use Pydantic models for data exchange.
Linting: Zero tolerance for Ruff errors.
Mocking: Use respx or unittest.mock for all external API calls in unit tests.
Vertical Slices: Implement features end-to-end rather than layer-by-layer.

Git Workflow

main: Production-ready (GitHub)
dev: Development integration (GitHub)
Remote origin: GitHub (source of truth for PRs/code review)
Remote huggingface-upstream: HuggingFace Spaces (deployment target)

HuggingFace Spaces Collaboration:

Each contributor should use their own dev branch: yourname-dev (e.g., vcms-dev, mario-dev)
DO NOT push directly to main or dev on HuggingFace - these can be overwritten easily
GitHub is the source of truth; HuggingFace is for deployment/demo
Consider using git hooks to prevent accidental pushes to protected branches