# Embeddings Brainstorm - Conclusions **Date**: November 2025 **Status**: CLOSED - Conclusions reached, no action needed --- ## The Question Should DeepBoner implement: 1. Internal codebase embeddings/ingestion pipeline? 2. mGREP for internal tool selection? 3. Self-knowledge components for agents? ## The Answer: NO After research and first-principles analysis, the conclusion is clear: ### Why Not Internal Embeddings/Ingestion ```text DeepBoner's Core Task: ┌─────────────────────────────────────────────────────────┐ │ User Query: "Evidence for testosterone in HSDD?" │ │ ↓ │ │ 1. Search PubMed, ClinicalTrials, Europe PMC │ │ 2. Judge: Is evidence sufficient? │ │ 3. Synthesize: Generate report │ │ ↓ │ │ Output: Research report with citations │ └─────────────────────────────────────────────────────────┘ Does ANY step require self-knowledge of codebase? NO. ``` ### Why Not mGREP for Tool Selection | Approach | Complexity | Accuracy | |----------|------------|----------| | Embeddings + mGREP for tool selection | High | Medium (semantic similarity ≠ correct tool) | | Direct prompting with tool descriptions | Low | High (LLM reasons about applicability) | **No real agent system uses embeddings for tool selection.** All major frameworks (LangChain, OpenAI, Anthropic, Magentic) use prompt-based tool selection because: 1. LLMs are already doing semantic matching internally 2. Tool count is small (5-20) - fits easily in context 3. Prompts allow reasoning, not just similarity ### What We Already Have DeepBoner already uses embeddings for the **right thing**: research evidence retrieval. - `src/services/embeddings.py` - ChromaDB + sentence-transformers - `src/services/llamaindex_rag.py` - OpenAI embeddings for premium tier ### The Real Priority Instead of internal embeddings/mGREP, focus on: 1. **Deduplication** across PubMed/Europe PMC/OpenAlex 2. **Outcome measures** from ClinicalTrials.gov 3. **Citation graph traversal** via OpenAlex See: `TOOL_ANALYSIS_CRITICAL.md` for detailed improvement roadmap. --- ## Research Sources - [SICA Paper (ICLR 2025)](https://arxiv.org/abs/2504.15228) - Self-improving agents - [Gödel Agent (ACL 2025)](https://arxiv.org/abs/2410.04444) - Recursive self-modification - [Introspection Paradox (EMNLP 2025)](https://aclanthology.org/2025.emnlp-main.352/) - Self-knowledge can hurt performance - [Anthropic Introspection Research](https://www.anthropic.com/research/introspection) - ~20% accuracy on genuine introspection --- *This document is closed. The conclusion is: don't implement internal embeddings/mGREP for this use case.*