Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

App Files Files Community

VibecoderMcSwaggins commited on Dec 2, 2025

Commit

c6e9843

unverified ·

1 Parent(s): 4337145

fix(P0): Implement Accumulator Pattern to resolve Repr Bug (#117)

Browse files

* docs: Add P1 bug doc for Simple Mode removal breaking Free Tier UX

SPEC-16 Unified Architecture removed Simple Mode, forcing all users
to Advanced Mode. When no API key is provided, Advanced Mode falls back
to HuggingFace Free Tier which triggers upstream agent-framework repr
bug (#2562).

Options documented:
A) Wait for upstream fix (PR #2566)
B) Restore Simple Mode for free tier
C) Current workaround in _extract_text()

* docs: Update P1 bug doc and SPEC-16 with rollback warning

CRITICAL: Simple Mode was deleted BEFORE verifying Advanced+HF worked.

Problem:
- Upstream agent-framework has repr bug (#2562)
- Advanced Mode + HuggingFace = garbage output
- Simple Mode (the working fallback) was deleted prematurely

Bug doc updates:
- Added "What Went Wrong" timeline
- Added Gradio UI confusion analysis (examples vs chat button)
- Recommendation: Restore Simple Mode as fallback

SPEC-16 updates:
- Status changed to "PARTIALLY IMPLEMENTED - ROLLBACK REQUIRED"
- Added critical warning about premature deletion
- Links to P1 bug doc for action items

* docs: CRITICAL - Simple Mode is NOT being deleted

This commit makes it CRYSTAL CLEAR across all documentation:

⛔ SIMPLE MODE IS NOT BEING DELETED - NON-NEGOTIABLE

What went wrong:
- SPEC-16 was supposed to INTEGRATE Simple Mode, not DELETE it
- simple.py was deleted BEFORE verifying Advanced+HF worked
- Upstream agent-framework has repr bug (#2562)
- Free tier users now have no working fallback

Required actions:
1. RESTORE simple.py from git history or MCP reference
2. KEEP Simple Mode as free-tier fallback indefinitely
3. Use Advanced Mode ONLY for paid API key users
4. Wait for upstream #2566 to merge before reconsidering

Updated files:
- SPEC_16: Status changed to "ON HOLD", added warning
- P0_SIMPLE_MODE_FORCED_SYNTHESIS_BYPASS: Changed to "Patch simple.py"
- ACTIVE_BUGS: Marked Simple Mode issues as OPEN

* docs: DO NOT use MCP reference repo - it's buggy

Updated all docs and GitHub issues to clarify:

1. DO NOT restore from MCP reference repo - has known bugs
2. Git revert in THIS repo MAY be possible - review for bugs first
3. Clean implementation preferred if old code is too buggy
4. Goal is WORKING Simple Mode, not blindly restored buggy code

Files updated:
- ACTIVE_BUGS.md
- SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md
- P1_SIMPLE_MODE_REMOVED_BREAKS_FREE_TIER_UX.md

GitHub issues updated:
- #105: Added warning about reference repo
- #113: Added warning about reference repo

* docs: Clarify UNIFIED architecture with Simple Mode INTEGRATED

- NOT two parallel universes/orchestrators
- ONE codebase handles all tiers (free + paid)
- Simple Mode behavior INTEGRATED, not separate
- Blocked by upstream bug #2562, waiting for PR #2566

* docs: Add architecture documentation for unified system

- Current state: Advanced Mode only, simple.py deleted
- Goal: ONE unified architecture (not parallel universes)
- Simple Mode INTEGRATED via HuggingFaceChatClient
- Blocked by upstream #2562, waiting for PR #2566
- Includes path forward for all scenarios

* docs: Update all bug docs for unified architecture consistency

- ACTIVE_BUGS.md: Consolidated free tier issue as single P0 blocker
- P0_SIMPLE_MODE_FORCED_SYNTHESIS_BYPASS.md: Simplified - bug fixed by unification
- All docs now consistently say: ONE unified architecture, NOT parallel universes
- Simple Mode behavior INTEGRATED via HuggingFaceChatClient
- simple.py is DELETED, not being restored

* docs: FINAL - Clear terminology, framework integration documented

Architecture:
- No API Key (Free) → HuggingFace backend
- API Key (Paid) → OpenAI backend
- ONE codebase, different backends, no "modes"

Framework Stack:
- Microsoft Agent Framework = orchestration (routes agents)
- Pydantic AI = structured outputs (validates data)
- Both work TOGETHER, not mutually exclusive

Blocked by upstream #2562, waiting for PR #2566.

All docs and GitHub issues now use consistent terminology.

* docs: Fix root-level docs for unified architecture

- CLAUDE.md: Remove simple.py reference, update orchestrator description
- AGENTS.md: Same fix
- GEMINI.md: Same fix
- README.md: "Two Modes" → "Unified Architecture" + Free/Paid Tier

All root docs now consistent with unified architecture:
- ONE orchestrator (advanced.py) for all users
- Auto-selects backend: OpenAI (if key) or HuggingFace (free)
- No more "Simple Mode" vs "Advanced Mode" terminology

* fix(orchestrator): implement Accumulator Pattern to resolve Repr Bug (P0)

Implements SPEC-17 to fix the P0 'Repr Bug' where agent messages displayed raw Python object strings.

Changes:
- Implemented Accumulator Pattern in AdvancedOrchestrator to use streaming deltas as the source of truth for text content.
- Added fallback logic to handle tool-only turns safely without exposing internal object representations.
- Refactored to reduce complexity (PLR0915) by extracting , , and .
- Added comprehensive unit tests (tests/unit/orchestrators/test_accumulator_pattern.py) verifying the fix against mocked upstream events.
- Updated documentation with SPEC-17 and Root Cause Analysis.

* docs: Add analysis for Gradio Example vs Chat Arrow behavior

- Documented the analysis of user-reported discrepancies between Example Click and Chat Arrow outputs.
- Confirmed that both actions utilize the same code path, with differences attributed to timing rather than divergent code.
- Identified the root cause as an upstream representation issue, linking to related documentation for further context.
- Provided verification steps and next actions regarding the upstream bug fix.

* fix(tests): isolate accumulator pattern tests to prevent module pollution

Refactors tests/unit/orchestrators/test_accumulator_pattern.py to use scoped fixtures for patching sys.modules instead of global module-level patching. This prevents side effects on other tests (like test_advanced_events.py and test_chat_client_factory.py).

Changes:
- Moved mock setup into 'mock_agent_framework' fixture.
- Implemented module reloading logic for 'src.orchestrators.advanced' to ensure it picks up mocks during isolation tests and real modules afterwards.
- Updated MockOrchestratorMessageEvent signature to match real class (added 'message' arg).
- Verified all 20 related tests pass together.

* fix: Address CodeRabbit review feedback

- Add `text` language identifier to ASCII diagram code blocks (MD041)
- Fix broken URL typo: togithub.com → github.com
- Remove unreachable dead code for MagenticAgentMessageEvent and
MagenticAgentDeltaEvent handlers in _process_event() (handled by
Accumulator Pattern in run() loop with continue statements)

* fix: Address all CodeRabbit review feedback

- Use synthesis_result.text instead of str() for AgentRunResponse
- Add Literal return type to _get_event_type_for_agent (eliminates type: ignore)
- Add @pytest .mark.unit markers to accumulator tests
- Add `text` language identifier to code fence in P0_SIMPLE_MODE doc
- Update P0_REPR_BUG checklist to reflect completed dead code removal
- Fix test mock to return object with .text property (matches AgentRunResponse API)

* docs: Fix markdown lint (blank line before code fence)

Files changed (16) hide show

.gitignore +2 -0
AGENTS.md +7 -4
CLAUDE.md +7 -4
GEMINI.md +7 -4
P0_REPR_BUG_ROOT_CAUSE_ANALYSIS.md +99 -0
README.md +3 -2
docs/ARCHITECTURE.md +104 -0
docs/bugs/ACTIVE_BUGS.md +38 -30
docs/bugs/GRADIO_EXAMPLE_VS_CHAT_ARROW_ANALYSIS.md +147 -0
docs/bugs/P0_SIMPLE_MODE_FORCED_SYNTHESIS_BYPASS.md +30 -190
docs/bugs/P1_SIMPLE_MODE_REMOVED_BREAKS_FREE_TIER_UX.md +61 -0
docs/specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md +71 -306
docs/specs/SPEC_17_ACCUMULATOR_PATTERN.md +62 -0
src/orchestrators/advanced.py +151 -87
tests/unit/orchestrators/test_accumulator_pattern.py +294 -0
tests/unit/orchestrators/test_advanced_timeout.py +4 -1

.gitignore CHANGED Viewed

@@ -50,6 +50,8 @@ reference_repos/pydanticai-research-agent/
 reference_repos/pubmed-mcp-server/
 reference_repos/DeepCritical/
 reference_repos/GradioDemo/
 # Keep the README in reference_repos
 !reference_repos/README.md

 reference_repos/pubmed-mcp-server/
 reference_repos/DeepCritical/
 reference_repos/GradioDemo/
+reference_repos/deepboner-hf-space/
+reference_repos/microsoft-agent-framework/
 # Keep the README in reference_repos
 !reference_repos/README.md

AGENTS.md CHANGED Viewed

@@ -50,10 +50,13 @@ Research Report with Citations
 **Key Components**:
-- `src/orchestrators/` - Orchestrator package (simple, advanced, langgraph modes)
-  - `simple.py` - Main search-and-judge loop
-  - `advanced.py` - Multi-agent Magentic mode
-  - `langgraph_orchestrator.py` - LangGraph-based workflow
 - `src/tools/pubmed.py` - PubMed E-utilities search
 - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
 - `src/tools/europepmc.py` - Europe PMC search

 **Key Components**:
+- `src/orchestrators/` - Unified orchestrator package
+  - `advanced.py` - Main orchestrator (handles both Free and Paid tiers)
+  - `factory.py` - Auto-selects backend based on API key presence
+  - `langgraph_orchestrator.py` - LangGraph-based workflow (experimental)
+- `src/clients/` - LLM backend adapters
+  - `factory.py` - Auto-selects: OpenAI (if key) or HuggingFace (free)
+  - `huggingface.py` - HuggingFace adapter for free tier
 - `src/tools/pubmed.py` - PubMed E-utilities search
 - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
 - `src/tools/europepmc.py` - Europe PMC search

CLAUDE.md CHANGED Viewed

@@ -50,10 +50,13 @@ Research Report with Citations
 **Key Components**:
-- `src/orchestrators/` - Orchestrator package (simple, advanced, langgraph modes)
-  - `simple.py` - Main search-and-judge loop
-  - `advanced.py` - Multi-agent Magentic mode
-  - `langgraph_orchestrator.py` - LangGraph-based workflow
 - `src/tools/pubmed.py` - PubMed E-utilities search
 - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
 - `src/tools/europepmc.py` - Europe PMC search

 **Key Components**:
+- `src/orchestrators/` - Unified orchestrator package
+  - `advanced.py` - Main orchestrator (handles both Free and Paid tiers)
+  - `factory.py` - Auto-selects backend based on API key presence
+  - `langgraph_orchestrator.py` - LangGraph-based workflow (experimental)
+- `src/clients/` - LLM backend adapters
+  - `factory.py` - Auto-selects: OpenAI (if key) or HuggingFace (free)
+  - `huggingface.py` - HuggingFace adapter for free tier
 - `src/tools/pubmed.py` - PubMed E-utilities search
 - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
 - `src/tools/europepmc.py` - Europe PMC search

GEMINI.md CHANGED Viewed

@@ -50,10 +50,13 @@ The project follows a **Vertical Slice Architecture** (Search -> Judge -> Orches
 ## Key Components
-- `src/orchestrators/` - Orchestrator package (simple, advanced, langgraph modes)
-  - `simple.py` - Main search-and-judge loop
-  - `advanced.py` - Multi-agent Magentic mode
-  - `langgraph_orchestrator.py` - LangGraph-based workflow
 - `src/tools/pubmed.py` - PubMed E-utilities search
 - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
 - `src/tools/europepmc.py` - Europe PMC search

 ## Key Components
+- `src/orchestrators/` - Unified orchestrator package
+  - `advanced.py` - Main orchestrator (handles both Free and Paid tiers)
+  - `factory.py` - Auto-selects backend based on API key presence
+  - `langgraph_orchestrator.py` - LangGraph-based workflow (experimental)
+- `src/clients/` - LLM backend adapters
+  - `factory.py` - Auto-selects: OpenAI (if key) or HuggingFace (free)
+  - `huggingface.py` - HuggingFace adapter for free tier
 - `src/tools/pubmed.py` - PubMed E-utilities search
 - `src/tools/clinicaltrials.py` - ClinicalTrials.gov API
 - `src/tools/europepmc.py` - Europe PMC search

P0_REPR_BUG_ROOT_CAUSE_ANALYSIS.md ADDED Viewed

	@@ -0,0 +1,99 @@

+# P0: Event Handling Implementation Spec
+**Status**: FIXED
+**Priority**: P0
+**Source of Truth**: `reference_repos/microsoft-agent-framework/python/samples/autogen-migration/orchestrations/04_magentic_one.py`
+---
+## Root Cause (One Sentence)
+We were extracting content from `MagenticAgentMessageEvent.message` — **the wrong event type** — instead of using `MagenticAgentDeltaEvent.text` as the sole source of streaming content.
+---
+## The Fix: Correct Event Handling Per Microsoft SSOT
+| Event Type | Correct Usage | What We Were Doing (Wrong) |
+|------------|---------------|----------------------------|
+| `MagenticAgentDeltaEvent` | **Extract `.text`** - This is the ONLY source of content | Partially used, not accumulated |
+| `MagenticAgentMessageEvent` | **Signal only** - Agent turn complete. IGNORE `.message` | Extracting `.message.text` (hits repr bug) |
+| `MagenticFinalResultEvent` | **Extract `.message.text`** - Final synthesis result | Correct |
+---
+## Implementation: Accumulator Pattern
+From Microsoft's `04_magentic_one.py` (lines 108-138):
+```python
+# Microsoft's Pattern
+async for event in workflow.run_stream(task):
+    if isinstance(event, MagenticAgentDeltaEvent):
+        # STREAM CONTENT: Accumulate and display
+        if event.text:
+            print(event.text, end="", flush=True)
+    elif isinstance(event, MagenticAgentMessageEvent):
+        # SIGNAL ONLY: Agent done. Print newline. DO NOT read .message
+        print()
+    elif isinstance(event, MagenticFinalResultEvent):
+        # FINAL RESULT: Safe to read .message.text
+        print(event.message.text)
+```
+---
+## Our Implementation (`src/orchestrators/advanced.py`)
+**Status**: ✅ IMPLEMENTED (lines 241-308)
+```python
+# 1. Accumulate streaming content (ONLY source of truth)
+if isinstance(event, MagenticAgentDeltaEvent):
+    if event.text:
+        current_message_buffer += event.text
+        yield AgentEvent(type="streaming", message=event.text, ...)
+# 2. Use buffer on completion signal (IGNORE event.message)
+if isinstance(event, MagenticAgentMessageEvent):
+    text_content = current_message_buffer or "Action completed (Tool Call)"
+    yield AgentEvent(message=f"{agent_name}: {text_content[:200]}...", ...)
+    current_message_buffer = ""  # Reset for next agent
+# 3. Final result - safe to extract
+if isinstance(event, MagenticFinalResultEvent):
+    text = self._extract_text(event.message)
+    yield AgentEvent(type="complete", message=text, ...)
+```
+---
+## Why This Eliminates the Repr Bug
+The repr bug occurs at `_magentic.py:1730`:
+```python
+text = last.text or str(last)  # Falls back to repr() for tool-only messages
+```
+By **never reading** `MagenticAgentMessageEvent.message.text`, we never hit this code path.
+**The repr bug is eliminated by correct implementation — no upstream fix required.**
+---
+## Verification Checklist
+- [x] `MagenticAgentDeltaEvent.text` used as sole content source
+- [x] `MagenticAgentMessageEvent` used as signal only (buffer consumed, not `.message`)
+- [x] `MagenticFinalResultEvent.message.text` extracted for final result
+- [x] Buffer reset on agent switch and completion
+- [x] Remove dead code path in `_process_event()` that still calls `_extract_text` on `MagenticAgentMessageEvent`
+---
+## Remaining Cleanup
+✅ **DONE** - Dead code paths for `MagenticAgentMessageEvent` and `MagenticAgentDeltaEvent` have been removed from `_process_event()`. Comments now explain these events are handled by the Accumulator Pattern in `run()`.

README.md CHANGED Viewed

@@ -55,8 +55,9 @@ Sexual health is health. Period. Yet it remains one of the most under-researched
 - 🤖 **MCP Integration**: Use our tools from Claude Desktop or any MCP client
 - 🔒 **Modal Sandbox**: Secure execution of AI-generated statistical analysis
 - 🧠 **Smart Evidence Synthesis**: LLM-powered judge evaluates and synthesizes findings
-- ⚡ **Two Modes**: Simple (fast) or Advanced (multi-agent deep dive)
-- 🆓 **Free Tier Available**: Works without API keys (HuggingFace Inference)
 ## Example Queries

 - 🤖 **MCP Integration**: Use our tools from Claude Desktop or any MCP client
 - 🔒 **Modal Sandbox**: Secure execution of AI-generated statistical analysis
 - 🧠 **Smart Evidence Synthesis**: LLM-powered judge evaluates and synthesizes findings
+- ⚡ **Unified Architecture**: Same powerful multi-agent orchestration for everyone
+- 🆓 **Free Tier**: Works without API keys (HuggingFace Inference)
+- 🚀 **Paid Tier**: Unlocks GPT-5 automatically when OpenAI key is provided
 ## Example Queries

docs/ARCHITECTURE.md ADDED Viewed

	@@ -0,0 +1,104 @@

+# DeepBoner Architecture
+> **Last Updated**: 2025-12-01
+---
+## How It Works (Simple Version)
+```text
+┌─────────────────────────────────────────────────────────────┐
+│                    UNIFIED ARCHITECTURE                      │
+│                                                              │
+│   User provides API key?                                     │
+│                                                              │
+│   NO (Free Tier)              YES (Paid Tier)               │
+│   ──────────────              ───────────────               │
+│   HuggingFace backend         OpenAI backend                │
+│   Qwen 2.5 72B (free)         GPT-5 (paid)                  │
+│                                                              │
+│   SAME orchestration logic for both                          │
+│   ONE codebase, different LLM backends                       │
+└─────────────────────────────────────────────────────────────┘
+```
+**That's it.** No "modes." Just: do you have an API key or not?
+---
+## Current Status
+**Free Tier is BLOCKED** by upstream bug #2562.
+Once [PR #2566](https://github.com/microsoft/agent-framework/pull/2566) merges:
+1. Update `agent-framework` dependency
+2. Free tier works
+3. Done
+---
+## Framework Stack
+DeepBoner uses TWO frameworks that work TOGETHER:
+| Framework | What It Does | Where Used |
+|-----------|--------------|------------|
+| **Microsoft Agent Framework** | Multi-agent orchestration | `src/orchestrators/advanced.py` |
+| **Pydantic AI** | Structured outputs, validation | `src/agent_factory/judges.py`, `src/agents/*.py` |
+**They are NOT mutually exclusive.** Microsoft AF handles the orchestration (Manager → Search → Judge → Report). Pydantic AI handles structured outputs within those agents.
+---
+## LLM Backend Selection
+Auto-detected by `src/clients/factory.py`:
+```python
+def get_chat_client():
+    if settings.has_openai_key:
+        return OpenAIChatClient(...)  # Paid tier
+    else:
+        return HuggingFaceChatClient(...)  # Free tier
+```
+| Condition | Backend | Model |
+|-----------|---------|-------|
+| User provides OpenAI key | OpenAI | GPT-5 |
+| No API key provided | HuggingFace | Qwen 2.5 72B (free) |
+---
+## Key Files
+| File | Purpose |
+|------|---------|
+| `src/orchestrators/advanced.py` | Multi-agent orchestration (Microsoft AF) |
+| `src/clients/factory.py` | Auto-selects LLM backend |
+| `src/clients/huggingface.py` | HuggingFace adapter for free tier |
+| `src/agent_factory/judges.py` | Judge logic (Pydantic AI) |
+| `src/agents/*.py` | Individual agents (Pydantic AI) |
+---
+## What Was Deleted
+`simple.py` (778 lines) was a SEPARATE orchestrator that created a "parallel universe." It's gone. Now there's ONE orchestrator with different backends.
+---
+## Upstream Blocker
+**Bug:** Microsoft Agent Framework produces `repr()` garbage for tool-call-only messages.
+**Fix:** [PR #2566](https://github.com/microsoft/agent-framework/pull/2566) - waiting to merge.
+**Tracking:** [Issue #2562](https://github.com/microsoft/agent-framework/issues/2562)
+---
+## References
+- [Pydantic AI](https://ai.pydantic.dev/) - Structured outputs framework
+- [Microsoft Agent Framework](https://github.com/microsoft/agent-framework) - Multi-agent orchestration
+- [AG-UI Protocol](https://www.copilotkit.ai/blog/introducing-pydantic-ai-integration-with-ag-ui) - How they integrate

docs/bugs/ACTIVE_BUGS.md CHANGED Viewed

@@ -1,21 +1,36 @@
 # Active Bugs
-> Last updated: 2025-12-01 (16:30 PST)
 >
 > **Note:** Completed bug docs archived to `docs/bugs/archive/`
 > **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)
-## P0 - Critical
-(No active P0 bugs)
 ---
-## P3 - UX Polish
-...
 ## Resolved Bugs
 ### ~~P0 - AIFunction Not JSON Serializable~~ FIXED
 **File:** `docs/bugs/P0_AIFUNCTION_NOT_JSON_SERIALIZABLE.md`
 **Found:** 2025-12-01
 **Resolved:** 2025-12-01
@@ -27,6 +42,7 @@
 - Result: Free Tier now supports full function calling capabilities with Qwen2.5-72B.
 ### ~~P1 - HuggingFace Router 401 Unauthorized~~ FIXED
 **File:** `docs/bugs/P1_HUGGINGFACE_ROUTER_401_HYPERBOLIC.md`
 **Found:** 2025-12-01
 **Resolved:** 2025-12-01
@@ -36,18 +52,8 @@
 - Fix: Generated new valid HF_TOKEN, updated `.env` and Spaces secrets
 - Also switched default model to `Qwen/Qwen2.5-72B-Instruct` for better reliability
-### ~~P0 - Simple Mode Ignores Forced Synthesis~~ FIXED
-**File:** `docs/bugs/P0_SIMPLE_MODE_FORCED_SYNTHESIS_BYPASS.md`
-**Issue:** [#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)
-**PR:** [#115](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/115) (SPEC-16)
-**Found:** 2025-12-01
-**Resolved:** 2025-12-01
-- Problem: Simple Mode ignored forced synthesis signals from Judge.
-- Fix: SPEC-16 unified architecture - removed Simple Mode entirely, integrated HuggingFace into Advanced Mode.
-- Simple Mode code deleted, capability preserved via `HuggingFaceChatClient` adapter.
 ### ~~P1 - Advanced Mode Exposes Uninterpretable Chain-of-Thought~~ FIXED
 **File:** `docs/bugs/P1_ADVANCED_MODE_UNINTERPRETABLE_CHAIN_OF_THOUGHT.md`
 **PR:** [#107](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/107)
 **Found:** 2025-12-01
@@ -59,6 +65,7 @@
 - CodeRabbit review addressed: test markers, edge case handling, truncation test coverage.
 ### ~~P0 - Advanced Mode Timeout Yields No Synthesis~~ FIXED
 **File:** `docs/bugs/P0_ADVANCED_MODE_TIMEOUT_NO_SYNTHESIS.md`
 **Found:** 2025-11-30 (Manual Testing)
 **Resolved:** 2025-12-01
@@ -75,38 +82,35 @@
 - Tests: `tests/unit/orchestrators/test_advanced_timeout.py`
 - Key files: `src/orchestrators/advanced.py`, `src/orchestrators/factory.py`, `src/services/research_memory.py`
-### ~~P0 - Free Tier Synthesis Incorrectly Uses Server-Side API Keys~~ FIXED
 **File:** `docs/bugs/P1_SYNTHESIS_BROKEN_KEY_FALLBACK.md`
 **PR:** [#103](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/103)
 **Found:** 2025-11-30 (Testing)
 **Resolved:** 2025-11-30
-**Verified:** Free Tier now produces full LLM-synthesized research reports ✅
-- Problem: Simple Mode crashed with "OpenAIError" on HuggingFace Spaces when user provided no key but admin key was invalid.
-- Root Cause: Synthesis logic bypassed the Free Tier judge and incorrectly used server-side keys via `get_model()`.
-- Fix: Implemented `synthesize()` in `HFInferenceJudgeHandler` to use free HuggingFace Inference, ensuring consistency with the judge phase.
-- Key files: `src/agent_factory/judges.py`, `src/orchestrators/simple.py`
-### ~~P0 - Synthesis Fails with OpenAIError in Free Mode~~ FIXED
 **File:** `docs/bugs/P0_SYNTHESIS_PROVIDER_MISMATCH.md`
 **Found:** 2025-11-30 (Code Audit)
 **Resolved:** 2025-11-30
 - Problem: "Simple Mode" (Free Tier) crashed with `OpenAIError`.
-- Root Cause: `get_model()` defaulted to OpenAI regardless of available keys.
-- Fix: Implemented auto-detection in `judges.py` (OpenAI > Anthropic > HuggingFace).
-- Added extensive unit tests and regression tests.
-### ~~P0 - Simple Mode Never Synthesizes~~ FIXED
 **PR:** [#71](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/71) (SPEC_06)
 **Commit**: `5cac97d` (2025-11-29)
 - Root cause: LLM-as-Judge recommendations were being IGNORED
-- Fix: Code-enforced termination criteria (`_should_synthesize()`)
-- Added combined score thresholds, late-iteration logic, emergency fallback
-- Simple mode now synthesizes instead of spinning forever
 ### ~~P3 - Magentic Mode Missing Termination Guarantee~~ FIXED
 **Commit**: `d36ce3c` (2025-11-29)
 - Added `final_event_received` tracking in `orchestrator_magentic.py`
@@ -114,6 +118,7 @@
 - Verified with `test_magentic_termination.py`
 ### ~~P0 - Magentic Mode Report Generation~~ FIXED
 **Commit**: `9006d69` (2025-11-29)
 - Fixed `_extract_text()` to handle various message object formats
@@ -122,6 +127,7 @@
 - Advanced mode now produces full research reports
 ### ~~P1 - Streaming Spam + API Key Persistence~~ FIXED
 **Commit**: `0c9be4a` (2025-11-29)
 - Streaming events now buffered (not token-by-token spam)
@@ -129,6 +135,7 @@
 - Examples use explicit `None` values to avoid overwriting keys
 ### ~~P2 - Missing "Thinking" State~~ FIXED
 **Commit**: `9006d69` (2025-11-29)
 - Added `"thinking"` event type with hourglass icon
@@ -136,6 +143,7 @@
 - Users now see feedback during 2-5 minute initial processing
 ### ~~P2 - Gradio Example Not Filling Chat Box~~ FIXED
 **Commit**: `2ea01fd` (2025-11-29)
 - Third example (HSDD) wasn't populating chat box when clicked

 # Active Bugs
+> Last updated: 2025-12-01 (21:00 PST)
 >
 > **Note:** Completed bug docs archived to `docs/bugs/archive/`
 > **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)
+> **See also:** [ARCHITECTURE.md](../ARCHITECTURE.md) for unified architecture plan
+## P0 - Critical (BLOCKED)
+### Free Tier Broken (Upstream #2562)
+**Issue:** [#105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105), [#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)
+**Status:** BLOCKED - Waiting for upstream PR #2566
+**Problem:** Free tier (Advanced Mode + HuggingFace) shows repr garbage output.
+**Cause:** Microsoft Agent Framework upstream bug #2562.
+**Fix:** Upstream PR #2566 will fix this. Once merged:
+1. Update `agent-framework` dependency
+2. Verify Advanced + HuggingFace works
+3. Unified architecture complete
+**Architecture Note:** We have ONE unified architecture. `simple.py` is deleted.
+Simple Mode behavior is INTEGRATED via `HuggingFaceChatClient`, not a parallel orchestrator.
 ---
 ## Resolved Bugs
 ### ~~P0 - AIFunction Not JSON Serializable~~ FIXED
 **File:** `docs/bugs/P0_AIFUNCTION_NOT_JSON_SERIALIZABLE.md`
 **Found:** 2025-12-01
 **Resolved:** 2025-12-01
 - Result: Free Tier now supports full function calling capabilities with Qwen2.5-72B.
 ### ~~P1 - HuggingFace Router 401 Unauthorized~~ FIXED
 **File:** `docs/bugs/P1_HUGGINGFACE_ROUTER_401_HYPERBOLIC.md`
 **Found:** 2025-12-01
 **Resolved:** 2025-12-01
 - Fix: Generated new valid HF_TOKEN, updated `.env` and Spaces secrets
 - Also switched default model to `Qwen/Qwen2.5-72B-Instruct` for better reliability
 ### ~~P1 - Advanced Mode Exposes Uninterpretable Chain-of-Thought~~ FIXED
 **File:** `docs/bugs/P1_ADVANCED_MODE_UNINTERPRETABLE_CHAIN_OF_THOUGHT.md`
 **PR:** [#107](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/107)
 **Found:** 2025-12-01
 - CodeRabbit review addressed: test markers, edge case handling, truncation test coverage.
 ### ~~P0 - Advanced Mode Timeout Yields No Synthesis~~ FIXED
 **File:** `docs/bugs/P0_ADVANCED_MODE_TIMEOUT_NO_SYNTHESIS.md`
 **Found:** 2025-11-30 (Manual Testing)
 **Resolved:** 2025-12-01
 - Tests: `tests/unit/orchestrators/test_advanced_timeout.py`
 - Key files: `src/orchestrators/advanced.py`, `src/orchestrators/factory.py`, `src/services/research_memory.py`
+### ~~P0 - Free Tier Synthesis Incorrectly Uses Server-Side API Keys~~ FIXED (Historical)
 **File:** `docs/bugs/P1_SYNTHESIS_BROKEN_KEY_FALLBACK.md`
 **PR:** [#103](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/103)
 **Found:** 2025-11-30 (Testing)
 **Resolved:** 2025-11-30
+- Problem: Simple Mode crashed with "OpenAIError" on HuggingFace Spaces.
+- Note: This was in the OLD Simple Mode. Now we use Unified Architecture.
+### ~~P0 - Synthesis Fails with OpenAIError in Free Mode~~ FIXED (Historical)
 **File:** `docs/bugs/P0_SYNTHESIS_PROVIDER_MISMATCH.md`
 **Found:** 2025-11-30 (Code Audit)
 **Resolved:** 2025-11-30
 - Problem: "Simple Mode" (Free Tier) crashed with `OpenAIError`.
+- Note: This was in the OLD Simple Mode. Now we use Unified Architecture.
+### ~~P0 - Simple Mode Never Synthesizes~~ FIXED (Historical)
 **PR:** [#71](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/71) (SPEC_06)
 **Commit**: `5cac97d` (2025-11-29)
 - Root cause: LLM-as-Judge recommendations were being IGNORED
+- Note: This was in the OLD Simple Mode. Now we use Unified Architecture.
 ### ~~P3 - Magentic Mode Missing Termination Guarantee~~ FIXED
 **Commit**: `d36ce3c` (2025-11-29)
 - Added `final_event_received` tracking in `orchestrator_magentic.py`
 - Verified with `test_magentic_termination.py`
 ### ~~P0 - Magentic Mode Report Generation~~ FIXED
 **Commit**: `9006d69` (2025-11-29)
 - Fixed `_extract_text()` to handle various message object formats
 - Advanced mode now produces full research reports
 ### ~~P1 - Streaming Spam + API Key Persistence~~ FIXED
 **Commit**: `0c9be4a` (2025-11-29)
 - Streaming events now buffered (not token-by-token spam)
 - Examples use explicit `None` values to avoid overwriting keys
 ### ~~P2 - Missing "Thinking" State~~ FIXED
 **Commit**: `9006d69` (2025-11-29)
 - Added `"thinking"` event type with hourglass icon
 - Users now see feedback during 2-5 minute initial processing
 ### ~~P2 - Gradio Example Not Filling Chat Box~~ FIXED
 **Commit**: `2ea01fd` (2025-11-29)
 - Third example (HSDD) wasn't populating chat box when clicked

docs/bugs/GRADIO_EXAMPLE_VS_CHAT_ARROW_ANALYSIS.md ADDED Viewed

	@@ -0,0 +1,147 @@

+# Gradio Example Click vs Chat Arrow - Code Path Analysis
+**Status**: ANALYZED - NOT A BUG (Same code path, different timing)
+**Priority**: N/A (Symptom of upstream repr bug)
+**Analyzed**: 2025-12-01
+**Related**: P0_HUGGINGFACE_TOOL_CALLING_BROKEN.md
+---
+## Symptom Reported
+User observed two different outputs when:
+1. **Clicking an Example** → Shows progress at 10%, "THINKING" message
+2. **Clicking Chat Arrow** → Shows full 5 rounds with repr garbage
+User suspected divergent code paths from vestigial Simple Mode deletion.
+---
+## Analysis: NO DIVERGENT CODE PATHS
+### Code Trace
+Both Example Click and Chat Arrow use **the exact same code path**:
+```text
+User Action (Example OR Chat Arrow)
+         ↓
+app.py:research_agent()         ← SAME FUNCTION
+         ↓
+app.py:configure_orchestrator() ← SAME FUNCTION (mode="advanced" always)
+         ↓
+factory.py:create_orchestrator() ← SAME FUNCTION
+         ↓
+factory.py:_determine_mode()    ← ALWAYS returns "advanced"
+         ↓
+AdvancedOrchestrator            ← SAME CLASS
+         ↓
+clients/factory.py:get_chat_client() ← SAME FUNCTION
+         ↓
+HuggingFaceChatClient (no API key) OR OpenAIChatClient (with API key)
+```
+### Evidence from Code
+**app.py:279-325 - ChatInterface Setup:**
+```python
+demo = gr.ChatInterface(
+    fn=research_agent,  # ← SAME FUNCTION FOR BOTH
+    examples=[
+        ["What drugs improve female libido post-menopause?", "sexual_health", None, None],
+        # ...
+    ],
+    # ...
+)
+```
+**factory.py:76-90 - Mode Determination:**
+```python
+def _determine_mode(explicit_mode: str | None) -> str:
+    if explicit_mode == "hierarchical":
+        return "hierarchical"
+    # "simple" is deprecated -> upgrade to "advanced"
+    # "magentic" is alias for "advanced"
+    return "advanced"  # ← ALWAYS ADVANCED
+```
+---
+## Explanation of Visual Difference
+The difference the user observed is **timing**, not code paths:
+| Screenshot | When Captured | Content |
+|------------|---------------|---------|
+| Example Click | Mid-execution | Progress bar at 10%, "THINKING" |
+| Chat Arrow | After completion | Full 5 rounds with repr garbage |
+**Both show the same process at different stages.**
+The repr garbage (`<agent_framework._types.ChatMessage object at 0x...>`) appears in BOTH:
+- Example Click: Would show repr garbage if captured after completion
+- Chat Arrow: Shows repr garbage because it was captured after completion
+---
+## The Real Bug: Upstream repr Issue
+The repr garbage is the **upstream Microsoft Agent Framework bug** documented in:
+- `docs/bugs/P0_HUGGINGFACE_TOOL_CALLING_BROKEN.md`
+**Root cause in upstream code:**
+```python
+# agent_framework/_workflows/_magentic.py line ~1799
+text = last.text or str(last)  # BUG: str(last) gives repr for tool-only messages
+```
+**Our workaround in advanced.py:**
+```python
+def _extract_text(self, message: Any) -> str:
+    # Filter out repr strings
+    if isinstance(message, str) and message.startswith("<") and "object at" in message:
+        return ""
+    # ...
+```
+---
+## Verification
+1. **No vestigial Simple Mode code** - `simple.py` is deleted, not imported anywhere
+2. **Factory always returns AdvancedOrchestrator** - verified in `factory.py:66-73`
+3. **Same research_agent function** - Gradio routes both Example and Chat Arrow through it
+---
+## Conclusion
+**There are NO divergent code paths.** The unified architecture is correctly implemented:
+| Component | Status |
+|-----------|--------|
+| Simple Mode | ✅ DELETED (no vestigial code) |
+| Factory Pattern | ✅ Always returns AdvancedOrchestrator |
+| Chat Client Factory | ✅ Auto-selects HuggingFace (free) or OpenAI (paid) |
+| Example Click | ✅ Uses same `research_agent()` function |
+| Chat Arrow Click | ✅ Uses same `research_agent()` function |
+**The only bug is the upstream repr display issue**, which affects BOTH paths equally.
+---
+## Next Steps
+1. **Wait for upstream fix** - [PR #2566](https://github.com/microsoft/agent-framework/pull/2566)
+2. **Once merged**: `uv add agent-framework@latest`
+3. **Test**: Verify both Example Click and Chat Arrow work identically
+---
+## References
+- `src/app.py` - Line 134-247 (`research_agent()`)
+- `src/app.py` - Line 279-325 (ChatInterface with examples)
+- `src/orchestrators/factory.py` - Line 43-73 (`create_orchestrator()`)
+- `src/clients/factory.py` - Line 15-76 (`get_chat_client()`)
+- `docs/bugs/P0_HUGGINGFACE_TOOL_CALLING_BROKEN.md` - Upstream repr bug details

docs/bugs/P0_SIMPLE_MODE_FORCED_SYNTHESIS_BYPASS.md CHANGED Viewed

@@ -1,219 +1,59 @@
-# P0 BUG: Simple Mode Ignores Forced Synthesis from HF Inference Failures
-**Status**: Open → **Fix via SPEC_16 (Integration)**
 **Priority**: P0 (Demo-blocking)
 **Discovered**: 2025-12-01
-**Affected Component**: `src/orchestrators/simple.py`
-**Strategic Fix**: [SPEC_16: Unified Chat Client Architecture](../specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md)
 **GitHub Issue**: [#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)
-> **Decision**: Instead of patching Simple Mode, we will **INTEGRATE its capability into Advanced Mode** per SPEC_16.
->
-> **What this means:**
-> - ✅ Free-tier HuggingFace capability is PRESERVED via `HuggingFaceChatClient`
-> - ✅ Users without API keys still get full functionality (Advanced Mode + HuggingFace backend)
-> - 🗑️ Simple Mode's redundant orchestration CODE is retired (not the capability!)
-> - 🐛 The bug disappears because Advanced Mode's Manager agent handles termination correctly
 ---
-## Problem Statement
-When HuggingFace Inference API fails 3 consecutive times, the `HFInferenceJudgeHandler` correctly returns a "forced synthesis" assessment with `sufficient=True, recommendation="synthesize"`. However, **Simple Mode's `_should_synthesize()` method ignores this signal** because of overly strict code-enforced thresholds.
-### Observed Behavior
-```
-✅ JUDGE_COMPLETE: Assessment: synthesize (confidence: 10%)
-🔄 LOOPING: Gathering more evidence...  ← BUG: Should have synthesized!
-```
-The orchestrator loops **10 full iterations** despite the judge repeatedly saying "synthesize" after iteration 4.
-### Expected Behavior
-When `HFInferenceJudgeHandler._create_forced_synthesis_assessment()` returns:
-- `sufficient=True`
-- `recommendation="synthesize"`
-The orchestrator should **immediately synthesize**, regardless of score thresholds.
 ---
-## Root Cause Analysis
-### The Forced Synthesis Assessment (judges.py:514-549)
-```python
-def _create_forced_synthesis_assessment(self, question, evidence):
-    return JudgeAssessment(
-        details=AssessmentDetails(
-            mechanism_score=0,        # ← Problem 1: Score is 0
-            clinical_evidence_score=0, # ← Problem 2: Score is 0
-            drug_candidates=["AI analysis required..."],
-            key_findings=findings,
-        ),
-        sufficient=True,              # ← Correct: Says sufficient
-        confidence=0.1,               # ← Problem 3: Too low for emergency
-        recommendation="synthesize",  # ← Correct: Says synthesize
-        ...
-    )
-```
-### The _should_synthesize Logic (simple.py:159-216)
-```python
-def _should_synthesize(self, assessment, iteration, max_iterations, evidence_count):
-    combined_score = mechanism_score + clinical_evidence_score  # = 0
-    # Priority 1: Judge approved - BUT REQUIRES combined_score >= 10!
-    if assessment.sufficient and assessment.recommendation == "synthesize":
-        if combined_score >= 10:  # ← 0 >= 10 is FALSE!
-            return True, "judge_approved"
-    # Priority 2-5: All require scores or drug candidates we don't have
-    # Priority 6: Emergency synthesis
-    if is_late_iteration and evidence_count >= 30 and confidence >= 0.5:
-        #                                          ↑ 0.1 >= 0.5 is FALSE!
-        return True, "emergency_synthesis"
-    return False, "continue_searching"  # ← Always ends up here!
 ```
-### The Bug
-1. **Priority 1 has wrong precondition**: It checks `combined_score >= 10` even when the judge explicitly says "synthesize". The score check should be skipped when it's a forced/error recovery synthesis.
-2. **Priority 6 confidence threshold is too high**: 0.5 confidence is reasonable for "emergency" synthesis, but forced synthesis from API failures uses 0.1 confidence to indicate low quality—this should still trigger synthesis.
----
-## Impact
-- **User sees**: 10 iterations of "Gathering more evidence" with 0% confidence
-- **Final output**: Partial synthesis with "Max iterations reached"
-- **Time wasted**: ~2-3 minutes of useless API calls
-- **UX**: Extremely confusing - user sees "synthesize" but system keeps searching
 ---
-## Proposed Fix
-### ~~Option A: Patch Simple Mode~~ (REJECTED)
-We considered patching `_should_synthesize()` to respect forced synthesis signals. However, this adds MORE complexity to an already complex system that we plan to delete.
-### ✅ Strategic Fix: SPEC_16 Unification (APPROVED)
-**Delete Simple Mode entirely and unify on Advanced Mode.**
-See: [SPEC_16: Unified Chat Client Architecture](../specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md)
-The implementation path:
-1. **Phase 1**: Create `HuggingFaceChatClient` adapter (~150 lines)
-   - Implements `agent_framework.BaseChatClient`
-   - Wraps `huggingface_hub.InferenceClient`
-   - Enables Advanced Mode to work with free tier
-2. **Phase 2**: Delete Simple Mode
-   - Remove `src/orchestrators/simple.py` (~778 lines)
-   - Remove `src/tools/search_handler.py` (~219 lines)
-   - Update factory to always use `AdvancedOrchestrator`
-3. **Why this works**: Advanced Mode uses Microsoft Agent Framework's built-in termination. When JudgeAgent returns "SUFFICIENT EVIDENCE" (per SPEC_15), the Manager agent immediately delegates to ReportAgent. **No custom `_should_synthesize()` thresholds needed.**
-### Why Unification > Patching
-| Approach | Lines Changed | Bug Fixed? | Technical Debt |
-|----------|---------------|------------|----------------|
-| Patch Simple Mode | +20 lines | Temporarily | Adds complexity |
-| **SPEC_16 Unification** | **-997 lines** | **Permanently** | **Eliminates 778 lines** |
----
-## Files to DELETE (via SPEC_16)
-| File | Lines | Reason |
-|------|-------|--------|
-| `src/orchestrators/simple.py` | 778 | Contains buggy `_should_synthesize()` - entire file deleted |
-| `src/tools/search_handler.py` | 219 | Manager agent handles orchestration in Advanced Mode |
-## Files to CREATE (via SPEC_16)
-| File | Lines | Purpose |
-|------|-------|---------|
-| `src/clients/__init__.py` | ~10 | Package exports |
-| `src/clients/factory.py` | ~50 | `get_chat_client()` factory |
-| `src/clients/huggingface.py` | ~150 | `HuggingFaceChatClient` adapter |
-**Net change: -997 lines deleted, +210 lines added = ~787 lines removed**
----
-## Acceptance Criteria (SPEC_16 Implementation)
-- [ ] `HuggingFaceChatClient` implements `agent_framework.BaseChatClient`
-- [ ] `get_chat_client()` returns HuggingFace client when no OpenAI key
-- [ ] `AdvancedOrchestrator` works with HuggingFace backend
-- [ ] `simple.py` is deleted (778 lines removed)
-- [ ] Free tier users get Advanced Mode with HuggingFace
-- [ ] No more "continue_searching" loops when HF fails
-- [ ] Manager agent respects "SUFFICIENT EVIDENCE" signal (SPEC_15)
----
-## Test Case (SPEC_16 Verification)
-```python
-@pytest.mark.asyncio
-async def test_unified_architecture_handles_hf_failures():
-    """
-    After SPEC_16: Free tier uses Advanced Mode with HuggingFace backend.
-    When HF fails, Manager agent should trigger synthesis via ReportAgent.
-    This test replaces the old Simple Mode test because:
-    - simple.py is DELETED
-    - Advanced Mode handles termination via Manager agent signals
-    - No _should_synthesize() thresholds to bypass
-    """
-    from unittest.mock import patch, MagicMock
-    from src.orchestrators.advanced import AdvancedOrchestrator
-    from src.clients.factory import get_chat_client
-    # Verify factory returns HuggingFace client when no OpenAI key
-    with patch("src.utils.config.settings") as mock_settings:
-        mock_settings.has_openai_key = False
-        mock_settings.has_gemini_key = False
-        mock_settings.has_huggingface_key = True
-        client = get_chat_client()
-        assert "HuggingFace" in type(client).__name__
-    # Verify AdvancedOrchestrator accepts HuggingFace client
-    # (The actual termination is handled by Manager agent respecting
-    #  "SUFFICIENT EVIDENCE" signals per SPEC_15)
-```
 ---
-## Related Issues & Specs
-| Reference | Type | Relationship |
-|-----------|------|--------------|
-| [SPEC_16](../specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md) | Spec | **THE FIX** - Unified architecture eliminates this bug |
-| [SPEC_15](../specs/SPEC_15_ADVANCED_MODE_PERFORMANCE.md) | Spec | Manager agent termination logic (already implemented) |
-| [Issue #105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105) | GitHub | Deprecate Simple Mode |
-| [Issue #109](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/109) | GitHub | Simplify Provider Architecture |
-| [Issue #110](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/110) | GitHub | Remove Anthropic Support |
-| PR #71 (SPEC_06) | PR | Added `_should_synthesize()` - now causes this bug |
-| Commit 5e761eb | Commit | Added `_create_forced_synthesis_assessment()` |
 ---
-## References
-- `src/orchestrators/simple.py:159-216` - `_should_synthesize()` method
-- `src/agent_factory/judges.py:514-549` - `_create_forced_synthesis_assessment()`
-- `src/agent_factory/judges.py:477-512` - `_create_quota_exhausted_assessment()`

+# P0 BUG: Simple Mode Synthesis Bypass (WILL BE FIXED BY UNIFIED ARCHITECTURE)
+**Status**: BLOCKED - Waiting for upstream PR #2566
 **Priority**: P0 (Demo-blocking)
 **Discovered**: 2025-12-01
 **GitHub Issue**: [#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)
 ---
+## Current State
+**`simple.py` is DELETED.** This bug existed in the old Simple Mode code.
+The bug will NOT be fixed by restoring Simple Mode. Instead, it will be **automatically fixed** when we complete the unified architecture (after upstream PR #2566 merges).
 ---
+## The Bug (Historical)
+When HuggingFace Inference API failed, Simple Mode's `_should_synthesize()` ignored forced synthesis signals due to overly strict thresholds.
+```text
+✅ JUDGE_COMPLETE: Assessment: synthesize (confidence: 10%)
+🔄 LOOPING: Gathering more evidence...  ← BUG: Should have synthesized!
 ```
 ---
+## Why Unified Architecture Fixes This
+| Architecture | How Termination Works |
+|--------------|----------------------|
+| **Old (Simple Mode)** | Custom `_should_synthesize()` with buggy thresholds |
+| **New (Unified)** | Manager agent respects "SUFFICIENT EVIDENCE" signals |
+The Manager agent in Advanced Mode already works correctly. By completing the unified architecture with HuggingFace support, we inherit that correct behavior.
+**No need to patch `_should_synthesize()` because the code is deleted.**
 ---
+## Path Forward
+1. **Wait** for upstream PR #2566 to merge (fixes repr bug)
+2. **Update** `agent-framework` dependency
+3. **Verify** Advanced Mode + HuggingFace works
+4. **Done** - This bug is gone (no `_should_synthesize()` thresholds)
 ---
+## Related
+| Reference | Description |
+|-----------|-------------|
+| [ARCHITECTURE.md](../ARCHITECTURE.md) | Current state and unified plan |
+| [SPEC_16](../specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md) | Unified architecture spec |
+| [Issue #105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105) | GitHub tracking |
+| [Upstream #2562](https://github.com/microsoft/agent-framework/issues/2562) | Framework bug |
+| [Upstream PR #2566](https://github.com/microsoft/agent-framework/pull/2566) | Framework fix |

docs/bugs/P1_SIMPLE_MODE_REMOVED_BREAKS_FREE_TIER_UX.md ADDED Viewed

	@@ -0,0 +1,61 @@

+# Free Tier (No API Key) - BLOCKED by Upstream #2562
+**Status**: BLOCKED - Waiting for upstream PR #2566
+**Priority**: P1
+**Discovered**: 2025-12-01
+---
+## Problem
+Free tier (no API key provided) shows garbage output:
+```
+📚 **SEARCH_COMPLETE**: searcher: <agent_framework._types.ChatMessage object at 0x7fd3f8617b10>
+```
+## Cause
+**Upstream Bug #2562**: Microsoft Agent Framework produces `repr()` garbage for tool-call-only messages.
+## Architecture
+```
+User provides API key?
+NO (Free Tier)              YES (Paid Tier)
+──────────────              ───────────────
+HuggingFace backend         OpenAI backend
+Qwen 2.5 72B (free)         GPT-5 (paid)
+SAME orchestration, different backends
+ONE codebase, not parallel universes
+```
+## Framework Stack
+| Framework | Role |
+|-----------|------|
+| Microsoft Agent Framework | Multi-agent orchestration |
+| Pydantic AI | Structured outputs & validation |
+Both work TOGETHER. Not mutually exclusive.
+## Fix
+**Upstream PR #2566** will fix this.
+Once merged:
+1. `uv add agent-framework@latest`
+2. Verify free tier works
+3. Done
+## What Was Deleted
+`simple.py` (778 lines) was a SEPARATE orchestrator. Created parallel universe. Now deleted. ONE orchestrator with different backends.
+## Related
+- [Issue #105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105)
+- [Upstream #2562](https://github.com/microsoft/agent-framework/issues/2562)
+- [Upstream PR #2566](https://github.com/microsoft/agent-framework/pull/2566)

docs/specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md CHANGED Viewed

@@ -1,350 +1,115 @@
-# SPEC_16: Unified Chat Client Architecture
-**Status**: Proposed
-**Priority**: P0 (Fixes Critical Bug #113)
-**Issue**: Updates [#105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105), [#109](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/109), **[#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)** (P0 Bug)
 **Created**: 2025-12-01
-**Last Updated**: 2025-12-01
 ---
-## ⚠️ CRITICAL CLARIFICATION: Integration, Not Deletion
-**This spec INTEGRATES Simple Mode's free-tier capability into Advanced Mode.**
-| What We're Doing | What We're NOT Doing |
-|------------------|----------------------|
-| ✅ Integrating HuggingFace support into Advanced Mode | ❌ Removing free-tier capability |
-| ✅ Unifying two parallel implementations into one | ❌ Breaking functionality for users without API keys |
-| ✅ Deleting redundant orchestration CODE | ❌ Deleting the CAPABILITY that code provided |
-| ✅ Making Advanced Mode work with ANY provider | ❌ Locking users into paid-only tiers |
-**After this spec:**
-- Users WITH OpenAI key → Advanced Mode (OpenAI backend) ✅
-- Users WITHOUT any key → Advanced Mode (HuggingFace backend) ✅ **SAME CAPABILITY, UNIFIED ARCHITECTURE**
----
-## Summary
-Unify Simple Mode and Advanced Mode into a **single orchestration system** by:
-1. **Renaming the namespace**: `OpenAIChatClient` → `BaseChatClient` (neutral protocol)
-2. **Creating an adapter**: `HuggingFaceChatClient` implements `BaseChatClient`
-3. **Retiring parallel code**: Simple Mode's while-loop becomes unnecessary
-The result: **One codebase, multiple providers, zero parallel universes.**
-> **🔥 P0 Bug Fix**: This also resolves Issue #113. Simple Mode's `_should_synthesize()` has a bug that ignores forced synthesis signals. Advanced Mode's Manager agent handles termination correctly. By integrating, the bug disappears.
----
-## The Integration Concept
-### Before: Two Parallel Universes (Current)
-```text
-User Query
-    │
-    ├── Has API Key? ──Yes──→ Advanced Mode (488 lines)
-    │                         └── Microsoft Agent Framework
-    │                         └── OpenAIChatClient (hardcoded) ◄── THE BOTTLENECK
-    │
-    └── No API Key? ──────────→ Simple Mode (778 lines)
-                                └── While-loop orchestration (SEPARATE CODE)
-                                └── Pydantic AI + HuggingFace
-```
-**Problem**: Same capability, two implementations, double maintenance, P0 bug in Simple Mode.
-### After: Unified Architecture (This Spec)
 ```text
-User Query
-    │
-    └──→ Advanced Mode (unified) ◄── ONE SYSTEM FOR ALL USERS
-         └── Microsoft Agent Framework
-         └── get_chat_client() returns: ◄── NAMESPACE NEUTRAL
-             │
-             ├── OpenAIChatClient      (if OpenAI key present)
-             ├── GeminiChatClient      (if Gemini key present) [Future]
-             └── HuggingFaceChatClient (fallback - FREE TIER) ◄── INTEGRATED!
-```
-**Result**: Free-tier users get the SAME Advanced Mode experience, just with HuggingFace as the LLM backend.
----
-## What Gets Integrated vs Retired
-### ✅ INTEGRATED (Capability Preserved)
-| Simple Mode Component | Integration Target | How |
-|-----------------------|-------------------|-----|
-| HuggingFace LLM calls | `HuggingFaceChatClient` | New adapter (~150 lines) |
-| Free-tier access | `get_chat_client()` factory | Auto-selects HF when no key |
-| Search tools (PubMed, etc.) | Already shared | `src/agents/tools.py` |
-| Evidence models | Already shared | `src/utils/models.py` |
-### 🗑️ RETIRED (Redundant Code Removed)
-| Simple Mode Component | Why Retired | Replacement in Advanced Mode |
-|-----------------------|-------------|------------------------------|
-| While-loop orchestration | Redundant | Manager agent orchestrates |
-| `_should_synthesize()` thresholds | **BUGGY** (P0 #113) | Manager agent signals |
-| `SearchHandler` scatter-gather | Redundant | SearchAgent handles this |
-| `JudgeHandler` | Redundant | JudgeAgent handles this |
-**Key insight**: We're not losing functionality. We're consolidating two implementations of the SAME functionality into one.
----
-## Technical Implementation
-### The Single Change That Enables Unification
-```python
-# BEFORE (hardcoded to OpenAI):
-from agent_framework.openai import OpenAIChatClient
-class AdvancedOrchestrator:
-    def __init__(self, ...):
-        self._chat_client = OpenAIChatClient(...)  # ❌ Only OpenAI works
-# AFTER (neutral - any provider):
-from agent_framework import BaseChatClient
-from src.clients.factory import get_chat_client
-class AdvancedOrchestrator:
-    def __init__(self, ...):
-        self._chat_client = get_chat_client()  # ✅ OpenAI, Gemini, OR HuggingFace
-```
-### HuggingFaceChatClient Adapter
-```python
-# src/clients/huggingface.py
-from agent_framework import BaseChatClient, ChatMessage, ChatResponse
-from huggingface_hub import InferenceClient
-class HuggingFaceChatClient(BaseChatClient):
-    """Adapter that makes HuggingFace work with Microsoft Agent Framework."""
-    def __init__(self, model_id: str = "meta-llama/Llama-3.1-70B-Instruct"):
-        self._client = InferenceClient(model=model_id)
-        self._model_id = model_id
-    async def _inner_get_response(
-        self,
-        messages: list[ChatMessage],
-        **kwargs
-    ) -> ChatResponse:
-        """Convert HuggingFace response to Agent Framework format."""
-        # Convert messages to HF format
-        hf_messages = [{"role": m.role, "content": m.content} for m in messages]
-        # Call HuggingFace
-        response = self._client.chat_completion(messages=hf_messages)
-        # Convert back to Agent Framework format
-        return ChatResponse(
-            content=response.choices[0].message.content,
-            # ... other fields
-        )
-    async def _inner_get_streaming_response(self, ...):
-        """Streaming version."""
-        ...
 ```
-### ChatClientFactory
-```python
-# src/clients/factory.py
-from agent_framework import BaseChatClient
-from agent_framework.openai import OpenAIChatClient
-from src.utils.config import settings
-def get_chat_client(provider: str | None = None) -> BaseChatClient:
-    """
-    Factory that returns the appropriate chat client.
-    Priority:
-    1. OpenAI (if key available) - Best function calling, GPT-5
-    2. Gemini (if key available) - Good alternative [Future]
-    3. HuggingFace (always available) - FREE TIER FALLBACK
-    """
-    if provider == "openai" or (provider is None and settings.has_openai_key):
-        return OpenAIChatClient(
-            model_id=settings.openai_model,  # gpt-5
-            api_key=settings.openai_api_key,
-        )
-    # Future: Gemini support
-    # if settings.has_gemini_key:
-    #     return GeminiChatClient(...)
-    # FREE TIER: HuggingFace (no API key required for public models)
-    from src.clients.huggingface import HuggingFaceChatClient
-    return HuggingFaceChatClient(
-        model_id="meta-llama/Llama-3.1-70B-Instruct",
-    )
-```
 ---
-## Why This Fixes P0 Bug #113
-### The Bug (Simple Mode)
-```python
-# src/orchestrators/simple.py - THE BUG
-def _should_synthesize(self, assessment, ...):
-    # When HF fails, judge returns: score=0, confidence=0.1, recommendation="synthesize"
-    if assessment.sufficient and assessment.recommendation == "synthesize":
-        if combined_score >= 10:  # ❌ 0 >= 10 is FALSE
-            return True
-    if confidence >= 0.5:  # ❌ 0.1 >= 0.5 is FALSE
-        return True, "emergency"
-    return False, "continue_searching"  # ❌ LOOPS FOREVER
-```
-### The Fix (Advanced Mode - Already Works Correctly)
-```python
-# Advanced Mode doesn't have this bug because:
-# 1. JudgeAgent says "SUFFICIENT EVIDENCE" in natural language
-# 2. Manager agent understands this and delegates to ReportAgent
-# 3. No hardcoded thresholds to bypass
-# The Manager agent prompt (src/orchestrators/advanced.py:152):
-"""
-When JudgeAgent says "SUFFICIENT EVIDENCE" or "STOP SEARCHING":
-→ IMMEDIATELY delegate to ReportAgent for synthesis
-"""
-```
-**By integrating Simple Mode's capability into Advanced Mode, the bug disappears** because Advanced Mode's termination logic works correctly.
 ---
-## Migration Plan
-### Phase 1: Create HuggingFaceChatClient (Enables Integration)
-- [ ] Create `src/clients/` package
-- [ ] Implement `HuggingFaceChatClient` (~150 lines)
-  - Extends `agent_framework.BaseChatClient`
-  - Wraps `huggingface_hub.InferenceClient.chat_completion()`
-  - Implements required abstract methods
-- [ ] Implement `get_chat_client()` factory (~50 lines)
-- [ ] Add unit tests
-**Exit Criteria**: `get_chat_client()` returns working HuggingFace client when no API key.
-### Phase 2: Integrate into Advanced Mode (Fixes P0 Bug)
-- [ ] Update `AdvancedOrchestrator` to use `get_chat_client()`
-- [ ] Update `magentic_agents.py` type hints: `OpenAIChatClient` → `BaseChatClient`
-- [ ] Update `orchestrators/factory.py` to always return `AdvancedOrchestrator`
-- [ ] Update `app.py` to remove mode toggle (everyone gets Advanced Mode)
-- [ ] Archive `simple.py` to `docs/archive/` (for reference)
-- [ ] Migrate Simple Mode tests to Advanced Mode tests
-**Exit Criteria**: Free-tier users get Advanced Mode with HuggingFace backend. P0 bug gone.
-### Phase 3: Cleanup (Optional)
-- [ ] Remove Anthropic provider code (Issue #110)
-- [ ] Add Gemini support (Issue #109)
-- [ ] Delete archived files after verification period
 ---
-## Files Changed
-### New Files (~200 lines)
-| File | Lines | Purpose |
-|------|-------|---------|
-| `src/clients/__init__.py` | ~10 | Package exports |
-| `src/clients/factory.py` | ~50 | `get_chat_client()` |
-| `src/clients/huggingface.py` | ~150 | HuggingFace adapter |
-### Modified Files
-| File | Change |
-|------|--------|
-| `src/orchestrators/advanced.py` | Use `get_chat_client()` instead of `OpenAIChatClient` |
-| `src/orchestrators/factory.py` | Always return `AdvancedOrchestrator` |
-| `src/agents/magentic_agents.py` | Type hints: `OpenAIChatClient` → `BaseChatClient` |
-| `src/app.py` | Remove mode toggle, always use Advanced |
-### Archived Files (NOT deleted from git history)
-| File | Lines | Reason |
-|------|-------|--------|
-| `src/orchestrators/simple.py` | 778 | Functionality INTEGRATED, code retired |
-| `src/tools/search_handler.py` | 219 | Manager agent handles this now |
 ---
-## Verification Checklist
-### Technical Prerequisites (Verified ✅)
-- [x] `agent_framework.BaseChatClient` exists
-- [x] Abstract methods: `_inner_get_response`, `_inner_get_streaming_response`
-- [x] `huggingface_hub.InferenceClient.chat_completion()` exists
-- [x] `chat_completion()` has `tools` parameter (verified in 0.36.0)
-- [x] HuggingFace supports Llama 3.1 70B via free inference
-- [x] **Dependency pinned**: `huggingface-hub>=0.24.0` in pyproject.toml (required for stable tool calling)
-### Capability Preservation Checklist
-After implementation, verify:
-- [ ] User with OpenAI key → Gets Advanced Mode with OpenAI (GPT-5)
-- [ ] User with NO key → Gets Advanced Mode with HuggingFace (Llama 3.1 70B)
-- [ ] Free-tier search works (PubMed, ClinicalTrials, EuropePMC)
-- [ ] Free-tier synthesis works (LLM generates report)
-- [ ] No more "continue_searching" infinite loops (P0 bug fixed)
 ---
-## Implementation Notes (From Independent Audit)
-### Dependency Requirement ✅ FIXED
-The `huggingface-hub` package must be `>=0.24.0` for stable `chat_completion` with tools support.
-```toml
-# pyproject.toml - ALREADY UPDATED
-"huggingface-hub>=0.24.0",  # Required for stable chat_completion with tools
-```
-### Llama 3.1 Prompt Considerations ⚠️
-The Manager agent prompt in `AdvancedOrchestrator._create_task_prompt()` was optimized for GPT-5. When using Llama 3.1 70B via HuggingFace, the prompt **may need tuning** to ensure strict adherence to delegation logic.
-**Potential issue**: Llama 3.1 may not immediately delegate to ReportAgent when JudgeAgent says "SUFFICIENT EVIDENCE".
-**Mitigation**: During implementation, test with HuggingFace backend and add reinforcement phrases if needed:
-- "You MUST delegate to ReportAgent when you see SUFFICIENT EVIDENCE"
-- "Do NOT continue searching after Judge approves"
-This is a **runtime verification** task, not a spec change.
 ---
 ## References
-- Microsoft Agent Framework: `agent_framework.BaseChatClient`
-- HuggingFace Inference: `huggingface_hub.InferenceClient`
-- Issue #105: Deprecate Simple Mode → **Reframe as "Integrate Simple Mode"**
-- Issue #109: Simplify Provider Architecture
-- Issue #110: Remove Anthropic Provider Support
-- Issue #113: P0 Bug - Simple Mode ignores forced synthesis

+# SPEC_16: Unified Architecture
+**Status**: BLOCKED - Waiting for upstream PR #2566
+**Priority**: P0
+**Issue**: [#105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105)
 **Created**: 2025-12-01
 ---
+## The Architecture (No Bullshit Version)
 ```text
+┌─────────────────────────────────────────────────────────────┐
+│                    UNIFIED ARCHITECTURE                      │
+│                                                              │
+│   User provides API key?                                     │
+│                                                              │
+│   NO (Free Tier)              YES (Paid Tier)               │
+│   ──────────────              ───────────────               │
+│   HuggingFace backend         OpenAI backend                │
+│   Qwen 2.5 72B (free)         GPT-5 (paid)                  │
+│                                                              │
+│   SAME orchestration logic for both                          │
+│   ONE codebase, different LLM backends                       │
+└─────────────────────────────────────────────────────────────┘
 ```
+**No "modes."** Just: do you have an API key or not?
 ---
+## Framework Stack
+DeepBoner uses TWO frameworks that work TOGETHER:
+| Framework | Role | Files |
+|-----------|------|-------|
+| **Microsoft Agent Framework** | Multi-agent ORCHESTRATION | `src/orchestrators/advanced.py` |
+| **Pydantic AI** | Structured OUTPUTS & validation | `src/agent_factory/judges.py`, `src/agents/*.py` |
+### Why Both?
+- **Microsoft AF** handles: Manager → Search → Judge → Report agent coordination
+- **Pydantic AI** handles: Structured responses, type validation, schema enforcement
+They are **NOT mutually exclusive**. They are **complementary**:
+- Microsoft AF = the highway system (routes agents)
+- Pydantic AI = the cargo containers (structures data)
+### Current Integration
+| Component | Framework | Purpose |
+|-----------|-----------|---------|
+| `AdvancedOrchestrator` | Microsoft AF | Coordinates multi-agent workflow |
+| `JudgeAssessment` | Pydantic AI | Structured judge output with validation |
+| `Evidence`, `Citation` | Pydantic | Validated data models |
+| Agent tool calling | Microsoft AF | Function execution |
+| Agent structured output | Pydantic AI | Response validation |
 ---
+## LLM Backend Selection
+Auto-detected by `src/clients/factory.py`:
+| Condition | Backend | Model |
+|-----------|---------|-------|
+| User provides OpenAI key | OpenAI | GPT-5 |
+| No API key | HuggingFace | Qwen 2.5 72B (free) |
 ---
+## Current Blocker
+**Upstream Bug #2562**: Microsoft Agent Framework produces `repr()` garbage for tool-call-only messages.
+**Fix**: [PR #2566](https://github.com/microsoft/agent-framework/pull/2566) - waiting for merge.
+**Once merged**:
+1. `uv add agent-framework@latest`
+2. Verify free tier works
+3. Done
 ---
+## What Was Deleted
+`simple.py` (778 lines) was a SEPARATE orchestrator that created a parallel universe:
+- Used Pydantic AI directly for LLM calls
+- Had its own while-loop orchestration
+- Duplicated search/judge logic
+Now there's ONE orchestrator with different backends.
 ---
+## Files
+| File | Framework | Purpose |
+|------|-----------|---------|
+| `src/orchestrators/advanced.py` | Microsoft AF | Multi-agent orchestration |
+| `src/clients/factory.py` | - | Auto-selects LLM backend |
+| `src/clients/huggingface.py` | - | HuggingFace adapter (free tier) |
+| `src/agent_factory/judges.py` | Pydantic AI | Structured judge assessments |
+| `src/agents/report_agent.py` | Pydantic AI | Structured report generation |
+| `src/utils/models.py` | Pydantic | Data models (Evidence, Citation) |
 ---
 ## References
+- [Microsoft Agent Framework](https://github.com/microsoft/agent-framework) - Multi-agent orchestration
+- [Pydantic AI](https://ai.pydantic.dev/) - Structured outputs framework
+- [Multi-Agent Research System with Pydantic](https://www.analyticsvidhya.com/blog/2025/03/multi-agent-research-assistant-system-using-pydantic/) - Architecture pattern
+- [AG-UI Protocol](https://www.copilotkit.ai/blog/introducing-pydantic-ai-integration-with-ag-ui) - How frameworks integrate

docs/specs/SPEC_17_ACCUMULATOR_PATTERN.md ADDED Viewed

	@@ -0,0 +1,62 @@

+# SPEC 17: Accumulator Pattern for Agent Events
+**Status**: IMPLEMENTED
+**Created**: 2025-12-02
+**Author**: AI Agent
+**Related**: P0_REPR_BUG_ROOT_CAUSE_ANALYSIS.md
+## 1. Context
+The Microsoft Agent Framework event model has a specific intended usage pattern:
+- `MagenticAgentDeltaEvent.text` → **Content Source** (Streaming)
+- `MagenticAgentMessageEvent` → **Completion Signal** (End of Turn)
+Our previous implementation incorrectly attempted to extract content from `MagenticAgentMessageEvent.message`. This property is not designed for content extraction and can contain internal representation data (repr strings) for tool-only messages. This led to the "repr bug" where users saw raw Python object strings in the UI.
+The **Accumulator Pattern** aligns our codebase with Microsoft's intended architecture (as demonstrated in their `04_magentic_one.py` sample) and resolves the display issues by using the correct event data source.
+## 2. The Solution: Accumulator Pattern
+Instead of relying on the final message event for content, we adopt the **Accumulator Pattern**, which aligns with the Microsoft Agent Framework's intended usage (as seen in their sample `04_magentic_one.py`).
+### 2.1 Core Concept
+1.  **Streaming is Truth**: `MagenticAgentDeltaEvent` is the exclusive source of text content. These events are not affected by the upstream bug.
+2.  **Accumulation**: The orchestrator maintains a stateful buffer (`current_message_buffer`) that appends text from delta events.
+3.  **Signal Processing**: `MagenticAgentMessageEvent` is treated solely as a completion signal ("end of turn"). When received, we consume the buffer to form the final UI message and then clear the buffer.
+### 2.2 Logic Flow
+```python
+current_message_buffer = ""
+for event in stream:
+    if event is DeltaEvent:
+        current_message_buffer += event.text
+        emit_streaming_event(event.text)
+    elif event is MessageEvent:
+        # IGNORE event.message (it might be corrupted)
+        final_text = current_message_buffer
+        if not final_text:
+             final_text = "Action completed (Tool Call)"
+        emit_complete_event(final_text)
+        current_message_buffer = ""
+```
+## 3. Test Plan
+To verify this pattern ensures correct output regardless of upstream bugs, we define the following test scenarios:
+### 3.1 Scenario A: Standard Text Message
+-   **Input**: Sequence of `MagenticAgentDeltaEvent` (with text parts) -> `MagenticAgentMessageEvent` (with corrupted repr).
+-   **Expected Output**: The `AgentEvent` emitted at the end must contain the concatenated text from the deltas, NOT the repr string.
+### 3.2 Scenario B: Tool Call (No Text)
+-   **Input**: No text deltas -> `MagenticAgentMessageEvent` (with corrupted repr).
+-   **Expected Output**: The `AgentEvent` should contain a fallback message (e.g., "Action completed (Tool Call)"), NOT the repr string.
+## 4. Implementation Details
+The pattern is implemented in `src/orchestrators/advanced.py` within the `run()` method loop. It bypasses `_process_event` for these specific event types to ensure strict control over data flow.

src/orchestrators/advanced.py CHANGED Viewed

@@ -17,7 +17,7 @@ Design Patterns:
 import asyncio
 from collections.abc import AsyncGenerator
-from typing import TYPE_CHECKING, Any
 import structlog
 from agent_framework import (
@@ -181,6 +181,69 @@ The final output should be a structured research report."""
         return f"Round {iteration}/{self._max_rounds} (~{est_display} remaining)"
     async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
         """
         Run the workflow.
@@ -193,18 +256,10 @@ The final output should be a structured research report."""
         """
         logger.info("Starting Advanced orchestrator", query=query)
-        yield AgentEvent(
-            type="started",
-            message=f"Starting research (Advanced mode): {query}",
-            iteration=0,
-        )
         # Initialize context state
-        yield AgentEvent(
-            type="progress",
-            message="Loading embedding service (LlamaIndex/ChromaDB)...",
-            iteration=0,
-        )
         embedding_service = self._init_embedding_service()
         yield AgentEvent(
@@ -238,25 +293,52 @@ The final output should be a structured research report."""
         iteration = 0
         final_event_received = False
         try:
             async with asyncio.timeout(self._timeout_seconds):
                 async for event in workflow.run_stream(task):
-                    agent_event = self._process_event(event, iteration)
-                    if agent_event:
-                        if isinstance(event, MagenticAgentMessageEvent):
-                            iteration += 1
-                            progress_msg = self._get_progress_message(iteration)
-                            # Yield progress update before the agent action
                             yield AgentEvent(
-                                type="progress",
-                                message=progress_msg,
                                 iteration=iteration,
                             )
                         if agent_event.type == "complete":
                             final_event_received = True
                         yield agent_event
             # GUARANTEE: Always emit termination event if stream ends without one
@@ -278,52 +360,8 @@ The final output should be a structured research report."""
                 )
         except TimeoutError:
-            logger.warning("Workflow timed out", iterations=iteration)
-            # ACTUALLY synthesize from gathered evidence
-            try:
-                from src.agents.magentic_agents import create_report_agent
-                from src.agents.state import get_magentic_state
-                state = get_magentic_state()
-                memory = state.memory
-                # Get evidence summary from memory
-                evidence_summary = await memory.get_context_summary()
-                # Create and invoke ReportAgent for synthesis
-                report_agent = create_report_agent(self._chat_client, domain=self.domain)
-                yield AgentEvent(
-                    type="synthesizing",
-                    message="Workflow timed out. Synthesizing available evidence...",
-                    iteration=iteration,
-                )
-                # Invoke ReportAgent directly
-                # Note: ChatAgent.run() returns the final response string
-                synthesis_result = await report_agent.run(
-                    "Synthesize research report from this evidence. "
-                    f"If evidence is sparse, say so.\n\n{evidence_summary}"
-                )
-                yield AgentEvent(
-                    type="complete",
-                    message=str(synthesis_result),
-                    data={"reason": "timeout_synthesis", "iterations": iteration},
-                    iteration=iteration,
-                )
-            except Exception as synth_error:
-                logger.error("Timeout synthesis failed", error=str(synth_error))
-                yield AgentEvent(
-                    type="complete",
-                    message=(
-                        f"Research timed out after {iteration} rounds. "
-                        f"Evidence gathered but synthesis failed: {synth_error}"
-                    ),
-                    data={"reason": "timeout_synthesis_failed", "iterations": iteration},
-                    iteration=iteration,
-                )
         except Exception as e:
             logger.error("Workflow failed", error=str(e))
@@ -333,6 +371,45 @@ The final output should be a structured research report."""
                 iteration=iteration,
             )
     def _extract_text(self, message: Any) -> str:
         """
         Defensively extract text from a message object.
@@ -384,7 +461,9 @@ The final output should be a structured research report."""
         # The repr is useless for display purposes
         return ""
-    def _get_event_type_for_agent(self, agent_name: str) -> str:
         """Map agent name to appropriate event type.
         Args:
@@ -444,17 +523,8 @@ The final output should be a structured research report."""
                 iteration=iteration,
             )
-        elif isinstance(event, MagenticAgentMessageEvent):
-            agent_name = event.agent_id or "unknown"
-            text = self._extract_text(event.message)
-            event_type = self._get_event_type_for_agent(agent_name)
-            # All returned types are valid AgentEvent.type literals
-            return AgentEvent(
-                type=event_type,  # type: ignore[arg-type]
-                message=f"{agent_name}: {self._smart_truncate(text)}",
-                iteration=iteration + 1,
-            )
         elif isinstance(event, MagenticFinalResultEvent):
             text = self._extract_text(event.message) if event.message else "No result"
@@ -465,14 +535,8 @@ The final output should be a structured research report."""
                 iteration=iteration,
             )
-        elif isinstance(event, MagenticAgentDeltaEvent):
-            if event.text:
-                return AgentEvent(
-                    type="streaming",
-                    message=event.text,
-                    data={"agent_id": event.agent_id},
-                    iteration=iteration,
-                )
         elif isinstance(event, WorkflowOutputEvent):
             if event.data:

 import asyncio
 from collections.abc import AsyncGenerator
+from typing import TYPE_CHECKING, Any, Literal
 import structlog
 from agent_framework import (
         return f"Round {iteration}/{self._max_rounds} (~{est_display} remaining)"
+    async def _init_workflow_events(self, query: str) -> AsyncGenerator[AgentEvent, None]:
+        """Yield initialization events."""
+        yield AgentEvent(
+            type="started",
+            message=f"Starting research (Advanced mode): {query}",
+            iteration=0,
+        )
+        yield AgentEvent(
+            type="progress",
+            message="Loading embedding service (LlamaIndex/ChromaDB)...",
+            iteration=0,
+        )
+    async def _handle_timeout(self, iteration: int) -> AsyncGenerator[AgentEvent, None]:
+        """Handle workflow timeout by attempting synthesis."""
+        logger.warning("Workflow timed out", iterations=iteration)
+        # ACTUALLY synthesize from gathered evidence
+        try:
+            from src.agents.magentic_agents import create_report_agent
+            from src.agents.state import get_magentic_state
+            state = get_magentic_state()
+            memory = state.memory
+            # Get evidence summary from memory
+            evidence_summary = await memory.get_context_summary()
+            # Create and invoke ReportAgent for synthesis
+            report_agent = create_report_agent(self._chat_client, domain=self.domain)
+            yield AgentEvent(
+                type="synthesizing",
+                message="Workflow timed out. Synthesizing available evidence...",
+                iteration=iteration,
+            )
+            # Invoke ReportAgent directly
+            # Note: ChatAgent.run() returns AgentRunResponse; access text via .text
+            synthesis_result = await report_agent.run(
+                "Synthesize research report from this evidence. "
+                f"If evidence is sparse, say so.\n\n{evidence_summary}"
+            )
+            yield AgentEvent(
+                type="complete",
+                message=synthesis_result.text,
+                data={"reason": "timeout_synthesis", "iterations": iteration},
+                iteration=iteration,
+            )
+        except Exception as synth_error:
+            logger.error("Timeout synthesis failed", error=str(synth_error))
+            yield AgentEvent(
+                type="complete",
+                message=(
+                    f"Research timed out after {iteration} rounds. "
+                    f"Evidence gathered but synthesis failed: {synth_error}"
+                ),
+                data={"reason": "timeout_synthesis_failed", "iterations": iteration},
+                iteration=iteration,
+            )
     async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
         """
         Run the workflow.
         """
         logger.info("Starting Advanced orchestrator", query=query)
+        async for event in self._init_workflow_events(query):
+            yield event
         # Initialize context state
         embedding_service = self._init_embedding_service()
         yield AgentEvent(
         iteration = 0
         final_event_received = False
+        # ACCUMULATOR PATTERN: Track streaming content to bypass upstream Repr Bug
+        # Upstream bug in _magentic.py flattens message.contents and sets message.text
+        # to repr(message) if text is empty. We must reconstruct text from Deltas.
+        current_message_buffer: str = ""
+        current_agent_id: str | None = None
         try:
             async with asyncio.timeout(self._timeout_seconds):
                 async for event in workflow.run_stream(task):
+                    # 1. Handle Streaming (Source of Truth for Content)
+                    if isinstance(event, MagenticAgentDeltaEvent):
+                        # Detect agent switch to clear buffer
+                        if event.agent_id != current_agent_id:
+                            current_message_buffer = ""
+                            current_agent_id = event.agent_id
+                        if event.text:
+                            current_message_buffer += event.text
                             yield AgentEvent(
+                                type="streaming",
+                                message=event.text,
+                                data={"agent_id": event.agent_id},
                                 iteration=iteration,
                             )
+                        continue
+                    # 2. Handle Completion Signal
+                    # We use our accumulated buffer instead of the corrupted event.message
+                    if isinstance(event, MagenticAgentMessageEvent):
+                        iteration += 1
+                        comp_event, prog_event = self._handle_completion_event(
+                            event, current_message_buffer, iteration
+                        )
+                        yield comp_event
+                        yield prog_event
+                        # Clear buffer after consuming
+                        current_message_buffer = ""
+                        continue
+                    # 3. Handle other events normally
+                    agent_event = self._process_event(event, iteration)
+                    if agent_event:
                         if agent_event.type == "complete":
                             final_event_received = True
                         yield agent_event
             # GUARANTEE: Always emit termination event if stream ends without one
                 )
         except TimeoutError:
+            async for event in self._handle_timeout(iteration):
+                yield event
         except Exception as e:
             logger.error("Workflow failed", error=str(e))
                 iteration=iteration,
             )
+    def _handle_completion_event(
+        self, event: MagenticAgentMessageEvent, buffer: str, iteration: int
+    ) -> tuple[AgentEvent, AgentEvent]:
+        """Handle an agent completion event using the accumulated buffer."""
+        # Use buffer if available, otherwise fall back cautiously
+        # (Only fall back if buffer empty, which implies tool-only turn)
+        text_content = buffer
+        if not text_content:
+            # Try extraction but ignore repr strings AND empty strings
+            raw_text = self._extract_text(event.message)
+            if raw_text and not (raw_text.startswith("<") and "object at" in raw_text):
+                text_content = raw_text
+            else:
+                text_content = "Action completed (Tool Call)"
+        agent_name = event.agent_id or "unknown"
+        event_type = self._get_event_type_for_agent(agent_name)
+        completion_event = AgentEvent(
+            type=event_type,
+            message=f"{agent_name}: {text_content[:200]}...",
+            iteration=iteration,
+        )
+        # Progress update
+        rounds_remaining = max(self._max_rounds - iteration, 0)
+        est_seconds = rounds_remaining * 45
+        est_display = (
+            f"{est_seconds // 60}m {est_seconds % 60}s" if est_seconds >= 60 else f"{est_seconds}s"
+        )
+        progress_event = AgentEvent(
+            type="progress",
+            message=f"Round {iteration}/{self._max_rounds} (~{est_display} remaining)",
+            iteration=iteration,
+        )
+        return completion_event, progress_event
     def _extract_text(self, message: Any) -> str:
         """
         Defensively extract text from a message object.
         # The repr is useless for display purposes
         return ""
+    def _get_event_type_for_agent(
+        self, agent_name: str
+    ) -> Literal["search_complete", "judge_complete", "hypothesizing", "synthesizing", "judging"]:
         """Map agent name to appropriate event type.
         Args:
                 iteration=iteration,
             )
+        # NOTE: MagenticAgentMessageEvent is handled in run() loop with Accumulator Pattern
+        # (see lines 322-335) and never reaches this method due to `continue` statement.
         elif isinstance(event, MagenticFinalResultEvent):
             text = self._extract_text(event.message) if event.message else "No result"
                 iteration=iteration,
             )
+        # NOTE: MagenticAgentDeltaEvent is handled in run() loop with Accumulator Pattern
+        # (see lines 306-320) and never reaches this method due to `continue` statement.
         elif isinstance(event, WorkflowOutputEvent):
             if event.data:

tests/unit/orchestrators/test_accumulator_pattern.py ADDED Viewed

	@@ -0,0 +1,294 @@

+"""
+Test the Accumulator Pattern for Microsoft Agent Framework event handling.
+This tests SPEC 17: We use MagenticAgentDeltaEvent.text as the sole source of content,
+and MagenticAgentMessageEvent as a signal only (ignoring .message to avoid repr bug).
+"""
+import importlib
+import sys
+import types
+from unittest.mock import MagicMock, patch
+import pytest
+# --- Create real event classes ---
+class MockDeltaEvent:
+    """Simulates MagenticAgentDeltaEvent with streaming text."""
+    def __init__(self, text: str, agent_id: str = "TestAgent"):
+        self.text = text
+        self.agent_id = agent_id
+class MockMessageEvent:
+    """Simulates MagenticAgentMessageEvent with potentially corrupted .message."""
+    def __init__(self, message_text: str, agent_id: str = "TestAgent"):
+        self.message = MagicMock()
+        self.message.text = message_text  # This could be repr garbage
+        self.agent_id = agent_id
+        self.text = None  # No top-level .text on MessageEvent
+class MockFinalResultEvent:
+    """Simulates MagenticFinalResultEvent."""
+    def __init__(self, text: str):
+        self.message = MagicMock()
+        self.message.text = text
+        self.text = None
+class MockOrchestratorMessageEvent:
+    """Simulates MagenticOrchestratorMessageEvent."""
+    def __init__(self, kind: str = "user_task", message: str = "test"):
+        self.kind = kind
+        self.message = MagicMock()
+        self.message.text = message
+class MockWorkflowOutputEvent:
+    """Simulates WorkflowOutputEvent."""
+    def __init__(self, data=None):
+        self.data = data
+# Pass-through decorators
+def mock_use_function_invocation(func=None):
+    return func if func else lambda f: f
+def mock_use_observability(func=None):
+    return func if func else lambda f: f
+@pytest.fixture
+def mock_agent_framework():
+    """
+    Mock the agent_framework module structure in sys.modules.
+    """
+    # Create the mock module structure
+    mock_af = types.ModuleType("agent_framework")
+    mock_af_openai = types.ModuleType("agent_framework.openai")
+    mock_af_middleware = types.ModuleType("agent_framework._middleware")
+    mock_af_tools = types.ModuleType("agent_framework._tools")
+    mock_af_types = types.ModuleType("agent_framework._types")
+    mock_af_observability = types.ModuleType("agent_framework.observability")
+    # Populate submodules
+    mock_af.openai = mock_af_openai
+    mock_af._middleware = mock_af_middleware
+    mock_af._tools = mock_af_tools
+    mock_af._types = mock_af_types
+    mock_af.observability = mock_af_observability
+    # Assign our REAL event classes as the module-level types
+    mock_af.MagenticAgentDeltaEvent = MockDeltaEvent
+    mock_af.MagenticAgentMessageEvent = MockMessageEvent
+    mock_af.MagenticFinalResultEvent = MockFinalResultEvent
+    mock_af.MagenticOrchestratorMessageEvent = MockOrchestratorMessageEvent
+    mock_af.WorkflowOutputEvent = MockWorkflowOutputEvent
+    # Mock other classes
+    mock_af.MagenticBuilder = MagicMock
+    mock_af.ChatAgent = MagicMock
+    mock_af.ai_function = MagicMock
+    mock_af.BaseChatClient = MagicMock
+    mock_af.ToolProtocol = MagicMock
+    mock_af.ChatMessage = MagicMock
+    mock_af.ChatResponse = MagicMock
+    mock_af.ChatResponseUpdate = MagicMock
+    mock_af.ChatOptions = MagicMock
+    mock_af.FinishReason = MagicMock
+    mock_af.Role = MagicMock
+    # Populate symbols in submodules
+    mock_af_openai.OpenAIChatClient = MagicMock
+    mock_af_middleware.use_chat_middleware = MagicMock
+    mock_af_tools.use_function_invocation = mock_use_function_invocation
+    mock_af_types.FunctionCallContent = MagicMock
+    mock_af_types.FunctionResultContent = MagicMock
+    mock_af_observability.use_observability = mock_use_observability
+    # Patch sys.modules to include our mocks
+    with patch.dict(
+        sys.modules,
+        {
+            "agent_framework": mock_af,
+            "agent_framework.openai": mock_af_openai,
+            "agent_framework._middleware": mock_af_middleware,
+            "agent_framework._tools": mock_af_tools,
+            "agent_framework._types": mock_af_types,
+            "agent_framework.observability": mock_af_observability,
+        },
+    ):
+        yield mock_af
+@pytest.fixture(scope="module", autouse=True)
+def cleanup_orchestrator_module():
+    """
+    Ensure src.orchestrators.advanced is restored to a clean state after tests.
+    This prevents 'Mock' classes from leaking into other tests via module globals.
+    """
+    yield
+    # After all tests in this module, reload the orchestrator module
+    # This will use the REAL agent_framework (since the mock fixture is teardown)
+    import src.orchestrators.advanced
+    importlib.reload(src.orchestrators.advanced)
+@pytest.fixture
+def mock_orchestrator(mock_agent_framework):
+    """
+    Create an AdvancedOrchestrator with all dependencies mocked.
+    Relies on reloading the module to pick up the mocked agent_framework.
+    """
+    # Import locally
+    import src.orchestrators.advanced
+    # RELOAD to ensure it picks up the mocked agent_framework from sys.modules
+    importlib.reload(src.orchestrators.advanced)
+    from src.orchestrators.advanced import AdvancedOrchestrator
+    with (
+        patch("src.orchestrators.advanced.get_chat_client"),
+        patch("src.orchestrators.advanced.get_embedding_service_if_available", return_value=None),
+        patch("src.orchestrators.advanced.init_magentic_state"),
+        patch("src.agents.state.ResearchMemory"),
+        patch("src.utils.service_loader.get_embedding_service", return_value=MagicMock()),
+    ):
+        orch = AdvancedOrchestrator(max_rounds=5)
+        yield orch
+@pytest.mark.unit
+@pytest.mark.asyncio
+async def test_accumulator_pattern_scenario_a_standard_text(mock_orchestrator):
+    """
+    Scenario A: Standard Text Message
+    Input: Deltas ("Hello", " World") -> MessageEvent (Poisoned Repr)
+    Expected: AgentEvent with "Hello World", NOT the repr string
+    """
+    events = [
+        MockDeltaEvent("Hello", agent_id="ChatBot"),
+        MockDeltaEvent(" World", agent_id="ChatBot"),
+        MockMessageEvent("<ChatMessage object at 0xDEADBEEF>", agent_id="ChatBot"),
+    ]
+    async def mock_stream(*args, **kwargs):
+        for event in events:
+            yield event
+    mock_workflow = MagicMock()
+    mock_workflow.run_stream = mock_stream
+    with patch.object(mock_orchestrator, "_build_workflow", return_value=mock_workflow):
+        generated_events = []
+        async for event in mock_orchestrator.run("test query"):
+            generated_events.append(event)
+    # Find the completion event for ChatBot (non-streaming)
+    chat_events = [
+        e for e in generated_events if "ChatBot" in str(e.message) and e.type != "streaming"
+    ]
+    assert len(chat_events) >= 1, (
+        f"Expected ChatBot events, got: {[e.message for e in generated_events]}"
+    )
+    final_event = chat_events[0]
+    # CRITICAL: Must contain accumulated text, NOT repr
+    assert "Hello World" in final_event.message or "Hello" in final_event.message
+    assert "<ChatMessage" not in final_event.message, f"Repr bug! Got: {final_event.message}"
+@pytest.mark.unit
+@pytest.mark.asyncio
+async def test_accumulator_pattern_scenario_b_tool_call(mock_orchestrator):
+    """
+    Scenario B: Tool Call (No Text Deltas)
+    Input: No Deltas -> MessageEvent (Poisoned Repr)
+    Expected: AgentEvent with fallback text, NOT the repr string
+    """
+    events = [
+        MockMessageEvent("<ChatMessage object at 0xDEADBEEF>", agent_id="SearchAgent"),
+    ]
+    async def mock_stream(*args, **kwargs):
+        for event in events:
+            yield event
+    mock_workflow = MagicMock()
+    mock_workflow.run_stream = mock_stream
+    with patch.object(mock_orchestrator, "_build_workflow", return_value=mock_workflow):
+        generated_events = []
+        async for event in mock_orchestrator.run("test query"):
+            generated_events.append(event)
+    # Find completion events for SearchAgent
+    search_events = [
+        e for e in generated_events if "SearchAgent" in str(e.message) and e.type != "streaming"
+    ]
+    assert len(search_events) >= 1, (
+        f"Expected SearchAgent events, got: {[e.message for e in generated_events]}"
+    )
+    final_event = search_events[0]
+    # CRITICAL: Should use fallback, NOT repr
+    assert "<ChatMessage" not in final_event.message, f"Repr bug! Got: {final_event.message}"
+    # Should contain fallback or tool indicator
+    assert "Action completed" in final_event.message or "Tool" in final_event.message
+@pytest.mark.unit
+@pytest.mark.asyncio
+async def test_accumulator_pattern_buffer_clearing(mock_orchestrator):
+    """
+    Verify buffer clears between agents.
+    Agent B should NOT inherit Agent A's accumulated text.
+    """
+    events = [
+        MockDeltaEvent("Agent A says hi", agent_id="AgentA"),
+        MockMessageEvent("irrelevant", agent_id="AgentA"),
+        MockDeltaEvent("Agent B responds", agent_id="AgentB"),
+        MockMessageEvent("irrelevant", agent_id="AgentB"),
+    ]
+    async def mock_stream(*args, **kwargs):
+        for event in events:
+            yield event
+    mock_workflow = MagicMock()
+    mock_workflow.run_stream = mock_stream
+    with patch.object(mock_orchestrator, "_build_workflow", return_value=mock_workflow):
+        generated_events = []
+        async for event in mock_orchestrator.run("test query"):
+            generated_events.append(event)
+    # Find non-streaming events for each agent
+    agent_a_events = [
+        e for e in generated_events if "AgentA" in str(e.message) and e.type != "streaming"
+    ]
+    agent_b_events = [
+        e for e in generated_events if "AgentB" in str(e.message) and e.type != "streaming"
+    ]
+    # Both should have completion events
+    assert len(agent_a_events) >= 1, f"No AgentA events: {[e.message for e in generated_events]}"
+    assert len(agent_b_events) >= 1, f"No AgentB events: {[e.message for e in generated_events]}"
+    # Agent A should have its own text
+    assert "Agent A" in agent_a_events[0].message
+    # Agent B should have its own text, NOT Agent A's
+    assert "Agent B" in agent_b_events[0].message
+    assert "Agent A" not in agent_b_events[0].message, "Buffer not cleared between agents!"

tests/unit/orchestrators/test_advanced_timeout.py CHANGED Viewed

@@ -41,8 +41,11 @@ async def test_timeout_synthesizes_evidence():
         mock_get_state.return_value = mock_state
         # Setup mock ReportAgent
         mock_report_agent = AsyncMock()
-        mock_report_agent.run.return_value = "Final Synthesized Report"
         mock_create_agent.return_value = mock_report_agent
         events = []

         mock_get_state.return_value = mock_state
         # Setup mock ReportAgent
+        # ChatAgent.run() returns AgentRunResponse with .text property
         mock_report_agent = AsyncMock()
+        mock_response = MagicMock()
+        mock_response.text = "Final Synthesized Report"
+        mock_report_agent.run.return_value = mock_response
         mock_create_agent.return_value = mock_report_agent
         events = []