Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

VibecoderMcSwaggins commited on Dec 4, 2025

Commit

a5b5479

1 Parent(s): 1bfc1df

refactor(orchestrator): DRY synthesis + constants + clean imports

Addresses senior review feedback on P1 fix code quality:

Priority 1 - DRY Violation:
- Unified `_handle_timeout()` and `_force_synthesis()` into `_synthesize_fallback(iteration, reason)`
- Removed 40+ lines of duplicate code
- Single method handles timeout, no_reporter, and max_rounds scenarios

Priority 2 - Redundant Imports:
- Added `get_magentic_state` to module-level imports
- Removed duplicate imports from inside methods

Priority 3 - Magic Strings:
- Added constants: REPORTER_AGENT_ID, SEARCHER_AGENT_ID, JUDGE_AGENT_ID, HYPOTHESIZER_AGENT_ID
- Updated `_get_event_type_for_agent()` to use constants
- Updated reporter detection in run() to use constant

Also:
- Fixed test patch paths (must patch in module namespace, not source)
- Added SPEC_ARCHITECTURAL_DEBT.md documenting Phase 2 refactoring roadmap

All 318 tests pass.

Files changed (3) hide show

SPEC_ARCHITECTURAL_DEBT.md +259 -0
src/orchestrators/advanced.py +42 -79
tests/unit/orchestrators/test_advanced_timeout.py +4 -2

SPEC_ARCHITECTURAL_DEBT.md ADDED Viewed

	@@ -0,0 +1,259 @@

+# Architectural Debt Specification
+> **Status**: IMPLEMENTED (Phase 1 Complete)
+> **Date**: 2025-12-04
+> **Scope**: `src/orchestrators/advanced.py` (Primary), System-Wide (Secondary)
+> **Purpose**: Roadmap for "DeepMind Status" Code Quality
+> **Author**: Claude (Senior Review Incorporated)
+---
+## Executive Summary
+The P1/P2 bug fixes in PR #124 introduced technical debt that must be addressed before the PR is considered "done". This spec documents three immediate priorities (DRY violation, redundant imports, magic strings) and five medium-term system-wide improvements.
+---
+## Part 1: Immediate Cleanup (MUST Complete Before PR Merge)
+### Priority 1: DRY Violation - Synthesis Methods
+**Problem**: `_handle_timeout()` (lines 201-248) and `_force_synthesis()` (lines 250-297) are **95% identical**.
+| `_handle_timeout()` | `_force_synthesis()` |
+|---------------------|----------------------|
+| Lines 201-248 (47 lines) | Lines 250-297 (47 lines) |
+| Yields "Workflow timed out. Synthesizing..." | Yields "Synthesizing research findings..." |
+| Error data: `timeout_synthesis` | Error data: `forced_synthesis` |
+| **Everything else is identical** | **Everything else is identical** |
+**SOLID Violation**: **DRY (Don't Repeat Yourself)**. Changes to synthesis logic must be made in two places. This is a maintenance nightmare and a source of future bugs.
+**Fix**: Extract unified method `_synthesize_fallback(iteration: int, reason: str)`:
+```python
+async def _synthesize_fallback(
+    self, iteration: int, reason: str
+) -> AsyncGenerator[AgentEvent, None]:
+    """
+    Unified fallback synthesis for all termination scenarios.
+    Args:
+        iteration: Current workflow iteration
+        reason: Why synthesis is being forced ("timeout", "no_reporter", "max_rounds")
+    """
+    status_messages = {
+        "timeout": "Workflow timed out. Synthesizing available evidence...",
+        "no_reporter": "Synthesizing research findings...",
+        "max_rounds": "Max rounds reached. Synthesizing findings...",
+    }
+    try:
+        state = get_magentic_state()
+        evidence_summary = await state.memory.get_context_summary()
+        report_agent = create_report_agent(self._chat_client, domain=self.domain)
+        yield AgentEvent(
+            type="synthesizing",
+            message=status_messages.get(reason, "Synthesizing..."),
+            iteration=iteration,
+        )
+        synthesis_result = await report_agent.run(
+            f"Synthesize research report from this evidence. "
+            f"If evidence is sparse, say so.\n\n{evidence_summary}"
+        )
+        yield AgentEvent(
+            type="complete",
+            message=synthesis_result.text,
+            data={"reason": f"{reason}_synthesis", "iterations": iteration},
+            iteration=iteration,
+        )
+    except Exception as synth_error:
+        logger.error(f"{reason} synthesis failed", error=str(synth_error))
+        yield AgentEvent(
+            type="complete",
+            message=f"Research completed. Synthesis failed: {synth_error}",
+            data={"reason": f"{reason}_synthesis_failed", "iterations": iteration},
+            iteration=iteration,
+        )
+```
+**Call Sites to Update**:
+1. Line 447: `async for event in self._handle_timeout(iteration):` → `self._synthesize_fallback(iteration, "timeout")`
+2. Line 412: `async for synth_event in self._force_synthesis(iteration):` → `self._synthesize_fallback(iteration, "no_reporter")`
+3. Line 432: `async for synth_event in self._force_synthesis(iteration):` → `self._synthesize_fallback(iteration, "max_rounds")`
+**Delete After Refactor**:
+- `_handle_timeout()` method (lines 201-248)
+- `_force_synthesis()` method (lines 250-297)
+---
+### Priority 2: Redundant Imports
+**Problem**: Imports inside methods that already exist at module level.
+| Location | Import | Already At |
+|----------|--------|------------|
+| Line 207 | `from src.agents.magentic_agents import create_report_agent` | Line 35 |
+| Line 208 | `from src.agents.state import get_magentic_state` | Missing! |
+| Line 257 | `from src.agents.magentic_agents import create_report_agent` | Line 35 |
+| Line 258 | `from src.agents.state import get_magentic_state` | Missing! |
+**SOLID Violation**: **SRP (Single Responsibility)**. Import management is scattered across the file instead of centralized at the top.
+**Fix**:
+1. Add to module-level imports (around line 38):
+   ```python
+   from src.agents.state import get_magentic_state, init_magentic_state
+   ```
+   Note: `init_magentic_state` is already imported at line 38. Add `get_magentic_state` to that import.
+2. Remove redundant imports from:
+   - Lines 207-208 (inside `_handle_timeout`)
+   - Lines 257-258 (inside `_force_synthesis`)
+---
+### Priority 3: Magic Strings
+**Problem**: Agent detection relies on string literals that break silently if agents are renamed.
+**Current Code** (line 385):
+```python
+agent_name = (event.agent_id or "").lower()
+if "report" in agent_name:  # FRAGILE: Breaks if agent renamed
+    reporter_ran = True
+```
+**Also in** `_get_event_type_for_agent()` (lines 593-602):
+```python
+if "search" in agent_lower:    # Magic string
+if "judge" in agent_lower:     # Magic string
+if "hypothes" in agent_lower:  # Magic string
+if "report" in agent_lower:    # Magic string
+```
+**SOLID Violation**: **OCP (Open/Closed Principle)**. Renaming an agent requires changes in multiple locations.
+**Fix Option A** - Constants:
+```python
+# At module level (after imports)
+REPORTER_AGENT_ID = "reporter"
+SEARCHER_AGENT_ID = "searcher"
+JUDGE_AGENT_ID = "judge"
+HYPOTHESIZER_AGENT_ID = "hypothesizer"
+```
+**Fix Option B** - Agent Name Attribute (Preferred):
+```python
+# In magentic_agents.py, ensure each agent has a .name attribute
+# Then in advanced.py:
+if event.agent_id == report_agent.name:
+    reporter_ran = True
+```
+**Recommendation**: Option A is simpler and doesn't require modifying agent factory. Use constants.
+---
+## Part 2: System-Wide Issues (Future PRs)
+These are valid concerns identified during code review but are NOT blockers for the current PR.
+### Priority 4: Dead Config
+**Location**: `src/utils/config.py`
+**Issue**: Zombie configuration values that are never used or raise NotImplemented.
+- `magentic_timeout`: Deprecated, never read
+- `anthropic_api_key`: Config exists but factory raises NotImplemented
+**Fix**: Audit config.py, remove dead settings, add deprecation warnings for transitional settings.
+---
+### Priority 5: Prompt Unification
+**Location**: `src/prompts/` vs `src/config/domain.py`
+**Issue**: Two sources of truth for prompts. `src/prompts/` files exist but are ignored. System uses hardcoded strings in `domain.py`.
+**Fix**: Pick ONE source of truth. Recommendation: Delete `src/prompts/` if unused, or migrate `domain.py` prompts there.
+---
+### Priority 6: Factory Monolith
+**Location**: `src/clients/factory.py`
+**Issue**: Hardcoded logic for detecting API key prefixes (`sk-` → OpenAI, `sk-ant-` → Anthropic error).
+**SOLID Violation**: OCP. Adding a new provider requires modifying the factory.
+**Fix**: Provider registry pattern with auto-registration, or strategy pattern with key prefix handlers.
+---
+### Priority 7: State Class
+**Location**: `src/orchestrators/advanced.py` `run()` method
+**Issue**: Method manages 6+ loose variables (`iteration`, `reporter_ran`, `buffer`, `current_agent_id`, `last_streamed_length`, `final_event_received`).
+**Fix**: Extract to `WorkflowState` dataclass:
+```python
+@dataclass
+class WorkflowState:
+    iteration: int = 0
+    reporter_ran: bool = False
+    current_message_buffer: str = ""
+    current_agent_id: str | None = None
+    last_streamed_length: int = 0
+    final_event_received: bool = False
+```
+---
+### Priority 8: Real Integration Tests
+**Location**: `tests/e2e/`
+**Issue**: We deleted flaky integration tests. Now we have ZERO automated tests against real APIs.
+**Fix**: Create stable `make test-live` suite with:
+- Real HuggingFace Free Tier test
+- Real OpenAI BYOK test
+- Proper timeout handling
+- Skip markers for CI (run manually or on schedule)
+---
+## Execution Strategy
+### Phase 1: Current PR (REQUIRED)
+Implement **Priority 1, 2, and 3** before merging PR #124.
+**Definition of Done**:
+- [x] `_synthesize_fallback(iteration, reason)` implemented
+- [x] `_handle_timeout()` and `_force_synthesis()` deleted
+- [x] All synthesis call sites updated
+- [x] Redundant imports removed
+- [x] `get_magentic_state` added to module-level imports
+- [x] Magic strings replaced with constants
+- [x] All tests pass (`make check`)
+### Phase 2: Future PRs (Separate Tickets)
+Create GitHub issues for Priority 4-8. Do NOT bloat the current bug fix PR.
+---
+## Appendix: Line Number Reference
+| Item | Current Location |
+|------|------------------|
+| `_handle_timeout()` | Lines 201-248 |
+| `_force_synthesis()` | Lines 250-297 |
+| Redundant imports (timeout) | Lines 207-208 |
+| Redundant imports (force) | Lines 257-258 |
+| Magic string detection | Line 385 |
+| `_get_event_type_for_agent()` | Lines 582-602 |
+| Module imports | Lines 18-48 |
+| `run()` method | Lines 299-456 |

src/orchestrators/advanced.py CHANGED Viewed

@@ -35,7 +35,7 @@ from src.agents.magentic_agents import (
     create_report_agent,
     create_search_agent,
 )
-from src.agents.state import init_magentic_state
 from src.clients.base import BaseChatClient
 from src.clients.factory import get_chat_client
 from src.config.domain import ResearchDomain, get_domain_config
@@ -49,6 +49,12 @@ if TYPE_CHECKING:
 logger = structlog.get_logger()
 class AdvancedOrchestrator(OrchestratorProtocol):
     """
@@ -198,81 +204,39 @@ The final output should be a structured research report."""
             iteration=0,
         )
-    async def _handle_timeout(self, iteration: int) -> AsyncGenerator[AgentEvent, None]:
-        """Handle workflow timeout by attempting synthesis."""
-        logger.warning("Workflow timed out", iterations=iteration)
-        # ACTUALLY synthesize from gathered evidence
-        try:
-            from src.agents.magentic_agents import create_report_agent
-            from src.agents.state import get_magentic_state
-            state = get_magentic_state()
-            memory = state.memory
-            # Get evidence summary from memory
-            evidence_summary = await memory.get_context_summary()
-            # Create and invoke ReportAgent for synthesis
-            report_agent = create_report_agent(self._chat_client, domain=self.domain)
-            yield AgentEvent(
-                type="synthesizing",
-                message="Workflow timed out. Synthesizing available evidence...",
-                iteration=iteration,
-            )
-            # Invoke ReportAgent directly
-            # Note: ChatAgent.run() returns AgentRunResponse; access text via .text
-            synthesis_result = await report_agent.run(
-                "Synthesize research report from this evidence. "
-                f"If evidence is sparse, say so.\n\n{evidence_summary}"
-            )
-            yield AgentEvent(
-                type="complete",
-                message=synthesis_result.text,
-                data={"reason": "timeout_synthesis", "iterations": iteration},
-                iteration=iteration,
-            )
-        except Exception as synth_error:
-            logger.error("Timeout synthesis failed", error=str(synth_error))
-            yield AgentEvent(
-                type="complete",
-                message=(
-                    f"Research timed out after {iteration} rounds. "
-                    f"Evidence gathered but synthesis failed: {synth_error}"
-                ),
-                data={"reason": "timeout_synthesis_failed", "iterations": iteration},
-                iteration=iteration,
-            )
-    async def _force_synthesis(self, iteration: int) -> AsyncGenerator[AgentEvent, None]:
-        """Force synthesis when workflow ends without ReportAgent running (P1 Fix).
-        This is a safety net for when the Manager agent (especially 7B models)
-        fails to properly delegate to ReportAgent before workflow termination.
         """
-        try:
-            from src.agents.magentic_agents import create_report_agent
-            from src.agents.state import get_magentic_state
             state = get_magentic_state()
-            memory = state.memory
-            # Get evidence summary from memory
-            evidence_summary = await memory.get_context_summary()
-            # Create and invoke ReportAgent for synthesis
             report_agent = create_report_agent(self._chat_client, domain=self.domain)
             yield AgentEvent(
                 type="synthesizing",
-                message="Synthesizing research findings...",
                 iteration=iteration,
             )
-            # Invoke ReportAgent directly
             synthesis_result = await report_agent.run(
                 "Synthesize research report from this evidence. "
                 f"If evidence is sparse, say so.\n\n{evidence_summary}"
@@ -281,18 +245,15 @@ The final output should be a structured research report."""
             yield AgentEvent(
                 type="complete",
                 message=synthesis_result.text,
-                data={"reason": "forced_synthesis", "iterations": iteration},
                 iteration=iteration,
             )
         except Exception as synth_error:
-            logger.error("Forced synthesis failed", error=str(synth_error))
             yield AgentEvent(
                 type="complete",
-                message=(
-                    f"Research completed after {iteration} rounds. "
-                    f"Evidence gathered but synthesis failed: {synth_error}"
-                ),
-                data={"reason": "forced_synthesis_failed", "iterations": iteration},
                 iteration=iteration,
             )
@@ -382,7 +343,7 @@ The final output should be a structured research report."""
                         # P1 FIX: Track if ReportAgent produced output
                         agent_name = (event.agent_id or "").lower()
-                        if "report" in agent_name:
                             reporter_ran = True
                         comp_event, prog_event = self._handle_completion_event(
@@ -409,7 +370,9 @@ The final output should be a structured research report."""
                                 "ReportAgent never ran - forcing synthesis",
                                 iterations=iteration,
                             )
-                            async for synth_event in self._force_synthesis(iteration):
                                 yield synth_event
                         else:
                             yield self._handle_final_event(event, iteration, last_streamed_length)
@@ -429,7 +392,7 @@ The final output should be a structured research report."""
                 )
                 # P1 FIX: Force synthesis if ReportAgent never ran
                 if not reporter_ran:
-                    async for synth_event in self._force_synthesis(iteration):
                         yield synth_event
                 else:
                     yield AgentEvent(
@@ -444,7 +407,7 @@ The final output should be a structured research report."""
                     )
         except TimeoutError:
-            async for event in self._handle_timeout(iteration):
                 yield event
         except Exception as e:
@@ -591,13 +554,13 @@ The final output should be a structured research report."""
             Event type string matching AgentEvent.type Literal
         """
         agent_lower = agent_name.lower()
-        if "search" in agent_lower:
             return "search_complete"
-        if "judge" in agent_lower:
             return "judge_complete"
-        if "hypothes" in agent_lower:
             return "hypothesizing"
-        if "report" in agent_lower:
             return "synthesizing"
         return "judging"  # Default for unknown agents

     create_report_agent,
     create_search_agent,
 )
+from src.agents.state import get_magentic_state, init_magentic_state
 from src.clients.base import BaseChatClient
 from src.clients.factory import get_chat_client
 from src.config.domain import ResearchDomain, get_domain_config
 logger = structlog.get_logger()
+# Agent ID constants - prevents silent breakage if agents are renamed
+REPORTER_AGENT_ID = "reporter"
+SEARCHER_AGENT_ID = "searcher"
+JUDGE_AGENT_ID = "judge"
+HYPOTHESIZER_AGENT_ID = "hypothesizer"
 class AdvancedOrchestrator(OrchestratorProtocol):
     """
             iteration=0,
         )
+    async def _synthesize_fallback(
+        self, iteration: int, reason: str
+    ) -> AsyncGenerator[AgentEvent, None]:
+        """
+        Unified fallback synthesis for all termination scenarios.
+        This method handles synthesis when the workflow terminates without
+        a proper report from ReportAgent. It's a safety net for:
+        - Timeout scenarios
+        - Manager model failing to delegate to ReportAgent (7B model limitation)
+        - Max rounds reached without synthesis
+        Args:
+            iteration: Current workflow iteration count
+            reason: Why synthesis is being forced ("timeout", "no_reporter", "max_rounds")
         """
+        status_messages = {
+            "timeout": "Workflow timed out. Synthesizing available evidence...",
+            "no_reporter": "Synthesizing research findings...",
+            "max_rounds": "Max rounds reached. Synthesizing findings...",
+        }
+        try:
             state = get_magentic_state()
+            evidence_summary = await state.memory.get_context_summary()
             report_agent = create_report_agent(self._chat_client, domain=self.domain)
             yield AgentEvent(
                 type="synthesizing",
+                message=status_messages.get(reason, "Synthesizing..."),
                 iteration=iteration,
             )
             synthesis_result = await report_agent.run(
                 "Synthesize research report from this evidence. "
                 f"If evidence is sparse, say so.\n\n{evidence_summary}"
             yield AgentEvent(
                 type="complete",
                 message=synthesis_result.text,
+                data={"reason": f"{reason}_synthesis", "iterations": iteration},
                 iteration=iteration,
             )
         except Exception as synth_error:
+            logger.error(f"{reason} synthesis failed", error=str(synth_error))
             yield AgentEvent(
                 type="complete",
+                message=f"Research completed. Synthesis failed: {synth_error}",
+                data={"reason": f"{reason}_synthesis_failed", "iterations": iteration},
                 iteration=iteration,
             )
                         # P1 FIX: Track if ReportAgent produced output
                         agent_name = (event.agent_id or "").lower()
+                        if REPORTER_AGENT_ID in agent_name:
                             reporter_ran = True
                         comp_event, prog_event = self._handle_completion_event(
                                 "ReportAgent never ran - forcing synthesis",
                                 iterations=iteration,
                             )
+                            async for synth_event in self._synthesize_fallback(
+                                iteration, "no_reporter"
+                            ):
                                 yield synth_event
                         else:
                             yield self._handle_final_event(event, iteration, last_streamed_length)
                 )
                 # P1 FIX: Force synthesis if ReportAgent never ran
                 if not reporter_ran:
+                    async for synth_event in self._synthesize_fallback(iteration, "max_rounds"):
                         yield synth_event
                 else:
                     yield AgentEvent(
                     )
         except TimeoutError:
+            async for event in self._synthesize_fallback(iteration, "timeout"):
                 yield event
         except Exception as e:
             Event type string matching AgentEvent.type Literal
         """
         agent_lower = agent_name.lower()
+        if SEARCHER_AGENT_ID in agent_lower:
             return "search_complete"
+        if JUDGE_AGENT_ID in agent_lower:
             return "judge_complete"
+        if HYPOTHESIZER_AGENT_ID in agent_lower:
             return "hypothesizing"
+        if REPORTER_AGENT_ID in agent_lower:
             return "synthesizing"
         return "judging"  # Default for unknown agents

tests/unit/orchestrators/test_advanced_timeout.py CHANGED Viewed

@@ -27,11 +27,13 @@ async def test_timeout_synthesizes_evidence():
     mock_workflow.run_stream = slow_stream
     # Mock dependencies used inside the timeout block
     with (
         patch.object(orchestrator, "_build_workflow", return_value=mock_workflow),
         patch("src.orchestrators.advanced.init_magentic_state"),
-        patch("src.agents.state.get_magentic_state") as mock_get_state,
-        patch("src.agents.magentic_agents.create_report_agent") as mock_create_agent,
     ):
         # Setup mock state and memory
         mock_memory = AsyncMock()

     mock_workflow.run_stream = slow_stream
     # Mock dependencies used inside the timeout block
+    # Note: get_magentic_state and create_report_agent are imported at module level in advanced.py
+    # so we must patch them in that module's namespace, not their original location
     with (
         patch.object(orchestrator, "_build_workflow", return_value=mock_workflow),
         patch("src.orchestrators.advanced.init_magentic_state"),
+        patch("src.orchestrators.advanced.get_magentic_state") as mock_get_state,
+        patch("src.orchestrators.advanced.create_report_agent") as mock_create_agent,
     ):
         # Setup mock state and memory
         mock_memory = AsyncMock()