Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

App Files Files Community

VibecoderMcSwaggins commited on 20 days ago

Commit

0b27b1c

2 Parent(s): 3753c39 21dd8fe

Merge main into dev: P2 bug fixes + CodeRabbit feedback

Browse files

Files changed (9) hide show

docs/bugs/ACTIVE_BUGS.md +14 -1
docs/bugs/P2_DUPLICATE_REPORT_CONTENT.md +1 -1
docs/bugs/P2_FIRST_TURN_TIMEOUT.md +1 -1
docs/future-roadmap/P3_MODAL_INTEGRATION_REMOVAL.md +78 -0
src/agents/search_agent.py +1 -1
src/orchestrators/advanced.py +50 -27
src/utils/config.py +2 -2
tests/unit/agents/test_search_agent.py +3 -3
tests/unit/orchestrators/test_advanced_orchestrator.py +3 -3

docs/bugs/ACTIVE_BUGS.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Active Bugs
-> Last updated: 2025-12-03
 >
 > **Note:** Completed bug docs archived to `docs/bugs/archive/`
 > **See also:** [ARCHITECTURE.md](../ARCHITECTURE.md) for unified architecture plan
@@ -59,6 +59,17 @@
 ---
 ## Resolved Bugs (December 2025)
 All resolved bugs have been moved to `docs/bugs/archive/`. Summary:
@@ -81,6 +92,8 @@ All resolved bugs have been moved to `docs/bugs/archive/`. Summary:
 - **P1 Simple Mode Removed Breaks Free Tier UX** - FIXED via Accumulator Pattern (PR #117)
 ### P2 Bugs (All FIXED)
 - **P2 7B Model Garbage Output** - SUPERSEDED by P1 Free Tier fix (root cause was premature marker, not model capacity)
 - **P2 Advanced Mode Cold Start No Feedback** - FIXED, all phases complete
 - **P2 Architectural BYOK Gaps** - FIXED, end-to-end BYOK support in PR #119

 # Active Bugs
+> Last updated: 2025-12-04
 >
 > **Note:** Completed bug docs archived to `docs/bugs/archive/`
 > **See also:** [ARCHITECTURE.md](../ARCHITECTURE.md) for unified architecture plan
 ---
+### P3 - Remove Modal Integration
+**File:** `docs/future-roadmap/P3_MODAL_INTEGRATION_REMOVAL.md`
+**Status:** OPEN - Tech Debt
+**Problem:** Modal (cloud functions) is integrated in 9 files but was decided against for this project. Creates dead code paths and confusion.
+**Fix:** Remove all Modal references from codebase (config, services, agents, tools).
+---
 ## Resolved Bugs (December 2025)
 All resolved bugs have been moved to `docs/bugs/archive/`. Summary:
 - **P1 Simple Mode Removed Breaks Free Tier UX** - FIXED via Accumulator Pattern (PR #117)
 ### P2 Bugs (All FIXED)
+- **P2 Duplicate Report Content** - FIXED in PR fix/p2-double-bug-squash, stateful deduplication in `run()` loop
+- **P2 First Turn Timeout** - FIXED in PR fix/p2-double-bug-squash, reduced results per tool (10→5), increased timeout (5→10 min)
 - **P2 7B Model Garbage Output** - SUPERSEDED by P1 Free Tier fix (root cause was premature marker, not model capacity)
 - **P2 Advanced Mode Cold Start No Feedback** - FIXED, all phases complete
 - **P2 Architectural BYOK Gaps** - FIXED, end-to-end BYOK support in PR #119

docs/bugs/P2_DUPLICATE_REPORT_CONTENT.md CHANGED Viewed

@@ -1,7 +1,7 @@
 # P2 Bug: Duplicate Report Content in Output
 **Date**: 2025-12-03
-**Status**: OPEN
 **Severity**: P2 (UX - Duplicate content confuses users)
 **Component**: `src/orchestrators/advanced.py`
 **Affects**: Both Free Tier (HuggingFace) AND Paid Tier (OpenAI)

 # P2 Bug: Duplicate Report Content in Output
 **Date**: 2025-12-03
+**Status**: FIXED (PR fix/p2-double-bug-squash)
 **Severity**: P2 (UX - Duplicate content confuses users)
 **Component**: `src/orchestrators/advanced.py`
 **Affects**: Both Free Tier (HuggingFace) AND Paid Tier (OpenAI)

docs/bugs/P2_FIRST_TURN_TIMEOUT.md CHANGED Viewed

@@ -1,7 +1,7 @@
 # P2 Bug: First Agent Turn Exceeds Workflow Timeout
 **Date**: 2025-12-03
-**Status**: OPEN
 **Severity**: P2 (UX - Workflow always times out on complex queries)
 **Component**: `src/orchestrators/advanced.py` + `src/agents/search_agent.py`
 **Affects**: Both Free Tier (HuggingFace) AND Paid Tier (OpenAI)

 # P2 Bug: First Agent Turn Exceeds Workflow Timeout
 **Date**: 2025-12-03
+**Status**: FIXED (PR fix/p2-double-bug-squash)
 **Severity**: P2 (UX - Workflow always times out on complex queries)
 **Component**: `src/orchestrators/advanced.py` + `src/agents/search_agent.py`
 **Affects**: Both Free Tier (HuggingFace) AND Paid Tier (OpenAI)

docs/future-roadmap/P3_MODAL_INTEGRATION_REMOVAL.md ADDED Viewed

	@@ -0,0 +1,78 @@

+# P3 Tech Debt: Modal Integration Removal
+**Date**: 2025-12-04
+**Status**: OPEN - Tech Debt (Future Roadmap)
+**Severity**: P3 (Tech Debt - Not blocking functionality)
+**Component**: Multiple files
+---
+## Executive Summary
+Modal (cloud function execution platform) is integrated throughout the codebase but was decided against for this project. This creates potential confusion and dead code paths that should be cleaned up when time permits.
+---
+## Affected Files
+The following files contain Modal references:
+| File | Usage |
+|------|-------|
+| `src/utils/config.py` | `MODAL_TOKEN_ID`, `MODAL_TOKEN_SECRET` settings |
+| `src/utils/service_loader.py` | Modal service initialization |
+| `src/services/llamaindex_rag.py` | Modal integration for RAG |
+| `src/agents/code_executor_agent.py` | Modal sandbox execution |
+| `src/utils/exceptions.py` | Modal-related exceptions |
+| `src/tools/code_execution.py` | Modal code execution tool |
+| `src/services/statistical_analyzer.py` | Modal statistical analysis |
+| `src/mcp_tools.py` | Modal MCP tool wrappers |
+| `src/agents/analysis_agent.py` | Modal analysis agent |
+---
+## Context
+Modal was originally integrated for:
+1. **Code Execution Sandbox**: Running untrusted code in isolated containers
+2. **Statistical Analysis**: Offloading heavy statistical computations
+3. **LlamaIndex RAG**: Premium embeddings with persistent storage
+However, the project decided against Modal because:
+- Added infrastructure complexity
+- Free Tier doesn't need cloud functions
+- Paid Tier uses OpenAI directly
+---
+## Recommended Fix
+1. Remove Modal-related code from all affected files
+2. Remove `MODAL_TOKEN_ID` and `MODAL_TOKEN_SECRET` from config
+3. Remove Modal from dependencies in `pyproject.toml`
+4. Update any documentation referencing Modal
+---
+## Impact If Not Fixed
+- Confusion for new contributors
+- Dead code in production
+- Unnecessary dependencies
+- Config settings that do nothing
+---
+## Test Plan
+1. Remove Modal code
+2. Run `make check` to ensure no breakage
+3. Verify Free Tier and Paid Tier still work
+4. Search codebase for any remaining Modal references
+---
+## Related
+- `P3_REMOVE_ANTHROPIC_PARTIAL_WIRING.md` - Similar tech debt for Anthropic
+- ARCHITECTURE.md - Current architecture excludes Modal

src/agents/search_agent.py CHANGED Viewed

@@ -67,7 +67,7 @@ class SearchAgent(BaseAgent):  # type: ignore[misc]
             )
         # Execute search
-        result: SearchResult = await self._handler.execute(query, max_results_per_tool=10)
         # Track what to show in response (initialized to search results as default)
         evidence_to_show: list[Evidence] = result.evidence

             )
         # Execute search
+        result: SearchResult = await self._handler.execute(query, max_results_per_tool=5)
         # Track what to show in response (initialized to search results as default)
         evidence_to_show: list[Evidence] = result.evidence

src/orchestrators/advanced.py CHANGED Viewed

@@ -301,6 +301,7 @@ The final output should be a structured research report."""
         # to repr(message) if text is empty. We must reconstruct text from Deltas.
         current_message_buffer: str = ""
         current_agent_id: str | None = None
         try:
             async with asyncio.timeout(self._timeout_seconds):
@@ -333,15 +334,21 @@ The final output should be a structured research report."""
                         yield comp_event
                         yield prog_event
                         # Clear buffer after consuming
                         current_message_buffer = ""
                         continue
-                    # 3. Handle other events normally
                     agent_event = self._process_event(event, iteration)
                     if agent_event:
-                        if agent_event.type == "complete":
-                            final_event_received = True
                         yield agent_event
             # GUARANTEE: Always emit termination event if stream ends without one
@@ -413,6 +420,40 @@ The final output should be a structured research report."""
         return completion_event, progress_event
     def _extract_text(self, message: Any) -> str:
         """
         Defensively extract text from a message object.
@@ -526,30 +567,12 @@ The final output should be a structured research report."""
                 iteration=iteration,
             )
-        # NOTE: MagenticAgentMessageEvent is handled in run() loop with Accumulator Pattern
-        # (see lines 322-335) and never reaches this method due to `continue` statement.
-        elif isinstance(event, MagenticFinalResultEvent):
-            text = self._extract_text(event.message) if event.message else "No result"
-            return AgentEvent(
-                type="complete",
-                message=text,
-                data={"iterations": iteration},
-                iteration=iteration,
-            )
-        # NOTE: MagenticAgentDeltaEvent is handled in run() loop with Accumulator Pattern
-        # (see lines 306-320) and never reaches this method due to `continue` statement.
-        elif isinstance(event, WorkflowOutputEvent):
-            if event.data:
-                # Use _extract_text to properly handle ChatMessage objects
-                text = self._extract_text(event.data)
-                return AgentEvent(
-                    type="complete",
-                    message=text if text else "Research complete (no synthesis)",
-                    iteration=iteration,
-                )
         return None

         # to repr(message) if text is empty. We must reconstruct text from Deltas.
         current_message_buffer: str = ""
         current_agent_id: str | None = None
+        last_streamed_length: int = 0  # Track for P2 Duplicate Report Bug Fix
         try:
             async with asyncio.timeout(self._timeout_seconds):
                         yield comp_event
                         yield prog_event
+                        # P2 BUG FIX: Save length before clearing
+                        last_streamed_length = len(current_message_buffer)
                         # Clear buffer after consuming
                         current_message_buffer = ""
                         continue
+                    # 3. Handle Final Events Inline (P2 Duplicate Report Fix)
+                    if isinstance(event, (MagenticFinalResultEvent, WorkflowOutputEvent)):
+                        final_event_received = True
+                        yield self._handle_final_event(event, iteration, last_streamed_length)
+                        continue
+                    # 4. Handle other events normally
                     agent_event = self._process_event(event, iteration)
                     if agent_event:
                         yield agent_event
             # GUARANTEE: Always emit termination event if stream ends without one
         return completion_event, progress_event
+    def _handle_final_event(
+        self,
+        event: MagenticFinalResultEvent | WorkflowOutputEvent,
+        iteration: int,
+        last_streamed_length: int,
+    ) -> AgentEvent:
+        """Handle final workflow events with duplicate content suppression (P2 Bug Fix)."""
+        # DECISION: Did we stream substantial content?
+        if last_streamed_length > 100:
+            # YES: Final event is a SIGNAL, not a payload
+            return AgentEvent(
+                type="complete",
+                message="Research complete.",
+                data={
+                    "iterations": iteration,
+                    "streamed_chars": last_streamed_length,
+                },
+                iteration=iteration,
+            )
+        # NO: Final event must carry the payload (tool-only turn, cache hit)
+        text = ""
+        if isinstance(event, MagenticFinalResultEvent):
+            text = self._extract_text(event.message) if event.message else "No result"
+        elif isinstance(event, WorkflowOutputEvent):
+            text = self._extract_text(event.data) if event.data else "Research complete"
+        return AgentEvent(
+            type="complete",
+            message=text,
+            data={"iterations": iteration},
+            iteration=iteration,
+        )
     def _extract_text(self, message: Any) -> str:
         """
         Defensively extract text from a message object.
                 iteration=iteration,
             )
+        # NOTE: The following event types are handled inline in run() loop and never reach
+        # this method due to `continue` statements:
+        # - MagenticAgentMessageEvent: Accumulator Pattern (lines 322-335)
+        # - MagenticAgentDeltaEvent: Accumulator Pattern (lines 306-320)
+        # - MagenticFinalResultEvent: P2 Duplicate Fix via _handle_final_event() (lines 343-347)
+        # - WorkflowOutputEvent: P2 Duplicate Fix via _handle_final_event() (lines 343-347)
         return None

src/utils/config.py CHANGED Viewed

@@ -71,10 +71,10 @@ class Settings(BaseSettings):
         description="Max coordination rounds for Advanced mode (default 5 for faster demos)",
     )
     advanced_timeout: float = Field(
-        default=300.0,
         ge=60.0,
         le=900.0,
-        description="Timeout for Advanced mode in seconds (default 5 min)",
     )
     search_timeout: int = Field(default=30, description="Seconds to wait for search")
     magentic_timeout: int = Field(

         description="Max coordination rounds for Advanced mode (default 5 for faster demos)",
     )
     advanced_timeout: float = Field(
+        default=600.0,
         ge=60.0,
         le=900.0,
+        description="Timeout for Advanced mode in seconds (default 10 min)",
     )
     search_timeout: int = Field(default=30, description="Seconds to wait for search")
     magentic_timeout: int = Field(

tests/unit/agents/test_search_agent.py CHANGED Viewed

@@ -47,7 +47,7 @@ async def test_run_executes_search(mock_handler: AsyncMock) -> None:
     response = await agent.run("test query")
     # Check handler called
-    mock_handler.execute.assert_awaited_once_with("test query", max_results_per_tool=10)
     # Check store updated
     assert len(store["current"]) == 1
@@ -67,7 +67,7 @@ async def test_run_handles_chat_message_input(mock_handler: AsyncMock) -> None:
     message = ChatMessage(role=Role.USER, text="test query")
     await agent.run(message)
-    mock_handler.execute.assert_awaited_once_with("test query", max_results_per_tool=10)
 @pytest.mark.asyncio
@@ -81,7 +81,7 @@ async def test_run_handles_list_input(mock_handler: AsyncMock) -> None:
         ChatMessage(role=Role.USER, text="test query"),
     ]
     await agent.run(messages)
-    mock_handler.execute.assert_awaited_once_with("test query", max_results_per_tool=10)
 @pytest.mark.asyncio

     response = await agent.run("test query")
     # Check handler called
+    mock_handler.execute.assert_awaited_once_with("test query", max_results_per_tool=5)
     # Check store updated
     assert len(store["current"]) == 1
     message = ChatMessage(role=Role.USER, text="test query")
     await agent.run(message)
+    mock_handler.execute.assert_awaited_once_with("test query", max_results_per_tool=5)
 @pytest.mark.asyncio
         ChatMessage(role=Role.USER, text="test query"),
     ]
     await agent.run(messages)
+    mock_handler.execute.assert_awaited_once_with("test query", max_results_per_tool=5)
 @pytest.mark.asyncio

tests/unit/orchestrators/test_advanced_orchestrator.py CHANGED Viewed

@@ -28,11 +28,11 @@ class TestAdvancedOrchestratorConfig:
         assert orch._max_rounds == 7
     @patch("src.orchestrators.advanced.get_chat_client")
-    def test_timeout_default_is_five_minutes(self, mock_get_client) -> None:
-        """Default timeout should be 300s (5 min) from settings."""
         mock_get_client.return_value = MagicMock()
         orch = AdvancedOrchestrator()
-        assert orch._timeout_seconds == 300.0
     @patch("src.orchestrators.advanced.get_chat_client")
     def test_explicit_timeout_overrides_settings(self, mock_get_client) -> None:

         assert orch._max_rounds == 7
     @patch("src.orchestrators.advanced.get_chat_client")
+    def test_timeout_default_is_ten_minutes(self, mock_get_client: MagicMock) -> None:
+        """Test that default timeout is 10 minutes (P2 fix)."""
         mock_get_client.return_value = MagicMock()
         orch = AdvancedOrchestrator()
+        assert orch._timeout_seconds == 600.0
     @patch("src.orchestrators.advanced.get_chat_client")
     def test_explicit_timeout_overrides_settings(self, mock_get_client) -> None: