Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

VibecoderMcSwaggins commited on Nov 29, 2025

Commit

d36ce3c

1 Parent(s): 3599f0a

Fix P3 bug: Guarantee termination event in Magentic mode

- Add fallback yield in MagenticOrchestrator.run when max rounds reached
- Track final_event_received flag to prevent double termination
- Add unit tests for termination guarantee and normal completion

Files changed (4) hide show

docs/bugs/ACTIVE_BUGS.md +36 -20
docs/bugs/P3_MAGENTIC_NO_TERMINATION_EVENT.md +177 -0
src/orchestrator_magentic.py +24 -0
tests/unit/test_magentic_termination.py +111 -0

docs/bugs/ACTIVE_BUGS.md CHANGED Viewed

@@ -1,39 +1,55 @@
 # Active Bugs
-> Last updated: 2025-11-28
-## P0 - Critical
-### Magentic Mode Report Generation
-**File**: [FIX_PLAN_MAGENTIC_MODE.md](./FIX_PLAN_MAGENTIC_MODE.md)
-**Symptom**: Magentic mode returns `ChatMessage` object instead of synthesized report text.
-**Root Cause**:
-- `event.message.text` extraction fails in orchestrator
-- `max_rounds=3` too low for SearchAgent + JudgeAgent + ReportAgent sequence
-**Workaround**: Use Simple mode (default) - works correctly with all LLM providers.
-**Status**: Fix plan documented, not yet implemented.
----
-## P1 - Minor UX
-### Gradio Settings Accordion Won't Collapse
-**File**: [P1_GRADIO_SETTINGS_CLEANUP.md](./P1_GRADIO_SETTINGS_CLEANUP.md)
-**Symptom**: Settings accordion stays open after user interaction.
-**Root Cause**: Nested `gr.Blocks` context prevents accordion state management.
-**Impact**: UX only - all functionality works correctly.
-**Status**: Solution documented, not yet implemented.
 ---
-## Resolved Bugs
-*None currently - bugs above are still open.*

 # Active Bugs
+> Last updated: 2025-11-29
+## P3 - Edge Case
+*(None)*
+---
+## Resolved Bugs
+### ~~P3 - Magentic Mode Missing Termination Guarantee~~ FIXED
+**Commit**: `(Pending)` (2025-11-29)
+- Added `final_event_received` tracking in `orchestrator_magentic.py`
+- Added fallback yield for "max iterations reached" scenario
+- Verified with `test_magentic_termination.py`
+### ~~P0 - Magentic Mode Report Generation~~ FIXED
+**Commit**: `9006d69` (2025-11-29)
+- Fixed `_extract_text()` to handle various message object formats
+- Increased `max_rounds=10` (was 3)
+- Added `temperature=1.0` for reasoning model compatibility
+- Advanced mode now produces full research reports
+### ~~P1 - Streaming Spam + API Key Persistence~~ FIXED
+**Commit**: `0c9be4a` (2025-11-29)
+- Streaming events now buffered (not token-by-token spam)
+- API key persists across example clicks via `gr.State`
+- Examples use explicit `None` values to avoid overwriting keys
+### ~~P2 - Missing "Thinking" State~~ FIXED
+**Commit**: `9006d69` (2025-11-29)
+- Added `"thinking"` event type with hourglass icon
+- Yields "Multi-agent reasoning in progress..." before blocking workflow call
+- Users now see feedback during 2-5 minute initial processing
+### ~~P1 - Gradio Settings Accordion~~ WONTFIX
+**File**: [P1_GRADIO_SETTINGS_CLEANUP.md](./P1_GRADIO_SETTINGS_CLEANUP.md)
+Decision: Removed nested Blocks, using ChatInterface directly.
+Accordion behavior is default Gradio - acceptable for demo.
 ---
+## How to Report Bugs
+1. Create `docs/bugs/P{N}_{SHORT_NAME}.md`
+2. Include: Symptom, Root Cause, Fix Plan, Test Plan
+3. Update this index
+4. Priority: P0=blocker, P1=important, P2=UX, P3=edge case

docs/bugs/P3_MAGENTIC_NO_TERMINATION_EVENT.md ADDED Viewed

	@@ -0,0 +1,177 @@

+# P3 Bug Report: Advanced Mode Missing Termination Guarantee
+## Status
+- **Date:** 2025-11-29
+- **Priority:** P3 (Edge case, but confusing UX)
+- **Component:** `src/orchestrator_magentic.py`
+- **Resolution:** Fixed (Guarantee termination event)
+---
+## Symptoms
+In **Advanced (Magentic) mode** with OpenAI API key:
+1. Workflow runs for many iterations (up to 10 rounds)
+2. Agents search, judge, hypothesize repeatedly
+3. Eventually... **nothing happens**
+   - No "complete" event
+   - No error message
+   - UI just stops updating
+**User perception:** "Did it finish? Did it crash? What happened?"
+### Observed Behavior
+When workflow hits `max_round_count=10`:
+- `workflow.run_stream(task)` iterator ends
+- NO `MagenticFinalResultEvent` is emitted by agent-framework
+- Our code yields nothing after the loop
+- User is left hanging
+---
+## Root Cause Analysis
+### Code Path (`src/orchestrator_magentic.py:170-186`)
+```python
+iteration = 0
+try:
+    async for event in workflow.run_stream(task):
+        agent_event = self._process_event(event, iteration)
+        if agent_event:
+            if isinstance(event, MagenticAgentMessageEvent):
+                iteration += 1
+            yield agent_event
+    # BUG: NO FALLBACK HERE!
+    # If loop ends without FinalResultEvent, user sees nothing
+except Exception as e:
+    logger.error("Magentic workflow failed", error=str(e))
+    yield AgentEvent(
+        type="error",
+        message=f"Workflow error: {e!s}",
+        iteration=iteration,
+    )
+# BUG: NO FINALLY BLOCK TO GUARANTEE TERMINATION EVENT
+```
+### Workflow Configuration (`src/orchestrator_magentic.py:110-116`)
+```python
+.with_standard_manager(
+    chat_client=manager_client,
+    max_round_count=self._max_rounds,  # 10 - can hit this limit
+    max_stall_count=3,                  # If agents repeat 3x
+    max_reset_count=2,                  # Workflow reset limit
+)
+```
+### Failure Modes
+| Scenario | What Happens | User Sees |
+|----------|--------------|-----------|
+| `MagenticFinalResultEvent` emitted | `_process_event` yields "complete" | Final report |
+| Max rounds (10) reached, no final event | Loop ends silently | **Nothing** |
+| `max_stall_count` triggered | Workflow ends | **Nothing** |
+| `max_reset_count` triggered | Workflow ends | **Nothing** |
+| OpenAI API error | Exception caught | Error message |
+---
+## The Fix
+Add guaranteed termination event after the loop:
+```python
+iteration = 0
+final_event_received = False
+try:
+    async for event in workflow.run_stream(task):
+        agent_event = self._process_event(event, iteration)
+        if agent_event:
+            if isinstance(event, MagenticAgentMessageEvent):
+                iteration += 1
+            if agent_event.type == "complete":
+                final_event_received = True
+            yield agent_event
+except Exception as e:
+    logger.error("Magentic workflow failed", error=str(e))
+    yield AgentEvent(
+        type="error",
+        message=f"Workflow error: {e!s}",
+        iteration=iteration,
+    )
+    final_event_received = True  # Error is a form of termination
+finally:
+    # GUARANTEE: Always emit termination event
+    if not final_event_received:
+        logger.warning(
+            "Workflow ended without final event",
+            iterations=iteration,
+        )
+        yield AgentEvent(
+            type="complete",
+            message=(
+                f"Research completed after {iteration} agent rounds. "
+                "Max iterations reached - results may be partial. "
+                "Try a more specific query for better results."
+            ),
+            data={"iterations": iteration, "reason": "max_rounds_reached"},
+            iteration=iteration,
+        )
+```
+---
+## Alternative: Increase Max Rounds
+The default `max_rounds=10` might be too low for complex queries.
+In `src/orchestrator_factory.py:52-53`:
+```python
+return orchestrator_cls(
+    max_rounds=config.max_iterations if config else 10,  # Could increase to 15-20
+    api_key=api_key,
+)
+```
+**Trade-off:** More rounds = more API cost, but better chance of complete results.
+---
+## Test Plan
+- [ ] Add fallback yield after async for loop
+- [ ] Add `final_event_received` flag tracking
+- [ ] Log warning when fallback is used
+- [ ] Test with `max_rounds=2` to force hitting limit
+- [ ] Verify user always sees termination event
+- [ ] `make check` passes
+---
+## Related Files
+- `src/orchestrator_magentic.py` - Main fix location
+- `src/orchestrator_factory.py` - Max rounds configuration
+- `src/utils/models.py` - AgentEvent types
+- `docs/bugs/P2_MAGENTIC_THINKING_STATE.md` - Related UX issue (implemented)
+---
+## Priority Justification
+**P3** because:
+- Advanced mode is working for most queries
+- Only hits edge case when max rounds reached without synthesis
+- User CAN retry with different query
+- Not blocking hackathon demo (free tier Simple mode works)
+Would be P2 if:
+- This happened frequently
+- No workaround existed

src/orchestrator_magentic.py CHANGED Viewed

@@ -168,14 +168,38 @@ The final output should be a structured research report."""
         )
         iteration = 0
         try:
             async for event in workflow.run_stream(task):
                 agent_event = self._process_event(event, iteration)
                 if agent_event:
                     if isinstance(event, MagenticAgentMessageEvent):
                         iteration += 1
                     yield agent_event
         except Exception as e:
             logger.error("Magentic workflow failed", error=str(e))
             yield AgentEvent(

         )
         iteration = 0
+        final_event_received = False
         try:
             async for event in workflow.run_stream(task):
                 agent_event = self._process_event(event, iteration)
                 if agent_event:
                     if isinstance(event, MagenticAgentMessageEvent):
                         iteration += 1
+                    if agent_event.type == "complete":
+                        final_event_received = True
                     yield agent_event
+            # GUARANTEE: Always emit termination event if stream ends without one
+            # (e.g., max rounds reached)
+            if not final_event_received:
+                logger.warning(
+                    "Workflow ended without final event",
+                    iterations=iteration,
+                )
+                yield AgentEvent(
+                    type="complete",
+                    message=(
+                        f"Research completed after {iteration} agent rounds. "
+                        "Max iterations reached - results may be partial. "
+                        "Try a more specific query for better results."
+                    ),
+                    data={"iterations": iteration, "reason": "max_rounds_reached"},
+                    iteration=iteration,
+                )
         except Exception as e:
             logger.error("Magentic workflow failed", error=str(e))
             yield AgentEvent(

tests/unit/test_magentic_termination.py ADDED Viewed

	@@ -0,0 +1,111 @@

+"""Tests for Magentic Orchestrator termination guarantee."""
+from unittest.mock import MagicMock, patch
+import pytest
+from agent_framework import MagenticAgentMessageEvent
+from src.orchestrator_magentic import MagenticOrchestrator
+from src.utils.models import AgentEvent
+# Skip tests if agent_framework is not installed
+pytest.importorskip("agent_framework")
+class MockChatMessage:
+    def __init__(self, content):
+        self.content = content
+        self.role = "assistant"
+    @property
+    def text(self):
+        return self.content
+@pytest.fixture
+def mock_magentic_requirements():
+    """Mock requirements check."""
+    with patch("src.orchestrator_magentic.check_magentic_requirements"):
+        yield
+@pytest.mark.asyncio
+async def test_termination_event_emitted_on_stream_end(mock_magentic_requirements):
+    """
+    Verify that a termination event is emitted when the workflow stream ends
+    without a MagenticFinalResultEvent (e.g. max rounds reached).
+    """
+    orchestrator = MagenticOrchestrator(max_rounds=2)
+    # Use real event class
+    mock_message = MockChatMessage("Thinking...")
+    mock_agent_event = MagenticAgentMessageEvent(agent_id="SearchAgent", message=mock_message)
+    # Mock the workflow and its run_stream method
+    mock_workflow = MagicMock()
+    # Create an async generator for run_stream
+    async def mock_stream(task):
+        # Yield the real message event
+        yield mock_agent_event
+        # STOP HERE - No FinalResultEvent
+    mock_workflow.run_stream = mock_stream
+    # Mock _build_workflow to return our mock workflow
+    with patch.object(orchestrator, "_build_workflow", return_value=mock_workflow):
+        events = []
+        async for event in orchestrator.run("Research query"):
+            events.append(event)
+        for i, e in enumerate(events):
+            print(f"Event {i}: {e.type} - {e.message}")
+        assert len(events) >= 2
+        assert events[0].type == "started"
+        # Verify the message event was processed
+        # Depending on _process_event logic, MagenticAgentMessageEvent might map to different types
+        # We assume it maps to something valid or we just check presence.
+        assert any("Thinking..." in e.message for e in events)
+        # THE CRITICAL CHECK: Did we get the fallback termination event?
+        last_event = events[-1]
+        assert last_event.type == "complete"
+        assert "Max iterations reached" in last_event.message
+        assert last_event.data.get("reason") == "max_rounds_reached"
+@pytest.mark.asyncio
+async def test_no_double_termination_event(mock_magentic_requirements):
+    """
+    Verify that we DO NOT emit a fallback event if the workflow finished normally.
+    """
+    orchestrator = MagenticOrchestrator()
+    mock_workflow = MagicMock()
+    with patch.object(orchestrator, "_build_workflow", return_value=mock_workflow):
+        # Mock _process_event to simulate a natural completion event
+        with patch.object(orchestrator, "_process_event") as mock_process:
+            mock_process.side_effect = [
+                AgentEvent(type="thinking", message="Working...", iteration=1),
+                AgentEvent(type="complete", message="Done!", iteration=2),
+            ]
+            async def mock_stream_with_yields(task):
+                yield "raw_event_1"
+                yield "raw_event_2"
+            mock_workflow.run_stream = mock_stream_with_yields
+            events = []
+            async for event in orchestrator.run("Research query"):
+                events.append(event)
+            assert events[-1].message == "Done!"
+            assert events[-1].type == "complete"
+            # Verify we didn't get a SECOND "Max iterations reached" event
+            fallback_events = [e for e in events if "Max iterations reached" in e.message]
+            assert len(fallback_events) == 0