VibecoderMcSwaggins commited on
Commit
d36ce3c
·
1 Parent(s): 3599f0a

Fix P3 bug: Guarantee termination event in Magentic mode

Browse files

- Add fallback yield in MagenticOrchestrator.run when max rounds reached
- Track final_event_received flag to prevent double termination
- Add unit tests for termination guarantee and normal completion

docs/bugs/ACTIVE_BUGS.md CHANGED
@@ -1,39 +1,55 @@
1
  # Active Bugs
2
 
3
- > Last updated: 2025-11-28
4
 
5
- ## P0 - Critical
6
 
7
- ### Magentic Mode Report Generation
8
- **File**: [FIX_PLAN_MAGENTIC_MODE.md](./FIX_PLAN_MAGENTIC_MODE.md)
9
 
10
- **Symptom**: Magentic mode returns `ChatMessage` object instead of synthesized report text.
11
 
12
- **Root Cause**:
13
- - `event.message.text` extraction fails in orchestrator
14
- - `max_rounds=3` too low for SearchAgent + JudgeAgent + ReportAgent sequence
15
 
16
- **Workaround**: Use Simple mode (default) - works correctly with all LLM providers.
 
17
 
18
- **Status**: Fix plan documented, not yet implemented.
 
 
19
 
20
- ---
 
21
 
22
- ## P1 - Minor UX
 
 
 
23
 
24
- ### Gradio Settings Accordion Won't Collapse
25
- **File**: [P1_GRADIO_SETTINGS_CLEANUP.md](./P1_GRADIO_SETTINGS_CLEANUP.md)
26
 
27
- **Symptom**: Settings accordion stays open after user interaction.
 
 
28
 
29
- **Root Cause**: Nested `gr.Blocks` context prevents accordion state management.
 
30
 
31
- **Impact**: UX only - all functionality works correctly.
 
 
32
 
33
- **Status**: Solution documented, not yet implemented.
 
 
 
 
34
 
35
  ---
36
 
37
- ## Resolved Bugs
38
 
39
- *None currently - bugs above are still open.*
 
 
 
 
1
  # Active Bugs
2
 
3
+ > Last updated: 2025-11-29
4
 
5
+ ## P3 - Edge Case
6
 
7
+ *(None)*
 
8
 
9
+ ---
10
 
11
+ ## Resolved Bugs
 
 
12
 
13
+ ### ~~P3 - Magentic Mode Missing Termination Guarantee~~ FIXED
14
+ **Commit**: `(Pending)` (2025-11-29)
15
 
16
+ - Added `final_event_received` tracking in `orchestrator_magentic.py`
17
+ - Added fallback yield for "max iterations reached" scenario
18
+ - Verified with `test_magentic_termination.py`
19
 
20
+ ### ~~P0 - Magentic Mode Report Generation~~ FIXED
21
+ **Commit**: `9006d69` (2025-11-29)
22
 
23
+ - Fixed `_extract_text()` to handle various message object formats
24
+ - Increased `max_rounds=10` (was 3)
25
+ - Added `temperature=1.0` for reasoning model compatibility
26
+ - Advanced mode now produces full research reports
27
 
28
+ ### ~~P1 - Streaming Spam + API Key Persistence~~ FIXED
29
+ **Commit**: `0c9be4a` (2025-11-29)
30
 
31
+ - Streaming events now buffered (not token-by-token spam)
32
+ - API key persists across example clicks via `gr.State`
33
+ - Examples use explicit `None` values to avoid overwriting keys
34
 
35
+ ### ~~P2 - Missing "Thinking" State~~ FIXED
36
+ **Commit**: `9006d69` (2025-11-29)
37
 
38
+ - Added `"thinking"` event type with hourglass icon
39
+ - Yields "Multi-agent reasoning in progress..." before blocking workflow call
40
+ - Users now see feedback during 2-5 minute initial processing
41
 
42
+ ### ~~P1 - Gradio Settings Accordion~~ WONTFIX
43
+ **File**: [P1_GRADIO_SETTINGS_CLEANUP.md](./P1_GRADIO_SETTINGS_CLEANUP.md)
44
+
45
+ Decision: Removed nested Blocks, using ChatInterface directly.
46
+ Accordion behavior is default Gradio - acceptable for demo.
47
 
48
  ---
49
 
50
+ ## How to Report Bugs
51
 
52
+ 1. Create `docs/bugs/P{N}_{SHORT_NAME}.md`
53
+ 2. Include: Symptom, Root Cause, Fix Plan, Test Plan
54
+ 3. Update this index
55
+ 4. Priority: P0=blocker, P1=important, P2=UX, P3=edge case
docs/bugs/P3_MAGENTIC_NO_TERMINATION_EVENT.md ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # P3 Bug Report: Advanced Mode Missing Termination Guarantee
2
+
3
+ ## Status
4
+ - **Date:** 2025-11-29
5
+ - **Priority:** P3 (Edge case, but confusing UX)
6
+ - **Component:** `src/orchestrator_magentic.py`
7
+ - **Resolution:** Fixed (Guarantee termination event)
8
+
9
+ ---
10
+
11
+ ## Symptoms
12
+
13
+ In **Advanced (Magentic) mode** with OpenAI API key:
14
+
15
+ 1. Workflow runs for many iterations (up to 10 rounds)
16
+ 2. Agents search, judge, hypothesize repeatedly
17
+ 3. Eventually... **nothing happens**
18
+ - No "complete" event
19
+ - No error message
20
+ - UI just stops updating
21
+
22
+ **User perception:** "Did it finish? Did it crash? What happened?"
23
+
24
+ ### Observed Behavior
25
+
26
+ When workflow hits `max_round_count=10`:
27
+ - `workflow.run_stream(task)` iterator ends
28
+ - NO `MagenticFinalResultEvent` is emitted by agent-framework
29
+ - Our code yields nothing after the loop
30
+ - User is left hanging
31
+
32
+ ---
33
+
34
+ ## Root Cause Analysis
35
+
36
+ ### Code Path (`src/orchestrator_magentic.py:170-186`)
37
+
38
+ ```python
39
+ iteration = 0
40
+ try:
41
+ async for event in workflow.run_stream(task):
42
+ agent_event = self._process_event(event, iteration)
43
+ if agent_event:
44
+ if isinstance(event, MagenticAgentMessageEvent):
45
+ iteration += 1
46
+ yield agent_event
47
+ # BUG: NO FALLBACK HERE!
48
+ # If loop ends without FinalResultEvent, user sees nothing
49
+
50
+ except Exception as e:
51
+ logger.error("Magentic workflow failed", error=str(e))
52
+ yield AgentEvent(
53
+ type="error",
54
+ message=f"Workflow error: {e!s}",
55
+ iteration=iteration,
56
+ )
57
+ # BUG: NO FINALLY BLOCK TO GUARANTEE TERMINATION EVENT
58
+ ```
59
+
60
+ ### Workflow Configuration (`src/orchestrator_magentic.py:110-116`)
61
+
62
+ ```python
63
+ .with_standard_manager(
64
+ chat_client=manager_client,
65
+ max_round_count=self._max_rounds, # 10 - can hit this limit
66
+ max_stall_count=3, # If agents repeat 3x
67
+ max_reset_count=2, # Workflow reset limit
68
+ )
69
+ ```
70
+
71
+ ### Failure Modes
72
+
73
+ | Scenario | What Happens | User Sees |
74
+ |----------|--------------|-----------|
75
+ | `MagenticFinalResultEvent` emitted | `_process_event` yields "complete" | Final report |
76
+ | Max rounds (10) reached, no final event | Loop ends silently | **Nothing** |
77
+ | `max_stall_count` triggered | Workflow ends | **Nothing** |
78
+ | `max_reset_count` triggered | Workflow ends | **Nothing** |
79
+ | OpenAI API error | Exception caught | Error message |
80
+
81
+ ---
82
+
83
+ ## The Fix
84
+
85
+ Add guaranteed termination event after the loop:
86
+
87
+ ```python
88
+ iteration = 0
89
+ final_event_received = False
90
+
91
+ try:
92
+ async for event in workflow.run_stream(task):
93
+ agent_event = self._process_event(event, iteration)
94
+ if agent_event:
95
+ if isinstance(event, MagenticAgentMessageEvent):
96
+ iteration += 1
97
+ if agent_event.type == "complete":
98
+ final_event_received = True
99
+ yield agent_event
100
+
101
+ except Exception as e:
102
+ logger.error("Magentic workflow failed", error=str(e))
103
+ yield AgentEvent(
104
+ type="error",
105
+ message=f"Workflow error: {e!s}",
106
+ iteration=iteration,
107
+ )
108
+ final_event_received = True # Error is a form of termination
109
+
110
+ finally:
111
+ # GUARANTEE: Always emit termination event
112
+ if not final_event_received:
113
+ logger.warning(
114
+ "Workflow ended without final event",
115
+ iterations=iteration,
116
+ )
117
+ yield AgentEvent(
118
+ type="complete",
119
+ message=(
120
+ f"Research completed after {iteration} agent rounds. "
121
+ "Max iterations reached - results may be partial. "
122
+ "Try a more specific query for better results."
123
+ ),
124
+ data={"iterations": iteration, "reason": "max_rounds_reached"},
125
+ iteration=iteration,
126
+ )
127
+ ```
128
+
129
+ ---
130
+
131
+ ## Alternative: Increase Max Rounds
132
+
133
+ The default `max_rounds=10` might be too low for complex queries.
134
+
135
+ In `src/orchestrator_factory.py:52-53`:
136
+ ```python
137
+ return orchestrator_cls(
138
+ max_rounds=config.max_iterations if config else 10, # Could increase to 15-20
139
+ api_key=api_key,
140
+ )
141
+ ```
142
+
143
+ **Trade-off:** More rounds = more API cost, but better chance of complete results.
144
+
145
+ ---
146
+
147
+ ## Test Plan
148
+
149
+ - [ ] Add fallback yield after async for loop
150
+ - [ ] Add `final_event_received` flag tracking
151
+ - [ ] Log warning when fallback is used
152
+ - [ ] Test with `max_rounds=2` to force hitting limit
153
+ - [ ] Verify user always sees termination event
154
+ - [ ] `make check` passes
155
+
156
+ ---
157
+
158
+ ## Related Files
159
+
160
+ - `src/orchestrator_magentic.py` - Main fix location
161
+ - `src/orchestrator_factory.py` - Max rounds configuration
162
+ - `src/utils/models.py` - AgentEvent types
163
+ - `docs/bugs/P2_MAGENTIC_THINKING_STATE.md` - Related UX issue (implemented)
164
+
165
+ ---
166
+
167
+ ## Priority Justification
168
+
169
+ **P3** because:
170
+ - Advanced mode is working for most queries
171
+ - Only hits edge case when max rounds reached without synthesis
172
+ - User CAN retry with different query
173
+ - Not blocking hackathon demo (free tier Simple mode works)
174
+
175
+ Would be P2 if:
176
+ - This happened frequently
177
+ - No workaround existed
src/orchestrator_magentic.py CHANGED
@@ -168,14 +168,38 @@ The final output should be a structured research report."""
168
  )
169
 
170
  iteration = 0
 
 
171
  try:
172
  async for event in workflow.run_stream(task):
173
  agent_event = self._process_event(event, iteration)
174
  if agent_event:
175
  if isinstance(event, MagenticAgentMessageEvent):
176
  iteration += 1
 
 
 
 
177
  yield agent_event
178
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
179
  except Exception as e:
180
  logger.error("Magentic workflow failed", error=str(e))
181
  yield AgentEvent(
 
168
  )
169
 
170
  iteration = 0
171
+ final_event_received = False
172
+
173
  try:
174
  async for event in workflow.run_stream(task):
175
  agent_event = self._process_event(event, iteration)
176
  if agent_event:
177
  if isinstance(event, MagenticAgentMessageEvent):
178
  iteration += 1
179
+
180
+ if agent_event.type == "complete":
181
+ final_event_received = True
182
+
183
  yield agent_event
184
 
185
+ # GUARANTEE: Always emit termination event if stream ends without one
186
+ # (e.g., max rounds reached)
187
+ if not final_event_received:
188
+ logger.warning(
189
+ "Workflow ended without final event",
190
+ iterations=iteration,
191
+ )
192
+ yield AgentEvent(
193
+ type="complete",
194
+ message=(
195
+ f"Research completed after {iteration} agent rounds. "
196
+ "Max iterations reached - results may be partial. "
197
+ "Try a more specific query for better results."
198
+ ),
199
+ data={"iterations": iteration, "reason": "max_rounds_reached"},
200
+ iteration=iteration,
201
+ )
202
+
203
  except Exception as e:
204
  logger.error("Magentic workflow failed", error=str(e))
205
  yield AgentEvent(
tests/unit/test_magentic_termination.py ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Tests for Magentic Orchestrator termination guarantee."""
2
+
3
+ from unittest.mock import MagicMock, patch
4
+
5
+ import pytest
6
+ from agent_framework import MagenticAgentMessageEvent
7
+
8
+ from src.orchestrator_magentic import MagenticOrchestrator
9
+ from src.utils.models import AgentEvent
10
+
11
+ # Skip tests if agent_framework is not installed
12
+ pytest.importorskip("agent_framework")
13
+
14
+
15
+ class MockChatMessage:
16
+ def __init__(self, content):
17
+ self.content = content
18
+ self.role = "assistant"
19
+
20
+ @property
21
+ def text(self):
22
+ return self.content
23
+
24
+
25
+ @pytest.fixture
26
+ def mock_magentic_requirements():
27
+ """Mock requirements check."""
28
+ with patch("src.orchestrator_magentic.check_magentic_requirements"):
29
+ yield
30
+
31
+
32
+ @pytest.mark.asyncio
33
+ async def test_termination_event_emitted_on_stream_end(mock_magentic_requirements):
34
+ """
35
+ Verify that a termination event is emitted when the workflow stream ends
36
+ without a MagenticFinalResultEvent (e.g. max rounds reached).
37
+ """
38
+ orchestrator = MagenticOrchestrator(max_rounds=2)
39
+
40
+ # Use real event class
41
+ mock_message = MockChatMessage("Thinking...")
42
+ mock_agent_event = MagenticAgentMessageEvent(agent_id="SearchAgent", message=mock_message)
43
+
44
+ # Mock the workflow and its run_stream method
45
+ mock_workflow = MagicMock()
46
+
47
+ # Create an async generator for run_stream
48
+ async def mock_stream(task):
49
+ # Yield the real message event
50
+ yield mock_agent_event
51
+ # STOP HERE - No FinalResultEvent
52
+
53
+ mock_workflow.run_stream = mock_stream
54
+
55
+ # Mock _build_workflow to return our mock workflow
56
+ with patch.object(orchestrator, "_build_workflow", return_value=mock_workflow):
57
+ events = []
58
+ async for event in orchestrator.run("Research query"):
59
+ events.append(event)
60
+
61
+ for i, e in enumerate(events):
62
+ print(f"Event {i}: {e.type} - {e.message}")
63
+
64
+ assert len(events) >= 2
65
+ assert events[0].type == "started"
66
+
67
+ # Verify the message event was processed
68
+ # Depending on _process_event logic, MagenticAgentMessageEvent might map to different types
69
+ # We assume it maps to something valid or we just check presence.
70
+ assert any("Thinking..." in e.message for e in events)
71
+
72
+ # THE CRITICAL CHECK: Did we get the fallback termination event?
73
+ last_event = events[-1]
74
+ assert last_event.type == "complete"
75
+ assert "Max iterations reached" in last_event.message
76
+ assert last_event.data.get("reason") == "max_rounds_reached"
77
+
78
+
79
+ @pytest.mark.asyncio
80
+ async def test_no_double_termination_event(mock_magentic_requirements):
81
+ """
82
+ Verify that we DO NOT emit a fallback event if the workflow finished normally.
83
+ """
84
+ orchestrator = MagenticOrchestrator()
85
+
86
+ mock_workflow = MagicMock()
87
+
88
+ with patch.object(orchestrator, "_build_workflow", return_value=mock_workflow):
89
+ # Mock _process_event to simulate a natural completion event
90
+ with patch.object(orchestrator, "_process_event") as mock_process:
91
+ mock_process.side_effect = [
92
+ AgentEvent(type="thinking", message="Working...", iteration=1),
93
+ AgentEvent(type="complete", message="Done!", iteration=2),
94
+ ]
95
+
96
+ async def mock_stream_with_yields(task):
97
+ yield "raw_event_1"
98
+ yield "raw_event_2"
99
+
100
+ mock_workflow.run_stream = mock_stream_with_yields
101
+
102
+ events = []
103
+ async for event in orchestrator.run("Research query"):
104
+ events.append(event)
105
+
106
+ assert events[-1].message == "Done!"
107
+ assert events[-1].type == "complete"
108
+
109
+ # Verify we didn't get a SECOND "Max iterations reached" event
110
+ fallback_events = [e for e in events if "Max iterations reached" in e.message]
111
+ assert len(fallback_events) == 0