DeepBoner / docs /bugs /P3_MAGENTIC_NO_TERMINATION_EVENT.md
VibecoderMcSwaggins's picture
Fix P3 bug: Guarantee termination event in Magentic mode
d36ce3c
|
raw
history blame
4.95 kB

P3 Bug Report: Advanced Mode Missing Termination Guarantee

Status

  • Date: 2025-11-29
  • Priority: P3 (Edge case, but confusing UX)
  • Component: src/orchestrator_magentic.py
  • Resolution: Fixed (Guarantee termination event)

Symptoms

In Advanced (Magentic) mode with OpenAI API key:

  1. Workflow runs for many iterations (up to 10 rounds)
  2. Agents search, judge, hypothesize repeatedly
  3. Eventually... nothing happens
    • No "complete" event
    • No error message
    • UI just stops updating

User perception: "Did it finish? Did it crash? What happened?"

Observed Behavior

When workflow hits max_round_count=10:

  • workflow.run_stream(task) iterator ends
  • NO MagenticFinalResultEvent is emitted by agent-framework
  • Our code yields nothing after the loop
  • User is left hanging

Root Cause Analysis

Code Path (src/orchestrator_magentic.py:170-186)

iteration = 0
try:
    async for event in workflow.run_stream(task):
        agent_event = self._process_event(event, iteration)
        if agent_event:
            if isinstance(event, MagenticAgentMessageEvent):
                iteration += 1
            yield agent_event
    # BUG: NO FALLBACK HERE!
    # If loop ends without FinalResultEvent, user sees nothing

except Exception as e:
    logger.error("Magentic workflow failed", error=str(e))
    yield AgentEvent(
        type="error",
        message=f"Workflow error: {e!s}",
        iteration=iteration,
    )
# BUG: NO FINALLY BLOCK TO GUARANTEE TERMINATION EVENT

Workflow Configuration (src/orchestrator_magentic.py:110-116)

.with_standard_manager(
    chat_client=manager_client,
    max_round_count=self._max_rounds,  # 10 - can hit this limit
    max_stall_count=3,                  # If agents repeat 3x
    max_reset_count=2,                  # Workflow reset limit
)

Failure Modes

Scenario What Happens User Sees
MagenticFinalResultEvent emitted _process_event yields "complete" Final report
Max rounds (10) reached, no final event Loop ends silently Nothing
max_stall_count triggered Workflow ends Nothing
max_reset_count triggered Workflow ends Nothing
OpenAI API error Exception caught Error message

The Fix

Add guaranteed termination event after the loop:

iteration = 0
final_event_received = False

try:
    async for event in workflow.run_stream(task):
        agent_event = self._process_event(event, iteration)
        if agent_event:
            if isinstance(event, MagenticAgentMessageEvent):
                iteration += 1
            if agent_event.type == "complete":
                final_event_received = True
            yield agent_event

except Exception as e:
    logger.error("Magentic workflow failed", error=str(e))
    yield AgentEvent(
        type="error",
        message=f"Workflow error: {e!s}",
        iteration=iteration,
    )
    final_event_received = True  # Error is a form of termination

finally:
    # GUARANTEE: Always emit termination event
    if not final_event_received:
        logger.warning(
            "Workflow ended without final event",
            iterations=iteration,
        )
        yield AgentEvent(
            type="complete",
            message=(
                f"Research completed after {iteration} agent rounds. "
                "Max iterations reached - results may be partial. "
                "Try a more specific query for better results."
            ),
            data={"iterations": iteration, "reason": "max_rounds_reached"},
            iteration=iteration,
        )

Alternative: Increase Max Rounds

The default max_rounds=10 might be too low for complex queries.

In src/orchestrator_factory.py:52-53:

return orchestrator_cls(
    max_rounds=config.max_iterations if config else 10,  # Could increase to 15-20
    api_key=api_key,
)

Trade-off: More rounds = more API cost, but better chance of complete results.


Test Plan

  • Add fallback yield after async for loop
  • Add final_event_received flag tracking
  • Log warning when fallback is used
  • Test with max_rounds=2 to force hitting limit
  • Verify user always sees termination event
  • make check passes

Related Files

  • src/orchestrator_magentic.py - Main fix location
  • src/orchestrator_factory.py - Max rounds configuration
  • src/utils/models.py - AgentEvent types
  • docs/bugs/P2_MAGENTIC_THINKING_STATE.md - Related UX issue (implemented)

Priority Justification

P3 because:

  • Advanced mode is working for most queries
  • Only hits edge case when max rounds reached without synthesis
  • User CAN retry with different query
  • Not blocking hackathon demo (free tier Simple mode works)

Would be P2 if:

  • This happened frequently
  • No workaround existed