DeepBoner / docs /bugs /P2_DUPLICATE_REPORT_CONTENT.md
VibecoderMcSwaggins's picture
fix(p2): Resolve duplicate report content and first-turn timeout bugs (#123)
21dd8fe unverified
|
raw
history blame
5.04 kB

P2 Bug: Duplicate Report Content in Output

Date: 2025-12-03 Status: FIXED (PR fix/p2-double-bug-squash) Severity: P2 (UX - Duplicate content confuses users) Component: src/orchestrators/advanced.py Affects: Both Free Tier (HuggingFace) AND Paid Tier (OpenAI)


Executive Summary

This is a confirmed stack bug, NOT a model limitation. The duplicate report appears because:

  1. Streaming events yield the full report content character-by-character
  2. Final events (MagenticFinalResultEvent/WorkflowOutputEvent) contain the SAME content
  3. No deduplication exists between streamed content and final event content
  4. Both are appended to the output

Symptom

The final research report appears twice in the UI output:

  1. First as streaming content (with πŸ“‘ **STREAMING**: prefix)
  2. Then again as a complete event (without prefix)

Root Cause

The _process_event() method handles final events but has no access to buffer state. The buffer was already cleared at line 337 before these events arrive.

# Line 337: Buffer cleared
current_message_buffer = ""
continue

# Line 341: Final events processed WITHOUT buffer context
agent_event = self._process_event(event, iteration)  # No buffer info!

The Fix (Consensus: Stateful Orchestrator Logic)

Location: src/orchestrators/advanced.py run() method

Strategy: Handle final events inline in the run() loop where buffer state exists. Track streaming volume to decide whether to re-emit content.

Why This Is Correct

Rejected Approach Why Wrong
UI-side string comparison Wrong layer, fragile, treats symptom
Stateless _process_event fix No state = can't know if streaming occurred
Stateful run() loop βœ… Only place with full lifecycle visibility

The run() loop is the single source of truth for the request lifecycle. It "saw" the content stream out. It must decide whether to re-emit.

Implementation

# In run() method, add tracking variable after line 302:
last_streamed_length: int = 0

# Before clearing buffer at line 337, save its length:
last_streamed_length = len(current_message_buffer)
current_message_buffer = ""
continue

# Replace lines 340-345 with inline handling of final events:
if isinstance(event, (MagenticFinalResultEvent, WorkflowOutputEvent)):
    final_event_received = True

    # DECISION: Did we stream substantial content?
    if last_streamed_length > 100:
        # YES: Final event is a SIGNAL, not a payload
        yield AgentEvent(
            type="complete",
            message="Research complete.",
            data={"iterations": iteration, "streamed_chars": last_streamed_length},
            iteration=iteration,
        )
    else:
        # NO: Final event must carry the payload (tool-only turn, cache hit)
        if isinstance(event, MagenticFinalResultEvent):
            text = self._extract_text(event.message) if event.message else "No result"
        else:  # WorkflowOutputEvent
            text = self._extract_text(event.data) if event.data else "Research complete"
        yield AgentEvent(
            type="complete",
            message=text,
            data={"iterations": iteration},
            iteration=iteration,
        )
    continue

# Keep existing fallback for other events:
agent_event = self._process_event(event, iteration)

Why Threshold of 100 Chars?

  • > 0 is too aggressive (might catch single-word streams)
  • > 500 is too conservative (might miss short but complete responses)
  • > 100 distinguishes "real content was streamed" from "just status messages"

Edge Cases Handled

Scenario last_streamed_length Action
Normal streaming report 5000+ Emit "Research complete."
Tool call, no text 0 Emit full content from final event
Very short response 50 Emit full content (fallback)
Agent switch mid-stream Reset on switch Tracks only final agent

Files to Modify

File Lines Change
src/orchestrators/advanced.py 296-345 Add last_streamed_length, handle final events inline
src/orchestrators/advanced.py 532-552 Optional: remove dead code from _process_event()

Test Plan

  1. Happy Path: Run query, verify report appears ONCE
  2. Fallback: Mock tool-only turn (no streaming), verify full content emitted
  3. Both Tiers: Test Free Tier and Paid Tier

Validation

This fix was independently validated by two AI agents (Claude and Gemini) analyzing the architecture. Both concluded:

"The Stateful Orchestrator Fix is the correct engineering solution. The 'Source of Truth' is the Orchestrator's runtime state."


Related

  • Not related to model quality - This is a stack bug
  • P1 Free Tier fix enabled streaming, exposing this bug
  • SPEC-17 Accumulator Pattern addressed repr bug but created this side effect