Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

App Files Files Community

DeepBoner / docs /bugs /P1_ADVANCED_MODE_UNINTERPRETABLE_CHAIN_OF_THOUGHT.md

VibecoderMcSwaggins

docs: Mark P1 chain-of-thought bug as resolved (PR #107)

e1232d2 14 days ago

preview code

raw

history blame

6.89 kB

P1: Advanced Mode Exposes Uninterpretable Chain-of-Thought Events

Priority: P1 (UX Degradation) Component: src/orchestrators/advanced.py Status: Resolved Issue: #106 PR: #107 Created: 2025-12-01 Resolved: 2025-12-01

Summary

The Advanced orchestrator exposes raw internal framework events from agent-framework-core directly to users. These events contain internal manager bookkeeping (task assignments, ledgers, instructions) that are:

Truncated mid-sentence at 200 characters
Use internal framework terminology (user_task, task_ledger, instruction)
Shown with misleading "JUDGING" event type
Not meaningful to end users

Resolution

Implemented "Smart Filter + Transform" logic in src/orchestrators/advanced.py:

Filtered: task_ledger and instruction events are now hidden.
Transformed: user_task events are mapped to type="progress" with a friendly "Manager assigning research task..." message.
Smart Truncation: Text is now truncated at sentence boundaries or word boundaries, preventing mid-word cuts.

Verified with new unit tests in tests/unit/orchestrators/test_advanced_events.py.

Example of Bad Output

🧠 **JUDGING**: Manager (user_task): Research sexual health and wellness interventions for: sildenafil mechanism  ##...

🧠 **JUDGING**: Manager (task_ledger):  We are working to address the following user request:  Research sexual healt...

🧠 **JUDGING**: Manager (instruction): Conduct targeted searches on PubMed, ClinicalTrials.gov, and Europe PMC to ga...

Users see:

Raw internal prompts being passed between manager and agents
Truncated text that cuts off mid-word ("healt...", "ga...")
Technical jargon ("task_ledger") with no context
All events labeled as "JUDGING" even when they're task assignments

Root Cause Analysis

The Chain of Issues

Location	Issue
`src/orchestrators/advanced.py:363-370`	`MagenticOrchestratorMessageEvent` raw events exposed without filtering
`src/orchestrators/advanced.py:368`	`event.kind` values (`user_task`, `task_ledger`, `instruction`) are internal framework concepts
`src/orchestrators/advanced.py:368`	Hard truncation: `text[:200]...` breaks mid-sentence
`src/orchestrators/advanced.py:367`	All manager events mapped to `type="judging"` regardless of actual purpose
`src/orchestrators/advanced.py:380`	Agent messages also truncated at 200 chars
`src/utils/models.py:136`	`"judging": "🧠"` icon shown for all these internal events
`src/app.py:248`	Events displayed verbatim via `event.to_markdown()`

Code Path

agent-framework-core (Microsoft)
        ↓
MagenticOrchestratorMessageEvent(kind="task_ledger", message="...")
        ↓
advanced.py:_process_event() - NO FILTERING
        ↓
AgentEvent(type="judging", message=f"Manager ({event.kind}): {text[:200]}...")
        ↓
models.py:to_markdown() → "🧠 **JUDGING**: Manager (task_ledger): ..."
        ↓
app.py → Displayed to user verbatim

Impact

User Confusion: Users see internal framework bookkeeping, not meaningful progress
Truncated Gibberish: 200-char limit cuts prompts mid-sentence, making them uninterpretable
Misleading Labels: "JUDGING" event type is wrong - these are task assignments
No Actionable Info: Users can't understand what the system is actually doing

Proposed Solutions

Option A: Filter Internal Events (Minimal)

Skip internal manager events entirely - they're framework bookkeeping:

def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
    if isinstance(event, MagenticOrchestratorMessageEvent):
        # Skip internal framework bookkeeping events
        if event.kind in ("user_task", "task_ledger", "instruction"):
            return None  # Don't expose to users
        # ... rest of handling

Pros: Simple, removes noise Cons: Users lose visibility into manager activity

Option B: Transform to User-Friendly Messages (Better UX)

Map internal events to meaningful user messages:

MANAGER_EVENT_MESSAGES = {
    "user_task": "Manager received research task",
    "task_ledger": "Manager tracking task progress",
    "instruction": "Manager assigning work to agent",
}

def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
    if isinstance(event, MagenticOrchestratorMessageEvent):
        if event.kind in MANAGER_EVENT_MESSAGES:
            return AgentEvent(
                type="progress",  # Not "judging"!
                message=MANAGER_EVENT_MESSAGES[event.kind],
                iteration=iteration,
            )

Pros: Users see meaningful progress, correct event types Cons: More code, loses raw detail for debugging

Option C: Smart Truncation + Verbose Mode

Truncate at sentence boundaries, not hard character limit
Add verbose_mode setting that shows full internal events for debugging
Use appropriate event types based on event.kind

def _smart_truncate(self, text: str, max_len: int = 200) -> str:
    """Truncate at sentence boundary."""
    if len(text) <= max_len:
        return text
    # Find last sentence boundary before limit
    truncated = text[:max_len]
    last_period = truncated.rfind(". ")
    if last_period > max_len // 2:
        return truncated[:last_period + 1]
    return truncated.rsplit(" ", 1)[0] + "..."

Recommended Approach

Combine Option A + B:

Default: Filter out task_ledger and instruction events (pure bookkeeping)
Transform: user_task → "Assigning research task to agents"
Proper Types: Use "progress" not "judging" for manager events
Future: Add verbose mode for debugging

Files to Modify

src/orchestrators/advanced.py:361-410 - _process_event() method
src/utils/models.py:107-123 - Add new event types if needed
tests/unit/orchestrators/test_advanced_timeout.py - Update assertions

Related Issues

P0: Advanced Mode Timeout No Synthesis (FIXED in PR #104)
This P1 was discovered while testing the P0 fix

Testing the Bug

import asyncio
from src.orchestrators.advanced import AdvancedOrchestrator

async def test():
    orch = AdvancedOrchestrator(max_rounds=3)
    async for event in orch.run("sildenafil mechanism"):
        if "Manager" in event.message:
            print(f"[{event.type}] {event.message}")
            # You'll see uninterpretable output

asyncio.run(test())

References

Microsoft Agent Framework: https://github.com/microsoft/agent-framework
AgentEvent model: src/utils/models.py:104
Advanced orchestrator: src/orchestrators/advanced.py