P1: Advanced Mode Exposes Uninterpretable Chain-of-Thought Events
Priority: P1 (UX Degradation)
Component: src/orchestrators/advanced.py
Status: Resolved
Issue: #106
PR: #107
Created: 2025-12-01
Resolved: 2025-12-01
Summary
The Advanced orchestrator exposes raw internal framework events from agent-framework-core directly to users. These events contain internal manager bookkeeping (task assignments, ledgers, instructions) that are:
- Truncated mid-sentence at 200 characters
- Use internal framework terminology (
user_task,task_ledger,instruction) - Shown with misleading "JUDGING" event type
- Not meaningful to end users
Resolution
Implemented "Smart Filter + Transform" logic in src/orchestrators/advanced.py:
- Filtered:
task_ledgerandinstructionevents are now hidden. - Transformed:
user_taskevents are mapped totype="progress"with a friendly "Manager assigning research task..." message. - Smart Truncation: Text is now truncated at sentence boundaries or word boundaries, preventing mid-word cuts.
Verified with new unit tests in tests/unit/orchestrators/test_advanced_events.py.
Example of Bad Output
π§ **JUDGING**: Manager (user_task): Research sexual health and wellness interventions for: sildenafil mechanism ##...
π§ **JUDGING**: Manager (task_ledger): We are working to address the following user request: Research sexual healt...
π§ **JUDGING**: Manager (instruction): Conduct targeted searches on PubMed, ClinicalTrials.gov, and Europe PMC to ga...
Users see:
- Raw internal prompts being passed between manager and agents
- Truncated text that cuts off mid-word ("healt...", "ga...")
- Technical jargon ("task_ledger") with no context
- All events labeled as "JUDGING" even when they're task assignments
Root Cause Analysis
The Chain of Issues
| Location | Issue |
|---|---|
src/orchestrators/advanced.py:363-370 |
MagenticOrchestratorMessageEvent raw events exposed without filtering |
src/orchestrators/advanced.py:368 |
event.kind values (user_task, task_ledger, instruction) are internal framework concepts |
src/orchestrators/advanced.py:368 |
Hard truncation: text[:200]... breaks mid-sentence |
src/orchestrators/advanced.py:367 |
All manager events mapped to type="judging" regardless of actual purpose |
src/orchestrators/advanced.py:380 |
Agent messages also truncated at 200 chars |
src/utils/models.py:136 |
"judging": "π§ " icon shown for all these internal events |
src/app.py:248 |
Events displayed verbatim via event.to_markdown() |
Code Path
agent-framework-core (Microsoft)
β
MagenticOrchestratorMessageEvent(kind="task_ledger", message="...")
β
advanced.py:_process_event() - NO FILTERING
β
AgentEvent(type="judging", message=f"Manager ({event.kind}): {text[:200]}...")
β
models.py:to_markdown() β "π§ **JUDGING**: Manager (task_ledger): ..."
β
app.py β Displayed to user verbatim
Impact
- User Confusion: Users see internal framework bookkeeping, not meaningful progress
- Truncated Gibberish: 200-char limit cuts prompts mid-sentence, making them uninterpretable
- Misleading Labels: "JUDGING" event type is wrong - these are task assignments
- No Actionable Info: Users can't understand what the system is actually doing
Proposed Solutions
Option A: Filter Internal Events (Minimal)
Skip internal manager events entirely - they're framework bookkeeping:
def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
if isinstance(event, MagenticOrchestratorMessageEvent):
# Skip internal framework bookkeeping events
if event.kind in ("user_task", "task_ledger", "instruction"):
return None # Don't expose to users
# ... rest of handling
Pros: Simple, removes noise Cons: Users lose visibility into manager activity
Option B: Transform to User-Friendly Messages (Better UX)
Map internal events to meaningful user messages:
MANAGER_EVENT_MESSAGES = {
"user_task": "Manager received research task",
"task_ledger": "Manager tracking task progress",
"instruction": "Manager assigning work to agent",
}
def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
if isinstance(event, MagenticOrchestratorMessageEvent):
if event.kind in MANAGER_EVENT_MESSAGES:
return AgentEvent(
type="progress", # Not "judging"!
message=MANAGER_EVENT_MESSAGES[event.kind],
iteration=iteration,
)
Pros: Users see meaningful progress, correct event types Cons: More code, loses raw detail for debugging
Option C: Smart Truncation + Verbose Mode
- Truncate at sentence boundaries, not hard character limit
- Add
verbose_modesetting that shows full internal events for debugging - Use appropriate event types based on
event.kind
def _smart_truncate(self, text: str, max_len: int = 200) -> str:
"""Truncate at sentence boundary."""
if len(text) <= max_len:
return text
# Find last sentence boundary before limit
truncated = text[:max_len]
last_period = truncated.rfind(". ")
if last_period > max_len // 2:
return truncated[:last_period + 1]
return truncated.rsplit(" ", 1)[0] + "..."
Recommended Approach
Combine Option A + B:
- Default: Filter out
task_ledgerandinstructionevents (pure bookkeeping) - Transform:
user_taskβ "Assigning research task to agents" - Proper Types: Use
"progress"not"judging"for manager events - Future: Add verbose mode for debugging
Files to Modify
src/orchestrators/advanced.py:361-410-_process_event()methodsrc/utils/models.py:107-123- Add new event types if neededtests/unit/orchestrators/test_advanced_timeout.py- Update assertions
Related Issues
- P0: Advanced Mode Timeout No Synthesis (FIXED in PR #104)
- This P1 was discovered while testing the P0 fix
Testing the Bug
import asyncio
from src.orchestrators.advanced import AdvancedOrchestrator
async def test():
orch = AdvancedOrchestrator(max_rounds=3)
async for event in orch.run("sildenafil mechanism"):
if "Manager" in event.message:
print(f"[{event.type}] {event.message}")
# You'll see uninterpretable output
asyncio.run(test())
References
- Microsoft Agent Framework: https://github.com/microsoft/agent-framework
- AgentEvent model:
src/utils/models.py:104 - Advanced orchestrator:
src/orchestrators/advanced.py