File size: 6,894 Bytes
7e1184a e1232d2 7e1184a 67bdc5a 7e1184a e1232d2 7e1184a 67bdc5a 7e1184a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
# P1: Advanced Mode Exposes Uninterpretable Chain-of-Thought Events
**Priority**: P1 (UX Degradation)
**Component**: `src/orchestrators/advanced.py`
**Status**: Resolved
**Issue**: [#106](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/106)
**PR**: [#107](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/107)
**Created**: 2025-12-01
**Resolved**: 2025-12-01
## Summary
The Advanced orchestrator exposes raw internal framework events from `agent-framework-core` directly to users. These events contain internal manager bookkeeping (task assignments, ledgers, instructions) that are:
1. Truncated mid-sentence at 200 characters
2. Use internal framework terminology (`user_task`, `task_ledger`, `instruction`)
3. Shown with misleading "JUDGING" event type
4. Not meaningful to end users
## Resolution
Implemented "Smart Filter + Transform" logic in `src/orchestrators/advanced.py`:
1. **Filtered**: `task_ledger` and `instruction` events are now hidden.
2. **Transformed**: `user_task` events are mapped to `type="progress"` with a friendly "Manager assigning research task..." message.
3. **Smart Truncation**: Text is now truncated at sentence boundaries or word boundaries, preventing mid-word cuts.
Verified with new unit tests in `tests/unit/orchestrators/test_advanced_events.py`.
## Example of Bad Output
```
π§ **JUDGING**: Manager (user_task): Research sexual health and wellness interventions for: sildenafil mechanism ##...
π§ **JUDGING**: Manager (task_ledger): We are working to address the following user request: Research sexual healt...
π§ **JUDGING**: Manager (instruction): Conduct targeted searches on PubMed, ClinicalTrials.gov, and Europe PMC to ga...
```
Users see:
- Raw internal prompts being passed between manager and agents
- Truncated text that cuts off mid-word ("healt...", "ga...")
- Technical jargon ("task_ledger") with no context
- All events labeled as "JUDGING" even when they're task assignments
## Root Cause Analysis
### The Chain of Issues
| Location | Issue |
|----------|-------|
| `src/orchestrators/advanced.py:363-370` | `MagenticOrchestratorMessageEvent` raw events exposed without filtering |
| `src/orchestrators/advanced.py:368` | `event.kind` values (`user_task`, `task_ledger`, `instruction`) are internal framework concepts |
| `src/orchestrators/advanced.py:368` | Hard truncation: `text[:200]...` breaks mid-sentence |
| `src/orchestrators/advanced.py:367` | All manager events mapped to `type="judging"` regardless of actual purpose |
| `src/orchestrators/advanced.py:380` | Agent messages also truncated at 200 chars |
| `src/utils/models.py:136` | `"judging": "π§ "` icon shown for all these internal events |
| `src/app.py:248` | Events displayed verbatim via `event.to_markdown()` |
### Code Path
```
agent-framework-core (Microsoft)
β
MagenticOrchestratorMessageEvent(kind="task_ledger", message="...")
β
advanced.py:_process_event() - NO FILTERING
β
AgentEvent(type="judging", message=f"Manager ({event.kind}): {text[:200]}...")
β
models.py:to_markdown() β "π§ **JUDGING**: Manager (task_ledger): ..."
β
app.py β Displayed to user verbatim
```
## Impact
1. **User Confusion**: Users see internal framework bookkeeping, not meaningful progress
2. **Truncated Gibberish**: 200-char limit cuts prompts mid-sentence, making them uninterpretable
3. **Misleading Labels**: "JUDGING" event type is wrong - these are task assignments
4. **No Actionable Info**: Users can't understand what the system is actually doing
## Proposed Solutions
### Option A: Filter Internal Events (Minimal)
Skip internal manager events entirely - they're framework bookkeeping:
```python
def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
if isinstance(event, MagenticOrchestratorMessageEvent):
# Skip internal framework bookkeeping events
if event.kind in ("user_task", "task_ledger", "instruction"):
return None # Don't expose to users
# ... rest of handling
```
**Pros**: Simple, removes noise
**Cons**: Users lose visibility into manager activity
### Option B: Transform to User-Friendly Messages (Better UX)
Map internal events to meaningful user messages:
```python
MANAGER_EVENT_MESSAGES = {
"user_task": "Manager received research task",
"task_ledger": "Manager tracking task progress",
"instruction": "Manager assigning work to agent",
}
def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
if isinstance(event, MagenticOrchestratorMessageEvent):
if event.kind in MANAGER_EVENT_MESSAGES:
return AgentEvent(
type="progress", # Not "judging"!
message=MANAGER_EVENT_MESSAGES[event.kind],
iteration=iteration,
)
```
**Pros**: Users see meaningful progress, correct event types
**Cons**: More code, loses raw detail for debugging
### Option C: Smart Truncation + Verbose Mode
1. Truncate at sentence boundaries, not hard character limit
2. Add `verbose_mode` setting that shows full internal events for debugging
3. Use appropriate event types based on `event.kind`
```python
def _smart_truncate(self, text: str, max_len: int = 200) -> str:
"""Truncate at sentence boundary."""
if len(text) <= max_len:
return text
# Find last sentence boundary before limit
truncated = text[:max_len]
last_period = truncated.rfind(". ")
if last_period > max_len // 2:
return truncated[:last_period + 1]
return truncated.rsplit(" ", 1)[0] + "..."
```
### Recommended Approach
**Combine Option A + B**:
1. **Default**: Filter out `task_ledger` and `instruction` events (pure bookkeeping)
2. **Transform**: `user_task` β "Assigning research task to agents"
3. **Proper Types**: Use `"progress"` not `"judging"` for manager events
4. **Future**: Add verbose mode for debugging
## Files to Modify
1. `src/orchestrators/advanced.py:361-410` - `_process_event()` method
2. `src/utils/models.py:107-123` - Add new event types if needed
3. `tests/unit/orchestrators/test_advanced_timeout.py` - Update assertions
## Related Issues
- P0: Advanced Mode Timeout No Synthesis (FIXED in PR #104)
- This P1 was discovered while testing the P0 fix
## Testing the Bug
```python
import asyncio
from src.orchestrators.advanced import AdvancedOrchestrator
async def test():
orch = AdvancedOrchestrator(max_rounds=3)
async for event in orch.run("sildenafil mechanism"):
if "Manager" in event.message:
print(f"[{event.type}] {event.message}")
# You'll see uninterpretable output
asyncio.run(test())
```
## References
- Microsoft Agent Framework: https://github.com/microsoft/agent-framework
- AgentEvent model: `src/utils/models.py:104`
- Advanced orchestrator: `src/orchestrators/advanced.py`
|