File size: 6,894 Bytes
7e1184a
 
 
 
e1232d2
7e1184a
67bdc5a
7e1184a
e1232d2
7e1184a
 
 
 
 
 
 
 
 
 
67bdc5a
 
 
 
 
 
 
 
 
 
7e1184a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
# P1: Advanced Mode Exposes Uninterpretable Chain-of-Thought Events

**Priority**: P1 (UX Degradation)
**Component**: `src/orchestrators/advanced.py`
**Status**: Resolved
**Issue**: [#106](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/106)
**PR**: [#107](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/107)
**Created**: 2025-12-01
**Resolved**: 2025-12-01

## Summary

The Advanced orchestrator exposes raw internal framework events from `agent-framework-core` directly to users. These events contain internal manager bookkeeping (task assignments, ledgers, instructions) that are:

1. Truncated mid-sentence at 200 characters
2. Use internal framework terminology (`user_task`, `task_ledger`, `instruction`)
3. Shown with misleading "JUDGING" event type
4. Not meaningful to end users

## Resolution

Implemented "Smart Filter + Transform" logic in `src/orchestrators/advanced.py`:

1. **Filtered**: `task_ledger` and `instruction` events are now hidden.
2. **Transformed**: `user_task` events are mapped to `type="progress"` with a friendly "Manager assigning research task..." message.
3. **Smart Truncation**: Text is now truncated at sentence boundaries or word boundaries, preventing mid-word cuts.

Verified with new unit tests in `tests/unit/orchestrators/test_advanced_events.py`.

## Example of Bad Output

```
🧠 **JUDGING**: Manager (user_task): Research sexual health and wellness interventions for: sildenafil mechanism  ##...

🧠 **JUDGING**: Manager (task_ledger):  We are working to address the following user request:  Research sexual healt...

🧠 **JUDGING**: Manager (instruction): Conduct targeted searches on PubMed, ClinicalTrials.gov, and Europe PMC to ga...
```

Users see:
- Raw internal prompts being passed between manager and agents
- Truncated text that cuts off mid-word ("healt...", "ga...")
- Technical jargon ("task_ledger") with no context
- All events labeled as "JUDGING" even when they're task assignments

## Root Cause Analysis

### The Chain of Issues

| Location | Issue |
|----------|-------|
| `src/orchestrators/advanced.py:363-370` | `MagenticOrchestratorMessageEvent` raw events exposed without filtering |
| `src/orchestrators/advanced.py:368` | `event.kind` values (`user_task`, `task_ledger`, `instruction`) are internal framework concepts |
| `src/orchestrators/advanced.py:368` | Hard truncation: `text[:200]...` breaks mid-sentence |
| `src/orchestrators/advanced.py:367` | All manager events mapped to `type="judging"` regardless of actual purpose |
| `src/orchestrators/advanced.py:380` | Agent messages also truncated at 200 chars |
| `src/utils/models.py:136` | `"judging": "🧠"` icon shown for all these internal events |
| `src/app.py:248` | Events displayed verbatim via `event.to_markdown()` |

### Code Path

```
agent-framework-core (Microsoft)
        ↓
MagenticOrchestratorMessageEvent(kind="task_ledger", message="...")
        ↓
advanced.py:_process_event() - NO FILTERING
        ↓
AgentEvent(type="judging", message=f"Manager ({event.kind}): {text[:200]}...")
        ↓
models.py:to_markdown() β†’ "🧠 **JUDGING**: Manager (task_ledger): ..."
        ↓
app.py β†’ Displayed to user verbatim
```

## Impact

1. **User Confusion**: Users see internal framework bookkeeping, not meaningful progress
2. **Truncated Gibberish**: 200-char limit cuts prompts mid-sentence, making them uninterpretable
3. **Misleading Labels**: "JUDGING" event type is wrong - these are task assignments
4. **No Actionable Info**: Users can't understand what the system is actually doing

## Proposed Solutions

### Option A: Filter Internal Events (Minimal)

Skip internal manager events entirely - they're framework bookkeeping:

```python
def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
    if isinstance(event, MagenticOrchestratorMessageEvent):
        # Skip internal framework bookkeeping events
        if event.kind in ("user_task", "task_ledger", "instruction"):
            return None  # Don't expose to users
        # ... rest of handling
```

**Pros**: Simple, removes noise
**Cons**: Users lose visibility into manager activity

### Option B: Transform to User-Friendly Messages (Better UX)

Map internal events to meaningful user messages:

```python
MANAGER_EVENT_MESSAGES = {
    "user_task": "Manager received research task",
    "task_ledger": "Manager tracking task progress",
    "instruction": "Manager assigning work to agent",
}

def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
    if isinstance(event, MagenticOrchestratorMessageEvent):
        if event.kind in MANAGER_EVENT_MESSAGES:
            return AgentEvent(
                type="progress",  # Not "judging"!
                message=MANAGER_EVENT_MESSAGES[event.kind],
                iteration=iteration,
            )
```

**Pros**: Users see meaningful progress, correct event types
**Cons**: More code, loses raw detail for debugging

### Option C: Smart Truncation + Verbose Mode

1. Truncate at sentence boundaries, not hard character limit
2. Add `verbose_mode` setting that shows full internal events for debugging
3. Use appropriate event types based on `event.kind`

```python
def _smart_truncate(self, text: str, max_len: int = 200) -> str:
    """Truncate at sentence boundary."""
    if len(text) <= max_len:
        return text
    # Find last sentence boundary before limit
    truncated = text[:max_len]
    last_period = truncated.rfind(". ")
    if last_period > max_len // 2:
        return truncated[:last_period + 1]
    return truncated.rsplit(" ", 1)[0] + "..."
```

### Recommended Approach

**Combine Option A + B**:

1. **Default**: Filter out `task_ledger` and `instruction` events (pure bookkeeping)
2. **Transform**: `user_task` β†’ "Assigning research task to agents"
3. **Proper Types**: Use `"progress"` not `"judging"` for manager events
4. **Future**: Add verbose mode for debugging

## Files to Modify

1. `src/orchestrators/advanced.py:361-410` - `_process_event()` method
2. `src/utils/models.py:107-123` - Add new event types if needed
3. `tests/unit/orchestrators/test_advanced_timeout.py` - Update assertions

## Related Issues

- P0: Advanced Mode Timeout No Synthesis (FIXED in PR #104)
- This P1 was discovered while testing the P0 fix

## Testing the Bug

```python
import asyncio
from src.orchestrators.advanced import AdvancedOrchestrator

async def test():
    orch = AdvancedOrchestrator(max_rounds=3)
    async for event in orch.run("sildenafil mechanism"):
        if "Manager" in event.message:
            print(f"[{event.type}] {event.message}")
            # You'll see uninterpretable output

asyncio.run(test())
```

## References

- Microsoft Agent Framework: https://github.com/microsoft/agent-framework
- AgentEvent model: `src/utils/models.py:104`
- Advanced orchestrator: `src/orchestrators/advanced.py`