VibecoderMcSwaggins commited on
Commit
67bdc5a
Β·
1 Parent(s): 7e1184a

fix: P1 Advanced Mode chain-of-thought interpretability (#106)

Browse files

Problem: Advanced orchestrator exposed raw internal framework events
from agent-framework-core to users:
- `Manager (task_ledger): We are working to address...` (truncated)
- `Manager (instruction): Conduct targeted searches...` (truncated)
- All mapped to type="judging" regardless of actual purpose

Solution:
1. Filter internal events: `task_ledger` and `instruction` now hidden
2. Transform: `user_task` β†’ type="progress" with friendly message
3. Smart truncation: Cut at sentence/word boundaries, not mid-word

Tests: tests/unit/orchestrators/test_advanced_events.py (4 tests)

Closes #106

docs/bugs/ACTIVE_BUGS.md CHANGED
@@ -23,16 +23,9 @@ _No active P0 bugs._
23
  - `Manager (task_ledger): We are working to address...`
24
  - `Manager (instruction): Conduct targeted searches on PubMed...`
25
 
26
- These are framework-internal bookkeeping truncated at 200 chars, making them uninterpretable.
27
-
28
  **Root Cause:** `_process_event()` in `advanced.py` doesn't filter or transform `MagenticOrchestratorMessageEvent` events from `agent-framework-core`.
29
 
30
- **Solution Options:**
31
- 1. Filter internal events (`user_task`, `task_ledger`, `instruction`)
32
- 2. Transform to user-friendly messages ("Manager assigning search task...")
33
- 3. Add verbose mode for debugging
34
-
35
- **Status:** Open
36
 
37
  ---
38
 
 
23
  - `Manager (task_ledger): We are working to address...`
24
  - `Manager (instruction): Conduct targeted searches on PubMed...`
25
 
 
 
26
  **Root Cause:** `_process_event()` in `advanced.py` doesn't filter or transform `MagenticOrchestratorMessageEvent` events from `agent-framework-core`.
27
 
28
+ **Status:** PR [#107](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/107) open, pending merge.
 
 
 
 
 
29
 
30
  ---
31
 
docs/bugs/P1_ADVANCED_MODE_UNINTERPRETABLE_CHAIN_OF_THOUGHT.md CHANGED
@@ -2,8 +2,9 @@
2
 
3
  **Priority**: P1 (UX Degradation)
4
  **Component**: `src/orchestrators/advanced.py`
5
- **Status**: Open
6
  **Issue**: [#106](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/106)
 
7
  **Created**: 2025-12-01
8
 
9
  ## Summary
@@ -15,6 +16,16 @@ The Advanced orchestrator exposes raw internal framework events from `agent-fram
15
  3. Shown with misleading "JUDGING" event type
16
  4. Not meaningful to end users
17
 
 
 
 
 
 
 
 
 
 
 
18
  ## Example of Bad Output
19
 
20
  ```
 
2
 
3
  **Priority**: P1 (UX Degradation)
4
  **Component**: `src/orchestrators/advanced.py`
5
+ **Status**: Fix Ready (PR #107 open)
6
  **Issue**: [#106](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/106)
7
+ **PR**: [#107](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/107)
8
  **Created**: 2025-12-01
9
 
10
  ## Summary
 
16
  3. Shown with misleading "JUDGING" event type
17
  4. Not meaningful to end users
18
 
19
+ ## Resolution
20
+
21
+ Implemented "Smart Filter + Transform" logic in `src/orchestrators/advanced.py`:
22
+
23
+ 1. **Filtered**: `task_ledger` and `instruction` events are now hidden.
24
+ 2. **Transformed**: `user_task` events are mapped to `type="progress"` with a friendly "Manager assigning research task..." message.
25
+ 3. **Smart Truncation**: Text is now truncated at sentence boundaries or word boundaries, preventing mid-word cuts.
26
+
27
+ Verified with new unit tests in `tests/unit/orchestrators/test_advanced_events.py`.
28
+
29
  ## Example of Bad Output
30
 
31
  ```
src/orchestrators/advanced.py CHANGED
@@ -358,17 +358,44 @@ The final output should be a structured research report."""
358
  return "synthesizing"
359
  return "judging" # Default for unknown agents
360
 
 
 
 
 
 
 
 
 
 
 
 
 
361
  def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
362
  """Process workflow event into AgentEvent."""
363
  if isinstance(event, MagenticOrchestratorMessageEvent):
 
 
 
 
364
  text = self._extract_text(event.message)
365
- if text:
 
 
 
 
366
  return AgentEvent(
367
- type="judging",
368
- message=f"Manager ({event.kind}): {text[:200]}...",
369
  iteration=iteration,
370
  )
371
 
 
 
 
 
 
 
 
372
  elif isinstance(event, MagenticAgentMessageEvent):
373
  agent_name = event.agent_id or "unknown"
374
  text = self._extract_text(event.message)
@@ -377,7 +404,7 @@ The final output should be a structured research report."""
377
  # All returned types are valid AgentEvent.type literals
378
  return AgentEvent(
379
  type=event_type, # type: ignore[arg-type]
380
- message=f"{agent_name}: {text[:200]}...",
381
  iteration=iteration + 1,
382
  )
383
 
 
358
  return "synthesizing"
359
  return "judging" # Default for unknown agents
360
 
361
+ def _smart_truncate(self, text: str, max_len: int = 200) -> str:
362
+ """Truncate at sentence boundary to avoid cutting words."""
363
+ if len(text) <= max_len:
364
+ return text
365
+ # Find last sentence boundary before limit
366
+ truncated = text[:max_len]
367
+ last_period = truncated.rfind(". ")
368
+ if last_period > max_len // 2:
369
+ return truncated[: last_period + 1]
370
+ # Fallback to word boundary
371
+ return truncated.rsplit(" ", 1)[0] + "..."
372
+
373
  def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
374
  """Process workflow event into AgentEvent."""
375
  if isinstance(event, MagenticOrchestratorMessageEvent):
376
+ # FILTERING: Skip internal framework bookkeeping
377
+ if event.kind in ("task_ledger", "instruction"):
378
+ return None
379
+
380
  text = self._extract_text(event.message)
381
+ if not text:
382
+ return None
383
+
384
+ # TRANSFORMATION: Make manager events user-friendly
385
+ if event.kind == "user_task":
386
  return AgentEvent(
387
+ type="progress",
388
+ message="Manager assigning research task to agents...",
389
  iteration=iteration,
390
  )
391
 
392
+ # Default fallback for other manager events
393
+ return AgentEvent(
394
+ type="judging",
395
+ message=f"Manager ({event.kind}): {self._smart_truncate(text)}",
396
+ iteration=iteration,
397
+ )
398
+
399
  elif isinstance(event, MagenticAgentMessageEvent):
400
  agent_name = event.agent_id or "unknown"
401
  text = self._extract_text(event.message)
 
404
  # All returned types are valid AgentEvent.type literals
405
  return AgentEvent(
406
  type=event_type, # type: ignore[arg-type]
407
+ message=f"{agent_name}: {self._smart_truncate(text)}",
408
  iteration=iteration + 1,
409
  )
410
 
tests/unit/orchestrators/test_advanced_events.py ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Test for AdvancedOrchestrator event processing (P1 Bug)."""
2
+
3
+ from unittest.mock import MagicMock
4
+
5
+ import pytest
6
+ from agent_framework import MagenticAgentMessageEvent, MagenticOrchestratorMessageEvent
7
+
8
+ from src.orchestrators.advanced import AdvancedOrchestrator
9
+
10
+
11
+ class TestAdvancedEventProcessing:
12
+ """Test event processing logic in AdvancedOrchestrator."""
13
+
14
+ @pytest.fixture
15
+ def orchestrator(self) -> AdvancedOrchestrator:
16
+ """Create an orchestrator instance with mocks."""
17
+ # Bypass __init__ logic that requires keys/env vars
18
+ orch = AdvancedOrchestrator.__new__(AdvancedOrchestrator)
19
+ # Minimal setup
20
+ orch._max_rounds = 5
21
+ orch._timeout_seconds = 300.0
22
+ return orch
23
+
24
+ def test_filters_internal_task_ledger_events(self, orchestrator: AdvancedOrchestrator) -> None:
25
+ """
26
+ Bug P1: Internal 'task_ledger' events should be filtered out.
27
+
28
+ Current behavior: Returns AgentEvent(type='judging', message='Manager (task_ledger): ...')
29
+ Desired behavior: Returns None (filtered)
30
+ """
31
+ # Create a raw internal framework event
32
+ raw_event = MagenticOrchestratorMessageEvent(
33
+ kind="task_ledger",
34
+ message="We are working to address the following user request: Research sildenafil...",
35
+ )
36
+
37
+ # Process the event
38
+ result = orchestrator._process_event(raw_event, iteration=1)
39
+
40
+ # FAIL if the event is NOT filtered (i.e., if it returns an event)
41
+ assert result is None, f"Should filter 'task_ledger' events, but got: {result}"
42
+
43
+ def test_filters_internal_instruction_events(self, orchestrator: AdvancedOrchestrator) -> None:
44
+ """
45
+ Bug P1: Internal 'instruction' events should be filtered out.
46
+
47
+ Current behavior: Returns AgentEvent(type='judging', message='Manager (instruction): ...')
48
+ Desired behavior: Returns None (filtered)
49
+ """
50
+ raw_event = MagenticOrchestratorMessageEvent(
51
+ kind="instruction", message="Conduct targeted searches on PubMed..."
52
+ )
53
+
54
+ result = orchestrator._process_event(raw_event, iteration=1)
55
+
56
+ assert result is None, f"Should filter 'instruction' events, but got: {result}"
57
+
58
+ def test_transforms_user_task_events(self, orchestrator: AdvancedOrchestrator) -> None:
59
+ """
60
+ Bug P1: 'user_task' events should be transformed to user-friendly messages.
61
+
62
+ Current behavior: 'Manager (user_task): Research...' (truncated, type='judging')
63
+ Desired behavior: 'Manager assigning research task...' (type='progress')
64
+ """
65
+ raw_event = MagenticOrchestratorMessageEvent(
66
+ kind="user_task",
67
+ message="Research sexual health and wellness interventions for: sildenafil mechanism",
68
+ )
69
+
70
+ result = orchestrator._process_event(raw_event, iteration=1)
71
+
72
+ assert result is not None
73
+ assert result.type == "progress" # NOT "judging"
74
+ assert "Manager assigning research task" in result.message
75
+ # Should use the generic friendly message
76
+ assert "sildenafil mechanism" not in result.message
77
+
78
+ def test_prevents_mid_sentence_truncation(self, orchestrator: AdvancedOrchestrator) -> None:
79
+ """
80
+ Bug P1: Long messages should be smart-truncated, not hard cut at 200 chars.
81
+ """
82
+ # A long message (> 200 chars)
83
+ long_text = "A" * 250
84
+
85
+ # Mock a standard agent message
86
+ mock_message = MagicMock()
87
+ mock_message.content = long_text
88
+ mock_message.text = long_text
89
+
90
+ raw_event = MagenticAgentMessageEvent(agent_id="SearchAgent", message=mock_message)
91
+
92
+ result = orchestrator._process_event(raw_event, iteration=1)
93
+
94
+ assert result is not None
95
+ # Current buggy behavior: len(message) == 200 + len("SearchAgent: ...")
96
+ # We want to verify we don't just slice randomly.
97
+ assert len(result.message) < 300 # Sanity check