File size: 8,342 Bytes
b80c43b
 
 
 
cc5dfc8
b80c43b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc5dfc8
 
 
b80c43b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
# P2: Advanced Mode Cold Start Has No User Feedback

**Priority**: P2 (UX Friction)
**Component**: `src/orchestrators/advanced.py`
**Status**: βœ… FIXED (All Phases Complete)
**Issue**: [#108](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/108)
**Created**: 2025-12-01

## Summary

When Advanced Mode starts, users experience three significant "dead zones" with no visual feedback:

1. **Initialization delay** (5-15 seconds): Between "STARTED" and "THINKING" events
2. **First LLM call delay** (10-30+ seconds): Between "THINKING" and first "PROGRESS" event
3. **Agent execution delay** (30-90+ seconds): After "PROGRESS" while SearchAgent executes

Users see the UI freeze with no indication of what's happening, leading to confusion about whether the system is working.

## Visual Timeline

```
πŸš€ STARTED: Starting research (Advanced mode)...
   β”‚
   β”‚  ← DEAD ZONE #1: 5-15 seconds of nothing
   β”‚     - Loading LlamaIndex/ChromaDB
   β”‚     - Initializing embedding service
   β”‚     - Building 4 agents + manager
   β”‚
⏳ THINKING: Multi-agent reasoning in progress...
   β”‚
   β”‚  ← DEAD ZONE #2: 10-30+ seconds of nothing
   β”‚     - Manager agent's first OpenAI API call
   β”‚     - Cold connection to OpenAI
   β”‚
⏱️ PROGRESS: Manager assigning research task...
   β”‚
   β”‚  ← DEAD ZONE #3: 30-90+ seconds of nothing
   β”‚     - SearchAgent executing PubMed/ClinicalTrials/EuropePMC queries
   β”‚     - Embedding and storing results in ChromaDB
   β”‚     - No streaming events during search execution
   β”‚
πŸ“Š SEARCH_COMPLETE / PROGRESS: Round 1/5...
```

## Root Cause Analysis

### Dead Zone #1: Initialization (Lines 162-165)

```python
yield AgentEvent(type="started", ...)  # User sees this

# === BLOCKING OPERATIONS (no events yielded) ===
embedding_service = self._init_embedding_service()  # ChromaDB, embeddings
init_magentic_state(query, embedding_service)       # Shared state
workflow = self._build_workflow()                   # 4 agents + manager

yield AgentEvent(type="thinking", ...)  # User finally sees this
```

**What's happening:**
1. `_init_embedding_service()` β†’ Loads LlamaIndex, connects to ChromaDB, initializes OpenAI embeddings
2. `init_magentic_state()` β†’ Creates ResearchMemory, sets up context
3. `_build_workflow()` β†’ Instantiates SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent, Manager

### Dead Zone #2: First LLM Call (Line 206)

```python
yield AgentEvent(type="thinking", ...)  # User sees this

async for event in workflow.run_stream(task):  # BLOCKING until first event
    # Manager makes first OpenAI call here
    # No events until manager responds and starts delegating
```

**What's happening:**
- Microsoft Agent Framework's manager agent receives the task
- Makes synchronous(ish) call to OpenAI for orchestration planning
- Only after response does it emit `MagenticOrchestratorMessageEvent`

### Dead Zone #3: Agent Execution (After PROGRESS event)

After "Manager assigning research task...", the SearchAgent executes but emits no events until complete:

**What's happening:**
- SearchAgent receives task from manager
- Executes parallel queries to PubMed, ClinicalTrials.gov, Europe PMC
- Each result is embedded and stored in ChromaDB
- Only after ALL searches complete does it emit `MagenticAgentMessageEvent`

**Why no streaming:**
- The agent's internal tool calls (search APIs, embeddings) don't emit framework events
- Microsoft Agent Framework only emits events at agent message boundaries
- 3 databases Γ— multiple queries Γ— embedding each result = long silent period

**Potential fix:** Add progress callbacks to `SearchAgent` tools:
```python
# In search_agent.py - hypothetical
async def search_pubmed(query: str, on_progress: Callable = None):
    results = await pubmed_client.search(query)
    if on_progress:
        on_progress(f"Found {len(results)} PubMed results")
    # ... embed and store
```

## Impact

1. **User Confusion**: "Is it frozen? Should I refresh?"
2. **Perceived Slowness**: Dead time feels longer than active progress
3. **No Cancel Option**: Users can't abort during these zones
4. **Support Burden**: Users report "it's not working" when it's actually initializing

## Proposed Solutions

### Option A: Granular Initialization Events (Quick Win)

Add progress events during initialization:

```python
yield AgentEvent(type="started", ...)

yield AgentEvent(
    type="progress",
    message="Loading embedding service...",
    iteration=0,
)
embedding_service = self._init_embedding_service()

yield AgentEvent(
    type="progress",
    message="Initializing research memory...",
    iteration=0,
)
init_magentic_state(query, embedding_service)

yield AgentEvent(
    type="progress",
    message="Building agent team (Search, Judge, Hypothesis, Report)...",
    iteration=0,
)
workflow = self._build_workflow()

yield AgentEvent(type="thinking", ...)
```

**Pros**: Simple, immediate feedback
**Cons**: Still sequential, doesn't speed up actual time

### Option B: Parallel Initialization (Performance + UX)

Use `asyncio.gather()` for independent operations:

```python
yield AgentEvent(type="progress", message="Initializing agents...", iteration=0)

# These could potentially run in parallel
embedding_task = asyncio.create_task(self._init_embedding_service_async())
workflow_task = asyncio.create_task(self._build_workflow_async())

embedding_service, workflow = await asyncio.gather(embedding_task, workflow_task)
init_magentic_state(query, embedding_service)
```

**Pros**: Faster initialization, better UX
**Cons**: Need to verify thread safety, more complex

### Option C: Pre-warming / Singleton Services

Initialize expensive services once at app startup, not per-request:

```python
# In app.py startup
global_embedding_service = init_embedding_service()
global_workflow_template = build_workflow_template()

# In orchestrator
workflow = global_workflow_template.clone()  # Fast
```

**Pros**: Near-instant start after first request
**Cons**: Memory overhead, cold start on first request still slow

### Option D: Animated Progress Indicator (UI-Only)

Add a Gradio progress bar or spinner that animates during the dead zones:

```python
# In app.py
with gr.Blocks() as demo:
    progress = gr.Progress()

    async def research(query):
        progress(0.1, desc="Initializing...")
        # ...
        progress(0.2, desc="Building agents...")
```

**Pros**: User sees activity even if nothing to report
**Cons**: Doesn't solve the actual blocking, Gradio-specific

## Recommended Approach

**Phase 1 (Quick Win)**: Option A - Add granular events βœ… COMPLETE
**Phase 2 (Performance)**: Option C - Pre-warm services at startup βœ… COMPLETE
**Phase 3 (Polish)**: Option D - Gradio progress bar βœ… COMPLETE

## Related Considerations

### Parallel Agent Orchestration

The current Microsoft Agent Framework runs agents sequentially through the manager. True parallel execution would require:

1. Breaking out of the framework's `run_stream()` pattern
2. Implementing our own parallel task dispatch
3. Managing agent coordination manually

This is a larger architectural change (P1 scope) and should be tracked separately if desired.

## Files to Modify

1. `src/orchestrators/advanced.py:155-210` - Add initialization events in `run()` method
2. `src/utils/service_loader.py` - Pre-warming logic
3. `src/app.py` - Gradio progress integration

## Testing the Issue

```python
import asyncio
import time
from src.orchestrators.advanced import AdvancedOrchestrator

async def test():
    orch = AdvancedOrchestrator(max_rounds=3)
    start = time.time()
    async for event in orch.run("test query"):
        elapsed = time.time() - start
        print(f"[{elapsed:.1f}s] {event.type}: {event.message[:50]}...")
        if event.type == "complete":
            break

asyncio.run(test())
```

Expected output showing the gaps:
```
[0.0s] started: Starting research (Advanced mode)...
[8.2s] thinking: Multi-agent reasoning in progress...  ← 8 second gap!
[22.5s] progress: Manager assigning research task...   ← 14 second gap!
```

## References

- Advanced orchestrator: `src/orchestrators/advanced.py`
- Embedding service loader: `src/utils/service_loader.py`
- LlamaIndex RAG: `src/services/llamaindex_rag.py`
- Microsoft Agent Framework: `agent-framework-core`