File size: 4,379 Bytes
21dd8fe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
# P2 Bug: First Agent Turn Exceeds Workflow Timeout

**Date**: 2025-12-03
**Status**: FIXED (PR fix/p2-double-bug-squash)
**Severity**: P2 (UX - Workflow always times out on complex queries)
**Component**: `src/orchestrators/advanced.py` + `src/agents/search_agent.py`
**Affects**: Both Free Tier (HuggingFace) AND Paid Tier (OpenAI)

---

## Executive Summary

The search agent's first turn can exceed the 5-minute workflow timeout, causing:
1. `iterations=0` at timeout (no agent completed a turn)
2. `_handle_timeout()` synthesizes from partial evidence
3. Users get incomplete research results

This is a **performance/architecture bug**, not a model issue.

---

## Symptom

```
[warning] Workflow timed out             iterations=0
```

The workflow times out with `iterations=0` - meaning the first agent (search agent) never completed its turn before the 5-minute timeout.

---

## Root Cause

The search agent's first turn is **extremely expensive**:

```
Search Agent First Turn:
β”œβ”€β”€ Manager assigns task
β”œβ”€β”€ Search agent starts
β”‚   β”œβ”€β”€ Calls PubMed search tool (10 results)
β”‚   β”œβ”€β”€ Calls ClinicalTrials search tool (10 results)
β”‚   β”œβ”€β”€ Calls EuropePMC search tool (10 results)
β”‚   └── For EACH result (30 total):
β”‚       β”œβ”€β”€ Generate embedding (OpenAI API call)
β”‚       β”œβ”€β”€ Check for duplicates (ChromaDB query)
β”‚       └── Store in ChromaDB
β”‚
β”‚   TOTAL: 30 results Γ— (embedding + dedup + store) = 90+ API/DB operations
β”‚
└── Agent turn completes (if timeout hasn't fired)
```

**The timeout is on the WORKFLOW, not individual agent turns.** A single greedy agent can consume the entire timeout budget.

---

## Impact

| Aspect | Impact |
|--------|--------|
| UX | Queries always timeout on first turn |
| Research quality | Synthesis happens on partial evidence |
| Confusion | `iterations=0` looks like nothing happened |

---

## The Fix (Consensus)

**Reduce work per turn + increase timeout budget.**

### Implementation

**1. Reduce results per tool (immediate)**

`src/agents/search_agent.py` line 70:
```python
# Change from 10 to 5
result: SearchResult = await self._handler.execute(query, max_results_per_tool=5)
```

**2. Increase workflow timeout (immediate)**

`src/utils/config.py`:
```python
advanced_timeout: float = Field(
    default=600.0,  # Was 300.0 (5 min), now 10 min
    ge=60.0,
    le=900.0,
    description="Timeout for Advanced mode in seconds",
)
```

### Why NOT Per-Turn Timeout

**DANGER**: The SearchHandler uses `asyncio.gather()`:

```python
# src/tools/search_handler.py line 163-164
results = await asyncio.gather(*tasks, return_exceptions=True)
```

This is an **all-or-nothing** operation. If you wrap it with `asyncio.timeout()` and the timeout fires, you get **zero results**, not partial results.

```python
# DON'T DO THIS - yields nothing on timeout
async with asyncio.timeout(60):
    result = await self._handler.execute(query)  # Cancelled = zero results
```

Per-turn timeout requires `SearchHandler` to support cancellation with partial results. That's a separate architectural change (see Future Work).

---

## Future Work (Streaming Evidence Ingestion)

For proper fix, `SearchHandler.execute()` should:
1. Yield results as they arrive (async generator)
2. Support cancellation with partial results
3. Allow agent to return "what we have so far" on timeout

```python
# Future architecture
async def execute_streaming(self, query: str) -> AsyncIterator[Evidence]:
    for tool in self.tools:
        async for evidence in tool.search_streaming(query):
            yield evidence  # Can be cancelled at any point
```

This is out of scope for the immediate fix.

---

## Test Plan

1. Run query with 10-minute timeout
2. Verify first agent turn completes before timeout
3. Verify `iterations >= 1` at workflow end

---

## Verification Data

From diagnostic run:
```
=== RAW FRAMEWORK EVENTS ===
  MagenticAgentDeltaEvent: 284
  MagenticOrchestratorMessageEvent: 3
  ...
  NO MagenticAgentMessageEvent  ← Agent never completed a turn!

[warning] Workflow timed out             iterations=0
```

---

## Related

- P2 Duplicate Report Bug (separate issue, happens after successful completion)
- `_handle_timeout()` correctly synthesizes, but with partial evidence
- Not related to model quality - this is infrastructure/performance