Commit
·
2eaf2d3
1
Parent(s):
9006d69
docs: fix misleading claims in bug documentation
Browse filesThe agent's doc claimed 'skip individual streaming events (don't yield)'
but the code actually yields for every token (for live typing UX).
Updated to accurately describe what the code does:
- Yields immediately for each token (UX)
- Doesn't append to response_parts (prevents O(N²))
- Each yield has ONE streaming marker (not accumulated)
Also noted that API key fix relies on undocumented Gradio behavior.
docs/bugs/P1_MAGENTIC_STREAMING_AND_KEY_PERSISTENCE.md
CHANGED
|
@@ -57,13 +57,16 @@ async for event in orchestrator.run(message):
|
|
| 57 |
For N tokens, this yields N times, each time showing all previous tokens. This is O(N²) string operations and creates massive visual spam.
|
| 58 |
|
| 59 |
### Fix Applied
|
| 60 |
-
**File:** `src/app.py:
|
| 61 |
|
| 62 |
-
Implemented streaming token buffering:
|
| 63 |
1. Added `streaming_buffer = ""` to accumulate tokens
|
| 64 |
-
2.
|
| 65 |
-
3.
|
| 66 |
-
4.
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
### Proposed Fix Options
|
| 69 |
|
|
@@ -137,18 +140,18 @@ Gradio's `ChatInterface` with `additional_inputs` has known issues:
|
|
| 137 |
2. `src/utils/llm_factory.py`
|
| 138 |
|
| 139 |
**Bug 1 (Streaming Spam):**
|
| 140 |
-
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
- This prevents the O(N²) list growth and "new line spam" while keeping the UI responsive.
|
| 145 |
|
| 146 |
**Bug 2 (API Key Persistence):**
|
| 147 |
-
- **Strategy:**
|
| 148 |
-
-
|
| 149 |
-
- Gradio
|
| 150 |
-
-
|
| 151 |
-
-
|
|
|
|
| 152 |
|
| 153 |
**Bug 3 (OpenAIModel Deprecation):** ✅ FIXED
|
| 154 |
- Replaced all `OpenAIModel` imports with `OpenAIChatModel` in `src/app.py` and `src/utils/llm_factory.py`.
|
|
|
|
| 57 |
For N tokens, this yields N times, each time showing all previous tokens. This is O(N²) string operations and creates massive visual spam.
|
| 58 |
|
| 59 |
### Fix Applied
|
| 60 |
+
**File:** `src/app.py:175-204`
|
| 61 |
|
| 62 |
+
Implemented streaming token buffering with live updates:
|
| 63 |
1. Added `streaming_buffer = ""` to accumulate tokens
|
| 64 |
+
2. For each streaming event: append to buffer, yield immediately (for live typing UX)
|
| 65 |
+
3. **Key fix**: Don't append streaming events to `response_parts` (prevents O(N²) list growth)
|
| 66 |
+
4. Each yield has only ONE `📡 STREAMING:` line (the accumulated buffer)
|
| 67 |
+
5. Flush buffer to `response_parts` only when non-streaming event occurs
|
| 68 |
+
|
| 69 |
+
**Result**: Live typing feel preserved, but no visual spam (each update replaces, not accumulates)
|
| 70 |
|
| 71 |
### Proposed Fix Options
|
| 72 |
|
|
|
|
| 140 |
2. `src/utils/llm_factory.py`
|
| 141 |
|
| 142 |
**Bug 1 (Streaming Spam):**
|
| 143 |
+
- Accumulate tokens in `streaming_buffer`
|
| 144 |
+
- Yield updates immediately for live typing UX
|
| 145 |
+
- **Key**: Don't append to `response_parts` until stream segment complete
|
| 146 |
+
- Each yield has ONE `📡 STREAMING:` line (not N accumulated lines)
|
|
|
|
| 147 |
|
| 148 |
**Bug 2 (API Key Persistence):**
|
| 149 |
+
- **Strategy:** Partial example list (relies on Gradio behavior)
|
| 150 |
+
- Examples have only 2 elements `[message, mode]` instead of 4
|
| 151 |
+
- Gradio only updates inputs with corresponding example values
|
| 152 |
+
- Remaining inputs (api_key textbox) are left unchanged
|
| 153 |
+
- `api_key_state` parameter exists as fallback but may be redundant
|
| 154 |
+
- **Note:** This is a workaround relying on undocumented Gradio behavior
|
| 155 |
|
| 156 |
**Bug 3 (OpenAIModel Deprecation):** ✅ FIXED
|
| 157 |
- Replaced all `OpenAIModel` imports with `OpenAIChatModel` in `src/app.py` and `src/utils/llm_factory.py`.
|