Spaces:
Configuration error
Configuration error
File size: 7,443 Bytes
cc5c775 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 | # π Streaming System - Real-Time AI Updates
## Overview
The PyCatan AI system now supports **real-time streaming** of AI agent thoughts, actions, and tool calls! This provides immediate visibility into what the AI is thinking and doing as it plays.
## Architecture
```
βββββββββββββββ ββββββββββββββββ βββββββββββββββ
β β Stream β β SSE β β
β LLM Client ββββββββββΊβ AI Manager ββββββββββΊβ Web Viewer β
β β Chunks β β Events β β
βββββββββββββββ ββββββββββββββββ βββββββββββββββ
β
β HTTP POST
βΌ
ββββββββββββββββ
βStream β
βBroadcaster β
ββββββββββββββββ
```
## Components
### 1. LLM Client (`llm_client.py`)
**New:** `generate_stream()` method
- Uses `client.models.generate_content_stream()` for streaming
- Yields `StreamChunk` objects in real-time
- Supports `include_thoughts=True` in ThinkingConfig
- Handles three chunk types:
- `thought` - AI reasoning/thinking
- `text` - Regular response text
- `function_call` - Tool/function calls
**StreamChunk dataclass:**
```python
@dataclass
class StreamChunk:
chunk_type: str # 'thought', 'text', 'function_call', 'done'
content: Optional[str] = None
function_call: Optional[Dict[str, Any]] = None
is_complete: bool = False
```
### 2. AI Manager (`ai_manager.py`)
**New:** `_send_to_llm_stream()` method
- Similar to `_send_to_llm()` but uses streaming
- Broadcasts chunks via `_broadcast_stream_chunk()`
- Supports tool calling loop with streaming
- Each iteration can stream thoughts and tool calls
**Configuration:**
- `config.llm.enable_streaming` - Enable/disable streaming (default: True)
- Falls back to regular mode if disabled
### 3. Stream Broadcaster (`stream_broadcaster.py`)
**New component** that pushes events to web viewer:
- Sends HTTP POST to `http://localhost:5001/api/stream/broadcast`
- Non-blocking with short timeout (0.5s)
- Automatically disables if web viewer not available
- Converts StreamChunk β JSON event
### 4. Web Viewer (`web_viewer.py`)
**New endpoints:**
**`GET /api/stream/<player_name>`** - SSE endpoint
- Returns Server-Sent Events stream
- Clients connect and receive real-time updates
- Sends keepalive pings every 30s
- Auto-reconnects on error
**`POST /api/stream/broadcast`** - Broadcast endpoint
- Receives events from AI Manager
- Pushes to player-specific queue
- Queue is non-blocking (max 1000 events)
**Event format:**
```json
{
"type": "thought|text|function_call|done",
"timestamp": "ISO-8601",
"content": "...",
"function_call": {...}
}
```
### 5. Dynamic Viewer UI (`viewer_dynamic.html`)
**New features:**
**Streaming Container** - Shows live updates:
- Appears at top of page when streaming active
- Shows player name with blinking indicator
- Auto-scrolls as new chunks arrive
- Fades out after completion
**Visual feedback:**
- π Purple border for thoughts
- πΉ Green border for text
- π§ Orange border for function calls
- β
Done status with green indicator
**JavaScript functions:**
- `initStreaming()` - Connect to SSE for all players
- `connectPlayerStream(player)` - Create EventSource
- `handleStreamChunk(player, chunk)` - Process incoming chunk
- `addStreamChunk(container, type, content)` - Display chunk
## Configuration
### Enable Streaming
In `config_dev.yaml`:
```yaml
llm:
enable_streaming: true # Enable real-time streaming
enable_thinking: true # Required for thought summaries
thinking_budget: 8000 # Budget for thinking tokens
```
### Disable Streaming
Set `enable_streaming: false` to use traditional request-response mode.
## Usage
### 1. Start the Game
Run `play_ai_auto.bat` which starts:
- Web Viewer on port 5001 (with SSE support)
- Game with AI agents
- LLM Logger console
### 2. Watch Real-Time Updates
Open browser to `http://localhost:5001`:
- Streaming boxes appear when AI is thinking
- See thoughts, tool calls, and responses as they happen
- Boxes disappear when complete
### 3. Review History
Completed requests are logged normally:
- Full prompt/response saved
- Tool iterations recorded
- All metadata preserved
## Technical Details
### Why SSE (Server-Sent Events)?
- One-way: Server β Client (perfect for our use case)
- Built-in reconnection
- Simple HTTP (no WebSocket complexity)
- Works with existing Flask app
### Why HTTP POST for Broadcasting?
- Decoupled architecture
- AI Manager doesn't need to know about SSE
- Non-blocking (fire and forget)
- Web viewer can be offline without breaking AI
### Token Budgets with Streaming
Streaming works with thinking budgets:
```yaml
# Single budget for all iterations
thinking_budget: 8000
thinking_budgets: []
# OR: Dynamic budgets per iteration
thinking_budgets: [8000, 4000, 2000] # 3 iterations
```
Each iteration streams its own thoughts and results.
## Benefits
### For Development
- **Immediate feedback** - See what AI is doing in real-time
- **Debug tool calls** - Watch function calling decisions
- **Monitor thinking** - Understand reasoning process
- **Better UX** - Know the system is working
### For Users
- **Transparency** - See AI decision-making
- **Engagement** - Watch the game unfold
- **Understanding** - Learn how AI plays Catan
- **Entertainment** - More interesting than waiting
## Future Enhancements
Possible additions:
- [ ] Stream to multiple viewers simultaneously
- [ ] Replay streaming for historical games
- [ ] Filter streams by type (thoughts only, tools only)
- [ ] Stream game state updates
- [ ] WebSocket option for bidirectional communication
- [ ] Stream compression for high-frequency updates
## Troubleshooting
**No streaming visible:**
- Check `enable_streaming: true` in config
- Verify web viewer is running on port 5001
- Check browser console for connection errors
- Ensure `enable_thinking: true` for thought summaries
**Connection drops:**
- SSE reconnects automatically after 5s
- Check network/firewall
- Verify Flask not blocking long connections
**Missing chunks:**
- Queue size is 1000 - may drop old events
- Increase queue size in `web_viewer.py` if needed
## API Reference
### StreamChunk
```python
chunk = StreamChunk(
chunk_type='thought', # or 'text', 'function_call', 'done'
content='Analyzing situation...',
is_complete=False
)
```
### SSE Event
```javascript
{
type: 'thought',
timestamp: '2026-01-10T12:34:56',
content: 'I should build a settlement...'
}
```
### Broadcast API
```bash
POST http://localhost:5001/api/stream/broadcast
Content-Type: application/json
{
"player_name": "Agent1",
"chunk_type": "thought",
"content": "Thinking..."
}
```
## Credits
Built on top of:
- **Google Gemini API** - Streaming support with thinking mode
- **Flask** - SSE server
- **Server-Sent Events** - Real-time browser updates
- **PyCatan** - Settlers of Catan implementation
---
**Happy Streaming! π**
|