Spaces:
Configuration error
Configuration error
| # 🌊 Streaming System - Real-Time AI Updates | |
| ## Overview | |
| The PyCatan AI system now supports **real-time streaming** of AI agent thoughts, actions, and tool calls! This provides immediate visibility into what the AI is thinking and doing as it plays. | |
| ## Architecture | |
| ``` | |
| ┌─────────────┐ ┌──────────────┐ ┌─────────────┐ | |
| │ │ Stream │ │ SSE │ │ | |
| │ LLM Client ├────────►│ AI Manager ├────────►│ Web Viewer │ | |
| │ │ Chunks │ │ Events │ │ | |
| └─────────────┘ └──────────────┘ └─────────────┘ | |
| │ | |
| │ HTTP POST | |
| ▼ | |
| ┌──────────────┐ | |
| │Stream │ | |
| │Broadcaster │ | |
| └──────────────┘ | |
| ``` | |
| ## Components | |
| ### 1. LLM Client (`llm_client.py`) | |
| **New:** `generate_stream()` method | |
| - Uses `client.models.generate_content_stream()` for streaming | |
| - Yields `StreamChunk` objects in real-time | |
| - Supports `include_thoughts=True` in ThinkingConfig | |
| - Handles three chunk types: | |
| - `thought` - AI reasoning/thinking | |
| - `text` - Regular response text | |
| - `function_call` - Tool/function calls | |
| **StreamChunk dataclass:** | |
| ```python | |
| @dataclass | |
| class StreamChunk: | |
| chunk_type: str # 'thought', 'text', 'function_call', 'done' | |
| content: Optional[str] = None | |
| function_call: Optional[Dict[str, Any]] = None | |
| is_complete: bool = False | |
| ``` | |
| ### 2. AI Manager (`ai_manager.py`) | |
| **New:** `_send_to_llm_stream()` method | |
| - Similar to `_send_to_llm()` but uses streaming | |
| - Broadcasts chunks via `_broadcast_stream_chunk()` | |
| - Supports tool calling loop with streaming | |
| - Each iteration can stream thoughts and tool calls | |
| **Configuration:** | |
| - `config.llm.enable_streaming` - Enable/disable streaming (default: True) | |
| - Falls back to regular mode if disabled | |
| ### 3. Stream Broadcaster (`stream_broadcaster.py`) | |
| **New component** that pushes events to web viewer: | |
| - Sends HTTP POST to `http://localhost:5001/api/stream/broadcast` | |
| - Non-blocking with short timeout (0.5s) | |
| - Automatically disables if web viewer not available | |
| - Converts StreamChunk → JSON event | |
| ### 4. Web Viewer (`web_viewer.py`) | |
| **New endpoints:** | |
| **`GET /api/stream/<player_name>`** - SSE endpoint | |
| - Returns Server-Sent Events stream | |
| - Clients connect and receive real-time updates | |
| - Sends keepalive pings every 30s | |
| - Auto-reconnects on error | |
| **`POST /api/stream/broadcast`** - Broadcast endpoint | |
| - Receives events from AI Manager | |
| - Pushes to player-specific queue | |
| - Queue is non-blocking (max 1000 events) | |
| **Event format:** | |
| ```json | |
| { | |
| "type": "thought|text|function_call|done", | |
| "timestamp": "ISO-8601", | |
| "content": "...", | |
| "function_call": {...} | |
| } | |
| ``` | |
| ### 5. Dynamic Viewer UI (`viewer_dynamic.html`) | |
| **New features:** | |
| **Streaming Container** - Shows live updates: | |
| - Appears at top of page when streaming active | |
| - Shows player name with blinking indicator | |
| - Auto-scrolls as new chunks arrive | |
| - Fades out after completion | |
| **Visual feedback:** | |
| - 💭 Purple border for thoughts | |
| - 🔹 Green border for text | |
| - 🔧 Orange border for function calls | |
| - ✅ Done status with green indicator | |
| **JavaScript functions:** | |
| - `initStreaming()` - Connect to SSE for all players | |
| - `connectPlayerStream(player)` - Create EventSource | |
| - `handleStreamChunk(player, chunk)` - Process incoming chunk | |
| - `addStreamChunk(container, type, content)` - Display chunk | |
| ## Configuration | |
| ### Enable Streaming | |
| In `config_dev.yaml`: | |
| ```yaml | |
| llm: | |
| enable_streaming: true # Enable real-time streaming | |
| enable_thinking: true # Required for thought summaries | |
| thinking_budget: 8000 # Budget for thinking tokens | |
| ``` | |
| ### Disable Streaming | |
| Set `enable_streaming: false` to use traditional request-response mode. | |
| ## Usage | |
| ### 1. Start the Game | |
| Run `play_ai_auto.bat` which starts: | |
| - Web Viewer on port 5001 (with SSE support) | |
| - Game with AI agents | |
| - LLM Logger console | |
| ### 2. Watch Real-Time Updates | |
| Open browser to `http://localhost:5001`: | |
| - Streaming boxes appear when AI is thinking | |
| - See thoughts, tool calls, and responses as they happen | |
| - Boxes disappear when complete | |
| ### 3. Review History | |
| Completed requests are logged normally: | |
| - Full prompt/response saved | |
| - Tool iterations recorded | |
| - All metadata preserved | |
| ## Technical Details | |
| ### Why SSE (Server-Sent Events)? | |
| - One-way: Server → Client (perfect for our use case) | |
| - Built-in reconnection | |
| - Simple HTTP (no WebSocket complexity) | |
| - Works with existing Flask app | |
| ### Why HTTP POST for Broadcasting? | |
| - Decoupled architecture | |
| - AI Manager doesn't need to know about SSE | |
| - Non-blocking (fire and forget) | |
| - Web viewer can be offline without breaking AI | |
| ### Token Budgets with Streaming | |
| Streaming works with thinking budgets: | |
| ```yaml | |
| # Single budget for all iterations | |
| thinking_budget: 8000 | |
| thinking_budgets: [] | |
| # OR: Dynamic budgets per iteration | |
| thinking_budgets: [8000, 4000, 2000] # 3 iterations | |
| ``` | |
| Each iteration streams its own thoughts and results. | |
| ## Benefits | |
| ### For Development | |
| - **Immediate feedback** - See what AI is doing in real-time | |
| - **Debug tool calls** - Watch function calling decisions | |
| - **Monitor thinking** - Understand reasoning process | |
| - **Better UX** - Know the system is working | |
| ### For Users | |
| - **Transparency** - See AI decision-making | |
| - **Engagement** - Watch the game unfold | |
| - **Understanding** - Learn how AI plays Catan | |
| - **Entertainment** - More interesting than waiting | |
| ## Future Enhancements | |
| Possible additions: | |
| - [ ] Stream to multiple viewers simultaneously | |
| - [ ] Replay streaming for historical games | |
| - [ ] Filter streams by type (thoughts only, tools only) | |
| - [ ] Stream game state updates | |
| - [ ] WebSocket option for bidirectional communication | |
| - [ ] Stream compression for high-frequency updates | |
| ## Troubleshooting | |
| **No streaming visible:** | |
| - Check `enable_streaming: true` in config | |
| - Verify web viewer is running on port 5001 | |
| - Check browser console for connection errors | |
| - Ensure `enable_thinking: true` for thought summaries | |
| **Connection drops:** | |
| - SSE reconnects automatically after 5s | |
| - Check network/firewall | |
| - Verify Flask not blocking long connections | |
| **Missing chunks:** | |
| - Queue size is 1000 - may drop old events | |
| - Increase queue size in `web_viewer.py` if needed | |
| ## API Reference | |
| ### StreamChunk | |
| ```python | |
| chunk = StreamChunk( | |
| chunk_type='thought', # or 'text', 'function_call', 'done' | |
| content='Analyzing situation...', | |
| is_complete=False | |
| ) | |
| ``` | |
| ### SSE Event | |
| ```javascript | |
| { | |
| type: 'thought', | |
| timestamp: '2026-01-10T12:34:56', | |
| content: 'I should build a settlement...' | |
| } | |
| ``` | |
| ### Broadcast API | |
| ```bash | |
| POST http://localhost:5001/api/stream/broadcast | |
| Content-Type: application/json | |
| { | |
| "player_name": "Agent1", | |
| "chunk_type": "thought", | |
| "content": "Thinking..." | |
| } | |
| ``` | |
| ## Credits | |
| Built on top of: | |
| - **Google Gemini API** - Streaming support with thinking mode | |
| - **Flask** - SSE server | |
| - **Server-Sent Events** - Real-time browser updates | |
| - **PyCatan** - Settlers of Catan implementation | |
| --- | |
| **Happy Streaming! 🌊** | |