Spaces:
Configuration error
Configuration error
| # Intermediate Responses Logging | |
| ## π Overview | |
| The system now saves **all intermediate LLM responses** - including raw content when the LLM requests tools instead of providing a final answer. | |
| ## ποΈ Directory Structure | |
| ``` | |
| session_YYYYMMDD_HHMMSS/ | |
| βββ Alice/ | |
| β βββ prompts/ | |
| β β βββ prompt_1.json # Initial prompt | |
| β β βββ iterations/ | |
| β β βββ prompt_1_iter2.json # Follow-up with tool results | |
| β βββ responses/ | |
| β βββ response_1.json # Final response (type: "final") | |
| β βββ intermediate/ | |
| β βββ response_1_iter1.json # NEW! Intermediate response with tool_calls | |
| βββ tool_executions.json | |
| βββ llm_communication.log | |
| ``` | |
| ## π What Gets Saved | |
| ### Intermediate Response Format | |
| Location: `responses/intermediate/response_X_iterY.json` | |
| ```json | |
| { | |
| "request_number": 1, | |
| "iteration": 1, | |
| "timestamp": "2026-01-09T16:07:34.123456", | |
| "player_name": "Alice", | |
| "type": "intermediate", | |
| "success": true, | |
| "raw_content": "...", // Raw LLM response content | |
| "has_tool_calls": true, | |
| "tool_calls": [ // Full tool_calls array from LLM | |
| { | |
| "name": "find_best_nodes", | |
| "parameters": { | |
| "reasoning": "Looking for high-yield nodes...", | |
| "min_pips": 10 | |
| } | |
| } | |
| ], | |
| "model": "gemini-2.0-flash-exp", | |
| "tokens": { | |
| "prompt": 2172, | |
| "completion": 79, | |
| "thinking": 0, | |
| "total": 2251 | |
| }, | |
| "latency_seconds": 16.234, | |
| "error": null | |
| } | |
| ``` | |
| ### Final Response Format | |
| Location: `responses/response_X.json` | |
| ```json | |
| { | |
| "request_number": 1, | |
| "timestamp": "2026-01-09T16:09:24.617751", | |
| "player_name": "Alice", | |
| "type": "final", // Marked as final | |
| "success": true, | |
| "raw_content": "...", // Final structured response | |
| "parsed": { // Parsed action | |
| "action_type": "place_starting_settlement", | |
| "parameters": {"node": 43} | |
| }, | |
| "model": "gemini-2.0-flash-exp", | |
| "tokens": { | |
| "prompt": 3538, // Accumulated tokens | |
| "completion": 355, | |
| "thinking": 5366, | |
| "total": 13070 // Total including all iterations + tools | |
| }, | |
| "latency_seconds": 26.136, | |
| "error": null | |
| } | |
| ``` | |
| ## π Complete Flow Example | |
| 1. **Initial Prompt** β `prompts/prompt_1.json` | |
| 2. **LLM Response (requests tools)** β `responses/intermediate/response_1_iter1.json` β¨ **NEW!** | |
| 3. **Tool Execution** β `tool_executions.json` | |
| 4. **Follow-up Prompt** β `prompts/iterations/prompt_1_iter2.json` | |
| 5. **Final Response** β `responses/response_1.json` | |
| ## π― Benefits | |
| 1. **Complete Audit Trail** - Every LLM interaction is saved | |
| 2. **Debug Tool Requests** - See exactly what the LLM asked for | |
| 3. **Analyze Reasoning** - Understand why tools were requested | |
| 4. **Replay Capability** - Can reconstruct entire conversation | |
| 5. **Cost Tracking** - Token counts for each iteration | |
| ## π Usage | |
| The intermediate responses are automatically saved by `AILogger.log_intermediate_response()` whenever the LLM returns `tool_calls` instead of a final answer. | |
| No changes needed to your code - it happens automatically! | |
| ## π Finding Intermediate Responses | |
| ```python | |
| from pathlib import Path | |
| session_dir = Path("examples/ai_testing/my_games/session_20260109_160732") | |
| # Find all intermediate responses for Alice | |
| intermediate_dir = session_dir / "Alice" / "responses" / "intermediate" | |
| for response_file in intermediate_dir.glob("*.json"): | |
| print(f"Found: {response_file.name}") | |
| ``` | |
| ## π‘ Why This Matters | |
| Previously, when the LLM requested tools, we only saved: | |
| - That tools were requested (in logs) | |
| - Which tools (in `tool_executions.json`) | |
| - The follow-up prompt (in `iterations/`) | |
| Now we **also save**: | |
| - β The raw LLM response content | |
| - β Full tool_calls structure | |
| - β Token counts for this specific iteration | |
| - β Timing information | |
| - β Any error messages | |
| This gives complete visibility into the AI agent's decision-making process! | |