Spaces:
Sleeping
Sleeping
Felix Lebel commited on
Commit ·
d725aa7
1
Parent(s): 615a63b
assignment done
Browse files- README.md +88 -4
- agent.py +653 -114
- mcp_server.py +260 -79
README.md
CHANGED
|
@@ -14,15 +14,99 @@ license: mit
|
|
| 14 |
|
| 15 |
## Overview
|
| 16 |
|
| 17 |
-
This is my submission for the Text Adventure Agent assignment. My agent uses the ReAct pattern to play text adventure games via MCP.
|
|
|
|
| 18 |
|
| 19 |
## Approach
|
| 20 |
|
| 21 |
<!-- Describe your approach here -->
|
| 22 |
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
## Files
|
| 28 |
|
|
|
|
| 14 |
|
| 15 |
## Overview
|
| 16 |
|
| 17 |
+
This is my submission for the Text Adventure Agent assignment. My agent uses the ReAct pattern to play text adventure games via MCP.
|
| 18 |
+
Author: Félix LEBEL
|
| 19 |
|
| 20 |
## Approach
|
| 21 |
|
| 22 |
<!-- Describe your approach here -->
|
| 23 |
|
| 24 |
+
### What strategy does your agent use?
|
| 25 |
+
General strategy: The agent is encourage to explore (use mouvement actions) and examine/interact with its environment as possible.
|
| 26 |
+
|
| 27 |
+
Here is an excerpt from my system prompt:
|
| 28 |
+
EXPLORATION STRATEGY (follow this priority):
|
| 29 |
+
1. EXPLORE a lot! Try new locations and exits frequently (north, south, east, west, northeast, northwest, southeast, southwest, up, down, enter, exit)
|
| 30 |
+
2. ALWAYS EXAMINE everything that could be interesting, especially details in objects, rooms... EXAMINE where you could find some loot or useful items, or clues for puzzles. INTERACT with characters and objects to discover new possibilities.
|
| 31 |
+
3. ALWAYS take items that seem useful (lamp, sword, key, etc.)
|
| 32 |
+
4. Open containers (mailbox, cases, doors, windows)
|
| 33 |
+
5. Try ALL exits from a location before moving on
|
| 34 |
+
6. Use get_map and location_log frequently to plan which unexplored exits to try, and what actions to take. It also helps you remember what you've tried at the current location and their outcomes, so you can avoid repeating failed actions and focus on promising ones.
|
| 35 |
+
7. Use memory to check if you're repeating yourself
|
| 36 |
+
8. If you've been in the same location for 3+ turns, MOVE to a new location
|
| 37 |
+
|
| 38 |
+
### What tools did you implement in your MCP server?
|
| 39 |
+
|
| 40 |
+
```python
|
| 41 |
+
def play_action(action: str) -> str:
|
| 42 |
+
"""
|
| 43 |
+
Execute a game command and return the result.
|
| 44 |
+
|
| 45 |
+
This is the main tool for interacting with the game.
|
| 46 |
+
|
| 47 |
+
Args:
|
| 48 |
+
action: The command to execute (e.g., "north", "take lamp", "open mailbox")
|
| 49 |
+
|
| 50 |
+
Returns:
|
| 51 |
+
The game's response to the action
|
| 52 |
+
|
| 53 |
+
Valid commands include:
|
| 54 |
+
- Movement: north, south, east, west, northeast, northwest, southeast, southwest, up, down, enter, exit
|
| 55 |
+
- Objects: take <item>, drop <item>, open <thing>, examine <thing>
|
| 56 |
+
- Other: look, inventory, read <thing>, turn on lamp
|
| 57 |
+
"""
|
| 58 |
+
```
|
| 59 |
+
```python
|
| 60 |
+
def memory() -> str:
|
| 61 |
+
"""
|
| 62 |
+
Get the current game state summary.
|
| 63 |
+
|
| 64 |
+
Returns:
|
| 65 |
+
A summary including current location (number of visits, actions tried, promising actions),
|
| 66 |
+
recent actions and current observation
|
| 67 |
+
"""
|
| 68 |
+
```
|
| 69 |
+
```python
|
| 70 |
+
def inventory() -> str:
|
| 71 |
+
"""
|
| 72 |
+
Check what the player is carrying.
|
| 73 |
+
|
| 74 |
+
Returns:
|
| 75 |
+
List of items in the player's inventory
|
| 76 |
+
"""
|
| 77 |
+
return get_game().get_inventory()
|
| 78 |
+
```
|
| 79 |
+
```python
|
| 80 |
+
def get_map() -> str:
|
| 81 |
+
"""
|
| 82 |
+
Get a map of explored locations, connections and exits.
|
| 83 |
+
Useful for navigation and avoiding getting lost.
|
| 84 |
+
|
| 85 |
+
Returns:
|
| 86 |
+
A text representation of explored locations and connections
|
| 87 |
+
"""
|
| 88 |
+
```
|
| 89 |
+
```python
|
| 90 |
+
def location_log() -> str:
|
| 91 |
+
"""
|
| 92 |
+
Shows what actions were tried and their outcomes at the current location, along with any promising actions to try.
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
Returns:
|
| 96 |
+
A detailed log of the current location, including visit count, actions taken and their outcomes, and promising leads.
|
| 97 |
+
"""
|
| 98 |
+
```
|
| 99 |
+
### Any interesting techniques or optimizations?
|
| 100 |
+
|
| 101 |
+
Here list of ideas and techniques I implemented:
|
| 102 |
+
- I used Jericho API to extract cleaner Locations
|
| 103 |
+
- I used these "cleaner" locations to write a function that determines when a player enters a new location
|
| 104 |
+
- I kept the log of every actions (I made the distinction between movements and non-movement actions) at every locations
|
| 105 |
+
- When building the LLM prompt for the agent, I implemented another LLM whose task is to extract promising actions from: the current observation, the general history of actions/tool calls taken by the agent and the log of actions taken by the agent at the specific current locations (to prevent the agent from getting stuck, and for it to be aware of its last actions)
|
| 106 |
+
- I implemented an "Exploration Pressure" in several ways:
|
| 107 |
+
* if the agent stays too long at the same locations, the LLM-agent-prompt changes and suggest more movements (or using get_map, or look)
|
| 108 |
+
* if the agents keeps coming again and again to the same location while having already massively interact with objects, I show him the directions/movements it hasn't already tried (at the current location)
|
| 109 |
+
- I refined massively the system prompt and the extraction prompt
|
| 110 |
|
| 111 |
## Files
|
| 112 |
|
agent.py
CHANGED
|
@@ -28,39 +28,32 @@ import os
|
|
| 28 |
import re
|
| 29 |
from dataclasses import dataclass, field
|
| 30 |
from typing import Optional
|
|
|
|
| 31 |
|
| 32 |
from dotenv import load_dotenv
|
| 33 |
-
from huggingface_hub import InferenceClient
|
| 34 |
|
| 35 |
# Load environment variables
|
| 36 |
load_dotenv()
|
| 37 |
|
| 38 |
-
# Set USE_LOCAL_MODEL=1 in your .env to use a locally downloaded model
|
| 39 |
-
USE_LOCAL_MODEL = os.getenv("USE_LOCAL_MODEL", "0").strip() in ("1", "true", "yes")
|
| 40 |
-
LOCAL_MODEL_ID = os.getenv("LOCAL_MODEL_ID", "Qwen/Qwen2.5-3B-Instruct")
|
| 41 |
-
|
| 42 |
# =============================================================================
|
| 43 |
# LLM Configuration - DO NOT MODIFY
|
| 44 |
# =============================================================================
|
| 45 |
|
| 46 |
-
# Model to use (fixed for fair evaluation)
|
| 47 |
LLM_MODEL = "Qwen/Qwen2.5-72B-Instruct"
|
| 48 |
|
| 49 |
-
|
| 50 |
_local_pipeline = None
|
| 51 |
|
| 52 |
if USE_LOCAL_MODEL:
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
LLM_CLIENT = None
|
| 63 |
-
else:
|
| 64 |
_hf_token = os.getenv("HF_TOKEN")
|
| 65 |
if not _hf_token:
|
| 66 |
raise ValueError("HF_TOKEN not found. Set it in your .env file.")
|
|
@@ -79,13 +72,6 @@ def call_llm(prompt: str, system_prompt: str, seed: int, max_tokens: int = 300)
|
|
| 79 |
|
| 80 |
Returns:
|
| 81 |
The LLM's response text
|
| 82 |
-
|
| 83 |
-
Example:
|
| 84 |
-
response = call_llm(
|
| 85 |
-
prompt="You are in a forest. What do you do?",
|
| 86 |
-
system_prompt=SYSTEM_PROMPT,
|
| 87 |
-
seed=42,
|
| 88 |
-
)
|
| 89 |
"""
|
| 90 |
messages = [
|
| 91 |
{"role": "system", "content": system_prompt},
|
|
@@ -96,7 +82,7 @@ def call_llm(prompt: str, system_prompt: str, seed: int, max_tokens: int = 300)
|
|
| 96 |
outputs = _local_pipeline(
|
| 97 |
messages,
|
| 98 |
max_new_tokens=max_tokens,
|
| 99 |
-
temperature=0.0001,
|
| 100 |
do_sample=True,
|
| 101 |
)
|
| 102 |
return outputs[0]["generated_text"][-1]["content"]
|
|
@@ -104,7 +90,7 @@ def call_llm(prompt: str, system_prompt: str, seed: int, max_tokens: int = 300)
|
|
| 104 |
response = LLM_CLIENT.chat.completions.create(
|
| 105 |
model=LLM_MODEL,
|
| 106 |
messages=messages,
|
| 107 |
-
temperature=0.0,
|
| 108 |
max_tokens=max_tokens,
|
| 109 |
seed=seed,
|
| 110 |
)
|
|
@@ -125,61 +111,221 @@ class RunResult:
|
|
| 125 |
|
| 126 |
|
| 127 |
# =============================================================================
|
| 128 |
-
# System Prompt
|
| 129 |
# =============================================================================
|
| 130 |
|
| 131 |
-
SYSTEM_PROMPT = """You are playing a classic text adventure game.
|
| 132 |
|
| 133 |
-
|
| 134 |
|
| 135 |
AVAILABLE TOOLS (use via MCP):
|
| 136 |
- play_action: Execute a game command (north, take lamp, open mailbox, etc.)
|
| 137 |
-
-
|
| 138 |
-
-
|
|
|
|
|
|
|
| 139 |
|
| 140 |
VALID GAME COMMANDS for play_action:
|
| 141 |
-
- Movement: north, south, east, west, up, down, enter, exit
|
| 142 |
- Objects: take <item>, drop <item>, open <thing>, close <thing>, examine <thing>
|
| 143 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
|
| 145 |
RESPOND IN THIS EXACT FORMAT (no markdown):
|
| 146 |
THOUGHT: <your reasoning about what to do next>
|
| 147 |
TOOL: <tool_name>
|
| 148 |
ARGS: <JSON arguments, e.g., {"action": "look"}>
|
| 149 |
|
| 150 |
-
|
| 151 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 152 |
TOOL: play_action
|
| 153 |
ARGS: {"action": "look"}
|
|
|
|
|
|
|
|
|
|
|
|
|
| 154 |
"""
|
| 155 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 156 |
|
| 157 |
# =============================================================================
|
| 158 |
-
# Student Agent
|
| 159 |
# =============================================================================
|
| 160 |
|
|
|
|
|
|
|
| 161 |
class StudentAgent:
|
| 162 |
"""
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
TODO:
|
| 166 |
-
1. Implement the run() method with the ReAct loop
|
| 167 |
-
2. Parse LLM responses to extract tool calls
|
| 168 |
-
3. Track state and avoid loops
|
| 169 |
-
|
| 170 |
-
Use the provided call_llm() function to interact with the LLM.
|
| 171 |
"""
|
| 172 |
|
| 173 |
def __init__(self):
|
| 174 |
"""Initialize your agent here."""
|
| 175 |
-
#
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 179 |
|
| 180 |
async def run(
|
| 181 |
self,
|
| 182 |
-
client,
|
| 183 |
game: str,
|
| 184 |
max_steps: int,
|
| 185 |
seed: int,
|
|
@@ -187,89 +333,483 @@ class StudentAgent:
|
|
| 187 |
) -> RunResult:
|
| 188 |
"""
|
| 189 |
Run the agent for a game session.
|
| 190 |
-
|
| 191 |
-
Args:
|
| 192 |
-
client: FastMCP Client connected to your MCP server
|
| 193 |
-
game: Name of the game being played (e.g., "zork1")
|
| 194 |
-
max_steps: Maximum number of steps to take
|
| 195 |
-
seed: Random seed for reproducibility (use for LLM calls)
|
| 196 |
-
verbose: Whether to print detailed output
|
| 197 |
-
|
| 198 |
-
Returns:
|
| 199 |
-
RunResult with final score and statistics
|
| 200 |
"""
|
| 201 |
-
# TODO: Implement your ReAct loop here
|
| 202 |
-
#
|
| 203 |
-
# Basic structure:
|
| 204 |
-
# 1. Get initial observation (call play_action with "look")
|
| 205 |
-
# 2. Loop for max_steps:
|
| 206 |
-
# a. Build prompt with current observation and history
|
| 207 |
-
# b. Call LLM to get thought and action
|
| 208 |
-
# c. Parse the response to extract tool and args
|
| 209 |
-
# d. Call the tool via client.call_tool(tool_name, args)
|
| 210 |
-
# e. Update history and state
|
| 211 |
-
# f. Check for game over
|
| 212 |
-
# 3. Return RunResult with final statistics
|
| 213 |
-
|
| 214 |
-
# Example of calling a tool:
|
| 215 |
-
# result = await client.call_tool("play_action", {"action": "look"})
|
| 216 |
-
# observation = result[0].text if result else "No response"
|
| 217 |
-
|
| 218 |
-
# Example of calling the LLM:
|
| 219 |
-
# response = call_llm(
|
| 220 |
-
# prompt="Current observation: " + observation,
|
| 221 |
-
# system_prompt=SYSTEM_PROMPT,
|
| 222 |
-
# seed=seed,
|
| 223 |
-
# )
|
| 224 |
-
|
| 225 |
-
# Placeholder implementation - replace with your code
|
| 226 |
locations_visited = set()
|
| 227 |
history = []
|
| 228 |
-
final_score = 0
|
| 229 |
moves = 0
|
| 230 |
|
| 231 |
-
#
|
| 232 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 233 |
|
| 234 |
return RunResult(
|
| 235 |
-
final_score=
|
| 236 |
-
max_score=350,
|
| 237 |
moves=moves,
|
| 238 |
locations_visited=locations_visited,
|
| 239 |
-
game_completed=
|
| 240 |
history=history,
|
| 241 |
)
|
| 242 |
-
|
| 243 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 244 |
"""
|
| 245 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 246 |
|
| 247 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 248 |
"""
|
| 249 |
-
|
| 250 |
-
|
| 251 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 252 |
def _parse_response(self, response: str) -> tuple[str, str, dict]:
|
| 253 |
"""
|
| 254 |
Parse LLM response to extract thought, tool name, and arguments.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 255 |
|
| 256 |
-
|
| 257 |
|
| 258 |
-
|
| 259 |
-
|
| 260 |
-
|
| 261 |
-
|
| 262 |
-
|
| 263 |
-
|
| 264 |
-
|
| 265 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 266 |
|
| 267 |
-
def
|
| 268 |
-
"""
|
| 269 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 270 |
|
| 271 |
-
|
| 272 |
-
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 273 |
return call_llm(prompt, system_prompt, seed)
|
| 274 |
|
| 275 |
|
|
@@ -281,7 +821,6 @@ async def test_agent():
|
|
| 281 |
"""Test the agent locally."""
|
| 282 |
from fastmcp import Client
|
| 283 |
|
| 284 |
-
# Path to your MCP server
|
| 285 |
server_path = "mcp_server.py"
|
| 286 |
|
| 287 |
agent = StudentAgent()
|
|
@@ -302,4 +841,4 @@ async def test_agent():
|
|
| 302 |
|
| 303 |
if __name__ == "__main__":
|
| 304 |
import asyncio
|
| 305 |
-
asyncio.run(test_agent())
|
|
|
|
| 28 |
import re
|
| 29 |
from dataclasses import dataclass, field
|
| 30 |
from typing import Optional
|
| 31 |
+
import numpy as np
|
| 32 |
|
| 33 |
from dotenv import load_dotenv
|
|
|
|
| 34 |
|
| 35 |
# Load environment variables
|
| 36 |
load_dotenv()
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
# =============================================================================
|
| 39 |
# LLM Configuration - DO NOT MODIFY
|
| 40 |
# =============================================================================
|
| 41 |
|
|
|
|
| 42 |
LLM_MODEL = "Qwen/Qwen2.5-72B-Instruct"
|
| 43 |
|
| 44 |
+
USE_LOCAL_MODEL = os.getenv("USE_LOCAL_MODEL", "false").lower() == "true"
|
| 45 |
_local_pipeline = None
|
| 46 |
|
| 47 |
if USE_LOCAL_MODEL:
|
| 48 |
+
try:
|
| 49 |
+
from transformers import pipeline
|
| 50 |
+
LOCAL_MODEL = os.getenv("LOCAL_MODEL", "Qwen/Qwen2.5-3B-Instruct")
|
| 51 |
+
_local_pipeline = pipeline("text-generation", model=LOCAL_MODEL, device_map="auto")
|
| 52 |
+
except Exception:
|
| 53 |
+
USE_LOCAL_MODEL = False
|
| 54 |
+
|
| 55 |
+
if not USE_LOCAL_MODEL:
|
| 56 |
+
from huggingface_hub import InferenceClient
|
|
|
|
|
|
|
| 57 |
_hf_token = os.getenv("HF_TOKEN")
|
| 58 |
if not _hf_token:
|
| 59 |
raise ValueError("HF_TOKEN not found. Set it in your .env file.")
|
|
|
|
| 72 |
|
| 73 |
Returns:
|
| 74 |
The LLM's response text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
"""
|
| 76 |
messages = [
|
| 77 |
{"role": "system", "content": system_prompt},
|
|
|
|
| 82 |
outputs = _local_pipeline(
|
| 83 |
messages,
|
| 84 |
max_new_tokens=max_tokens,
|
| 85 |
+
temperature=0.0001,
|
| 86 |
do_sample=True,
|
| 87 |
)
|
| 88 |
return outputs[0]["generated_text"][-1]["content"]
|
|
|
|
| 90 |
response = LLM_CLIENT.chat.completions.create(
|
| 91 |
model=LLM_MODEL,
|
| 92 |
messages=messages,
|
| 93 |
+
temperature=0.0,
|
| 94 |
max_tokens=max_tokens,
|
| 95 |
seed=seed,
|
| 96 |
)
|
|
|
|
| 111 |
|
| 112 |
|
| 113 |
# =============================================================================
|
| 114 |
+
# System Prompt
|
| 115 |
# =============================================================================
|
| 116 |
|
|
|
|
| 117 |
|
| 118 |
+
SYSTEM_PROMPT = """You are playing a classic text adventure game. Your goal is to EXPLORE widely, COLLECT treasures and MAXIMIZE your score.
|
| 119 |
|
| 120 |
AVAILABLE TOOLS (use via MCP):
|
| 121 |
- play_action: Execute a game command (north, take lamp, open mailbox, etc.)
|
| 122 |
+
- location_log: See what actions were tried at the current location, their outcomes and the promising actions to try.
|
| 123 |
+
- memory: Get a current game state summary including current location (number of visits, actions tried, promising actions), recent actions and current observation
|
| 124 |
+
- get_map: Get a map of explored locations, connections and exits. It also helps you remember what you've tried at the current location and their outcomes, so you can avoid repeating failed actions and focus on promising ones..
|
| 125 |
+
- inventory: Have a look at what you're currently carrying.
|
| 126 |
|
| 127 |
VALID GAME COMMANDS for play_action:
|
| 128 |
+
- Movement: north, south, east, west, northeast, northwest, southeast, southwest, up, down, enter, exit
|
| 129 |
- Objects: take <item>, drop <item>, open <thing>, close <thing>, examine <thing>
|
| 130 |
+
- Light: turn on lamp, turn off lamp
|
| 131 |
+
- Combat: attack <enemy> with <weapon>
|
| 132 |
+
- Other: inventory, look, read <thing>, wait
|
| 133 |
+
- Other: look, examine, listen, speak, look, take, drop, empty, fill, inventory, climb, swim, open, close, set, turn, push, pull, push [direction], throw at, eat, drink, wear, take off, burn, dig, kick, destroy, read, ask for, give, feed, show, ask about, tell about, talk to, kiss, attack, wake, answer, wave, rub , squeeze, jump, jump over, wait, sleep
|
| 134 |
+
sing, yell, think, pray
|
| 135 |
+
|
| 136 |
+
FORBIDDEN (will NOT work): check, inspect, search, grab, use, help
|
| 137 |
|
| 138 |
RESPOND IN THIS EXACT FORMAT (no markdown):
|
| 139 |
THOUGHT: <your reasoning about what to do next>
|
| 140 |
TOOL: <tool_name>
|
| 141 |
ARGS: <JSON arguments, e.g., {"action": "look"}>
|
| 142 |
|
| 143 |
+
EXPLORATION STRATEGY (follow this priority):
|
| 144 |
+
1. EXPLORE a lot! Try new locations and exits frequently (north, south, east, west, northeast, northwest, southeast, southwest, up, down, enter, exit)
|
| 145 |
+
2. ALWAYS EXAMINE everything that could be interesting, especially details in objects, rooms... EXAMINE where you could find some loot or useful items, or clues for puzzles. INTERACT with characters and objects to discover new possibilities.
|
| 146 |
+
3. ALWAYS take items that seem useful (lamp, sword, key, etc.)
|
| 147 |
+
4. Open containers (mailbox, cases, doors, windows)
|
| 148 |
+
5. Try ALL exits from a location before moving on
|
| 149 |
+
6. Use get_map and location_log frequently to plan which unexplored exits to try, and what actions to take. It also helps you remember what you've tried at the current location and their outcomes, so you can avoid repeating failed actions and focus on promising ones.
|
| 150 |
+
7. Use memory to check if you're repeating yourself
|
| 151 |
+
8. If you've been in the same location for 3+ turns, MOVE to a new location
|
| 152 |
+
|
| 153 |
+
HERE IS THE STRUCTURE OF THE GAME OUTPUT you receive after each action and tool call:
|
| 154 |
+
<BEGIN GAME OUTPUT>
|
| 155 |
+
- CURRENT LOCATION: <location name>
|
| 156 |
+
- STEPS AT THIS LOCATION: <number of steps taken at this location>
|
| 157 |
+
|
| 158 |
+
- RECENT ACTIONS:
|
| 159 |
+
[<location name>] > action -> outcome
|
| 160 |
+
[<other location name>] > other action -> other outcome
|
| 161 |
+
...
|
| 162 |
+
[<other location name>] > other action -> other outcome
|
| 163 |
+
|
| 164 |
+
- CURRENT SITUATION:
|
| 165 |
+
<text describing the current location, visible objects, characters, exits, inventory, map, etc.>
|
| 166 |
+
or <map description>
|
| 167 |
+
|
| 168 |
+
- ACTIONS ALREADY TRIED AT THIS LOCATION:
|
| 169 |
+
> action -> outcome
|
| 170 |
+
> other action -> other outcome
|
| 171 |
+
|
| 172 |
+
- ACTIONS SUGGESTED: action1, action2, action3
|
| 173 |
+
<END GAME OUTPUT>
|
| 174 |
+
|
| 175 |
+
|
| 176 |
+
"CURRENT SITUATION" is the most important part of the output, it is the direct consequence of your last action and the most up-to-date description of the world. Focus on it to find new interactions, objects, exits, and details to examine.
|
| 177 |
+
"RECENT ACTIONS" is a summary of what you've done recently and their outcomes. Use it to avoid repeating failed actions and to focus on promising ones.
|
| 178 |
+
DON'T SUGGEST ACTIONS YOU'VE ALREADY TRIED AT THIS LOCATION. If there are too many ACTIONS ALREADY TRIED AT THIS LOCATION, move to another place (use look to see the exits).
|
| 179 |
+
|
| 180 |
+
|
| 181 |
+
IMPORTANT:
|
| 182 |
+
- DO NOT repeat the same action multiple times in a row
|
| 183 |
+
- If an action doesn't work, try something DIFFERENT or EXAMINE more (precisely) to find new possibilities
|
| 184 |
+
|
| 185 |
+
Examples:
|
| 186 |
+
|
| 187 |
+
THOUGHT: I need to remember what I've tried here before. Let me check the location log.
|
| 188 |
+
TOOL: location_log
|
| 189 |
+
ARGS: {}
|
| 190 |
+
|
| 191 |
+
THOUGHT: I see an interesting object. Let me examine it.
|
| 192 |
+
TOOL: play_action
|
| 193 |
+
ARGS: {"action": "examine mailbox"}
|
| 194 |
+
|
| 195 |
+
THOUGHT: I should check the map to find unexplored exits and to remember what I've tried here before.
|
| 196 |
+
TOOL: get_map
|
| 197 |
+
ARGS: {}
|
| 198 |
+
|
| 199 |
+
THOUGHT: Look around to find more details about the room and possible interactions.
|
| 200 |
TOOL: play_action
|
| 201 |
ARGS: {"action": "look"}
|
| 202 |
+
|
| 203 |
+
THOUGHT: Let me remember to try opening the trapdoor later when I have a key.
|
| 204 |
+
TOOL: record_promising_action
|
| 205 |
+
ARGS: {"action": "open trapdoor"}
|
| 206 |
"""
|
| 207 |
|
| 208 |
+
# =============================================================================
|
| 209 |
+
# Prompt for extracting promising actions from observations
|
| 210 |
+
# =============================================================================
|
| 211 |
+
|
| 212 |
+
EXTRACT_ACTIONS_PROMPT = """You are analyzing text adventure game output. Extract promising actions the player should try.
|
| 213 |
+
|
| 214 |
+
Here is the structure of the GAME OUTPUT you receive:
|
| 215 |
+
<BEGIN GAME OUTPUT>
|
| 216 |
+
- CURRENT LOCATION: <location name>
|
| 217 |
+
- STEPS AT THIS LOCATION: <number of steps taken at this location>
|
| 218 |
+
|
| 219 |
+
- RECENT ACTIONS:
|
| 220 |
+
[<location name>] > action -> outcome
|
| 221 |
+
[<other location name>] > other action -> other outcome
|
| 222 |
+
...
|
| 223 |
+
[<other location name>] > other action -> other outcome
|
| 224 |
+
|
| 225 |
+
- CURRENT SITUATION:
|
| 226 |
+
<text describing the current location, visible objects, characters, exits, inventory, map, etc.>
|
| 227 |
+
or <map description>
|
| 228 |
+
|
| 229 |
+
- ACTIONS ALREADY TRIED AT THIS LOCATION:
|
| 230 |
+
> action -> outcome
|
| 231 |
+
> other action -> other outcome
|
| 232 |
+
|
| 233 |
+
- ACTIONS SUGGESTED: action1, action2, action3
|
| 234 |
+
<END GAME OUTPUT>
|
| 235 |
+
|
| 236 |
+
|
| 237 |
+
Given the GAME OUTPUT, output a JSON list of action strings. Focus on:
|
| 238 |
+
- Objects mentioned in CURRENT SITUATION that can be TAKEN, examined, or opened
|
| 239 |
+
- Objects or places to examine mentioned in CURRENT SITUATION that could reveal new information or items
|
| 240 |
+
- Directions/exits mentioned in CURRENT SITUATION
|
| 241 |
+
- Interactive elements in CURRENT SITUATION (doors, containers, levers, buttons). Suggest interacting with them to discover new possibilities.
|
| 242 |
+
- Items that might be useful in CURRENT SITUATION
|
| 243 |
+
- Exploration if there is no interesting object to interact with mentioned in CURRENT SITUATION
|
| 244 |
+
|
| 245 |
+
Follow these additional guidelines:
|
| 246 |
+
- "CURRENT SITUATION" is the most important part of the output, it is the direct consequence of your last action and the most up-to-date description of the world. Focus on it to find new interactions, objects, exits, and details to examine.
|
| 247 |
+
- "RECENT ACTIONS" is a summary of what you've done recently and their outcomes. Use it to avoid repeating failed actions and to focus on promising ones.
|
| 248 |
+
- DON'T SUGGEST ACTIONS YOU'VE ALREADY TRIED AT THIS LOCATION. If there are too many ACTIONS ALREADY TRIED AT THIS LOCATION, move to another place (use look to see the exits).
|
| 249 |
+
- ACTIONS SUGGESTED are additionally useful, but make sure to focus on the CURRENT SITUATION and RECENT ACTIONS to find promising actions that are relevant to the current context.
|
| 250 |
+
|
| 251 |
+
IMPORTANT: If there is a warning 'WARNING', 'EXPLORATION HINT' or 'URGENT' in the GAME OUTPUT, prioritize suggesting actions that address those warnings.
|
| 252 |
+
|
| 253 |
+
VALID COMMANDS for include:
|
| 254 |
+
- Movement: north, south, east, west, northeast, northwest, southeast, southwest, up, down, enter, exit
|
| 255 |
+
- Objects: take <item>, drop <item>, open <thing>, close <thing>, examine <thing>
|
| 256 |
+
- Light: turn on lamp, turn off lamp
|
| 257 |
+
- Combat: attack <enemy> with <weapon>
|
| 258 |
+
- Other: inventory, look, read <thing>, wait
|
| 259 |
+
- Other: look, examine, listen, speak, look, take, drop, empty, fill, inventory, climb, swim, open, close, set, turn, push, pull, push [direction], throw at, eat, drink, wear, take off, burn, dig, kick, destroy, read, ask for, give, feed, show, ask about, tell about, talk to, kiss, attack, wake, answer, wave, rub , squeeze, jump, jump over, wait, sleep
|
| 260 |
+
sing, yell, think, pray
|
| 261 |
+
KEEP VALID COMMANDS SIMPLE (e.g., "examine pcture" instead of "examine picture on east wall").
|
| 262 |
+
SUGGEST look when you need more information.
|
| 263 |
+
|
| 264 |
+
Output ONLY a JSON list, no explanation. Example: ["examine table", "take key", "open door", "north"]
|
| 265 |
+
If nothing stands out, output: []"""
|
| 266 |
+
|
| 267 |
+
|
| 268 |
+
EXTRACT_ACTIONS_PROMPT_EXIT = """You are analyzing text adventure game output. Extract promising actions or directions the player should try.
|
| 269 |
+
|
| 270 |
+
Here is the structure of the GAME OUTPUT you receive:
|
| 271 |
+
<BEGIN GAME OUTPUT>
|
| 272 |
+
- CURRENT LOCATION: <location name>
|
| 273 |
+
- STEPS AT THIS LOCATION: <number of steps taken at this location>
|
| 274 |
+
|
| 275 |
+
- RECENT ACTIONS:
|
| 276 |
+
[<location name>] > action -> outcome
|
| 277 |
+
[<other location name>] > other action -> other outcome
|
| 278 |
+
...
|
| 279 |
+
[<other location name>] > other action -> other outcome
|
| 280 |
+
|
| 281 |
+
- CURRENT SITUATION:
|
| 282 |
+
<text describing the current location, visible objects, characters, exits, inventory, map, etc.>
|
| 283 |
+
or <map description>
|
| 284 |
+
|
| 285 |
+
- ACTIONS ALREADY TRIED AT THIS LOCATION:
|
| 286 |
+
> action -> outcome
|
| 287 |
+
> other action -> other outcome
|
| 288 |
+
|
| 289 |
+
<END GAME OUTPUT>
|
| 290 |
+
|
| 291 |
+
GUIDELINES:
|
| 292 |
+
The player needs to move to a different location. TRY TO DISCOVER NEW PLACES AND EXITS TO EXPLORE (look at RECENT ACTIONS to avoid going in the same direction again).
|
| 293 |
+
If no exits or directions are mentioned in the CURRENT SITUATION, suggest: look, get_map.
|
| 294 |
+
Otherwise, suggests exits and directions mentioned in the CURRENT SITUATION among the valid commands: north, south, east, west, northeast, northwest, southeast, southwest.
|
| 295 |
+
|
| 296 |
+
Output ONLY a JSON list, no explanation. Example: ["north", "look", "southwest", "east"]
|
| 297 |
+
If nothing stands out, output: []"""
|
| 298 |
+
|
| 299 |
|
| 300 |
# =============================================================================
|
| 301 |
+
# Student Agent
|
| 302 |
# =============================================================================
|
| 303 |
|
| 304 |
+
MVMT_COMMANDS = {"look", "north", "south", "east", "west", "up", "down", "northeast", "northwest", "southeast", "southwest"}
|
| 305 |
+
|
| 306 |
class StudentAgent:
|
| 307 |
"""
|
| 308 |
+
ReAct agent with enhanced exploration and location-aware reasoning.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 309 |
"""
|
| 310 |
|
| 311 |
def __init__(self):
|
| 312 |
"""Initialize your agent here."""
|
| 313 |
+
self.history_agent: list[dict] = [] # # location -> history of actions/directions and outcomes at that location
|
| 314 |
+
self.history_location: dict[str, list[dict]] = {} # location -> history of actions that are not directions and outcomes at that location
|
| 315 |
+
self.remaining_directions: dict[str, set[str]] = {} # location -> unexplored directions
|
| 316 |
+
self.recent_actions: list[str] = [] # track recent actions for loop detection
|
| 317 |
+
self.score: int = 0
|
| 318 |
+
self.previous_location: str = "" # track previous location to detect movement
|
| 319 |
+
self.current_location: str = "" # track current location
|
| 320 |
+
self.steps_at_current_location: int = 0 # track how many steps we've been at the current location to encourage exploration
|
| 321 |
+
self.visited_locations: dict[str, int] = {} # location -> visit count
|
| 322 |
+
self.promising_actions: list[str] = [] # promising actions extracted from observation at new locations
|
| 323 |
+
self.is_new_location: bool = False # flag to indicate if the last observation was a new location
|
| 324 |
+
|
| 325 |
|
| 326 |
async def run(
|
| 327 |
self,
|
| 328 |
+
client,
|
| 329 |
game: str,
|
| 330 |
max_steps: int,
|
| 331 |
seed: int,
|
|
|
|
| 333 |
) -> RunResult:
|
| 334 |
"""
|
| 335 |
Run the agent for a game session.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 336 |
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 337 |
locations_visited = set()
|
| 338 |
history = []
|
|
|
|
| 339 |
moves = 0
|
| 340 |
|
| 341 |
+
# Get list of available tools
|
| 342 |
+
tools = await client.list_tools()
|
| 343 |
+
tool_names = [t.name for t in tools]
|
| 344 |
+
|
| 345 |
+
# Get initial observation
|
| 346 |
+
result = await client.call_tool("play_action", {"action": "look"})
|
| 347 |
+
observation, location, is_new_location = self._extract_result(result)
|
| 348 |
+
|
| 349 |
+
# Track location (for counting unique locations visited, not necessarily the same as in-game location name)
|
| 350 |
+
dummy_location = observation.split("\n")[0] if observation else "Unknown"
|
| 351 |
+
locations_visited.add(dummy_location)
|
| 352 |
+
|
| 353 |
+
# Track location (location = in-game location name = the name of the room or area we're currently in, extracted from the observation)
|
| 354 |
+
self.current_location = location
|
| 355 |
+
self.previous_location = location
|
| 356 |
+
self.visited_locations[location] = 1
|
| 357 |
+
self.remaining_directions[location] = set(["north", "south", "east", "west", "northeast", "northwest", "southeast", "southwest"])
|
| 358 |
+
|
| 359 |
+
if verbose:
|
| 360 |
+
print(f"\n{observation}")
|
| 361 |
+
|
| 362 |
+
# Extract promising actions from initial observation
|
| 363 |
+
self.promising_actions = self._extract_promising_actions(observation, seed, EXTRACT_ACTIONS_PROMPT)
|
| 364 |
+
if self.promising_actions and verbose:
|
| 365 |
+
print(f"[PROMISING] {self.promising_actions}")
|
| 366 |
+
|
| 367 |
+
# Main ReAct loop
|
| 368 |
+
for step in range(1, max_steps + 1):
|
| 369 |
+
|
| 370 |
+
# Build prompt with context
|
| 371 |
+
prompt = self._build_prompt(observation, seed + step)
|
| 372 |
+
|
| 373 |
+
# Call LLM for reasoning
|
| 374 |
+
response = call_llm(prompt, SYSTEM_PROMPT, seed + step)
|
| 375 |
+
|
| 376 |
+
# Parse the response
|
| 377 |
+
thought, tool_name, tool_args = self._parse_response(response)
|
| 378 |
+
|
| 379 |
+
if verbose:
|
| 380 |
+
print(f"\n--- Step {step} ---")
|
| 381 |
+
print(f"[THOUGHT] {thought}")
|
| 382 |
+
print(f"[TOOL] {tool_name}({tool_args})")
|
| 383 |
+
|
| 384 |
+
# Validate and fix common issues
|
| 385 |
+
tool_name, tool_args = self._validate_tool_call(tool_name, tool_args, tool_names)
|
| 386 |
+
|
| 387 |
+
# Loop detection for play_action
|
| 388 |
+
if tool_name == "play_action":
|
| 389 |
+
action = tool_args.get("action", "look")
|
| 390 |
+
|
| 391 |
+
self.recent_actions.append(action)
|
| 392 |
+
if len(self.recent_actions) > 7:
|
| 393 |
+
self.recent_actions = self.recent_actions[-7:]
|
| 394 |
+
|
| 395 |
+
# Detect loops - if same action 3 times, force exploration
|
| 396 |
+
if len(self.recent_actions) >= 3 and len(set(self.recent_actions[-3:])) == 1:
|
| 397 |
+
if verbose:
|
| 398 |
+
print(f"[WARNING] Loop detected - forcing exploration")
|
| 399 |
+
# Try to move somewhere new
|
| 400 |
+
tool_name, tool_args = self._break_loop(tool_names)
|
| 401 |
+
self.recent_actions.append(tool_args.get("action", "look"))
|
| 402 |
+
|
| 403 |
+
# If stuck at same location too long, add exploration pressure
|
| 404 |
+
if self.steps_at_current_location >= 5 and tool_name == "play_action":
|
| 405 |
+
action = tool_args.get("action", "")
|
| 406 |
+
if action not in MVMT_COMMANDS:
|
| 407 |
+
if verbose:
|
| 408 |
+
print(f"[EXPLORATION BIAS] Been here {self.steps_at_current_location} steps, forcing movement")
|
| 409 |
+
|
| 410 |
+
moves += 1
|
| 411 |
+
|
| 412 |
+
# Execute the tool
|
| 413 |
+
try:
|
| 414 |
+
result = await client.call_tool(tool_name, tool_args)
|
| 415 |
+
observation, new_location, is_new_location = self._extract_result(result)
|
| 416 |
+
self.is_new_location = is_new_location
|
| 417 |
+
|
| 418 |
+
if verbose:
|
| 419 |
+
print(f"[RESULT] {observation[:200]}...")
|
| 420 |
+
|
| 421 |
+
except Exception as e:
|
| 422 |
+
observation = f"Error: {e}"
|
| 423 |
+
if verbose:
|
| 424 |
+
print(f"[ERROR] {e}")
|
| 425 |
+
|
| 426 |
+
# Detect location changes
|
| 427 |
+
self.previous_location = self.current_location
|
| 428 |
+
self.current_location = new_location
|
| 429 |
+
if is_new_location:
|
| 430 |
+
self.steps_at_current_location = 0
|
| 431 |
+
|
| 432 |
+
# Extract promising actions from new location
|
| 433 |
+
self.promising_actions = self._extract_promising_actions(observation, seed + step, EXTRACT_ACTIONS_PROMPT)
|
| 434 |
+
if self.promising_actions and verbose:
|
| 435 |
+
print(f"[PROMISING at new location] {self.promising_actions}")
|
| 436 |
+
|
| 437 |
+
else:
|
| 438 |
+
self.steps_at_current_location += 1
|
| 439 |
+
self.promising_actions = [] # Clear promising actions if we haven't moved
|
| 440 |
+
|
| 441 |
+
# Track number of visits to this location
|
| 442 |
+
if self._has_moved():
|
| 443 |
+
self.visited_locations[self.current_location] = self.visited_locations.get(self.current_location, 0) + 1
|
| 444 |
+
self.steps_at_current_location = 0
|
| 445 |
+
|
| 446 |
+
# Track location (for counting unique locations visited, not necessarily the same as in-game location name)
|
| 447 |
+
dummy_location = observation.split("\n")[0] if observation else "Unknown"
|
| 448 |
+
locations_visited.add(dummy_location)
|
| 449 |
+
|
| 450 |
+
# Update history of actions/directions and outcomes at that location
|
| 451 |
+
# Keep this general history not too long
|
| 452 |
+
self.history_agent.append({
|
| 453 |
+
"step": step,
|
| 454 |
+
"thought": thought,
|
| 455 |
+
"tool": tool_name,
|
| 456 |
+
"args": tool_args,
|
| 457 |
+
"result": observation[:200],
|
| 458 |
+
"location": self.current_location,
|
| 459 |
+
})
|
| 460 |
+
if len(self.history_agent) > 15:
|
| 461 |
+
self.history_agent = self.history_agent[-15:]
|
| 462 |
+
|
| 463 |
+
if self.current_location not in self.history_location:
|
| 464 |
+
self.history_location[self.current_location] = []
|
| 465 |
+
|
| 466 |
+
# Update remaining directions for this location if it's new
|
| 467 |
+
if self.current_location not in self.remaining_directions:
|
| 468 |
+
self.remaining_directions[self.current_location] = set(["north", "south", "east", "west", "northeast", "northwest", "southeast", "southwest"])
|
| 469 |
+
|
| 470 |
+
# Update history of non-movement actions at this location (to help the LLM learn from what worked and what didn't at this location).
|
| 471 |
+
if action not in MVMT_COMMANDS:
|
| 472 |
+
self.history_location[self.current_location].append({
|
| 473 |
+
"step": step,
|
| 474 |
+
"thought": thought,
|
| 475 |
+
"tool": tool_name,
|
| 476 |
+
"args": tool_args,
|
| 477 |
+
"result": observation,
|
| 478 |
+
})
|
| 479 |
+
else:
|
| 480 |
+
# If it's a movement action, remove it from remaining directions for this location
|
| 481 |
+
if action in self.remaining_directions[self.current_location]:
|
| 482 |
+
self.remaining_directions[self.current_location].remove(action)
|
| 483 |
+
|
| 484 |
+
# Track score from observation
|
| 485 |
+
self._update_score(observation)
|
| 486 |
+
|
| 487 |
+
# Record in result history (for final output)
|
| 488 |
+
history.append((thought, f"{tool_name}({tool_args})", observation[:100]))
|
| 489 |
+
|
| 490 |
+
# Check for game over
|
| 491 |
+
if self._is_game_over(observation):
|
| 492 |
+
if verbose:
|
| 493 |
+
print("\n*** GAME OVER ***")
|
| 494 |
+
break
|
| 495 |
|
| 496 |
return RunResult(
|
| 497 |
+
final_score=self.score,
|
| 498 |
+
max_score=350,
|
| 499 |
moves=moves,
|
| 500 |
locations_visited=locations_visited,
|
| 501 |
+
game_completed=self._is_game_over(observation),
|
| 502 |
history=history,
|
| 503 |
)
|
| 504 |
+
|
| 505 |
+
def _has_moved(self) -> bool:
|
| 506 |
+
"""Check if the player has moved to a new location."""
|
| 507 |
+
return self.current_location != self.previous_location
|
| 508 |
+
|
| 509 |
+
def _parse_location_from_observation(self, observation: str) -> tuple[str, bool]:
|
| 510 |
+
"""Extract location name from observation text.
|
| 511 |
+
Return also if it's a new location based on tags in the observation."""
|
| 512 |
+
is_new_location = False
|
| 513 |
+
if not observation:
|
| 514 |
+
return "Unknown", False
|
| 515 |
+
first_line = observation.split("\n")[0].strip()
|
| 516 |
+
# If the first line begins with "[NEW LOCATION:", is_new_location = True
|
| 517 |
+
if first_line.startswith("[NEW LOCATION:"):
|
| 518 |
+
is_new_location = True
|
| 519 |
+
# Extract location from "[NEW/CURRENT LOCATION: location name]" if present
|
| 520 |
+
match = re.search(r'\[(?:NEW|CURRENT) LOCATION: (.+?)\]', first_line)
|
| 521 |
+
|
| 522 |
+
if match:
|
| 523 |
+
return match.group(1).strip(), is_new_location
|
| 524 |
+
else:
|
| 525 |
+
print(f"[ERROR] Could not parse location from observation. Defaulting to first line as location. Observation: \n{observation[:100]}...")
|
| 526 |
+
# Otherwise, return the first line as location
|
| 527 |
+
return first_line, is_new_location
|
| 528 |
+
|
| 529 |
+
def _parse_observation_wo_score(self, observation: str) -> str:
|
| 530 |
+
"""Remove score information from observation to avoid confusion."""
|
| 531 |
+
if not observation:
|
| 532 |
+
return ""
|
| 533 |
+
return observation.split("[Score:")[0].strip()
|
| 534 |
+
|
| 535 |
+
def _extract_promising_actions(self, observation: str, seed: int, prompt: str) -> list[str]:
|
| 536 |
"""
|
| 537 |
+
Use the LLM to extract promising actions from an observation.
|
| 538 |
+
Returns a list of action strings worth trying.
|
| 539 |
+
"""
|
| 540 |
+
try:
|
| 541 |
+
response = call_llm(
|
| 542 |
+
prompt=f"{observation}",
|
| 543 |
+
system_prompt=prompt,
|
| 544 |
+
seed=seed,
|
| 545 |
+
max_tokens=150,
|
| 546 |
+
)
|
| 547 |
+
# Try to parse JSON list from response
|
| 548 |
+
# Find the JSON array in the response
|
| 549 |
+
match = re.search(r'\[.*?\]', response, re.DOTALL)
|
| 550 |
+
if match:
|
| 551 |
+
actions = json.loads(match.group(0))
|
| 552 |
+
if isinstance(actions, list):
|
| 553 |
+
return [str(a) for a in actions if isinstance(a, str)]
|
| 554 |
+
except Exception:
|
| 555 |
+
pass
|
| 556 |
+
return []
|
| 557 |
+
|
| 558 |
+
def _break_loop(self, tool_names: list[str]) -> tuple[str, dict]:
|
| 559 |
+
"""Break out of a loop by choosing an unexplored action."""
|
| 560 |
+
# Try movement directions we haven't tried recently
|
| 561 |
+
directions = ["north", "south", "east", "west", "up", "down",
|
| 562 |
+
"northeast", "northwest", "southeast", "southwest"]
|
| 563 |
+
recent_set = set(self.recent_actions[-5:]) if self.recent_actions else set()
|
| 564 |
+
|
| 565 |
+
for d in directions:
|
| 566 |
+
if d not in recent_set:
|
| 567 |
+
return "play_action", {"action": d}
|
| 568 |
|
| 569 |
+
# If all directions tried, try examining or looking
|
| 570 |
+
if "get_map" in tool_names:
|
| 571 |
+
return "get_map", {}
|
| 572 |
+
|
| 573 |
+
return "play_action", {"action": "look"}
|
| 574 |
+
|
| 575 |
+
def _force_movement(self) -> tuple[str, dict]:
|
| 576 |
+
"""Force a movement action when stuck too long at a location."""
|
| 577 |
+
directions = ["north", "south", "east", "west", "up", "down",
|
| 578 |
+
"enter", "northeast", "northwest", "southeast", "southwest"]
|
| 579 |
+
recent_set = set(self.recent_actions[-5:]) if self.recent_actions else set()
|
| 580 |
+
|
| 581 |
+
for d in directions:
|
| 582 |
+
if d not in recent_set:
|
| 583 |
+
return "play_action", {"action": d}
|
| 584 |
+
|
| 585 |
+
# Fallback: just try north
|
| 586 |
+
return "play_action", {"action": "north"}
|
| 587 |
+
|
| 588 |
+
def _build_prompt(self, observation: str, seed: int = 0) -> str:
|
| 589 |
"""
|
| 590 |
+
Build the prompt for the LLM with rich context.
|
| 591 |
+
"""
|
| 592 |
+
parts = []
|
| 593 |
+
|
| 594 |
+
parts.append(f"- CURRENT LOCATION: {self.current_location}")
|
| 595 |
+
parts.append(f"- STEPS AT THIS LOCATION: {self.steps_at_current_location}")
|
| 596 |
+
|
| 597 |
+
# Recent history
|
| 598 |
+
if self.history_agent:
|
| 599 |
+
parts.append("\n- RECENT ACTIONS:")
|
| 600 |
+
for entry in self.history_agent[-5:]:
|
| 601 |
+
loc = entry.get("location", "?")
|
| 602 |
+
action = entry.get("args", {}).get("action", entry["tool"])
|
| 603 |
+
result = entry.get("result", "")
|
| 604 |
+
result = self._parse_observation_wo_score(result)
|
| 605 |
+
# replace newlines in result with spaces for better readability
|
| 606 |
+
result = result.replace("\n", " ")
|
| 607 |
+
result_short = result[:80] + "..." if len(result) > 80 else result
|
| 608 |
+
parts.append(f" [{loc}] > {action} -> {result_short}")
|
| 609 |
+
|
| 610 |
+
# Warn about repeated actions
|
| 611 |
+
if self.recent_actions and len(self.recent_actions) >= 4 and len(set(self.recent_actions[-3:])) == 1:
|
| 612 |
+
parts.append(f"\n[WARNING: You've been doing '{self.recent_actions[-1]}' repeatedly. TRY SOMETHING COMPLETELY DIFFERENT!]")
|
| 613 |
+
|
| 614 |
+
# Exploration pressure
|
| 615 |
+
if self.steps_at_current_location >= 4:
|
| 616 |
+
parts.append(f"\n[EXPLORATION HINT: You have been at '{self.current_location}' for {self.steps_at_current_location} steps. Consider moving to a NEW location soon! Use 'look' to find exits of the room, or 'get_map' to see the discovered map.]")
|
| 617 |
+
if self.steps_at_current_location >= 5:
|
| 618 |
+
parts.append(f"\n[URGENT: You MUST move to a different location NOW. Pick a direction and go.]")
|
| 619 |
+
|
| 620 |
+
parts.append(f"\n- CURRENT SITUATION:\n{observation}")
|
| 621 |
+
|
| 622 |
+
# Actions already tried at this location (to avoid repetition and encourage trying new things)
|
| 623 |
+
revisited = self.visited_locations.get(self.current_location, 0) > 1
|
| 624 |
+
location_history = self.history_location.get(self.current_location, [])
|
| 625 |
+
if revisited:
|
| 626 |
+
parts.append(f"\n- ACTIONS ALREADY TRIED AT THIS LOCATION ({self.current_location}):")
|
| 627 |
+
for entry in location_history[-20:]:
|
| 628 |
+
action = entry.get("args", {}).get("action", entry["tool"])
|
| 629 |
+
result = entry.get("result", "")
|
| 630 |
+
result = self._parse_observation_wo_score(result)
|
| 631 |
+
result = result.replace("\n", " ")
|
| 632 |
+
result_short = result[:100] + "..." if len(result) > 100 else result
|
| 633 |
+
parts.append(f" > {action} -> {result_short}")
|
| 634 |
+
|
| 635 |
+
# # Show remaining unexplored directions for current location
|
| 636 |
+
# if self.current_location in self.remaining_directions and (self.visited_locations.get(self.current_location, 0) >= 5 or self.steps_at_current_location >= 5):
|
| 637 |
+
# # remaining should be a list
|
| 638 |
+
# remaining = list(self.remaining_directions[self.current_location])
|
| 639 |
+
# if remaining:
|
| 640 |
+
# parts.append(f"\n- REMAINING UNEXPLORED DIRECTIONS AT THIS LOCATION: {', '.join(remaining)}")
|
| 641 |
+
|
| 642 |
+
# Actions suggested by the LLM
|
| 643 |
+
if self.promising_actions:
|
| 644 |
+
parts.append(f"\n- ACTIONS SUGGESTED AT NEW LOCATION: {', '.join(self.promising_actions)}")
|
| 645 |
+
else:
|
| 646 |
+
prompt = EXTRACT_ACTIONS_PROMPT
|
| 647 |
+
if self.steps_at_current_location >= 5:
|
| 648 |
+
prompt = EXTRACT_ACTIONS_PROMPT_EXIT
|
| 649 |
+
promising_actions = self._extract_promising_actions("\n".join(parts), seed=seed, prompt=prompt)
|
| 650 |
+
if len(location_history) >= 7 or self.visited_locations.get(self.current_location, 0) >= 7:
|
| 651 |
+
# If we've been here a lot, prioritize exit directions
|
| 652 |
+
directions = ['look', 'get_map', 'north', 'south', 'east', 'west', 'northeast', 'northwest', 'southeast', 'southwest', 'up', 'down', 'enter', 'exit']
|
| 653 |
+
# Take 4 random elements from directions to build promising_actions
|
| 654 |
+
promising_actions = np.random.choice(directions, size=min(4, len(directions)), replace=False).tolist()
|
| 655 |
+
if promising_actions:
|
| 656 |
+
parts.append(f"\n- ACTIONS SUGGESTED: {', '.join(promising_actions)}")
|
| 657 |
+
|
| 658 |
+
parts.append("\nWhat do you do next?")
|
| 659 |
+
|
| 660 |
+
print(f"\n################### [START DEBUG] PROMPT RICH IN CONTEXT PASSED TO THE AGENT ###################\n{'\n'.join(parts)}\n[################### [END DEBUG] PROMPT RICH IN CONTEXT PASSED TO THE AGENT ###################]")
|
| 661 |
+
|
| 662 |
+
return "\n".join(parts)
|
| 663 |
+
|
| 664 |
def _parse_response(self, response: str) -> tuple[str, str, dict]:
|
| 665 |
"""
|
| 666 |
Parse LLM response to extract thought, tool name, and arguments.
|
| 667 |
+
"""
|
| 668 |
+
thought = "No reasoning provided"
|
| 669 |
+
tool_name = "play_action"
|
| 670 |
+
tool_args = {"action": "look"}
|
| 671 |
|
| 672 |
+
lines = response.strip().split("\n")
|
| 673 |
|
| 674 |
+
for line in lines:
|
| 675 |
+
line_clean = line.strip()
|
| 676 |
+
line_upper = line_clean.upper()
|
| 677 |
+
|
| 678 |
+
if line_upper.startswith("THOUGHT:"):
|
| 679 |
+
thought = line_clean.split(":", 1)[1].strip()
|
| 680 |
+
|
| 681 |
+
elif line_upper.startswith("TOOL:"):
|
| 682 |
+
raw_tool = line_clean.split(":", 1)[1].strip().lower()
|
| 683 |
+
raw_tool = raw_tool.replace("**", "").replace("*", "").replace("`", "")
|
| 684 |
+
tool_name = raw_tool.strip()
|
| 685 |
+
|
| 686 |
+
elif line_upper.startswith("ARGS:"):
|
| 687 |
+
raw_args = line_clean.split(":", 1)[1].strip()
|
| 688 |
+
raw_args = raw_args.replace("**", "").replace("*", "").replace("`", "")
|
| 689 |
+
try:
|
| 690 |
+
parsed = json.loads(raw_args)
|
| 691 |
+
if isinstance(parsed, dict):
|
| 692 |
+
tool_args = parsed
|
| 693 |
+
except json.JSONDecodeError:
|
| 694 |
+
# Try to extract action from malformed JSON
|
| 695 |
+
match = re.search(r'"action"\s*:\s*"([^"]+)"', raw_args)
|
| 696 |
+
if match:
|
| 697 |
+
tool_args = {"action": match.group(1)}
|
| 698 |
+
else:
|
| 699 |
+
# Try bare string
|
| 700 |
+
clean = raw_args.strip().strip('"').strip("'")
|
| 701 |
+
if clean:
|
| 702 |
+
tool_args = {"action": clean}
|
| 703 |
+
|
| 704 |
+
return thought, tool_name, tool_args
|
| 705 |
+
|
| 706 |
+
def _validate_tool_call(self, tool_name: str, tool_args: dict, valid_tools: list[str]) -> tuple[str, dict]:
|
| 707 |
+
"""Validate and fix common tool call issues."""
|
| 708 |
+
# Fix tool name
|
| 709 |
+
if tool_name not in valid_tools:
|
| 710 |
+
if tool_name in ["action", "do", "command", "play"]:
|
| 711 |
+
tool_name = "play_action"
|
| 712 |
+
elif tool_name in ["map", "location", "locations"]:
|
| 713 |
+
tool_name = "get_map"
|
| 714 |
+
elif tool_name in ["mem", "state", "status", "history"]:
|
| 715 |
+
tool_name = "memory"
|
| 716 |
+
elif tool_name in ["inv", "items", "carrying"]:
|
| 717 |
+
tool_name = "inventory"
|
| 718 |
+
elif tool_name in ["valid", "valid_actions", "actions", "possible_actions"]:
|
| 719 |
+
tool_name = "get_valid_actions"
|
| 720 |
+
elif tool_name in ["log", "loc_log", "location_history"]:
|
| 721 |
+
tool_name = "location_log"
|
| 722 |
+
elif tool_name in ["record", "remember", "save_action", "promising"]:
|
| 723 |
+
tool_name = "record_promising_action"
|
| 724 |
+
else:
|
| 725 |
+
tool_name = "play_action"
|
| 726 |
+
|
| 727 |
+
# Fix action verbs
|
| 728 |
+
if tool_name == "play_action":
|
| 729 |
+
action = tool_args.get("action", "look")
|
| 730 |
+
|
| 731 |
+
invalid_verb_map = {
|
| 732 |
+
"check": "examine",
|
| 733 |
+
"inspect": "examine",
|
| 734 |
+
"search": "look",
|
| 735 |
+
"grab": "take",
|
| 736 |
+
"pick": "take",
|
| 737 |
+
"pick up": "take",
|
| 738 |
+
"get": "take",
|
| 739 |
+
"collect": "take",
|
| 740 |
+
"use": "turn on",
|
| 741 |
+
"switch on": "turn on",
|
| 742 |
+
"go north": "north",
|
| 743 |
+
"go south": "south",
|
| 744 |
+
"go east": "east",
|
| 745 |
+
"go west": "west",
|
| 746 |
+
"go up": "up",
|
| 747 |
+
"go down": "down",
|
| 748 |
+
"move north": "north",
|
| 749 |
+
"move south": "south",
|
| 750 |
+
"move east": "east",
|
| 751 |
+
"move west": "west",
|
| 752 |
+
}
|
| 753 |
+
|
| 754 |
+
action_lower = action.lower().strip()
|
| 755 |
+
if action_lower in invalid_verb_map:
|
| 756 |
+
action = invalid_verb_map[action_lower]
|
| 757 |
+
else:
|
| 758 |
+
# Check if action starts with an invalid verb
|
| 759 |
+
for invalid, valid in invalid_verb_map.items():
|
| 760 |
+
if action_lower.startswith(invalid + " "):
|
| 761 |
+
remainder = action_lower[len(invalid):].strip()
|
| 762 |
+
action = f"{valid} {remainder}"
|
| 763 |
+
break
|
| 764 |
+
|
| 765 |
+
tool_args["action"] = action
|
| 766 |
+
|
| 767 |
+
return tool_name, tool_args
|
| 768 |
|
| 769 |
+
def _extract_result(self, result) -> str:
|
| 770 |
+
"""Extract observation, location, and boolean indicating if it's a new location from MCP tool result."""
|
| 771 |
+
if hasattr(result, 'content') and result.content:
|
| 772 |
+
obs = result.content[0].text
|
| 773 |
+
elif isinstance(result, list) and result:
|
| 774 |
+
obs = result[0].text if hasattr(result[0], 'text') else str(result[0])
|
| 775 |
+
else:
|
| 776 |
+
obs = str(result)
|
| 777 |
+
location, is_new_location = self._parse_location_from_observation(obs)
|
| 778 |
|
| 779 |
+
# obs without the first line
|
| 780 |
+
obs_without_first_line = "\n".join(obs.split("\n")[1:]).strip() if "\n" in obs else obs
|
| 781 |
+
|
| 782 |
+
return obs_without_first_line, location, is_new_location
|
| 783 |
+
|
| 784 |
+
|
| 785 |
+
def _update_score(self, text: str) -> None:
|
| 786 |
+
"""Update score from game text."""
|
| 787 |
+
patterns = [
|
| 788 |
+
r'Score:\s*(\d+)',
|
| 789 |
+
r'score[:\s]+(\d+)',
|
| 790 |
+
r'\[Score:\s*(\d+)',
|
| 791 |
+
r'Total:\s*(\d+)',
|
| 792 |
+
]
|
| 793 |
+
|
| 794 |
+
for pattern in patterns:
|
| 795 |
+
match = re.search(pattern, text, re.IGNORECASE)
|
| 796 |
+
if match:
|
| 797 |
+
self.score = max(self.score, int(match.group(1)))
|
| 798 |
+
|
| 799 |
+
def _is_game_over(self, text: str) -> bool:
|
| 800 |
+
"""Check if the game is over."""
|
| 801 |
+
game_over_phrases = [
|
| 802 |
+
"game over",
|
| 803 |
+
"you have died",
|
| 804 |
+
"you are dead",
|
| 805 |
+
"*** you have died ***",
|
| 806 |
+
"*** you have won ***",
|
| 807 |
+
]
|
| 808 |
+
text_lower = text.lower()
|
| 809 |
+
return any(phrase in text_lower for phrase in game_over_phrases)
|
| 810 |
+
|
| 811 |
+
def _call_llm(self, prompt: str, system_prompt: str, seed: int) -> str:
|
| 812 |
+
"""Convenience wrapper for call_llm()."""
|
| 813 |
return call_llm(prompt, system_prompt, seed)
|
| 814 |
|
| 815 |
|
|
|
|
| 821 |
"""Test the agent locally."""
|
| 822 |
from fastmcp import Client
|
| 823 |
|
|
|
|
| 824 |
server_path = "mcp_server.py"
|
| 825 |
|
| 826 |
agent = StudentAgent()
|
|
|
|
| 841 |
|
| 842 |
if __name__ == "__main__":
|
| 843 |
import asyncio
|
| 844 |
+
asyncio.run(test_agent())
|
mcp_server.py
CHANGED
|
@@ -26,6 +26,7 @@ Then open the MCP Inspector in your browser to test the tools interactively.
|
|
| 26 |
|
| 27 |
import sys
|
| 28 |
import os
|
|
|
|
| 29 |
|
| 30 |
# Add parent directory to path to import games module
|
| 31 |
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
|
@@ -33,6 +34,9 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
|
| 33 |
from fastmcp import FastMCP
|
| 34 |
from games.zork_env import TextAdventureEnv
|
| 35 |
|
|
|
|
|
|
|
|
|
|
| 36 |
|
| 37 |
# =============================================================================
|
| 38 |
# Create the MCP Server
|
|
@@ -45,53 +49,213 @@ mcp = FastMCP("Student Text Adventure Server")
|
|
| 45 |
# Game State Management
|
| 46 |
# =============================================================================
|
| 47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
class GameManager:
|
| 49 |
"""
|
| 50 |
-
Manages the text adventure game state.
|
| 51 |
-
|
| 52 |
-
TODO: Extend this class to track:
|
| 53 |
-
- Action history (for memory tool)
|
| 54 |
-
- Explored locations (for mapping)
|
| 55 |
-
- Current score and moves
|
| 56 |
"""
|
| 57 |
|
| 58 |
def __init__(self):
|
| 59 |
self.env: TextAdventureEnv = None
|
| 60 |
self.state = None
|
| 61 |
self.game_name: str = ""
|
| 62 |
-
#
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
def initialize(self, game: str = "zork1"):
|
| 68 |
"""Initialize or reset the game."""
|
| 69 |
self.game_name = game
|
| 70 |
self.env = TextAdventureEnv(game)
|
| 71 |
self.state = self.env.reset()
|
| 72 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
return self.state.observation
|
| 74 |
|
| 75 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
"""Execute an action and return the result."""
|
| 77 |
if self.env is None:
|
| 78 |
self.initialize()
|
| 79 |
|
| 80 |
-
self.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
|
| 82 |
-
#
|
| 83 |
-
|
| 84 |
-
|
|
|
|
|
|
|
| 85 |
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
|
| 92 |
-
def
|
| 93 |
-
"""Get
|
| 94 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 95 |
|
| 96 |
|
| 97 |
# Global game manager
|
|
@@ -102,14 +266,13 @@ def get_game() -> GameManager:
|
|
| 102 |
"""Get or initialize the game manager."""
|
| 103 |
global _game
|
| 104 |
if _game.env is None:
|
| 105 |
-
|
| 106 |
-
game = os.environ.get("GAME", "zork1")
|
| 107 |
_game.initialize(game)
|
| 108 |
return _game
|
| 109 |
|
| 110 |
|
| 111 |
# =============================================================================
|
| 112 |
-
# MCP Tools
|
| 113 |
# =============================================================================
|
| 114 |
|
| 115 |
@mcp.tool()
|
|
@@ -126,78 +289,97 @@ def play_action(action: str) -> str:
|
|
| 126 |
The game's response to the action
|
| 127 |
|
| 128 |
Valid commands include:
|
| 129 |
-
- Movement: north, south, east, west, up, down, enter, exit
|
| 130 |
- Objects: take <item>, drop <item>, open <thing>, examine <thing>
|
| 131 |
- Other: look, inventory, read <thing>, turn on lamp
|
| 132 |
"""
|
| 133 |
game = get_game()
|
| 134 |
|
| 135 |
-
|
| 136 |
-
# TODO: You might want to include score changes in the response
|
| 137 |
|
| 138 |
-
|
|
|
|
| 139 |
|
| 140 |
-
|
| 141 |
-
|
| 142 |
|
| 143 |
-
|
| 144 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
-
# TODO: Implement additional tools to help your agent
|
| 147 |
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
# pass
|
| 159 |
|
| 160 |
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
# result = game.step("inventory")
|
| 171 |
-
# return result
|
| 172 |
|
| 173 |
|
| 174 |
-
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
|
| 180 |
-
|
| 181 |
-
|
| 182 |
-
|
| 183 |
-
|
| 184 |
-
# pass
|
| 185 |
|
| 186 |
|
| 187 |
# @mcp.tool()
|
| 188 |
# def get_valid_actions() -> str:
|
| 189 |
# """
|
| 190 |
-
# Get a list of
|
| 191 |
-
#
|
|
|
|
| 192 |
# Returns:
|
| 193 |
-
#
|
| 194 |
# """
|
| 195 |
-
# # This is a hint: Jericho provides get_valid_actions()
|
| 196 |
# game = get_game()
|
| 197 |
-
#
|
| 198 |
-
#
|
| 199 |
-
# return "Valid actions: " + ", ".join(valid
|
| 200 |
-
# return "Could not determine valid actions"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 201 |
|
| 202 |
|
| 203 |
# =============================================================================
|
|
@@ -205,5 +387,4 @@ def play_action(action: str) -> str:
|
|
| 205 |
# =============================================================================
|
| 206 |
|
| 207 |
if __name__ == "__main__":
|
| 208 |
-
|
| 209 |
-
mcp.run()
|
|
|
|
| 26 |
|
| 27 |
import sys
|
| 28 |
import os
|
| 29 |
+
import re
|
| 30 |
|
| 31 |
# Add parent directory to path to import games module
|
| 32 |
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
|
|
|
| 34 |
from fastmcp import FastMCP
|
| 35 |
from games.zork_env import TextAdventureEnv
|
| 36 |
|
| 37 |
+
# Get game from environment variable (default: zork1)
|
| 38 |
+
INITIAL_GAME = os.environ.get("GAME", "lostpig")
|
| 39 |
+
|
| 40 |
|
| 41 |
# =============================================================================
|
| 42 |
# Create the MCP Server
|
|
|
|
| 49 |
# Game State Management
|
| 50 |
# =============================================================================
|
| 51 |
|
| 52 |
+
class LocationLog:
|
| 53 |
+
"""Tracks actions, outcomes, and promising leads for a single location."""
|
| 54 |
+
def __init__(self, name: str):
|
| 55 |
+
self.name = name # Location name (e.g., "Kitchen")
|
| 56 |
+
self.visit_count: int = 0 # How many times we've been here
|
| 57 |
+
self.actions_taken: list[tuple[str, str]] = [] # (action, short_outcome)
|
| 58 |
+
self.exits_known: list[str] = [] # List of known exits from this location (e.g., "north -> Kitchen")
|
| 59 |
+
|
| 60 |
+
|
| 61 |
class GameManager:
|
| 62 |
"""
|
| 63 |
+
Manages the text adventure game state with rich location tracking.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
"""
|
| 65 |
|
| 66 |
def __init__(self):
|
| 67 |
self.env: TextAdventureEnv = None
|
| 68 |
self.state = None
|
| 69 |
self.game_name: str = ""
|
| 70 |
+
self.history: list[tuple[str, str]] = [] # list of (action, result) for recent actions
|
| 71 |
+
self.explored_locations: dict[str, set[str]] = {} # location name -> set of exits (e.g., "north -> Kitchen")
|
| 72 |
+
self.location_logs: dict[str, LocationLog] = {} # location name -> log of actions and outcomes at that location
|
| 73 |
+
self.previous_player_location: str = "" # Jericho internal location object
|
| 74 |
+
self.current_player_location: str = "" # Jericho internal location object
|
| 75 |
+
self.global_action_count: int = 0
|
| 76 |
+
self.score_history: list[int] = []
|
| 77 |
+
|
| 78 |
+
def _get_jericho_location(self):
|
| 79 |
+
"""Get the internal Jericho player location object for comparison."""
|
| 80 |
+
try:
|
| 81 |
+
res = self.env.env.get_player_location()
|
| 82 |
+
match = re.search(r"Obj\d+: (.*) Parent\d+", res.name)
|
| 83 |
+
if match:
|
| 84 |
+
return match.group(1)
|
| 85 |
+
# Fallback: return the full location string
|
| 86 |
+
return res.name
|
| 87 |
+
except Exception:
|
| 88 |
+
return None
|
| 89 |
+
|
| 90 |
+
def has_moved(self) -> bool:
|
| 91 |
+
"""
|
| 92 |
+
Determine if we moved to a new location.
|
| 93 |
+
Compares current player location object to the previous one.
|
| 94 |
+
"""
|
| 95 |
+
current_loc = self.current_player_location
|
| 96 |
+
previous_loc = self.previous_player_location
|
| 97 |
+
|
| 98 |
+
changed = not (current_loc == previous_loc)
|
| 99 |
+
return changed
|
| 100 |
+
|
| 101 |
def initialize(self, game: str = "zork1"):
|
| 102 |
"""Initialize or reset the game."""
|
| 103 |
self.game_name = game
|
| 104 |
self.env = TextAdventureEnv(game)
|
| 105 |
self.state = self.env.reset()
|
| 106 |
+
self.history = []
|
| 107 |
+
self.explored_locations = {}
|
| 108 |
+
self.location_logs = {}
|
| 109 |
+
self.previous_player_location = ""
|
| 110 |
+
self.current_player_location = self._get_jericho_location()
|
| 111 |
+
self.global_action_count = 0
|
| 112 |
+
self.score_history = [0]
|
| 113 |
+
self._ensure_location_log(self.current_player_location)
|
| 114 |
+
self.location_logs[self.current_player_location].visit_count += 1
|
| 115 |
return self.state.observation
|
| 116 |
|
| 117 |
+
def _ensure_location_log(self, location: str):
|
| 118 |
+
"""Ensure a LocationLog exists for the given location."""
|
| 119 |
+
if location not in self.location_logs:
|
| 120 |
+
self.location_logs[location] = LocationLog(location)
|
| 121 |
+
|
| 122 |
+
def take_action(self, action: str) -> str:
|
| 123 |
"""Execute an action and return the result."""
|
| 124 |
if self.env is None:
|
| 125 |
self.initialize()
|
| 126 |
|
| 127 |
+
self.previous_player_location = self._get_jericho_location() # Store previous location before taking action
|
| 128 |
+
self.state = self.env.step(action) # Execute the action in the game environment
|
| 129 |
+
self.current_player_location = self._get_jericho_location() # Store current location after taking action
|
| 130 |
+
|
| 131 |
+
result = self.state.observation # Get the observation/result of the action
|
| 132 |
+
self.global_action_count += 1
|
| 133 |
+
self.score_history.append(self.state.score)
|
| 134 |
+
|
| 135 |
+
# Track history
|
| 136 |
+
self.history.append((action, result))
|
| 137 |
+
if len(self.history) > 50:
|
| 138 |
+
self.history = self.history[-50:]
|
| 139 |
+
|
| 140 |
+
moved = self.has_moved()
|
| 141 |
+
is_new_place = False
|
| 142 |
+
|
| 143 |
+
# New place! Update explored locations map
|
| 144 |
+
if self.current_player_location not in self.explored_locations:
|
| 145 |
+
self.explored_locations[self.current_player_location] = set()
|
| 146 |
+
is_new_place = True
|
| 147 |
+
# Add exit from previous location to current location
|
| 148 |
+
if moved:
|
| 149 |
+
self.explored_locations[self.previous_player_location].add(f"{action} -> {self.current_player_location}")
|
| 150 |
|
| 151 |
+
# Update location log
|
| 152 |
+
self._ensure_location_log(self.current_player_location)
|
| 153 |
+
current_loc_log = self.location_logs[self.current_player_location]
|
| 154 |
+
if moved:
|
| 155 |
+
current_loc_log.visit_count += 1
|
| 156 |
|
| 157 |
+
# Log this action and a short outcome in the previous location's log
|
| 158 |
+
prev_loc_log = self.location_logs.get(self.previous_player_location)
|
| 159 |
+
if prev_loc_log is not None:
|
| 160 |
+
short_outcome = result[:120].replace('\n', ' ')
|
| 161 |
+
prev_loc_log.actions_taken.append((action, short_outcome))
|
| 162 |
+
# Keep log manageable
|
| 163 |
+
if len(prev_loc_log.actions_taken) > 30:
|
| 164 |
+
prev_loc_log.actions_taken = prev_loc_log.actions_taken[-30:]
|
| 165 |
+
|
| 166 |
+
return result, is_new_place
|
| 167 |
+
|
| 168 |
+
def get_memory(self) -> str:
|
| 169 |
+
"""Get a summary of current game state."""
|
| 170 |
+
recent = self.history[-5:] if self.history else []
|
| 171 |
+
recent_str = "\n".join([f" > {a} -> {r[:60]}..." for a, r in recent]) if recent else " (none yet)"
|
| 172 |
+
|
| 173 |
+
# Add location-specific info
|
| 174 |
+
loc_log = self.location_logs.get(self.current_player_location)
|
| 175 |
+
loc_info = ""
|
| 176 |
+
if loc_log:
|
| 177 |
+
loc_info = f"\nThis location visited {loc_log.visit_count} time(s)."
|
| 178 |
+
if loc_log.actions_taken:
|
| 179 |
+
loc_info += f"\nActions tried at this location: {len(loc_log.actions_taken)}"
|
| 180 |
+
recent_here = loc_log.actions_taken[-5:]
|
| 181 |
+
loc_info += "\nRecent actions at this location:"
|
| 182 |
+
for act, out in recent_here:
|
| 183 |
+
loc_info += f"\n > {act} -> {out[:50]}..."
|
| 184 |
+
if loc_log.promising_actions:
|
| 185 |
+
loc_info += f"\nPromising actions at this location: {', '.join(loc_log.promising_actions[:10])}"
|
| 186 |
+
|
| 187 |
+
return f"""[CURRENT LOCATION: {self.current_player_location}]
|
| 188 |
+
Location info:
|
| 189 |
+
{loc_info}
|
| 190 |
+
|
| 191 |
+
Recent Actions:
|
| 192 |
+
{recent_str}
|
| 193 |
+
|
| 194 |
+
Current Observation:
|
| 195 |
+
{self.state.observation}"""
|
| 196 |
|
| 197 |
+
|
| 198 |
+
def get_map(self) -> str:
|
| 199 |
+
"""Get a map of explored locations."""
|
| 200 |
+
if not self.explored_locations:
|
| 201 |
+
return "Map: No locations explored yet. Try moving around!"
|
| 202 |
+
lines = [f"[CURRENT LOCATION: {self.current_player_location}]"]
|
| 203 |
+
lines.append("EXPLORED LOCATIONS AND EXITS:")
|
| 204 |
+
for loc, exits in sorted(self.explored_locations.items()):
|
| 205 |
+
visit_info = ""
|
| 206 |
+
if loc in self.location_logs:
|
| 207 |
+
visit_info = f" (visited {self.location_logs[loc].visit_count}x, {len(self.location_logs[loc].actions_taken)} actions tried)"
|
| 208 |
+
lines.append(f"\n* {loc}{visit_info}")
|
| 209 |
+
if exits:
|
| 210 |
+
for exit_info in sorted(exits):
|
| 211 |
+
lines.append(f" -> {exit_info}")
|
| 212 |
+
else:
|
| 213 |
+
lines.append(" -> No exits mapped yet")
|
| 214 |
+
|
| 215 |
+
# Add detailed log for current location
|
| 216 |
+
location = self.current_player_location
|
| 217 |
+
loc_log = self.location_logs.get(location)
|
| 218 |
+
if loc_log:
|
| 219 |
+
lines.append(f"\n- INFORMATION FOR CURRENT LOCATION: {location}")
|
| 220 |
+
lines.append(f" * Visited: {loc_log.visit_count} time(s)")
|
| 221 |
+
lines.append(f" * Actions tried here: {len(loc_log.actions_taken)}")
|
| 222 |
+
|
| 223 |
+
if loc_log.actions_taken:
|
| 224 |
+
lines.append(f" * Action history at this location {location}:")
|
| 225 |
+
for act, out in loc_log.actions_taken[-10:]:
|
| 226 |
+
lines.append(f" > {act} -> {out[:80]}")
|
| 227 |
+
|
| 228 |
+
return "\n".join(lines)
|
| 229 |
+
|
| 230 |
+
def get_inventory(self) -> str:
|
| 231 |
+
"""Get current inventory."""
|
| 232 |
+
if self.env is None:
|
| 233 |
+
return "Game not initialized"
|
| 234 |
+
inv_state = self.env.step("inventory")
|
| 235 |
+
lines = [f"[CURRENT LOCATION: {self.current_player_location}]"]
|
| 236 |
+
lines.append(inv_state.observation)
|
| 237 |
+
return "\n".join(lines)
|
| 238 |
|
| 239 |
+
def get_location_log(self) -> str:
|
| 240 |
+
"""Get detailed log for a specific location."""
|
| 241 |
+
location = self.current_player_location
|
| 242 |
+
loc_log = self.location_logs.get(location)
|
| 243 |
+
if not loc_log:
|
| 244 |
+
return f"No log for location: {location}"
|
| 245 |
+
|
| 246 |
+
lines = [f"[CURRENT LOCATION: {location}]"]
|
| 247 |
+
lines.append(f"Visited: {loc_log.visit_count} time(s)")
|
| 248 |
+
lines.append(f"Actions tried: {len(loc_log.actions_taken)}")
|
| 249 |
+
|
| 250 |
+
if loc_log.actions_taken:
|
| 251 |
+
lines.append("\nAction history:")
|
| 252 |
+
for act, out in loc_log.actions_taken[-10:]:
|
| 253 |
+
lines.append(f" > {act} -> {out[:80]}")
|
| 254 |
+
|
| 255 |
+
if loc_log.exits_known:
|
| 256 |
+
lines.append(f"Known exits: {', '.join(loc_log.exits_known)}")
|
| 257 |
+
|
| 258 |
+
return "\n".join(lines)
|
| 259 |
|
| 260 |
|
| 261 |
# Global game manager
|
|
|
|
| 266 |
"""Get or initialize the game manager."""
|
| 267 |
global _game
|
| 268 |
if _game.env is None:
|
| 269 |
+
game = os.environ.get("GAME", "lostpig")
|
|
|
|
| 270 |
_game.initialize(game)
|
| 271 |
return _game
|
| 272 |
|
| 273 |
|
| 274 |
# =============================================================================
|
| 275 |
+
# MCP Tools
|
| 276 |
# =============================================================================
|
| 277 |
|
| 278 |
@mcp.tool()
|
|
|
|
| 289 |
The game's response to the action
|
| 290 |
|
| 291 |
Valid commands include:
|
| 292 |
+
- Movement: north, south, east, west, northeast, northwest, southeast, southwest, up, down, enter, exit
|
| 293 |
- Objects: take <item>, drop <item>, open <thing>, examine <thing>
|
| 294 |
- Other: look, inventory, read <thing>, turn on lamp
|
| 295 |
"""
|
| 296 |
game = get_game()
|
| 297 |
|
| 298 |
+
result, is_new_place = game.take_action(action)
|
|
|
|
| 299 |
|
| 300 |
+
# Add score info
|
| 301 |
+
score_info = f"\n\n[Score: {game.state.score} | Moves: {game.state.moves}]"
|
| 302 |
|
| 303 |
+
if game.state.reward > 0:
|
| 304 |
+
score_info = f"\n\n+{game.state.reward} points! (Total: {game.state.score})"
|
| 305 |
|
| 306 |
+
# Indicate if we moved to a new location
|
| 307 |
+
location_info = ""
|
| 308 |
+
if is_new_place:
|
| 309 |
+
location_info = f"[NEW LOCATION: {game.current_player_location}]\n"
|
| 310 |
+
else:
|
| 311 |
+
location_info = f"[CURRENT LOCATION: {game.current_player_location}]\n"
|
| 312 |
+
|
| 313 |
+
done_info = ""
|
| 314 |
+
if game.state.done:
|
| 315 |
+
done_info = "\n\nGAME OVER"
|
| 316 |
+
|
| 317 |
+
return location_info + result + score_info + done_info
|
| 318 |
|
|
|
|
| 319 |
|
| 320 |
+
@mcp.tool()
|
| 321 |
+
def memory() -> str:
|
| 322 |
+
"""
|
| 323 |
+
Get the current game state summary.
|
| 324 |
+
|
| 325 |
+
Returns:
|
| 326 |
+
A summary including current location (number of visits, actions tried, promising actions),
|
| 327 |
+
recent actions and current observation
|
| 328 |
+
"""
|
| 329 |
+
return get_game().get_memory()
|
|
|
|
| 330 |
|
| 331 |
|
| 332 |
+
@mcp.tool()
|
| 333 |
+
def inventory() -> str:
|
| 334 |
+
"""
|
| 335 |
+
Check what the player is carrying.
|
| 336 |
+
|
| 337 |
+
Returns:
|
| 338 |
+
List of items in the player's inventory
|
| 339 |
+
"""
|
| 340 |
+
return get_game().get_inventory()
|
|
|
|
|
|
|
| 341 |
|
| 342 |
|
| 343 |
+
@mcp.tool()
|
| 344 |
+
def get_map() -> str:
|
| 345 |
+
"""
|
| 346 |
+
Get a map of explored locations, connections and exits.
|
| 347 |
+
Useful for navigation and avoiding getting lost.
|
| 348 |
+
|
| 349 |
+
Returns:
|
| 350 |
+
A text representation of explored locations and connections
|
| 351 |
+
"""
|
| 352 |
+
return get_game().get_map()
|
|
|
|
| 353 |
|
| 354 |
|
| 355 |
# @mcp.tool()
|
| 356 |
# def get_valid_actions() -> str:
|
| 357 |
# """
|
| 358 |
+
# Get a list of valid actions from the current game state using the game engine.
|
| 359 |
+
# Useful when entering a new location to understand what's possible.
|
| 360 |
+
|
| 361 |
# Returns:
|
| 362 |
+
# A list of valid actions that the game engine considers possible
|
| 363 |
# """
|
|
|
|
| 364 |
# game = get_game()
|
| 365 |
+
# valid = game.get_valid_actions_list()
|
| 366 |
+
# if valid:
|
| 367 |
+
# return "Valid actions: " + ", ".join(valid)
|
| 368 |
+
# return "Could not determine valid actions. Try: look, inventory, examine, north, south, east, west, up, down, take, drop, open, close, read"
|
| 369 |
+
|
| 370 |
+
|
| 371 |
+
@mcp.tool()
|
| 372 |
+
def location_log() -> str:
|
| 373 |
+
"""
|
| 374 |
+
Shows what actions were tried and their outcomes at the current location, along with any promising actions to try.
|
| 375 |
+
|
| 376 |
+
|
| 377 |
+
Returns:
|
| 378 |
+
A detailed log of the current location, including visit count, actions taken and their outcomes, and promising leads.
|
| 379 |
+
"""
|
| 380 |
+
game = get_game()
|
| 381 |
+
return game.get_location_log()
|
| 382 |
+
|
| 383 |
|
| 384 |
|
| 385 |
# =============================================================================
|
|
|
|
| 387 |
# =============================================================================
|
| 388 |
|
| 389 |
if __name__ == "__main__":
|
| 390 |
+
mcp.run()
|
|
|