Spaces:
Sleeping
Sleeping
submission
Browse files- README.md +14 -4
- agent.py +270 -181
- mcp_server.py +304 -158
- requirements.txt +1 -0
README.md
CHANGED
|
@@ -18,11 +18,21 @@ This is my submission for the Text Adventure Agent assignment. My agent uses the
|
|
| 18 |
|
| 19 |
## Approach
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
## Files
|
| 28 |
|
|
|
|
| 18 |
|
| 19 |
## Approach
|
| 20 |
|
| 21 |
+
When comparing to the reference agent, this agent has two types of improvements: a **large mapping system** that is very different from the reference agent and **smaller changes** that directly build upon the ideas of the baseline.
|
| 22 |
|
| 23 |
+
**Context size management:** Each call to the model **automatically includes a short summary of past actions**, without any need for the agent to explicitely call the memory tool. If done properly, the **context size stays small** while helping the model not to get stuck. The whole summary process has been rewritten and always includes explicit function calls from past iterations of the LLM, which **greatly reduces the number of poorly formatted MCP requests**.
|
| 24 |
+
|
| 25 |
+
**Better location counting:** The reference agent was using broken filtering rules for detecting the current location. This has been changed to something more robust that seem to work on all games tested so far.
|
| 26 |
+
|
| 27 |
+
**Graph of locations:** The agent tends to struggle with spatial awareness in this type of game, especially when the number of location grows. At the same time, directions can be weird. For instance, at the beginning of Zork, going north then south does not lead to the initial position. Backtracking using opposite direction will thus not always work. This agent tries to allievate this problem by creating a directed graph of transition between locations, with transitions that are confirmed and other that are unconfirmed.
|
| 28 |
+
|
| 29 |
+
*Example:* For instance, if going north from location A leads to location B, then the edges (A, B, north, confirmed) and (B, A, south, unconfirmed) will be added to the graph, because we cannot be sure that going south from B will lead to A, but this is still a good guess.
|
| 30 |
+
|
| 31 |
+
*Unexplored locations:* The graph also takes into account location adjacent to previously visited location, whose existence can be infered through the use of `get_valid_actions`.
|
| 32 |
+
|
| 33 |
+
**Listing locations and travelling:** From the above graph, the closest explored and unexplored locations can be infered. In practice, this is done by prioritizing routing through "confirmed" edges, but allowing to go through "unconfirmed" edges if needed. Accordingly, the agent is given tools to list those locations, to see how far they are, and to travel through them. When travelling, the actions corresponding to the best guess for a route to this location will be executed in order. When destination is reached or something unexpected happen, the control is given back to the agent. This allows to shorten even more the context size and keep the agent focused on longer term objectives.
|
| 34 |
+
|
| 35 |
+
*Note:* The agent is extremely slow on lostpig because calls to `get_valid_actions` take ages for some reason.
|
| 36 |
|
| 37 |
## Files
|
| 38 |
|
agent.py
CHANGED
|
@@ -1,26 +1,8 @@
|
|
| 1 |
"""
|
| 2 |
-
|
| 3 |
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
Your agent should:
|
| 8 |
-
1. Connect to the MCP server via the provided client
|
| 9 |
-
2. Use the ReAct pattern (Thought -> Action -> Observation)
|
| 10 |
-
3. Call MCP tools to interact with the game
|
| 11 |
-
4. Maximize the game score within the step limit
|
| 12 |
-
|
| 13 |
-
Required method:
|
| 14 |
-
async def run(self, client, game, max_steps, seed, verbose) -> RunResult
|
| 15 |
-
|
| 16 |
-
The 'client' is a FastMCP Client already connected to your MCP server.
|
| 17 |
-
Use it to call tools like: await client.call_tool("play_action", {"action": "look"})
|
| 18 |
-
|
| 19 |
-
Tips:
|
| 20 |
-
- Start by looking around and understanding your environment
|
| 21 |
-
- Keep track of visited locations to avoid loops
|
| 22 |
-
- Pick up useful items (lamp, sword, etc.)
|
| 23 |
-
- The seed parameter should be used to set your LLM's seed for reproducibility
|
| 24 |
"""
|
| 25 |
|
| 26 |
import json
|
|
@@ -32,83 +14,41 @@ from typing import Optional
|
|
| 32 |
from dotenv import load_dotenv
|
| 33 |
from huggingface_hub import InferenceClient
|
| 34 |
|
| 35 |
-
# Load environment variables
|
| 36 |
load_dotenv()
|
| 37 |
|
| 38 |
-
# Set USE_LOCAL_MODEL=1 in your .env to use a locally downloaded model
|
| 39 |
-
USE_LOCAL_MODEL = os.getenv("USE_LOCAL_MODEL", "0").strip() in ("1", "true", "yes")
|
| 40 |
-
LOCAL_MODEL_ID = os.getenv("LOCAL_MODEL_ID", "Qwen/Qwen2.5-3B-Instruct")
|
| 41 |
-
|
| 42 |
# =============================================================================
|
| 43 |
# LLM Configuration - DO NOT MODIFY
|
| 44 |
# =============================================================================
|
| 45 |
|
| 46 |
-
# Model to use (fixed for fair evaluation)
|
| 47 |
LLM_MODEL = "Qwen/Qwen2.5-72B-Instruct"
|
| 48 |
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
if USE_LOCAL_MODEL:
|
| 53 |
-
import torch
|
| 54 |
-
from transformers import pipeline as _hf_pipeline
|
| 55 |
|
| 56 |
-
|
| 57 |
-
"text-generation",
|
| 58 |
-
model=LOCAL_MODEL_ID,
|
| 59 |
-
torch_dtype=torch.bfloat16,
|
| 60 |
-
device_map="auto",
|
| 61 |
-
)
|
| 62 |
-
LLM_CLIENT = None
|
| 63 |
-
else:
|
| 64 |
-
_hf_token = os.getenv("HF_TOKEN")
|
| 65 |
-
if not _hf_token:
|
| 66 |
-
raise ValueError("HF_TOKEN not found. Set it in your .env file.")
|
| 67 |
-
LLM_CLIENT = InferenceClient(token=_hf_token)
|
| 68 |
|
| 69 |
|
| 70 |
def call_llm(prompt: str, system_prompt: str, seed: int, max_tokens: int = 300) -> str:
|
| 71 |
-
"""
|
| 72 |
-
Call the LLM with the given prompt. Use this function in your agent.
|
| 73 |
-
|
| 74 |
-
Args:
|
| 75 |
-
prompt: The user prompt (current game state, history, etc.)
|
| 76 |
-
system_prompt: The system prompt (instructions for the agent)
|
| 77 |
-
seed: Random seed for reproducibility
|
| 78 |
-
max_tokens: Maximum tokens in response (default: 300)
|
| 79 |
-
|
| 80 |
-
Returns:
|
| 81 |
-
The LLM's response text
|
| 82 |
-
|
| 83 |
-
Example:
|
| 84 |
-
response = call_llm(
|
| 85 |
-
prompt="You are in a forest. What do you do?",
|
| 86 |
-
system_prompt=SYSTEM_PROMPT,
|
| 87 |
-
seed=42,
|
| 88 |
-
)
|
| 89 |
-
"""
|
| 90 |
messages = [
|
| 91 |
{"role": "system", "content": system_prompt},
|
| 92 |
{"role": "user", "content": prompt},
|
| 93 |
]
|
| 94 |
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
temperature=0.0001, # Near-deterministic (0.0 unsupported by some backends)
|
| 100 |
-
do_sample=True,
|
| 101 |
-
)
|
| 102 |
-
return outputs[0]["generated_text"][-1]["content"]
|
| 103 |
|
| 104 |
response = LLM_CLIENT.chat.completions.create(
|
| 105 |
model=LLM_MODEL,
|
| 106 |
messages=messages,
|
| 107 |
-
temperature=0.0,
|
| 108 |
max_tokens=max_tokens,
|
| 109 |
seed=seed,
|
| 110 |
)
|
| 111 |
-
|
| 112 |
return response.choices[0].message.content
|
| 113 |
|
| 114 |
|
|
@@ -125,179 +65,328 @@ class RunResult:
|
|
| 125 |
|
| 126 |
|
| 127 |
# =============================================================================
|
| 128 |
-
# System Prompt
|
| 129 |
# =============================================================================
|
| 130 |
|
| 131 |
-
SYSTEM_PROMPT = """You are
|
| 132 |
-
|
| 133 |
-
GOAL: Explore the world, solve puzzles, and maximize your score.
|
| 134 |
|
| 135 |
-
AVAILABLE TOOLS (use via MCP):
|
| 136 |
-
|
| 137 |
-
-
|
| 138 |
-
-
|
|
|
|
|
|
|
|
|
|
| 139 |
|
| 140 |
VALID GAME COMMANDS for play_action:
|
| 141 |
- Movement: north, south, east, west, up, down, enter, exit
|
| 142 |
- Objects: take <item>, drop <item>, open <thing>, close <thing>, examine <thing>
|
| 143 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
|
| 145 |
RESPOND IN THIS EXACT FORMAT (no markdown):
|
| 146 |
-
THOUGHT: <
|
| 147 |
TOOL: <tool_name>
|
| 148 |
-
ARGS: <JSON arguments
|
| 149 |
|
| 150 |
-
|
| 151 |
-
THOUGHT: I
|
| 152 |
TOOL: play_action
|
| 153 |
ARGS: {"action": "look"}
|
| 154 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 155 |
|
| 156 |
|
| 157 |
# =============================================================================
|
| 158 |
-
# Student Agent
|
| 159 |
# =============================================================================
|
| 160 |
|
| 161 |
class StudentAgent:
|
| 162 |
-
"""
|
| 163 |
-
Your ReAct agent implementation.
|
| 164 |
-
|
| 165 |
-
TODO:
|
| 166 |
-
1. Implement the run() method with the ReAct loop
|
| 167 |
-
2. Parse LLM responses to extract tool calls
|
| 168 |
-
3. Track state and avoid loops
|
| 169 |
-
|
| 170 |
-
Use the provided call_llm() function to interact with the LLM.
|
| 171 |
-
"""
|
| 172 |
|
| 173 |
def __init__(self):
|
| 174 |
-
"""Initialize
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
# self.visited_locations = set()
|
| 178 |
-
pass
|
| 179 |
|
| 180 |
async def run(
|
| 181 |
self,
|
| 182 |
-
client,
|
| 183 |
game: str,
|
| 184 |
max_steps: int,
|
| 185 |
seed: int,
|
| 186 |
verbose: bool = False,
|
| 187 |
) -> RunResult:
|
| 188 |
-
"""
|
| 189 |
-
|
|
|
|
|
|
|
| 190 |
|
| 191 |
-
|
| 192 |
-
|
| 193 |
-
|
| 194 |
-
max_steps: Maximum number of steps to take
|
| 195 |
-
seed: Random seed for reproducibility (use for LLM calls)
|
| 196 |
-
verbose: Whether to print detailed output
|
| 197 |
-
|
| 198 |
-
Returns:
|
| 199 |
-
RunResult with final score and statistics
|
| 200 |
-
"""
|
| 201 |
-
# TODO: Implement your ReAct loop here
|
| 202 |
-
#
|
| 203 |
-
# Basic structure:
|
| 204 |
-
# 1. Get initial observation (call play_action with "look")
|
| 205 |
-
# 2. Loop for max_steps:
|
| 206 |
-
# a. Build prompt with current observation and history
|
| 207 |
-
# b. Call LLM to get thought and action
|
| 208 |
-
# c. Parse the response to extract tool and args
|
| 209 |
-
# d. Call the tool via client.call_tool(tool_name, args)
|
| 210 |
-
# e. Update history and state
|
| 211 |
-
# f. Check for game over
|
| 212 |
-
# 3. Return RunResult with final statistics
|
| 213 |
|
| 214 |
-
#
|
| 215 |
-
|
| 216 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 217 |
|
| 218 |
-
#
|
| 219 |
-
|
| 220 |
-
|
| 221 |
-
# system_prompt=SYSTEM_PROMPT,
|
| 222 |
-
# seed=seed,
|
| 223 |
-
# )
|
| 224 |
|
| 225 |
-
|
| 226 |
-
|
| 227 |
-
history = []
|
| 228 |
-
final_score = 0
|
| 229 |
-
moves = 0
|
| 230 |
|
| 231 |
-
#
|
| 232 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 233 |
|
| 234 |
return RunResult(
|
| 235 |
-
final_score=
|
| 236 |
-
max_score=350,
|
| 237 |
moves=moves,
|
| 238 |
locations_visited=locations_visited,
|
| 239 |
-
game_completed=
|
| 240 |
history=history,
|
| 241 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 242 |
|
| 243 |
-
def
|
| 244 |
-
"""
|
| 245 |
-
|
| 246 |
|
| 247 |
-
|
| 248 |
-
""
|
| 249 |
-
|
| 250 |
-
|
|
|
|
|
|
|
|
|
|
| 251 |
|
| 252 |
-
def _parse_response(self, response: str) -> tuple[str, str, dict]:
|
| 253 |
-
"""
|
| 254 |
-
|
|
|
|
|
|
|
| 255 |
|
| 256 |
-
|
| 257 |
|
| 258 |
-
|
| 259 |
-
|
| 260 |
-
|
| 261 |
-
|
| 262 |
-
|
| 263 |
-
|
| 264 |
-
|
| 265 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 266 |
|
| 267 |
-
def
|
| 268 |
-
"""
|
| 269 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 270 |
|
| 271 |
-
|
| 272 |
-
""
|
| 273 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 274 |
|
| 275 |
|
| 276 |
# =============================================================================
|
| 277 |
-
#
|
| 278 |
# =============================================================================
|
| 279 |
|
| 280 |
async def test_agent():
|
| 281 |
"""Test the agent locally."""
|
| 282 |
from fastmcp import Client
|
| 283 |
|
| 284 |
-
# Path to your MCP server
|
| 285 |
-
server_path = "mcp_server.py"
|
| 286 |
-
|
| 287 |
agent = StudentAgent()
|
| 288 |
|
| 289 |
-
async with Client(
|
| 290 |
result = await agent.run(
|
| 291 |
client=client,
|
| 292 |
game="zork1",
|
| 293 |
-
max_steps=
|
| 294 |
seed=42,
|
| 295 |
verbose=True,
|
| 296 |
)
|
| 297 |
|
| 298 |
-
print(f"\
|
|
|
|
| 299 |
print(f"Moves: {result.moves}")
|
| 300 |
-
print(f"Locations: {result.locations_visited}")
|
| 301 |
|
| 302 |
|
| 303 |
if __name__ == "__main__":
|
|
|
|
| 1 |
"""
|
| 2 |
+
Example: MCP ReAct Agent
|
| 3 |
|
| 4 |
+
A complete ReAct agent that uses MCP tools to play text adventure games.
|
| 5 |
+
This is a working example students can learn from.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
"""
|
| 7 |
|
| 8 |
import json
|
|
|
|
| 14 |
from dotenv import load_dotenv
|
| 15 |
from huggingface_hub import InferenceClient
|
| 16 |
|
|
|
|
| 17 |
load_dotenv()
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
# =============================================================================
|
| 20 |
# LLM Configuration - DO NOT MODIFY
|
| 21 |
# =============================================================================
|
| 22 |
|
|
|
|
| 23 |
LLM_MODEL = "Qwen/Qwen2.5-72B-Instruct"
|
| 24 |
|
| 25 |
+
_hf_token = os.getenv("HF_TOKEN")
|
| 26 |
+
if not _hf_token:
|
| 27 |
+
raise ValueError("HF_TOKEN not found. Set it in your .env file.")
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
+
LLM_CLIENT = InferenceClient(token=_hf_token)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
|
| 32 |
def call_llm(prompt: str, system_prompt: str, seed: int, max_tokens: int = 300) -> str:
|
| 33 |
+
"""Call the LLM with the given prompt."""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
messages = [
|
| 35 |
{"role": "system", "content": system_prompt},
|
| 36 |
{"role": "user", "content": prompt},
|
| 37 |
]
|
| 38 |
|
| 39 |
+
# print("\n\n------------")
|
| 40 |
+
# for m in messages[1:]:
|
| 41 |
+
# print(f"{m['role']}: {m['content']}")
|
| 42 |
+
# print("------------\n\n")
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
response = LLM_CLIENT.chat.completions.create(
|
| 45 |
model=LLM_MODEL,
|
| 46 |
messages=messages,
|
| 47 |
+
temperature=0.0,
|
| 48 |
max_tokens=max_tokens,
|
| 49 |
seed=seed,
|
| 50 |
)
|
| 51 |
+
|
| 52 |
return response.choices[0].message.content
|
| 53 |
|
| 54 |
|
|
|
|
| 65 |
|
| 66 |
|
| 67 |
# =============================================================================
|
| 68 |
+
# System Prompt
|
| 69 |
# =============================================================================
|
| 70 |
|
| 71 |
+
SYSTEM_PROMPT = """You are an expert text adventure game player. Your goal is to explore, collect treasures, and maximize your score as fast as possible.
|
|
|
|
|
|
|
| 72 |
|
| 73 |
+
AVAILABLE TOOLS (use these via MCP):
|
| 74 |
+
1. play_action - Execute game commands and physically interact with your environment (north, take lamp, open mailbox, etc).
|
| 75 |
+
2. get_locations - List nearby locations that you visited or that are adjacent to locations you visited.
|
| 76 |
+
3. get_unexplored_locations - List nearby unexplored adjacent to locations you visited.
|
| 77 |
+
4. travel - Fast travel to a given location you previously visited through backtracking.
|
| 78 |
+
5. memory - Get a summary of the current game state, in case you feel lost.
|
| 79 |
+
6. inventory - Check your inventory. You have no inventory size limit.
|
| 80 |
|
| 81 |
VALID GAME COMMANDS for play_action:
|
| 82 |
- Movement: north, south, east, west, up, down, enter, exit
|
| 83 |
- Objects: take <item>, drop <item>, open <thing>, close <thing>, examine <thing>
|
| 84 |
+
- Light: turn on lamp, turn off lamp
|
| 85 |
+
- Combat: attack <enemy> with <weapon>
|
| 86 |
+
- Other: inventory, look, read <thing>, wait
|
| 87 |
+
|
| 88 |
+
FORBIDDEN (will NOT work): check, inspect, search, grab, use, help
|
| 89 |
|
| 90 |
RESPOND IN THIS EXACT FORMAT (no markdown):
|
| 91 |
+
THOUGHT: <brief reasoning about what to do next>
|
| 92 |
TOOL: <tool_name>
|
| 93 |
+
ARGS: <JSON arguments>
|
| 94 |
|
| 95 |
+
Examples:
|
| 96 |
+
THOUGHT: I need to see what's around me.
|
| 97 |
TOOL: play_action
|
| 98 |
ARGS: {"action": "look"}
|
| 99 |
+
|
| 100 |
+
THOUGHT: I'm completely loss and don't know where to go next. I will check for nearby unexplored locations.
|
| 101 |
+
TOOL: get_unexplored_locations
|
| 102 |
+
ARGS: {}
|
| 103 |
+
|
| 104 |
+
THOUGHT: I need to explore new locations. I travel north of the burn forest directly.
|
| 105 |
+
TOOL: travel
|
| 106 |
+
ARGS: {"destination": "Unexplored (North Of Burnt Forest"}
|
| 107 |
+
|
| 108 |
+
STRATEGY:
|
| 109 |
+
1. Explore systematically and travel to unexplored places. When relevant, explore up and down before exploring other directions.
|
| 110 |
+
2. Pick up useful items. They will not be collected automatically; you have to manually collect them (e.g. "take sword").
|
| 111 |
+
3. Open containers (mailbox, window, etc.)
|
| 112 |
+
4. Use get_locations and get_unexplored_locations to avoid getting lost. Use 'travel' for faster travel.
|
| 113 |
+
5. Turn on lamp before dark areas!
|
| 114 |
+
|
| 115 |
+
DO NOT repeat the same action multiple times in a row."""
|
| 116 |
|
| 117 |
|
| 118 |
# =============================================================================
|
| 119 |
+
# Student Agent Implementation
|
| 120 |
# =============================================================================
|
| 121 |
|
| 122 |
class StudentAgent:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 123 |
|
| 124 |
def __init__(self):
|
| 125 |
+
"""Initialize the agent state."""
|
| 126 |
+
self.history: list[dict] = []
|
| 127 |
+
self.score: int = 0
|
|
|
|
|
|
|
| 128 |
|
| 129 |
async def run(
|
| 130 |
self,
|
| 131 |
+
client,
|
| 132 |
game: str,
|
| 133 |
max_steps: int,
|
| 134 |
seed: int,
|
| 135 |
verbose: bool = False,
|
| 136 |
) -> RunResult:
|
| 137 |
+
"""Run the agent for a game session."""
|
| 138 |
+
locations_visited = set()
|
| 139 |
+
history = []
|
| 140 |
+
moves = 0
|
| 141 |
|
| 142 |
+
# Get list of available tools
|
| 143 |
+
tools = await client.list_tools()
|
| 144 |
+
tool_names = [t.name for t in tools]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
+
# Get initial observation
|
| 147 |
+
observation, self.score, is_game_over = (await client.call_tool("play_action", {"action": "look"})).data
|
| 148 |
+
# result = self._extract_result(await client.call_tool("play_action", {"action": "look"}))
|
| 149 |
+
# observation = '\n'.join(result.split('\n')[:-2])
|
| 150 |
+
# self.score = max(self.score, int(result.split('\n')[-2]))
|
| 151 |
+
# is_game_over = bool(result.split('\n')[-1])
|
| 152 |
+
|
| 153 |
+
self.history.append({
|
| 154 |
+
"step": 0,
|
| 155 |
+
"thought": "This is the start of the game. I need to see what is around me.",
|
| 156 |
+
"tool": 'play_action',
|
| 157 |
+
"args": {'action': 'look'},
|
| 158 |
+
"result": observation,
|
| 159 |
+
})
|
| 160 |
|
| 161 |
+
# Track initial location
|
| 162 |
+
location = observation.split("\n")[0] if observation else "Unknown"
|
| 163 |
+
locations_visited.add(location)
|
|
|
|
|
|
|
|
|
|
| 164 |
|
| 165 |
+
if verbose:
|
| 166 |
+
print(self._entry_to_str(self.history[-1]))
|
|
|
|
|
|
|
|
|
|
| 167 |
|
| 168 |
+
# Main ReAct loop
|
| 169 |
+
for step in range(1, max_steps + 1):
|
| 170 |
+
|
| 171 |
+
# Make prompt from game history and call LLM
|
| 172 |
+
prompt = self._make_prompt()
|
| 173 |
+
response = call_llm(prompt, SYSTEM_PROMPT, seed + step)
|
| 174 |
+
|
| 175 |
+
# Parse the response
|
| 176 |
+
thought, tool_name, tool_args = self._parse_response(response, tool_names)
|
| 177 |
+
|
| 178 |
+
if verbose:
|
| 179 |
+
print(f"\n--- Step {step} ---")
|
| 180 |
+
print(f"THOUGHT: {thought}")
|
| 181 |
+
print(f"TOOL: {tool_name}")
|
| 182 |
+
print(f"ARGS: {tool_args}")
|
| 183 |
+
|
| 184 |
+
# Validate and fix common issues
|
| 185 |
+
tool_name, tool_args = self._validate_tool_call(tool_name, tool_args, tool_names)
|
| 186 |
+
|
| 187 |
+
# Execute the tool
|
| 188 |
+
try:
|
| 189 |
+
if tool_name == "play_action" or tool_name == "travel":
|
| 190 |
+
moves += 1
|
| 191 |
+
# result = self._extract_result(await client.call_tool(tool_name, tool_args))
|
| 192 |
+
# observation = '\n'.join(result.split('\n')[:-2])
|
| 193 |
+
# self.score = max(self.score, int(result.split('\n')[-2]))
|
| 194 |
+
# is_game_over = bool(int(result.split('\n')[-1]))
|
| 195 |
+
observation, self.score, is_game_over = (await client.call_tool(tool_name, tool_args)).data
|
| 196 |
+
# else:
|
| 197 |
+
# # observation = self._extract_result(await client.call_tool(tool_name, tool_args))
|
| 198 |
+
# observation, = (await client.call_tool(tool_name, tool_args)).data
|
| 199 |
+
|
| 200 |
+
|
| 201 |
+
except Exception as e:
|
| 202 |
+
observation = f"Error: {e}"
|
| 203 |
+
|
| 204 |
+
# Track location
|
| 205 |
+
location = observation.split("\n")[0] if observation else "Unknown"
|
| 206 |
+
locations_visited.add(location)
|
| 207 |
+
|
| 208 |
+
# Update history
|
| 209 |
+
self.history.append({
|
| 210 |
+
'step': step,
|
| 211 |
+
'thought': thought,
|
| 212 |
+
'tool': tool_name,
|
| 213 |
+
'args': tool_args,
|
| 214 |
+
'result': observation,
|
| 215 |
+
'score': self.score,
|
| 216 |
+
'game_over': is_game_over,
|
| 217 |
+
})
|
| 218 |
+
|
| 219 |
+
if verbose:
|
| 220 |
+
print(f"GAME: {observation}")
|
| 221 |
+
|
| 222 |
+
if is_game_over:
|
| 223 |
+
if verbose:
|
| 224 |
+
print("\n*** GAME OVER ***")
|
| 225 |
+
break
|
| 226 |
|
| 227 |
return RunResult(
|
| 228 |
+
final_score=self.score,
|
| 229 |
+
max_score=350,
|
| 230 |
moves=moves,
|
| 231 |
locations_visited=locations_visited,
|
| 232 |
+
game_completed=is_game_over,
|
| 233 |
history=history,
|
| 234 |
)
|
| 235 |
+
|
| 236 |
+
def _entry_to_str(self, entry: dict) -> str:
|
| 237 |
+
parts = []
|
| 238 |
+
parts.append(f"THOUGHT: {entry['thought']}")
|
| 239 |
+
parts.append(f"TOOL: {entry['tool']}")
|
| 240 |
+
parts.append(f"ARGS: {entry['args']}")
|
| 241 |
+
parts.append(f"GAME: {entry['result']}")
|
| 242 |
+
return '\n'.join(parts)
|
| 243 |
|
| 244 |
+
def _make_prompt(self, n_past_steps: int = 4) -> str:
|
| 245 |
+
"""Build the prompt for the LLM with context."""
|
| 246 |
+
parts = []
|
| 247 |
|
| 248 |
+
# Recent history
|
| 249 |
+
parts.append("\nHere are the last things that happened:")
|
| 250 |
+
for entry in self.history[-n_past_steps:]:
|
| 251 |
+
parts.append(self._entry_to_str(entry))
|
| 252 |
+
|
| 253 |
+
parts.append(f"\nYou current score is {self.score}. Now it's your turn! What do you do next?")
|
| 254 |
+
return '\n'.join(parts)
|
| 255 |
|
| 256 |
+
def _parse_response(self, response: str, valid_tools: list[str]) -> tuple[str, str, dict]:
|
| 257 |
+
"""Parse the LLM response to extract thought, tool, and arguments."""
|
| 258 |
+
thought = "No reasoning provided"
|
| 259 |
+
tool_name = "play_action"
|
| 260 |
+
tool_args = {"action": "look"}
|
| 261 |
|
| 262 |
+
lines = response.strip().split("\n")
|
| 263 |
|
| 264 |
+
for line in lines:
|
| 265 |
+
line_clean = line.strip()
|
| 266 |
+
line_upper = line_clean.upper()
|
| 267 |
+
|
| 268 |
+
if line_upper.startswith("THOUGHT:"):
|
| 269 |
+
thought = line_clean.split(":", 1)[1].strip()
|
| 270 |
+
|
| 271 |
+
elif line_upper.startswith("TOOL:"):
|
| 272 |
+
raw_tool = line_clean.split(":", 1)[1].strip().lower()
|
| 273 |
+
raw_tool = raw_tool.replace("**", "").replace("*", "").replace("`", "")
|
| 274 |
+
raw_tool = raw_tool.split()[0] if raw_tool else "play_action"
|
| 275 |
+
tool_name = raw_tool
|
| 276 |
+
|
| 277 |
+
elif line_upper.startswith("ARGS:"):
|
| 278 |
+
args_part = line_clean.split(":", 1)[1].strip()
|
| 279 |
+
try:
|
| 280 |
+
args_part = args_part.replace("'", '"')
|
| 281 |
+
tool_args = json.loads(args_part)
|
| 282 |
+
except json.JSONDecodeError:
|
| 283 |
+
match = re.search(r'"action"\s*:\s*"([^"]+)"', args_part)
|
| 284 |
+
if match:
|
| 285 |
+
tool_args = {"action": match.group(1)}
|
| 286 |
+
else:
|
| 287 |
+
tool_args = {"action": "look"}
|
| 288 |
+
|
| 289 |
+
return thought, tool_name, tool_args
|
| 290 |
|
| 291 |
+
def _validate_tool_call(self, tool_name: str, tool_args: dict, valid_tools: list[str]) -> tuple[str, dict]:
|
| 292 |
+
"""Validate and fix common tool call issues."""
|
| 293 |
+
# Fix tool name
|
| 294 |
+
if tool_name not in valid_tools:
|
| 295 |
+
if tool_name in ["action", "do", "command"]:
|
| 296 |
+
tool_name = "play_action"
|
| 297 |
+
elif tool_name in ["map", "location"]:
|
| 298 |
+
tool_name = "get_map"
|
| 299 |
+
elif tool_name in ["mem", "state", "status"]:
|
| 300 |
+
tool_name = "memory"
|
| 301 |
+
elif tool_name in ["inv", "items"]:
|
| 302 |
+
tool_name = "inventory"
|
| 303 |
+
else:
|
| 304 |
+
tool_name = "play_action"
|
| 305 |
|
| 306 |
+
# Fix action verbs
|
| 307 |
+
if tool_name == "play_action":
|
| 308 |
+
action = tool_args.get("action", "look")
|
| 309 |
+
|
| 310 |
+
invalid_verb_map = {
|
| 311 |
+
"check": "examine",
|
| 312 |
+
"inspect": "examine",
|
| 313 |
+
"search": "look",
|
| 314 |
+
"grab": "take",
|
| 315 |
+
"pick": "take",
|
| 316 |
+
"use": "examine",
|
| 317 |
+
"investigate": "examine",
|
| 318 |
+
}
|
| 319 |
+
|
| 320 |
+
words = action.lower().split()
|
| 321 |
+
if words and words[0] in invalid_verb_map:
|
| 322 |
+
words[0] = invalid_verb_map[words[0]]
|
| 323 |
+
action = " ".join(words)
|
| 324 |
+
|
| 325 |
+
action = action.lower().strip()
|
| 326 |
+
action = action.replace("**", "").replace("*", "").replace("`", "")
|
| 327 |
+
action = " ".join(action.split())
|
| 328 |
+
|
| 329 |
+
tool_args["action"] = action
|
| 330 |
+
|
| 331 |
+
return tool_name, tool_args
|
| 332 |
+
|
| 333 |
+
def _extract_result(self, result) -> str:
|
| 334 |
+
"""Extract text from MCP tool result."""
|
| 335 |
+
# return result.data
|
| 336 |
+
if hasattr(result, 'content') and result.content:
|
| 337 |
+
return result.content[0].text
|
| 338 |
+
if isinstance(result, list) and result:
|
| 339 |
+
return result[0].text if hasattr(result[0], 'text') else str(result[0])
|
| 340 |
+
return str(result)
|
| 341 |
+
|
| 342 |
+
def _update_score(self, text: str) -> None:
|
| 343 |
+
"""Update score from game text."""
|
| 344 |
+
patterns = [
|
| 345 |
+
r'Score:\s*(\d+)',
|
| 346 |
+
r'score[:\s]+(\d+)',
|
| 347 |
+
r'\[Score:\s*(\d+)',
|
| 348 |
+
]
|
| 349 |
+
|
| 350 |
+
for pattern in patterns:
|
| 351 |
+
match = re.search(pattern, text, re.IGNORECASE)
|
| 352 |
+
if match:
|
| 353 |
+
self.score = max(self.score, int(match.group(1)))
|
| 354 |
+
|
| 355 |
+
def _is_game_over(self, text: str) -> bool:
|
| 356 |
+
"""Check if the game is over."""
|
| 357 |
+
game_over_phrases = [
|
| 358 |
+
"game over",
|
| 359 |
+
"you have died",
|
| 360 |
+
"you are dead",
|
| 361 |
+
"*** you have died ***",
|
| 362 |
+
]
|
| 363 |
+
text_lower = text.lower()
|
| 364 |
+
return any(phrase in text_lower for phrase in game_over_phrases)
|
| 365 |
|
| 366 |
|
| 367 |
# =============================================================================
|
| 368 |
+
# Local Testing
|
| 369 |
# =============================================================================
|
| 370 |
|
| 371 |
async def test_agent():
|
| 372 |
"""Test the agent locally."""
|
| 373 |
from fastmcp import Client
|
| 374 |
|
|
|
|
|
|
|
|
|
|
| 375 |
agent = StudentAgent()
|
| 376 |
|
| 377 |
+
async with Client("mcp_server.py") as client:
|
| 378 |
result = await agent.run(
|
| 379 |
client=client,
|
| 380 |
game="zork1",
|
| 381 |
+
max_steps=20,
|
| 382 |
seed=42,
|
| 383 |
verbose=True,
|
| 384 |
)
|
| 385 |
|
| 386 |
+
print(f"\n{'=' * 50}")
|
| 387 |
+
print(f"Final Score: {result.final_score}")
|
| 388 |
print(f"Moves: {result.moves}")
|
| 389 |
+
print(f"Locations: {len(result.locations_visited)}")
|
| 390 |
|
| 391 |
|
| 392 |
if __name__ == "__main__":
|
mcp_server.py
CHANGED
|
@@ -1,209 +1,355 @@
|
|
| 1 |
"""
|
| 2 |
-
|
| 3 |
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
Required tool:
|
| 8 |
-
play_action(action: str) -> str
|
| 9 |
-
Execute a game command and return the result.
|
| 10 |
-
|
| 11 |
-
Recommended tools:
|
| 12 |
-
memory() -> str
|
| 13 |
-
Return current game state, score, and recent history.
|
| 14 |
-
|
| 15 |
-
inventory() -> str
|
| 16 |
-
Return the player's current inventory.
|
| 17 |
-
|
| 18 |
-
get_map() -> str
|
| 19 |
-
Return a map of explored locations.
|
| 20 |
-
|
| 21 |
-
Test your server with:
|
| 22 |
-
fastmcp dev submission_template/mcp_server.py
|
| 23 |
-
|
| 24 |
-
Then open the MCP Inspector in your browser to test the tools interactively.
|
| 25 |
"""
|
| 26 |
|
| 27 |
import sys
|
| 28 |
import os
|
|
|
|
|
|
|
| 29 |
|
| 30 |
# Add parent directory to path to import games module
|
| 31 |
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
| 32 |
|
| 33 |
from fastmcp import FastMCP
|
| 34 |
-
from games.zork_env import TextAdventureEnv
|
| 35 |
|
| 36 |
|
| 37 |
-
#
|
| 38 |
-
|
| 39 |
-
# =============================================================================
|
| 40 |
|
| 41 |
-
|
|
|
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
-
# =============================================================================
|
| 45 |
-
# Game State Management
|
| 46 |
-
# =============================================================================
|
| 47 |
|
| 48 |
-
class
|
| 49 |
-
"""
|
| 50 |
-
Manages the text adventure game state.
|
| 51 |
-
|
| 52 |
-
TODO: Extend this class to track:
|
| 53 |
-
- Action history (for memory tool)
|
| 54 |
-
- Explored locations (for mapping)
|
| 55 |
-
- Current score and moves
|
| 56 |
-
"""
|
| 57 |
|
| 58 |
-
def __init__(self):
|
| 59 |
-
self.env: TextAdventureEnv = None
|
| 60 |
-
self.state = None
|
| 61 |
-
self.game_name: str = ""
|
| 62 |
-
# TODO: Add more state tracking
|
| 63 |
-
# self.history: list[tuple[str, str]] = []
|
| 64 |
-
# self.explored_locations: dict[str, set[str]] = {}
|
| 65 |
-
# self.current_location: str = ""
|
| 66 |
-
|
| 67 |
-
def initialize(self, game: str = "zork1"):
|
| 68 |
-
"""Initialize or reset the game."""
|
| 69 |
self.game_name = game
|
| 70 |
self.env = TextAdventureEnv(game)
|
| 71 |
self.state = self.env.reset()
|
| 72 |
-
|
| 73 |
-
|
|
|
|
|
|
|
| 74 |
|
| 75 |
-
def
|
| 76 |
-
"""
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
self.state = self.env.step(action)
|
|
|
|
| 81 |
|
| 82 |
-
#
|
| 83 |
-
|
| 84 |
-
# Update location tracking, etc.
|
| 85 |
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
-
def
|
| 89 |
-
"""Get
|
| 90 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
|
| 92 |
-
def
|
| 93 |
-
"""Get
|
| 94 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 95 |
|
| 96 |
|
| 97 |
-
# Global game
|
| 98 |
-
|
| 99 |
|
| 100 |
|
| 101 |
-
def get_game() ->
|
| 102 |
-
"""Get or initialize the game
|
| 103 |
-
global
|
| 104 |
-
if
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
_game.initialize(game)
|
| 108 |
-
return _game
|
| 109 |
|
| 110 |
|
| 111 |
# =============================================================================
|
| 112 |
-
# MCP Tools
|
| 113 |
# =============================================================================
|
| 114 |
|
| 115 |
@mcp.tool()
|
| 116 |
-
def play_action(action: str) -> str:
|
| 117 |
"""
|
| 118 |
-
Execute a game
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 119 |
|
| 120 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 121 |
|
| 122 |
Args:
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
Returns:
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
- Movement: north, south, east, west, up, down, enter, exit
|
| 130 |
-
- Objects: take <item>, drop <item>, open <thing>, examine <thing>
|
| 131 |
-
- Other: look, inventory, read <thing>, turn on lamp
|
| 132 |
"""
|
| 133 |
game = get_game()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
# @mcp.tool()
|
| 149 |
-
# def memory() -> str:
|
| 150 |
-
# """
|
| 151 |
-
# Get the current game state summary.
|
| 152 |
-
#
|
| 153 |
-
# Returns:
|
| 154 |
-
# A summary including current location, score, moves, and recent history
|
| 155 |
-
# """
|
| 156 |
-
# game = get_game()
|
| 157 |
-
# # TODO: Return useful state information
|
| 158 |
-
# pass
|
| 159 |
-
|
| 160 |
-
|
| 161 |
-
# @mcp.tool()
|
| 162 |
-
# def inventory() -> str:
|
| 163 |
-
# """
|
| 164 |
-
# Check what the player is carrying.
|
| 165 |
-
#
|
| 166 |
-
# Returns:
|
| 167 |
-
# List of items in the player's inventory
|
| 168 |
-
# """
|
| 169 |
-
# game = get_game()
|
| 170 |
-
# result = game.step("inventory")
|
| 171 |
-
# return result
|
| 172 |
-
|
| 173 |
-
|
| 174 |
-
# @mcp.tool()
|
| 175 |
-
# def get_map() -> str:
|
| 176 |
-
# """
|
| 177 |
-
# Get a map of explored locations.
|
| 178 |
-
#
|
| 179 |
-
# Returns:
|
| 180 |
-
# A text representation of explored locations and connections
|
| 181 |
-
# """
|
| 182 |
-
# game = get_game()
|
| 183 |
-
# # TODO: Return map of explored locations
|
| 184 |
-
# pass
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
# @mcp.tool()
|
| 188 |
-
# def get_valid_actions() -> str:
|
| 189 |
-
# """
|
| 190 |
-
# Get a list of likely valid actions from the current location.
|
| 191 |
-
#
|
| 192 |
-
# Returns:
|
| 193 |
-
# List of actions that might work here
|
| 194 |
-
# """
|
| 195 |
-
# # This is a hint: Jericho provides get_valid_actions()
|
| 196 |
-
# game = get_game()
|
| 197 |
-
# if game.env and game.env.env:
|
| 198 |
-
# valid = game.env.env.get_valid_actions()
|
| 199 |
-
# return "Valid actions: " + ", ".join(valid[:20])
|
| 200 |
-
# return "Could not determine valid actions"
|
| 201 |
|
| 202 |
|
| 203 |
# =============================================================================
|
| 204 |
-
#
|
| 205 |
# =============================================================================
|
| 206 |
|
| 207 |
if __name__ == "__main__":
|
| 208 |
-
# This runs the server with stdio transport (for MCP clients)
|
| 209 |
mcp.run()
|
|
|
|
| 1 |
"""
|
| 2 |
+
Example: MCP Server for Text Adventures
|
| 3 |
|
| 4 |
+
A complete MCP server that exposes text adventure games via tools.
|
| 5 |
+
This demonstrates a full-featured server with memory, mapping, and inventory.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
"""
|
| 7 |
|
| 8 |
import sys
|
| 9 |
import os
|
| 10 |
+
from typing import Optional
|
| 11 |
+
import networkx as nx
|
| 12 |
|
| 13 |
# Add parent directory to path to import games module
|
| 14 |
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
| 15 |
|
| 16 |
from fastmcp import FastMCP
|
| 17 |
+
from games.zork_env import TextAdventureEnv, list_available_games
|
| 18 |
|
| 19 |
|
| 20 |
+
# Get game from environment variable (default: zork1)
|
| 21 |
+
INITIAL_GAME = os.environ.get("GAME", "zork1")
|
|
|
|
| 22 |
|
| 23 |
+
# Create the MCP server
|
| 24 |
+
mcp = FastMCP("Text Adventure Server")
|
| 25 |
|
| 26 |
+
OPPOSITE_DIRECTION = {
|
| 27 |
+
'north': 'south',
|
| 28 |
+
'south': 'north',
|
| 29 |
+
'east': 'west',
|
| 30 |
+
'west': 'east',
|
| 31 |
+
'up': 'down',
|
| 32 |
+
'down': 'up',
|
| 33 |
+
}
|
| 34 |
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
+
class GameState:
|
| 37 |
+
"""Manages the text adventure game state and exploration data."""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
+
def __init__(self, game: str = "zork1"):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
self.game_name = game
|
| 41 |
self.env = TextAdventureEnv(game)
|
| 42 |
self.state = self.env.reset()
|
| 43 |
+
self.history: list[tuple[str, str]] = []
|
| 44 |
+
self.loc_history: list[Optional[str]] = [None]
|
| 45 |
+
self.graph: nx.DiGraph = nx.DiGraph() # graph of locations
|
| 46 |
+
self.current_location: str = self._extract_location(self.state.observation)
|
| 47 |
|
| 48 |
+
def _extract_location(self, observation: str) -> Optional[str]:
|
| 49 |
+
"""Extract location name from observation (usually first line)."""
|
| 50 |
+
lines = observation.strip().split('\n')
|
| 51 |
+
return lines[0] if lines[0].istitle() and lines[0][-1] not in ['.', '!', ',', ';', '?'] else None
|
| 52 |
+
|
| 53 |
+
def valid_actions(self) -> list[str]:
|
| 54 |
+
"Returns valid actions as computed by Jericho"
|
| 55 |
+
print("start get_valid_actions", file=sys.stderr)
|
| 56 |
+
va = self.env.env.get_valid_actions(use_object_tree=True, use_ctypes=True, use_parallel=False)
|
| 57 |
+
print("end get_valid_actions", file=sys.stderr)
|
| 58 |
+
return va
|
| 59 |
+
|
| 60 |
+
def _follow_direction(self, location: Optional[str], direction: str):
|
| 61 |
+
"Returns what happens in the graph if a direction is followed from a location."
|
| 62 |
+
if not self.graph.has_node(location):
|
| 63 |
+
return None
|
| 64 |
+
candidates = [v for _, v, data in self.graph.out_edges(location, data=True) if data['direction'] == direction]
|
| 65 |
+
return candidates[0] if len(candidates) > 0 else None
|
| 66 |
+
|
| 67 |
+
def take_action(self, action: str) -> str:
|
| 68 |
+
"""Execute a game action and return the result."""
|
| 69 |
self.state = self.env.step(action)
|
| 70 |
+
result = self.state.observation
|
| 71 |
|
| 72 |
+
# Track history
|
| 73 |
+
self.history.append((action, result))
|
|
|
|
| 74 |
|
| 75 |
+
#########################
|
| 76 |
+
# Update location graph #
|
| 77 |
+
#########################
|
| 78 |
+
new_loc = self._extract_location(result)
|
| 79 |
+
if new_loc is None:
|
| 80 |
+
return result
|
| 81 |
+
|
| 82 |
+
new_loc = new_loc.lower()
|
| 83 |
+
self.loc_history.append(new_loc)
|
| 84 |
+
|
| 85 |
+
# when location hasn't changed, nothing to do
|
| 86 |
+
if self.loc_history[-2] == new_loc:
|
| 87 |
+
return result
|
| 88 |
+
|
| 89 |
+
last_loc = self.loc_history[-2]
|
| 90 |
+
|
| 91 |
+
first_time_here = self.graph.has_node(new_loc)
|
| 92 |
+
|
| 93 |
+
# if we are not where we were supposed to go, update the graph accordingly
|
| 94 |
+
# this is done by deleting unsure edges between explored location and renaming unexplored locations
|
| 95 |
+
old_node = self._follow_direction(last_loc, action)
|
| 96 |
+
if old_node is not None and old_node != new_loc:
|
| 97 |
+
if self.graph.nodes(data=True)[old_node]['explored']:
|
| 98 |
+
self.graph.remove_edge(last_loc, old_node)
|
| 99 |
+
else:
|
| 100 |
+
nx.relabel_nodes(self.graph, {old_node: new_loc})
|
| 101 |
+
|
| 102 |
+
self.graph.add_node(new_loc, explored=True)
|
| 103 |
+
|
| 104 |
+
|
| 105 |
+
# add sure forward direction
|
| 106 |
+
if last_loc is not None:
|
| 107 |
+
self.graph.add_edge(last_loc, new_loc, direction=action, sure=True)
|
| 108 |
+
|
| 109 |
+
# add unsure backward direction if possible and useful
|
| 110 |
+
if action in OPPOSITE_DIRECTION.keys() and not self.graph.has_edge(new_loc, last_loc) and last_loc is not None:
|
| 111 |
+
self.graph.add_edge(new_loc, last_loc, direction=OPPOSITE_DIRECTION[action], sure=False)
|
| 112 |
+
|
| 113 |
+
|
| 114 |
+
# if first time in this location, add all valid directions that seem to be unexplored as unexplored locations
|
| 115 |
+
if not first_time_here:
|
| 116 |
+
return result
|
| 117 |
+
|
| 118 |
+
# for direction in ['north', 'south', 'east', 'west']:
|
| 119 |
+
for direction in set(self.valid_actions()) & set(OPPOSITE_DIRECTION.keys()):
|
| 120 |
+
if self._follow_direction(new_loc, direction) is not None:
|
| 121 |
+
continue
|
| 122 |
+
unexplored = f"unexplored ({direction} of {new_loc})".lower()
|
| 123 |
+
self.graph.add_node(unexplored, explored=False)
|
| 124 |
+
self.graph.add_edge(new_loc, unexplored, direction=direction, sure=True)
|
| 125 |
+
result += f"\nThere is a potentially unnexplored location {direction} of here."
|
| 126 |
+
|
| 127 |
+
return result
|
| 128 |
+
|
| 129 |
+
def travel(self, dest_loc: str) -> str:
|
| 130 |
+
cur_loc = self.loc_history[-1]
|
| 131 |
+
dest_loc = dest_loc.lower()
|
| 132 |
+
|
| 133 |
+
if dest_loc not in self.graph.nodes():
|
| 134 |
+
return f"\"{dest_loc.title()}\" is not the exact name of any location you have seen. You can get a list of locations you can travel to by using get_locations or get_unexplored_locations. Make sure you type the exact name of the (unexplored) location."
|
| 135 |
+
|
| 136 |
+
# travel only through edges that are sure to exist if possible
|
| 137 |
+
sure_graph = nx.subgraph_view(self.graph, filter_edge = lambda u,v: self.graph[u][v]['sure'])
|
| 138 |
+
|
| 139 |
+
try:
|
| 140 |
+
path = nx.shortest_path(sure_graph, source=cur_loc, target=dest_loc)
|
| 141 |
+
except nx.NetworkXNoPath:
|
| 142 |
+
try:
|
| 143 |
+
path = nx.shortest_path(self.graph, source=cur_loc, target=dest_loc)
|
| 144 |
+
except nx.NetworkXNoPath:
|
| 145 |
+
return f"Cannot travel from {cur_loc.title()} to {dest_loc.title()} given current knowledge of the map. Did you type the location name properly?"
|
| 146 |
+
|
| 147 |
+
parts = []
|
| 148 |
+
for i in range(len(path)-1):
|
| 149 |
+
direction = self.graph[path[i]][path[i+1]]['direction']
|
| 150 |
+
parts.append(f"> {direction}\n")
|
| 151 |
+
parts.append(self.take_action(direction))
|
| 152 |
+
|
| 153 |
+
# if travel is not finished but location is unexpected, stop here
|
| 154 |
+
if i < len(path)-2 and self.loc_history[-1] != path[i+1]:
|
| 155 |
+
parts.append("This location is unexpected given your initial route. You stop travelling here.")
|
| 156 |
+
break
|
| 157 |
+
return '\n'.join(parts)
|
| 158 |
+
|
| 159 |
+
def get_locations(self, n_max=None, unexplored=False) -> str:
|
| 160 |
+
if unexplored:
|
| 161 |
+
graph = nx.subgraph_view(self.graph, filter_node = lambda u: not self.graph.nodes(data=True)[u]['explored'])
|
| 162 |
+
else:
|
| 163 |
+
graph = self.graph
|
| 164 |
+
|
| 165 |
+
all_paths = nx.single_source_shortest_path(self.graph, self.loc_history[-1])
|
| 166 |
+
all_paths = sorted(all_paths.values(), key=len)[1:]
|
| 167 |
+
|
| 168 |
+
if len(all_paths) == 0:
|
| 169 |
+
return f"No {'unexplored' if unexplored else ''} location automatically detected. Try something else and figure it out by yourself."
|
| 170 |
+
|
| 171 |
+
parts = []
|
| 172 |
+
parts.append(f"Current location: {self.loc_history[-1].title()}")
|
| 173 |
+
if n_max is None or len(all_paths) <= n_max:
|
| 174 |
+
parts.append(f"All {"unexplored" if unexplored else "known"} locations you can potentially travel to given your current knowledge:")
|
| 175 |
+
else:
|
| 176 |
+
parts.append(f"{n_max} closest {"unexplored" if unexplored else "known"} locations you can potentially travel to given your current knowledge:")
|
| 177 |
+
|
| 178 |
+
for path in all_paths:
|
| 179 |
+
parts.append(f"- {path[-1].title()}: {len(path)-1} step{'s' if len(path) > 2 else ''} away")
|
| 180 |
+
# parts.append(f"- {path[-1].title()} (current location -> {(' -> '.join(path[1:])).title()})")
|
| 181 |
+
|
| 182 |
+
return '\n'.join(parts)
|
| 183 |
+
|
| 184 |
+
def get_memory(self) -> str:
|
| 185 |
+
"""Get a summary of current game state."""
|
| 186 |
+
recent = self.history[-5:] if self.history else []
|
| 187 |
+
recent_str = "\n".join([f" > {a} -> {r[:60]}..." for a, r in recent]) if recent else " (none yet)"
|
| 188 |
+
|
| 189 |
+
return f"""Current State:
|
| 190 |
+
- Location: {self.current_location}
|
| 191 |
+
- Score: {self.state.score} points
|
| 192 |
+
- Moves: {self.state.moves}
|
| 193 |
+
- Game: {self.game_name}
|
| 194 |
+
|
| 195 |
+
Recent Actions:
|
| 196 |
+
{recent_str}
|
| 197 |
+
|
| 198 |
+
Current Observation:
|
| 199 |
+
{self.state.observation}"""
|
| 200 |
|
| 201 |
+
def get_map(self) -> str:
|
| 202 |
+
"""Get a map of explored locations."""
|
| 203 |
+
if not self.explored_locations:
|
| 204 |
+
return "Map: No locations explored yet. Try moving around!"
|
| 205 |
+
|
| 206 |
+
lines = ["Explored Locations and Exits:"]
|
| 207 |
+
for loc, exits in sorted(self.explored_locations.items()):
|
| 208 |
+
lines.append(f"\n* {loc}")
|
| 209 |
+
for exit_info in sorted(exits):
|
| 210 |
+
lines.append(f" -> {exit_info}")
|
| 211 |
+
|
| 212 |
+
lines.append(f"\n[Current] {self.current_location}")
|
| 213 |
+
return "\n".join(lines)
|
| 214 |
|
| 215 |
+
def get_inventory(self) -> str:
|
| 216 |
+
"""Get current inventory."""
|
| 217 |
+
items = self.state.inventory if hasattr(self.state, 'inventory') and self.state.inventory else []
|
| 218 |
+
|
| 219 |
+
if not items:
|
| 220 |
+
return "Inventory: You are empty-handed."
|
| 221 |
+
|
| 222 |
+
item_names = []
|
| 223 |
+
for item in items:
|
| 224 |
+
item_str = str(item)
|
| 225 |
+
item_lower = item_str.lower()
|
| 226 |
+
if "parent" in item_lower:
|
| 227 |
+
idx = item_lower.index("parent")
|
| 228 |
+
name = item_str[:idx].strip()
|
| 229 |
+
if ":" in name:
|
| 230 |
+
name = name.split(":", 1)[1].strip()
|
| 231 |
+
item_names.append(name)
|
| 232 |
+
elif ":" in item_str:
|
| 233 |
+
name = item_str.split(":")[1].strip()
|
| 234 |
+
item_names.append(name)
|
| 235 |
+
else:
|
| 236 |
+
item_names.append(item_str)
|
| 237 |
+
|
| 238 |
+
return f"Inventory: {', '.join(item_names)}"
|
| 239 |
|
| 240 |
|
| 241 |
+
# Global game state
|
| 242 |
+
_game_state: GameState | None = None
|
| 243 |
|
| 244 |
|
| 245 |
+
def get_game() -> GameState:
|
| 246 |
+
"""Get or initialize the game state."""
|
| 247 |
+
global _game_state
|
| 248 |
+
if _game_state is None:
|
| 249 |
+
_game_state = GameState(INITIAL_GAME)
|
| 250 |
+
return _game_state
|
|
|
|
|
|
|
| 251 |
|
| 252 |
|
| 253 |
# =============================================================================
|
| 254 |
+
# MCP Tools
|
| 255 |
# =============================================================================
|
| 256 |
|
| 257 |
@mcp.tool()
|
| 258 |
+
def play_action(action: str) -> tuple[str, int, bool]:
|
| 259 |
"""
|
| 260 |
+
Execute a game action in the text adventure.
|
| 261 |
+
|
| 262 |
+
Args:
|
| 263 |
+
action: The command to execute (e.g., 'north', 'take lamp', 'open mailbox')
|
| 264 |
+
|
| 265 |
+
Returns: tuple (result, score, game_over), where:
|
| 266 |
+
- result (str): The game's response to your action
|
| 267 |
+
- score (int): The current score
|
| 268 |
+
- game_over (bool): Whether or not the game is over
|
| 269 |
+
"""
|
| 270 |
+
game = get_game()
|
| 271 |
+
result = game.take_action(action)
|
| 272 |
+
|
| 273 |
+
# Add score info
|
| 274 |
+
score_info = f"\n\n[Score: {game.state.score} | Moves: {game.state.moves}]"
|
| 275 |
+
|
| 276 |
+
if game.state.reward > 0:
|
| 277 |
+
result += f"\n\n+{game.state.reward} points! (Total: {game.state.score})"
|
| 278 |
|
| 279 |
+
if game.state.done:
|
| 280 |
+
result += "\n\nGAME OVER"
|
| 281 |
+
else:
|
| 282 |
+
valid = [action for action in game.valid_actions() if action != 'jump'] # jump often means death
|
| 283 |
+
result += f"\n\nCurrent valid actions: {', '.join(valid)}"
|
| 284 |
+
|
| 285 |
+
# return f"{result}\n{game.state.score}\n{int(game.state.done)}"
|
| 286 |
+
return result, game.state.score, game.state.done
|
| 287 |
+
|
| 288 |
+
@mcp.tool()
|
| 289 |
+
def travel(destination: str) -> tuple[str, int, bool]:
|
| 290 |
+
"""
|
| 291 |
+
Travel to a location the user is aware of.
|
| 292 |
|
| 293 |
Args:
|
| 294 |
+
destination: The name of the location to travel to
|
| 295 |
+
|
| 296 |
+
Returns: tuple (result, score, game_over), where:
|
| 297 |
+
- result (str): The game's response to your action
|
| 298 |
+
- score (int): The current score
|
| 299 |
+
- game_over (bool): Whether or not the game is over
|
|
|
|
|
|
|
|
|
|
| 300 |
"""
|
| 301 |
game = get_game()
|
| 302 |
+
result = game.travel(destination)
|
| 303 |
+
|
| 304 |
+
# Add score info
|
| 305 |
+
score_info = f"\n\n[Score: {game.state.score} | Moves: {game.state.moves}]"
|
| 306 |
+
|
| 307 |
+
if game.state.reward > 0:
|
| 308 |
+
result += f"\n\n+{game.state.reward} points! (Total: {game.state.score})"
|
| 309 |
+
|
| 310 |
+
if game.state.done:
|
| 311 |
+
result += "\n\nGAME OVER"
|
| 312 |
+
else:
|
| 313 |
+
result += f"\n\nCurrent valid actions: {', '.join(game.valid_actions())}"
|
| 314 |
+
# return f"{result}\n{game.state.score}\n{int(game.state.done)}"
|
| 315 |
+
return result, game.state.score, game.state.done
|
| 316 |
+
|
| 317 |
+
@mcp.tool()
|
| 318 |
+
def get_locations() -> tuple[str, int, bool]:
|
| 319 |
+
"Get a list of all locations the player can go to."
|
| 320 |
+
game = get_game()
|
| 321 |
+
return game.get_locations(n_max=10), game.state.score, game.state.done
|
| 322 |
+
|
| 323 |
+
@mcp.tool()
|
| 324 |
+
def get_unexplored_locations() -> tuple[str, int, bool]:
|
| 325 |
+
"Get a list of all unexplored locations the player can go to."
|
| 326 |
+
game = get_game()
|
| 327 |
+
return game.get_locations(n_max=10, unexplored=True), game.state.score, game.state.done
|
| 328 |
+
# return get_game().get_locations(n_max=10, unexplored=True)
|
| 329 |
+
|
| 330 |
+
@mcp.tool()
|
| 331 |
+
def memory() -> tuple[str, int, bool]:
|
| 332 |
+
"""
|
| 333 |
+
Get a summary of the current game state.
|
| 334 |
|
| 335 |
+
Returns location, score, moves, recent actions, and current observation.
|
| 336 |
+
"""
|
| 337 |
+
game = get_game()
|
| 338 |
+
return game.get_memory(), game.state.score, game.state.done
|
| 339 |
+
|
| 340 |
+
|
| 341 |
+
@mcp.tool()
|
| 342 |
+
def inventory() -> tuple[str, int, bool]:
|
| 343 |
+
"""
|
| 344 |
+
Check what items you are currently carrying.
|
| 345 |
+
"""
|
| 346 |
+
game = get_game()
|
| 347 |
+
return game.get_inventory(), game.state.score, game.state.done
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 348 |
|
| 349 |
|
| 350 |
# =============================================================================
|
| 351 |
+
# Main
|
| 352 |
# =============================================================================
|
| 353 |
|
| 354 |
if __name__ == "__main__":
|
|
|
|
| 355 |
mcp.run()
|
requirements.txt
CHANGED
|
@@ -7,3 +7,4 @@
|
|
| 7 |
# Add any additional packages your agent needs below:
|
| 8 |
# numpy
|
| 9 |
# requests
|
|
|
|
|
|
| 7 |
# Add any additional packages your agent needs below:
|
| 8 |
# numpy
|
| 9 |
# requests
|
| 10 |
+
networkx
|