text-adventure-template

Sleeping

App Files Files Community

flams commited on Feb 23

Commit

4f5b6ec

1 Parent(s): 615a63b

soumission zork

Browse files

Files changed (4) hide show

README.md +12 -13
agent.py +388 -167
mcp_server.py +445 -176
requirements.txt +1 -2

README.md CHANGED Viewed

@@ -18,11 +18,19 @@ This is my submission for the Text Adventure Agent assignment. My agent uses the
 ## Approach
-<!-- Describe your approach here -->
-- What strategy does your agent use?
-- What tools did you implement in your MCP server?
-- Any interesting techniques or optimizations?
 ## Files
@@ -33,15 +41,6 @@ This is my submission for the Text Adventure Agent assignment. My agent uses the
 | `app.py` | Gradio interface for HF Space |
 | `requirements.txt` | Additional dependencies |
-## How to Submit
-1. Fork the template Space: `https://huggingface.co/spaces/LLM-course/text-adventure-template`
-2. Clone your fork locally
-3. Implement your agent in `agent.py` and `mcp_server.py`
-4. Test locally (see below)
-5. Push your changes to your Space
-6. Submit your Space URL on the course platform
 ## Local Testing
 ```bash

 ## Approach
+The agent follows a ReAct loop (Thought → Tool → Observation) with a clean separation between LLM reasoning and game state management via MCP. At each step the LLM produces structured text containing THOUGHT, TOOL, and ARGS fields; the agent parses this output and dispatches the appropriate MCP tool call. This decoupling means the LLM never interacts with the game directly — it reasons about what to do, and the tool layer executes it, returning observations that feed the next reasoning step.
+A central design choice is the graph-based world map (the WorldMap class). Every room the agent visits becomes a node; movement directions become directed edges. When the agent moves to a new room, a reverse edge is automatically recorded so backtracking is always possible. The map separately tracks unexplored exits — directions mentioned in room descriptions that the agent has not yet traversed. Two BFS routines operate on this graph: `find_path` computes the shortest route between any two explored rooms, and `suggest_exploration` finds the nearest unexplored exit from the current location, giving the agent a concrete next target when it runs out of local options.
+To compensate for the LLM's limited context window, the agent maintains a structured notebook with typed categories: Clue, Puzzle, Item, Danger, NPC, Code, Goal, and Map. Each entry is deduplicated on write to prevent bloat. The `memory` tool returns a full status dump — current location, inventory, all notebook entries, and the map — so the LLM can re-orient itself at any point without relying on conversation history alone.
+Exit detection is automated: a regex scan runs on every game response, extracting direction words (north, south, up, etc.) and automatically calling `register_exits`. This removes a common failure mode where the LLM forgets to register exits, wasting action steps.
+Stuck detection operates at multiple layers. On the server side, an action history counter flags when the same action appears three or more times in the last six actions. On the agent side, a `failed_actions` set records location-action pairs so the prompt can explicitly list what not to retry. A score stagnation detector triggers after eight consecutive turns with no score change, injecting a hint into the prompt that encourages the agent to try a different area or strategy.
+The LLM output parser is deliberately forgiving: it performs fuzzy matching on tool names to handle typos and extra spaces, collects multi-line JSON across split outputs, falls back to regex extraction when JSON is malformed, and merges synonymous argument keys (target, object, item, direction) into a canonical form. A common normalization converts phrases like "go north" into the bare direction "north".
+Finally, the system prompt encodes a ten-point strategy distilled from expert text adventure play: examine every object, take everything not nailed down, save clues to the notebook, try synonyms when a command fails, and routinely listen and search in each room.
 ## Files
 | `app.py` | Gradio interface for HF Space |
 | `requirements.txt` | Additional dependencies |
 ## Local Testing
 ```bash

agent.py CHANGED Viewed

@@ -1,26 +1,13 @@
 """
-Student Agent for Text Adventure Games
-This is your submission file. Implement the StudentAgent class to play
-text adventure games using the MCP server you also implement.
-Your agent should:
-1. Connect to the MCP server via the provided client
-2. Use the ReAct pattern (Thought -> Action -> Observation)
-3. Call MCP tools to interact with the game
-4. Maximize the game score within the step limit
-Required method:
-    async def run(self, client, game, max_steps, seed, verbose) -> RunResult
-The 'client' is a FastMCP Client already connected to your MCP server.
-Use it to call tools like: await client.call_tool("play_action", {"action": "look"})
-Tips:
-- Start by looking around and understanding your environment
-- Keep track of visited locations to avoid loops
-- Pick up useful items (lamp, sword, etc.)
-- The seed parameter should be used to set your LLM's seed for reproducibility
 """
 import json
@@ -70,22 +57,15 @@ else:
 def call_llm(prompt: str, system_prompt: str, seed: int, max_tokens: int = 300) -> str:
     """
     Call the LLM with the given prompt. Use this function in your agent.
     Args:
         prompt: The user prompt (current game state, history, etc.)
         system_prompt: The system prompt (instructions for the agent)
         seed: Random seed for reproducibility
         max_tokens: Maximum tokens in response (default: 300)
     Returns:
         The LLM's response text
-    Example:
-        response = call_llm(
-            prompt="You are in a forest. What do you do?",
-            system_prompt=SYSTEM_PROMPT,
-            seed=42,
-        )
     """
     messages = [
         {"role": "system", "content": system_prompt},
@@ -124,153 +104,396 @@ class RunResult:
     history: list[tuple[str, str, str]] = field(default_factory=list)
-# =============================================================================
-# System Prompt - Customize this for your agent
-# =============================================================================
-SYSTEM_PROMPT = """You are playing a classic text adventure game.
-GOAL: Explore the world, solve puzzles, and maximize your score.
-AVAILABLE TOOLS (use via MCP):
-- play_action: Execute a game command (north, take lamp, open mailbox, etc.)
-- memory: Get current game state and history (if implemented)
-- inventory: Check what you're carrying (if implemented)
-VALID GAME COMMANDS for play_action:
-- Movement: north, south, east, west, up, down, enter, exit
-- Objects: take <item>, drop <item>, open <thing>, close <thing>, examine <thing>
-- Other: look, inventory, read <thing>, turn on lamp
-RESPOND IN THIS EXACT FORMAT (no markdown):
-THOUGHT: <your reasoning about what to do next>
-TOOL: <tool_name>
-ARGS: <JSON arguments, e.g., {"action": "look"}>
 Example:
-THOUGHT: I should look around to see where I am.
 TOOL: play_action
-ARGS: {"action": "look"}
-"""
-# =============================================================================
-# Student Agent - IMPLEMENT THIS CLASS
-# =============================================================================
 class StudentAgent:
-    """
-    Your ReAct agent implementation.
-    TODO:
-    1. Implement the run() method with the ReAct loop
-    2. Parse LLM responses to extract tool calls
-    3. Track state and avoid loops
-    Use the provided call_llm() function to interact with the LLM.
-    """
     def __init__(self):
-        """Initialize your agent here."""
-        # TODO: Initialize any state tracking you need
-        # self.history = []
-        # self.visited_locations = set()
-        pass
     async def run(
-        self,
-        client,  # FastMCP Client connected to your MCP server
-        game: str,
-        max_steps: int,
-        seed: int,
-        verbose: bool = False,
     ) -> RunResult:
-        """
-        Run the agent for a game session.
-        Args:
-            client: FastMCP Client connected to your MCP server
-            game: Name of the game being played (e.g., "zork1")
-            max_steps: Maximum number of steps to take
-            seed: Random seed for reproducibility (use for LLM calls)
-            verbose: Whether to print detailed output
-        Returns:
-            RunResult with final score and statistics
-        """
-        # TODO: Implement your ReAct loop here
-        #
-        # Basic structure:
-        # 1. Get initial observation (call play_action with "look")
-        # 2. Loop for max_steps:
-        #    a. Build prompt with current observation and history
-        #    b. Call LLM to get thought and action
-        #    c. Parse the response to extract tool and args
-        #    d. Call the tool via client.call_tool(tool_name, args)
-        #    e. Update history and state
-        #    f. Check for game over
-        # 3. Return RunResult with final statistics
-        # Example of calling a tool:
-        # result = await client.call_tool("play_action", {"action": "look"})
-        # observation = result[0].text if result else "No response"
-        # Example of calling the LLM:
-        # response = call_llm(
-        #     prompt="Current observation: " + observation,
-        #     system_prompt=SYSTEM_PROMPT,
-        #     seed=seed,
-        # )
-        # Placeholder implementation - replace with your code
-        locations_visited = set()
-        history = []
-        final_score = 0
-        moves = 0
-        # TODO: Your implementation here
-        # ...
         return RunResult(
-            final_score=final_score,
-            max_score=350,  # Zork1 max score, adjust if needed
-            moves=moves,
-            locations_visited=locations_visited,
-            game_completed=False,
-            history=history,
         )
-    def _build_prompt(self, observation: str, history: list) -> str:
-        """
-        Build the prompt for the LLM.
-        TODO: Implement this to create effective prompts
-        """
-        # TODO: Combine system prompt, history, and current observation
-        pass
-    def _parse_response(self, response: str) -> tuple[str, str, dict]:
-        """
-        Parse LLM response to extract thought, tool name, and arguments.
-        TODO: Implement robust parsing
-        Returns:
-            Tuple of (thought, tool_name, args_dict)
-        """
-        # TODO: Parse the response format:
-        # THOUGHT: ...
-        # TOOL: ...
-        # ARGS: {...}
-        pass
-    def _call_llm(self, prompt: str, system_prompt: str, seed: int) -> str:
-        """
-        Call the LLM with the given prompt.
-        This is a convenience wrapper - you can also use call_llm() directly.
-        """
-        return call_llm(prompt, system_prompt, seed)
 # =============================================================================
@@ -280,12 +503,10 @@ class StudentAgent:
 async def test_agent():
     """Test the agent locally."""
     from fastmcp import Client
-    # Path to your MCP server
     server_path = "mcp_server.py"
     agent = StudentAgent()
     async with Client(server_path) as client:
         result = await agent.run(
             client=client,
@@ -294,7 +515,7 @@ async def test_agent():
             seed=42,
             verbose=True,
         )
         print(f"\nFinal Score: {result.final_score}")
         print(f"Moves: {result.moves}")
         print(f"Locations: {result.locations_visited}")

 """
+MCP ReAct Agent - Enhanced Generalist
+Key improvements over v6:
+  - Richer system prompt with strategy patterns for different game types
+  - Stuck detection + automatic recovery (suggest_exploration, try new verbs)
+  - Smarter history: shows failed actions to avoid repetition
+  - Exit registration from game text (auto-detects mentioned directions)
+  - Multi-phase play: explore → collect → solve → backtrack
+  - Robust parsing with multiple fallback strategies
 """
 import json
 def call_llm(prompt: str, system_prompt: str, seed: int, max_tokens: int = 300) -> str:
     """
     Call the LLM with the given prompt. Use this function in your agent.
     Args:
         prompt: The user prompt (current game state, history, etc.)
         system_prompt: The system prompt (instructions for the agent)
         seed: Random seed for reproducibility
         max_tokens: Maximum tokens in response (default: 300)
     Returns:
         The LLM's response text
     """
     messages = [
         {"role": "system", "content": system_prompt},
     history: list[tuple[str, str, str]] = field(default_factory=list)
+# ─── System Prompt ─────────────────────────────────────────────────────────────
+SYSTEM_PROMPT = """You are an expert text adventure game player. You are methodical, curious, and never give up.
+AVAILABLE TOOLS:
+- play_action: Send a command to the game.
+  ARGS: {"action": "your command"}
+  For movement use direction words: north, south, east, west, up, down, in, out, ne, nw, se, sw
+  For interactions: examine <thing>, take <item>, drop <item>, open <thing>, close <thing>,
+    read <thing>, push <thing>, pull <thing>, turn <thing>, light <thing>, put <item> in <container>,
+    unlock <door> with <key>, give <item> to <npc>, attack <enemy> with <weapon>, tie <item> to <thing>,
+    climb <thing>, enter <thing>, search <thing>, listen, smell, wave <item>, eat <item>, drink <item>
+- think: Plan your strategy. ARGS: {"goal": "...", "thought": "..."}
+- notebook_write: Save clues, codes, puzzle info permanently.
+  ARGS: {"text": "...", "category": "Clue|Puzzle|Item|Danger|NPC|Code|Goal|Map"}
+- notebook_read: Read your saved notes. ARGS: {"keyword": "optional filter"}
+- memory: Full status dump (location, inventory, notes, map). ARGS: {}
+- get_map: View explored map and unexplored exits. ARGS: {}
+- find_path: Get directions to a known room. ARGS: {"target_room": "room name"}
+- suggest_exploration: Get suggestion for nearest unexplored area. ARGS: {}
+- register_exits: Record exits visible in current room.
+  ARGS: {"directions": "north, south, up"}
+STRATEGY — How to play well:
+1. EXPLORE SYSTEMATICALLY: When you enter a new room, ALWAYS do "look" first, then register visible exits with register_exits. Explore every exit.
+2. EXAMINE EVERYTHING: If the game describes objects, furniture, or features — examine them. Things hide under rugs, inside containers, behind paintings.
+3. TAKE EVERYTHING: Collect all portable items. You'll need them later for puzzles.
+4. READ CAREFULLY: The game text contains ALL clues. Unusual descriptions often hint at puzzles.
+5. SAVE CLUES: If you notice a code, inscription, locked door, NPC request, or puzzle — write it in notebook_write immediately.
+6. DON'T REPEAT FAILURES: Check your recent history. If a command didn't work, try a DIFFERENT approach. Use synonyms: get/take, look/examine, push/move.
+7. BACKTRACK SMARTLY: If stuck, call suggest_exploration to find unexplored exits, or find_path to return to a room with unsolved puzzles.
+8. USE ITEMS: When you have items and encounter obstacles, think about which item might help. Try "use X", "put X in Y", "unlock Y with X".
+9. LISTEN AND SEARCH: "listen", "search", "look under X", "look behind X" often reveal hidden things.
+10. CHECK SCORE: If your score increases, you're making progress. If not for a while, try a new area.
+RESPONSE FORMAT (strict):
+THOUGHT: <brief reasoning about what you observe and your plan>
+TOOL: <exactly one tool name>
+ARGS: <valid JSON for that tool>
 Example:
+THOUGHT: I see a rusty door to the north and a brass lamp on the ground. I should take the lamp first.
 TOOL: play_action
+ARGS: {"action": "take lamp"}"""
+# ─── Directions mentioned in text ──────────────────────────────────────────────
+EXIT_PATTERN = re.compile(
+    r"\b(north|south|east|west|up|down|northeast|northwest|southeast|southwest)\b",
+    re.IGNORECASE,
+)
+DIRECTION_SET = {
+    "n",
+    "s",
+    "e",
+    "w",
+    "u",
+    "d",
+    "ne",
+    "nw",
+    "se",
+    "sw",
+    "north",
+    "south",
+    "east",
+    "west",
+    "up",
+    "down",
+    "northeast",
+    "northwest",
+    "southeast",
+    "southwest",
+    "in",
+    "out",
+    "enter",
+    "exit",
+}
 class StudentAgent:
     def __init__(self):
+        self.history: list[dict] = []
+        self.score: int = 0
+        self.max_score: int = 0
+        self.location: str = "Unknown"
+        self.locations_visited: set[str] = set()
+        self.failed_actions: set[str] = set()  # track "location:action" that failed
+        self.consecutive_no_score: int = 0
+        self.last_score: int = 0
     async def run(
+        self, client, game: str, max_steps: int, seed: int, verbose: bool = False
     ) -> RunResult:
+        tools = await client.list_tools()
+        tool_names = [t.name for t in tools]
+        # Initial look
+        result = await client.call_tool("play_action", {"action": "look"})
+        observation = self._extract_result(result)
+        self._update_state(observation)
+        # Register initial exits
+        exits = self._detect_exits(observation)
+        if exits:
+            try:
+                await client.call_tool(
+                    "register_exits", {"directions": ", ".join(exits)}
+                )
+            except Exception:
+                pass
+        if verbose:
+            print(f"\n{'=' * 60}\nINITIAL OBSERVATION:\n{observation}\n{'=' * 60}")
+        step = 0
+        for step in range(1, max_steps + 1):
+            prompt = self._build_prompt(observation, step)
+            response = call_llm(prompt, SYSTEM_PROMPT, seed + step, max_tokens=400)
+            thought, tool_name, tool_args = self._parse_response(response, tool_names)
+            if verbose:
+                print(f"\n--- Step {step} ---")
+                print(f"  THOUGHT: {thought}")
+                print(f"  TOOL: {tool_name}({json.dumps(tool_args)})")
+            try:
+                result = await client.call_tool(tool_name, tool_args)
+                observation = self._extract_result(result)
+            except Exception as e:
+                observation = f"Error: {e}"
+            if verbose:
+                obs_preview = observation[:400].replace("\n", "\n    ")
+                print(f"  RESULT: {obs_preview}")
+            self._update_state(observation)
+            # Auto-register exits when we get a play_action result
+            if tool_name == "play_action":
+                exits = self._detect_exits(observation)
+                if exits:
+                    try:
+                        await client.call_tool(
+                            "register_exits", {"directions": ", ".join(exits)}
+                        )
+                    except Exception:
+                        pass
+                # Track failed movement
+                action = tool_args.get("action", "").lower()
+                if self._is_failure(observation):
+                    self.failed_actions.add(f"{self.location}:{action}")
+            # Track score progress
+            if self.score > self.last_score:
+                self.consecutive_no_score = 0
+                self.last_score = self.score
+            else:
+                self.consecutive_no_score += 1
+            self.history.append(
+                {
+                    "step": step,
+                    "thought": thought,
+                    "tool": tool_name,
+                    "args": tool_args,
+                    "result": observation[:200],
+                    "location": self.location,
+                    "score": self.score,
+                }
+            )
+            if self._is_game_over(observation):
+                break
         return RunResult(
+            final_score=self.score,
+            max_score=self.max_score,
+            moves=step,
+            locations_visited=self.locations_visited,
+            game_completed=self._is_game_over(observation),
+            error=None,
+            history=[
+                (h["tool"], json.dumps(h["args"]), h["result"]) for h in self.history
+            ],
+        )
+    def _build_prompt(self, observation: str, step: int) -> str:
+        parts = []
+        # Status line
+        parts.append(
+            f"[Step {step} | Score: {self.score}/{self.max_score} | "
+            f"Location: {self.location} | Rooms visited: {len(self.locations_visited)}]"
+        )
+        # Recent history (last 7 for better context)
+        if self.history:
+            parts.append("\nRecent history:")
+            for h in self.history[-7:]:
+                action_str = json.dumps(h["args"])
+                loc = h.get("location", "?")
+                result_short = h["result"].replace("\n", " ")[:80]
+                parts.append(f"  [{loc}] {h['tool']}({action_str}) -> {result_short}")
+        # Failed actions at current location (helps avoid repetition)
+        loc_failures = [
+            a.split(":", 1)[1]
+            for a in self.failed_actions
+            if a.startswith(f"{self.location}:")
+        ]
+        if loc_failures:
+            parts.append(f"\nActions that FAILED here: {', '.join(loc_failures)}")
+        # Stuck hint
+        if self.consecutive_no_score > 8:
+            parts.append(
+                "\n[HINT: Score hasn't changed in a while. Consider: "
+                "call suggest_exploration, check memory, examine objects more carefully, "
+                "or try using inventory items on things you've seen.]"
+            )
+        # Current game output
+        parts.append(f"\nGame output:\n{observation}")
+        parts.append("\nWhat do you do next?")
+        return "\n".join(parts)
+    def _parse_response(
+        self, response: str, valid_tools: list[str]
+    ) -> tuple[str, str, dict]:
+        thought = "..."
+        tool_name = "play_action"
+        tool_args = {"action": "look"}
+        lines = response.split("\n")
+        args_lines = []
+        collecting_args = False
+        for line in lines:
+            clean = line.strip()
+            up = clean.upper()
+            if up.startswith("THOUGHT:"):
+                thought = clean.split(":", 1)[1].strip()
+                collecting_args = False
+            elif up.startswith("TOOL:"):
+                raw_tool = clean.split(":", 1)[1].strip().lower().strip("`").strip()
+                # Handle common LLM mistakes
+                raw_tool = raw_tool.replace(" ", "_")
+                if raw_tool in valid_tools:
+                    tool_name = raw_tool
+                elif "play" in raw_tool or "action" in raw_tool:
+                    tool_name = "play_action"
+                elif "note" in raw_tool and "write" in raw_tool:
+                    tool_name = "notebook_write"
+                elif "note" in raw_tool and "read" in raw_tool:
+                    tool_name = "notebook_read"
+                elif "note" in raw_tool:
+                    tool_name = "notebook_write"
+                elif "map" in raw_tool:
+                    tool_name = "get_map"
+                elif "path" in raw_tool:
+                    tool_name = "find_path"
+                elif "suggest" in raw_tool or "explor" in raw_tool:
+                    tool_name = "suggest_exploration"
+                elif "register" in raw_tool or "exit" in raw_tool:
+                    tool_name = "register_exits"
+                collecting_args = False
+            elif up.startswith("ARGS:"):
+                raw = clean.split(":", 1)[1].strip()
+                args_lines = [raw]
+                collecting_args = True
+            elif collecting_args and clean:
+                args_lines.append(clean)
+        # Parse ARGS
+        if args_lines:
+            raw_args = " ".join(args_lines)
+            # Try direct JSON parse
+            try:
+                tool_args = json.loads(raw_args)
+            except json.JSONDecodeError:
+                # Try extracting JSON object
+                m = re.search(r"\{[^{}]+\}", raw_args)
+                if m:
+                    try:
+                        tool_args = json.loads(m.group())
+                    except json.JSONDecodeError:
+                        pass
+                # Fallback: try extracting action string
+                if tool_name == "play_action":
+                    m = re.search(r'"action"\s*:\s*"([^"]+)"', raw_args)
+                    if m:
+                        tool_args = {"action": m.group(1)}
+        # ─── Fix play_action args ───
+        if tool_name == "play_action":
+            action = str(tool_args.get("action", "")).strip()
+            # Merge split args (action + target/object)
+            for extra_key in ("target", "object", "item", "direction"):
+                extra = str(tool_args.get(extra_key, "")).strip()
+                if extra and extra.lower() not in action.lower():
+                    action = f"{action} {extra}".strip()
+            # Strip "go " prefix for bare directions
+            if action.lower().startswith("go "):
+                rest = action[3:].strip().lower()
+                if rest in DIRECTION_SET:
+                    action = rest
+            tool_args = {"action": action or "look"}
+        # ─── Fix find_path args ───
+        if tool_name == "find_path":
+            # Normalize: the tool expects "target_room" not "to" or "room"
+            for key in ("to", "room", "destination", "target"):
+                if key in tool_args and "target_room" not in tool_args:
+                    tool_args["target_room"] = tool_args.pop(key)
+        # Final validation
+        if tool_name not in valid_tools:
+            tool_name = "play_action"
+            if "action" not in tool_args:
+                tool_args = {"action": "look"}
+        return thought, tool_name, tool_args
+    def _extract_result(self, result) -> str:
+        if hasattr(result, "content") and result.content:
+            return result.content[0].text
+        return str(result)
+    def _update_state(self, text: str):
+        m = re.search(r"Score:\s*(\d+)/(\d+)", text, re.IGNORECASE)
+        if m:
+            self.score = int(m.group(1))
+            self.max_score = int(m.group(2))
+        m_loc = re.search(r"\[Location:\s*([^|\]]+)", text)
+        if m_loc:
+            loc = m_loc.group(1).strip()
+            if loc and loc != "Unknown":
+                self.location = loc
+                self.locations_visited.add(loc)
+    def _detect_exits(self, text: str) -> list[str]:
+        """Extract direction words mentioned in game text."""
+        return list(set(EXIT_PATTERN.findall(text.lower())))
+    def _is_failure(self, text: str) -> bool:
+        """Detect if the game rejected our action."""
+        fail_phrases = [
+            "you can't go",
+            "you can't do",
+            "i don't understand",
+            "that's not a verb",
+            "you don't see",
+            "you can't see",
+            "there's no",
+            "you can't",
+            "nothing happens",
+            "is locked",
+            "is closed",
+            "won't budge",
+            "doesn't seem to",
+            "you aren't",
+        ]
+        lower = text.lower()
+        return any(f in lower for f in fail_phrases)
+    def _is_game_over(self, text: str) -> bool:
+        return any(
+            x in text.lower()
+            for x in [
+                "*** you have died ***",
+                "*** you have won ***",
+                "game over",
+                "you have won",
+                "you have died",
+                "would you like to restart",
+            ]
         )
 # =============================================================================
 async def test_agent():
     """Test the agent locally."""
     from fastmcp import Client
     server_path = "mcp_server.py"
     agent = StudentAgent()
     async with Client(server_path) as client:
         result = await agent.run(
             client=client,
             seed=42,
             verbose=True,
         )
         print(f"\nFinal Score: {result.final_score}")
         print(f"Moves: {result.moves}")
         print(f"Locations: {result.locations_visited}")

mcp_server.py CHANGED Viewed

@@ -1,209 +1,478 @@
 """
-Student MCP Server for Text Adventure Games
-This is your MCP server submission. Implement the tools that your agent
-will use to play text adventure games.
-Required tool:
-    play_action(action: str) -> str
-        Execute a game command and return the result.
-Recommended tools:
-    memory() -> str
-        Return current game state, score, and recent history.
-    inventory() -> str
-        Return the player's current inventory.
-    get_map() -> str
-        Return a map of explored locations.
-Test your server with:
-    fastmcp dev submission_template/mcp_server.py
-Then open the MCP Inspector in your browser to test the tools interactively.
 """
 import sys
 import os
-# Add parent directory to path to import games module
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
-from fastmcp import FastMCP
 from games.zork_env import TextAdventureEnv
-# =============================================================================
-# Create the MCP Server
-# =============================================================================
-mcp = FastMCP("Student Text Adventure Server")
-# =============================================================================
-# Game State Management
-# =============================================================================
-class GameManager:
-    """
-    Manages the text adventure game state.
-    TODO: Extend this class to track:
-    - Action history (for memory tool)
-    - Explored locations (for mapping)
-    - Current score and moves
-    """
     def __init__(self):
-        self.env: TextAdventureEnv = None
-        self.state = None
-        self.game_name: str = ""
-        # TODO: Add more state tracking
-        # self.history: list[tuple[str, str]] = []
-        # self.explored_locations: dict[str, set[str]] = {}
-        # self.current_location: str = ""
-    def initialize(self, game: str = "zork1"):
-        """Initialize or reset the game."""
-        self.game_name = game
-        self.env = TextAdventureEnv(game)
         self.state = self.env.reset()
-        # TODO: Reset your state tracking here
-        return self.state.observation
     def step(self, action: str) -> str:
-        """Execute an action and return the result."""
-        if self.env is None:
-            self.initialize()
-        self.state = self.env.step(action)
-        # TODO: Update your state tracking here
-        # self.history.append((action, self.state.observation))
-        # Update location tracking, etc.
         return self.state.observation
-    def get_score(self) -> int:
-        """Get current score."""
-        return self.state.score if self.state else 0
-    def get_moves(self) -> int:
-        """Get number of moves taken."""
-        return self.state.moves if self.state else 0
-# Global game manager
-_game = GameManager()
-def get_game() -> GameManager:
-    """Get or initialize the game manager."""
     global _game
-    if _game.env is None:
-        # Get game from environment variable (set by evaluator)
-        game = os.environ.get("GAME", "zork1")
-        _game.initialize(game)
     return _game
 # =============================================================================
-# MCP Tools - IMPLEMENT THESE
 # =============================================================================
 @mcp.tool()
 def play_action(action: str) -> str:
-    """
-    Execute a game command and return the result.
-    This is the main tool for interacting with the game.
-    Args:
-        action: The command to execute (e.g., "north", "take lamp", "open mailbox")
-    Returns:
-        The game's response to the action
-    Valid commands include:
-        - Movement: north, south, east, west, up, down, enter, exit
-        - Objects: take <item>, drop <item>, open <thing>, examine <thing>
-        - Other: look, inventory, read <thing>, turn on lamp
-    """
-    game = get_game()
-    # TODO: You might want to add action validation here
-    # TODO: You might want to include score changes in the response
-    result = game.step(action)
-    # Optional: Append score info
-    # result += f"\n[Score: {game.get_score()} | Moves: {game.get_moves()}]"
-    return result
-# TODO: Implement additional tools to help your agent
-# @mcp.tool()
-# def memory() -> str:
-#     """
-#     Get the current game state summary.
-#
-#     Returns:
-#         A summary including current location, score, moves, and recent history
-#     """
-#     game = get_game()
-#     # TODO: Return useful state information
-#     pass
-# @mcp.tool()
-# def inventory() -> str:
-#     """
-#     Check what the player is carrying.
-#
-#     Returns:
-#         List of items in the player's inventory
-#     """
-#     game = get_game()
-#     result = game.step("inventory")
-#     return result
-# @mcp.tool()
-# def get_map() -> str:
-#     """
-#     Get a map of explored locations.
-#
-#     Returns:
-#         A text representation of explored locations and connections
-#     """
-#     game = get_game()
-#     # TODO: Return map of explored locations
-#     pass
-# @mcp.tool()
-# def get_valid_actions() -> str:
-#     """
-#     Get a list of likely valid actions from the current location.
-#
-#     Returns:
-#         List of actions that might work here
-#     """
-#     # This is a hint: Jericho provides get_valid_actions()
-#     game = get_game()
-#     if game.env and game.env.env:
-#         valid = game.env.env.get_valid_actions()
-#         return "Valid actions: " + ", ".join(valid[:20])
-#     return "Could not determine valid actions"
-# =============================================================================
-# Run the server
-# =============================================================================
 if __name__ == "__main__":
-    # This runs the server with stdio transport (for MCP clients)
     mcp.run()

 """
+MCP Server - Enhanced Generalist Agent
+Features:
+  - Graph-based mapping with BFS pathfinding
+  - Structured notebook (clues, puzzles, NPCs, dangers)
+  - Explicit inventory tracking
+  - Unexplored exit tracking for systematic exploration
+  - Stuck detection helpers
 """
+import re
 import sys
 import os
+from collections import deque
+from fastmcp import FastMCP
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from games.zork_env import TextAdventureEnv
+INITIAL_GAME = os.environ.get("GAME", "zork1")
+mcp = FastMCP("Text Adventure Server")
+# All recognized movement directions
+DIRECTIONS = {
+    "n",
+    "s",
+    "e",
+    "w",
+    "u",
+    "d",
+    "ne",
+    "nw",
+    "se",
+    "sw",
+    "north",
+    "south",
+    "east",
+    "west",
+    "up",
+    "down",
+    "northeast",
+    "northwest",
+    "southeast",
+    "southwest",
+    "in",
+    "out",
+    "enter",
+    "exit",
+}
+# Canonical direction mapping for consistency in the graph
+DIR_CANONICAL = {
+    "n": "north",
+    "s": "south",
+    "e": "east",
+    "w": "west",
+    "u": "up",
+    "d": "down",
+    "ne": "northeast",
+    "nw": "northwest",
+    "se": "southeast",
+    "sw": "southwest",
+}
+OPPOSITE_DIR = {
+    "north": "south",
+    "south": "north",
+    "east": "west",
+    "west": "east",
+    "up": "down",
+    "down": "up",
+    "northeast": "southwest",
+    "southwest": "northeast",
+    "northwest": "southeast",
+    "southeast": "northwest",
+    "in": "out",
+    "out": "in",
+    "enter": "exit",
+    "exit": "enter",
+}
+def canonicalize(direction: str) -> str:
+    d = direction.lower().strip()
+    return DIR_CANONICAL.get(d, d)
+class WorldMap:
+    """Graph-based world map with BFS pathfinding."""
+    def __init__(self):
+        # room_name -> {canonical_direction: destination_room_name}
+        self.graph: dict[str, dict[str, str]] = {}
+        # room_name -> brief description
+        self.room_info: dict[str, str] = {}
+        # room_name -> set of canonical directions mentioned but not yet taken
+        self.known_exits: dict[str, set[str]] = {}
+    def ensure_room(self, name: str):
+        if name and name != "Unknown":
+            if name not in self.graph:
+                self.graph[name] = {}
+            if name not in self.known_exits:
+                self.known_exits[name] = set()
+    def record_move(self, from_room: str, direction: str, to_room: str):
+        """Record a successful movement between rooms."""
+        d = canonicalize(direction)
+        self.ensure_room(from_room)
+        self.ensure_room(to_room)
+        if from_room != to_room and from_room != "Unknown" and to_room != "Unknown":
+            self.graph[from_room][d] = to_room
+            # Record reverse edge
+            opp = OPPOSITE_DIR.get(d)
+            if opp:
+                self.graph[to_room][opp] = from_room
+            # Remove from unexplored
+            self.known_exits.get(from_room, set()).discard(d)
+    def record_blocked(self, room: str, direction: str):
+        """Record that a direction is blocked/doesn't work from a room."""
+        d = canonicalize(direction)
+        self.ensure_room(room)
+        self.graph[room][d] = "[BLOCKED]"
+        self.known_exits.get(room, set()).discard(d)
+    def register_exits(self, room: str, directions: list[str]):
+        """Register exits mentioned in room description that we haven't explored yet."""
+        self.ensure_room(room)
+        for d in directions:
+            cd = canonicalize(d)
+            if cd not in self.graph.get(room, {}):
+                self.known_exits[room].add(cd)
+    def set_room_info(self, room: str, info: str):
+        self.room_info[room] = info[:200]
+    def find_path(self, start: str, end: str) -> list[str] | None:
+        """BFS shortest path. Returns list of directions, or None if no path."""
+        if start == end:
+            return []
+        if start not in self.graph or end not in self.graph:
+            return None
+        queue = deque([(start, [])])
+        visited = {start}
+        while queue:
+            current, path = queue.popleft()
+            for direction, neighbor in self.graph.get(current, {}).items():
+                if neighbor == "[BLOCKED]" or neighbor in visited:
+                    continue
+                if neighbor == end:
+                    return path + [direction]
+                visited.add(neighbor)
+                queue.append((neighbor, path + [direction]))
+        return None
+    def get_unexplored(self) -> list[tuple[str, str]]:
+        """Get all (room, direction) pairs that are known but unexplored."""
+        result = []
+        for room, dirs in self.known_exits.items():
+            for d in dirs:
+                result.append((room, d))
+        return result
+    def get_nearest_unexplored(self, current: str) -> tuple[list[str], str] | None:
+        """Find the nearest unexplored exit from current position.
+        Returns (path_to_room, unexplored_direction) or None."""
+        unexplored = self.get_unexplored()
+        if not unexplored:
+            return None
+        best = None
+        best_len = float("inf")
+        for room, direction in unexplored:
+            if room == current:
+                return ([], direction)
+            path = self.find_path(current, room)
+            if path is not None and len(path) < best_len:
+                best = (path, direction)
+                best_len = len(path)
+        return best
+    def to_text(self, current: str = "") -> str:
+        if not self.graph:
+            return "Map is empty — no paths recorded yet."
+        lines = []
+        for room in sorted(self.graph.keys()):
+            exits = self.graph[room]
+            marker = " << YOU ARE HERE" if room == current else ""
+            exit_parts = []
+            for d, dest in sorted(exits.items()):
+                if dest == "[BLOCKED]":
+                    exit_parts.append(f"{d}:BLOCKED")
+                else:
+                    exit_parts.append(f"{d}->{dest}")
+            unexplored = self.known_exits.get(room, set())
+            for d in sorted(unexplored):
+                exit_parts.append(f"{d}:???")
+            exits_str = ", ".join(exit_parts) if exit_parts else "no known exits"
+            lines.append(f"  [{room}]{marker}: {exits_str}")
+        return "Known Map:\n" + "\n".join(lines)
+class Notebook:
+    """Structured notebook for clues, items, puzzles, etc."""
     def __init__(self):
+        self.entries: list[dict] = []
+    def add(self, text: str, category: str = "General") -> str:
+        cat = category.upper().strip()
+        entry = {"category": cat, "text": text}
+        # Avoid exact duplicates
+        for e in self.entries:
+            if e["text"] == text and e["category"] == cat:
+                return f"(Already noted: {text})"
+        self.entries.append(entry)
+        return f"Noted [{cat}]: {text} (Total: {len(self.entries)} entries)"
+    def to_text(self) -> str:
+        if not self.entries:
+            return "Notebook is empty."
+        lines = []
+        for e in self.entries:
+            lines.append(f"  [{e['category']}] {e['text']}")
+        return "\n".join(lines)
+    def search(self, keyword: str) -> str:
+        kw = keyword.lower()
+        matches = [
+            e
+            for e in self.entries
+            if kw in e["text"].lower() or kw in e["category"].lower()
+        ]
+        if not matches:
+            return f"No notes matching '{keyword}'."
+        return "\n".join(f"  [{e['category']}] {e['text']}" for e in matches)
+class GameState:
+    def __init__(self, game_name: str):
+        self.env = TextAdventureEnv(game_name)
         self.state = self.env.reset()
+        self.world_map = WorldMap()
+        self.notebook = Notebook()
+        self.current_location = "Unknown"
+        self._update_location(self.state.location)
+        self.world_map.ensure_room(self.current_location)
+        self.current_goal = (
+            "Explore the environment, collect useful items, and increase score."
+        )
+        self.plan = ""
+        self.action_history: list[str] = []  # last N actions for stuck detection
+    def _clean_location_name(self, raw: str) -> str:
+        if not raw:
+            return "Unknown"
+        if ":" in raw:
+            raw = raw.split(":", 1)[1].strip()
+        raw = re.sub(
+            r"\s*(Parent\d+|Sibling\d+|Child\d+|Attributes\s*\[.*?\]|Properties\s*\[.*?\]).*",
+            "",
+            raw,
+        )
+        return raw.strip() or "Unknown"
+    def _update_location(self, raw_loc: str):
+        self.current_location = self._clean_location_name(raw_loc)
     def step(self, action: str) -> str:
+        prev_loc = self.current_location
+        action_clean = action.strip()
+        self.state = self.env.step(action_clean)
+        self._update_location(self.state.location)
+        # Track action history for stuck detection
+        self.action_history.append(action_clean.lower())
+        if len(self.action_history) > 20:
+            self.action_history = self.action_history[-20:]
+        # Determine if this was a movement command
+        action_lower = action_clean.lower()
+        is_move = action_lower in DIRECTIONS
+        if is_move:
+            if prev_loc != self.current_location and prev_loc != "Unknown":
+                self.world_map.record_move(
+                    prev_loc, action_lower, self.current_location
+                )
+            elif prev_loc == self.current_location and prev_loc != "Unknown":
+                self.world_map.record_blocked(prev_loc, action_lower)
         return self.state.observation
+    def get_inventory_text(self) -> str:
+        if hasattr(self.state, "inventory") and self.state.inventory:
+            items = self.state.inventory
+            if isinstance(items, list):
+                return ", ".join(items) if items else "empty-handed"
+            return str(items)
+        return "(unknown — try 'inventory' command)"
+    def is_stuck(self) -> bool:
+        """Detect if agent is looping: same action repeated 3+ times in last 6."""
+        if len(self.action_history) < 6:
+            return False
+        recent = self.action_history[-6:]
+        from collections import Counter
+        counts = Counter(recent)
+        return counts.most_common(1)[0][1] >= 3
+_game: GameState | None = None
+def get_game() -> GameState:
     global _game
+    if _game is None:
+        _game = GameState(INITIAL_GAME)
     return _game
 # =============================================================================
+# TOOLS
 # =============================================================================
 @mcp.tool()
 def play_action(action: str) -> str:
+    """Execute a game command. Examples: 'north', 'take lamp', 'examine rug', 'open door', 'inventory'.
+    For directions, just use: north/south/east/west/up/down/in/out/ne/nw/se/sw.
+    Returns the game's response plus your current location and score."""
+    g = get_game()
+    obs = g.step(action)
+    # Build response with structured metadata
+    parts = [obs]
+    parts.append(
+        f"\n[Location: {g.current_location} | Score: {g.state.score}/{g.state.max_score}]"
+    )
+    # If stuck, add a gentle nudge
+    if g.is_stuck():
+        parts.append(
+            "[WARNING: You seem to be repeating actions. Try something different!]"
+        )
+    return "\n".join(parts)
+@mcp.tool()
+def think(goal: str, thought: str) -> str:
+    """Plan your strategy. Update your current goal and reasoning.
+    Use this to organize what you want to accomplish and why."""
+    g = get_game()
+    g.current_goal = goal
+    g.plan = thought
+    return f"Goal updated: {goal}\nPlan: {thought}\nLocation: {g.current_location}"
+@mcp.tool()
+def notebook_write(text: str, category: str = "Clue") -> str:
+    """Save important information to your permanent notebook.
+    Categories: Clue, Puzzle, Item, Danger, NPC, Code, Goal, Map.
+    Use this to remember puzzle hints, codes, locked doors, NPC dialogue, etc."""
+    g = get_game()
+    return g.notebook.add(text, category)
+@mcp.tool()
+def notebook_read(keyword: str = "") -> str:
+    """Read your notebook. Optionally filter by keyword or category.
+    Call with no keyword to see everything."""
+    g = get_game()
+    if keyword:
+        return g.notebook.search(keyword)
+    return g.notebook.to_text()
+@mcp.tool()
+def memory() -> str:
+    """Get a full status dump: location, goal, inventory, notebook, and map."""
+    g = get_game()
+    return f"""=== STATUS ===
+Location: {g.current_location}
+Score: {g.state.score}/{g.state.max_score}
+Goal: {g.current_goal}
+Plan: {g.plan}
+=== INVENTORY ===
+{g.get_inventory_text()}
+=== NOTEBOOK ({len(g.notebook.entries)} entries) ===
+{g.notebook.to_text()}
+=== MAP ===
+{g.world_map.to_text(g.current_location)}
+"""
+@mcp.tool()
+def get_map() -> str:
+    """View your explored map with all known connections and unexplored exits."""
+    g = get_game()
+    txt = g.world_map.to_text(g.current_location)
+    unexplored = g.world_map.get_unexplored()
+    if unexplored:
+        txt += f"\n\nUnexplored exits ({len(unexplored)}):"
+        for room, d in unexplored:
+            txt += f"\n  {room} -> {d} (not yet visited)"
+    return txt
+@mcp.tool()
+def find_path(target_room: str) -> str:
+    """Find the shortest path from your current location to a target room.
+    Returns step-by-step directions, or says if no path is known."""
+    g = get_game()
+    path = g.world_map.find_path(g.current_location, target_room)
+    if path is None:
+        # Try fuzzy match
+        for room in g.world_map.graph:
+            if target_room.lower() in room.lower():
+                path = g.world_map.find_path(g.current_location, room)
+                if path is not None:
+                    target_room = room
+                    break
+    if path is None:
+        return f"No known path from '{g.current_location}' to '{target_room}'. You may need to explore more."
+    if not path:
+        return f"You are already at '{target_room}'!"
+    return f"Path to '{target_room}': {' -> '.join(path)} ({len(path)} steps)"
+@mcp.tool()
+def suggest_exploration() -> str:
+    """Get a suggestion for where to explore next, based on unexplored exits.
+    Finds the nearest unexplored direction and tells you how to get there."""
+    g = get_game()
+    result = g.world_map.get_nearest_unexplored(g.current_location)
+    if result is None:
+        return "No unexplored exits recorded. Try: look (to spot exits), or try directions manually."
+    path, unexplored_dir = result
+    if not path:
+        return f"There's an unexplored exit right here: go '{unexplored_dir}'!"
+    path_str = " -> ".join(path)
+    return (
+        f"Nearest unexplored exit: go to via [{path_str}], then try '{unexplored_dir}'."
+    )
+@mcp.tool()
+def register_exits(directions: str) -> str:
+    """Tell the map about exits you see in the current room description.
+    Pass a comma-separated list of directions, e.g. 'north, south, up'.
+    This helps track what you haven't explored yet."""
+    g = get_game()
+    dirs = [d.strip().lower() for d in directions.split(",") if d.strip()]
+    valid = [d for d in dirs if canonicalize(d) in DIRECTIONS or d in DIRECTIONS]
+    if valid:
+        g.world_map.register_exits(g.current_location, valid)
+        return f"Registered exits at {g.current_location}: {', '.join(valid)}"
+    return "No valid directions recognized. Use: north, south, east, west, up, down, in, out, etc."
 if __name__ == "__main__":
     mcp.run()

requirements.txt CHANGED Viewed

@@ -5,5 +5,4 @@
 # Do not add jericho, fastmcp here - they are installed during evaluation
 # Add any additional packages your agent needs below:
-# numpy
-# requests

 # Do not add jericho, fastmcp here - they are installed during evaluation
 # Add any additional packages your agent needs below:
+python-dotenv