text-adventure-template

Sleeping

App Files Files Community

VZ22 commited on Feb 22

Commit

748ada7

1 Parent(s): adc4f14

cleaned and added README

Browse files

Files changed (2) hide show

README.md +12 -5
agent.py +5 -9

README.md CHANGED Viewed

@@ -16,13 +16,20 @@ license: mit
 This is my submission for the Text Adventure Agent assignment. My agent uses the ReAct pattern to play text adventure games via MCP.
-## Approach
-<!-- Describe your approach here -->
-- What strategy does your agent use?
-- What tools did you implement in your MCP server?
-- Any interesting techniques or optimizations?
 ## Files

 This is my submission for the Text Adventure Agent assignment. My agent uses the ReAct pattern to play text adventure games via MCP.
+## Approach (My Report)
+The idea behind my approach is that the game has an internal state, within the same internal state, only a set of actions has an effect, and the same action will always result in the same output, so there is no point repeating the same action within the same state. However, when the state is changed, then some prior action might be relevant once again.
+For me, the state is first defined by the location. Then within that location, it is possible to change the state (like in the lostpig game, when we hear something, then listen again and `northeast` becomes a valid direction). As a naive way of determining it, I say that if there is a "!" in the output, and that we've never seen this output, then it is a new state.
+At each step, I tell the agent which actions it already tried within that state and tell it to not repeat it. If it does repeat, I will loop (max 5 times) until it outputs something new. Then, if I already have a lot of actions tried within the same state (more than 5 actions), then I will force an action randomly (either an idle action such as `wait` or `listen`, or move to a different place).
+Additionaly, I also have an internal map, each time the agent does an action that moves it, I update the map accordingly. And at each step, I tell the agent its neighbors (Location's name, unknown, or inaccessible if we already tried). If we have a new state, then the inaccessible places will become unknown.
+Finally, I consider the default state as the output from `look`, then I will store it and if another action gives something similar (levenshtein score above $0.8$), then I will consider it as something already seen, so that even if there is a "!", it will not matter (like in the foundain room of the lostpig game).
+In the mcp server, I only changed a few things. First I used the Jericho's function `get_dictionary` to get a list of words that the game accepts. If it receives an action that is not valid, then I default it to `look`. I added a variable `last_observation`, which is what I use to represent the state (it only keeps the output of the game env, not the score at the end). Finally, for the location, I used the function `get_player_location` from Jericho to have a foolproof way of determining our location (prior to finding that function, I used a parser function in the agent file that is still there).
 ## Files

agent.py CHANGED Viewed

@@ -215,7 +215,6 @@ class StudentAgent:
         self.history: list[dict] = []
         self.score: int = 0
         self.history_state_tried_action = {}
-        self.useless_actions = {}
         self.location_state = {} # to each location, we have a set of every observation made here
         self.idle_actions = ["listen", "wait", "diagnose", "yell", "pray", "launch", "take all"]  # Actions that don't change location
@@ -293,7 +292,6 @@ class StudentAgent:
                 else:
                     observation += " Be thourough, examine everything around you and try to find all treasures and points of interest! Also remember your objective"
                 locations_visited.add(current_location)
-                self.useless_actions[current_location] = []
             prompt = self._build_prompt(observation)
             prompt += self._look_for_neighboring_locations(prompt)
@@ -357,13 +355,6 @@ class StudentAgent:
                 # Look if we got the same observation as for a "look"
                 current_obs = await client.call_tool("last_observation", {}) # observation also has the score
                 current_obs = self._extract_result(current_obs)
-                if tool_args.get("action", "").lower() == "look":
-                    look_observation = current_obs.lower()
-                    if "nothing" in look_observation or "can't see" in look_observation or "dark" in look_observation:
-                        self.useless_actions[current_location].append("look")
-                elif levenshtein(look_observation, current_obs, ratio=True) > 0.8:
-                    self.useless_actions[current_location].append(f"{tool_name}({tool_args})")
-                    not_new_state = True
                 tried_action_in_same_state.append((tool_name, tool_args))
                 if verbose:
@@ -373,6 +364,11 @@ class StudentAgent:
                 if verbose:
                     print(f"[ERROR] {e}")
             # Track location
             location = await client.call_tool("current_location", {})
             location = self._extract_result(location)

         self.history: list[dict] = []
         self.score: int = 0
         self.history_state_tried_action = {}
         self.location_state = {} # to each location, we have a set of every observation made here
         self.idle_actions = ["listen", "wait", "diagnose", "yell", "pray", "launch", "take all"]  # Actions that don't change location
                 else:
                     observation += " Be thourough, examine everything around you and try to find all treasures and points of interest! Also remember your objective"
                 locations_visited.add(current_location)
             prompt = self._build_prompt(observation)
             prompt += self._look_for_neighboring_locations(prompt)
                 # Look if we got the same observation as for a "look"
                 current_obs = await client.call_tool("last_observation", {}) # observation also has the score
                 current_obs = self._extract_result(current_obs)
                 tried_action_in_same_state.append((tool_name, tool_args))
                 if verbose:
                 if verbose:
                     print(f"[ERROR] {e}")
+            if tool_args.get("action", "").lower() == "look":
+                look_observation = current_obs.lower()
+            elif levenshtein(look_observation, current_obs, ratio=True) > 0.8:
+                not_new_state = True
             # Track location
             location = await client.call_tool("current_location", {})
             location = self._extract_result(location)