Spaces:
No application file
A newer version of the Gradio SDK is available:
6.8.0
title: Agentic Zork
emoji: 🎮
colorFrom: green
colorTo: purple
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: true
license: mit
hf_oauth: true
short_description: 'Third assignment: Playing Zork has never been so boring!'
Text Adventure LLM Agent Project --- Report on ideas
In the agent of this submission, the primary objective is to navigate through the text adventure game using a depth-first search (DFS) approach, trying to explore as many actions as possible.
DFS Strategy Implementation
The agent's design is specifically structured to mimic a depth-first search by:
- Exploring all directions from each location (north, south, east, west, up, down, etc.) before backtracking.
- Tracking the actions and directions that have been tried at each location, ensuring it does not repeat previous actions that failed or that didn't progress the game state.
- Prioritizing unexplored paths and trying new actions whenever possible. When all directions from a location are exhausted, the agent backtracks and tries new directions from other places.
The 4 main things are this:
History and Exploration Tracking: The agent maintains a history of actions (last few moves) and records the directions tried at each location. This information is used to build the prompt to guide the LLM (Language Model) in deciding what to do next. The agent keeps track of the following:
- Tried Actions: The list of actions already attempted in the current location (e.g., "moved north," "opened the mailbox").
- Failed Actions: The list of actions that didn’t work (e.g., "you can’t go that way," "nothing happens").
- Untried Directions: The directions at each location that have not been explored yet, which the LLM is encouraged to attempt next.
This history is fed into the LLM's prompt to guide decision-making, helping the agent explore new actions and avoid repeating failed ones.
Depth-First Search in Action: The DFS approach is realized by the following steps:
- Initial Exploration: When the agent first encounters a location, it tries to exhaust possible directions and follows directions until it's not possible anymore.
- Backtracking: If the agent reaches a dead-end or exhausts all directions at a particular location, it backtracks to a previously visited location to try other unexplored directions.
- Exploring New Locations: The agent prioritizes traveling to new, unexplored locations to maximize the number of areas visited. The agent is encouraged not to revisit previously explored locations unless there is no other option.
Prompt Construction for the LLM: The LLM's decision-making is guided by the prompt, which includes a summary of the game state (current location, score), the recent history of actions, and directions tried. The prompt is designed to give the LLM all the necessary context to make an informed decision:
- Current Location: The agent’s current position in the game.
- Score: The current score, which helps the agent prioritize actions that might yield higher rewards.
- Failed Actions: Actions that have already failed at the current location are explicitly listed, so the LLM is discouraged from repeating them.
- Untried Directions: Directions that haven’t been explored yet are highlighted, prompting the LLM to try those.
- Recent Actions: A summary of the last few actions and their outcomes, giving the agent more context on the game state.
By feeding this rich context to the LLM, the agent is able to make more informed decisions, ensuring it explores new paths and actions efficiently.
Failure Handling: To avoid getting stuck in loops, the agent has mechanisms to:
- Track failed actions to avoid repeating them at the same location.
- Force actions when the LLM appears to be stuck in a state of reasoning without taking meaningful action. If the agent detects that it's not progressing (e.g., it calls "look" too many times), it forces a movement or interaction.