Zork-Adventure-RL-Agent

Sleeping

App Files Files Community

Zork-Adventure-RL-Agent / README.md

InesManelB

Updated README file

4eb5e8b about 2 months ago

preview code

raw

history blame contribute delete

2.44 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: Text Adventure Agent Submission
emoji: 🗺
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: false
license: mit

Text Adventure Agent Submission

Overview

This is my submission for the Text Adventure Agent assignment. The agent builds upon the baseline Agentic-Zork implementation, introducing agentic framework improvements and a Just-In-Time Reinforcement Learning (JitRL) mechanism for cross-episode learning without gradient updates.

Approach

Agentic Framework

ReAct agent with structured prompting: the agent proposes multiple candidate actions with confidence scores and reasoning at each step
History summarization: a dedicated LLM call generates a structured summary ([SUMMARY], [PROGRESS], [LOCATION]) at each step to provide compact context without overloading the context window
Valid action constraining: the agent is restricted to the set of valid actions provided by the Jericho engine, eliminating invalid command errors
Inventory injection: current inventory is included directly in the prompt, avoiding unnecessary tool calls

JitRL Cross-Episode Memory

After each episode, an LLM-based evaluator assigns step-level rewards based on long-term impact
Discounted returns $G_t$ are computed for each (state, action) pair and stored in a FAISS vector index
At each step, similar past states are retrieved and used to estimate action advantages $\hat{A}(s, a) = \hat{Q}(s, a) - \hat{V}(s)$
Candidate action scores are updated following the JitRL rule: $z'(s,a) = z(s,a) + \beta \hat{A}(s,a)$

MCP Server Tools

play_action: executes a game command and returns the full game state
reset_game: initializes or resets the game environment
get_valid_actions: returns the list of valid actions at the current state

Files

File	Description
`agent.py`	ReAct agent with `StudentAgent` class
`mcp_server.py`	MCP server with game interaction tools
`cross_episode_memory.py`	The memory used across different runs of the agent
`app.py`	Gradio interface for HF Space
`requirements.txt`	Additional dependencies

Local Testing

# Install dependencies
pip install -r requirements.txt

# Test the MCP server interactively
fastmcp dev mcp_server.py

# Run your agent on a game
python run_agent.py --agent . --game lostpig -v -n 20