text-adventure-template

Sleeping

App Files Files Community

text-adventure-template / README.md

F10JM

Improved agent with structured exploration and LLM-powered action extraction

5c50c03 about 2 months ago

preview code

raw

history blame contribute delete

1.86 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: Text Adventure Agent
emoji: 🎮
colorFrom: green
colorTo: purple
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: mit

Text Adventure LLM Agent

Approach

This submission implements a structured exploration agent that plays text adventure games using an MCP server built with FastMCP and Qwen2.5-72B-Instruct via HuggingFace Inference API.

MCP Server Design

The MCP server exposes five tools: play_action, memory, get_valid_actions, inventory, and get_map. The key design decision is failed action tracking: every action that neither scores points nor moves to a new location is added to a per-location failed_actions set, surfaced via the memory tool and used to filter get_valid_actions results.

Agent Design

The agent moves beyond a pure ReAct loop with structured exploration:

New location detection: uses the memory tool to reliably detect location changes rather than parsing raw text
On entering new location: immediately calls get_valid_actions and uses the LLM to extract promising actions from the observation
Per-location action log: tracks every action tried at each location with a one-sentence LLM-generated summary of the outcome
Promising action queue: maintains a prioritized list of actions to try at each location before falling back to the LLM
LLM-powered action extraction: calls the LLM on each new observation to identify scoring opportunities and useful interactions
Exploration bias: if the agent has been in the same location for 5+ steps, it forces a movement action to a new direction

This hybrid approach uses the LLM for two specific tasks (extracting promising actions, summarizing outcomes) rather than for every decision, making exploration much more efficient. The agent successfully reaches the Gnome Room within 100 steps.