InesManelB commited on
Commit
4eb5e8b
·
1 Parent(s): ac36746

Updated README file

Browse files
Files changed (1) hide show
  1. README.md +17 -16
README.md CHANGED
@@ -14,15 +14,27 @@ license: mit
14
 
15
  ## Overview
16
 
17
- This is my submission for the Text Adventure Agent assignment. My agent uses the ReAct pattern to play text adventure games via MCP.
18
 
19
  ## Approach
20
 
21
- <!-- Describe your approach here -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- - What strategy does your agent use?
24
- - What tools did you implement in your MCP server?
25
- - Any interesting techniques or optimizations?
26
 
27
  ## Files
28
 
@@ -34,14 +46,6 @@ This is my submission for the Text Adventure Agent assignment. My agent uses the
34
  | `app.py` | Gradio interface for HF Space |
35
  | `requirements.txt` | Additional dependencies |
36
 
37
- ## How to Submit
38
-
39
- 1. Fork the template Space: `https://huggingface.co/spaces/LLM-course/text-adventure-template`
40
- 2. Clone your fork locally
41
- 3. Implement your agent in `agent.py` and `mcp_server.py`
42
- 4. Test locally (see below)
43
- 5. Push your changes to your Space
44
- 6. Submit your Space URL on the course platform
45
 
46
  ## Local Testing
47
 
@@ -54,7 +58,4 @@ fastmcp dev mcp_server.py
54
 
55
  # Run your agent on a game
56
  python run_agent.py --agent . --game lostpig -v -n 20
57
-
58
- # Run evaluation
59
- python -m evaluation.evaluate -s . -g lostpig -t 3
60
  ```
 
14
 
15
  ## Overview
16
 
17
+ This is my submission for the Text Adventure Agent assignment. The agent builds upon the baseline Agentic-Zork implementation, introducing agentic framework improvements and a Just-In-Time Reinforcement Learning (JitRL) mechanism for cross-episode learning without gradient updates.
18
 
19
  ## Approach
20
 
21
+ ### Agentic Framework
22
+ - **ReAct agent** with structured prompting: the agent proposes multiple candidate actions with confidence scores and reasoning at each step
23
+ - **History summarization**: a dedicated LLM call generates a structured summary (`[SUMMARY]`, `[PROGRESS]`, `[LOCATION]`) at each step to provide compact context without overloading the context window
24
+ - **Valid action constraining**: the agent is restricted to the set of valid actions provided by the Jericho engine, eliminating invalid command errors
25
+ - **Inventory injection**: current inventory is included directly in the prompt, avoiding unnecessary tool calls
26
+
27
+ ### JitRL Cross-Episode Memory
28
+ - After each episode, an LLM-based evaluator assigns step-level rewards based on long-term impact
29
+ - Discounted returns $G_t$ are computed for each (state, action) pair and stored in a FAISS vector index
30
+ - At each step, similar past states are retrieved and used to estimate action advantages $\hat{A}(s, a) = \hat{Q}(s, a) - \hat{V}(s)$
31
+ - Candidate action scores are updated following the JitRL rule: $z'(s,a) = z(s,a) + \beta \hat{A}(s,a)$
32
+
33
+ ### MCP Server Tools
34
+ - `play_action`: executes a game command and returns the full game state
35
+ - `reset_game`: initializes or resets the game environment
36
+ - `get_valid_actions`: returns the list of valid actions at the current state
37
 
 
 
 
38
 
39
  ## Files
40
 
 
46
  | `app.py` | Gradio interface for HF Space |
47
  | `requirements.txt` | Additional dependencies |
48
 
 
 
 
 
 
 
 
 
49
 
50
  ## Local Testing
51
 
 
58
 
59
  # Run your agent on a game
60
  python run_agent.py --agent . --game lostpig -v -n 20
 
 
 
61
  ```