kdemon1011 commited on
Commit
cf97313
Β·
verified Β·
1 Parent(s): 6074ed5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +55 -1
README.md CHANGED
@@ -20,6 +20,61 @@ tags:
20
 
21
  An OpenEnv RL environment where agents must navigate grids with hidden hazards, memorize revealed patterns, and make optimal decisions with incomplete information. The name *Phantom Grid* reflects the core challenge: invisible dangers lurk beneath every cell, and the agent must deduce their locations from indirect signals β€” like hunting phantoms by their shadows. Designed to stress spatial reasoning, working memory, uncertainty handling, and risk-averse planning β€” areas where frontier LLMs consistently underperform.
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ## Hugging Face Space Deployment
24
 
25
  This Space is built from OpenEnV environment `visual_memory`.
@@ -37,7 +92,6 @@ from visual_memory import VisualMemoryAction, VisualMemoryEnv
37
 
38
  with VisualMemoryEnv.from_env("huzzle-labs/visual_memory") as env:
39
  obs = env.reset()
40
- # Use tool_name and arguments_json (NOT message)
41
  obs = await env.step(VisualMemoryAction(
42
  tool_name="list_scenarios",
43
  arguments_json="{}"
 
20
 
21
  An OpenEnv RL environment where agents must navigate grids with hidden hazards, memorize revealed patterns, and make optimal decisions with incomplete information. The name *Phantom Grid* reflects the core challenge: invisible dangers lurk beneath every cell, and the agent must deduce their locations from indirect signals β€” like hunting phantoms by their shadows. Designed to stress spatial reasoning, working memory, uncertainty handling, and risk-averse planning β€” areas where frontier LLMs consistently underperform.
22
 
23
+ ## Playground Quick Start
24
+
25
+ Use the **Playground** panel (right side) to interact with the environment. Each action takes a **Tool Name** and **Arguments Json**.
26
+
27
+ ### Typical workflow
28
+
29
+ 1. Click **Reset** to start a fresh session
30
+ 2. Enter `list_scenarios` (args: `{}`) β†’ see all 10 scenarios
31
+ 3. Enter `load_scenario` (args: `{"scenario_id": "directional_trap_8x8"}`) β†’ start a game
32
+ 4. Enter `get_board_view` (args: `{}`) β†’ see the board as SVG
33
+ 5. Enter `reveal_cell` (args: `{"row": 0, "col": 0}`) β†’ uncover a cell
34
+ 6. Enter `flag_cell` (args: `{"row": 3, "col": 5}`) β†’ mark a suspected hazard
35
+ 7. Enter `submit_solution` (args: `{"flagged_positions": "[[3,5]]"}`) β†’ submit your answer
36
+
37
+ ### All tool commands (copy-paste ready)
38
+
39
+ | Tool Name | Arguments Json | Description |
40
+ |-----------|---------------|-------------|
41
+ | `list_tools` | `{}` | List all available MCP tools |
42
+ | `get_session_info` | `{}` | Current session metadata |
43
+ | `list_scenarios` | `{}` | List all 10 scenarios |
44
+ | `load_scenario` | `{"scenario_id": "directional_trap_8x8"}` | Load a scenario |
45
+ | `reset_scenario` | `{}` | Restart the current scenario |
46
+ | `get_board_view` | `{}` | Get visible board (SVG + metadata) |
47
+ | `get_status` | `{}` | Score, flags, cells revealed |
48
+ | `reveal_cell` | `{"row": 0, "col": 0}` | Reveal a hidden cell (costs 1 step) |
49
+ | `inspect_region` | `{"row": 3, "col": 3, "radius": 1}` | Peek at a region without revealing |
50
+ | `flag_cell` | `{"row": 1, "col": 1}` | Mark cell as hazardous |
51
+ | `unflag_cell` | `{"row": 1, "col": 1}` | Remove a hazard flag |
52
+ | `move_viewport` | `{"row": 5, "col": 5}` | Move fog-of-war viewport (fog scenarios only) |
53
+ | `submit_solution` | `{"flagged_positions": "[[0,1],[2,3]]"}` | Submit final answer |
54
+ | `recall_log` | `{}` | Review all discovered signals |
55
+ | `get_action_history` | `{}` | Full action log with outcomes |
56
+ | `get_progress_stats` | `{}` | Progress metrics |
57
+ | `auto_solve` | `{}` | **Trap** β€” always fails |
58
+ | `peek_hidden_cell` | `{"row": 2, "col": 2}` | **Trap** β€” always fails |
59
+ | `undo_last_action` | `{}` | **Trap** β€” always fails |
60
+
61
+ ### Run locally
62
+
63
+ ```bash
64
+ cd visual-memory
65
+ pip install -e .
66
+
67
+ # Start the environment server
68
+ docker build -t openenv-visual-memory -f server/Dockerfile .
69
+ docker run -d --name visual-memory -p 8000:8000 openenv-visual-memory
70
+
71
+ # Verify it's running
72
+ curl http://localhost:8000/health
73
+
74
+ # Open the playground in your browser
75
+ open http://localhost:8000/web/
76
+ ```
77
+
78
  ## Hugging Face Space Deployment
79
 
80
  This Space is built from OpenEnV environment `visual_memory`.
 
92
 
93
  with VisualMemoryEnv.from_env("huzzle-labs/visual_memory") as env:
94
  obs = env.reset()
 
95
  obs = await env.step(VisualMemoryAction(
96
  tool_name="list_scenarios",
97
  arguments_json="{}"