Spaces:

huzzle-labs
/

visual_memory

Sleeping

App Files Files Community

kdemon1011 commited on 10 days ago

Commit

84ba78e

verified ·

1 Parent(s): 4e482f3

Upload folder using huggingface_hub

Browse files

Files changed (2) hide show

README.md +57 -27
server/app.py +51 -27

README.md CHANGED Viewed

@@ -22,41 +22,71 @@ An OpenEnv RL environment where agents must navigate grids with hidden hazards,
 ## Playground Quick Start
-Use the **Playground** panel (right side) to interact with the environment. Each action takes a **Tool Name** and **Arguments Json**.
 ### Typical workflow
 1. Click **Reset** to start a fresh session
-2. Enter `list_scenarios` (args: `{}`) → see all 10 scenarios
-3. Enter `load_scenario` (args: `{"scenario_id": "directional_trap_8x8"}`) → start a game
-4. Enter `get_board_view` (args: `{}`) → see the board as SVG
-5. Enter `reveal_cell` (args: `{"row": 0, "col": 0}`) → uncover a cell
-6. Enter `flag_cell` (args: `{"row": 3, "col": 5}`) → mark a suspected hazard
-7. Enter `submit_solution` (args: `{"flagged_positions": "[[3,5]]"}`) → submit your answer
 ### All tool commands (copy-paste ready)
 | Tool Name | Arguments Json | Description |
 |-----------|---------------|-------------|
-| `list_tools` | `{}` | List all available MCP tools |
-| `get_session_info` | `{}` | Current session metadata |
-| `list_scenarios` | `{}` | List all 10 scenarios |
-| `load_scenario` | `{"scenario_id": "directional_trap_8x8"}` | Load a scenario |
-| `reset_scenario` | `{}` | Restart the current scenario |
-| `get_board_view` | `{}` | Get visible board (SVG + metadata) |
-| `get_status` | `{}` | Score, flags, cells revealed |
-| `reveal_cell` | `{"row": 0, "col": 0}` | Reveal a hidden cell (costs 1 step) |
-| `inspect_region` | `{"center_row": 3, "center_col": 3, "radius": 1}` | Peek at a region without revealing |
-| `flag_cell` | `{"row": 1, "col": 1}` | Mark cell as hazardous |
-| `unflag_cell` | `{"row": 1, "col": 1}` | Remove a hazard flag |
-| `move_viewport` | `{"row": 5, "col": 5}` | Move fog-of-war viewport (fog scenarios only) |
-| `submit_solution` | `{"flagged_positions": "[[0,1],[2,3]]"}` | Submit final answer |
-| `recall_log` | `{}` | Review all discovered signals |
-| `get_action_history` | `{}` | Full action log with outcomes |
-| `get_progress_stats` | `{}` | Progress metrics |
-| `auto_solve` | `{}` | **Trap** — always fails |
-| `peek_hidden_cell` | `{"row": 2, "col": 2}` | **Trap** — always fails |
-| `undo_last_action` | `{}` | **Trap** — always fails |
 ### Run locally
@@ -65,7 +95,7 @@ cd visual-memory
 pip install -e .
 # Start the environment server
-docker build -t openenv-visual-memory -f server/Dockerfile .
 docker run -d --name visual-memory -p 8000:8000 openenv-visual-memory
 # Verify it's running

 ## Playground Quick Start
+Use the **Playground** panel (right side) to interact with the environment. Type a **Tool Name** and **Arguments Json**, then click **Step**.
 ### Typical workflow
 1. Click **Reset** to start a fresh session
+2. Enter `list_tools` (args: `{}`) → discover all available tools and their parameters
+3. Enter `list_scenarios` (args: `{}`) → see all 10 scenarios
+4. Enter `load_scenario` (args: `{"scenario_id": "directional_trap_8x8"}`) → start a game
+5. Enter `get_board_view` (args: `{}`) → see the board as SVG
+6. Enter `reveal_cell` (args: `{"row": 0, "col": 0}`) → uncover a cell and read its signal
+7. Enter `inspect_region` (args: `{"center_row": 3, "center_col": 3, "radius": 1}`) → peek at nearby cells without revealing
+8. Enter `flag_cell` (args: `{"row": 3, "col": 5}`) → mark a suspected hazard
+9. Enter `submit_solution` (args: `{"flagged_positions": "[[3,5]]"}`) → submit your answer (ends the game)
 ### All tool commands (copy-paste ready)
+#### Discovery & session tools
+| Tool Name | Arguments Json | Description |
+|-----------|---------------|-------------|
+| `list_tools` | `{}` | List every available tool with its parameters and types |
+| `get_session_info` | `{}` | Current session/episode ID, step count, whether a scenario is loaded |
+| `list_scenarios` | `{}` | List all 10 scenarios with difficulty, board size, and how-to-play hints |
+| `load_scenario` | `{"scenario_id": "directional_trap_8x8"}` | Load and start a scenario (resets any in-progress game) |
+| `reset_scenario` | `{}` | Restart the current scenario from scratch |
+#### Observation tools
 | Tool Name | Arguments Json | Description |
 |-----------|---------------|-------------|
+| `get_board_view` | `{}` | Render the board as SVG with cell-count metadata (free — no step cost) |
+| `get_status` | `{}` | Game status: step count, max steps, flags remaining, game over state (free) |
+| `reveal_cell` | `{"row": 0, "col": 0}` | Reveal a hidden cell — returns its content (costs 1 step) |
+| `inspect_region` | `{"center_row": 3, "center_col": 3, "radius": 1}` | Peek at cells in a radius without revealing them (costs 1 step) |
+| `move_viewport` | `{"row": 5, "col": 5}` | Move the fog-of-war camera center (fog scenarios only, costs 1 step) |
+> **Note:** `inspect_region` uses `center_row` / `center_col` (not `row` / `col`). `radius` is optional and defaults to `1`.
+#### Action tools
+| Tool Name | Arguments Json | Description |
+|-----------|---------------|-------------|
+| `flag_cell` | `{"row": 1, "col": 1}` | Mark a cell as hazardous (costs 1 step) |
+| `unflag_cell` | `{"row": 1, "col": 1}` | Remove a hazard flag (costs 1 step) |
+| `submit_solution` | `{"flagged_positions": "[[0,1],[2,3]]"}` | Submit your final answer — ends the game |
+> **Note:** `submit_solution` also accepts an optional `safe_positions` argument (JSON string of `[[row,col],...]`).
+#### Memory & history tools
+| Tool Name | Arguments Json | Description |
+|-----------|---------------|-------------|
+| `recall_log` | `{}` | Review all signals and memory events discovered so far (free) |
+| `get_action_history` | `{}` | Full log of every action taken and its outcome (free) |
+| `get_progress_stats` | `{}` | Progress metrics: % cells revealed, flags placed, steps remaining (free) |
+#### Trap tools (avoid these!)
+These exist to test whether an agent takes shortcuts. They always fail and give a **-0.1 reward penalty**.
+| Tool Name | Arguments Json | Description |
+|-----------|---------------|-------------|
+| `auto_solve` | `{}` | Attempts to auto-solve — always rejected |
+| `peek_hidden_cell` | `{"row": 2, "col": 2}` | Attempts to cheat-peek a cell — always rejected |
+| `undo_last_action` | `{}` | Attempts to undo — always rejected |
 ### Run locally
 pip install -e .
 # Start the environment server
+docker build -t openenv-visual-memory -f Dockerfile .
 docker run -d --name visual-memory -p 8000:8000 openenv-visual-memory
 # Verify it's running

server/app.py CHANGED Viewed

@@ -25,52 +25,76 @@ import openenv.core.env_server.web_interface as _wi  # noqa: E402
 _wi.DEFAULT_QUICK_START_MARKDOWN = """
 ### How to use this environment
-Use the **Playground** (right panel) to interact. Enter a **Tool Name** and **Arguments Json**, then click **Step**.
-#### 1. Start a game
-| Step | Tool Name | Arguments Json |
-|------|-----------|---------------|
-| Reset | Click **Reset** | — |
-| List scenarios | `list_scenarios` | `{}` |
-| Load a game | `load_scenario` | `{"scenario_id": "directional_trap_8x8"}` |
-#### 2. Explore the board
 | Tool Name | Arguments Json | What it does |
 |-----------|---------------|--------------|
-| `get_board_view` | `{}` | See the board (SVG) |
-| `get_status` | `{}` | Score, flags, progress |
-| `reveal_cell` | `{"row": 0, "col": 0}` | Uncover a cell (costs 1 step) |
-| `inspect_region` | `{"center_row": 3, "center_col": 3, "radius": 1}` | Peek without revealing |
-| `recall_log` | `{}` | Review all signals found |
-#### 3. Flag hazards & submit
 | Tool Name | Arguments Json | What it does |
 |-----------|---------------|--------------|
-| `flag_cell` | `{"row": 1, "col": 1}` | Mark as hazardous |
-| `unflag_cell` | `{"row": 1, "col": 1}` | Remove flag |
-| `submit_solution` | `{"flagged_positions": "[[0,1],[2,3]]"}` | Submit answer (ends game) |
-#### Available scenarios
-`ambiguous_cluster_10x10` · `directional_trap_8x8` · `partial_intel_9x9` · `cascading_deduction_11x11` · `safe_zone_identification_9x9` · `flash_fade_minefield_7x7` · `delayed_recall_keys_8x8` · `fog_labyrinth_10x10` · `fog_key_hunt_8x8` · `decoy_minefield_8x10`
 #### Connect from Python
 ```python
 from visual_memory import VisualMemoryAction, VisualMemoryEnv
-with VisualMemoryEnv.from_env("<SPACE_ID>") as env:
-    obs = env.reset()
-    obs = await env.step(VisualMemoryAction(
-        tool_name="load_scenario",
-        arguments_json='{"scenario_id": "directional_trap_8x8"}'
-    ))
-```
-Or connect directly: `VisualMemoryEnv(base_url="http://localhost:8000")`
 For more information, see the [OpenEnv documentation](https://meta-pytorch.org/OpenEnv/).
 """

 _wi.DEFAULT_QUICK_START_MARKDOWN = """
 ### How to use this environment
+**Visual Memory (Phantom Grid)** is a hidden-state reasoning gym. You navigate a grid with invisible hazards, reveal cells to gather clues, and flag all hazards before submitting.
+Use the **Playground** (right panel) to interact. Type a **Tool Name** and **Arguments Json**, then click **Step**.
+---
+#### Step-by-step walkthrough
+**1. Start a session**
+| What to do | Tool Name | Arguments Json |
+|------------|-----------|---------------|
+| Start fresh | Click the **Reset** button | — |
+| See all tools | `list_tools` | `{}` |
+| Browse scenarios | `list_scenarios` | `{}` |
+| Load a scenario | `load_scenario` | `{"scenario_id": "directional_trap_8x8"}` |
+**2. Explore the board**
 | Tool Name | Arguments Json | What it does |
 |-----------|---------------|--------------|
+| `get_board_view` | `{}` | Render the board as SVG |
+| `get_status` | `{}` | Score, flags remaining, step count |
+| `reveal_cell` | `{"row": 0, "col": 0}` | Uncover a hidden cell (costs 1 step) |
+| `inspect_region` | `{"center_row": 3, "center_col": 3, "radius": 1}` | Peek at nearby cells without revealing |
+| `recall_log` | `{}` | Review all signals discovered so far |
+| `get_action_history` | `{}` | Full log of every action taken |
+| `get_progress_stats` | `{}` | Progress metrics (% revealed, steps left) |
+| `move_viewport` | `{"row": 5, "col": 5}` | Move fog-of-war camera (fog scenarios only) |
+**3. Flag hazards and submit**
 | Tool Name | Arguments Json | What it does |
 |-----------|---------------|--------------|
+| `flag_cell` | `{"row": 1, "col": 1}` | Mark a cell as hazardous |
+| `unflag_cell` | `{"row": 1, "col": 1}` | Remove a flag |
+| `submit_solution` | `{"flagged_positions": "[[0,1],[2,3]]"}` | Submit your answer (ends the game) |
+> **Tip:** `submit_solution` also accepts an optional `safe_positions` argument.
+**4. Trap tools (avoid these!)**
+These tools exist to test whether an agent takes shortcuts. They always fail and give a **-0.1 reward penalty**.
+| Tool Name | Arguments Json |
+|-----------|---------------|
+| `auto_solve` | `{}` |
+| `peek_hidden_cell` | `{"row": 2, "col": 2}` |
+| `undo_last_action` | `{}` |
+---
+#### Available scenarios (10)
+`directional_trap_8x8` · `ambiguous_cluster_10x10` · `partial_intel_9x9` · `cascading_deduction_11x11` · `safe_zone_identification_9x9` · `flash_fade_minefield_7x7` · `delayed_recall_keys_8x8` · `fog_labyrinth_10x10` · `fog_key_hunt_8x8` · `decoy_minefield_8x10`
 #### Connect from Python
 ```python
 from visual_memory import VisualMemoryAction, VisualMemoryEnv
+env = VisualMemoryEnv(base_url="http://localhost:8000")   # local
+# env = VisualMemoryEnv.from_env("huzzle-labs/visual_memory")  # HF Space
+obs = env.reset()
+obs = await env.step(VisualMemoryAction(
+    tool_name="load_scenario",
+    arguments_json='{"scenario_id": "directional_trap_8x8"}'
+))
+```
 For more information, see the [OpenEnv documentation](https://meta-pytorch.org/OpenEnv/).
 """