ViditOstwal commited on
Commit
d8977cf
·
verified ·
1 Parent(s): e7cea77

Upload folder using huggingface_hub

Browse files
Dockerfile ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # Multi-stage build using openenv-base
8
+ # This Dockerfile is flexible and works for both:
9
+ # - In-repo environments (with local OpenEnv sources)
10
+ # - Standalone environments (with openenv from PyPI/Git)
11
+ # The build script (openenv build) handles context detection and sets appropriate build args.
12
+
13
+ ARG BASE_IMAGE=ghcr.io/meta-pytorch/openenv-base:latest
14
+ FROM ${BASE_IMAGE} AS builder
15
+
16
+ WORKDIR /app
17
+
18
+ # Ensure git is available (required for installing dependencies from VCS)
19
+ RUN apt-get update && \
20
+ apt-get install -y --no-install-recommends git && \
21
+ rm -rf /var/lib/apt/lists/*
22
+
23
+ # Build argument to control whether we're building standalone or in-repo
24
+ ARG BUILD_MODE=in-repo
25
+ ARG ENV_NAME=maze_env
26
+
27
+ # Copy environment code (always at root of build context)
28
+ COPY . /app/env
29
+
30
+ # For in-repo builds, openenv is already vendored in the build context
31
+ # For standalone builds, openenv will be installed via pyproject.toml
32
+ WORKDIR /app/env
33
+
34
+ # Ensure uv is available (for local builds where base image lacks it)
35
+ RUN if ! command -v uv >/dev/null 2>&1; then \
36
+ curl -LsSf https://astral.sh/uv/install.sh | sh && \
37
+ mv /root/.local/bin/uv /usr/local/bin/uv && \
38
+ mv /root/.local/bin/uvx /usr/local/bin/uvx; \
39
+ fi
40
+
41
+ # Install dependencies using uv sync
42
+ # If uv.lock exists, use it; otherwise resolve on the fly
43
+ RUN --mount=type=cache,target=/root/.cache/uv \
44
+ if [ -f uv.lock ]; then \
45
+ uv sync --frozen --no-install-project --no-editable; \
46
+ else \
47
+ uv sync --no-install-project --no-editable; \
48
+ fi
49
+
50
+ RUN --mount=type=cache,target=/root/.cache/uv \
51
+ if [ -f uv.lock ]; then \
52
+ uv sync --frozen --no-editable; \
53
+ else \
54
+ uv sync --no-editable; \
55
+ fi
56
+
57
+ # Final runtime stage
58
+ FROM ${BASE_IMAGE}
59
+
60
+ WORKDIR /app
61
+
62
+ # Copy the virtual environment from builder
63
+ COPY --from=builder /app/env/.venv /app/.venv
64
+
65
+ # Copy the environment code
66
+ COPY --from=builder /app/env /app/env
67
+
68
+ # Set PATH to use the virtual environment
69
+ ENV PATH="/app/.venv/bin:$PATH"
70
+
71
+ # Set PYTHONPATH so imports work correctly
72
+ ENV PYTHONPATH="/app/env:$PYTHONPATH"
73
+
74
+ # WEB INTERFACE
75
+ ENV ENABLE_WEB_INTERFACE=true
76
+
77
+ # Health check
78
+ HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
79
+ CMD curl -f http://localhost:8000/health || exit 1
80
+
81
+ # Run the FastAPI server
82
+ # The module path is constructed to work with the /app/env structure
83
+ CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]
README.md CHANGED
@@ -1,10 +1,129 @@
1
  ---
2
- title: Maze Env
3
- emoji: 🏃
4
- colorFrom: pink
5
- colorTo: yellow
6
  sdk: docker
7
  pinned: false
 
 
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Maze Env Environment Server
3
+ emoji: 🎭
4
+ colorFrom: green
5
+ colorTo: pink
6
  sdk: docker
7
  pinned: false
8
+ app_port: 8000
9
+ base_path: /web
10
+ tags:
11
+ - openenv
12
  ---
13
 
14
+ # Maze Env Environment
15
+
16
+ Ice-sliding maze environment on [OpenEnv](https://github.com/meta-pytorch/OpenEnv).
17
+
18
+ Agents call `reset` / `step` with directions. All players slide simultaneously in the chosen direction until blocked by a wall or another player.
19
+
20
+ Levels live in `dataset/ice-maze-levels.json`.
21
+
22
+ ## Core Rules
23
+
24
+ - **Actions**: `LEFT`, `RIGHT`, `UP`, `DOWN`
25
+ - **Movement**: on each step, every player slides as far as possible in that direction
26
+ - **Win condition**: episode is solved only when every player is on an exit after a step
27
+
28
+ Board symbols used at runtime:
29
+
30
+ - `#` wall
31
+ - `.` empty ice
32
+ - `e` unoccupied exit
33
+ - `a` player on non-exit cell
34
+ - `b` player on exit cell
35
+
36
+ ## Quick Start (Client)
37
+
38
+ ```python
39
+ from maze_env import MazeAction, MazeEnv
40
+
41
+ with MazeEnv(base_url="http://localhost:8000").sync() as env:
42
+ reset_result = env.reset(level_index=0)
43
+ print(reset_result.observation.board)
44
+ print(reset_result.observation.system_prompt)
45
+
46
+ step_result = env.step(MazeAction(direction="LEFT"))
47
+ obs = step_result.observation
48
+ print(obs.board)
49
+ print(obs.message, obs.reward, obs.done)
50
+ ```
51
+
52
+ ## Run Locally
53
+
54
+ Start server:
55
+
56
+ ```bash
57
+ uv run --project . server
58
+ ```
59
+
60
+ Or with uvicorn directly:
61
+
62
+ ```bash
63
+ uv run uvicorn server.app:app --reload
64
+ ```
65
+
66
+ ## Docker
67
+
68
+ Build:
69
+
70
+ ```bash
71
+ docker build -t maze_env-env:latest -f server/Dockerfile .
72
+ ```
73
+
74
+ Run:
75
+
76
+ ```bash
77
+ docker run --rm -p 8000:8000 maze_env-env:latest
78
+ ```
79
+
80
+ ## Dataset Validation
81
+
82
+ Run:
83
+
84
+ ```bash
85
+ uv run python dataset/validate_dataset.py
86
+ ```
87
+
88
+ The validator checks:
89
+
90
+ - `start`/`end` consistency against `annotated_board`
91
+ - `diameter == len(path)` when both are present
92
+ - path replay through the actual environment:
93
+ - `done` must **not** become `True` before the final path move
94
+ - `done` must be `True` at the final path move
95
+
96
+ ## Smoke Test Environment Logic
97
+
98
+ ```bash
99
+ uv run python server/maze_env_environment.py
100
+ ```
101
+
102
+ This runs a direct `reset`/`step` demo without starting the API server.
103
+
104
+ ## Deployment (OpenEnv / Hugging Face)
105
+
106
+ ```bash
107
+ openenv push
108
+ ```
109
+
110
+ This uses `openenv.yaml` and deploys the Docker-backed environment.
111
+
112
+ ## Project Structure
113
+
114
+ ```text
115
+ .
116
+ ├── __init__.py
117
+ ├── client.py
118
+ ├── models.py
119
+ ├── openenv.yaml
120
+ ├── pyproject.toml
121
+ ├── dataset/
122
+ │ ├── ice-maze-levels.json
123
+ │ └── validate_dataset.py
124
+ └── server/
125
+ ├── app.py
126
+ ├── maze_env_environment.py
127
+ ├── maze_env_helpers.py
128
+ └── Dockerfile
129
+ ```
__init__.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """Maze Env Environment."""
8
+
9
+ from .client import MazeEnv
10
+ from .models import MazeAction, MazeDirection, MazeObservation
11
+
12
+ __all__ = [
13
+ "MazeAction",
14
+ "MazeDirection",
15
+ "MazeObservation",
16
+ "MazeEnv",
17
+ ]
client.py ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """Ice Maze Environment Client."""
8
+
9
+ from typing import Dict
10
+
11
+ from openenv.core import EnvClient
12
+ from openenv.core.client_types import StepResult
13
+ from openenv.core.env_server.types import State
14
+
15
+ from .models import MazeAction, MazeObservation
16
+
17
+
18
+ class MazeEnv(EnvClient[MazeAction, MazeObservation, State]):
19
+ """
20
+ Client for the Ice Maze Environment.
21
+
22
+ Maintains a persistent WebSocket connection to the environment server,
23
+ enabling efficient multi-step interactions with lower latency.
24
+ Each client instance has its own dedicated environment session on the server.
25
+
26
+ Example (async):
27
+ >>> async with MazeEnv(base_url="http://localhost:8000") as env:
28
+ ... obs = await env.reset(level_index=1)
29
+ ... print(obs.observation.system_prompt)
30
+ ...
31
+ ... obs = await env.step(MazeAction(direction="LEFT"))
32
+ ... print(obs.observation.board)
33
+ ... print(obs.observation.message)
34
+
35
+ Example (sync wrapper):
36
+ >>> with MazeEnv(base_url="http://localhost:8000").sync() as env:
37
+ ... obs = env.reset(level_index=1)
38
+ ... print(obs.observation.system_prompt)
39
+ ... obs = env.step(MazeAction(direction="UP"))
40
+ ... print(obs.observation.board)
41
+
42
+ Example with Docker:
43
+ >>> client = await MazeEnv.from_docker_image("maze_env-env:latest")
44
+ >>> try:
45
+ ... obs = await client.reset(level_index=0)
46
+ ... obs = await client.step(MazeAction(direction="RIGHT"))
47
+ ... finally:
48
+ ... await client.close()
49
+ """
50
+
51
+ def _step_payload(self, action: MazeAction) -> Dict:
52
+ """
53
+ Convert MazeAction to JSON payload for the step WebSocket message.
54
+
55
+ Args:
56
+ action: MazeAction instance with a direction field.
57
+
58
+ Returns:
59
+ Dictionary representation suitable for JSON encoding.
60
+ """
61
+ # Send canonical wire value expected by the environment.
62
+ return {"direction": action.direction.value}
63
+
64
+ def _parse_result(self, payload: Dict) -> StepResult[MazeObservation]:
65
+ """
66
+ Parse server response into StepResult[MazeObservation].
67
+
68
+ The server serializes the observation via serialize_observation(), which
69
+ produces:
70
+ {
71
+ "observation": { <MazeObservation fields minus done/reward/metadata> },
72
+ "reward": float | None,
73
+ "done": bool,
74
+ }
75
+
76
+ Args:
77
+ payload: JSON response data from server.
78
+
79
+ Returns:
80
+ StepResult with a fully populated MazeObservation.
81
+ """
82
+ obs_data = payload.get("observation", {})
83
+ done = payload.get("done", False)
84
+ reward = payload.get("reward")
85
+ observation = MazeObservation(
86
+ board=obs_data.get("board", ""),
87
+ step_count=obs_data.get("step_count", 0),
88
+ max_steps=obs_data.get("max_steps", 0),
89
+ previous_actions=obs_data.get("previous_actions", []),
90
+ system_prompt=obs_data.get("system_prompt", ""),
91
+ agent_positions=obs_data.get("agent_positions", []),
92
+ exit_positions=obs_data.get("exit_positions", []),
93
+ num_players=obs_data.get("num_players", 1),
94
+ message=obs_data.get("message", ""),
95
+ done=done,
96
+ reward=reward,
97
+ metadata=obs_data.get("metadata", payload.get("metadata", {})),
98
+ )
99
+
100
+ return StepResult(
101
+ observation=observation,
102
+ reward=reward,
103
+ done=done,
104
+ )
105
+
106
+ def _parse_state(self, payload: Dict) -> State:
107
+ """
108
+ Parse server response into a State object.
109
+
110
+ The state endpoint returns the full State dict including extra fields
111
+ (current_board, agent_positions, action_history, etc.).
112
+
113
+ Args:
114
+ payload: JSON response from the state WebSocket message.
115
+
116
+ Returns:
117
+ State object with episode_id, step_count, and all extra fields.
118
+ """
119
+ return State(**payload) if payload else State()
dataset/ice-maze-levels.json ADDED
@@ -0,0 +1,1210 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "width": 8,
4
+ "height": 8,
5
+ "players": 1,
6
+ "diameter":4,
7
+ "threes_along_diameter": 1,
8
+ "open_cells": 58,
9
+ "states": 58,
10
+ "start": [
11
+ [
12
+ 6,
13
+ 4
14
+ ]
15
+ ],
16
+ "end": [
17
+ [
18
+ 4,
19
+ 7
20
+ ]
21
+ ],
22
+ "path": "ULDR",
23
+ "annotated_board": [
24
+ "##########",
25
+ "#........#",
26
+ "#........#",
27
+ "#........#",
28
+ "#......###",
29
+ "#.......e#",
30
+ "###....###",
31
+ "###..a...#",
32
+ "###......#",
33
+ "##########"
34
+ ],
35
+ "date": "2026-04-06T01:42:07-06:00"
36
+ },
37
+ {
38
+ "width": 8,
39
+ "height": 8,
40
+ "players": 1,
41
+ "diameter": 6,
42
+ "threes_along_diameter": 4,
43
+ "open_cells": 61,
44
+ "states": 61,
45
+ "start": [
46
+ [
47
+ 3,
48
+ 6
49
+ ]
50
+ ],
51
+ "end": [
52
+ [
53
+ 0,
54
+ 5
55
+ ]
56
+ ],
57
+ "path": "LULDRU",
58
+ "annotated_board": [
59
+ "##########",
60
+ "#.....e..#",
61
+ "#........#",
62
+ "#......#.#",
63
+ "###....a.#",
64
+ "#........#",
65
+ "#........#",
66
+ "#........#",
67
+ "#........#",
68
+ "##########"
69
+ ],
70
+ "date": "2026-04-06T01:48:26-06:00"
71
+ },
72
+ {
73
+ "width": 8,
74
+ "height": 8,
75
+ "players": 1,
76
+ "diameter": 8,
77
+ "threes_along_diameter": 5,
78
+ "open_cells": 60,
79
+ "states": 60,
80
+ "start": [
81
+ [
82
+ 0,
83
+ 5
84
+ ]
85
+ ],
86
+ "end": [
87
+ [
88
+ 1,
89
+ 0
90
+ ]
91
+ ],
92
+ "path": "DLURDRUL",
93
+ "annotated_board": [
94
+ "##########",
95
+ "#....#a..#",
96
+ "#e.......#",
97
+ "#........#",
98
+ "#........#",
99
+ "#.....#..#",
100
+ "#..##....#",
101
+ "#........#",
102
+ "#........#",
103
+ "##########"
104
+ ],
105
+ "date": "2026-04-06T01:48:25-06:00"
106
+ },
107
+ {
108
+ "width": 8,
109
+ "height": 8,
110
+ "players": 1,
111
+ "diameter": 12,
112
+ "threes_along_diameter": 8,
113
+ "open_cells": 58,
114
+ "states": 58,
115
+ "start": [
116
+ [
117
+ 6,
118
+ 6
119
+ ]
120
+ ],
121
+ "end": [
122
+ [
123
+ 1,
124
+ 5
125
+ ]
126
+ ],
127
+ "path": "LURURDRDRULU",
128
+ "annotated_board": [
129
+ "##########",
130
+ "#.....#..#",
131
+ "#.....e..#",
132
+ "##......##",
133
+ "#....#...#",
134
+ "#........#",
135
+ "#......###",
136
+ "#......a.#",
137
+ "#........#",
138
+ "##########"
139
+ ],
140
+ "date": "2026-04-06T01:42:07-06:00"
141
+ },
142
+ {
143
+ "width": 8,
144
+ "height": 8,
145
+ "players": 1,
146
+ "diameter": 18,
147
+ "threes_along_diameter": 15,
148
+ "open_cells": 51,
149
+ "states": 51,
150
+ "start": [
151
+ [
152
+ 6,
153
+ 1
154
+ ]
155
+ ],
156
+ "end": [
157
+ [
158
+ 0,
159
+ 3
160
+ ]
161
+ ],
162
+ "path": "LURDRURDLURULURDLU",
163
+ "annotated_board": [
164
+ "##########",
165
+ "#..#e#...#",
166
+ "##......##",
167
+ "#..#.....#",
168
+ "#......###",
169
+ "#...#....#",
170
+ "#.#.....##",
171
+ "#.a......#",
172
+ "#..#...###",
173
+ "##########"
174
+ ],
175
+ "date": "2026-04-06T01:40:27-06:00"
176
+ },
177
+ {
178
+ "width": 8,
179
+ "height": 8,
180
+ "players": 2,
181
+ "diameter": 8,
182
+ "threes_along_diameter": 1,
183
+ "open_cells": 63,
184
+ "states": 1953,
185
+ "start": [
186
+ [
187
+ 0,
188
+ 1
189
+ ],
190
+ [
191
+ 0,
192
+ 2
193
+ ]
194
+ ],
195
+ "end": [
196
+ [
197
+ 1,
198
+ 7
199
+ ],
200
+ [
201
+ 2,
202
+ 7
203
+ ]
204
+ ],
205
+ "path": "DLURDLUR",
206
+ "annotated_board": [
207
+ "##########",
208
+ "##aa.....#",
209
+ "#.......e#",
210
+ "#.......e#",
211
+ "#........#",
212
+ "#........#",
213
+ "#........#",
214
+ "#........#",
215
+ "#........#",
216
+ "##########"
217
+ ],
218
+ "date": "2026-04-06T01:54:31-06:00"
219
+ },
220
+ {
221
+ "width": 8,
222
+ "height": 8,
223
+ "players": 2,
224
+ "diameter": 8,
225
+ "threes_along_diameter": 1,
226
+ "open_cells": 62,
227
+ "states": 1891,
228
+ "start": [
229
+ [
230
+ 0,
231
+ 0
232
+ ],
233
+ [
234
+ 1,
235
+ 0
236
+ ]
237
+ ],
238
+ "end": [
239
+ [
240
+ 3,
241
+ 0
242
+ ],
243
+ [
244
+ 7,
245
+ 1
246
+ ]
247
+ ],
248
+ "path": "RDLDRULD",
249
+ "annotated_board": [
250
+ "##########",
251
+ "#a.......#",
252
+ "#a.......#",
253
+ "#........#",
254
+ "#e.......#",
255
+ "##.......#",
256
+ "##.......#",
257
+ "##.......#",
258
+ "#.e......#",
259
+ "##########"
260
+ ],
261
+ "date": "2026-04-06T01:54:30-06:00"
262
+ },
263
+ {
264
+ "width": 8,
265
+ "height": 8,
266
+ "players": 2,
267
+ "diameter": 9,
268
+ "threes_along_diameter": 1,
269
+ "open_cells": 61,
270
+ "states": 1830,
271
+ "start": [
272
+ [
273
+ 0,
274
+ 0
275
+ ],
276
+ [
277
+ 1,
278
+ 0
279
+ ]
280
+ ],
281
+ "end": [
282
+ [
283
+ 1,
284
+ 0
285
+ ],
286
+ [
287
+ 2,
288
+ 0
289
+ ]
290
+ ],
291
+ "path": "RDLULDRUL",
292
+ "annotated_board": [
293
+ "##########",
294
+ "#a......##",
295
+ "#b.......#",
296
+ "#e.......#",
297
+ "#........#",
298
+ "#........#",
299
+ "#....#...#",
300
+ "#........#",
301
+ "##.......#",
302
+ "##########"
303
+ ],
304
+ "date": "2026-04-06T02:03:47-06:00"
305
+ },
306
+ {
307
+ "width": 8,
308
+ "height": 8,
309
+ "players": 2,
310
+ "diameter": 9,
311
+ "threes_along_diameter": 1,
312
+ "open_cells": 60,
313
+ "states": 1770,
314
+ "start": [
315
+ [
316
+ 6,
317
+ 1
318
+ ],
319
+ [
320
+ 7,
321
+ 0
322
+ ]
323
+ ],
324
+ "end": [
325
+ [
326
+ 0,
327
+ 7
328
+ ],
329
+ [
330
+ 7,
331
+ 7
332
+ ]
333
+ ],
334
+ "path": "RULDRDLUR",
335
+ "annotated_board": [
336
+ "##########",
337
+ "#.......e#",
338
+ "#........#",
339
+ "#........#",
340
+ "#...#....#",
341
+ "#........#",
342
+ "#.#...#..#",
343
+ "##a......#",
344
+ "#a......e#",
345
+ "##########"
346
+ ],
347
+ "date": "2026-04-06T01:54:30-06:00"
348
+ },
349
+ {
350
+ "width": 8,
351
+ "height": 8,
352
+ "players": 2,
353
+ "diameter": 9,
354
+ "threes_along_diameter": 1,
355
+ "open_cells": 59,
356
+ "states": 1711,
357
+ "start": [
358
+ [
359
+ 0,
360
+ 0
361
+ ],
362
+ [
363
+ 0,
364
+ 1
365
+ ]
366
+ ],
367
+ "end": [
368
+ [
369
+ 7,
370
+ 1
371
+ ],
372
+ [
373
+ 7,
374
+ 5
375
+ ]
376
+ ],
377
+ "path": "DRURDLURD",
378
+ "annotated_board": [
379
+ "##########",
380
+ "#aa#######",
381
+ "#......#.#",
382
+ "#........#",
383
+ "#........#",
384
+ "#........#",
385
+ "#........#",
386
+ "#........#",
387
+ "#.e...e..#",
388
+ "##########"
389
+ ],
390
+ "date": "2026-04-06T02:05:44-06:00"
391
+ },
392
+ {
393
+ "width": 8,
394
+ "height": 8,
395
+ "players": 2,
396
+ "diameter": 15,
397
+ "threes_along_diameter": 12,
398
+ "open_cells": 60,
399
+ "states": 1770,
400
+ "start": [
401
+ [
402
+ 6,
403
+ 0
404
+ ],
405
+ [
406
+ 7,
407
+ 0
408
+ ]
409
+ ],
410
+ "end": [
411
+ [
412
+ 1,
413
+ 0
414
+ ],
415
+ [
416
+ 4,
417
+ 0
418
+ ]
419
+ ],
420
+ "path": "RULURULULDRULDL",
421
+ "annotated_board": [
422
+ "##########",
423
+ "#........#",
424
+ "#e.......#",
425
+ "#.#......#",
426
+ "#........#",
427
+ "#e.....#.#",
428
+ "##.......#",
429
+ "#a......##",
430
+ "#a.......#",
431
+ "##########"
432
+ ],
433
+ "date": "2026-04-06T02:05:47-06:00"
434
+ },
435
+ {
436
+ "width": 8,
437
+ "height": 8,
438
+ "players": 2,
439
+ "diameter": 20,
440
+ "threes_along_diameter": 12,
441
+ "open_cells": 59,
442
+ "states": 1711,
443
+ "start": [
444
+ [
445
+ 4,
446
+ 6
447
+ ],
448
+ [
449
+ 4,
450
+ 7
451
+ ]
452
+ ],
453
+ "end": [
454
+ [
455
+ 5,
456
+ 7
457
+ ],
458
+ [
459
+ 7,
460
+ 7
461
+ ]
462
+ ],
463
+ "path": "URULDLDLURULDRURDLDR",
464
+ "annotated_board": [
465
+ "##########",
466
+ "#..#.....#",
467
+ "#........#",
468
+ "#........#",
469
+ "#......#.#",
470
+ "#.....#aa#",
471
+ "#.......e#",
472
+ "##.......#",
473
+ "#...#...e#",
474
+ "##########"
475
+ ],
476
+ "date": "2026-04-06T02:05:49-06:00"
477
+ },
478
+ {
479
+ "width": 8,
480
+ "height": 8,
481
+ "players": 2,
482
+ "diameter": 16,
483
+ "threes_along_diameter": 12,
484
+ "open_cells": 57,
485
+ "states": 1596,
486
+ "start": [
487
+ [
488
+ 1,
489
+ 0
490
+ ],
491
+ [
492
+ 1,
493
+ 1
494
+ ]
495
+ ],
496
+ "end": [
497
+ [
498
+ 0,
499
+ 1
500
+ ],
501
+ [
502
+ 0,
503
+ 5
504
+ ]
505
+ ],
506
+ "path": "RDLDRULURURDRULU",
507
+ "annotated_board": [
508
+ "##########",
509
+ "#.e...e.##",
510
+ "#aa..#...#",
511
+ "#.......##",
512
+ "#.......##",
513
+ "##.......#",
514
+ "#..##....#",
515
+ "#........#",
516
+ "#........#",
517
+ "##########"
518
+ ],
519
+ "date": "2026-04-06T02:03:47-06:00"
520
+ },
521
+ {
522
+ "width": 8,
523
+ "height": 8,
524
+ "players": 2,
525
+ "diameter": 18,
526
+ "threes_along_diameter": 15,
527
+ "open_cells": 58,
528
+ "states": 1653,
529
+ "start": [
530
+ [
531
+ 0,
532
+ 1
533
+ ],
534
+ [
535
+ 5,
536
+ 0
537
+ ]
538
+ ],
539
+ "end": [
540
+ [
541
+ 0,
542
+ 2
543
+ ],
544
+ [
545
+ 0,
546
+ 3
547
+ ]
548
+ ],
549
+ "path": "LDRDRULURDRDLULDRU",
550
+ "annotated_board": [
551
+ "##########",
552
+ "#.aee....#",
553
+ "#........#",
554
+ "#.#......#",
555
+ "#....#...#",
556
+ "##......##",
557
+ "#a.......#",
558
+ "#...#....#",
559
+ "#.#......#",
560
+ "##########"
561
+ ],
562
+ "date": "2026-04-06T02:03:44-06:00"
563
+ },
564
+ {
565
+ "width": 8,
566
+ "height": 8,
567
+ "players": 2,
568
+ "diameter": 20,
569
+ "threes_along_diameter": 14,
570
+ "open_cells": 55,
571
+ "states": 1485,
572
+ "start": [
573
+ [
574
+ 0,
575
+ 5
576
+ ],
577
+ [
578
+ 5,
579
+ 0
580
+ ]
581
+ ],
582
+ "end": [
583
+ [
584
+ 0,
585
+ 5
586
+ ],
587
+ [
588
+ 2,
589
+ 0
590
+ ]
591
+ ],
592
+ "path": "URDLDRURURDLURULDLUL",
593
+ "annotated_board": [
594
+ "##########",
595
+ "##...#b..#",
596
+ "#....##..#",
597
+ "#e.......#",
598
+ "#.....#..#",
599
+ "#........#",
600
+ "#a.#....##",
601
+ "#.#......#",
602
+ "#...#....#",
603
+ "##########"
604
+ ],
605
+ "date": "2026-04-06T01:54:31-06:00"
606
+ },
607
+ {
608
+ "width": 8,
609
+ "height": 8,
610
+ "diameter": 35,
611
+ "threes_along_diameter": 30,
612
+ "open_cells": 56,
613
+ "states": 1540,
614
+ "start": [
615
+ [
616
+ 4,
617
+ 0
618
+ ],
619
+ [
620
+ 5,
621
+ 0
622
+ ]
623
+ ],
624
+ "end": [
625
+ [
626
+ 2,
627
+ 0
628
+ ],
629
+ [
630
+ 4,
631
+ 6
632
+ ]
633
+ ],
634
+ "path": "RULULULDRDRURULULULURURDRURULDRDLUL",
635
+ "annotated_board": [
636
+ "##########",
637
+ "#...#....#",
638
+ "#.#......#",
639
+ "#e.....#.#",
640
+ "##......##",
641
+ "#a....#e.#",
642
+ "#a.......#",
643
+ "#......#.#",
644
+ "#......#.#",
645
+ "##########"
646
+ ],
647
+ "date": "2026-03-29T23:13:48-06:00"
648
+ },
649
+ {
650
+ "width": 8,
651
+ "height": 8,
652
+ "diameter": 35,
653
+ "threes_along_diameter": 30,
654
+ "open_cells": 56,
655
+ "states": 1540,
656
+ "start": [
657
+ [
658
+ 4,
659
+ 0
660
+ ],
661
+ [
662
+ 5,
663
+ 1
664
+ ]
665
+ ],
666
+ "end": [[0, 3],[7, 0]],
667
+ "path": "RURULULDRURDLURURULDRURDRDLURULURDL",
668
+ "annotated_board": [
669
+ "##########",
670
+ "#..#e..#.#",
671
+ "#....#...#",
672
+ "#.......##",
673
+ "#........#",
674
+ "#a..#....#",
675
+ "##a..#..##",
676
+ "#........#",
677
+ "#e.......#",
678
+ "##########"
679
+ ],
680
+ "date": "2026-03-29T23:38:26-06:00"
681
+ },
682
+ {
683
+ "width": 8,
684
+ "height": 8,
685
+ "diameter": 38,
686
+ "threes_along_diameter": 30,
687
+ "open_cells": 56,
688
+ "states": 1540,
689
+ "start": [[0, 4], [7, 5]],
690
+ "end": [[0, 4], [2, 7]],
691
+ "path": "LULDRDLDLURDRDRDLULDRDRDLULURDRURDLDRU",
692
+ "annotated_board": [
693
+ "##########",
694
+ "#...#b...#",
695
+ "#.......##",
696
+ "##......e#",
697
+ "#......#.#",
698
+ "#........#",
699
+ "#.......##",
700
+ "#.....#..#",
701
+ "##....a#.#",
702
+ "##########"
703
+ ],
704
+ "date": "2026-03-30T00:03:39-06:00"
705
+ },
706
+ {
707
+ "width": 8,
708
+ "height": 8,
709
+ "diameter": 38,
710
+ "threes_along_diameter": 30,
711
+ "open_cells": 56,
712
+ "states": 1540,
713
+ "start": [
714
+ [
715
+ 0,
716
+ 6
717
+ ],
718
+ [
719
+ 4,
720
+ 0
721
+ ]
722
+ ],
723
+ "end": [
724
+ [
725
+ 0,
726
+ 1
727
+ ],
728
+ [
729
+ 4,
730
+ 0
731
+ ]
732
+ ],
733
+ "path": "RULDLULULDRURDLULULDRULURDRULDRULDRULU",
734
+ "annotated_board": [
735
+ "##########",
736
+ "#.e#..#a.#",
737
+ "#......#.#",
738
+ "#....#...#",
739
+ "##.......#",
740
+ "#b.......#",
741
+ "#.....#..#",
742
+ "#.#.....##",
743
+ "#........#",
744
+ "##########"
745
+ ],
746
+ "date": "2026-03-30T00:07:08-06:00"
747
+ },
748
+ {
749
+ "width": 8,
750
+ "height": 8,
751
+ "diameter": 35,
752
+ "threes_along_diameter": 30,
753
+ "open_cells": 56,
754
+ "states": 1540,
755
+ "start": [
756
+ [
757
+ 5,
758
+ 0
759
+ ],
760
+ [
761
+ 6,
762
+ 1
763
+ ]
764
+ ],
765
+ "end": [
766
+ [
767
+ 0,
768
+ 3
769
+ ],
770
+ [
771
+ 4,
772
+ 0
773
+ ]
774
+ ],
775
+ "path": "RURULULDRURDLURURULDRURDRDLURULULDL",
776
+ "annotated_board": [
777
+ "##########",
778
+ "#..#e..#.#",
779
+ "#....#...#",
780
+ "#.......##",
781
+ "#........#",
782
+ "#e.......#",
783
+ "#a..#....#",
784
+ "##a..#..##",
785
+ "#........#",
786
+ "##########"
787
+ ],
788
+ "date": "2026-03-30T00:45:01-06:00"
789
+ },
790
+ {
791
+ "width": 8,
792
+ "height": 8,
793
+ "diameter": 36,
794
+ "threes_along_diameter": 30,
795
+ "open_cells": 56,
796
+ "states": 1540,
797
+ "start": [
798
+ [
799
+ 0,
800
+ 0
801
+ ],
802
+ [
803
+ 0,
804
+ 3
805
+ ]
806
+ ],
807
+ "end": [
808
+ [
809
+ 0,
810
+ 2
811
+ ],
812
+ [
813
+ 5,
814
+ 1
815
+ ]
816
+ ],
817
+ "path": "LDLDRULDLDRDLDLDLURULDLULURURDLDLDLU",
818
+ "annotated_board": [
819
+ "##########",
820
+ "#a.ea....#",
821
+ "##.......#",
822
+ "#....#...#",
823
+ "#....#...#",
824
+ "#.#.....##",
825
+ "#.e...#..#",
826
+ "##.......#",
827
+ "#..#.....#",
828
+ "##########"
829
+ ],
830
+ "date": "2026-03-30T00:57:40-06:00"
831
+ },
832
+ {
833
+ "width": 8,
834
+ "height": 8,
835
+ "diameter": 35,
836
+ "threes_along_diameter": 30,
837
+ "open_cells": 56,
838
+ "states": 1540,
839
+ "start": [
840
+ [
841
+ 4,
842
+ 6
843
+ ],
844
+ [
845
+ 6,
846
+ 5
847
+ ]
848
+ ],
849
+ "end": [
850
+ [
851
+ 0,
852
+ 4
853
+ ],
854
+ [
855
+ 2,
856
+ 0
857
+ ]
858
+ ],
859
+ "path": "ULULDLDRULURDLULULDRULURURDLULDLDRU",
860
+ "annotated_board": [
861
+ "##########",
862
+ "#..#.e.#.#",
863
+ "##.......#",
864
+ "#e.......#",
865
+ "#.#....#.#",
866
+ "#.....#a.#",
867
+ "##.......#",
868
+ "#.....a..#",
869
+ "#......#.#",
870
+ "##########"
871
+ ],
872
+ "date": "2026-03-30T01:23:46-06:00"
873
+ },
874
+ {
875
+ "width": 8,
876
+ "height": 8,
877
+ "diameter": 35,
878
+ "threes_along_diameter": 30,
879
+ "open_cells": 56,
880
+ "states": 1540,
881
+ "start": [
882
+ [
883
+ 4,
884
+ 6
885
+ ],
886
+ [
887
+ 5,
888
+ 4
889
+ ]
890
+ ],
891
+ "end": [
892
+ [
893
+ 0,
894
+ 2
895
+ ],
896
+ [
897
+ 7,
898
+ 0
899
+ ]
900
+ ],
901
+ "path": "LULURURDLULDRULULURDLULDLDRULURULDL",
902
+ "annotated_board": [
903
+ "##########",
904
+ "#.#e..#..#",
905
+ "#...#....#",
906
+ "##.......#",
907
+ "#........#",
908
+ "#....#.a.#",
909
+ "##..#a..##",
910
+ "#........#",
911
+ "#e.......#",
912
+ "##########"
913
+ ],
914
+ "date": "2026-03-30T01:35:30-06:00"
915
+ },
916
+ {
917
+ "width": 8,
918
+ "height": 8,
919
+ "diameter": 37,
920
+ "threes_along_diameter": 30,
921
+ "open_cells": 56,
922
+ "states": 1540,
923
+ "start": [
924
+ [
925
+ 0,
926
+ 5
927
+ ],
928
+ [
929
+ 7,
930
+ 0
931
+ ]
932
+ ],
933
+ "end": [
934
+ [
935
+ 0,
936
+ 2
937
+ ],
938
+ [
939
+ 1,
940
+ 4
941
+ ]
942
+ ],
943
+ "path": "URULDRDLDLDRULURULULULDLURULULURDRULU",
944
+ "annotated_board": [
945
+ "##########",
946
+ "#..e.#a..#",
947
+ "###..e...#",
948
+ "#........#",
949
+ "#........#",
950
+ "#.....#..#",
951
+ "#...#...##",
952
+ "#......#.#",
953
+ "#a.#.....#",
954
+ "##########"
955
+ ],
956
+ "date": "2026-03-30T03:05:59-06:00"
957
+ },
958
+ {
959
+ "width": 8,
960
+ "height": 8,
961
+ "diameter": 42,
962
+ "threes_along_diameter": 30,
963
+ "open_cells": 56,
964
+ "states": 1540,
965
+ "start": [
966
+ [
967
+ 0,
968
+ 7
969
+ ],
970
+ [
971
+ 1,
972
+ 0
973
+ ]
974
+ ],
975
+ "end": [
976
+ [
977
+ 0,
978
+ 4
979
+ ],
980
+ [
981
+ 0,
982
+ 5
983
+ ]
984
+ ],
985
+ "path": "LDRDRURULDRDRDLULDRULDRDLDLURDLULULDRDLDRU",
986
+ "annotated_board": [
987
+ "##########",
988
+ "##...ee.a#",
989
+ "#a.......#",
990
+ "#........#",
991
+ "#.......##",
992
+ "#.#......#",
993
+ "##.......#",
994
+ "#...#....#",
995
+ "##.#...#.#",
996
+ "##########"
997
+ ],
998
+ "date": "2026-03-30T04:40:17-06:00"
999
+ },
1000
+ {
1001
+ "width": 8,
1002
+ "height": 8,
1003
+ "diameter": 56,
1004
+ "threes_along_diameter": 49,
1005
+ "open_cells": 47,
1006
+ "states": 1081,
1007
+ "start": [
1008
+ [
1009
+ 1,
1010
+ 0
1011
+ ],
1012
+ [
1013
+ 4,
1014
+ 7
1015
+ ]
1016
+ ],
1017
+ "end": [
1018
+ [
1019
+ 1,
1020
+ 0
1021
+ ],
1022
+ [
1023
+ 6,
1024
+ 3
1025
+ ]
1026
+ ],
1027
+ "path": "DLURDRULDLDLDRULDRULDLDLURDRDRURDLULDRDRULDLDLDRULULDLUL",
1028
+ "annotated_board": [
1029
+ "##########",
1030
+ "##..#.#.##",
1031
+ "#b#...#..#",
1032
+ "#..#.....#",
1033
+ "#....#..##",
1034
+ "##.....#a#",
1035
+ "#.....#..#",
1036
+ "####e....#",
1037
+ "#....#..##",
1038
+ "##########"
1039
+ ],
1040
+ "date": "2026-04-02T21:34:31-06:00"
1041
+ },
1042
+ {
1043
+ "width": 8,
1044
+ "height": 8,
1045
+ "diameter": 60,
1046
+ "threes_along_diameter": 50,
1047
+ "open_cells": 44,
1048
+ "states": 946,
1049
+ "start": [
1050
+ [
1051
+ 4,
1052
+ 0
1053
+ ],
1054
+ [
1055
+ 7,
1056
+ 7
1057
+ ]
1058
+ ],
1059
+ "end": [
1060
+ [
1061
+ 0,
1062
+ 4
1063
+ ],
1064
+ [
1065
+ 6,
1066
+ 6
1067
+ ]
1068
+ ],
1069
+ "path": "LULURDRURURDRULURURULDLULDLULURURDRDLDLDLURURDRURDRDLDRURULU",
1070
+ "annotated_board": [
1071
+ "##########",
1072
+ "##..#e.#.#",
1073
+ "#..#...#.#",
1074
+ "#....#...#",
1075
+ "####..#..#",
1076
+ "#a.###..##",
1077
+ "#...#..#.#",
1078
+ "#.#....e##",
1079
+ "#.#...#.a#",
1080
+ "##########"
1081
+ ],
1082
+ "date": "2026-04-02T21:31:33-06:00"
1083
+ },
1084
+ {
1085
+ "width": 8,
1086
+ "height": 8,
1087
+ "diameter": 69,
1088
+ "threes_along_diameter": 50,
1089
+ "open_cells": 46,
1090
+ "states": 1035,
1091
+ "start": [
1092
+ [
1093
+ 0,
1094
+ 0
1095
+ ],
1096
+ [
1097
+ 0,
1098
+ 1
1099
+ ]
1100
+ ],
1101
+ "end": [
1102
+ [
1103
+ 1,
1104
+ 2
1105
+ ],
1106
+ [
1107
+ 4,
1108
+ 7
1109
+ ]
1110
+ ],
1111
+ "path": "DRURURDLURURURDRDLDRDLDLULDLDRDLDRULDLURDRURULULURULULDRDRULDLDLURDRU",
1112
+ "annotated_board": [
1113
+ "##########",
1114
+ "#aa#...###",
1115
+ "##.e.#...#",
1116
+ "#...#.#..#",
1117
+ "#..#....##",
1118
+ "###..#..e#",
1119
+ "##....#..#",
1120
+ "#...#..###",
1121
+ "#..#.....#",
1122
+ "##########"
1123
+ ],
1124
+ "date": "2026-04-02T21:40:15-06:00"
1125
+ },
1126
+ {
1127
+ "width": 8,
1128
+ "height": 8,
1129
+ "diameter": 61,
1130
+ "threes_along_diameter": 50,
1131
+ "open_cells": 51,
1132
+ "states": 1275,
1133
+ "start": [
1134
+ [
1135
+ 0,
1136
+ 7
1137
+ ],
1138
+ [
1139
+ 1,
1140
+ 5
1141
+ ]
1142
+ ],
1143
+ "end": [
1144
+ [
1145
+ 3,
1146
+ 1
1147
+ ],
1148
+ [
1149
+ 6,
1150
+ 7
1151
+ ]
1152
+ ],
1153
+ "path": "LDLDLDRDLURDRDRDLURDRURURULULULURDRURDRDLDLURDRURDRDLULULDLUR",
1154
+ "annotated_board": [
1155
+ "##########",
1156
+ "#......#a#",
1157
+ "##..#.a..#",
1158
+ "##......##",
1159
+ "#.e#.....#",
1160
+ "#......#.#",
1161
+ "#....#.#.#",
1162
+ "#.#...#.e#",
1163
+ "#...#..#.#",
1164
+ "##########"
1165
+ ],
1166
+ "date": "2026-04-02T21:37:40-06:00"
1167
+ },
1168
+ {
1169
+ "width": 8,
1170
+ "height": 8,
1171
+ "diameter": 63,
1172
+ "threes_along_diameter": 51,
1173
+ "open_cells": 48,
1174
+ "states": 1128,
1175
+ "start": [
1176
+ [
1177
+ 2,
1178
+ 7
1179
+ ],
1180
+ [
1181
+ 3,
1182
+ 7
1183
+ ]
1184
+ ],
1185
+ "end": [
1186
+ [
1187
+ 3,
1188
+ 5
1189
+ ],
1190
+ [
1191
+ 6,
1192
+ 0
1193
+ ]
1194
+ ],
1195
+ "path": "LULULURDRDRURDLDRDRURDRDRDLDLULULULULDRDRDRDLDLURULULULDRDRDRDL",
1196
+ "annotated_board": [
1197
+ "##########",
1198
+ "#...######",
1199
+ "##....####",
1200
+ "#..#....a#",
1201
+ "#.#..#e.a#",
1202
+ "##......##",
1203
+ "#.......##",
1204
+ "#e.#..#..#",
1205
+ "#.#....#.#",
1206
+ "##########"
1207
+ ],
1208
+ "date": "2026-04-02T21:33:28-06:00"
1209
+ }
1210
+ ]
dataset/validate_dataset.py ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
3
+ # All rights reserved.
4
+ #
5
+ # This source code is licensed under the BSD-style license found in the
6
+ # LICENSE file in the root directory of this source tree.
7
+
8
+ """
9
+ Validate dataset/ice-maze-levels.json: ``start`` / ``end`` match the board, and ``diameter`` equals ``len(path)`` when both are set.
10
+
11
+ Board glyphs: ``a`` = start (player) only, ``e`` = exit only, ``b`` = start and exit on the same cell.
12
+ Other lowercase letters (e.g. ``c``) are treated as additional player starts only.
13
+ """
14
+
15
+ from __future__ import annotations
16
+
17
+ import json
18
+ import sys
19
+ from pathlib import Path
20
+ from typing import Dict, List, Tuple
21
+
22
+ REPO_ROOT = Path(__file__).resolve().parent.parent
23
+ if str(REPO_ROOT) not in sys.path:
24
+ sys.path.insert(0, str(REPO_ROOT))
25
+
26
+ from models import MazeAction
27
+ from server.maze_env_environment import MazeEnvironment
28
+
29
+ STEP_CHAR_TO_DIRECTION = {
30
+ "U": "UP",
31
+ "D": "DOWN",
32
+ "L": "LEFT",
33
+ "R": "RIGHT",
34
+ }
35
+
36
+
37
+ def parse_board(rows: List[str]) -> Tuple[List[Tuple[int, int]], List[Tuple[int, int]]]:
38
+ """Board ``(row, col)`` from ``enumerate``: ``a``/``b``/other players → starts; ``e``/``b`` → exits."""
39
+ players: List[Tuple[int, int]] = []
40
+ exits: List[Tuple[int, int]] = []
41
+ for r, row in enumerate(rows):
42
+ for c, ch in enumerate(row):
43
+ if ch == "b":
44
+ players.append((r, c))
45
+ exits.append((r, c))
46
+ elif ch == "a":
47
+ players.append((r, c))
48
+ elif ch == "e":
49
+ exits.append((r, c))
50
+ elif len(ch) == 1 and ch.islower() and ch.isalpha():
51
+ players.append((r, c))
52
+ return sorted(players), sorted(exits)
53
+
54
+
55
+ def to_interior_zero_based(board_rc: List[Tuple[int, int]]) -> List[Tuple[int, int]]:
56
+ """0-based interior: subtract 1 from each board (row, col) from ``enumerate``."""
57
+ return sorted((r - 1, c - 1) for r, c in board_rc)
58
+
59
+
60
+ def json_coords(name: str, raw: object) -> List[Tuple[int, int]]:
61
+ if not isinstance(raw, list):
62
+ raise ValueError(f"{name} must be a list")
63
+ out: List[Tuple[int, int]] = []
64
+ for i, item in enumerate(raw):
65
+ if not isinstance(item, (list, tuple)) or len(item) != 2:
66
+ raise ValueError(f"{name}[{i}] must be [row, col]")
67
+ out.append((int(item[0]), int(item[1])))
68
+ return sorted(out)
69
+
70
+
71
+ def validate_level(i: int, level: Dict) -> List[str]:
72
+ err: List[str] = []
73
+ p = f"Level[{i}]"
74
+
75
+ if not isinstance(level.get("annotated_board"), list) or not level["annotated_board"]:
76
+ err.append(f"{p}: need non-empty annotated_board (list of strings)")
77
+ return err
78
+
79
+ if "start" not in level:
80
+ err.append(f"{p}: missing start")
81
+ if "end" not in level:
82
+ err.append(f"{p}: missing end")
83
+ if err:
84
+ return err
85
+
86
+ rows = level["annotated_board"]
87
+ try:
88
+ board_players, board_exits = parse_board(rows)
89
+ except (TypeError, ValueError) as e:
90
+ return [f"{p}: board parse error: {e}"]
91
+
92
+ try:
93
+ want_players = json_coords("start", level["start"])
94
+ want_exits = json_coords("end", level["end"])
95
+ except ValueError as e:
96
+ return [f"{p}: {e}"]
97
+
98
+ got_players = to_interior_zero_based(board_players)
99
+ got_exits = to_interior_zero_based(board_exits)
100
+
101
+ if got_players != want_players:
102
+ err.append(
103
+ f"{p}: start mismatch — JSON (0-based interior, sorted): {want_players}; "
104
+ f"parsed board row/col (sorted): {board_players}; after -1 per axis: {got_players}"
105
+ )
106
+ if got_exits != want_exits:
107
+ err.append(
108
+ f"{p}: end mismatch — JSON (0-based interior, sorted): {want_exits}; "
109
+ f"parsed board row/col (sorted): {board_exits}; after -1 per axis: {got_exits}"
110
+ )
111
+
112
+ if "diameter" in level and "path" in level:
113
+ path = level["path"]
114
+ diam = level["diameter"]
115
+ if not isinstance(path, str):
116
+ err.append(f"{p}: path must be a string, got {type(path).__name__}")
117
+ elif not isinstance(diam, int) or isinstance(diam, bool):
118
+ err.append(f"{p}: diameter must be an int, got {type(diam).__name__}")
119
+ elif len(path) != diam:
120
+ err.append(
121
+ f"{p}: diameter ({diam}) != len(path) ({len(path)}); path={path!r}"
122
+ )
123
+
124
+ return err
125
+
126
+
127
+ def validate_level_path_replay(i: int, level: Dict, env: MazeEnvironment) -> List[str]:
128
+ """Replay level path in MazeEnvironment and verify done only at the final step."""
129
+ p = f"Level[{i}]"
130
+ path = level.get("path")
131
+ if path is None:
132
+ return []
133
+ if not isinstance(path, str):
134
+ return [f"{p}: path must be a string, got {type(path).__name__}"]
135
+
136
+ errors: List[str] = []
137
+ obs = env.reset(level_index=i)
138
+
139
+ if not path:
140
+ if not obs.done:
141
+ errors.append(f"{p}: empty path but reset state is not done")
142
+ return errors
143
+
144
+ if obs.done:
145
+ errors.append(f"{p}: reset starts done=True but path is non-empty ({path!r})")
146
+ return errors
147
+
148
+ for step_idx, token in enumerate(path, start=1):
149
+ direction = STEP_CHAR_TO_DIRECTION.get(token)
150
+ if direction is None:
151
+ errors.append(
152
+ f"{p}: path contains invalid token {token!r} at 1-based step {step_idx}; "
153
+ f"use only {sorted(STEP_CHAR_TO_DIRECTION)}"
154
+ )
155
+ break
156
+
157
+ obs = env.step(MazeAction(direction=direction))
158
+ is_last = step_idx == len(path)
159
+
160
+ if obs.done and not is_last:
161
+ errors.append(
162
+ f"{p}: done became True too early at step {step_idx}/{len(path)}; path={path!r}"
163
+ )
164
+ break
165
+ if is_last and not obs.done:
166
+ errors.append(
167
+ f"{p}: done is False at final path step {step_idx}/{len(path)}; path={path!r}"
168
+ )
169
+
170
+ return errors
171
+
172
+
173
+ def main() -> int:
174
+ path = Path(__file__).resolve().parent / "ice-maze-levels.json"
175
+ if not path.is_file():
176
+ print(f"error: missing {path}", file=sys.stderr)
177
+ return 1
178
+
179
+ data = json.loads(path.read_text(encoding="utf-8"))
180
+ if not isinstance(data, list):
181
+ print("error: root must be a JSON array", file=sys.stderr)
182
+ return 1
183
+
184
+ errors: List[str] = []
185
+ env = MazeEnvironment()
186
+ for i, level in enumerate(data):
187
+ if isinstance(level, dict):
188
+ errors.extend(validate_level(i, level))
189
+ errors.extend(validate_level_path_replay(i, level, env))
190
+ else:
191
+ errors.append(f"Level[{i}]: not an object")
192
+
193
+ if errors:
194
+ for msg in errors:
195
+ print(msg, file=sys.stderr)
196
+ return 1
197
+
198
+ print(f"OK: {len(data)} levels — {path}")
199
+ return 0
200
+
201
+
202
+ if __name__ == "__main__":
203
+ raise SystemExit(main())
experiment.ipynb ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 1,
6
+ "id": "e962db90",
7
+ "metadata": {},
8
+ "outputs": [
9
+ {
10
+ "data": {
11
+ "text/plain": [
12
+ "([(0, 0), (1, 0)], [(3, 0), (7, 1)])"
13
+ ]
14
+ },
15
+ "execution_count": 1,
16
+ "metadata": {},
17
+ "output_type": "execute_result"
18
+ }
19
+ ],
20
+ "source": [
21
+ "from validate_dataset import scan_annotated_board\n",
22
+ "\n",
23
+ "scan_annotated_board(\n",
24
+ " [\n",
25
+ " \"##########\",\n",
26
+ " \"#a.......#\",\n",
27
+ " \"#a.......#\",\n",
28
+ " \"#........#\",\n",
29
+ " \"#e.......#\",\n",
30
+ " \"##.......#\",\n",
31
+ " \"##.......#\",\n",
32
+ " \"##.......#\",\n",
33
+ " \"#.e......#\",\n",
34
+ " \"##########\"\n",
35
+ " ]\n",
36
+ ")"
37
+ ]
38
+ },
39
+ {
40
+ "cell_type": "code",
41
+ "execution_count": 9,
42
+ "id": "9b7f20ab",
43
+ "metadata": {},
44
+ "outputs": [],
45
+ "source": [
46
+ "from __future__ import annotations\n",
47
+ "\n",
48
+ "import json\n",
49
+ "import sys\n",
50
+ "from pathlib import Path\n",
51
+ "from typing import Dict, List, Tuple\n",
52
+ "\n",
53
+ "path = Path(\"dataset/ice-maze-levels.json\")\n",
54
+ "\n",
55
+ "if not path.is_file():\n",
56
+ " print(f\"error: missing {path}\", file=sys.stderr)\n",
57
+ " sys.exit(1)\n",
58
+ "\n",
59
+ "data = json.loads(path.read_text(encoding=\"utf-8\"))\n",
60
+ "\n",
61
+ "if not isinstance(data, list):\n",
62
+ " print(\"error: root must be a JSON array\", file=sys.stderr)\n",
63
+ " sys.exit(1)"
64
+ ]
65
+ },
66
+ {
67
+ "cell_type": "code",
68
+ "execution_count": 13,
69
+ "id": "a53bb716",
70
+ "metadata": {},
71
+ "outputs": [
72
+ {
73
+ "data": {
74
+ "text/plain": [
75
+ "30"
76
+ ]
77
+ },
78
+ "execution_count": 13,
79
+ "metadata": {},
80
+ "output_type": "execute_result"
81
+ }
82
+ ],
83
+ "source": [
84
+ "len(data)"
85
+ ]
86
+ },
87
+ {
88
+ "cell_type": "code",
89
+ "execution_count": 2,
90
+ "id": "9953a0b9",
91
+ "metadata": {},
92
+ "outputs": [
93
+ {
94
+ "data": {
95
+ "text/plain": [
96
+ "42"
97
+ ]
98
+ },
99
+ "execution_count": 2,
100
+ "metadata": {},
101
+ "output_type": "execute_result"
102
+ }
103
+ ],
104
+ "source": [
105
+ "len('LDRDRURULDRDRDLULDRULDRDLDLURDLULULDRDLDRU')"
106
+ ]
107
+ },
108
+ {
109
+ "cell_type": "code",
110
+ "execution_count": null,
111
+ "id": "28535d64",
112
+ "metadata": {},
113
+ "outputs": [],
114
+ "source": []
115
+ }
116
+ ],
117
+ "metadata": {
118
+ "kernelspec": {
119
+ "display_name": ".venv (3.13.11)",
120
+ "language": "python",
121
+ "name": "python3"
122
+ },
123
+ "language_info": {
124
+ "codemirror_mode": {
125
+ "name": "ipython",
126
+ "version": 3
127
+ },
128
+ "file_extension": ".py",
129
+ "mimetype": "text/x-python",
130
+ "name": "python",
131
+ "nbconvert_exporter": "python",
132
+ "pygments_lexer": "ipython3",
133
+ "version": "3.13.11"
134
+ }
135
+ },
136
+ "nbformat": 4,
137
+ "nbformat_minor": 5
138
+ }
models.py ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ Data models for the Ice Maze Environment.
9
+
10
+ The maze environment loads ice-sliding puzzle levels and exposes them
11
+ to LLM agents via the OpenEnv protocol.
12
+ """
13
+
14
+ from enum import Enum
15
+ from typing import List
16
+
17
+ from openenv.core.env_server.types import Action, Observation
18
+ from pydantic import Field, field_validator
19
+
20
+
21
+ class MazeDirection(str, Enum):
22
+ """Cardinal direction for a single Ice Maze step (all players move together)."""
23
+
24
+ LEFT = "LEFT"
25
+ RIGHT = "RIGHT"
26
+ UP = "UP"
27
+ DOWN = "DOWN"
28
+
29
+
30
+ class MazeAction(Action):
31
+ """
32
+ Action for the Ice Maze environment.
33
+
34
+ The agent specifies a direction to slide all players simultaneously.
35
+ On ice, players slide until they hit a wall (#) or another player.
36
+ Exit cells (e) do NOT stop sliding — players slide through them.
37
+ """
38
+
39
+ direction: MazeDirection = Field(
40
+ ...,
41
+ description="Direction to move all players simultaneously: LEFT, RIGHT, UP, or DOWN.",
42
+ )
43
+
44
+ @field_validator("direction", mode="before")
45
+ @classmethod
46
+ def _coerce_direction(cls, v: object) -> object:
47
+ if isinstance(v, MazeDirection):
48
+ return v
49
+ if isinstance(v, str):
50
+ key = v.strip().upper()
51
+ if key in MazeDirection.__members__:
52
+ return MazeDirection[key]
53
+ return v
54
+
55
+
56
+ class MazeObservation(Observation):
57
+ """
58
+ Observation from the Ice Maze environment.
59
+
60
+ Primary agent-facing fields: current board, step budget, action history,
61
+ and (on reset) the system prompt with rules and layout context.
62
+ Additional fields support tooling and state introspection.
63
+
64
+ Inherited from Observation base:
65
+ done (bool) — True when all players are simultaneously on exit cells
66
+ reward (float|None) — Reward signal for this step
67
+ metadata (dict) — Extra info: level_index, action_history (no oracle path)
68
+ """
69
+
70
+ board: str = Field(
71
+ default="",
72
+ description=(
73
+ "Current ASCII board rendered as a newline-separated string. "
74
+ "Symbols: # wall, . ice, a player on non-exit, e unoccupied exit, "
75
+ "b player currently on an exit."
76
+ ),
77
+ )
78
+ step_count: int = Field(
79
+ default=0,
80
+ description="Number of steps taken so far in this episode.",
81
+ )
82
+ max_steps: int = Field(
83
+ default=0,
84
+ description="Maximum steps allowed for this episode before a hard limit (set on reset).",
85
+ )
86
+ previous_actions: List[str] = Field(
87
+ default_factory=list,
88
+ description=(
89
+ "Directions applied so far in order, each value one of "
90
+ "LEFT, RIGHT, UP, DOWN (same vocabulary as MazeAction)."
91
+ ),
92
+ )
93
+ system_prompt: str = Field(
94
+ default="",
95
+ description=(
96
+ "Instructions for the LLM: maze rules, valid actions, symbols, layout, "
97
+ "step count vs max steps, and previous actions (oldest first). "
98
+ "Refreshed on reset() and on each step()."
99
+ ),
100
+ )
101
+ agent_positions: List[List[int]] = Field(
102
+ default_factory=list,
103
+ description=(
104
+ "Current interior coordinates of each player as [[row, col], ...]. "
105
+ "Interior coords are 0-indexed from the top-left non-wall cell "
106
+ "(i.e. board_row - 1, board_col - 1)."
107
+ ),
108
+ )
109
+ exit_positions: List[List[int]] = Field(
110
+ default_factory=list,
111
+ description=(
112
+ "Interior coordinates of all exit cells as [[row, col], ...]. "
113
+ "Exit cells are shared — any player can use any exit. "
114
+ "Fixed for the duration of the episode."
115
+ ),
116
+ )
117
+ num_players: int = Field(
118
+ default=1,
119
+ description="Number of players in this level.",
120
+ )
121
+ message: str = Field(
122
+ default="",
123
+ description=(
124
+ "Human-readable status message describing what just happened, "
125
+ "e.g. 'Moved LEFT. Step 3.', 'Solved! All players reached an exit in 6 steps.', "
126
+ "'Invalid direction. Use: LEFT, RIGHT, UP, DOWN'."
127
+ ),
128
+ )
129
+
130
+ def __str__(self) -> str:
131
+ parts = []
132
+
133
+ parts.append(f"done={self.done} | reward={self.reward}")
134
+ parts.append(f"step={self.step_count}/{self.max_steps}")
135
+ parts.append(f"players={self.agent_positions} exits={self.exit_positions}")
136
+
137
+ if self.previous_actions:
138
+ parts.append(f"actions={self.previous_actions}")
139
+
140
+ if self.message:
141
+ parts.append(f"message={self.message}")
142
+
143
+ # 👇 Full system prompt (clearly separated)
144
+ if self.system_prompt:
145
+ parts.append("\n=== SYSTEM PROMPT ===")
146
+ parts.append(self.system_prompt)
147
+
148
+ return "\n".join(parts)
openenv.yaml ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ spec_version: 1
2
+ name: maze_env
3
+ type: space
4
+ runtime: fastapi
5
+ app: server.app:app
6
+ port: 8000
7
+
openenv_maze_env.egg-info/PKG-INFO ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ Metadata-Version: 2.4
2
+ Name: openenv-maze_env
3
+ Version: 0.1.0
4
+ Summary: Maze Env environment for OpenEnv
5
+ Requires-Python: >=3.10
6
+ Requires-Dist: openenv-core[core]>=0.2.2
7
+ Provides-Extra: dev
8
+ Requires-Dist: pytest>=8.0.0; extra == "dev"
9
+ Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
openenv_maze_env.egg-info/SOURCES.txt ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ README.md
2
+ __init__.py
3
+ client.py
4
+ models.py
5
+ pyproject.toml
6
+ ./__init__.py
7
+ ./client.py
8
+ ./models.py
9
+ ./validate_dataset.py
10
+ ./dataset/ice-maze-levels.json
11
+ openenv_maze_env.egg-info/PKG-INFO
12
+ openenv_maze_env.egg-info/SOURCES.txt
13
+ openenv_maze_env.egg-info/dependency_links.txt
14
+ openenv_maze_env.egg-info/entry_points.txt
15
+ openenv_maze_env.egg-info/requires.txt
16
+ openenv_maze_env.egg-info/top_level.txt
17
+ server/__init__.py
18
+ server/app.py
19
+ server/maze_env_environment.py
openenv_maze_env.egg-info/dependency_links.txt ADDED
@@ -0,0 +1 @@
 
 
1
+
openenv_maze_env.egg-info/entry_points.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ [console_scripts]
2
+ server = maze_env.server.app:main
openenv_maze_env.egg-info/requires.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ openenv-core[core]>=0.2.2
2
+
3
+ [dev]
4
+ pytest>=8.0.0
5
+ pytest-cov>=4.0.0
openenv_maze_env.egg-info/top_level.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ maze_env
pyproject.toml ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ [build-system]
8
+ requires = ["setuptools>=45", "wheel"]
9
+ build-backend = "setuptools.build_meta"
10
+
11
+ [project]
12
+ name = "openenv-maze_env"
13
+ version = "0.1.0"
14
+ description = "Maze Env environment for OpenEnv"
15
+ requires-python = ">=3.10"
16
+ dependencies = [
17
+ # Core OpenEnv runtime (provides FastAPI server + HTTP client types)
18
+ # install from github
19
+ # "openenv-core[core] @ git+https://github.com/meta-pytorch/OpenEnv.git",
20
+ "openenv-core[core]>=0.2.2",
21
+ # Environment-specific dependencies
22
+ # Add all dependencies needed for your environment here
23
+ # Examples:
24
+ # "numpy>=1.19.0",
25
+ # "torch>=2.0.0",
26
+ # "gymnasium>=0.29.0",
27
+ # "openspiel>=1.0.0",
28
+ # "smolagents>=1.22.0,<2",
29
+ ]
30
+
31
+ [project.optional-dependencies]
32
+ dev = [
33
+ "pytest>=8.0.0",
34
+ "pytest-cov>=4.0.0",
35
+ ]
36
+
37
+ [project.scripts]
38
+ # Server entry point - enables running via: uv run --project . server
39
+ # or: python -m maze_env.server.app
40
+ server = "maze_env.server.app:main"
41
+
42
+ [tool.setuptools]
43
+ include-package-data = true
44
+ packages = ["maze_env", "maze_env.server"]
45
+ package-dir = { "maze_env" = ".", "maze_env.server" = "server" }
46
+
47
+ [tool.setuptools.package-data]
48
+ maze_env = ["dataset/ice-maze-levels.json"]
server/__init__.py ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """Maze Env environment server components."""
8
+
9
+ from .maze_env_environment import MazeEnvironment
10
+
11
+ __all__ = ["MazeEnvironment"]
server/app.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ FastAPI application for the Maze Env Environment.
9
+
10
+ This module creates an HTTP server that exposes the MazeEnvironment
11
+ over HTTP and WebSocket endpoints, compatible with EnvClient.
12
+
13
+ Endpoints:
14
+ - POST /reset: Reset the environment
15
+ - POST /step: Execute an action
16
+ - GET /state: Get current environment state
17
+ - GET /schema: Get action/observation schemas
18
+ - WS /ws: WebSocket endpoint for persistent sessions
19
+
20
+ Usage:
21
+ # Development (with auto-reload):
22
+ uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
23
+
24
+ # Production:
25
+ uvicorn server.app:app --host 0.0.0.0 --port 8000 --workers 4
26
+
27
+ # Or run directly:
28
+ python -m server.app
29
+ """
30
+
31
+ try:
32
+ from openenv.core.env_server.http_server import create_app
33
+ except Exception as e: # pragma: no cover
34
+ raise ImportError(
35
+ "openenv is required for the web interface. Install dependencies with '\n uv sync\n'"
36
+ ) from e
37
+
38
+ try:
39
+ from ..models import MazeAction, MazeObservation
40
+ from .maze_env_environment import MazeEnvironment
41
+ except ModuleNotFoundError:
42
+ from models import MazeAction, MazeObservation
43
+ from server.maze_env_environment import MazeEnvironment
44
+
45
+
46
+ # Create the app with web interface and README integration
47
+ app = create_app(
48
+ MazeEnvironment,
49
+ MazeAction,
50
+ MazeObservation,
51
+ env_name="maze_env",
52
+ max_concurrent_envs=1, # increase this number to allow more concurrent WebSocket sessions
53
+ )
54
+
55
+
56
+ def main(host: str = "0.0.0.0", port: int = 8000):
57
+ """
58
+ Entry point for direct execution via uv run or python -m.
59
+
60
+ This function enables running the server without Docker:
61
+ uv run --project . server
62
+ uv run --project . server --port 8001
63
+ python -m maze_env.server.app
64
+
65
+ Args:
66
+ host: Host address to bind to (default: "0.0.0.0")
67
+ port: Port number to listen on (default: 8000)
68
+
69
+ For production deployments, consider using uvicorn directly with
70
+ multiple workers:
71
+ uvicorn maze_env.server.app:app --workers 4
72
+ """
73
+ import uvicorn
74
+
75
+ uvicorn.run(app, host=host, port=port)
76
+
77
+
78
+ if __name__ == "__main__":
79
+ import argparse
80
+
81
+ parser = argparse.ArgumentParser()
82
+ parser.add_argument("--port", type=int, default=8000)
83
+ args = parser.parse_args()
84
+ main(port=args.port)
server/maze_env_environment.py ADDED
@@ -0,0 +1,291 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ Ice Maze Environment Implementation.
9
+
10
+ Players slide on ice in a given direction until they hit a wall (#) or
11
+ another player. All players move simultaneously. The episode is solved
12
+ when every player is simultaneously on an exit cell (e) after a move.
13
+ """
14
+
15
+ from typing import FrozenSet, List, Optional, Tuple
16
+ from uuid import uuid4
17
+
18
+ from openenv.core.env_server.interfaces import Environment
19
+ from openenv.core.env_server.types import State
20
+
21
+ try:
22
+ from ..models import MazeAction, MazeObservation
23
+ from .maze_env_helpers import (
24
+ apply_direction_slide,
25
+ build_step_feedback,
26
+ build_system_prompt,
27
+ load_ice_maze_levels,
28
+ parse_board_entities,
29
+ render_board,
30
+ resolve_max_steps,
31
+ )
32
+ except ImportError:
33
+ try:
34
+ from models import MazeAction, MazeObservation
35
+ from server.maze_env_helpers import (
36
+ apply_direction_slide,
37
+ build_step_feedback,
38
+ build_system_prompt,
39
+ load_ice_maze_levels,
40
+ parse_board_entities,
41
+ render_board,
42
+ resolve_max_steps,
43
+ )
44
+ except ImportError:
45
+ import os
46
+ import sys
47
+
48
+ repo_root = os.path.dirname(os.path.dirname(__file__))
49
+ if repo_root not in sys.path:
50
+ sys.path.insert(0, repo_root)
51
+
52
+ from models import MazeAction, MazeObservation
53
+ from maze_env_helpers import (
54
+ apply_direction_slide,
55
+ build_step_feedback,
56
+ build_system_prompt,
57
+ load_ice_maze_levels,
58
+ parse_board_entities,
59
+ render_board,
60
+ resolve_max_steps,
61
+ )
62
+
63
+ class MazeEnvironment(Environment):
64
+ """
65
+ Ice Maze environment.
66
+
67
+ Each episode loads one puzzle level from dataset/ice-maze-levels.json.
68
+ Players slide on ice until hitting a wall or another player.
69
+ All players move simultaneously in the same direction each turn.
70
+ The episode ends when every player is on an exit cell simultaneously.
71
+
72
+ Supports concurrent WebSocket sessions (each session gets its own instance).
73
+ """
74
+
75
+ SUPPORTS_CONCURRENT_SESSIONS: bool = True
76
+
77
+ def __init__(self):
78
+ """Initialise with empty state; call reset() to load a level."""
79
+ super().__init__()
80
+ self._levels: List[dict] = load_ice_maze_levels()
81
+ self._current_level_index: int = 0
82
+ self._reset_index: int = 0
83
+ self._level: dict = {}
84
+ self._grid: List[List[str]] = []
85
+ self._num_players: int = 1
86
+ # Interior coords (0-based inside ``#`` wall): grid row/col = interior + 1
87
+ self._agent_positions: List[Tuple[int, int]] = []
88
+ # Goal tile interiors are fixed for the episode.
89
+ self._exit_positions: FrozenSet[Tuple[int, int]] = frozenset()
90
+ self._action_history: List[str] = []
91
+ self._max_steps: int = 0
92
+ self._done: bool = False
93
+ self._state: State = State(episode_id=str(uuid4()), step_count=0)
94
+
95
+ def _interior_positions_lists(self) -> Tuple[List[List[int]], List[List[int]]]:
96
+ """Return agent/exit interior positions as JSON-friendly `[row, col]` lists."""
97
+ agents = [[r, c] for r, c in self._agent_positions]
98
+ exits = [[r, c] for r, c in sorted(self._exit_positions)]
99
+ return agents, exits
100
+
101
+ # ------------------------------------------------------------------
102
+ # Public API
103
+ # ------------------------------------------------------------------
104
+
105
+ def reset(self, level_index: Optional[int] = None, **kwargs) -> MazeObservation:
106
+ """
107
+ Reset the environment and load a puzzle level.
108
+
109
+ Args:
110
+ level_index: If given, load this level index (modulo number of levels).
111
+ If omitted, use the internal reset counter and advance it so
112
+ successive resets cycle through the dataset.
113
+
114
+ Returns:
115
+ MazeObservation with the initial board state and full system prompt.
116
+ """
117
+ n = len(self._levels)
118
+ if level_index is not None:
119
+ idx = int(level_index) % n
120
+ else:
121
+ idx = self._reset_index % n
122
+ # Count every reset call, including manual level picks.
123
+ self._reset_index += 1
124
+ self._current_level_index = idx
125
+ self._level = self._levels[idx]
126
+
127
+ self._grid = [list(row) for row in self._level["annotated_board"]]
128
+ agent_list, exit_list = parse_board_entities(self._grid)
129
+ self._agent_positions = list(agent_list)
130
+ self._exit_positions = frozenset(exit_list)
131
+
132
+ self._num_players = self._level.get("players", len(self._agent_positions))
133
+ self._action_history = []
134
+ self._max_steps = resolve_max_steps(self._level, kwargs)
135
+ self._done = False
136
+ self._state = State(episode_id=str(uuid4()), step_count=0)
137
+
138
+ # Check degenerate case: all players already on exits
139
+ self._done = all(pos in self._exit_positions for pos in self._agent_positions)
140
+
141
+ ag, ex = self._interior_positions_lists()
142
+ return MazeObservation(
143
+ board=render_board(self._grid),
144
+ step_count=0,
145
+ max_steps=self._max_steps,
146
+ previous_actions=[],
147
+ system_prompt=self._full_system_prompt(),
148
+ agent_positions=ag,
149
+ exit_positions=ex,
150
+ num_players=self._num_players,
151
+ message="Level loaded. Find the exit!",
152
+ done=self._done,
153
+ reward=0.0,
154
+ # path/diameter live on self._level for offline rubrics only — not agent-facing
155
+ metadata={
156
+ "level_index": idx,
157
+ },
158
+ )
159
+
160
+ def step(self, action: MazeAction, **kwargs) -> MazeObservation: # type: ignore[override]
161
+ """
162
+ Execute one environment step for a direction command.
163
+
164
+ If the episode is already done, return the current state unchanged.
165
+ Otherwise apply one directional slide move to all players and update
166
+ step history, solved status, reward, and message.
167
+ """
168
+ # MazeAction enforces a valid MazeDirection; value is canonical ("LEFT", etc.).
169
+ direction = action.direction.value
170
+ if self._done:
171
+ return self._current_obs(
172
+ message="Episode already complete. Call reset() to start a new episode.",
173
+ reward=0.0,
174
+ )
175
+
176
+ any_slide_moved = apply_direction_slide(
177
+ grid=self._grid,
178
+ direction=direction,
179
+ num_players=self._num_players,
180
+ agent_positions=self._agent_positions,
181
+ exit_positions=self._exit_positions,
182
+ )
183
+
184
+ self._action_history.append(direction)
185
+ self._state.step_count += 1
186
+
187
+ self._done = all(pos in self._exit_positions for pos in self._agent_positions)
188
+ reward, message = build_step_feedback(
189
+ done=self._done,
190
+ moved=any_slide_moved,
191
+ direction=direction,
192
+ step_count=self._state.step_count,
193
+ )
194
+
195
+ prev = list(self._action_history)
196
+ ag, ex = self._interior_positions_lists()
197
+ return MazeObservation(
198
+ board=render_board(self._grid),
199
+ step_count=self._state.step_count,
200
+ max_steps=self._max_steps,
201
+ previous_actions=prev,
202
+ system_prompt=self._full_system_prompt(),
203
+ agent_positions=ag,
204
+ exit_positions=ex,
205
+ num_players=self._num_players,
206
+ message=message,
207
+ done=self._done,
208
+ reward=reward,
209
+ metadata={
210
+ "level_index": self._current_level_index,
211
+ "action_history": prev,
212
+ },
213
+ )
214
+
215
+ @property
216
+ def state(self) -> State:
217
+ """
218
+ Return the full current state for LLM introspection.
219
+
220
+ Includes board, positions, action history, and level metadata.
221
+ """
222
+ ag, ex = self._interior_positions_lists()
223
+ return State(
224
+ episode_id=self._state.episode_id,
225
+ step_count=self._state.step_count,
226
+ # Extra fields (State uses extra="allow")
227
+ current_board=render_board(self._grid),
228
+ num_players=self._num_players,
229
+ agent_positions=ag,
230
+ exit_positions=ex,
231
+ action_history=list(self._action_history),
232
+ level_index=self._current_level_index,
233
+ done=self._done,
234
+ )
235
+
236
+ def _full_system_prompt(self) -> str:
237
+ """Rules + board + positions + step budget + current step count and action history."""
238
+ ag, ex = self._interior_positions_lists()
239
+ return build_system_prompt(
240
+ width=self._level.get("width", "?"),
241
+ height=self._level.get("height", "?"),
242
+ num_players=self._num_players,
243
+ board=render_board(self._grid),
244
+ agent_positions_interior=ag,
245
+ exit_positions_interior=ex,
246
+ max_steps=self._max_steps,
247
+ step_count=self._state.step_count,
248
+ previous_actions=list(self._action_history),
249
+ )
250
+
251
+ def _current_obs(self, message: str, reward: float) -> MazeObservation:
252
+ """Return an observation reflecting the current state (no movement)."""
253
+ prev = list(self._action_history)
254
+ ag, ex = self._interior_positions_lists()
255
+ return MazeObservation(
256
+ board=render_board(self._grid),
257
+ step_count=self._state.step_count,
258
+ max_steps=self._max_steps,
259
+ previous_actions=prev,
260
+ system_prompt=self._full_system_prompt(),
261
+ agent_positions=ag,
262
+ exit_positions=ex,
263
+ num_players=self._num_players,
264
+ message=message,
265
+ done=self._done,
266
+ reward=reward,
267
+ metadata={
268
+ "level_index": self._current_level_index,
269
+ "action_history": prev,
270
+ },
271
+ )
272
+
273
+
274
+ # ---------------------------------------------------------------------------
275
+ # Quick smoke-test (run directly: python server/maze_env_environment.py)
276
+ # ---------------------------------------------------------------------------
277
+
278
+ # if __name__ == "__main__":
279
+ # env = MazeEnvironment()
280
+
281
+ # print("=== RESET (level 0) ===")
282
+ # obs = env.reset(level_index=23)
283
+ # print(obs)
284
+ # print(f"done={obs.done}, reward={obs.reward}")
285
+
286
+ # moves = ["UP", "LEFT", "DOWN", "RIGHT"]
287
+ # for move in moves:
288
+ # print(f"\n=== STEP: {move} ===")
289
+ # obs = env.step(MazeAction(direction=move))
290
+ # print(obs)
291
+ # print("######################################################")
server/maze_env_helpers.py ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """Pure helpers for the Ice Maze environment (I/O, coords, prompts, slide order)."""
8
+
9
+ from __future__ import annotations
10
+
11
+ import json
12
+ import os
13
+ from typing import Dict, FrozenSet, List, Optional, Set, Tuple
14
+
15
+ try:
16
+ from ..models import MazeDirection
17
+ except ImportError:
18
+ from models import MazeDirection
19
+
20
+ # ---------------------------------------------------------------------------
21
+ # Level dataset
22
+ # ---------------------------------------------------------------------------
23
+
24
+ _LEVELS_CACHE: Optional[List[dict]] = None
25
+
26
+
27
+ def load_ice_maze_levels() -> List[dict]:
28
+ """Load the level JSON once and reuse it across environment instances."""
29
+ global _LEVELS_CACHE
30
+ if _LEVELS_CACHE is not None:
31
+ return _LEVELS_CACHE
32
+ levels_path = os.path.normpath(
33
+ os.path.join(os.path.dirname(__file__), "..", "dataset", "ice-maze-levels.json")
34
+ )
35
+ with open(levels_path, "r") as f:
36
+ _LEVELS_CACHE = json.load(f)
37
+ return _LEVELS_CACHE
38
+
39
+
40
+ # ---------------------------------------------------------------------------
41
+ # Board / cell utilities
42
+ # ---------------------------------------------------------------------------
43
+
44
+ def parse_board_entities(
45
+ grid: List[List[str]],
46
+ ) -> Tuple[List[Tuple[int, int]], List[Tuple[int, int]]]:
47
+ """Extract player (`a`/`b`) and exit (`e`/`b`) coordinates from the board grid."""
48
+ agents: List[Tuple[int, int]] = []
49
+ exits: Set[Tuple[int, int]] = set()
50
+
51
+ for br, row in enumerate(grid):
52
+ for bc, cell in enumerate(row):
53
+ ir, ic = br - 1, bc - 1
54
+ if cell == "e":
55
+ exits.add((ir, ic))
56
+ elif cell == "b":
57
+ agents.append((ir, ic))
58
+ exits.add((ir, ic))
59
+ elif cell == "a":
60
+ agents.append((ir, ic))
61
+
62
+ return agents, sorted(exits)
63
+
64
+
65
+ def render_board(grid: List[List[str]]) -> str:
66
+ """Convert the 2D grid into the text board sent in observations/prompts."""
67
+ return "\n".join("".join(row) for row in grid)
68
+
69
+
70
+ # ---------------------------------------------------------------------------
71
+ # Movement
72
+ # ---------------------------------------------------------------------------
73
+
74
+ DIRECTION_DELTAS: Dict[str, Tuple[int, int]] = {
75
+ MazeDirection.UP.value: (-1, 0),
76
+ MazeDirection.DOWN.value: (1, 0),
77
+ MazeDirection.LEFT.value: (0, -1),
78
+ MazeDirection.RIGHT.value: (0, 1),
79
+ }
80
+
81
+
82
+ def cell_at_interior(grid: List[List[str]], ir: int, ic: int) -> str:
83
+ """Read the grid character at an interior coordinate (border-offset by +1)."""
84
+ return grid[ir + 1][ic + 1]
85
+
86
+
87
+ def set_cell_at_interior(grid: List[List[str]], ir: int, ic: int, ch: str) -> None:
88
+ """Write a grid character at an interior coordinate (border-offset by +1)."""
89
+ grid[ir + 1][ic + 1] = ch
90
+
91
+
92
+ def can_move_to_cell(
93
+ grid: List[List[str]],
94
+ ir: int,
95
+ ic: int,
96
+ exit_positions: FrozenSet[Tuple[int, int]],
97
+ ) -> bool:
98
+ """Return whether a player may slide into this interior cell."""
99
+ br, bc = ir + 1, ic + 1
100
+ if br < 0 or bc < 0 or br >= len(grid) or bc >= len(grid[0]):
101
+ return False
102
+ ch = cell_at_interior(grid, ir, ic)
103
+ # Movable: empty ice or unoccupied exit.
104
+ if ch == "." or ch == "e":
105
+ return True
106
+ # Blocked: wall, occupied floor, occupied exit.
107
+ if ch == "#" or ch == "a" or ch == "b":
108
+ return False
109
+ return ch == "."
110
+
111
+
112
+ def glyph_agent_enters(
113
+ ir: int,
114
+ ic: int,
115
+ exit_positions: FrozenSet[Tuple[int, int]],
116
+ ) -> str:
117
+ """Return destination glyph after an agent enters a cell."""
118
+ if (ir, ic) in exit_positions:
119
+ return "b"
120
+ return "a"
121
+
122
+
123
+ def glyph_after_agent_leaves(
124
+ ir: int,
125
+ ic: int,
126
+ exit_positions: FrozenSet[Tuple[int, int]],
127
+ ) -> str:
128
+ """Return source glyph after an agent leaves a cell."""
129
+ if (ir, ic) in exit_positions:
130
+ return "e"
131
+ return "."
132
+
133
+
134
+ def slide_one_agent(
135
+ grid: List[List[str]],
136
+ agent_positions: List[Tuple[int, int]],
137
+ agent_index: int,
138
+ dr: int,
139
+ dc: int,
140
+ exit_positions: FrozenSet[Tuple[int, int]],
141
+ ) -> bool:
142
+ """Slide one agent until blocked; return True if it moved at least one cell."""
143
+ moved = False
144
+ while True:
145
+ ir, ic = agent_positions[agent_index]
146
+ nr, nc = ir + dr, ic + dc
147
+ if not can_move_to_cell(grid, nr, nc, exit_positions):
148
+ break
149
+ set_cell_at_interior(
150
+ grid,
151
+ ir,
152
+ ic,
153
+ glyph_after_agent_leaves(ir, ic, exit_positions),
154
+ )
155
+ set_cell_at_interior(
156
+ grid,
157
+ nr,
158
+ nc,
159
+ glyph_agent_enters(nr, nc, exit_positions),
160
+ )
161
+ agent_positions[agent_index] = (nr, nc)
162
+ moved = True
163
+ return moved
164
+
165
+
166
+ def apply_direction_slide(
167
+ grid: List[List[str]],
168
+ direction: str,
169
+ num_players: int,
170
+ agent_positions: List[Tuple[int, int]],
171
+ exit_positions: FrozenSet[Tuple[int, int]],
172
+ ) -> bool:
173
+ """Apply one directional move to all players; return whether any player moved."""
174
+ dr, dc = DIRECTION_DELTAS[direction]
175
+ any_moved = False
176
+ indices = sorted_slide_player_indices(direction, num_players, agent_positions)
177
+ for agent_index in indices:
178
+ if slide_one_agent(
179
+ grid=grid,
180
+ agent_positions=agent_positions,
181
+ agent_index=agent_index,
182
+ dr=dr,
183
+ dc=dc,
184
+ exit_positions=exit_positions,
185
+ ):
186
+ any_moved = True
187
+ return any_moved
188
+
189
+
190
+ def build_step_feedback(done: bool, moved: bool, direction: str, step_count: int) -> Tuple[float, str]:
191
+ """Return step reward and status message from current transition outcome."""
192
+ if done:
193
+ return (
194
+ 1.0,
195
+ f"Solved! All players reached an exit in {step_count} step(s).",
196
+ )
197
+ if not moved:
198
+ return (-0.1, f"No player moved — already against a wall going {direction}.")
199
+ return (-0.01, f"Moved {direction}. Step {step_count}.")
200
+
201
+
202
+ def sorted_slide_player_indices(
203
+ direction: str,
204
+ num_players: int,
205
+ agent_positions: List[Tuple[int, int]],
206
+ ) -> List[int]:
207
+ """Order player updates so simultaneous sliding resolves collisions correctly."""
208
+ if direction == MazeDirection.LEFT.value:
209
+ return sorted(range(num_players), key=lambda i: agent_positions[i][1])
210
+ if direction == MazeDirection.RIGHT.value:
211
+ return sorted(range(num_players), key=lambda i: agent_positions[i][1], reverse=True)
212
+ if direction == MazeDirection.UP.value:
213
+ return sorted(range(num_players), key=lambda i: agent_positions[i][0])
214
+ return sorted(range(num_players), key=lambda i: agent_positions[i][0], reverse=True)
215
+
216
+
217
+ # ---------------------------------------------------------------------------
218
+ # Episode parameters
219
+ # ---------------------------------------------------------------------------
220
+
221
+ def resolve_max_steps(level: dict, reset_kwargs: Optional[dict] = None) -> int:
222
+ """Choose max steps from reset args, level config, or a diameter-based default."""
223
+ reset_kwargs = reset_kwargs or {}
224
+ if "max_steps" in reset_kwargs:
225
+ return int(reset_kwargs["max_steps"])
226
+ if "max_steps" in level:
227
+ return int(level["max_steps"])
228
+ path = level.get("path") or ""
229
+ diam = int(level.get("diameter", len(path) if path else 1))
230
+ return max(1, diam * 5)
231
+
232
+
233
+ # ---------------------------------------------------------------------------
234
+ # LLM system prompt
235
+ # ---------------------------------------------------------------------------
236
+
237
+ def build_system_prompt(
238
+ *,
239
+ width: object,
240
+ height: object,
241
+ num_players: int,
242
+ board: str,
243
+ agent_positions_interior: List[List[int]],
244
+ exit_positions_interior: List[List[int]],
245
+ max_steps: int,
246
+ step_count: int,
247
+ previous_actions: List[str],
248
+ ) -> str:
249
+ """Build the full system prompt text describing rules and current episode state."""
250
+ player_line = (
251
+ "There is 1 player on the board."
252
+ if num_players == 1
253
+ else f"There are {num_players} players on the board."
254
+ )
255
+ move_line = (
256
+ "Each turn, send a direction to move the player."
257
+ if num_players == 1
258
+ else "Each turn, ALL players move SIMULTANEOUSLY in the same direction."
259
+ )
260
+ block_line = (
261
+ "" if num_players == 1
262
+ else " - Players act as walls — they block each other's sliding.\n"
263
+ )
264
+ prev_display = ", ".join(previous_actions) if previous_actions else "(none yet)"
265
+
266
+ return (
267
+ f"You are playing an Ice Maze puzzle.\n"
268
+ f"\n"
269
+ f"BOARD ({width}×{height}):\n"
270
+ f"{board}\n"
271
+ f"\n"
272
+ f"SYMBOLS:\n"
273
+ f" # = Wall (impassable)\n"
274
+ f" . = Open cell (slippery ice)\n"
275
+ f" a = Player on a non-exit cell\n"
276
+ f" b = Player currently on an exit cell\n"
277
+ f" e = Exit cell (goal)\n"
278
+ f"\n"
279
+ f"RULES:\n"
280
+ f" - {player_line}\n"
281
+ f" - {move_line}\n"
282
+ f" - On ice, each player SLIDES until they hit a wall (#) or another player (a or b).\n"
283
+ f"{block_line}"
284
+ f" - Exit cells (e) do NOT stop sliding — players slide through or onto them.\n"
285
+ f" - After all players stop: if EVERY player is on an exit cell → you win!\n"
286
+ f" - Exit cells are shared — any player can use any exit.\n"
287
+ f"\n"
288
+ f"STEP BUDGET: at most {max_steps} steps for this level.\n"
289
+ f"\n"
290
+ f"VALID ACTIONS: \"LEFT\", \"RIGHT\", \"UP\", \"DOWN\"\n"
291
+ f"\n"
292
+ f"Current player position(s): {agent_positions_interior}\n"
293
+ f"Exit cell position(s): {exit_positions_interior}\n"
294
+ f"\n"
295
+ f"EPISODE PROGRESS:\n"
296
+ f" - Step count (moves so far): {step_count} / {max_steps}\n"
297
+ f" - Previous actions (oldest → newest): {prev_display}\n"
298
+ )
server/requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ openenv[core]>=0.2.0
2
+ fastapi>=0.115.0
3
+ uvicorn>=0.24.0
4
+
5
+
6
+
uv.lock ADDED
The diff for this file is too large to render. See raw diff