Spaces:
Sleeping
Sleeping
| # Maze Environment | |
| Integration of Maze game with the OpenEnv framework. | |
| ## Architecture | |
| ``` | |
| ββββββββββββββββββββββββββββββββββββββ | |
| β RL Training Code (Client) β | |
| β MazeEnv.step(action) β | |
| ββββββββββββββββ¬ββββββββββββββββββββββ | |
| β HTTP | |
| ββββββββββββββββΌββββββββββββββββββββββ | |
| β FastAPI Server (Docker) β | |
| β MazeEnvironment β | |
| β ββ Wraps Maze environment β | |
| β ββ Agent controls player β | |
| ββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| ## Installation & Usage | |
| ### Option 1: Local Development (without Docker) | |
| **Requirements:** | |
| - Python 3.11+ | |
| - Numpy | |
| ```python | |
| from envs.maze_env import MazeEnv, MazeAction | |
| # Start local server manually | |
| # python -m envs.maze_env.server.app | |
| # Connect to local server | |
| env = MazeEnv(base_url="http://localhost:8000") | |
| # Reset environment | |
| result = env.reset() | |
| print(f"Initial state: {result.observation.info_state}") | |
| print(f"Legal actions: {result.observation.legal_actions}") | |
| # Take actions | |
| for _ in range(10): | |
| action_id = result.observation.legal_actions[0] # Choose first legal action | |
| result = env.step(MazeAction(action_id=action_id)) | |
| print(f"Reward: {result.reward}, Done: {result.done}") | |
| if result.done: | |
| break | |
| # Cleanup | |
| env.close() | |
| ``` | |
| ### Option 2: Docker (Recommended) | |
| **Build Docker image:** | |
| ```bash | |
| cd OpenEnv | |
| docker build -f src/envs/maze_env/server/Dockerfile -t maze-env:latest . | |
| ``` | |
| **Use with from_docker_image():** | |
| ```python | |
| from envs.maze_env import MazeEnv, MazeAction | |
| # Automatically starts container | |
| env = MazeEnv.from_docker_image("maze-env:latest") | |
| result = env.reset() | |
| result = env.step(MazeAction(action_id=0)) | |
| env.close() # Stops container | |
| ``` | |
| ## Configuration | |
| ### Variables | |
| - `maze` : Maze as a numpy array saved in mazearray.py | |
| ### Example | |
| ```bash | |
| docker run -p 8000:8000 maze-env:latest | |
| ``` | |
| ## API Reference | |
| ### MazeAction | |
| ```python | |
| @dataclass | |
| class MazeAction(Action): | |
| action: int # Action to be taken | |
| ``` | |
| ### MazeObservation | |
| ```python | |
| @dataclass | |
| class MazeObservation(Observation): | |
| position: List[int] # [row, col] | |
| total_reward: float # Total reward | |
| legal_actions: List[int] = field(default_factory=list) # Legal action based on the current position | |
| ``` | |
| ### MazeState | |
| ```python | |
| @dataclass | |
| class MazeState(State): | |
| episode_id: str # Episode | |
| step_count: int # Number of steps | |
| done: bool = False # Solve status | |
| ``` | |
| ## References | |
| - [Maze Environment](https://github.com/erikdelange/Reinforcement-Learning-Maze) | |