Spaces:
Sleeping
Sleeping
metadata
title: Grid World Environment Server
emoji: π»
colorFrom: yellow
colorTo: pink
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
- openenv
Grid World Environment
Grid World is a simple 5x5 navigation task with a fixed goal at (4, 4). The agent moves with cardinal actions and receives a small step penalty until it reaches the goal. Each observation also includes a suggested_action that you can pass directly into the next step.
Architecture
ββββββββββββββββββββββββββββββββββββββ
β RL Client β
β GridWorldEnv.step(action) β
ββββββββββββββββ¬ββββββββββββββββββββββ
β HTTP / WebSocket
ββββββββββββββββΌββββββββββββββββββββββ
β FastAPI Server (Docker) β
β GridWorldEnvironment β
β ββ Reset/Step/State endpoints β
β ββ Reward + termination β
β ββ Action validation β
ββββββββββββββββββββββββββββββββββββββ
Installation & Usage
Option 1: Local Development (without Docker)
Requirements:
- Python 3.11+
- uv (recommended) or pip
cd envs/grid_world
# Install the package and dependencies
uv pip install -e .
# or
pip install -e .
Run the server locally:
cd envs/grid_world
uv run --project . server --port 8000
# or
uvicorn server.app:app --reload --port 8000
Connect with the client:
from grid_world import GridWorldAction, GridWorldEnv
env = GridWorldEnv(base_url="http://localhost:8000")
result = env.reset()
print(result.observation.message)
action = result.observation.suggested_action
result = env.step(GridWorldAction(action=action))
print(result.observation.suggested_action, result.reward)
env.close()
Option 2: Docker (Recommended)
Build the image from the repo root:
cd /path/to/OpenEnv
docker build -f envs/grid_world/server/Dockerfile -t grid-world-env:latest .
Run the container:
docker run -p 8000:8000 grid-world-env:latest
Use with from_docker_image():
from grid_world import GridWorldAction, GridWorldEnv
env = None
try:
# Create environment from Docker image
env = GridWorldEnv.from_docker_image("grid-world-env:latest")
# Reset to start a new episode
result = env.reset()
print(f"Initial suggested action: {result.observation.suggested_action}")
print(f"Message: {result.observation.message}")
# Play until done
while not result.done:
action = result.observation.suggested_action
result = env.step(GridWorldAction(action=action))
print(f"Reward: {result.reward}, Done: {result.done}")
finally:
if env is not None:
env.close()
API Endpoints
GET /health- Container health checkPOST /reset- Reset the environmentPOST /step- Execute an actionGET /state- Fetch current stateGET /schema- Action/observation schemaWS /ws- WebSocket endpoint for low-latency sessions
Environment Details
Actions
GridWorldAction
action(enum):UP,DOWN,LEFT,RIGHT
Observations
GridWorldObservation
x(int): Agent x positiony(int): Agent y positionsuggested_action(MoveAction | null): Recommended next move toward the goalmessage(str): Status messagereward(float | null): Reward for the transitiondone(bool): Episode termination flag
Rewards & Termination
- Each step gives
-0.1reward. - Reaching
(4, 4)yields+1.0anddone = True. - Reset returns
reward = nullanddone = False.
State
GET /state returns:
episode_id,step_countagent_x,agent_ygoal_x,goal_ygrid_size,episode_steps
Project Structure
grid_world/
βββ __init__.py # Module exports
βββ README.md # This file
βββ openenv.yaml # OpenEnv manifest
βββ pyproject.toml # Project metadata and dependencies
βββ uv.lock # Locked dependencies (generated)
βββ client.py # GridWorldEnv client
βββ models.py # Action and Observation models
βββ server/
βββ __init__.py # Server module exports
βββ grid_world_environment.py # Core environment logic
βββ app.py # FastAPI application
βββ Dockerfile # Container image definition