grid_world / README.md
Jiyaaaaaa's picture
Upload folder using huggingface_hub
054b7f6 verified
metadata
title: Grid World Environment Server
emoji: 🎻
colorFrom: yellow
colorTo: pink
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
  - openenv

Grid World Environment

Grid World is a simple 5x5 navigation task with a fixed goal at (4, 4). The agent moves with cardinal actions and receives a small step penalty until it reaches the goal. Each observation also includes a suggested_action that you can pass directly into the next step.

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ RL Client                           β”‚
β”‚   GridWorldEnv.step(action)         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚ HTTP / WebSocket
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ FastAPI Server (Docker)             β”‚
β”‚   GridWorldEnvironment              β”‚
β”‚     β”œβ”€ Reset/Step/State endpoints   β”‚
β”‚     β”œβ”€ Reward + termination         β”‚
β”‚     └─ Action validation            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Installation & Usage

Option 1: Local Development (without Docker)

Requirements:

  • Python 3.11+
  • uv (recommended) or pip
cd envs/grid_world

# Install the package and dependencies
uv pip install -e .
# or
pip install -e .

Run the server locally:

cd envs/grid_world
uv run --project . server --port 8000
# or
uvicorn server.app:app --reload --port 8000

Connect with the client:

from grid_world import GridWorldAction, GridWorldEnv

env = GridWorldEnv(base_url="http://localhost:8000")
result = env.reset()
print(result.observation.message)

action = result.observation.suggested_action
result = env.step(GridWorldAction(action=action))
print(result.observation.suggested_action, result.reward)

env.close()

Option 2: Docker (Recommended)

Build the image from the repo root:

cd /path/to/OpenEnv
docker build -f envs/grid_world/server/Dockerfile -t grid-world-env:latest .

Run the container:

docker run -p 8000:8000 grid-world-env:latest

Use with from_docker_image():

from grid_world import GridWorldAction, GridWorldEnv

env = None
try:
    # Create environment from Docker image
    env = GridWorldEnv.from_docker_image("grid-world-env:latest")

    # Reset to start a new episode
    result = env.reset()
    print(f"Initial suggested action: {result.observation.suggested_action}")
    print(f"Message: {result.observation.message}")

    # Play until done
    while not result.done:
        action = result.observation.suggested_action
        result = env.step(GridWorldAction(action=action))
        print(f"Reward: {result.reward}, Done: {result.done}")

finally:
    if env is not None:
        env.close()

API Endpoints

  • GET /health - Container health check
  • POST /reset - Reset the environment
  • POST /step - Execute an action
  • GET /state - Fetch current state
  • GET /schema - Action/observation schema
  • WS /ws - WebSocket endpoint for low-latency sessions

Environment Details

Actions

GridWorldAction

  • action (enum): UP, DOWN, LEFT, RIGHT

Observations

GridWorldObservation

  • x (int): Agent x position
  • y (int): Agent y position
  • suggested_action (MoveAction | null): Recommended next move toward the goal
  • message (str): Status message
  • reward (float | null): Reward for the transition
  • done (bool): Episode termination flag

Rewards & Termination

  • Each step gives -0.1 reward.
  • Reaching (4, 4) yields +1.0 and done = True.
  • Reset returns reward = null and done = False.

State

GET /state returns:

  • episode_id, step_count
  • agent_x, agent_y
  • goal_x, goal_y
  • grid_size, episode_steps

Project Structure

grid_world/
β”œβ”€β”€ __init__.py            # Module exports
β”œβ”€β”€ README.md              # This file
β”œβ”€β”€ openenv.yaml           # OpenEnv manifest
β”œβ”€β”€ pyproject.toml         # Project metadata and dependencies
β”œβ”€β”€ uv.lock                # Locked dependencies (generated)
β”œβ”€β”€ client.py              # GridWorldEnv client
β”œβ”€β”€ models.py              # Action and Observation models
└── server/
    β”œβ”€β”€ __init__.py        # Server module exports
    β”œβ”€β”€ grid_world_environment.py  # Core environment logic
    β”œβ”€β”€ app.py             # FastAPI application
    └── Dockerfile         # Container image definition