Spaces:

Crashbandicoote2
/

snake_env

Runtime error

App Files Files Community

Crashbandicoote2 commited on Nov 17, 2025

Commit

20bb95e

verified ·

1 Parent(s): db9ae8b

Upload folder using huggingface_hub

Browse files

Files changed (18) hide show

Dockerfile +34 -0
README.md +277 -5
__init__.py +12 -0
client.py +115 -0
models.py +70 -0
openenv.yaml +6 -0
openenv_snake_env.egg-info/PKG-INFO +17 -0
openenv_snake_env.egg-info/SOURCES.txt +13 -0
openenv_snake_env.egg-info/dependency_links.txt +1 -0
openenv_snake_env.egg-info/entry_points.txt +2 -0
openenv_snake_env.egg-info/requires.txt +13 -0
openenv_snake_env.egg-info/top_level.txt +3 -0
pyproject.toml +43 -0
server/__init__.py +7 -0
server/app.py +59 -0
server/requirements.txt +5 -0
server/snake_environment.py +246 -0
uv.lock +0 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,34 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+# Use the standard openenv base image
+# Built from: docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+# In GitHub Actions, this is overridden to use the GHCR base image
+ARG BASE_IMAGE=openenv-base:latest
+FROM ${BASE_IMAGE}
+# Install dependencies
+COPY src/envs/snake_env/server/requirements.txt /tmp/requirements.txt
+RUN pip install --no-cache-dir -r /tmp/requirements.txt && rm /tmp/requirements.txt
+# Copy only what's needed for this environment
+COPY src/core/ /app/src/core/
+COPY src/envs/snake_env/ /app/src/envs/snake_env/
+# Copy README for web interface documentation
+COPY src/envs/snake_env/README.md /app/README.md
+# Expose port
+EXPOSE 8000
+# Health check
+HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:8000/health || exit 1
+# Run the FastAPI server
+# CMD ["uvicorn", "envs.snake_env.server.app:app", "--host", "0.0.0.0", "--port", "8000"]
+ENV ENABLE_WEB_INTERFACE=true
+CMD ["python", "-m", "uvicorn", "envs.snake_env.server.app:app", "--host", "0.0.0.0", "--port", "8000"]

README.md CHANGED Viewed

@@ -1,10 +1,282 @@
 ---
-title: Snake Env
-emoji: 🌍
-colorFrom: blue
-colorTo: red
 sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Snake Environment Server
+emoji: 🐉
+colorFrom: 'blue'
+colorTo: 'green'
 sdk: docker
 pinned: false
+app_port: 8000
+base_path: /web
+tags:
+  - openenv
 ---
+# Snake Environment
+A multi-agent snake game environment for OpenEnv, based on [marlenv](https://github.com/kc-ml2/marlenv)'s Snake-v1. This environment provides a single-agent interface to the classic snake game where the snake must navigate a grid, eat fruits, and avoid walls and its own body.
+## Overview
+The Snake environment wraps the marlenv Snake-v1 environment to provide a clean OpenEnv-compatible interface. Multiple snakes can battle on a fixed size grid map, but this implementation focuses on single-agent gameplay.
+### Features
+- **Grid-based gameplay**: Configurable grid size (default: 20x20)
+- **Fruit collection**: Snake grows when eating fruits
+- **Partial observability**: Optional vision range for limited field of view
+- **Customizable rewards**: Configurable reward function for different game aspects
+- **Two control modes**:
+  - `snake`: Relative actions (turn left/right)
+  - `human`: Global directions (up/down/left/right)
+### Game Rules
+- Snake dies when its head hits a wall or its own body
+- Snake grows by one unit when it eats a fruit
+- Episode ends when the snake dies or reaches maximum steps
+- Rewards can be customized for: eating fruits, survival time, and death penalty
+## Quick Start
+### Using Docker (Recommended)
+```python
+from envs.snake_env import SnakeAction, SnakeEnv
+# Start environment from Docker image
+client = SnakeEnv.from_docker_image("snake-env:latest")
+# Reset to start new episode
+result = client.reset()
+print(f"Snake alive: {result.observation.alive}")
+print(f"Grid shape: {len(result.observation.grid)}x{len(result.observation.grid[0])}")
+# Take actions
+result = client.step(SnakeAction(action=0))  # Continue straight
+print(f"Reward: {result.reward}")
+print(f"Score: {result.observation.episode_score}")
+result = client.step(SnakeAction(action=1))  # Turn left
+result = client.step(SnakeAction(action=2))  # Turn right
+# Check game state
+state = client.state()
+print(f"Episode: {state.episode_id}")
+print(f"Steps: {state.step_count}")
+# Cleanup
+client.close()
+```
+### Using Local Server
+```bash
+# Install dependencies
+cd src/envs/snake_env
+pip install -e .
+# Run server
+uv run --project . server
+```
+Then connect from another terminal:
+```python
+from envs.snake_env import SnakeAction, SnakeEnv
+# Connect to running server
+client = SnakeEnv(base_url="http://localhost:8000")
+result = client.reset()
+result = client.step(SnakeAction(action=0))
+```
+## Actions
+The action space depends on the `observer` mode:
+### Snake Mode (Default)
+Relative actions based on current direction:
+- `0`: No-op (continue in same direction)
+- `1`: Turn left (90 degrees counterclockwise)
+- `2`: Turn right (90 degrees clockwise)
+### Human Mode
+Global directional actions:
+- `0`: No-op
+- `1`: Move left
+- `2`: Move right
+- `3`: Move down
+- `4`: Move up
+## Observations
+Each observation includes:
+- `grid`: The full game grid as a 2D array (height × width)
+- `observation`: Encoded observation based on vision range
+- `episode_score`: Cumulative score in current episode
+- `episode_steps`: Number of steps taken
+- `episode_fruits`: Number of fruits eaten
+- `episode_kills`: Number of kills (always 0 in single-agent mode)
+- `alive`: Whether the snake is still alive
+## Configuration
+### Environment Parameters
+```python
+from envs.snake_env.server.snake_environment import SnakeEnvironment
+env = SnakeEnvironment(
+    height=20,           # Grid height (default: 20)
+    width=20,            # Grid width (default: 20)
+    snake_length=3,      # Initial snake length (default: 3)
+    vision_range=5,      # Partial observability (None for full grid)
+    observer='snake',    # 'snake' or 'human' mode
+    max_episode_steps=1000,  # Maximum steps per episode
+    reward_dict={        # Custom reward function
+        'fruit': 1.0,    # Reward for eating fruit
+        'kill': 0.0,     # Reward for kills (multi-agent)
+        'lose': -1.0,    # Penalty for death
+        'win': 0.0,      # Reward for winning (multi-agent)
+        'time': 0.0,     # Reward per timestep
+    }
+)
+```
+### Custom Rewards
+You can customize the reward function to encourage different behaviors:
+```python
+# Encourage survival
+reward_dict = {
+    'fruit': 1.0,
+    'lose': -10.0,
+    'time': 0.01,  # Small reward for staying alive
+}
+# Fast fruit collection
+reward_dict = {
+    'fruit': 10.0,
+    'lose': -1.0,
+    'time': -0.01,  # Penalty for taking too long
+}
+```
+## Building and Deployment
+### Build Docker Image
+From the repository root:
+```bash
+# Build base image first (if not already built)
+docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+# Build snake environment image
+docker build -t snake-env:latest -f src/envs/snake_env/server/Dockerfile .
+```
+The Dockerfile uses `pip install` with `requirements.txt` for maximum compatibility.
+### Run Docker Container
+```bash
+# Run the container
+docker run -p 8000:8000 snake-env:latest
+# Or with environment variables
+docker run -p 8000:8000 \
+  -e ENABLE_WEB_INTERFACE=true \
+  snake-env:latest
+```
+### Web Interface
+When `ENABLE_WEB_INTERFACE=true` is set, you can access the web interface at `http://localhost:8000/web` to interact with the environment through your browser.
+## Dependencies
+The snake environment requires:
+- `marlenv`: Multi-agent snake game implementation
+- `gym==0.24.1`: OpenAI Gym (required by marlenv)
+- `numpy`: Numerical operations
+- Standard OpenEnv dependencies (fastapi, pydantic, uvicorn)
+These are automatically installed when using Docker or installing via pip.
+## Example Training Loop
+```python
+from envs.snake_env import SnakeAction, SnakeEnv
+import random
+# Connect to environment
+env = SnakeEnv.from_docker_image("snake-env:latest")
+# Training loop
+for episode in range(10):
+    result = env.reset()
+    total_reward = 0
+    done = False
+    while not done:
+        # Simple random policy (replace with your agent)
+        action = SnakeAction(action=random.randint(0, 2))
+        result = env.step(action)
+        total_reward += result.reward
+        done = result.done
+    print(f"Episode {episode}: Reward={total_reward}, "
+          f"Fruits={result.observation.episode_fruits}, "
+          f"Steps={result.observation.episode_steps}")
+env.close()
+```
+## Troubleshooting
+### marlenv Installation Issues
+If you encounter issues installing marlenv, you can install it from source:
+```bash
+pip install git+https://github.com/kc-ml2/marlenv.git
+```
+### Import Errors
+Make sure you're in the correct directory when running the server:
+```bash
+cd src/envs/snake_env
+uv run --project . server
+```
+### Docker Build Issues
+Ensure the base image is built first:
+```bash
+docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+```
+## Citation
+The underlying snake game is from marlenv:
+```bibtex
+@MISC{marlenv2021,
+    author = {ML2},
+    title = {Marlenv, Multi-agent Reinforcement Learning Environment},
+    howpublished = {\url{http://github.com/kc-ml2/marlenv}},
+    year = {2021}
+}
+```
+## License
+BSD 3-Clause License - See LICENSE file in the root directory.

__init__.py ADDED Viewed

	@@ -0,0 +1,12 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Snake Environment - A multi-agent snake game environment based on marlenv."""
+from .client import SnakeEnv
+from .models import SnakeAction, SnakeObservation
+__all__ = ["SnakeAction", "SnakeObservation", "SnakeEnv"]

client.py ADDED Viewed

	@@ -0,0 +1,115 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Snake Environment HTTP Client.
+This module provides the client for connecting to a Snake Environment server
+over HTTP.
+"""
+from typing import Any, Dict
+# Support both in-repo and standalone imports
+try:
+    # In-repo imports (when running from OpenEnv repository)
+    from core.client_types import StepResult
+    from core.env_server.types import State
+    from core.http_env_client import HTTPEnvClient
+    from .models import SnakeAction, SnakeObservation
+except ImportError:
+    from models import SnakeAction, SnakeObservation
+    # Standalone imports (when environment is standalone with openenv-core from pip)
+    from openenv_core.client_types import StepResult
+    from openenv_core.env_server.types import State
+    from openenv_core.http_env_client import HTTPEnvClient
+class SnakeEnv(HTTPEnvClient[SnakeAction, SnakeObservation]):
+    """
+    HTTP client for the Snake Environment.
+    This client connects to a SnakeEnvironment HTTP server and provides
+    methods to interact with it: reset(), step(), and state access.
+    Example:
+        >>> # Connect to a running server
+        >>> client = SnakeEnv(base_url="http://localhost:8000")
+        >>> result = client.reset()
+        >>> print(result.observation.alive)  # True
+        >>>
+        >>> # Take an action (turn left)
+        >>> result = client.step(SnakeAction(action=1))
+        >>> print(result.observation.episode_score)
+        >>> print(result.reward)
+    Example with Docker:
+        >>> # Automatically start container and connect
+        >>> client = SnakeEnv.from_docker_image("snake-env:latest")
+        >>> result = client.reset()
+        >>> result = client.step(SnakeAction(action=0))  # noop
+    """
+    def _step_payload(self, action: SnakeAction) -> Dict:
+        """
+        Convert SnakeAction to JSON payload for step request.
+        Args:
+            action: SnakeAction instance
+        Returns:
+            Dictionary representation suitable for JSON encoding
+        """
+        return {
+            "action": action.action,
+        }
+    def _parse_result(self, payload: Dict) -> StepResult[SnakeObservation]:
+        """
+        Parse server response into StepResult[SnakeObservation].
+        Args:
+            payload: JSON response from server
+        Returns:
+            StepResult with SnakeObservation
+        """
+        obs_data = payload.get("observation", {})
+        observation = SnakeObservation(
+            grid=obs_data.get("grid", []),
+            observation=obs_data.get("observation", []),
+            episode_score=obs_data.get("episode_score", 0.0),
+            episode_steps=obs_data.get("episode_steps", 0),
+            episode_fruits=obs_data.get("episode_fruits", 0),
+            episode_kills=obs_data.get("episode_kills", 0),
+            alive=obs_data.get("alive", True),
+            done=payload.get("done", False),
+            reward=payload.get("reward"),
+            metadata=obs_data.get("metadata", {}),
+        )
+        return StepResult(
+            observation=observation,
+            reward=payload.get("reward"),
+            done=payload.get("done", False),
+        )
+    def _parse_state(self, payload: Dict) -> State:
+        """
+        Parse server response into State object.
+        Args:
+            payload: JSON response from /state endpoint
+        Returns:
+            State object with episode_id and step_count
+        """
+        return State(
+            episode_id=payload.get("episode_id"),
+            step_count=payload.get("step_count", 0),
+        )

models.py ADDED Viewed

	@@ -0,0 +1,70 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Data models for the Snake Environment.
+The Snake environment is a multi-agent reinforcement learning environment
+based on marlenv's Snake-v1. Multiple snakes battle on a fixed size grid map.
+"""
+from dataclasses import dataclass
+from typing import Any, Dict, List, Optional
+# Support both in-repo and standalone imports
+try:
+    # In-repo imports (when running from OpenEnv repository)
+    from core.env_server.types import Action, Observation
+except ImportError:
+    # Standalone imports (when environment is standalone with openenv-core from pip)
+    from openenv_core.env_server.types import Action, Observation
+@dataclass(kw_only=True)
+class SnakeAction(Action):
+    """
+    Action for the Snake environment.
+    For single snake (observer='snake'):
+        action: int in [0, 1, 2]
+            0 = noop (continue in same direction)
+            1 = turn left (90 degrees)
+            2 = turn right (90 degrees)
+    For single snake (observer='human'):
+        action: int in [0, 1, 2, 3, 4]
+            0 = noop
+            1 = left
+            2 = right
+            3 = down
+            4 = up
+    """
+    action: int
+@dataclass(kw_only=True)
+class SnakeObservation(Observation):
+    """
+    Observation from the Snake environment.
+    Attributes:
+        grid: The current game grid as a nested list (height x width)
+        observation: The encoded observation for the snake (can be full grid or vision range)
+        episode_score: Total score accumulated in this episode
+        episode_steps: Number of steps taken in this episode
+        episode_fruits: Number of fruits eaten in this episode
+        episode_kills: Number of kills in this episode
+        alive: Whether the snake is still alive
+    """
+    grid: List[List[int]]
+    observation: List[List[List[float]]]  # H x W x C observation
+    episode_score: float = 0.0
+    episode_steps: int = 0
+    episode_fruits: int = 0
+    episode_kills: int = 0
+    alive: bool = True

openenv.yaml ADDED Viewed

	@@ -0,0 +1,6 @@

+spec_version: 1
+name: snake_env
+type: space
+runtime: fastapi
+app: server.app:app
+port: 8000

openenv_snake_env.egg-info/PKG-INFO ADDED Viewed

	@@ -0,0 +1,17 @@

+Metadata-Version: 2.4
+Name: openenv-snake-env
+Version: 0.1.0
+Summary: Snake Environment for OpenEnv - multi-agent snake game based on marlenv
+Requires-Python: >=3.10
+Requires-Dist: openenv-core>=0.1.0
+Requires-Dist: fastapi>=0.115.0
+Requires-Dist: pydantic>=2.0.0
+Requires-Dist: uvicorn>=0.24.0
+Requires-Dist: requests>=2.31.0
+Requires-Dist: marlenv>=1.0.0
+Requires-Dist: gym==0.24.1
+Requires-Dist: numpy>=1.24.0
+Requires-Dist: Pillow>=10.0.0
+Provides-Extra: dev
+Requires-Dist: pytest>=8.0.0; extra == "dev"
+Requires-Dist: pytest-cov>=4.0.0; extra == "dev"

openenv_snake_env.egg-info/SOURCES.txt ADDED Viewed

	@@ -0,0 +1,13 @@

+README.md
+client.py
+models.py
+pyproject.toml
+openenv_snake_env.egg-info/PKG-INFO
+openenv_snake_env.egg-info/SOURCES.txt
+openenv_snake_env.egg-info/dependency_links.txt
+openenv_snake_env.egg-info/entry_points.txt
+openenv_snake_env.egg-info/requires.txt
+openenv_snake_env.egg-info/top_level.txt
+server/__init__.py
+server/app.py
+server/snake_environment.py

openenv_snake_env.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+

openenv_snake_env.egg-info/entry_points.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ [console_scripts]
2	+ server = server.app:main

openenv_snake_env.egg-info/requires.txt ADDED Viewed

	@@ -0,0 +1,13 @@

+openenv-core>=0.1.0
+fastapi>=0.115.0
+pydantic>=2.0.0
+uvicorn>=0.24.0
+requests>=2.31.0
+marlenv>=1.0.0
+gym==0.24.1
+numpy>=1.24.0
+Pillow>=10.0.0
+[dev]
+pytest>=8.0.0
+pytest-cov>=4.0.0

openenv_snake_env.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1,3 @@

+client
+models
+server

pyproject.toml ADDED Viewed

	@@ -0,0 +1,43 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+[build-system]
+requires = ["setuptools>=45", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "openenv-snake-env"
+version = "0.1.0"
+description = "Snake Environment for OpenEnv - multi-agent snake game based on marlenv"
+requires-python = ">=3.10"
+dependencies = [
+    # Core OpenEnv dependencies (required for server functionality)
+    "openenv-core>=0.1.0",
+    "fastapi>=0.115.0",
+    "pydantic>=2.0.0",
+    "uvicorn>=0.24.0",
+    "requests>=2.31.0",
+    # Snake environment specific dependencies
+    "marlenv>=1.0.0",  # Multi-agent snake game environment
+    "gym==0.24.1",  # Required by marlenv
+    "numpy>=1.24.0",
+    "Pillow>=10.0.0",  # Required by marlenv for image rendering
+]
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-cov>=4.0.0",
+]
+[project.scripts]
+# Server entry point - enables running via: uv run --project . server
+# or: python -m server.app
+server = "server.app:main"
+[tool.setuptools]
+py-modules = ["models", "client"]
+packages = ["server"]

server/__init__.py ADDED Viewed

	@@ -0,0 +1,7 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Snake Environment Server - FastAPI HTTP server for snake game."""

server/app.py ADDED Viewed

	@@ -0,0 +1,59 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+FastAPI application for the Snake Environment.
+This module creates an HTTP server that exposes the SnakeEnvironment
+over HTTP endpoints, making it compatible with HTTPEnvClient.
+Usage:
+    # Development (with auto-reload):
+    uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
+    # Production:
+    uvicorn server.app:app --host 0.0.0.0 --port 8000 --workers 4
+    # Or run directly:
+    uv run --project . server
+"""
+# Support both in-repo and standalone imports
+try:
+    # In-repo imports (when running from OpenEnv repository)
+    from core.env_server.http_server import create_app
+    from ..models import SnakeAction, SnakeObservation
+    from .snake_environment import SnakeEnvironment
+except ImportError:
+    # Standalone imports (when environment is standalone with openenv-core from pip)
+    from openenv_core.env_server.http_server import create_app
+    from models import SnakeAction, SnakeObservation
+    from server.snake_environment import SnakeEnvironment
+# Create the environment instance
+env = SnakeEnvironment()
+# Create the app with web interface and README integration
+app = create_app(env, SnakeAction, SnakeObservation, env_name="snake_env")
+def main():
+    """
+    Entry point for direct execution via uv run or python -m.
+    This function enables running the server without Docker:
+        uv run --project . server
+        python -m envs.snake_env.server.app
+        openenv serve snake_env
+    """
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=8000)
+if __name__ == "__main__":
+    main()

server/requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+# Snake environment dependencies
+marlenv>=1.0.0
+gym==0.24.1
+numpy>=1.24.0
+Pillow>=10.0.0

server/snake_environment.py ADDED Viewed

	@@ -0,0 +1,246 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Snake Environment Implementation.
+A multi-agent snake game environment that wraps marlenv's Snake-v1.
+This implementation provides a single-agent interface by wrapping the
+multi-agent marlenv environment.
+"""
+from uuid import uuid4
+import gym
+import marlenv.envs  # Register marlenv environments with gym
+import numpy as np
+# Support both in-repo and standalone imports
+try:
+    # In-repo imports (when running from OpenEnv repository)
+    from core.env_server.interfaces import Environment
+    from core.env_server.types import State
+    from ..models import SnakeAction, SnakeObservation
+except ImportError:
+    from models import SnakeAction, SnakeObservation
+    # Standalone imports (when environment is standalone with openenv-core from pip)
+    from openenv_core.env_server.interfaces import Environment
+    from openenv_core.env_server.types import State
+class SingleAgentWrapper(gym.Wrapper):
+    """
+    Custom wrapper to convert multi-agent marlenv to single-agent.
+    This wrapper properly handles the conversion without triggering
+    gym 0.24.1's strict type checking on done flags.
+    """
+    def __init__(self, env):
+        super().__init__(env)
+        # Unwrap observation and action spaces for single agent
+        if hasattr(env.observation_space, '__getitem__'):
+            self.observation_space = env.observation_space[0]
+        if hasattr(env.action_space, '__getitem__'):
+            self.action_space = env.action_space[0]
+    def reset(self, **kwargs):
+        obs = self.env.reset(**kwargs)
+        # Remove first dimension if it's a multi-agent array (num_agents, H, W, C)
+        if hasattr(obs, 'shape') and len(obs.shape) == 4 and obs.shape[0] == 1:
+            return obs[0]  # Return (H, W, C)
+        # Return first agent's observation if it's a list
+        if isinstance(obs, list):
+            return obs[0]
+        return obs
+    def step(self, action):
+        # Wrap action in list for multi-agent env
+        obs, rewards, dones, info = self.env.step([action])
+        # Unwrap returns for single agent
+        # Handle observation: remove first dimension if shape is (1, H, W, C)
+        if hasattr(obs, 'shape') and len(obs.shape) == 4 and obs.shape[0] == 1:
+            obs = obs[0]  # Convert (1, H, W, C) -> (H, W, C)
+        elif isinstance(obs, list):
+            obs = obs[0]
+        reward = rewards[0] if isinstance(rewards, list) else rewards
+        done = dones[0] if isinstance(dones, list) else dones
+        # Ensure done is a boolean (not numpy bool)
+        done = bool(done)
+        return obs, reward, done, info
+class SnakeEnvironment(Environment):
+    """
+    A snake game environment that wraps marlenv's Snake-v1.
+    This environment provides a single-agent interface to the multi-agent
+    snake game. The snake must navigate a grid, eat fruits, and avoid walls
+    and its own body.
+    Args:
+        height: Height of the grid map (default: 20)
+        width: Width of the grid map (default: 20)
+        snake_length: Initial length of the snake (default: 3)
+        vision_range: Vision range for partial observability (default: None for full grid)
+        observer: 'snake' for relative actions or 'human' for global directions (default: 'snake')
+        max_episode_steps: Maximum steps per episode (default: 1000)
+        reward_dict: Custom reward function (default: fruit=1.0, others=0.0)
+    Example:
+        >>> env = SnakeEnvironment()
+        >>> obs = env.reset()
+        >>> print(obs.alive)  # True
+        >>>
+        >>> obs = env.step(SnakeAction(action=1))  # Turn left
+        >>> print(obs.episode_score)
+        >>> print(obs.reward)
+    """
+    def __init__(
+        self,
+        height: int = 20,
+        width: int = 20,
+        snake_length: int = 3,
+        vision_range: int = None,
+        observer: str = "snake",
+        max_episode_steps: int = 1000,
+        reward_dict: dict = None,
+    ):
+        """Initialize the snake environment."""
+        self._state = State(episode_id=str(uuid4()), step_count=0)
+        # Default reward function
+        if reward_dict is None:
+            reward_dict = {
+                "fruit": 1.0,
+                "kill": 0.0,
+                "lose": -1.0,
+                "win": 100.0,
+                "time": 0.001,
+            }
+        # Create the marlenv snake environment for single agent
+        # Note: We don't use gym.make directly to avoid gym 0.24.1 wrappers
+        from marlenv.envs.snake_env import SnakeEnv as MarlenvSnake
+        self.base_env = MarlenvSnake(
+            height=height,
+            width=width,
+            num_snakes=1,  # Single agent
+            snake_length=snake_length,
+            vision_range=vision_range,
+            frame_stack=1,
+            observer=observer,
+            reward_dict=reward_dict,
+            max_episode_steps=max_episode_steps,
+        )
+        # Wrap with our custom SingleAgent wrapper
+        self.env = SingleAgentWrapper(self.base_env)
+        # Track episode statistics
+        self._episode_score = 0.0
+        self._episode_fruits = 0
+        self._episode_kills = 0
+    def reset(self) -> SnakeObservation:
+        """
+        Reset the environment.
+        Returns:
+            SnakeObservation with initial game state
+        """
+        self._state = State(episode_id=str(uuid4()), step_count=0)
+        self._episode_score = 0.0
+        self._episode_fruits = 0
+        self._episode_kills = 0
+        # Reset the marlenv environment
+        obs = self.env.reset()
+        # Convert observation to list format
+        obs_list = obs.tolist() if isinstance(obs, np.ndarray) else obs
+        # Get the grid from the environment (access base env directly)
+        grid = self.base_env.grid.tolist() if hasattr(self.base_env, "grid") else []
+        return SnakeObservation(
+            grid=grid,
+            observation=obs_list,
+            episode_score=self._episode_score,
+            episode_steps=self._state.step_count,
+            episode_fruits=self._episode_fruits,
+            episode_kills=self._episode_kills,
+            alive=True,
+            done=False,
+            reward=0.0,
+        )
+    def step(self, action: SnakeAction) -> SnakeObservation:  # type: ignore[override]
+        """
+        Execute a step in the environment.
+        Args:
+            action: SnakeAction containing the action to take
+        Returns:
+            SnakeObservation with the result of the action
+        """
+        self._state.step_count += 1
+        # Execute action in marlenv
+        obs, reward, done, info = self.env.step(action.action)
+        # Update episode statistics
+        self._episode_score += reward
+        # Convert observation to list format
+        obs_list = obs.tolist() if isinstance(obs, np.ndarray) else obs
+        # Get the grid from the environment (access base env directly)
+        grid = self.base_env.grid.tolist() if hasattr(self.base_env, "grid") else []
+        # Extract episode statistics from info if available
+        episode_fruits = (
+            info.get("episode_fruits", [self._episode_fruits])[0]
+            if "episode_fruits" in info
+            else self._episode_fruits
+        )
+        episode_kills = (
+            info.get("episode_kills", [self._episode_kills])[0]
+            if "episode_kills" in info
+            else self._episode_kills
+        )
+        return SnakeObservation(
+            grid=grid,
+            observation=obs_list,
+            episode_score=self._episode_score,
+            episode_steps=self._state.step_count,
+            episode_fruits=int(episode_fruits),
+            episode_kills=int(episode_kills),
+            alive=not done,
+            done=done,
+            reward=float(reward),
+            metadata={"info": info},
+        )
+    @property
+    def state(self) -> State:
+        """
+        Get the current environment state.
+        Returns:
+            Current State with episode_id and step_count
+        """
+        return self._state

uv.lock ADDED Viewed

The diff for this file is too large to render. See raw diff