Spaces:

Crashbandicoote2
/

unity_env

Runtime error

App Files Files Community

Crashbandicoote2 commited on Jan 11

Commit

0f53490

verified ·

1 Parent(s): 6c44dae

Upload folder using huggingface_hub

Browse files

Files changed (16) hide show

Dockerfile +67 -0
README.md +602 -5
__init__.py +12 -0
client.py +263 -0
models.py +164 -0
openenv.yaml +6 -0
openenv_unity_env.egg-info/PKG-INFO +16 -0
openenv_unity_env.egg-info/SOURCES.txt +15 -0
openenv_unity_env.egg-info/dependency_links.txt +1 -0
openenv_unity_env.egg-info/entry_points.txt +2 -0
openenv_unity_env.egg-info/requires.txt +12 -0
openenv_unity_env.egg-info/top_level.txt +1 -0
pyproject.toml +45 -0
server/__init__.py +11 -0
server/app.py +84 -0
server/unity_environment.py +554 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,67 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+# Multi-stage build for Unity ML-Agents environment
+# Uses pip for package installation (no virtual environment)
+# Note: Using Python 3.10.12 specifically because ml-agents requires >=3.10.1,<=3.10.12
+# Note: Unity binaries are x86_64 only, so we force linux/amd64 platform
+FROM --platform=linux/amd64 python:3.10.12-slim AS builder
+WORKDIR /app
+# Install build dependencies
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    build-essential \
+    git \
+    && rm -rf /var/lib/apt/lists/*
+# Copy environment code
+COPY . /app/env
+WORKDIR /app/env
+# Install dependencies using pip
+# Note: mlagents packages are installed from git source via pyproject.toml
+RUN pip install --upgrade pip && \
+    pip install --no-cache-dir -e .
+# Final runtime stage
+FROM --platform=linux/amd64 python:3.10.12-slim
+WORKDIR /app
+# Install runtime dependencies (curl for healthcheck)
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Copy installed packages from builder
+COPY --from=builder /usr/local/lib/python3.10/site-packages /usr/local/lib/python3.10/site-packages
+COPY --from=builder /usr/local/bin /usr/local/bin
+# Copy the environment code
+COPY . /app/env
+# Create cache directory for Unity binaries
+RUN mkdir -p /root/.mlagents-cache
+# Set PYTHONPATH so imports work correctly
+ENV PYTHONPATH="/app/env:$PYTHONPATH"
+# Expose port
+EXPOSE 8000
+# Health check
+HEALTHCHECK --interval=30s --timeout=3s --start-period=60s --retries=3 \
+    CMD curl -f http://localhost:8000/health || exit 1
+# Note: Longer start period (60s) because Unity environment download may take time on first run
+# Run the FastAPI server
+# Note: workers=1 because Unity environments are not thread-safe
+ENV ENABLE_WEB_INTERFACE=true
+CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]

README.md CHANGED Viewed

@@ -1,10 +1,607 @@
 ---
-title: Unity Env
-emoji: 🐢
-colorFrom: green
-colorTo: blue
 sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Unity Environment Server
+emoji: 🌐
+colorFrom: blue
+colorTo: green
 sdk: docker
 pinned: false
+app_port: 8000
+base_path: /web
+tags:
+  - openenv
+  - Unity
+  - MlAgents
+  - MlAgentsUnity
+  - MlAgentsEnv
 ---
+<!--
+Copyright (c) Meta Platforms, Inc. and affiliates.
+All rights reserved.
+This source code is licensed under the BSD-style license found in the
+LICENSE file in the root directory of this source tree.
+-->
+<div align="center">
+# Unity ML-Agents Environment
+OpenEnv wrapper for [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents) environments. This environment provides access to Unity's reinforcement learning environments through a standardized HTTP/WebSocket interface.
+## Supported Environments
+| Environment | Action Type | Description |
+|------------|-------------|-------------|
+| **PushBlock** | Discrete (7) | Push a block to a goal position |
+| **3DBall** | Continuous (2) | Balance a ball on a platform |
+| **3DBallHard** | Continuous (2) | Harder version of 3DBall |
+| **GridWorld** | Discrete (5) | Navigate a grid to find goals |
+| **Basic** | Discrete (3) | Simple left/right movement |
+More environments may be available depending on the ML-Agents registry version.
+## Installation
+### Option 1: Non-Docker Installation (Local Development)
+#### Prerequisites
+- Python 3.10+
+- [uv](https://docs.astral.sh/uv/) (recommended) or pip
+#### Install from OpenEnv Repository
+```bash
+# Clone the OpenEnv repository (if not already done)
+git clone https://github.com/your-org/OpenEnv.git
+cd OpenEnv
+# Install the unity_env package with dependencies
+cd envs/unity_env
+uv pip install -e .
+# Or with pip
+pip install -e .
+```
+#### Install Dependencies Only
+```bash
+cd envs/unity_env
+# Using uv (recommended)
+uv sync
+# Or using pip
+pip install -r requirements.txt  # if available
+pip install mlagents-envs numpy pillow fastapi uvicorn pydantic
+```
+#### Verify Installation
+```bash
+# Test the installation
+cd envs/unity_env
+python -c "from server.unity_environment import UnityMLAgentsEnvironment; print('Installation successful!')"
+```
+**Note:** The first run will download Unity environment binaries (~500MB). These are cached in `~/.mlagents-cache/` for future use.
+### Option 2: Docker Installation
+#### Prerequisites
+- Docker installed and running
+- Python 3.10+ (for running the client)
+#### Build the Docker Image
+```bash
+cd envs/unity_env
+# Build the Docker image
+docker build -f server/Dockerfile -t unity-env:latest .
+# Verify the build
+docker images | grep unity-env
+```
+**Note for Apple Silicon (M1/M2/M3/M4) users:** Docker mode is **not supported** on Apple Silicon because Unity's Mono runtime crashes under x86_64 emulation. Use **direct mode** (`--direct`) or **server mode** (`--url`) instead, which run native macOS binaries. See [Troubleshooting](#docker-mode-fails-on-apple-silicon-m1m2m3m4) for details.
+#### Run the Docker Container
+```bash
+# Run with default settings (graphics enabled, 800x600)
+docker run -p 8000:8000 unity-env:latest
+# Run with custom settings
+docker run -p 8000:8000 \
+  -e UNITY_NO_GRAPHICS=0 \
+  -e UNITY_WIDTH=1280 \
+  -e UNITY_HEIGHT=720 \
+  -e UNITY_TIME_SCALE=1.0 \
+  unity-env:latest
+# Run in headless mode (faster for training)
+docker run -p 8000:8000 \
+  -e UNITY_NO_GRAPHICS=1 \
+  -e UNITY_TIME_SCALE=20 \
+  unity-env:latest
+# Run with persistent cache (avoid re-downloading binaries)
+docker run -p 8000:8000 \
+  -v ~/.mlagents-cache:/root/.mlagents-cache \
+  unity-env:latest
+```
+#### Install Client Dependencies
+To connect to the Docker container, install the client on your host machine:
+```bash
+cd envs/unity_env
+pip install requests websockets
+```
+## Quick Start
+### Option 1: Direct Mode (Fastest for Testing)
+Run the Unity environment directly without a server:
+```bash
+cd envs/unity_env
+# Run with graphics (default: 1280x720)
+python example_usage.py --direct
+# Run with custom window size
+python example_usage.py --direct --width 800 --height 600
+# Run headless (faster for training)
+python example_usage.py --direct --no-graphics --time-scale 20
+# Run 3DBall environment
+python example_usage.py --direct --env 3DBall --episodes 5
+```
+### Option 2: Server Mode
+Start the server and connect with a client:
+```bash
+# Terminal 1: Start the server (graphics enabled by default)
+cd envs/unity_env
+uv run uvicorn server.app:app --host 0.0.0.0 --port 8000
+# Terminal 2: Run the example client
+python example_usage.py --url http://localhost:8000
+python example_usage.py --url http://localhost:8000 --env 3DBall --episodes 5
+```
+### Option 3: Docker Mode
+Run via Docker container (auto-starts and connects):
+```bash
+cd envs/unity_env
+# Run with default settings
+python example_usage.py --docker
+# Run with custom window size
+python example_usage.py --docker --width 1280 --height 720
+# Run headless (faster for training)
+python example_usage.py --docker --no-graphics --time-scale 20
+# Run 3DBall for 10 episodes
+python example_usage.py --docker --env 3DBall --episodes 10
+# Use a custom Docker image
+python example_usage.py --docker --docker-image my-unity-env:v1
+```
+## Example Scripts
+### Basic Usage Examples
+#### 1. Direct Mode - Quick Testing
+```bash
+# Run PushBlock with graphics (default)
+python example_usage.py --direct
+# Output:
+# ============================================================
+# Unity ML-Agents Environment - Direct Mode
+# ============================================================
+# Environment: PushBlock
+# Episodes: 3
+# Max steps: 500
+# Window size: 1280x720
+# Graphics: Enabled
+# ...
+```
+#### 2. Direct Mode - Training Configuration
+```bash
+# Headless mode with fast simulation (20x speed)
+python example_usage.py --direct --no-graphics --time-scale 20 --episodes 10 --max-steps 1000
+# This is ideal for training - no graphics overhead, faster simulation
+```
+#### 3. Direct Mode - 3DBall with Custom Window
+```bash
+# Run 3DBall (continuous actions) with larger window
+python example_usage.py --direct --env 3DBall --width 1280 --height 720 --episodes 5
+```
+#### 4. Docker Mode - Production-like Testing
+```bash
+# Build the image first
+docker build -f server/Dockerfile -t unity-env:latest .
+# Run via Docker with graphics
+python example_usage.py --docker --width 1280 --height 720
+# Run via Docker in headless mode for training
+python example_usage.py --docker --no-graphics --time-scale 20 --episodes 20
+```
+#### 5. Server Mode - Separate Server and Client
+```bash
+# Terminal 1: Start server with specific settings
+UNITY_WIDTH=1280 UNITY_HEIGHT=720 uv run uvicorn server.app:app --port 8000
+# Terminal 2: Connect and run episodes
+python example_usage.py --url http://localhost:8000 --env PushBlock --episodes 5
+python example_usage.py --url http://localhost:8000 --env 3DBall --episodes 5
+```
+#### 6. Alternating Environments
+```bash
+# Run alternating episodes between PushBlock and 3DBall
+python example_usage.py --direct --env both --episodes 6
+# Episodes 1,3,5 = PushBlock; Episodes 2,4,6 = 3DBall
+```
+### Command Line Options
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--direct` | - | Run environment directly (no server) |
+| `--docker` | - | Run via Docker container |
+| `--url` | localhost:8000 | Server URL for server mode |
+| `--docker-image` | unity-env:latest | Docker image name |
+| `--env` | PushBlock | Environment: PushBlock, 3DBall, both |
+| `--episodes` | 3 | Number of episodes |
+| `--max-steps` | 500 | Max steps per episode |
+| `--width` | 1280 | Window width in pixels |
+| `--height` | 720 | Window height in pixels |
+| `--no-graphics` | - | Headless mode (faster) |
+| `--time-scale` | 1.0 | Simulation speed multiplier |
+| `--quality-level` | 5 | Graphics quality 0-5 |
+| `--quiet` | - | Reduce output verbosity |
+## Python Client Usage
+### Connect to Server
+```python
+from envs.unity_env import UnityEnv, UnityAction
+# Connect to the server
+with UnityEnv(base_url="http://localhost:8000") as client:
+    # Reset to PushBlock environment
+    result = client.reset(env_id="PushBlock")
+    print(f"Observation dims: {len(result.observation.vector_observations)}")
+    # Take actions
+    for _ in range(100):
+        # PushBlock actions: 0=noop, 1=forward, 2=backward,
+        # 3=rotate_left, 4=rotate_right, 5=strafe_left, 6=strafe_right
+        action = UnityAction(discrete_actions=[1])  # Move forward
+        result = client.step(action)
+        print(f"Reward: {result.reward}, Done: {result.done}")
+        if result.done:
+            result = client.reset()
+```
+### Connect via Docker
+```python
+from envs.unity_env import UnityEnv, UnityAction
+# Automatically start Docker container and connect
+client = UnityEnv.from_docker_image(
+    "unity-env:latest",
+    environment={
+        "UNITY_NO_GRAPHICS": "0",
+        "UNITY_WIDTH": "1280",
+        "UNITY_HEIGHT": "720",
+    }
+)
+try:
+    result = client.reset(env_id="PushBlock")
+    for _ in range(100):
+        action = UnityAction(discrete_actions=[1])
+        result = client.step(action)
+finally:
+    client.close()
+```
+### Switch Environments Dynamically
+```python
+# Start with PushBlock
+result = client.reset(env_id="PushBlock")
+# ... train on PushBlock ...
+# Switch to 3DBall (continuous actions)
+result = client.reset(env_id="3DBall")
+action = UnityAction(continuous_actions=[0.5, -0.3])
+result = client.step(action)
+```
+### Direct Environment Usage (No Server)
+```python
+from envs.unity_env.server.unity_environment import UnityMLAgentsEnvironment
+from envs.unity_env.models import UnityAction
+# Create environment directly
+env = UnityMLAgentsEnvironment(
+    env_id="PushBlock",
+    no_graphics=False,  # Show graphics window
+    width=1280,
+    height=720,
+    time_scale=1.0,
+)
+try:
+    obs = env.reset()
+    print(f"Observation: {len(obs.vector_observations)} dimensions")
+    for step in range(100):
+        action = UnityAction(discrete_actions=[1])  # Move forward
+        obs = env.step(action)
+        print(f"Step {step}: reward={obs.reward}, done={obs.done}")
+        if obs.done:
+            obs = env.reset()
+finally:
+    env.close()
+```
+## Action Spaces
+### PushBlock (Discrete)
+7 discrete actions:
+- `0`: No operation
+- `1`: Move forward
+- `2`: Move backward
+- `3`: Rotate left
+- `4`: Rotate right
+- `5`: Strafe left
+- `6`: Strafe right
+```python
+action = UnityAction(discrete_actions=[1])  # Move forward
+```
+### 3DBall (Continuous)
+2 continuous actions in range [-1, 1]:
+- Action 0: X-axis rotation
+- Action 1: Z-axis rotation
+```python
+action = UnityAction(continuous_actions=[0.5, -0.3])
+```
+## Observations
+All environments provide vector observations. The size depends on the environment:
+- **PushBlock**: 70 dimensions (14 ray-casts detecting walls/goals/blocks)
+- **3DBall**: 8 dimensions (rotation and ball position/velocity)
+- **GridWorld**: Visual observations (grid view)
+```python
+result = client.reset()
+obs = result.observation
+# Access observations
+print(f"Vector obs: {obs.vector_observations}")
+print(f"Behavior: {obs.behavior_name}")
+print(f"Action spec: {obs.action_spec_info}")
+```
+### Visual Observations (Optional)
+Some environments support visual observations. Enable with `include_visual=True`:
+```python
+result = client.reset(include_visual=True)
+if result.observation.visual_observations:
+    # Base64-encoded PNG images
+    for img_b64 in result.observation.visual_observations:
+        # Decode and use the image
+        import base64
+        img_bytes = base64.b64decode(img_b64)
+```
+## Configuration
+### Constructor Arguments
+When creating `UnityMLAgentsEnvironment` directly:
+```python
+from envs.unity_env.server.unity_environment import UnityMLAgentsEnvironment
+env = UnityMLAgentsEnvironment(
+    env_id="PushBlock",      # Unity environment to load
+    no_graphics=False,       # False = show graphics window
+    width=1280,              # Window width in pixels
+    height=720,              # Window height in pixels
+    time_scale=1.0,          # Simulation speed (20.0 for fast training)
+    quality_level=5,         # Graphics quality 0-5
+)
+```
+### Environment Variables
+For Docker deployment, configure via environment variables:
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `UNITY_ENV_ID` | PushBlock | Default Unity environment |
+| `UNITY_NO_GRAPHICS` | 0 | Set to 1 for headless mode |
+| `UNITY_WIDTH` | 1280 | Window width in pixels |
+| `UNITY_HEIGHT` | 720 | Window height in pixels |
+| `UNITY_TIME_SCALE` | 1.0 | Simulation speed multiplier |
+| `UNITY_QUALITY_LEVEL` | 5 | Graphics quality 0-5 |
+| `UNITY_CACHE_DIR` | ~/.mlagents-cache | Binary cache directory |
+## Environment State
+Access detailed environment information:
+```python
+state = client.state()
+print(f"Environment: {state.env_id}")
+print(f"Episode ID: {state.episode_id}")
+print(f"Step count: {state.step_count}")
+print(f"Available envs: {state.available_envs}")
+print(f"Action spec: {state.action_spec}")
+print(f"Observation spec: {state.observation_spec}")
+```
+## Troubleshooting
+### Docker Mode Fails on Apple Silicon (M1/M2/M3/M4)
+**Symptom:** When running with `--docker` on Apple Silicon Macs, you see an error like:
+```
+Error running with Docker: Server error: The Unity environment took too long to respond...
+```
+Or in Docker logs:
+```
+* Assertion: should not be reached at tramp-amd64.c:605
+Environment shut down with return code -6 (SIGABRT)
+```
+**Cause:** Unity ML-Agents binaries are x86_64 (Intel) only. When Docker runs the x86_64 Linux container on Apple Silicon, it uses QEMU emulation. The Mono runtime inside Unity has architecture-specific code that crashes under emulation.
+**Solutions:**
+1. **Use Direct Mode** (recommended for macOS):
+   ```bash
+   python example_usage.py --direct --no-graphics
+   ```
+   Direct mode downloads native macOS binaries which work on Apple Silicon.
+2. **Use Server Mode** with a local server:
+   ```bash
+   # Terminal 1: Start server (uses native macOS binaries)
+   uvicorn server.app:app --host 0.0.0.0 --port 8000
+   # Terminal 2: Run client
+   python example_usage.py --url http://localhost:8000
+   ```
+3. **Use an x86_64 Linux machine** for Docker mode:
+   The Docker image works correctly on native x86_64 Linux machines (cloud VMs, dedicated servers, etc.).
+### First Run is Slow
+The first run downloads Unity binaries (~500MB). This is normal and only happens once. Binaries are cached in `~/.mlagents-cache/`.
+### Graphics Not Showing
+- Ensure `--no-graphics` is NOT set
+- On Linux, ensure X11 is available
+- For Docker, you may need to set up X11 forwarding
+### Docker Container Fails to Start
+```bash
+# Check Docker logs
+docker logs <container_id>
+# Ensure the image is built
+docker images | grep unity-env
+# Rebuild if necessary
+docker build -f server/Dockerfile -t unity-env:latest .
+```
+### Import Errors
+```bash
+# Ensure you're in the correct directory
+cd envs/unity_env
+# Install dependencies
+uv sync
+# or
+pip install -e .
+```
+### mlagents-envs Installation Issues
+The `mlagents-envs` and `mlagents` packages are installed from source by default (via the GitHub repository). If you encounter issues or want to install manually:
+```bash
+# Clone the ml-agents repository
+git clone https://github.com/Unity-Technologies/ml-agents.git
+cd ml-agents
+# Install mlagents-envs from source
+pip install -e ./ml-agents-envs
+# Install the full ml-agents package
+pip install -e ./ml-agents
+```
+This approach is useful when:
+- You need to modify the mlagents source code
+- You want to use a specific branch or commit
+- The git dependency in pyproject.toml is causing issues
+## Caveats
+1. **First Run Download**: Unity binaries (~500MB) are downloaded on first use
+2. **Platform-Specific**: Binaries are platform-specific (macOS, Linux, Windows)
+3. **Apple Silicon + Docker**: Docker mode does not work on Apple Silicon Macs due to x86_64 emulation issues with Unity's Mono runtime. Use direct mode or server mode instead.
+4. **Single Worker**: Unity environments are not thread-safe; use `workers=1`
+5. **Graphics Mode**: Some features require X11/display for graphics mode
+6. **Multi-Agent**: Currently uses first agent only; full multi-agent support planned
+## Dependencies
+- `mlagents-envs` (installed from source via git)
+- `mlagents` (installed from source via git)
+- `numpy>=1.20.0`
+- `pillow>=9.0.0` (for visual observations)
+- `openenv-core[core]>=0.2.0`
+## References
+- [Unity ML-Agents Documentation](https://unity-technologies.github.io/ml-agents/)
+- [ML-Agents GitHub](https://github.com/Unity-Technologies/ml-agents)
+- [Example Environments](https://unity-technologies.github.io/ml-agents/Learning-Environment-Examples/)

__init__.py ADDED Viewed

	@@ -0,0 +1,12 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Unity ML-Agents Environment for OpenEnv."""
+from .client import UnityEnv
+from .models import UnityAction, UnityObservation, UnityState
+__all__ = ["UnityAction", "UnityObservation", "UnityState", "UnityEnv"]

client.py ADDED Viewed

	@@ -0,0 +1,263 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Unity ML-Agents Environment Client.
+This module provides the client for connecting to a Unity ML-Agents
+Environment server via WebSocket for persistent sessions.
+"""
+from typing import Any, Dict, List, Optional
+# Support multiple import scenarios
+try:
+    # In-repo imports (when running from OpenEnv repository root)
+    from openenv.core.client_types import StepResult
+    from openenv.core.env_client import EnvClient
+    from .models import UnityAction, UnityObservation, UnityState
+except ImportError:
+    # openenv from pip
+    from openenv.core.client_types import StepResult
+    from openenv.core.env_client import EnvClient
+    try:
+        # Direct execution from envs/unity_env/ directory
+        from models import UnityAction, UnityObservation, UnityState
+    except ImportError:
+        try:
+            # Package installed as unity_env
+            from unity_env.models import UnityAction, UnityObservation, UnityState
+        except ImportError:
+            # Running from OpenEnv root with envs prefix
+            from envs.unity_env.models import UnityAction, UnityObservation, UnityState
+class UnityEnv(EnvClient[UnityAction, UnityObservation, UnityState]):
+    """
+    Client for Unity ML-Agents environments.
+    This client maintains a persistent WebSocket connection to the environment
+    server, enabling efficient multi-step interactions with lower latency.
+    Each client instance has its own dedicated environment session on the server.
+    Note: Unity environments can take 30-60+ seconds to initialize on first reset
+    (downloading binaries, starting Unity process). The client is configured with
+    longer ping timeouts to handle this.
+    Supported Unity Environments:
+    - PushBlock: Push a block to a goal (discrete actions: 7)
+    - 3DBall: Balance a ball on a platform (continuous actions: 2)
+    - 3DBallHard: Harder version of 3DBall
+    - GridWorld: Navigate a grid to find goals
+    - Basic: Simple movement task
+    - And more from the ML-Agents registry
+    Example:
+        >>> # Connect to a running server
+        >>> with UnityEnv(base_url="http://localhost:8000") as client:
+        ...     result = client.reset()
+        ...     print(f"Vector obs: {len(result.observation.vector_observations)} dims")
+        ...
+        ...     # Take action (PushBlock: 1=forward)
+        ...     result = client.step(UnityAction(discrete_actions=[1]))
+        ...     print(f"Reward: {result.reward}")
+    Example with Docker:
+        >>> # Automatically start container and connect
+        >>> client = UnityEnv.from_docker_image("unity-env:latest")
+        >>> try:
+        ...     result = client.reset(env_id="3DBall")
+        ...     result = client.step(UnityAction(continuous_actions=[0.5, -0.3]))
+        ... finally:
+        ...     client.close()
+    Example switching environments:
+        >>> client = UnityEnv(base_url="http://localhost:8000")
+        >>> # Start with PushBlock
+        >>> result = client.reset(env_id="PushBlock")
+        >>> # ... train on PushBlock ...
+        >>> # Switch to 3DBall
+        >>> result = client.reset(env_id="3DBall")
+        >>> # ... train on 3DBall ...
+    """
+    def __init__(
+        self,
+        base_url: str,
+        connect_timeout_s: float = 10.0,
+        message_timeout_s: float = 180.0,  # 3 minutes for slow Unity initialization
+        provider: Optional[Any] = None,
+    ):
+        """
+        Initialize Unity environment client.
+        Uses longer default timeouts than the base EnvClient because Unity
+        environments can take 30-60+ seconds to initialize on first reset.
+        Args:
+            base_url: Base URL of the environment server (http:// or ws://).
+            connect_timeout_s: Timeout for establishing WebSocket connection
+            message_timeout_s: Timeout for receiving responses (default 3 min for Unity)
+            provider: Optional container/runtime provider for lifecycle management.
+        """
+        super().__init__(
+            base_url=base_url,
+            connect_timeout_s=connect_timeout_s,
+            message_timeout_s=message_timeout_s,
+            provider=provider,
+        )
+    def connect(self) -> "UnityEnv":
+        """
+        Establish WebSocket connection to the server.
+        Overrides the default connection to use longer ping timeouts,
+        since Unity environments can take 30-60+ seconds to initialize.
+        Returns:
+            self for method chaining
+        Raises:
+            ConnectionError: If connection cannot be established
+        """
+        from websockets.sync.client import connect as ws_connect
+        if self._ws is not None:
+            return self
+        try:
+            # Use longer ping_timeout for Unity (60s) since environment
+            # initialization can block the server for a while
+            self._ws = ws_connect(
+                self._ws_url,
+                open_timeout=self._connect_timeout,
+                ping_timeout=120,  # 2 minutes for slow Unity initialization
+                ping_interval=30,  # Send pings every 30 seconds
+                close_timeout=30,
+            )
+        except Exception as e:
+            raise ConnectionError(f"Failed to connect to {self._ws_url}: {e}") from e
+        return self
+    def _step_payload(self, action: UnityAction) -> Dict:
+        """
+        Convert UnityAction to JSON payload for step request.
+        Args:
+            action: UnityAction instance
+        Returns:
+            Dictionary representation suitable for JSON encoding
+        """
+        payload: Dict[str, Any] = {}
+        if action.discrete_actions is not None:
+            payload["discrete_actions"] = action.discrete_actions
+        if action.continuous_actions is not None:
+            payload["continuous_actions"] = action.continuous_actions
+        if action.metadata:
+            payload["metadata"] = action.metadata
+        return payload
+    def _parse_result(self, payload: Dict) -> StepResult[UnityObservation]:
+        """
+        Parse server response into StepResult[UnityObservation].
+        Args:
+            payload: JSON response from server
+        Returns:
+            StepResult with UnityObservation
+        """
+        obs_data = payload.get("observation", {})
+        observation = UnityObservation(
+            vector_observations=obs_data.get("vector_observations", []),
+            visual_observations=obs_data.get("visual_observations"),
+            behavior_name=obs_data.get("behavior_name", ""),
+            action_spec_info=obs_data.get("action_spec_info", {}),
+            observation_spec_info=obs_data.get("observation_spec_info", {}),
+            done=payload.get("done", False),
+            reward=payload.get("reward"),
+            metadata=obs_data.get("metadata", {}),
+        )
+        return StepResult(
+            observation=observation,
+            reward=payload.get("reward"),
+            done=payload.get("done", False),
+        )
+    def _parse_state(self, payload: Dict) -> UnityState:
+        """
+        Parse server response into UnityState object.
+        Args:
+            payload: JSON response from /state endpoint
+        Returns:
+            UnityState object with environment information
+        """
+        return UnityState(
+            episode_id=payload.get("episode_id"),
+            step_count=payload.get("step_count", 0),
+            env_id=payload.get("env_id", ""),
+            behavior_name=payload.get("behavior_name", ""),
+            action_spec=payload.get("action_spec", {}),
+            observation_spec=payload.get("observation_spec", {}),
+            available_envs=payload.get("available_envs", []),
+        )
+    def reset(
+        self,
+        env_id: Optional[str] = None,
+        include_visual: bool = False,
+        **kwargs,
+    ) -> StepResult[UnityObservation]:
+        """
+        Reset the environment.
+        Args:
+            env_id: Optionally switch to a different Unity environment.
+                Available: PushBlock, 3DBall, 3DBallHard, GridWorld, Basic
+            include_visual: If True, include visual observations in response.
+            **kwargs: Additional arguments passed to server.
+        Returns:
+            StepResult with initial observation.
+        """
+        reset_kwargs = dict(kwargs)
+        if env_id is not None:
+            reset_kwargs["env_id"] = env_id
+        reset_kwargs["include_visual"] = include_visual
+        return super().reset(**reset_kwargs)
+    @staticmethod
+    def available_environments() -> List[str]:
+        """
+        List commonly available Unity environments.
+        Note: The actual list may vary based on the ML-Agents registry version.
+        Use state.available_envs after connecting for the authoritative list.
+        Returns:
+            List of environment identifiers.
+        """
+        return [
+            "PushBlock",
+            "3DBall",
+            "3DBallHard",
+            "GridWorld",
+            "Basic",
+            "VisualPushBlock",
+        ]

models.py ADDED Viewed

	@@ -0,0 +1,164 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Data models for the Unity ML-Agents Environment.
+The Unity environment wraps Unity ML-Agents environments (PushBlock, 3DBall,
+GridWorld, etc.) providing a unified interface for reinforcement learning.
+"""
+from typing import Any, Dict, List, Optional
+from pydantic import Field
+# Support both in-repo and standalone imports
+try:
+    # In-repo imports (when running from OpenEnv repository)
+    from openenv.core.env_server.types import Action, Observation, State
+except ImportError:
+    # Standalone imports (when environment is standalone with openenv from pip)
+    from openenv.core.env_server.types import Action, Observation, State
+class UnityAction(Action):
+    """
+    Action for Unity ML-Agents environments.
+    Supports both discrete and continuous action spaces. Unity environments
+    may use either or both types of actions:
+    - Discrete actions: Integer indices for categorical choices
+      (e.g., movement direction: 0=forward, 1=backward, 2=left, 3=right)
+    - Continuous actions: Float values typically in [-1, 1] range
+      (e.g., joint rotations, force magnitudes)
+    Example (PushBlock - discrete):
+        >>> action = UnityAction(discrete_actions=[3])  # Rotate left
+    Example (Walker - continuous):
+        >>> action = UnityAction(continuous_actions=[0.5, -0.3, 0.0, ...])
+    Attributes:
+        discrete_actions: List of discrete action indices for each action branch.
+            For PushBlock: [0-6] where 0=noop, 1=forward, 2=backward,
+            3=rotate_left, 4=rotate_right, 5=strafe_left, 6=strafe_right
+        continuous_actions: List of continuous action values, typically in [-1, 1].
+        metadata: Additional action parameters.
+    """
+    discrete_actions: Optional[List[int]] = Field(
+        default=None,
+        description="Discrete action indices for each action branch",
+    )
+    continuous_actions: Optional[List[float]] = Field(
+        default=None,
+        description="Continuous action values, typically in [-1, 1] range",
+    )
+class UnityObservation(Observation):
+    """
+    Observation from Unity ML-Agents environments.
+    Contains vector observations (sensor readings) and optionally visual
+    observations (rendered images). Most Unity environments provide vector
+    observations; visual observations are optional and must be requested.
+    Attributes:
+        vector_observations: Flattened array of all vector observations.
+            Size and meaning depends on the specific environment.
+            For PushBlock: 70 values from 14 ray-casts detecting walls/goals/blocks.
+        visual_observations: Optional list of base64-encoded images (PNG format).
+            Only included when include_visual=True in reset/step.
+        behavior_name: Name of the Unity behavior (agent type).
+        action_spec_info: Information about the action space for this environment.
+        observation_spec_info: Information about the observation space.
+    """
+    vector_observations: List[float] = Field(
+        default_factory=list,
+        description="Flattened vector observations from the environment",
+    )
+    visual_observations: Optional[List[str]] = Field(
+        default=None,
+        description="Base64-encoded PNG images (when include_visual=True)",
+    )
+    behavior_name: str = Field(
+        default="",
+        description="Name of the Unity behavior/agent type",
+    )
+    action_spec_info: Dict[str, Any] = Field(
+        default_factory=dict,
+        description="Information about the action space",
+    )
+    observation_spec_info: Dict[str, Any] = Field(
+        default_factory=dict,
+        description="Information about the observation space",
+    )
+class UnityState(State):
+    """
+    Extended state for Unity ML-Agents environments.
+    Provides additional metadata about the currently loaded environment,
+    including action and observation space specifications.
+    Attributes:
+        episode_id: Unique identifier for the current episode.
+        step_count: Number of steps taken in the current episode.
+        env_id: Identifier of the currently loaded Unity environment.
+        behavior_name: Name of the Unity behavior (agent type).
+        action_spec: Detailed specification of the action space.
+        observation_spec: Detailed specification of the observation space.
+        available_envs: List of available environment identifiers.
+    """
+    env_id: str = Field(
+        default="PushBlock",
+        description="Identifier of the loaded Unity environment",
+    )
+    behavior_name: str = Field(
+        default="",
+        description="Name of the Unity behavior/agent type",
+    )
+    action_spec: Dict[str, Any] = Field(
+        default_factory=dict,
+        description="Specification of the action space",
+    )
+    observation_spec: Dict[str, Any] = Field(
+        default_factory=dict,
+        description="Specification of the observation space",
+    )
+    available_envs: List[str] = Field(
+        default_factory=list,
+        description="List of available Unity environments",
+    )
+# Available Unity environments from the ML-Agents registry
+# These are pre-built environments that can be downloaded automatically
+AVAILABLE_UNITY_ENVIRONMENTS = [
+    "PushBlock",
+    "3DBall",
+    "3DBallHard",
+    "GridWorld",
+    "Basic",
+    "VisualPushBlock",
+    # Note: More environments may be available in newer versions of ML-Agents
+]
+# Action descriptions for PushBlock (most commonly used example)
+PUSHBLOCK_ACTIONS = {
+    0: "noop",
+    1: "forward",
+    2: "backward",
+    3: "rotate_left",
+    4: "rotate_right",
+    5: "strafe_left",
+    6: "strafe_right",
+}

openenv.yaml ADDED Viewed

	@@ -0,0 +1,6 @@

+spec_version: 1
+name: unity_env
+type: space
+runtime: fastapi
+app: server.app:app
+port: 8000

openenv_unity_env.egg-info/PKG-INFO ADDED Viewed

	@@ -0,0 +1,16 @@

+Metadata-Version: 2.4
+Name: openenv-unity-env
+Version: 0.1.0
+Summary: Unity ML-Agents Environment for OpenEnv - wraps Unity environments like PushBlock, 3DBall, GridWorld
+Requires-Python: >=3.10
+Requires-Dist: openenv-core[core]>=0.2.0
+Requires-Dist: fastapi>=0.115.0
+Requires-Dist: pydantic>=2.0.0
+Requires-Dist: uvicorn>=0.24.0
+Requires-Dist: requests>=2.31.0
+Requires-Dist: mlagents-envs>=1.0.0
+Requires-Dist: numpy>=1.20.0
+Requires-Dist: pillow>=9.0.0
+Provides-Extra: dev
+Requires-Dist: pytest>=8.0.0; extra == "dev"
+Requires-Dist: pytest-cov>=4.0.0; extra == "dev"

openenv_unity_env.egg-info/SOURCES.txt ADDED Viewed

	@@ -0,0 +1,15 @@

+README.md
+pyproject.toml
+./__init__.py
+./client.py
+./example_usage.py
+./models.py
+openenv_unity_env.egg-info/PKG-INFO
+openenv_unity_env.egg-info/SOURCES.txt
+openenv_unity_env.egg-info/dependency_links.txt
+openenv_unity_env.egg-info/entry_points.txt
+openenv_unity_env.egg-info/requires.txt
+openenv_unity_env.egg-info/top_level.txt
+server/__init__.py
+server/app.py
+server/unity_environment.py

openenv_unity_env.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+

openenv_unity_env.egg-info/entry_points.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ [console_scripts]
2	+ server = unity_env.server.app:main

openenv_unity_env.egg-info/requires.txt ADDED Viewed

	@@ -0,0 +1,12 @@

+openenv-core[core]>=0.2.0
+fastapi>=0.115.0
+pydantic>=2.0.0
+uvicorn>=0.24.0
+requests>=2.31.0
+mlagents-envs>=1.0.0
+numpy>=1.20.0
+pillow>=9.0.0
+[dev]
+pytest>=8.0.0
+pytest-cov>=4.0.0

openenv_unity_env.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ unity_env

pyproject.toml ADDED Viewed

	@@ -0,0 +1,45 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+[build-system]
+requires = ["setuptools>=45", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "openenv-unity-env"
+version = "0.1.0"
+description = "Unity ML-Agents Environment for OpenEnv - wraps Unity environments like PushBlock, 3DBall, GridWorld"
+requires-python = ">=3.10"
+dependencies = [
+    # Core OpenEnv dependencies (required for server functionality)
+    "openenv-core[core]>=0.2.0",
+    "fastapi>=0.115.0",
+    "pydantic>=2.0.0",
+    "uvicorn>=0.24.0",
+    "requests>=2.31.0",
+    # Unity ML-Agents dependencies (installed from source for latest features)
+    "mlagents-envs @ git+https://github.com/Unity-Technologies/ml-agents.git#subdirectory=ml-agents-envs",
+    # "mlagents @ git+https://github.com/Unity-Technologies/ml-agents.git#subdirectory=ml-agents",
+    "numpy>=1.20.0",
+    # Optional: for visual observations
+    "pillow>=9.0.0",
+]
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-cov>=4.0.0",
+]
+[project.scripts]
+# Server entry point - enables running via: uv run --project . server
+# or: python -m unity_env.server.app
+server = "unity_env.server.app:main"
+[tool.setuptools]
+include-package-data = true
+packages = ["unity_env", "unity_env.server"]
+package-dir = { "unity_env" = ".", "unity_env.server" = "server" }

server/__init__.py ADDED Viewed

	@@ -0,0 +1,11 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Unity environment server components."""
+from .unity_environment import UnityMLAgentsEnvironment
+__all__ = ["UnityMLAgentsEnvironment"]

server/app.py ADDED Viewed

	@@ -0,0 +1,84 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+FastAPI application for the Unity ML-Agents Environment.
+This module creates an HTTP server that exposes Unity ML-Agents environments
+over HTTP and WebSocket endpoints, compatible with EnvClient.
+Usage:
+    # Development (with auto-reload):
+    uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
+    # Production:
+    uvicorn server.app:app --host 0.0.0.0 --port 8000 --workers 1
+    # Or run directly:
+    uv run --project . server
+Note: Unity environments are not thread-safe, so use workers=1.
+"""
+# Support multiple import scenarios
+try:
+    # In-repo imports (when running from OpenEnv repository root)
+    from openenv.core.env_server.http_server import create_app
+    from ..models import UnityAction, UnityObservation
+    from .unity_environment import UnityMLAgentsEnvironment
+except ImportError:
+    # openenv from pip
+    from openenv.core.env_server.http_server import create_app
+    try:
+        # Direct execution from envs/unity_env/ directory
+        import sys
+        from pathlib import Path
+        # Add parent directory to path for direct execution
+        _parent = str(Path(__file__).parent.parent)
+        if _parent not in sys.path:
+            sys.path.insert(0, _parent)
+        from models import UnityAction, UnityObservation
+        from server.unity_environment import UnityMLAgentsEnvironment
+    except ImportError:
+        try:
+            # Package installed as unity_env
+            from unity_env.models import UnityAction, UnityObservation
+            from unity_env.server.unity_environment import UnityMLAgentsEnvironment
+        except ImportError:
+            # Running from OpenEnv root with envs prefix
+            from envs.unity_env.models import UnityAction, UnityObservation
+            from envs.unity_env.server.unity_environment import UnityMLAgentsEnvironment
+# Create the app with web interface
+# Pass the class (factory) instead of an instance for WebSocket session support
+app = create_app(
+    UnityMLAgentsEnvironment,
+    UnityAction,
+    UnityObservation,
+    env_name="unity_env",
+)
+def main():
+    """
+    Entry point for direct execution via uv run or python -m.
+    This function enables running the server without Docker:
+        uv run --project . server
+        python -m envs.unity_env.server.app
+        openenv serve unity_env
+    """
+    import uvicorn
+    # Note: workers=1 because Unity environments are not thread-safe
+    uvicorn.run(app, host="0.0.0.0", port=8000, workers=1)
+if __name__ == "__main__":
+    main()

server/unity_environment.py ADDED Viewed

	@@ -0,0 +1,554 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Unity ML-Agents Environment Implementation.
+Wraps Unity ML-Agents environments (PushBlock, 3DBall, GridWorld, etc.)
+with the OpenEnv interface for standardized reinforcement learning.
+"""
+import base64
+import glob
+import hashlib
+import io
+import os
+from pathlib import Path
+from sys import platform
+from typing import Any, Dict, List, Optional
+from uuid import uuid4
+import numpy as np
+# Support multiple import scenarios
+try:
+    # In-repo imports (when running from OpenEnv repository root)
+    from openenv.core.env_server.interfaces import Environment
+    from ..models import UnityAction, UnityObservation, UnityState
+except ImportError:
+    # openenv from pip
+    from openenv.core.env_server.interfaces import Environment
+    try:
+        # Direct execution from envs/unity_env/ directory (imports from parent)
+        import sys
+        from pathlib import Path
+        # Add parent directory to path for direct execution
+        _parent = str(Path(__file__).parent.parent)
+        if _parent not in sys.path:
+            sys.path.insert(0, _parent)
+        from models import UnityAction, UnityObservation, UnityState
+    except ImportError:
+        try:
+            # Package installed as unity_env
+            from unity_env.models import UnityAction, UnityObservation, UnityState
+        except ImportError:
+            # Running from OpenEnv root with envs prefix
+            from envs.unity_env.models import UnityAction, UnityObservation, UnityState
+# Persistent cache directory to avoid re-downloading environment binaries
+PERSISTENT_CACHE_DIR = os.path.join(str(Path.home()), ".mlagents-cache")
+def get_cached_binary_path(cache_dir: str, name: str, url: str) -> Optional[str]:
+    """Check if binary is cached and return its path."""
+    if platform == "darwin":
+        extension = "*.app"
+    elif platform in ("linux", "linux2"):
+        extension = "*.x86_64"
+    elif platform == "win32":
+        extension = "*.exe"
+    else:
+        return None
+    bin_dir = os.path.join(cache_dir, "binaries")
+    url_hash = "-" + hashlib.md5(url.encode()).hexdigest()
+    search_path = os.path.join(bin_dir, name + url_hash, "**", extension)
+    candidates = glob.glob(search_path, recursive=True)
+    for c in candidates:
+        if "UnityCrashHandler64" not in c:
+            return c
+    return None
+class UnityMLAgentsEnvironment(Environment):
+    """
+    Wraps Unity ML-Agents environments with the OpenEnv interface.
+    This environment supports all Unity ML-Agents registry environments
+    including PushBlock, 3DBall, GridWorld, and more. Environments are
+    automatically downloaded on first use.
+    Features:
+    - Dynamic environment switching via reset(env_id="...")
+    - Support for both discrete and continuous action spaces
+    - Optional visual observations (base64-encoded images)
+    - Persistent caching to avoid re-downloading binaries
+    - Headless mode for faster training (no_graphics=True)
+    Example:
+        >>> env = UnityMLAgentsEnvironment()
+        >>> obs = env.reset()
+        >>> print(obs.vector_observations)
+        >>>
+        >>> # Take a random action
+        >>> obs = env.step(UnityAction(discrete_actions=[1]))  # Move forward
+        >>> print(obs.reward)
+    Example with different environment:
+        >>> env = UnityMLAgentsEnvironment(env_id="3DBall")
+        >>> obs = env.reset()
+        >>>
+        >>> # Or switch environment on reset
+        >>> obs = env.reset(env_id="PushBlock")
+    """
+    # Each WebSocket session gets its own environment instance
+    SUPPORTS_CONCURRENT_SESSIONS = False
+    def __init__(
+        self,
+        env_id: Optional[str] = None,
+        no_graphics: Optional[bool] = None,
+        time_scale: Optional[float] = None,
+        width: Optional[int] = None,
+        height: Optional[int] = None,
+        quality_level: Optional[int] = None,
+        cache_dir: Optional[str] = None,
+    ):
+        """
+        Initialize the Unity ML-Agents environment.
+        Configuration can be provided via constructor arguments or environment
+        variables. Environment variables are used when constructor arguments
+        are not provided (useful for Docker deployment).
+        Args:
+            env_id: Identifier of the Unity environment to load.
+                Available: PushBlock, 3DBall, 3DBallHard, GridWorld, Basic
+                Env var: UNITY_ENV_ID (default: PushBlock)
+            no_graphics: If True, run in headless mode (faster training).
+                Env var: UNITY_NO_GRAPHICS (0 or 1, default: 0 = graphics enabled)
+            time_scale: Simulation speed multiplier.
+                Env var: UNITY_TIME_SCALE (default: 1.0)
+            width: Window width in pixels (when graphics enabled).
+                Env var: UNITY_WIDTH (default: 1280)
+            height: Window height in pixels (when graphics enabled).
+                Env var: UNITY_HEIGHT (default: 720)
+            quality_level: Graphics quality 0-5 (when graphics enabled).
+                Env var: UNITY_QUALITY_LEVEL (default: 5)
+            cache_dir: Directory to cache downloaded environment binaries.
+                Env var: UNITY_CACHE_DIR (default: ~/.mlagents-cache)
+        """
+        # Initialize cleanup-critical attributes first (for __del__ safety)
+        self._unity_env = None
+        self._behavior_name = None
+        self._behavior_spec = None
+        self._engine_channel = None
+        # Read from environment variables with defaults, allow constructor override
+        self._env_id = env_id or os.environ.get("UNITY_ENV_ID", "PushBlock")
+        # Handle no_graphics: default is False (graphics enabled)
+        if no_graphics is not None:
+            self._no_graphics = no_graphics
+        else:
+            env_no_graphics = os.environ.get("UNITY_NO_GRAPHICS", "0")
+            self._no_graphics = env_no_graphics.lower() in ("1", "true", "yes")
+        self._time_scale = (
+            time_scale
+            if time_scale is not None
+            else float(os.environ.get("UNITY_TIME_SCALE", "1.0"))
+        )
+        self._width = (
+            width
+            if width is not None
+            else int(os.environ.get("UNITY_WIDTH", "1280"))
+        )
+        self._height = (
+            height
+            if height is not None
+            else int(os.environ.get("UNITY_HEIGHT", "720"))
+        )
+        self._quality_level = (
+            quality_level
+            if quality_level is not None
+            else int(os.environ.get("UNITY_QUALITY_LEVEL", "5"))
+        )
+        self._cache_dir = cache_dir or os.environ.get(
+            "UNITY_CACHE_DIR", PERSISTENT_CACHE_DIR
+        )
+        self._include_visual = False
+        # State tracking
+        self._state = UnityState(
+            episode_id=str(uuid4()),
+            step_count=0,
+            env_id=self._env_id,
+        )
+        # Ensure cache directory exists
+        os.makedirs(self._cache_dir, exist_ok=True)
+    def _load_environment(self, env_id: str) -> None:
+        """Load or switch to a Unity environment."""
+        # Close existing environment if any
+        if self._unity_env is not None:
+            try:
+                self._unity_env.close()
+            except Exception:
+                pass
+        # Import ML-Agents components
+        try:
+            from mlagents_envs.base_env import ActionTuple
+            from mlagents_envs.registry import default_registry
+            from mlagents_envs.registry.remote_registry_entry import RemoteRegistryEntry
+            from mlagents_envs.side_channel.engine_configuration_channel import (
+                EngineConfigurationChannel,
+            )
+        except ImportError as e:
+            raise ImportError(
+                "mlagents-envs is required. Install with: pip install mlagents-envs"
+            ) from e
+        # Create engine configuration channel
+        self._engine_channel = EngineConfigurationChannel()
+        # Check if environment is in registry
+        if env_id not in default_registry:
+            available = list(default_registry.keys())
+            raise ValueError(
+                f"Environment '{env_id}' not found. Available: {available}"
+            )
+        # Get registry entry and create with persistent cache
+        entry = default_registry[env_id]
+        # Create a new entry with our persistent cache directory
+        persistent_entry = RemoteRegistryEntry(
+            identifier=entry.identifier,
+            expected_reward=entry.expected_reward,
+            description=entry.description,
+            linux_url=getattr(entry, "_linux_url", None),
+            darwin_url=getattr(entry, "_darwin_url", None),
+            win_url=getattr(entry, "_win_url", None),
+            additional_args=getattr(entry, "_add_args", []),
+            tmp_dir=self._cache_dir,
+        )
+        # Create the environment
+        self._unity_env = persistent_entry.make(
+            no_graphics=self._no_graphics,
+            side_channels=[self._engine_channel],
+        )
+        # Configure engine settings
+        if not self._no_graphics:
+            self._engine_channel.set_configuration_parameters(
+                width=self._width,
+                height=self._height,
+                quality_level=self._quality_level,
+                time_scale=self._time_scale,
+            )
+        else:
+            self._engine_channel.set_configuration_parameters(
+                time_scale=self._time_scale
+            )
+        # Get behavior info
+        if not self._unity_env.behavior_specs:
+            self._unity_env.step()
+        self._behavior_name = list(self._unity_env.behavior_specs.keys())[0]
+        self._behavior_spec = self._unity_env.behavior_specs[self._behavior_name]
+        # Update state
+        self._env_id = env_id
+        self._state.env_id = env_id
+        self._state.behavior_name = self._behavior_name
+        self._state.action_spec = self._get_action_spec_info()
+        self._state.observation_spec = self._get_observation_spec_info()
+        self._state.available_envs = list(default_registry.keys())
+    def _get_action_spec_info(self) -> Dict[str, Any]:
+        """Get information about the action space."""
+        spec = self._behavior_spec.action_spec
+        return {
+            "is_discrete": spec.is_discrete(),
+            "is_continuous": spec.is_continuous(),
+            "discrete_size": spec.discrete_size,
+            "discrete_branches": list(spec.discrete_branches) if spec.is_discrete() else [],
+            "continuous_size": spec.continuous_size,
+        }
+    def _get_observation_spec_info(self) -> Dict[str, Any]:
+        """Get information about the observation space."""
+        specs = self._behavior_spec.observation_specs
+        obs_info = []
+        for i, spec in enumerate(specs):
+            obs_info.append({
+                "index": i,
+                "shape": list(spec.shape),
+                "dimension_property": str(spec.dimension_property),
+                "observation_type": str(spec.observation_type),
+            })
+        return {"observations": obs_info, "count": len(specs)}
+    def _get_observation(
+        self,
+        decision_steps=None,
+        terminal_steps=None,
+        reward: float = 0.0,
+        done: bool = False,
+    ) -> UnityObservation:
+        """Convert Unity observation to UnityObservation."""
+        vector_obs = []
+        visual_obs = []
+        # Determine which steps to use
+        if terminal_steps is not None and len(terminal_steps) > 0:
+            steps = terminal_steps
+            done = True
+            # Get reward from terminal step
+            if len(terminal_steps.agent_id) > 0:
+                reward = float(terminal_steps[terminal_steps.agent_id[0]].reward)
+        elif decision_steps is not None and len(decision_steps) > 0:
+            steps = decision_steps
+            # Get reward from decision step
+            if len(decision_steps.agent_id) > 0:
+                reward = float(decision_steps[decision_steps.agent_id[0]].reward)
+        else:
+            # No agents, return empty observation
+            return UnityObservation(
+                vector_observations=[],
+                visual_observations=None,
+                behavior_name=self._behavior_name or "",
+                done=done,
+                reward=reward,
+                action_spec_info=self._state.action_spec,
+                observation_spec_info=self._state.observation_spec,
+            )
+        # Process observations from first agent
+        for obs in steps.obs:
+            if len(obs.shape) == 2:
+                # Vector observation (agents, features)
+                vector_obs.extend(obs[0].tolist())
+            elif len(obs.shape) == 4 and self._include_visual:
+                # Visual observation (agents, height, width, channels)
+                img_array = (obs[0] * 255).astype(np.uint8)
+                # Encode as base64 PNG
+                try:
+                    from PIL import Image
+                    img = Image.fromarray(img_array)
+                    buffer = io.BytesIO()
+                    img.save(buffer, format="PNG")
+                    img_b64 = base64.b64encode(buffer.getvalue()).decode("utf-8")
+                    visual_obs.append(img_b64)
+                except ImportError:
+                    # PIL not available, skip visual observations
+                    pass
+        return UnityObservation(
+            vector_observations=vector_obs,
+            visual_observations=visual_obs if visual_obs else None,
+            behavior_name=self._behavior_name or "",
+            done=done,
+            reward=reward,
+            action_spec_info=self._state.action_spec,
+            observation_spec_info=self._state.observation_spec,
+        )
+    def reset(
+        self,
+        env_id: Optional[str] = None,
+        seed: Optional[int] = None,
+        include_visual: bool = False,
+        **kwargs,
+    ) -> UnityObservation:
+        """
+        Reset the environment and return initial observation.
+        Args:
+            env_id: Optionally switch to a different Unity environment.
+            seed: Random seed (not fully supported by Unity ML-Agents).
+            include_visual: If True, include visual observations in output.
+            **kwargs: Additional arguments (ignored).
+        Returns:
+            UnityObservation with initial state.
+        """
+        self._include_visual = include_visual
+        # Load or switch environment if needed
+        target_env = env_id or self._env_id
+        if self._unity_env is None or target_env != self._env_id:
+            self._load_environment(target_env)
+        # Reset the environment
+        self._unity_env.reset()
+        # Update state
+        self._state = UnityState(
+            episode_id=str(uuid4()),
+            step_count=0,
+            env_id=self._env_id,
+            behavior_name=self._behavior_name,
+            action_spec=self._state.action_spec,
+            observation_spec=self._state.observation_spec,
+            available_envs=self._state.available_envs,
+        )
+        # Get initial observation
+        decision_steps, terminal_steps = self._unity_env.get_steps(self._behavior_name)
+        return self._get_observation(
+            decision_steps=decision_steps,
+            terminal_steps=terminal_steps,
+            reward=0.0,
+            done=False,
+        )
+    def step(self, action: UnityAction) -> UnityObservation:
+        """
+        Execute one step in the environment.
+        Args:
+            action: UnityAction with discrete and/or continuous actions.
+        Returns:
+            UnityObservation with new state, reward, and done flag.
+        """
+        if self._unity_env is None:
+            raise RuntimeError("Environment not initialized. Call reset() first.")
+        from mlagents_envs.base_env import ActionTuple
+        # Get current decision steps to know how many agents
+        decision_steps, terminal_steps = self._unity_env.get_steps(self._behavior_name)
+        # Check if episode already ended
+        if len(terminal_steps) > 0:
+            return self._get_observation(
+                decision_steps=decision_steps,
+                terminal_steps=terminal_steps,
+                done=True,
+            )
+        n_agents = len(decision_steps)
+        if n_agents == 0:
+            # No agents need decisions, just step
+            self._unity_env.step()
+            self._state.step_count += 1
+            decision_steps, terminal_steps = self._unity_env.get_steps(self._behavior_name)
+            return self._get_observation(
+                decision_steps=decision_steps,
+                terminal_steps=terminal_steps,
+            )
+        # Build action tuple
+        action_tuple = ActionTuple()
+        # Handle discrete actions
+        if action.discrete_actions is not None:
+            discrete = np.array([action.discrete_actions] * n_agents, dtype=np.int32)
+            # Ensure correct shape (n_agents, n_branches)
+            if discrete.ndim == 1:
+                discrete = discrete.reshape(n_agents, -1)
+            action_tuple.add_discrete(discrete)
+        elif self._behavior_spec.action_spec.is_discrete():
+            # Default to no-op (action 0)
+            n_branches = self._behavior_spec.action_spec.discrete_size
+            discrete = np.zeros((n_agents, n_branches), dtype=np.int32)
+            action_tuple.add_discrete(discrete)
+        # Handle continuous actions
+        if action.continuous_actions is not None:
+            continuous = np.array([action.continuous_actions] * n_agents, dtype=np.float32)
+            if continuous.ndim == 1:
+                continuous = continuous.reshape(n_agents, -1)
+            action_tuple.add_continuous(continuous)
+        elif self._behavior_spec.action_spec.is_continuous():
+            # Default to zero actions
+            n_continuous = self._behavior_spec.action_spec.continuous_size
+            continuous = np.zeros((n_agents, n_continuous), dtype=np.float32)
+            action_tuple.add_continuous(continuous)
+        # Set actions and step
+        self._unity_env.set_actions(self._behavior_name, action_tuple)
+        self._unity_env.step()
+        self._state.step_count += 1
+        # Get new observation
+        decision_steps, terminal_steps = self._unity_env.get_steps(self._behavior_name)
+        return self._get_observation(
+            decision_steps=decision_steps,
+            terminal_steps=terminal_steps,
+        )
+    async def reset_async(
+        self,
+        env_id: Optional[str] = None,
+        seed: Optional[int] = None,
+        include_visual: bool = False,
+        **kwargs,
+    ) -> UnityObservation:
+        """
+        Async version of reset - runs in a thread to avoid blocking the event loop.
+        Unity ML-Agents environments can take 10-60+ seconds to initialize.
+        Running in a thread allows the event loop to continue processing
+        WebSocket keepalive pings during this time.
+        """
+        import asyncio
+        return await asyncio.to_thread(
+            self.reset,
+            env_id=env_id,
+            seed=seed,
+            include_visual=include_visual,
+            **kwargs,
+        )
+    async def step_async(self, action: UnityAction) -> UnityObservation:
+        """
+        Async version of step - runs in a thread to avoid blocking the event loop.
+        Although step() is usually fast, running in a thread ensures
+        the event loop remains responsive.
+        """
+        import asyncio
+        return await asyncio.to_thread(self.step, action)
+    @property
+    def state(self) -> UnityState:
+        """Get the current environment state."""
+        return self._state
+    def close(self) -> None:
+        """Close the Unity environment."""
+        unity_env = getattr(self, "_unity_env", None)
+        if unity_env is not None:
+            try:
+                unity_env.close()
+            except Exception:
+                pass
+            self._unity_env = None
+    def __del__(self):
+        """Cleanup on deletion."""
+        try:
+            self.close()
+        except Exception:
+            pass