Crashbandicoote2 commited on
Commit
0f53490
·
verified ·
1 Parent(s): 6c44dae

Upload folder using huggingface_hub

Browse files
Dockerfile ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # Multi-stage build for Unity ML-Agents environment
8
+ # Uses pip for package installation (no virtual environment)
9
+ # Note: Using Python 3.10.12 specifically because ml-agents requires >=3.10.1,<=3.10.12
10
+ # Note: Unity binaries are x86_64 only, so we force linux/amd64 platform
11
+
12
+ FROM --platform=linux/amd64 python:3.10.12-slim AS builder
13
+
14
+ WORKDIR /app
15
+
16
+ # Install build dependencies
17
+ RUN apt-get update && apt-get install -y --no-install-recommends \
18
+ build-essential \
19
+ git \
20
+ && rm -rf /var/lib/apt/lists/*
21
+
22
+ # Copy environment code
23
+ COPY . /app/env
24
+
25
+ WORKDIR /app/env
26
+
27
+ # Install dependencies using pip
28
+ # Note: mlagents packages are installed from git source via pyproject.toml
29
+ RUN pip install --upgrade pip && \
30
+ pip install --no-cache-dir -e .
31
+
32
+ # Final runtime stage
33
+ FROM --platform=linux/amd64 python:3.10.12-slim
34
+
35
+ WORKDIR /app
36
+
37
+ # Install runtime dependencies (curl for healthcheck)
38
+ RUN apt-get update && apt-get install -y --no-install-recommends \
39
+ curl \
40
+ && rm -rf /var/lib/apt/lists/*
41
+
42
+ # Copy installed packages from builder
43
+ COPY --from=builder /usr/local/lib/python3.10/site-packages /usr/local/lib/python3.10/site-packages
44
+ COPY --from=builder /usr/local/bin /usr/local/bin
45
+
46
+ # Copy the environment code
47
+ COPY . /app/env
48
+
49
+ # Create cache directory for Unity binaries
50
+ RUN mkdir -p /root/.mlagents-cache
51
+
52
+ # Set PYTHONPATH so imports work correctly
53
+ ENV PYTHONPATH="/app/env:$PYTHONPATH"
54
+
55
+ # Expose port
56
+ EXPOSE 8000
57
+
58
+ # Health check
59
+ HEALTHCHECK --interval=30s --timeout=3s --start-period=60s --retries=3 \
60
+ CMD curl -f http://localhost:8000/health || exit 1
61
+
62
+ # Note: Longer start period (60s) because Unity environment download may take time on first run
63
+
64
+ # Run the FastAPI server
65
+ # Note: workers=1 because Unity environments are not thread-safe
66
+ ENV ENABLE_WEB_INTERFACE=true
67
+ CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]
README.md CHANGED
@@ -1,10 +1,607 @@
1
  ---
2
- title: Unity Env
3
- emoji: 🐢
4
- colorFrom: green
5
- colorTo: blue
6
  sdk: docker
7
  pinned: false
 
 
 
 
 
 
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Unity Environment Server
3
+ emoji: 🌐
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: docker
7
  pinned: false
8
+ app_port: 8000
9
+ base_path: /web
10
+ tags:
11
+ - openenv
12
+ - Unity
13
+ - MlAgents
14
+ - MlAgentsUnity
15
+ - MlAgentsEnv
16
  ---
17
 
18
+ <!--
19
+ Copyright (c) Meta Platforms, Inc. and affiliates.
20
+ All rights reserved.
21
+ This source code is licensed under the BSD-style license found in the
22
+ LICENSE file in the root directory of this source tree.
23
+ -->
24
+
25
+ <div align="center">
26
+
27
+ # Unity ML-Agents Environment
28
+
29
+ OpenEnv wrapper for [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents) environments. This environment provides access to Unity's reinforcement learning environments through a standardized HTTP/WebSocket interface.
30
+
31
+ ## Supported Environments
32
+
33
+ | Environment | Action Type | Description |
34
+ |------------|-------------|-------------|
35
+ | **PushBlock** | Discrete (7) | Push a block to a goal position |
36
+ | **3DBall** | Continuous (2) | Balance a ball on a platform |
37
+ | **3DBallHard** | Continuous (2) | Harder version of 3DBall |
38
+ | **GridWorld** | Discrete (5) | Navigate a grid to find goals |
39
+ | **Basic** | Discrete (3) | Simple left/right movement |
40
+
41
+ More environments may be available depending on the ML-Agents registry version.
42
+
43
+ ## Installation
44
+
45
+ ### Option 1: Non-Docker Installation (Local Development)
46
+
47
+ #### Prerequisites
48
+
49
+ - Python 3.10+
50
+ - [uv](https://docs.astral.sh/uv/) (recommended) or pip
51
+
52
+ #### Install from OpenEnv Repository
53
+
54
+ ```bash
55
+ # Clone the OpenEnv repository (if not already done)
56
+ git clone https://github.com/your-org/OpenEnv.git
57
+ cd OpenEnv
58
+
59
+ # Install the unity_env package with dependencies
60
+ cd envs/unity_env
61
+ uv pip install -e .
62
+
63
+ # Or with pip
64
+ pip install -e .
65
+ ```
66
+
67
+ #### Install Dependencies Only
68
+
69
+ ```bash
70
+ cd envs/unity_env
71
+
72
+ # Using uv (recommended)
73
+ uv sync
74
+
75
+ # Or using pip
76
+ pip install -r requirements.txt # if available
77
+ pip install mlagents-envs numpy pillow fastapi uvicorn pydantic
78
+ ```
79
+
80
+ #### Verify Installation
81
+
82
+ ```bash
83
+ # Test the installation
84
+ cd envs/unity_env
85
+ python -c "from server.unity_environment import UnityMLAgentsEnvironment; print('Installation successful!')"
86
+ ```
87
+
88
+ **Note:** The first run will download Unity environment binaries (~500MB). These are cached in `~/.mlagents-cache/` for future use.
89
+
90
+ ### Option 2: Docker Installation
91
+
92
+ #### Prerequisites
93
+
94
+ - Docker installed and running
95
+ - Python 3.10+ (for running the client)
96
+
97
+ #### Build the Docker Image
98
+
99
+ ```bash
100
+ cd envs/unity_env
101
+
102
+ # Build the Docker image
103
+ docker build -f server/Dockerfile -t unity-env:latest .
104
+
105
+ # Verify the build
106
+ docker images | grep unity-env
107
+ ```
108
+
109
+ **Note for Apple Silicon (M1/M2/M3/M4) users:** Docker mode is **not supported** on Apple Silicon because Unity's Mono runtime crashes under x86_64 emulation. Use **direct mode** (`--direct`) or **server mode** (`--url`) instead, which run native macOS binaries. See [Troubleshooting](#docker-mode-fails-on-apple-silicon-m1m2m3m4) for details.
110
+
111
+ #### Run the Docker Container
112
+
113
+ ```bash
114
+ # Run with default settings (graphics enabled, 800x600)
115
+ docker run -p 8000:8000 unity-env:latest
116
+
117
+ # Run with custom settings
118
+ docker run -p 8000:8000 \
119
+ -e UNITY_NO_GRAPHICS=0 \
120
+ -e UNITY_WIDTH=1280 \
121
+ -e UNITY_HEIGHT=720 \
122
+ -e UNITY_TIME_SCALE=1.0 \
123
+ unity-env:latest
124
+
125
+ # Run in headless mode (faster for training)
126
+ docker run -p 8000:8000 \
127
+ -e UNITY_NO_GRAPHICS=1 \
128
+ -e UNITY_TIME_SCALE=20 \
129
+ unity-env:latest
130
+
131
+ # Run with persistent cache (avoid re-downloading binaries)
132
+ docker run -p 8000:8000 \
133
+ -v ~/.mlagents-cache:/root/.mlagents-cache \
134
+ unity-env:latest
135
+ ```
136
+
137
+ #### Install Client Dependencies
138
+
139
+ To connect to the Docker container, install the client on your host machine:
140
+
141
+ ```bash
142
+ cd envs/unity_env
143
+ pip install requests websockets
144
+ ```
145
+
146
+ ## Quick Start
147
+
148
+ ### Option 1: Direct Mode (Fastest for Testing)
149
+
150
+ Run the Unity environment directly without a server:
151
+
152
+ ```bash
153
+ cd envs/unity_env
154
+
155
+ # Run with graphics (default: 1280x720)
156
+ python example_usage.py --direct
157
+
158
+ # Run with custom window size
159
+ python example_usage.py --direct --width 800 --height 600
160
+
161
+ # Run headless (faster for training)
162
+ python example_usage.py --direct --no-graphics --time-scale 20
163
+
164
+ # Run 3DBall environment
165
+ python example_usage.py --direct --env 3DBall --episodes 5
166
+ ```
167
+
168
+ ### Option 2: Server Mode
169
+
170
+ Start the server and connect with a client:
171
+
172
+ ```bash
173
+ # Terminal 1: Start the server (graphics enabled by default)
174
+ cd envs/unity_env
175
+ uv run uvicorn server.app:app --host 0.0.0.0 --port 8000
176
+
177
+ # Terminal 2: Run the example client
178
+ python example_usage.py --url http://localhost:8000
179
+ python example_usage.py --url http://localhost:8000 --env 3DBall --episodes 5
180
+ ```
181
+
182
+ ### Option 3: Docker Mode
183
+
184
+ Run via Docker container (auto-starts and connects):
185
+
186
+ ```bash
187
+ cd envs/unity_env
188
+
189
+ # Run with default settings
190
+ python example_usage.py --docker
191
+
192
+ # Run with custom window size
193
+ python example_usage.py --docker --width 1280 --height 720
194
+
195
+ # Run headless (faster for training)
196
+ python example_usage.py --docker --no-graphics --time-scale 20
197
+
198
+ # Run 3DBall for 10 episodes
199
+ python example_usage.py --docker --env 3DBall --episodes 10
200
+
201
+ # Use a custom Docker image
202
+ python example_usage.py --docker --docker-image my-unity-env:v1
203
+ ```
204
+
205
+ ## Example Scripts
206
+
207
+ ### Basic Usage Examples
208
+
209
+ #### 1. Direct Mode - Quick Testing
210
+
211
+ ```bash
212
+ # Run PushBlock with graphics (default)
213
+ python example_usage.py --direct
214
+
215
+ # Output:
216
+ # ============================================================
217
+ # Unity ML-Agents Environment - Direct Mode
218
+ # ============================================================
219
+ # Environment: PushBlock
220
+ # Episodes: 3
221
+ # Max steps: 500
222
+ # Window size: 1280x720
223
+ # Graphics: Enabled
224
+ # ...
225
+ ```
226
+
227
+ #### 2. Direct Mode - Training Configuration
228
+
229
+ ```bash
230
+ # Headless mode with fast simulation (20x speed)
231
+ python example_usage.py --direct --no-graphics --time-scale 20 --episodes 10 --max-steps 1000
232
+
233
+ # This is ideal for training - no graphics overhead, faster simulation
234
+ ```
235
+
236
+ #### 3. Direct Mode - 3DBall with Custom Window
237
+
238
+ ```bash
239
+ # Run 3DBall (continuous actions) with larger window
240
+ python example_usage.py --direct --env 3DBall --width 1280 --height 720 --episodes 5
241
+ ```
242
+
243
+ #### 4. Docker Mode - Production-like Testing
244
+
245
+ ```bash
246
+ # Build the image first
247
+ docker build -f server/Dockerfile -t unity-env:latest .
248
+
249
+ # Run via Docker with graphics
250
+ python example_usage.py --docker --width 1280 --height 720
251
+
252
+ # Run via Docker in headless mode for training
253
+ python example_usage.py --docker --no-graphics --time-scale 20 --episodes 20
254
+ ```
255
+
256
+ #### 5. Server Mode - Separate Server and Client
257
+
258
+ ```bash
259
+ # Terminal 1: Start server with specific settings
260
+ UNITY_WIDTH=1280 UNITY_HEIGHT=720 uv run uvicorn server.app:app --port 8000
261
+
262
+ # Terminal 2: Connect and run episodes
263
+ python example_usage.py --url http://localhost:8000 --env PushBlock --episodes 5
264
+ python example_usage.py --url http://localhost:8000 --env 3DBall --episodes 5
265
+ ```
266
+
267
+ #### 6. Alternating Environments
268
+
269
+ ```bash
270
+ # Run alternating episodes between PushBlock and 3DBall
271
+ python example_usage.py --direct --env both --episodes 6
272
+ # Episodes 1,3,5 = PushBlock; Episodes 2,4,6 = 3DBall
273
+ ```
274
+
275
+ ### Command Line Options
276
+
277
+ | Option | Default | Description |
278
+ |--------|---------|-------------|
279
+ | `--direct` | - | Run environment directly (no server) |
280
+ | `--docker` | - | Run via Docker container |
281
+ | `--url` | localhost:8000 | Server URL for server mode |
282
+ | `--docker-image` | unity-env:latest | Docker image name |
283
+ | `--env` | PushBlock | Environment: PushBlock, 3DBall, both |
284
+ | `--episodes` | 3 | Number of episodes |
285
+ | `--max-steps` | 500 | Max steps per episode |
286
+ | `--width` | 1280 | Window width in pixels |
287
+ | `--height` | 720 | Window height in pixels |
288
+ | `--no-graphics` | - | Headless mode (faster) |
289
+ | `--time-scale` | 1.0 | Simulation speed multiplier |
290
+ | `--quality-level` | 5 | Graphics quality 0-5 |
291
+ | `--quiet` | - | Reduce output verbosity |
292
+
293
+ ## Python Client Usage
294
+
295
+ ### Connect to Server
296
+
297
+ ```python
298
+ from envs.unity_env import UnityEnv, UnityAction
299
+
300
+ # Connect to the server
301
+ with UnityEnv(base_url="http://localhost:8000") as client:
302
+ # Reset to PushBlock environment
303
+ result = client.reset(env_id="PushBlock")
304
+ print(f"Observation dims: {len(result.observation.vector_observations)}")
305
+
306
+ # Take actions
307
+ for _ in range(100):
308
+ # PushBlock actions: 0=noop, 1=forward, 2=backward,
309
+ # 3=rotate_left, 4=rotate_right, 5=strafe_left, 6=strafe_right
310
+ action = UnityAction(discrete_actions=[1]) # Move forward
311
+ result = client.step(action)
312
+ print(f"Reward: {result.reward}, Done: {result.done}")
313
+
314
+ if result.done:
315
+ result = client.reset()
316
+ ```
317
+
318
+ ### Connect via Docker
319
+
320
+ ```python
321
+ from envs.unity_env import UnityEnv, UnityAction
322
+
323
+ # Automatically start Docker container and connect
324
+ client = UnityEnv.from_docker_image(
325
+ "unity-env:latest",
326
+ environment={
327
+ "UNITY_NO_GRAPHICS": "0",
328
+ "UNITY_WIDTH": "1280",
329
+ "UNITY_HEIGHT": "720",
330
+ }
331
+ )
332
+
333
+ try:
334
+ result = client.reset(env_id="PushBlock")
335
+ for _ in range(100):
336
+ action = UnityAction(discrete_actions=[1])
337
+ result = client.step(action)
338
+ finally:
339
+ client.close()
340
+ ```
341
+
342
+ ### Switch Environments Dynamically
343
+
344
+ ```python
345
+ # Start with PushBlock
346
+ result = client.reset(env_id="PushBlock")
347
+ # ... train on PushBlock ...
348
+
349
+ # Switch to 3DBall (continuous actions)
350
+ result = client.reset(env_id="3DBall")
351
+ action = UnityAction(continuous_actions=[0.5, -0.3])
352
+ result = client.step(action)
353
+ ```
354
+
355
+ ### Direct Environment Usage (No Server)
356
+
357
+ ```python
358
+ from envs.unity_env.server.unity_environment import UnityMLAgentsEnvironment
359
+ from envs.unity_env.models import UnityAction
360
+
361
+ # Create environment directly
362
+ env = UnityMLAgentsEnvironment(
363
+ env_id="PushBlock",
364
+ no_graphics=False, # Show graphics window
365
+ width=1280,
366
+ height=720,
367
+ time_scale=1.0,
368
+ )
369
+
370
+ try:
371
+ obs = env.reset()
372
+ print(f"Observation: {len(obs.vector_observations)} dimensions")
373
+
374
+ for step in range(100):
375
+ action = UnityAction(discrete_actions=[1]) # Move forward
376
+ obs = env.step(action)
377
+ print(f"Step {step}: reward={obs.reward}, done={obs.done}")
378
+
379
+ if obs.done:
380
+ obs = env.reset()
381
+ finally:
382
+ env.close()
383
+ ```
384
+
385
+ ## Action Spaces
386
+
387
+ ### PushBlock (Discrete)
388
+
389
+ 7 discrete actions:
390
+ - `0`: No operation
391
+ - `1`: Move forward
392
+ - `2`: Move backward
393
+ - `3`: Rotate left
394
+ - `4`: Rotate right
395
+ - `5`: Strafe left
396
+ - `6`: Strafe right
397
+
398
+ ```python
399
+ action = UnityAction(discrete_actions=[1]) # Move forward
400
+ ```
401
+
402
+ ### 3DBall (Continuous)
403
+
404
+ 2 continuous actions in range [-1, 1]:
405
+ - Action 0: X-axis rotation
406
+ - Action 1: Z-axis rotation
407
+
408
+ ```python
409
+ action = UnityAction(continuous_actions=[0.5, -0.3])
410
+ ```
411
+
412
+ ## Observations
413
+
414
+ All environments provide vector observations. The size depends on the environment:
415
+
416
+ - **PushBlock**: 70 dimensions (14 ray-casts detecting walls/goals/blocks)
417
+ - **3DBall**: 8 dimensions (rotation and ball position/velocity)
418
+ - **GridWorld**: Visual observations (grid view)
419
+
420
+ ```python
421
+ result = client.reset()
422
+ obs = result.observation
423
+
424
+ # Access observations
425
+ print(f"Vector obs: {obs.vector_observations}")
426
+ print(f"Behavior: {obs.behavior_name}")
427
+ print(f"Action spec: {obs.action_spec_info}")
428
+ ```
429
+
430
+ ### Visual Observations (Optional)
431
+
432
+ Some environments support visual observations. Enable with `include_visual=True`:
433
+
434
+ ```python
435
+ result = client.reset(include_visual=True)
436
+ if result.observation.visual_observations:
437
+ # Base64-encoded PNG images
438
+ for img_b64 in result.observation.visual_observations:
439
+ # Decode and use the image
440
+ import base64
441
+ img_bytes = base64.b64decode(img_b64)
442
+ ```
443
+
444
+ ## Configuration
445
+
446
+ ### Constructor Arguments
447
+
448
+ When creating `UnityMLAgentsEnvironment` directly:
449
+
450
+ ```python
451
+ from envs.unity_env.server.unity_environment import UnityMLAgentsEnvironment
452
+
453
+ env = UnityMLAgentsEnvironment(
454
+ env_id="PushBlock", # Unity environment to load
455
+ no_graphics=False, # False = show graphics window
456
+ width=1280, # Window width in pixels
457
+ height=720, # Window height in pixels
458
+ time_scale=1.0, # Simulation speed (20.0 for fast training)
459
+ quality_level=5, # Graphics quality 0-5
460
+ )
461
+ ```
462
+
463
+ ### Environment Variables
464
+
465
+ For Docker deployment, configure via environment variables:
466
+
467
+ | Variable | Default | Description |
468
+ |----------|---------|-------------|
469
+ | `UNITY_ENV_ID` | PushBlock | Default Unity environment |
470
+ | `UNITY_NO_GRAPHICS` | 0 | Set to 1 for headless mode |
471
+ | `UNITY_WIDTH` | 1280 | Window width in pixels |
472
+ | `UNITY_HEIGHT` | 720 | Window height in pixels |
473
+ | `UNITY_TIME_SCALE` | 1.0 | Simulation speed multiplier |
474
+ | `UNITY_QUALITY_LEVEL` | 5 | Graphics quality 0-5 |
475
+ | `UNITY_CACHE_DIR` | ~/.mlagents-cache | Binary cache directory |
476
+
477
+ ## Environment State
478
+
479
+ Access detailed environment information:
480
+
481
+ ```python
482
+ state = client.state()
483
+ print(f"Environment: {state.env_id}")
484
+ print(f"Episode ID: {state.episode_id}")
485
+ print(f"Step count: {state.step_count}")
486
+ print(f"Available envs: {state.available_envs}")
487
+ print(f"Action spec: {state.action_spec}")
488
+ print(f"Observation spec: {state.observation_spec}")
489
+ ```
490
+
491
+ ## Troubleshooting
492
+
493
+ ### Docker Mode Fails on Apple Silicon (M1/M2/M3/M4)
494
+
495
+ **Symptom:** When running with `--docker` on Apple Silicon Macs, you see an error like:
496
+
497
+ ```
498
+ Error running with Docker: Server error: The Unity environment took too long to respond...
499
+ ```
500
+
501
+ Or in Docker logs:
502
+
503
+ ```
504
+ * Assertion: should not be reached at tramp-amd64.c:605
505
+ Environment shut down with return code -6 (SIGABRT)
506
+ ```
507
+
508
+ **Cause:** Unity ML-Agents binaries are x86_64 (Intel) only. When Docker runs the x86_64 Linux container on Apple Silicon, it uses QEMU emulation. The Mono runtime inside Unity has architecture-specific code that crashes under emulation.
509
+
510
+ **Solutions:**
511
+
512
+ 1. **Use Direct Mode** (recommended for macOS):
513
+ ```bash
514
+ python example_usage.py --direct --no-graphics
515
+ ```
516
+ Direct mode downloads native macOS binaries which work on Apple Silicon.
517
+
518
+ 2. **Use Server Mode** with a local server:
519
+ ```bash
520
+ # Terminal 1: Start server (uses native macOS binaries)
521
+ uvicorn server.app:app --host 0.0.0.0 --port 8000
522
+
523
+ # Terminal 2: Run client
524
+ python example_usage.py --url http://localhost:8000
525
+ ```
526
+
527
+ 3. **Use an x86_64 Linux machine** for Docker mode:
528
+ The Docker image works correctly on native x86_64 Linux machines (cloud VMs, dedicated servers, etc.).
529
+
530
+ ### First Run is Slow
531
+
532
+ The first run downloads Unity binaries (~500MB). This is normal and only happens once. Binaries are cached in `~/.mlagents-cache/`.
533
+
534
+ ### Graphics Not Showing
535
+
536
+ - Ensure `--no-graphics` is NOT set
537
+ - On Linux, ensure X11 is available
538
+ - For Docker, you may need to set up X11 forwarding
539
+
540
+ ### Docker Container Fails to Start
541
+
542
+ ```bash
543
+ # Check Docker logs
544
+ docker logs <container_id>
545
+
546
+ # Ensure the image is built
547
+ docker images | grep unity-env
548
+
549
+ # Rebuild if necessary
550
+ docker build -f server/Dockerfile -t unity-env:latest .
551
+ ```
552
+
553
+ ### Import Errors
554
+
555
+ ```bash
556
+ # Ensure you're in the correct directory
557
+ cd envs/unity_env
558
+
559
+ # Install dependencies
560
+ uv sync
561
+ # or
562
+ pip install -e .
563
+ ```
564
+
565
+ ### mlagents-envs Installation Issues
566
+
567
+ The `mlagents-envs` and `mlagents` packages are installed from source by default (via the GitHub repository). If you encounter issues or want to install manually:
568
+
569
+ ```bash
570
+ # Clone the ml-agents repository
571
+ git clone https://github.com/Unity-Technologies/ml-agents.git
572
+ cd ml-agents
573
+
574
+ # Install mlagents-envs from source
575
+ pip install -e ./ml-agents-envs
576
+
577
+ # Install the full ml-agents package
578
+ pip install -e ./ml-agents
579
+ ```
580
+
581
+ This approach is useful when:
582
+ - You need to modify the mlagents source code
583
+ - You want to use a specific branch or commit
584
+ - The git dependency in pyproject.toml is causing issues
585
+
586
+ ## Caveats
587
+
588
+ 1. **First Run Download**: Unity binaries (~500MB) are downloaded on first use
589
+ 2. **Platform-Specific**: Binaries are platform-specific (macOS, Linux, Windows)
590
+ 3. **Apple Silicon + Docker**: Docker mode does not work on Apple Silicon Macs due to x86_64 emulation issues with Unity's Mono runtime. Use direct mode or server mode instead.
591
+ 4. **Single Worker**: Unity environments are not thread-safe; use `workers=1`
592
+ 5. **Graphics Mode**: Some features require X11/display for graphics mode
593
+ 6. **Multi-Agent**: Currently uses first agent only; full multi-agent support planned
594
+
595
+ ## Dependencies
596
+
597
+ - `mlagents-envs` (installed from source via git)
598
+ - `mlagents` (installed from source via git)
599
+ - `numpy>=1.20.0`
600
+ - `pillow>=9.0.0` (for visual observations)
601
+ - `openenv-core[core]>=0.2.0`
602
+
603
+ ## References
604
+
605
+ - [Unity ML-Agents Documentation](https://unity-technologies.github.io/ml-agents/)
606
+ - [ML-Agents GitHub](https://github.com/Unity-Technologies/ml-agents)
607
+ - [Example Environments](https://unity-technologies.github.io/ml-agents/Learning-Environment-Examples/)
__init__.py ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """Unity ML-Agents Environment for OpenEnv."""
8
+
9
+ from .client import UnityEnv
10
+ from .models import UnityAction, UnityObservation, UnityState
11
+
12
+ __all__ = ["UnityAction", "UnityObservation", "UnityState", "UnityEnv"]
client.py ADDED
@@ -0,0 +1,263 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ Unity ML-Agents Environment Client.
9
+
10
+ This module provides the client for connecting to a Unity ML-Agents
11
+ Environment server via WebSocket for persistent sessions.
12
+ """
13
+
14
+ from typing import Any, Dict, List, Optional
15
+
16
+ # Support multiple import scenarios
17
+ try:
18
+ # In-repo imports (when running from OpenEnv repository root)
19
+ from openenv.core.client_types import StepResult
20
+ from openenv.core.env_client import EnvClient
21
+
22
+ from .models import UnityAction, UnityObservation, UnityState
23
+ except ImportError:
24
+ # openenv from pip
25
+ from openenv.core.client_types import StepResult
26
+ from openenv.core.env_client import EnvClient
27
+
28
+ try:
29
+ # Direct execution from envs/unity_env/ directory
30
+ from models import UnityAction, UnityObservation, UnityState
31
+ except ImportError:
32
+ try:
33
+ # Package installed as unity_env
34
+ from unity_env.models import UnityAction, UnityObservation, UnityState
35
+ except ImportError:
36
+ # Running from OpenEnv root with envs prefix
37
+ from envs.unity_env.models import UnityAction, UnityObservation, UnityState
38
+
39
+
40
+ class UnityEnv(EnvClient[UnityAction, UnityObservation, UnityState]):
41
+ """
42
+ Client for Unity ML-Agents environments.
43
+
44
+ This client maintains a persistent WebSocket connection to the environment
45
+ server, enabling efficient multi-step interactions with lower latency.
46
+ Each client instance has its own dedicated environment session on the server.
47
+
48
+ Note: Unity environments can take 30-60+ seconds to initialize on first reset
49
+ (downloading binaries, starting Unity process). The client is configured with
50
+ longer ping timeouts to handle this.
51
+
52
+ Supported Unity Environments:
53
+ - PushBlock: Push a block to a goal (discrete actions: 7)
54
+ - 3DBall: Balance a ball on a platform (continuous actions: 2)
55
+ - 3DBallHard: Harder version of 3DBall
56
+ - GridWorld: Navigate a grid to find goals
57
+ - Basic: Simple movement task
58
+ - And more from the ML-Agents registry
59
+
60
+ Example:
61
+ >>> # Connect to a running server
62
+ >>> with UnityEnv(base_url="http://localhost:8000") as client:
63
+ ... result = client.reset()
64
+ ... print(f"Vector obs: {len(result.observation.vector_observations)} dims")
65
+ ...
66
+ ... # Take action (PushBlock: 1=forward)
67
+ ... result = client.step(UnityAction(discrete_actions=[1]))
68
+ ... print(f"Reward: {result.reward}")
69
+
70
+ Example with Docker:
71
+ >>> # Automatically start container and connect
72
+ >>> client = UnityEnv.from_docker_image("unity-env:latest")
73
+ >>> try:
74
+ ... result = client.reset(env_id="3DBall")
75
+ ... result = client.step(UnityAction(continuous_actions=[0.5, -0.3]))
76
+ ... finally:
77
+ ... client.close()
78
+
79
+ Example switching environments:
80
+ >>> client = UnityEnv(base_url="http://localhost:8000")
81
+ >>> # Start with PushBlock
82
+ >>> result = client.reset(env_id="PushBlock")
83
+ >>> # ... train on PushBlock ...
84
+ >>> # Switch to 3DBall
85
+ >>> result = client.reset(env_id="3DBall")
86
+ >>> # ... train on 3DBall ...
87
+ """
88
+
89
+ def __init__(
90
+ self,
91
+ base_url: str,
92
+ connect_timeout_s: float = 10.0,
93
+ message_timeout_s: float = 180.0, # 3 minutes for slow Unity initialization
94
+ provider: Optional[Any] = None,
95
+ ):
96
+ """
97
+ Initialize Unity environment client.
98
+
99
+ Uses longer default timeouts than the base EnvClient because Unity
100
+ environments can take 30-60+ seconds to initialize on first reset.
101
+
102
+ Args:
103
+ base_url: Base URL of the environment server (http:// or ws://).
104
+ connect_timeout_s: Timeout for establishing WebSocket connection
105
+ message_timeout_s: Timeout for receiving responses (default 3 min for Unity)
106
+ provider: Optional container/runtime provider for lifecycle management.
107
+ """
108
+ super().__init__(
109
+ base_url=base_url,
110
+ connect_timeout_s=connect_timeout_s,
111
+ message_timeout_s=message_timeout_s,
112
+ provider=provider,
113
+ )
114
+
115
+ def connect(self) -> "UnityEnv":
116
+ """
117
+ Establish WebSocket connection to the server.
118
+
119
+ Overrides the default connection to use longer ping timeouts,
120
+ since Unity environments can take 30-60+ seconds to initialize.
121
+
122
+ Returns:
123
+ self for method chaining
124
+
125
+ Raises:
126
+ ConnectionError: If connection cannot be established
127
+ """
128
+ from websockets.sync.client import connect as ws_connect
129
+
130
+ if self._ws is not None:
131
+ return self
132
+
133
+ try:
134
+ # Use longer ping_timeout for Unity (60s) since environment
135
+ # initialization can block the server for a while
136
+ self._ws = ws_connect(
137
+ self._ws_url,
138
+ open_timeout=self._connect_timeout,
139
+ ping_timeout=120, # 2 minutes for slow Unity initialization
140
+ ping_interval=30, # Send pings every 30 seconds
141
+ close_timeout=30,
142
+ )
143
+ except Exception as e:
144
+ raise ConnectionError(f"Failed to connect to {self._ws_url}: {e}") from e
145
+
146
+ return self
147
+
148
+ def _step_payload(self, action: UnityAction) -> Dict:
149
+ """
150
+ Convert UnityAction to JSON payload for step request.
151
+
152
+ Args:
153
+ action: UnityAction instance
154
+
155
+ Returns:
156
+ Dictionary representation suitable for JSON encoding
157
+ """
158
+ payload: Dict[str, Any] = {}
159
+
160
+ if action.discrete_actions is not None:
161
+ payload["discrete_actions"] = action.discrete_actions
162
+
163
+ if action.continuous_actions is not None:
164
+ payload["continuous_actions"] = action.continuous_actions
165
+
166
+ if action.metadata:
167
+ payload["metadata"] = action.metadata
168
+
169
+ return payload
170
+
171
+ def _parse_result(self, payload: Dict) -> StepResult[UnityObservation]:
172
+ """
173
+ Parse server response into StepResult[UnityObservation].
174
+
175
+ Args:
176
+ payload: JSON response from server
177
+
178
+ Returns:
179
+ StepResult with UnityObservation
180
+ """
181
+ obs_data = payload.get("observation", {})
182
+
183
+ observation = UnityObservation(
184
+ vector_observations=obs_data.get("vector_observations", []),
185
+ visual_observations=obs_data.get("visual_observations"),
186
+ behavior_name=obs_data.get("behavior_name", ""),
187
+ action_spec_info=obs_data.get("action_spec_info", {}),
188
+ observation_spec_info=obs_data.get("observation_spec_info", {}),
189
+ done=payload.get("done", False),
190
+ reward=payload.get("reward"),
191
+ metadata=obs_data.get("metadata", {}),
192
+ )
193
+
194
+ return StepResult(
195
+ observation=observation,
196
+ reward=payload.get("reward"),
197
+ done=payload.get("done", False),
198
+ )
199
+
200
+ def _parse_state(self, payload: Dict) -> UnityState:
201
+ """
202
+ Parse server response into UnityState object.
203
+
204
+ Args:
205
+ payload: JSON response from /state endpoint
206
+
207
+ Returns:
208
+ UnityState object with environment information
209
+ """
210
+ return UnityState(
211
+ episode_id=payload.get("episode_id"),
212
+ step_count=payload.get("step_count", 0),
213
+ env_id=payload.get("env_id", ""),
214
+ behavior_name=payload.get("behavior_name", ""),
215
+ action_spec=payload.get("action_spec", {}),
216
+ observation_spec=payload.get("observation_spec", {}),
217
+ available_envs=payload.get("available_envs", []),
218
+ )
219
+
220
+ def reset(
221
+ self,
222
+ env_id: Optional[str] = None,
223
+ include_visual: bool = False,
224
+ **kwargs,
225
+ ) -> StepResult[UnityObservation]:
226
+ """
227
+ Reset the environment.
228
+
229
+ Args:
230
+ env_id: Optionally switch to a different Unity environment.
231
+ Available: PushBlock, 3DBall, 3DBallHard, GridWorld, Basic
232
+ include_visual: If True, include visual observations in response.
233
+ **kwargs: Additional arguments passed to server.
234
+
235
+ Returns:
236
+ StepResult with initial observation.
237
+ """
238
+ reset_kwargs = dict(kwargs)
239
+ if env_id is not None:
240
+ reset_kwargs["env_id"] = env_id
241
+ reset_kwargs["include_visual"] = include_visual
242
+
243
+ return super().reset(**reset_kwargs)
244
+
245
+ @staticmethod
246
+ def available_environments() -> List[str]:
247
+ """
248
+ List commonly available Unity environments.
249
+
250
+ Note: The actual list may vary based on the ML-Agents registry version.
251
+ Use state.available_envs after connecting for the authoritative list.
252
+
253
+ Returns:
254
+ List of environment identifiers.
255
+ """
256
+ return [
257
+ "PushBlock",
258
+ "3DBall",
259
+ "3DBallHard",
260
+ "GridWorld",
261
+ "Basic",
262
+ "VisualPushBlock",
263
+ ]
models.py ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ Data models for the Unity ML-Agents Environment.
9
+
10
+ The Unity environment wraps Unity ML-Agents environments (PushBlock, 3DBall,
11
+ GridWorld, etc.) providing a unified interface for reinforcement learning.
12
+ """
13
+
14
+ from typing import Any, Dict, List, Optional
15
+
16
+ from pydantic import Field
17
+
18
+ # Support both in-repo and standalone imports
19
+ try:
20
+ # In-repo imports (when running from OpenEnv repository)
21
+ from openenv.core.env_server.types import Action, Observation, State
22
+ except ImportError:
23
+ # Standalone imports (when environment is standalone with openenv from pip)
24
+ from openenv.core.env_server.types import Action, Observation, State
25
+
26
+
27
+ class UnityAction(Action):
28
+ """
29
+ Action for Unity ML-Agents environments.
30
+
31
+ Supports both discrete and continuous action spaces. Unity environments
32
+ may use either or both types of actions:
33
+
34
+ - Discrete actions: Integer indices for categorical choices
35
+ (e.g., movement direction: 0=forward, 1=backward, 2=left, 3=right)
36
+ - Continuous actions: Float values typically in [-1, 1] range
37
+ (e.g., joint rotations, force magnitudes)
38
+
39
+ Example (PushBlock - discrete):
40
+ >>> action = UnityAction(discrete_actions=[3]) # Rotate left
41
+
42
+ Example (Walker - continuous):
43
+ >>> action = UnityAction(continuous_actions=[0.5, -0.3, 0.0, ...])
44
+
45
+ Attributes:
46
+ discrete_actions: List of discrete action indices for each action branch.
47
+ For PushBlock: [0-6] where 0=noop, 1=forward, 2=backward,
48
+ 3=rotate_left, 4=rotate_right, 5=strafe_left, 6=strafe_right
49
+ continuous_actions: List of continuous action values, typically in [-1, 1].
50
+ metadata: Additional action parameters.
51
+ """
52
+
53
+ discrete_actions: Optional[List[int]] = Field(
54
+ default=None,
55
+ description="Discrete action indices for each action branch",
56
+ )
57
+ continuous_actions: Optional[List[float]] = Field(
58
+ default=None,
59
+ description="Continuous action values, typically in [-1, 1] range",
60
+ )
61
+
62
+
63
+ class UnityObservation(Observation):
64
+ """
65
+ Observation from Unity ML-Agents environments.
66
+
67
+ Contains vector observations (sensor readings) and optionally visual
68
+ observations (rendered images). Most Unity environments provide vector
69
+ observations; visual observations are optional and must be requested.
70
+
71
+ Attributes:
72
+ vector_observations: Flattened array of all vector observations.
73
+ Size and meaning depends on the specific environment.
74
+ For PushBlock: 70 values from 14 ray-casts detecting walls/goals/blocks.
75
+ visual_observations: Optional list of base64-encoded images (PNG format).
76
+ Only included when include_visual=True in reset/step.
77
+ behavior_name: Name of the Unity behavior (agent type).
78
+ action_spec_info: Information about the action space for this environment.
79
+ observation_spec_info: Information about the observation space.
80
+ """
81
+
82
+ vector_observations: List[float] = Field(
83
+ default_factory=list,
84
+ description="Flattened vector observations from the environment",
85
+ )
86
+ visual_observations: Optional[List[str]] = Field(
87
+ default=None,
88
+ description="Base64-encoded PNG images (when include_visual=True)",
89
+ )
90
+ behavior_name: str = Field(
91
+ default="",
92
+ description="Name of the Unity behavior/agent type",
93
+ )
94
+ action_spec_info: Dict[str, Any] = Field(
95
+ default_factory=dict,
96
+ description="Information about the action space",
97
+ )
98
+ observation_spec_info: Dict[str, Any] = Field(
99
+ default_factory=dict,
100
+ description="Information about the observation space",
101
+ )
102
+
103
+
104
+ class UnityState(State):
105
+ """
106
+ Extended state for Unity ML-Agents environments.
107
+
108
+ Provides additional metadata about the currently loaded environment,
109
+ including action and observation space specifications.
110
+
111
+ Attributes:
112
+ episode_id: Unique identifier for the current episode.
113
+ step_count: Number of steps taken in the current episode.
114
+ env_id: Identifier of the currently loaded Unity environment.
115
+ behavior_name: Name of the Unity behavior (agent type).
116
+ action_spec: Detailed specification of the action space.
117
+ observation_spec: Detailed specification of the observation space.
118
+ available_envs: List of available environment identifiers.
119
+ """
120
+
121
+ env_id: str = Field(
122
+ default="PushBlock",
123
+ description="Identifier of the loaded Unity environment",
124
+ )
125
+ behavior_name: str = Field(
126
+ default="",
127
+ description="Name of the Unity behavior/agent type",
128
+ )
129
+ action_spec: Dict[str, Any] = Field(
130
+ default_factory=dict,
131
+ description="Specification of the action space",
132
+ )
133
+ observation_spec: Dict[str, Any] = Field(
134
+ default_factory=dict,
135
+ description="Specification of the observation space",
136
+ )
137
+ available_envs: List[str] = Field(
138
+ default_factory=list,
139
+ description="List of available Unity environments",
140
+ )
141
+
142
+
143
+ # Available Unity environments from the ML-Agents registry
144
+ # These are pre-built environments that can be downloaded automatically
145
+ AVAILABLE_UNITY_ENVIRONMENTS = [
146
+ "PushBlock",
147
+ "3DBall",
148
+ "3DBallHard",
149
+ "GridWorld",
150
+ "Basic",
151
+ "VisualPushBlock",
152
+ # Note: More environments may be available in newer versions of ML-Agents
153
+ ]
154
+
155
+ # Action descriptions for PushBlock (most commonly used example)
156
+ PUSHBLOCK_ACTIONS = {
157
+ 0: "noop",
158
+ 1: "forward",
159
+ 2: "backward",
160
+ 3: "rotate_left",
161
+ 4: "rotate_right",
162
+ 5: "strafe_left",
163
+ 6: "strafe_right",
164
+ }
openenv.yaml ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ spec_version: 1
2
+ name: unity_env
3
+ type: space
4
+ runtime: fastapi
5
+ app: server.app:app
6
+ port: 8000
openenv_unity_env.egg-info/PKG-INFO ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Metadata-Version: 2.4
2
+ Name: openenv-unity-env
3
+ Version: 0.1.0
4
+ Summary: Unity ML-Agents Environment for OpenEnv - wraps Unity environments like PushBlock, 3DBall, GridWorld
5
+ Requires-Python: >=3.10
6
+ Requires-Dist: openenv-core[core]>=0.2.0
7
+ Requires-Dist: fastapi>=0.115.0
8
+ Requires-Dist: pydantic>=2.0.0
9
+ Requires-Dist: uvicorn>=0.24.0
10
+ Requires-Dist: requests>=2.31.0
11
+ Requires-Dist: mlagents-envs>=1.0.0
12
+ Requires-Dist: numpy>=1.20.0
13
+ Requires-Dist: pillow>=9.0.0
14
+ Provides-Extra: dev
15
+ Requires-Dist: pytest>=8.0.0; extra == "dev"
16
+ Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
openenv_unity_env.egg-info/SOURCES.txt ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ README.md
2
+ pyproject.toml
3
+ ./__init__.py
4
+ ./client.py
5
+ ./example_usage.py
6
+ ./models.py
7
+ openenv_unity_env.egg-info/PKG-INFO
8
+ openenv_unity_env.egg-info/SOURCES.txt
9
+ openenv_unity_env.egg-info/dependency_links.txt
10
+ openenv_unity_env.egg-info/entry_points.txt
11
+ openenv_unity_env.egg-info/requires.txt
12
+ openenv_unity_env.egg-info/top_level.txt
13
+ server/__init__.py
14
+ server/app.py
15
+ server/unity_environment.py
openenv_unity_env.egg-info/dependency_links.txt ADDED
@@ -0,0 +1 @@
 
 
1
+
openenv_unity_env.egg-info/entry_points.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ [console_scripts]
2
+ server = unity_env.server.app:main
openenv_unity_env.egg-info/requires.txt ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ openenv-core[core]>=0.2.0
2
+ fastapi>=0.115.0
3
+ pydantic>=2.0.0
4
+ uvicorn>=0.24.0
5
+ requests>=2.31.0
6
+ mlagents-envs>=1.0.0
7
+ numpy>=1.20.0
8
+ pillow>=9.0.0
9
+
10
+ [dev]
11
+ pytest>=8.0.0
12
+ pytest-cov>=4.0.0
openenv_unity_env.egg-info/top_level.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ unity_env
pyproject.toml ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ [build-system]
8
+ requires = ["setuptools>=45", "wheel"]
9
+ build-backend = "setuptools.build_meta"
10
+
11
+ [project]
12
+ name = "openenv-unity-env"
13
+ version = "0.1.0"
14
+ description = "Unity ML-Agents Environment for OpenEnv - wraps Unity environments like PushBlock, 3DBall, GridWorld"
15
+ requires-python = ">=3.10"
16
+ dependencies = [
17
+ # Core OpenEnv dependencies (required for server functionality)
18
+ "openenv-core[core]>=0.2.0",
19
+ "fastapi>=0.115.0",
20
+ "pydantic>=2.0.0",
21
+ "uvicorn>=0.24.0",
22
+ "requests>=2.31.0",
23
+ # Unity ML-Agents dependencies (installed from source for latest features)
24
+ "mlagents-envs @ git+https://github.com/Unity-Technologies/ml-agents.git#subdirectory=ml-agents-envs",
25
+ # "mlagents @ git+https://github.com/Unity-Technologies/ml-agents.git#subdirectory=ml-agents",
26
+ "numpy>=1.20.0",
27
+ # Optional: for visual observations
28
+ "pillow>=9.0.0",
29
+ ]
30
+
31
+ [project.optional-dependencies]
32
+ dev = [
33
+ "pytest>=8.0.0",
34
+ "pytest-cov>=4.0.0",
35
+ ]
36
+
37
+ [project.scripts]
38
+ # Server entry point - enables running via: uv run --project . server
39
+ # or: python -m unity_env.server.app
40
+ server = "unity_env.server.app:main"
41
+
42
+ [tool.setuptools]
43
+ include-package-data = true
44
+ packages = ["unity_env", "unity_env.server"]
45
+ package-dir = { "unity_env" = ".", "unity_env.server" = "server" }
server/__init__.py ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """Unity environment server components."""
8
+
9
+ from .unity_environment import UnityMLAgentsEnvironment
10
+
11
+ __all__ = ["UnityMLAgentsEnvironment"]
server/app.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ FastAPI application for the Unity ML-Agents Environment.
9
+
10
+ This module creates an HTTP server that exposes Unity ML-Agents environments
11
+ over HTTP and WebSocket endpoints, compatible with EnvClient.
12
+
13
+ Usage:
14
+ # Development (with auto-reload):
15
+ uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
16
+
17
+ # Production:
18
+ uvicorn server.app:app --host 0.0.0.0 --port 8000 --workers 1
19
+
20
+ # Or run directly:
21
+ uv run --project . server
22
+
23
+ Note: Unity environments are not thread-safe, so use workers=1.
24
+ """
25
+
26
+ # Support multiple import scenarios
27
+ try:
28
+ # In-repo imports (when running from OpenEnv repository root)
29
+ from openenv.core.env_server.http_server import create_app
30
+
31
+ from ..models import UnityAction, UnityObservation
32
+ from .unity_environment import UnityMLAgentsEnvironment
33
+ except ImportError:
34
+ # openenv from pip
35
+ from openenv.core.env_server.http_server import create_app
36
+
37
+ try:
38
+ # Direct execution from envs/unity_env/ directory
39
+ import sys
40
+ from pathlib import Path
41
+
42
+ # Add parent directory to path for direct execution
43
+ _parent = str(Path(__file__).parent.parent)
44
+ if _parent not in sys.path:
45
+ sys.path.insert(0, _parent)
46
+ from models import UnityAction, UnityObservation
47
+ from server.unity_environment import UnityMLAgentsEnvironment
48
+ except ImportError:
49
+ try:
50
+ # Package installed as unity_env
51
+ from unity_env.models import UnityAction, UnityObservation
52
+ from unity_env.server.unity_environment import UnityMLAgentsEnvironment
53
+ except ImportError:
54
+ # Running from OpenEnv root with envs prefix
55
+ from envs.unity_env.models import UnityAction, UnityObservation
56
+ from envs.unity_env.server.unity_environment import UnityMLAgentsEnvironment
57
+
58
+ # Create the app with web interface
59
+ # Pass the class (factory) instead of an instance for WebSocket session support
60
+ app = create_app(
61
+ UnityMLAgentsEnvironment,
62
+ UnityAction,
63
+ UnityObservation,
64
+ env_name="unity_env",
65
+ )
66
+
67
+
68
+ def main():
69
+ """
70
+ Entry point for direct execution via uv run or python -m.
71
+
72
+ This function enables running the server without Docker:
73
+ uv run --project . server
74
+ python -m envs.unity_env.server.app
75
+ openenv serve unity_env
76
+ """
77
+ import uvicorn
78
+
79
+ # Note: workers=1 because Unity environments are not thread-safe
80
+ uvicorn.run(app, host="0.0.0.0", port=8000, workers=1)
81
+
82
+
83
+ if __name__ == "__main__":
84
+ main()
server/unity_environment.py ADDED
@@ -0,0 +1,554 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ Unity ML-Agents Environment Implementation.
9
+
10
+ Wraps Unity ML-Agents environments (PushBlock, 3DBall, GridWorld, etc.)
11
+ with the OpenEnv interface for standardized reinforcement learning.
12
+ """
13
+
14
+ import base64
15
+ import glob
16
+ import hashlib
17
+ import io
18
+ import os
19
+ from pathlib import Path
20
+ from sys import platform
21
+ from typing import Any, Dict, List, Optional
22
+ from uuid import uuid4
23
+
24
+ import numpy as np
25
+
26
+ # Support multiple import scenarios
27
+ try:
28
+ # In-repo imports (when running from OpenEnv repository root)
29
+ from openenv.core.env_server.interfaces import Environment
30
+
31
+ from ..models import UnityAction, UnityObservation, UnityState
32
+ except ImportError:
33
+ # openenv from pip
34
+ from openenv.core.env_server.interfaces import Environment
35
+
36
+ try:
37
+ # Direct execution from envs/unity_env/ directory (imports from parent)
38
+ import sys
39
+ from pathlib import Path
40
+
41
+ # Add parent directory to path for direct execution
42
+ _parent = str(Path(__file__).parent.parent)
43
+ if _parent not in sys.path:
44
+ sys.path.insert(0, _parent)
45
+ from models import UnityAction, UnityObservation, UnityState
46
+ except ImportError:
47
+ try:
48
+ # Package installed as unity_env
49
+ from unity_env.models import UnityAction, UnityObservation, UnityState
50
+ except ImportError:
51
+ # Running from OpenEnv root with envs prefix
52
+ from envs.unity_env.models import UnityAction, UnityObservation, UnityState
53
+
54
+
55
+ # Persistent cache directory to avoid re-downloading environment binaries
56
+ PERSISTENT_CACHE_DIR = os.path.join(str(Path.home()), ".mlagents-cache")
57
+
58
+
59
+ def get_cached_binary_path(cache_dir: str, name: str, url: str) -> Optional[str]:
60
+ """Check if binary is cached and return its path."""
61
+ if platform == "darwin":
62
+ extension = "*.app"
63
+ elif platform in ("linux", "linux2"):
64
+ extension = "*.x86_64"
65
+ elif platform == "win32":
66
+ extension = "*.exe"
67
+ else:
68
+ return None
69
+
70
+ bin_dir = os.path.join(cache_dir, "binaries")
71
+ url_hash = "-" + hashlib.md5(url.encode()).hexdigest()
72
+ search_path = os.path.join(bin_dir, name + url_hash, "**", extension)
73
+
74
+ candidates = glob.glob(search_path, recursive=True)
75
+ for c in candidates:
76
+ if "UnityCrashHandler64" not in c:
77
+ return c
78
+ return None
79
+
80
+
81
+ class UnityMLAgentsEnvironment(Environment):
82
+ """
83
+ Wraps Unity ML-Agents environments with the OpenEnv interface.
84
+
85
+ This environment supports all Unity ML-Agents registry environments
86
+ including PushBlock, 3DBall, GridWorld, and more. Environments are
87
+ automatically downloaded on first use.
88
+
89
+ Features:
90
+ - Dynamic environment switching via reset(env_id="...")
91
+ - Support for both discrete and continuous action spaces
92
+ - Optional visual observations (base64-encoded images)
93
+ - Persistent caching to avoid re-downloading binaries
94
+ - Headless mode for faster training (no_graphics=True)
95
+
96
+ Example:
97
+ >>> env = UnityMLAgentsEnvironment()
98
+ >>> obs = env.reset()
99
+ >>> print(obs.vector_observations)
100
+ >>>
101
+ >>> # Take a random action
102
+ >>> obs = env.step(UnityAction(discrete_actions=[1])) # Move forward
103
+ >>> print(obs.reward)
104
+
105
+ Example with different environment:
106
+ >>> env = UnityMLAgentsEnvironment(env_id="3DBall")
107
+ >>> obs = env.reset()
108
+ >>>
109
+ >>> # Or switch environment on reset
110
+ >>> obs = env.reset(env_id="PushBlock")
111
+ """
112
+
113
+ # Each WebSocket session gets its own environment instance
114
+ SUPPORTS_CONCURRENT_SESSIONS = False
115
+
116
+ def __init__(
117
+ self,
118
+ env_id: Optional[str] = None,
119
+ no_graphics: Optional[bool] = None,
120
+ time_scale: Optional[float] = None,
121
+ width: Optional[int] = None,
122
+ height: Optional[int] = None,
123
+ quality_level: Optional[int] = None,
124
+ cache_dir: Optional[str] = None,
125
+ ):
126
+ """
127
+ Initialize the Unity ML-Agents environment.
128
+
129
+ Configuration can be provided via constructor arguments or environment
130
+ variables. Environment variables are used when constructor arguments
131
+ are not provided (useful for Docker deployment).
132
+
133
+ Args:
134
+ env_id: Identifier of the Unity environment to load.
135
+ Available: PushBlock, 3DBall, 3DBallHard, GridWorld, Basic
136
+ Env var: UNITY_ENV_ID (default: PushBlock)
137
+ no_graphics: If True, run in headless mode (faster training).
138
+ Env var: UNITY_NO_GRAPHICS (0 or 1, default: 0 = graphics enabled)
139
+ time_scale: Simulation speed multiplier.
140
+ Env var: UNITY_TIME_SCALE (default: 1.0)
141
+ width: Window width in pixels (when graphics enabled).
142
+ Env var: UNITY_WIDTH (default: 1280)
143
+ height: Window height in pixels (when graphics enabled).
144
+ Env var: UNITY_HEIGHT (default: 720)
145
+ quality_level: Graphics quality 0-5 (when graphics enabled).
146
+ Env var: UNITY_QUALITY_LEVEL (default: 5)
147
+ cache_dir: Directory to cache downloaded environment binaries.
148
+ Env var: UNITY_CACHE_DIR (default: ~/.mlagents-cache)
149
+ """
150
+ # Initialize cleanup-critical attributes first (for __del__ safety)
151
+ self._unity_env = None
152
+ self._behavior_name = None
153
+ self._behavior_spec = None
154
+ self._engine_channel = None
155
+
156
+ # Read from environment variables with defaults, allow constructor override
157
+ self._env_id = env_id or os.environ.get("UNITY_ENV_ID", "PushBlock")
158
+
159
+ # Handle no_graphics: default is False (graphics enabled)
160
+ if no_graphics is not None:
161
+ self._no_graphics = no_graphics
162
+ else:
163
+ env_no_graphics = os.environ.get("UNITY_NO_GRAPHICS", "0")
164
+ self._no_graphics = env_no_graphics.lower() in ("1", "true", "yes")
165
+
166
+ self._time_scale = (
167
+ time_scale
168
+ if time_scale is not None
169
+ else float(os.environ.get("UNITY_TIME_SCALE", "1.0"))
170
+ )
171
+ self._width = (
172
+ width
173
+ if width is not None
174
+ else int(os.environ.get("UNITY_WIDTH", "1280"))
175
+ )
176
+ self._height = (
177
+ height
178
+ if height is not None
179
+ else int(os.environ.get("UNITY_HEIGHT", "720"))
180
+ )
181
+ self._quality_level = (
182
+ quality_level
183
+ if quality_level is not None
184
+ else int(os.environ.get("UNITY_QUALITY_LEVEL", "5"))
185
+ )
186
+ self._cache_dir = cache_dir or os.environ.get(
187
+ "UNITY_CACHE_DIR", PERSISTENT_CACHE_DIR
188
+ )
189
+ self._include_visual = False
190
+
191
+ # State tracking
192
+ self._state = UnityState(
193
+ episode_id=str(uuid4()),
194
+ step_count=0,
195
+ env_id=self._env_id,
196
+ )
197
+
198
+ # Ensure cache directory exists
199
+ os.makedirs(self._cache_dir, exist_ok=True)
200
+
201
+ def _load_environment(self, env_id: str) -> None:
202
+ """Load or switch to a Unity environment."""
203
+ # Close existing environment if any
204
+ if self._unity_env is not None:
205
+ try:
206
+ self._unity_env.close()
207
+ except Exception:
208
+ pass
209
+
210
+ # Import ML-Agents components
211
+ try:
212
+ from mlagents_envs.base_env import ActionTuple
213
+ from mlagents_envs.registry import default_registry
214
+ from mlagents_envs.registry.remote_registry_entry import RemoteRegistryEntry
215
+ from mlagents_envs.side_channel.engine_configuration_channel import (
216
+ EngineConfigurationChannel,
217
+ )
218
+ except ImportError as e:
219
+ raise ImportError(
220
+ "mlagents-envs is required. Install with: pip install mlagents-envs"
221
+ ) from e
222
+
223
+ # Create engine configuration channel
224
+ self._engine_channel = EngineConfigurationChannel()
225
+
226
+ # Check if environment is in registry
227
+ if env_id not in default_registry:
228
+ available = list(default_registry.keys())
229
+ raise ValueError(
230
+ f"Environment '{env_id}' not found. Available: {available}"
231
+ )
232
+
233
+ # Get registry entry and create with persistent cache
234
+ entry = default_registry[env_id]
235
+
236
+ # Create a new entry with our persistent cache directory
237
+ persistent_entry = RemoteRegistryEntry(
238
+ identifier=entry.identifier,
239
+ expected_reward=entry.expected_reward,
240
+ description=entry.description,
241
+ linux_url=getattr(entry, "_linux_url", None),
242
+ darwin_url=getattr(entry, "_darwin_url", None),
243
+ win_url=getattr(entry, "_win_url", None),
244
+ additional_args=getattr(entry, "_add_args", []),
245
+ tmp_dir=self._cache_dir,
246
+ )
247
+
248
+ # Create the environment
249
+ self._unity_env = persistent_entry.make(
250
+ no_graphics=self._no_graphics,
251
+ side_channels=[self._engine_channel],
252
+ )
253
+
254
+ # Configure engine settings
255
+ if not self._no_graphics:
256
+ self._engine_channel.set_configuration_parameters(
257
+ width=self._width,
258
+ height=self._height,
259
+ quality_level=self._quality_level,
260
+ time_scale=self._time_scale,
261
+ )
262
+ else:
263
+ self._engine_channel.set_configuration_parameters(
264
+ time_scale=self._time_scale
265
+ )
266
+
267
+ # Get behavior info
268
+ if not self._unity_env.behavior_specs:
269
+ self._unity_env.step()
270
+
271
+ self._behavior_name = list(self._unity_env.behavior_specs.keys())[0]
272
+ self._behavior_spec = self._unity_env.behavior_specs[self._behavior_name]
273
+
274
+ # Update state
275
+ self._env_id = env_id
276
+ self._state.env_id = env_id
277
+ self._state.behavior_name = self._behavior_name
278
+ self._state.action_spec = self._get_action_spec_info()
279
+ self._state.observation_spec = self._get_observation_spec_info()
280
+ self._state.available_envs = list(default_registry.keys())
281
+
282
+ def _get_action_spec_info(self) -> Dict[str, Any]:
283
+ """Get information about the action space."""
284
+ spec = self._behavior_spec.action_spec
285
+ return {
286
+ "is_discrete": spec.is_discrete(),
287
+ "is_continuous": spec.is_continuous(),
288
+ "discrete_size": spec.discrete_size,
289
+ "discrete_branches": list(spec.discrete_branches) if spec.is_discrete() else [],
290
+ "continuous_size": spec.continuous_size,
291
+ }
292
+
293
+ def _get_observation_spec_info(self) -> Dict[str, Any]:
294
+ """Get information about the observation space."""
295
+ specs = self._behavior_spec.observation_specs
296
+ obs_info = []
297
+ for i, spec in enumerate(specs):
298
+ obs_info.append({
299
+ "index": i,
300
+ "shape": list(spec.shape),
301
+ "dimension_property": str(spec.dimension_property),
302
+ "observation_type": str(spec.observation_type),
303
+ })
304
+ return {"observations": obs_info, "count": len(specs)}
305
+
306
+ def _get_observation(
307
+ self,
308
+ decision_steps=None,
309
+ terminal_steps=None,
310
+ reward: float = 0.0,
311
+ done: bool = False,
312
+ ) -> UnityObservation:
313
+ """Convert Unity observation to UnityObservation."""
314
+ vector_obs = []
315
+ visual_obs = []
316
+
317
+ # Determine which steps to use
318
+ if terminal_steps is not None and len(terminal_steps) > 0:
319
+ steps = terminal_steps
320
+ done = True
321
+ # Get reward from terminal step
322
+ if len(terminal_steps.agent_id) > 0:
323
+ reward = float(terminal_steps[terminal_steps.agent_id[0]].reward)
324
+ elif decision_steps is not None and len(decision_steps) > 0:
325
+ steps = decision_steps
326
+ # Get reward from decision step
327
+ if len(decision_steps.agent_id) > 0:
328
+ reward = float(decision_steps[decision_steps.agent_id[0]].reward)
329
+ else:
330
+ # No agents, return empty observation
331
+ return UnityObservation(
332
+ vector_observations=[],
333
+ visual_observations=None,
334
+ behavior_name=self._behavior_name or "",
335
+ done=done,
336
+ reward=reward,
337
+ action_spec_info=self._state.action_spec,
338
+ observation_spec_info=self._state.observation_spec,
339
+ )
340
+
341
+ # Process observations from first agent
342
+ for obs in steps.obs:
343
+ if len(obs.shape) == 2:
344
+ # Vector observation (agents, features)
345
+ vector_obs.extend(obs[0].tolist())
346
+ elif len(obs.shape) == 4 and self._include_visual:
347
+ # Visual observation (agents, height, width, channels)
348
+ img_array = (obs[0] * 255).astype(np.uint8)
349
+ # Encode as base64 PNG
350
+ try:
351
+ from PIL import Image
352
+ img = Image.fromarray(img_array)
353
+ buffer = io.BytesIO()
354
+ img.save(buffer, format="PNG")
355
+ img_b64 = base64.b64encode(buffer.getvalue()).decode("utf-8")
356
+ visual_obs.append(img_b64)
357
+ except ImportError:
358
+ # PIL not available, skip visual observations
359
+ pass
360
+
361
+ return UnityObservation(
362
+ vector_observations=vector_obs,
363
+ visual_observations=visual_obs if visual_obs else None,
364
+ behavior_name=self._behavior_name or "",
365
+ done=done,
366
+ reward=reward,
367
+ action_spec_info=self._state.action_spec,
368
+ observation_spec_info=self._state.observation_spec,
369
+ )
370
+
371
+ def reset(
372
+ self,
373
+ env_id: Optional[str] = None,
374
+ seed: Optional[int] = None,
375
+ include_visual: bool = False,
376
+ **kwargs,
377
+ ) -> UnityObservation:
378
+ """
379
+ Reset the environment and return initial observation.
380
+
381
+ Args:
382
+ env_id: Optionally switch to a different Unity environment.
383
+ seed: Random seed (not fully supported by Unity ML-Agents).
384
+ include_visual: If True, include visual observations in output.
385
+ **kwargs: Additional arguments (ignored).
386
+
387
+ Returns:
388
+ UnityObservation with initial state.
389
+ """
390
+ self._include_visual = include_visual
391
+
392
+ # Load or switch environment if needed
393
+ target_env = env_id or self._env_id
394
+ if self._unity_env is None or target_env != self._env_id:
395
+ self._load_environment(target_env)
396
+
397
+ # Reset the environment
398
+ self._unity_env.reset()
399
+
400
+ # Update state
401
+ self._state = UnityState(
402
+ episode_id=str(uuid4()),
403
+ step_count=0,
404
+ env_id=self._env_id,
405
+ behavior_name=self._behavior_name,
406
+ action_spec=self._state.action_spec,
407
+ observation_spec=self._state.observation_spec,
408
+ available_envs=self._state.available_envs,
409
+ )
410
+
411
+ # Get initial observation
412
+ decision_steps, terminal_steps = self._unity_env.get_steps(self._behavior_name)
413
+
414
+ return self._get_observation(
415
+ decision_steps=decision_steps,
416
+ terminal_steps=terminal_steps,
417
+ reward=0.0,
418
+ done=False,
419
+ )
420
+
421
+ def step(self, action: UnityAction) -> UnityObservation:
422
+ """
423
+ Execute one step in the environment.
424
+
425
+ Args:
426
+ action: UnityAction with discrete and/or continuous actions.
427
+
428
+ Returns:
429
+ UnityObservation with new state, reward, and done flag.
430
+ """
431
+ if self._unity_env is None:
432
+ raise RuntimeError("Environment not initialized. Call reset() first.")
433
+
434
+ from mlagents_envs.base_env import ActionTuple
435
+
436
+ # Get current decision steps to know how many agents
437
+ decision_steps, terminal_steps = self._unity_env.get_steps(self._behavior_name)
438
+
439
+ # Check if episode already ended
440
+ if len(terminal_steps) > 0:
441
+ return self._get_observation(
442
+ decision_steps=decision_steps,
443
+ terminal_steps=terminal_steps,
444
+ done=True,
445
+ )
446
+
447
+ n_agents = len(decision_steps)
448
+ if n_agents == 0:
449
+ # No agents need decisions, just step
450
+ self._unity_env.step()
451
+ self._state.step_count += 1
452
+ decision_steps, terminal_steps = self._unity_env.get_steps(self._behavior_name)
453
+ return self._get_observation(
454
+ decision_steps=decision_steps,
455
+ terminal_steps=terminal_steps,
456
+ )
457
+
458
+ # Build action tuple
459
+ action_tuple = ActionTuple()
460
+
461
+ # Handle discrete actions
462
+ if action.discrete_actions is not None:
463
+ discrete = np.array([action.discrete_actions] * n_agents, dtype=np.int32)
464
+ # Ensure correct shape (n_agents, n_branches)
465
+ if discrete.ndim == 1:
466
+ discrete = discrete.reshape(n_agents, -1)
467
+ action_tuple.add_discrete(discrete)
468
+ elif self._behavior_spec.action_spec.is_discrete():
469
+ # Default to no-op (action 0)
470
+ n_branches = self._behavior_spec.action_spec.discrete_size
471
+ discrete = np.zeros((n_agents, n_branches), dtype=np.int32)
472
+ action_tuple.add_discrete(discrete)
473
+
474
+ # Handle continuous actions
475
+ if action.continuous_actions is not None:
476
+ continuous = np.array([action.continuous_actions] * n_agents, dtype=np.float32)
477
+ if continuous.ndim == 1:
478
+ continuous = continuous.reshape(n_agents, -1)
479
+ action_tuple.add_continuous(continuous)
480
+ elif self._behavior_spec.action_spec.is_continuous():
481
+ # Default to zero actions
482
+ n_continuous = self._behavior_spec.action_spec.continuous_size
483
+ continuous = np.zeros((n_agents, n_continuous), dtype=np.float32)
484
+ action_tuple.add_continuous(continuous)
485
+
486
+ # Set actions and step
487
+ self._unity_env.set_actions(self._behavior_name, action_tuple)
488
+ self._unity_env.step()
489
+ self._state.step_count += 1
490
+
491
+ # Get new observation
492
+ decision_steps, terminal_steps = self._unity_env.get_steps(self._behavior_name)
493
+
494
+ return self._get_observation(
495
+ decision_steps=decision_steps,
496
+ terminal_steps=terminal_steps,
497
+ )
498
+
499
+ async def reset_async(
500
+ self,
501
+ env_id: Optional[str] = None,
502
+ seed: Optional[int] = None,
503
+ include_visual: bool = False,
504
+ **kwargs,
505
+ ) -> UnityObservation:
506
+ """
507
+ Async version of reset - runs in a thread to avoid blocking the event loop.
508
+
509
+ Unity ML-Agents environments can take 10-60+ seconds to initialize.
510
+ Running in a thread allows the event loop to continue processing
511
+ WebSocket keepalive pings during this time.
512
+ """
513
+ import asyncio
514
+
515
+ return await asyncio.to_thread(
516
+ self.reset,
517
+ env_id=env_id,
518
+ seed=seed,
519
+ include_visual=include_visual,
520
+ **kwargs,
521
+ )
522
+
523
+ async def step_async(self, action: UnityAction) -> UnityObservation:
524
+ """
525
+ Async version of step - runs in a thread to avoid blocking the event loop.
526
+
527
+ Although step() is usually fast, running in a thread ensures
528
+ the event loop remains responsive.
529
+ """
530
+ import asyncio
531
+
532
+ return await asyncio.to_thread(self.step, action)
533
+
534
+ @property
535
+ def state(self) -> UnityState:
536
+ """Get the current environment state."""
537
+ return self._state
538
+
539
+ def close(self) -> None:
540
+ """Close the Unity environment."""
541
+ unity_env = getattr(self, "_unity_env", None)
542
+ if unity_env is not None:
543
+ try:
544
+ unity_env.close()
545
+ except Exception:
546
+ pass
547
+ self._unity_env = None
548
+
549
+ def __del__(self):
550
+ """Cleanup on deletion."""
551
+ try:
552
+ self.close()
553
+ except Exception:
554
+ pass