Spaces:
Sleeping
Sleeping
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -9,16 +9,38 @@ app_port: 7860
|
|
| 9 |
|
| 10 |
# Tetris OpenEnv
|
| 11 |
|
| 12 |
-
A Tetris RL environment for LLM agent training, built on
|
| 13 |
|
| 14 |
LLM agents receive a text-based board representation and must choose spatial actions (left, right, rotate, drop) to play Tetris. Features combo scoring where clearing multiple lines simultaneously gives disproportionately higher rewards.
|
| 15 |
|
| 16 |
-
##
|
| 17 |
|
| 18 |
-
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
## Reward Structure
|
| 24 |
|
|
@@ -30,3 +52,8 @@ LLM agents receive a text-based board representation and must choose spatial act
|
|
| 30 |
| 4 (Tetris!) | +1500 | x15 |
|
| 31 |
|
| 32 |
Penalties: -1/step, -2*height, -5*holes, -500 game over.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
# Tetris OpenEnv
|
| 11 |
|
| 12 |
+
A Tetris RL environment for LLM agent training, built on OpenEnv 0.2.1.
|
| 13 |
|
| 14 |
LLM agents receive a text-based board representation and must choose spatial actions (left, right, rotate, drop) to play Tetris. Features combo scoring where clearing multiple lines simultaneously gives disproportionately higher rewards.
|
| 15 |
|
| 16 |
+
## Problem Statement
|
| 17 |
|
| 18 |
+
**Wild Card (#5)** - Teaching LLMs spatial reasoning through Tetris. The agent must interpret a 2D text grid and plan piece placements, a fundamentally non-linguistic task solved through language.
|
| 19 |
+
|
| 20 |
+
## Quick Start
|
| 21 |
+
|
| 22 |
+
```python
|
| 23 |
+
from tetris_env import TetrisEnvClient, TetrisAction
|
| 24 |
+
|
| 25 |
+
with TetrisEnvClient(base_url="https://VortexedSquirrel-tetris-env.hf.space") as env:
|
| 26 |
+
result = env.reset(seed=42)
|
| 27 |
+
while not result.done:
|
| 28 |
+
action = TetrisAction(action="drop")
|
| 29 |
+
result = env.step(action)
|
| 30 |
+
print(f"Reward: {result.reward}, Score: {result.observation.score}")
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
## Actions
|
| 34 |
+
|
| 35 |
+
| Action | Description |
|
| 36 |
+
|---|---|
|
| 37 |
+
| `left` | Move piece left |
|
| 38 |
+
| `right` | Move piece right |
|
| 39 |
+
| `rotate_cw` | Rotate clockwise |
|
| 40 |
+
| `rotate_ccw` | Rotate counter-clockwise |
|
| 41 |
+
| `drop` | Hard drop to bottom |
|
| 42 |
+
| `down` | Soft drop one row |
|
| 43 |
+
| `noop` | Do nothing |
|
| 44 |
|
| 45 |
## Reward Structure
|
| 46 |
|
|
|
|
| 52 |
| 4 (Tetris!) | +1500 | x15 |
|
| 53 |
|
| 54 |
Penalties: -1/step, -2*height, -5*holes, -500 game over.
|
| 55 |
+
|
| 56 |
+
## Built With
|
| 57 |
+
|
| 58 |
+
- [OpenEnv 0.2.1](https://github.com/meta-pytorch/OpenEnv) by Meta PyTorch
|
| 59 |
+
- Deployed on [Hugging Face Spaces](https://huggingface.co/spaces/VortexedSquirrel/tetris-env)
|