Spaces:
Runtime error
Runtime error
File size: 4,499 Bytes
68a23df 3e1f9da 68a23df 3e1f9da 68a23df 3e1f9da | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | ---
title: Moonfish Chess
emoji: ♟️
colorFrom: gray
colorTo: blue
sdk: docker
pinned: false
license: mit
base_path: /web
---
# Chess OpenEnv
A chess environment for reinforcement learning, built on [moonfish](https://github.com/luccab/moonfish) and compatible with the [OpenEnv](https://github.com/meta-pytorch/OpenEnv) framework.
## Features
- **Full Chess Rules**: Legal move generation, checkmate/stalemate detection, draw conditions
- **Position Evaluation**: PeSTO evaluation function from moonfish for reward shaping
- **OpenEnv Compatible**: Standard `reset()`, `step()`, `state()` interface
- **Configurable Rewards**: Win/loss/draw payoffs, illegal move penalties, evaluation-based rewards
- **HTTP API**: FastAPI server for remote training and multi-agent setups
- **Containerized**: Docker support for reproducible deployments
## Quick Start
### Local Usage (No Server)
```python
from moonfish.rl import ChessEnvironment, ChessAction
# Create environment
env = ChessEnvironment()
# Start a new game
obs = env.reset()
print(f"Legal moves: {obs.legal_moves}")
# Make a move
action = ChessAction(move="e2e4")
obs, reward, done = env.step(action)
print(f"FEN: {obs.fen}")
print(f"Reward: {reward}, Done: {done}")
```
### Client-Server Usage
Start the server:
```bash
cd moonfish/rl
python -m uvicorn server.app:app --host 0.0.0.0 --port 8000
```
Connect with the client:
```python
from moonfish.rl import ChessEnvClient, ChessAction
client = ChessEnvClient("http://localhost:8000")
obs = client.reset()
result = client.step(ChessAction(move="e2e4"))
print(f"Reward: {result.reward}")
client.close()
```
## Data Models
### ChessAction
```python
@dataclass
class ChessAction:
move: str # UCI format: "e2e4", "e7e8q" (promotion)
```
### ChessObservation
```python
@dataclass
class ChessObservation:
fen: str # Board state in FEN notation
legal_moves: List[str] # Available moves in UCI format
is_check: bool # Current player in check
done: bool # Game over
reward: Optional[float] # Terminal reward
result: Optional[str] # "1-0", "0-1", "1/2-1/2"
metadata: Dict[str, Any] # Evaluation, material, etc.
```
### ChessState
```python
@dataclass
class ChessState:
episode_id: str # Unique game identifier
step_count: int # Half-moves played
current_player: str # "white" or "black"
fen: str # Current position
move_history: List[str] # All moves in UCI format
```
## Reward Configuration
```python
from moonfish.rl import ChessEnvironment, RewardConfig
config = RewardConfig(
win=1.0, # Reward for winning
loss=-1.0, # Penalty for losing
draw=0.0, # Reward for draw
illegal_move=-0.1, # Penalty for illegal moves
use_evaluation=True, # Enable intermediate rewards
evaluation_scale=0.0001, # Scale for eval-based rewards
)
env = ChessEnvironment(reward_config=config)
```
## Docker
Build and run:
```bash
docker build -t chess-openenv .
docker run -p 8000:8000 chess-openenv
```
## Integration with RL Frameworks
### With TorchRL
```python
from moonfish.rl import ChessEnvironment, ChessAction
class ChessTorchRLWrapper:
def __init__(self):
self.env = ChessEnvironment()
def reset(self):
obs = self.env.reset()
return self._obs_to_tensor(obs)
def step(self, action_idx):
move = self._idx_to_move(action_idx)
obs, reward, done = self.env.step(ChessAction(move=move))
return self._obs_to_tensor(obs), reward, done
```
### With OpenEnv Training Loop
```python
from moonfish.rl import make_env, ChessAction
import random
client = make_env("http://localhost:8000")
for episode in range(100):
obs = client.reset()
episode_reward = 0
while not obs.done:
# Your policy here (random for demo)
move = random.choice(obs.legal_moves)
result = client.step(ChessAction(move=move))
obs = result.observation
episode_reward += result.reward
print(f"Episode {episode}: reward={episode_reward}")
client.close()
```
## API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Health check |
| `/metadata` | GET | Environment configuration |
| `/reset` | POST | Start new episode |
| `/step` | POST | Execute a move |
| `/state` | GET | Get episode metadata |
## License
MIT - See the moonfish repository for full license details.
|