moonfish_chess / README.md
luccabb's picture
Upload folder using huggingface_hub
3e1f9da verified
---
title: Moonfish Chess
emoji: ♟️
colorFrom: gray
colorTo: blue
sdk: docker
pinned: false
license: mit
base_path: /web
---
# Chess OpenEnv
A chess environment for reinforcement learning, built on [moonfish](https://github.com/luccab/moonfish) and compatible with the [OpenEnv](https://github.com/meta-pytorch/OpenEnv) framework.
## Features
- **Full Chess Rules**: Legal move generation, checkmate/stalemate detection, draw conditions
- **Position Evaluation**: PeSTO evaluation function from moonfish for reward shaping
- **OpenEnv Compatible**: Standard `reset()`, `step()`, `state()` interface
- **Configurable Rewards**: Win/loss/draw payoffs, illegal move penalties, evaluation-based rewards
- **HTTP API**: FastAPI server for remote training and multi-agent setups
- **Containerized**: Docker support for reproducible deployments
## Quick Start
### Local Usage (No Server)
```python
from moonfish.rl import ChessEnvironment, ChessAction
# Create environment
env = ChessEnvironment()
# Start a new game
obs = env.reset()
print(f"Legal moves: {obs.legal_moves}")
# Make a move
action = ChessAction(move="e2e4")
obs, reward, done = env.step(action)
print(f"FEN: {obs.fen}")
print(f"Reward: {reward}, Done: {done}")
```
### Client-Server Usage
Start the server:
```bash
cd moonfish/rl
python -m uvicorn server.app:app --host 0.0.0.0 --port 8000
```
Connect with the client:
```python
from moonfish.rl import ChessEnvClient, ChessAction
client = ChessEnvClient("http://localhost:8000")
obs = client.reset()
result = client.step(ChessAction(move="e2e4"))
print(f"Reward: {result.reward}")
client.close()
```
## Data Models
### ChessAction
```python
@dataclass
class ChessAction:
move: str # UCI format: "e2e4", "e7e8q" (promotion)
```
### ChessObservation
```python
@dataclass
class ChessObservation:
fen: str # Board state in FEN notation
legal_moves: List[str] # Available moves in UCI format
is_check: bool # Current player in check
done: bool # Game over
reward: Optional[float] # Terminal reward
result: Optional[str] # "1-0", "0-1", "1/2-1/2"
metadata: Dict[str, Any] # Evaluation, material, etc.
```
### ChessState
```python
@dataclass
class ChessState:
episode_id: str # Unique game identifier
step_count: int # Half-moves played
current_player: str # "white" or "black"
fen: str # Current position
move_history: List[str] # All moves in UCI format
```
## Reward Configuration
```python
from moonfish.rl import ChessEnvironment, RewardConfig
config = RewardConfig(
win=1.0, # Reward for winning
loss=-1.0, # Penalty for losing
draw=0.0, # Reward for draw
illegal_move=-0.1, # Penalty for illegal moves
use_evaluation=True, # Enable intermediate rewards
evaluation_scale=0.0001, # Scale for eval-based rewards
)
env = ChessEnvironment(reward_config=config)
```
## Docker
Build and run:
```bash
docker build -t chess-openenv .
docker run -p 8000:8000 chess-openenv
```
## Integration with RL Frameworks
### With TorchRL
```python
from moonfish.rl import ChessEnvironment, ChessAction
class ChessTorchRLWrapper:
def __init__(self):
self.env = ChessEnvironment()
def reset(self):
obs = self.env.reset()
return self._obs_to_tensor(obs)
def step(self, action_idx):
move = self._idx_to_move(action_idx)
obs, reward, done = self.env.step(ChessAction(move=move))
return self._obs_to_tensor(obs), reward, done
```
### With OpenEnv Training Loop
```python
from moonfish.rl import make_env, ChessAction
import random
client = make_env("http://localhost:8000")
for episode in range(100):
obs = client.reset()
episode_reward = 0
while not obs.done:
# Your policy here (random for demo)
move = random.choice(obs.legal_moves)
result = client.step(ChessAction(move=move))
obs = result.observation
episode_reward += result.reward
print(f"Episode {episode}: reward={episode_reward}")
client.close()
```
## API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Health check |
| `/metadata` | GET | Environment configuration |
| `/reset` | POST | Start new episode |
| `/step` | POST | Execute a move |
| `/state` | GET | Get episode metadata |
## License
MIT - See the moonfish repository for full license details.