Spaces:

luccabb
/

moonfish_chess

Sleeping

App Files Files Community

moonfish_chess / README.md

luccabb

Upload folder using huggingface_hub

3e1f9da verified 28 days ago

preview code

raw

history blame contribute delete

4.5 kB

metadata

title: Moonfish Chess
emoji: ♟️
colorFrom: gray
colorTo: blue
sdk: docker
pinned: false
license: mit
base_path: /web

Chess OpenEnv

A chess environment for reinforcement learning, built on moonfish and compatible with the OpenEnv framework.

Features

Full Chess Rules: Legal move generation, checkmate/stalemate detection, draw conditions
Position Evaluation: PeSTO evaluation function from moonfish for reward shaping
OpenEnv Compatible: Standard reset(), step(), state() interface
Configurable Rewards: Win/loss/draw payoffs, illegal move penalties, evaluation-based rewards
HTTP API: FastAPI server for remote training and multi-agent setups
Containerized: Docker support for reproducible deployments

Quick Start

Local Usage (No Server)

from moonfish.rl import ChessEnvironment, ChessAction

# Create environment
env = ChessEnvironment()

# Start a new game
obs = env.reset()
print(f"Legal moves: {obs.legal_moves}")

# Make a move
action = ChessAction(move="e2e4")
obs, reward, done = env.step(action)

print(f"FEN: {obs.fen}")
print(f"Reward: {reward}, Done: {done}")

Client-Server Usage

Start the server:

cd moonfish/rl
python -m uvicorn server.app:app --host 0.0.0.0 --port 8000

Connect with the client:

from moonfish.rl import ChessEnvClient, ChessAction

client = ChessEnvClient("http://localhost:8000")

obs = client.reset()
result = client.step(ChessAction(move="e2e4"))
print(f"Reward: {result.reward}")

client.close()

Data Models

ChessAction

@dataclass
class ChessAction:
    move: str  # UCI format: "e2e4", "e7e8q" (promotion)

ChessObservation

@dataclass
class ChessObservation:
    fen: str              # Board state in FEN notation
    legal_moves: List[str]  # Available moves in UCI format
    is_check: bool        # Current player in check
    done: bool            # Game over
    reward: Optional[float]  # Terminal reward
    result: Optional[str]    # "1-0", "0-1", "1/2-1/2"
    metadata: Dict[str, Any]  # Evaluation, material, etc.

ChessState

@dataclass
class ChessState:
    episode_id: str        # Unique game identifier
    step_count: int        # Half-moves played
    current_player: str    # "white" or "black"
    fen: str               # Current position
    move_history: List[str]  # All moves in UCI format

Reward Configuration

from moonfish.rl import ChessEnvironment, RewardConfig

config = RewardConfig(
    win=1.0,           # Reward for winning
    loss=-1.0,         # Penalty for losing
    draw=0.0,          # Reward for draw
    illegal_move=-0.1, # Penalty for illegal moves
    use_evaluation=True,  # Enable intermediate rewards
    evaluation_scale=0.0001,  # Scale for eval-based rewards
)

env = ChessEnvironment(reward_config=config)

Docker

Build and run:

docker build -t chess-openenv .
docker run -p 8000:8000 chess-openenv

Integration with RL Frameworks

With TorchRL

from moonfish.rl import ChessEnvironment, ChessAction

class ChessTorchRLWrapper:
    def __init__(self):
        self.env = ChessEnvironment()

    def reset(self):
        obs = self.env.reset()
        return self._obs_to_tensor(obs)

    def step(self, action_idx):
        move = self._idx_to_move(action_idx)
        obs, reward, done = self.env.step(ChessAction(move=move))
        return self._obs_to_tensor(obs), reward, done

With OpenEnv Training Loop

from moonfish.rl import make_env, ChessAction
import random

client = make_env("http://localhost:8000")

for episode in range(100):
    obs = client.reset()
    episode_reward = 0

    while not obs.done:
        # Your policy here (random for demo)
        move = random.choice(obs.legal_moves)
        result = client.step(ChessAction(move=move))
        obs = result.observation
        episode_reward += result.reward

    print(f"Episode {episode}: reward={episode_reward}")

client.close()

API Endpoints

Endpoint	Method	Description
`/health`	GET	Health check
`/metadata`	GET	Environment configuration
`/reset`	POST	Start new episode
`/step`	POST	Execute a move
`/state`	GET	Get episode metadata

License

MIT - See the moonfish repository for full license details.