ROOK-LM-124m / README.md

jrahn

Update README.md

09eea97 verified 7 months ago

preview code

raw

history blame contribute delete

5.01 kB

metadata

library_name: transformers
pipeline_tag: text-generation
tags:
  - chess
  - llm.c
  - chain-of-thought
  - strategic-reasoning
license: mit
language:
  - en
datasets:
  - lfsm/rook-40m
metrics:
  - accuracy
widget:
  - text: 'P: 1r6/p5k1/5p2/PPp3p1/2r3P1/4B3/4KP2/R7 w - - 1 33 '
  - text: 'P: 1k6/1pr1p3/3p2q1/pR1P4/P1pb4/5Q1P/8/1R3K2 w - - 7 46 '
  - text: 'P: 2k4r/pb3R2/3Bpn2/1pp4P/8/1P6/4BPP1/3K4 b - - 0 33 '

ROOK-LM-124M

A 124M parameter language model for chess with chain-of-thought reasoning, trained with synthetic explanations from Stockfish 16.1.

Model Details

Model Description

ROOK-LM generates chess moves with detailed reasoning traces, incorporating position analysis, candidate evaluation, and move selection in a chain-of-thought format.

Developed by: Jonathan Rahn, Jenia Jitsev (LAION/JSC), Qi Sun (Tokyo Tech/Sakana AI)
Model type: GPT-2 (autoregressive language model)
Language(s): Chess notation with natural language explanations
License: MIT
Repository: GitHub
Paper: LAION Research Note
Logs: Weights & Biases

Model Architecture

Parameters: 124M
Architecture: GPT-2 family
Context Length: up to 2048 tokens
Training Framework: llm.c (training); HF scripts in this repo support experiments

Uses

Direct Use

Chess move generation with explanations
Chess position analysis
Educational chess tutoring
Research on reasoning in language models

Downstream Use

Fine-tuning for specific chess styles
Integration with chess interfaces
Building chess teaching assistants

Training Details

Training Data

Dataset: rook-40m
Size: 40M positions (6B tokens)
Generation: Stockfish 16.1 on Tsubame 4.0 supercomputer
Format: FEN position → reasoning → move

Chain-of-Thought Format

ROOK-LM uses a structured format with position, candidate moves, evaluations, and best move:

<FEN position>
M: <candidate moves in UCI notation>
E: <evaluation scores for each candidate>
B: <best move in UCI notation>

Concrete Training Example:

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
M: e2e4 d2d4 g1f3 c2c4 g2g3
E: 0.3 0.3 0.2 0.1 0.0
B: e2e4

Breakdown:

Position in FEN notation (padded to 90 chars for consistency)
M: Top 5 candidate moves from Stockfish analysis (UCI format, padded to 30 chars)
E: Evaluation scores for each candidate move (centipawns/100, padded to 40 chars)
B: Best move selected by Stockfish

Generation Example (Inference):

# Input prompt
prompt = "r1bqkbnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq - 2 3"

# Model generates continuation (stripped padding)
output = "M: d2d4 b1c3 f1c4 f1b5 d2d3 E: 0.6 0.5 0.4 0.3 0.2 B: d2d4"

The model learns to:

Analyze the position
Generate plausible candidate moves
Evaluate each candidate
Select the best move based on evaluations

Training Procedure

Hardware: 2x NVIDIA RTX 4090
Framework: llm.c (karpathy/llm.c)
Trained for multiple epochs on rook-40m with llm.c; typical sequence length up to 2048

Evaluation

Performance Metrics

Action accuracy (rook-40m, 3 epochs): 22.2%
BIG-bench Checkmate-in-One: 24.4%
- Values from the LAION research note

Reasoning Quality

The model generates coherent chess analysis including:

Position evaluation
Tactical motif identification
Strategic planning
Move justification

Technical Details

Tokenization

Custom chess tokenizer combining:

FEN notation tokens
UCI move notation
Natural language vocabulary
Special tokens for structure

Integration with llm.c

The model uses the llm.c framework for efficient training:

# Training command
./train_gpt2 \
    --input_bin data/rook_train.bin \
    --val_bin data/rook_val.bin \
    --model_file log/model.bin \
    --batch_size 512 \
    --sequence_length 2048

Limitations

Computation: No deep search capabilities
Tactics: May miss complex combinations
Consistency: Reasoning may not always align with move choice
Context: Limited by 2048 token context window

Related Models

ROOK-CLF-9M: Classification approach
RookWorld-LM-124M: Unified agent+environment model

Citation

@article{rook2024,
  title={ROOK: Strategic Reasoning in Chess Without Search},
  author={Rahn, Jonathan and Jitsev, Jenia and Sun, Qi},
  journal={LAION Research Notes},
  year={2024},
  url={https://laion.ai/notes/rook/}
}

Model Card Contact

Jonathan Rahn - GitHub | Research Page

Metrics Source

LAION research note: https://laion.ai/notes/rook/