metadata
library_name: transformers
pipeline_tag: text-generation
tags:
- chess
- llm.c
- chain-of-thought
- strategic-reasoning
license: mit
language:
- en
datasets:
- lfsm/rook-40m
metrics:
- accuracy
widget:
- text: 'P: 1r6/p5k1/5p2/PPp3p1/2r3P1/4B3/4KP2/R7 w - - 1 33 '
- text: 'P: 1k6/1pr1p3/3p2q1/pR1P4/P1pb4/5Q1P/8/1R3K2 w - - 7 46 '
- text: 'P: 2k4r/pb3R2/3Bpn2/1pp4P/8/1P6/4BPP1/3K4 b - - 0 33 '
ROOK-LM-124M
A 124M parameter language model for chess with chain-of-thought reasoning, trained with synthetic explanations from Stockfish 16.1.
Model Details
Model Description
ROOK-LM generates chess moves with detailed reasoning traces, incorporating position analysis, candidate evaluation, and move selection in a chain-of-thought format.
- Developed by: Jonathan Rahn, Jenia Jitsev (LAION/JSC), Qi Sun (Tokyo Tech/Sakana AI)
- Model type: GPT-2 (autoregressive language model)
- Language(s): Chess notation with natural language explanations
- License: MIT
- Repository: GitHub
- Paper: LAION Research Note
- Logs: Weights & Biases
Model Architecture
- Parameters: 124M
- Architecture: GPT-2 family
- Context Length: up to 2048 tokens
- Training Framework: llm.c (training); HF scripts in this repo support experiments
Uses
Direct Use
- Chess move generation with explanations
- Chess position analysis
- Educational chess tutoring
- Research on reasoning in language models
Downstream Use
- Fine-tuning for specific chess styles
- Integration with chess interfaces
- Building chess teaching assistants
Training Details
Training Data
- Dataset: rook-40m
- Size: 40M positions (6B tokens)
- Generation: Stockfish 16.1 on Tsubame 4.0 supercomputer
- Format: FEN position → reasoning → move
Chain-of-Thought Format
ROOK-LM uses a structured format with position, candidate moves, evaluations, and best move:
<FEN position>
M: <candidate moves in UCI notation>
E: <evaluation scores for each candidate>
B: <best move in UCI notation>
Concrete Training Example:
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
M: e2e4 d2d4 g1f3 c2c4 g2g3
E: 0.3 0.3 0.2 0.1 0.0
B: e2e4
Breakdown:
- Position in FEN notation (padded to 90 chars for consistency)
- M: Top 5 candidate moves from Stockfish analysis (UCI format, padded to 30 chars)
- E: Evaluation scores for each candidate move (centipawns/100, padded to 40 chars)
- B: Best move selected by Stockfish
Generation Example (Inference):
# Input prompt
prompt = "r1bqkbnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq - 2 3"
# Model generates continuation (stripped padding)
output = "M: d2d4 b1c3 f1c4 f1b5 d2d3 E: 0.6 0.5 0.4 0.3 0.2 B: d2d4"
The model learns to:
- Analyze the position
- Generate plausible candidate moves
- Evaluate each candidate
- Select the best move based on evaluations
Training Procedure
- Hardware: 2x NVIDIA RTX 4090
- Framework: llm.c (karpathy/llm.c)
- Trained for multiple epochs on rook-40m with llm.c; typical sequence length up to 2048
Evaluation
Performance Metrics
- Action accuracy (rook-40m, 3 epochs): 22.2%
- BIG-bench Checkmate-in-One: 24.4%
- Values from the LAION research note
Reasoning Quality
The model generates coherent chess analysis including:
- Position evaluation
- Tactical motif identification
- Strategic planning
- Move justification
Technical Details
Tokenization
Custom chess tokenizer combining:
- FEN notation tokens
- UCI move notation
- Natural language vocabulary
- Special tokens for structure
Integration with llm.c
The model uses the llm.c framework for efficient training:
# Training command
./train_gpt2 \
--input_bin data/rook_train.bin \
--val_bin data/rook_val.bin \
--model_file log/model.bin \
--batch_size 512 \
--sequence_length 2048
Limitations
- Computation: No deep search capabilities
- Tactics: May miss complex combinations
- Consistency: Reasoning may not always align with move choice
- Context: Limited by 2048 token context window
Related Models
- ROOK-CLF-9M: Classification approach
- RookWorld-LM-124M: Unified agent+environment model
Citation
@article{rook2024,
title={ROOK: Strategic Reasoning in Chess Without Search},
author={Rahn, Jonathan and Jitsev, Jenia and Sun, Qi},
journal={LAION Research Notes},
year={2024},
url={https://laion.ai/notes/rook/}
}
Model Card Contact
Jonathan Rahn - GitHub | Research Page
Metrics Source
LAION research note: https://laion.ai/notes/rook/