OMLCheT-v1

A fine-tuned distilgpt2 that generates legal chess moves in Standard Algebraic Notation (SAN), trained on ~86k real games from the Open Machine Learning Chess Tournament dataset.


Model Overview

Field Detail
Base model distilbert/distilgpt2
Architecture Decoder-only Transformer (GPT-2 family)
Task Causal language modelling over SAN move sequences
Intended playstyle Generalist — reproduces human amateur-to-intermediate patterns seen in the training corpus; no explicit tactical or positional bias was enforced
Input/Output Plain SAN string (e.g. e4 e5 Nf3) → continuation (e.g. Nc6 Bc4 …)

The model treats a chess game as a text sequence: moves are space-separated tokens and the model is trained to predict the next token at each step. During inference, sampling from the model is equivalent to picking the next move.


Architecture Details

All figures are for the base distilgpt2 skeleton; the fine-tuning adds only one new embedding vector (<|chess|>).

Attribute Value
Total parameters ~82.7 M
Transformer blocks 6
Embedding dimension 768
Attention heads 12
Feed-forward dimension 3 072
Context window 1 024 tokens
Vocabulary size 50 258 (50 257 GPT-2 BPE + 1 domain token <|chess|>)
Positional encoding Learned absolute
Activation GELU

Training Data

Field Detail
Dataset OMLCheT/chess-san-base
Subset used clean
Volume ~86 600 games (train: 81 860 / test: 4 740)
Source Open Machine Learning Chess Tournament (OMLCheT) — AI vs AI games played under tournament conditions
Format Raw SAN strings, one game per row, e.g. e4 e5 Nf3 Nc6 Bc4 …
Pre-processing Each game is wrapped as <|chess|> {moves} <|endoftext|> and short games are packed together into 256-token chunks

Training Porgress

Training Loss Validation Loss Entropy Num Tokens Mean Token Accuracy
1.1141 1.0671 1.0349 51,434,331 0.6441

What the corpus is and isn't:
The games come from ML-agent matches, not human grandmasters or large Lichess databases. This means the model has learned patterns produced by other (possibly imperfect) chess agents, not a broad human-style distribution. Move quality varies widely across the corpus.


Training Methodology

Supervised next-token prediction (standard causal language modelling). No reinforcement learning or RLHF was used.

Hyperparameters

Hyperparameter Value
Framework HuggingFace transformers + trl (SFTTrainer)
Epochs 3
Per-device batch size 16
Gradient accumulation steps 2 (effective batch = 32)
Learning rate 5 × 10⁻⁴
LR schedule Cosine decay with 5% warmup
Weight decay 0.01
Optimiser AdamW (default transformers implementation)
Max sequence length 256 tokens
Packing Enabled (packing=True) — short games concatenated into full-length chunks
Precision bf16 on Ampere+ GPUs, fp16 on older CUDA, fp32 on CPU
Seed 42

Training process

  1. Load distilgpt2 weights from HuggingFace Hub.
  2. Add the <|chess|> domain prefix token and resize token embeddings.
  3. Format each game: <|chess|> {san_moves} <EOS>.
  4. Pack multiple short games per 256-token chunk to maximise GPU utilisation.
  5. Train with cross-entropy loss over all tokens (moves and the prefix).
  6. Select the checkpoint with the lowest eval_loss.

Known Limitations / Failure Modes

Failure mode Severity Notes
Illegal moves Medium The model has no explicit legality checker; it occasionally emits moves that are syntactically valid SAN but illegal given the current board position (e.g. moving a pinned piece)
Endgame blunders High The training corpus is dominated by middlegame positions. The model has seen relatively few endgame sequences and tends to play aimlessly once queens are traded
Pawn promotions Medium–High Promotion notation (e8=Q, a1=N, etc.) appears infrequently; underpromotions are rarely generated
Long games Medium At 256 tokens the context window truncates games running past ~60–70 full moves; the model loses positional coherence in very long endgames
Repetition Low–Medium Without a repetition detector the model can occasionally cycle through the same few moves
Opening diversity Low The model shows reasonable opening variety for common openings (Italian, Ruy López, Sicilian), but handles rare lines poorly
Engine-level play N/A This is a language model, not a search-based engine; it does not calculate variations or evaluate positions. Expect amateur-to-club strength at best

Tip for downstream users: always wrap inference in a legality filter (e.g. python-chess) and re-sample on illegal output.


Inference Speed

Benchmarked on a single NVIDIA T4 (Colab free tier) with the full fine-tuned checkpoint loaded in fp16:

Metric Value
Time per move (greedy) ~15–25 ms
Time per move (sampling, top-p=0.9) ~20–35 ms
Moves per second ~30–60
Full 40-move game generation ~0.8–1.5 s
Memory footprint (fp16) ~330 MB VRAM
Memory footprint (fp32 / CPU) ~660 MB RAM

On a Kaggle P100 expect roughly 2× faster; on CPU expect ~200–500 ms per move.

from transformers import pipeline
import torch

pipe = pipeline(
    "text-generation",
    model="OMLCheT/OMLCheT-v1",
    torch_dtype=torch.float16,
    device=0,                          # GPU; use -1 for CPU
)

# Provide moves played so far; model continues from here
prompt = "<|chess|> e4 e5 Nf3 Nc6 Bc4"
result = pipe(
    prompt,
    max_new_tokens=80,
    do_sample=True,
    temperature=0.8,
    top_p=0.9,
    pad_token_id=pipe.tokenizer.eos_token_id,
)
print(result[0]["generated_text"])

License

MIT License

This model weights file is released under the MIT License.

  • The base model (distilbert/distilgpt2) is also MIT-licensed.
  • The training dataset (OMLCheT/chess-san-base) is released by us — check the dataset card for its specific terms.
  • Chess move notation (SAN) is in the public domain.

You are free to use, modify, distribute, and build on top of this model for any purpose, commercial or non-commercial, with attribution.


Downloads last month
-
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OMLCheT/OMLCheT-v1

Finetuned
(1514)
this model

Dataset used to train OMLCheT/OMLCheT-v1

Space using OMLCheT/OMLCheT-v1 1