File size: 2,954 Bytes
89dab6d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
license: apache-2.0
datasets:
- mlabonne/chessllm
library_name: transformers
tags:
- chess
pipeline_tag: text-generation
---

# ChessSLM

**ChessSLM** is a small language model designed to play chess using natural language move generation.  
Despite having only **30M parameters**, it is capable of competing with and occasionally outperforming larger language models in chess-playing tasks.

The model is based on the **GPT-2 architecture** and was pre-trained from scratch on **500,000 chess games** from the `mlabonne/chessllm` dataset using **SAN (Standard Algebraic Notation)**.

Play against ChessSLM [here](https://flamef0x.github.io/other/chess/chess).

---

## Overview

- **Architecture:** GPT-2  
- **Parameters:** ~40M  
- **Training data:** 500k chess games  
- **Notation:** SAN (Standard Algebraic Notation)  
- **Task:** Autoregressive chess move generation

ChessSLM demonstrates that **specialized small language models can perform competitively in narrow domains** such as chess.

---

## Capabilities

ChessSLM can play chess by generating moves sequentially in SAN notation.  
It has been evaluated in matches against several language models, including:

- Claude
- Gemini
- Qwen
- GPT-2
- GPT-Neo
- Pythia
- LLaMA
- Mistral
- other small chess-oriented models

The model achieves an averaging rating of **around ~1054 Elo** against other language models despite its small size.

---

## Benchmark Results

| Model | Elo Rating |
|------|------------|
| EleutherAI/pythia-70m-deduped | 1111 |
| mlabonne/chesspythia-70m | 1101 |
| nlpguy/amdchess-v9 | 1094 |
| nlpguy/smolchess-v2 | 1093 |
| DedeProGames/mini-chennus | 1083 |
| distilbert/distilgpt2 | 1061 |
| DedeProGames/dialochess | 1059 |
| facebook/opt-125m | 1057 |
| **FlameF0X/ChessSLM** | **1054** |
| **FlameF0X/ChessSLM-RL** | **1054** |
| mlabonne/grandpythia-200k-70m | 1050 |
| DedeProGames/Chesser-248K-Mini | 1048 |

---

## Limitations

Like many language-model-based chess systems, ChessSLM has several limitations:

- **Illegal move hallucinations:** The model may occasionally generate moves that violate chess rules.
- **No board-state verification:** Moves are generated purely from learned patterns rather than a validated game state.
- **Limited strategic depth:** While competitive at lower Elo levels, it cannot match dedicated chess engines.

These limitations are common for **pure language-model chess agents** that do not use external rule engines.

---

## Future Improvements

Potential improvements include:

- Adding **move legality filtering**
- Integrating **board-state validation**
- Training on **larger datasets**
- Reinforcement learning through **self-play**

---

## Summary

ChessSLM shows that **very small language models can achieve meaningful chess performance** when trained on domain-specific data.  
It serves as a lightweight baseline for exploring **LLM-based chess agents** and **specialized small language models (SLMs)**.