FlameF0X commited on
Commit
89dab6d
·
verified ·
1 Parent(s): e9c978d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -0
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - mlabonne/chessllm
5
+ library_name: transformers
6
+ tags:
7
+ - chess
8
+ pipeline_tag: text-generation
9
+ ---
10
+
11
+ # ChessSLM
12
+
13
+ **ChessSLM** is a small language model designed to play chess using natural language move generation.
14
+ Despite having only **30M parameters**, it is capable of competing with and occasionally outperforming larger language models in chess-playing tasks.
15
+
16
+ The model is based on the **GPT-2 architecture** and was pre-trained from scratch on **500,000 chess games** from the `mlabonne/chessllm` dataset using **SAN (Standard Algebraic Notation)**.
17
+
18
+ Play against ChessSLM [here](https://flamef0x.github.io/other/chess/chess).
19
+
20
+ ---
21
+
22
+ ## Overview
23
+
24
+ - **Architecture:** GPT-2
25
+ - **Parameters:** ~40M
26
+ - **Training data:** 500k chess games
27
+ - **Notation:** SAN (Standard Algebraic Notation)
28
+ - **Task:** Autoregressive chess move generation
29
+
30
+ ChessSLM demonstrates that **specialized small language models can perform competitively in narrow domains** such as chess.
31
+
32
+ ---
33
+
34
+ ## Capabilities
35
+
36
+ ChessSLM can play chess by generating moves sequentially in SAN notation.
37
+ It has been evaluated in matches against several language models, including:
38
+
39
+ - Claude
40
+ - Gemini
41
+ - Qwen
42
+ - GPT-2
43
+ - GPT-Neo
44
+ - Pythia
45
+ - LLaMA
46
+ - Mistral
47
+ - other small chess-oriented models
48
+
49
+ The model achieves an averaging rating of **around ~1054 Elo** against other language models despite its small size.
50
+
51
+ ---
52
+
53
+ ## Benchmark Results
54
+
55
+ | Model | Elo Rating |
56
+ |------|------------|
57
+ | EleutherAI/pythia-70m-deduped | 1111 |
58
+ | mlabonne/chesspythia-70m | 1101 |
59
+ | nlpguy/amdchess-v9 | 1094 |
60
+ | nlpguy/smolchess-v2 | 1093 |
61
+ | DedeProGames/mini-chennus | 1083 |
62
+ | distilbert/distilgpt2 | 1061 |
63
+ | DedeProGames/dialochess | 1059 |
64
+ | facebook/opt-125m | 1057 |
65
+ | **FlameF0X/ChessSLM** | **1054** |
66
+ | **FlameF0X/ChessSLM-RL** | **1054** |
67
+ | mlabonne/grandpythia-200k-70m | 1050 |
68
+ | DedeProGames/Chesser-248K-Mini | 1048 |
69
+
70
+ ---
71
+
72
+ ## Limitations
73
+
74
+ Like many language-model-based chess systems, ChessSLM has several limitations:
75
+
76
+ - **Illegal move hallucinations:** The model may occasionally generate moves that violate chess rules.
77
+ - **No board-state verification:** Moves are generated purely from learned patterns rather than a validated game state.
78
+ - **Limited strategic depth:** While competitive at lower Elo levels, it cannot match dedicated chess engines.
79
+
80
+ These limitations are common for **pure language-model chess agents** that do not use external rule engines.
81
+
82
+ ---
83
+
84
+ ## Future Improvements
85
+
86
+ Potential improvements include:
87
+
88
+ - Adding **move legality filtering**
89
+ - Integrating **board-state validation**
90
+ - Training on **larger datasets**
91
+ - Reinforcement learning through **self-play**
92
+
93
+ ---
94
+
95
+ ## Summary
96
+
97
+ ChessSLM shows that **very small language models can achieve meaningful chess performance** when trained on domain-specific data.
98
+ It serves as a lightweight baseline for exploring **LLM-based chess agents** and **specialized small language models (SLMs)**.