ChessSLM-RL / README.md
FlameF0X's picture
Create README.md
318db0d verified
---
license: apache-2.0
library_name: transformers
tags:
- chess
- sefl-play
pipeline_tag: text-generation
base_model:
- FlameF0X/ChessSLM
---
# ChessSLM-RL
**ChessSLM-RL** is the improve version of **ChessSLM** (a small language model designed to play chess using natural language move generation.) by using RL (Reinforcement LeanLearning) to make the model to hallucinated less and play a bit more conscious.
Despite having only **30M parameters**, it is capable of competing with and occasionally outperforming larger language models in chess-playing tasks.
The model is based on the ChessSLM pre-train model, fine-tuned using RL and Stockfish to make the model to play more legal moves and attempt fewer illegal moves.
Play against ChessSLM [here](https://flamef0x.github.io/other/chess).
---
## Overview
- **Architecture:** GPT-2
- **Parameters:** ~30M
- **Training data:** Self-Play
- **Task:** Autoregressive chess move generation
---
## Capabilities
ChessSLM can play chess by generating moves sequentially in SAN notation.
It has been evaluated in matches against several language models, including:
- Claude
- Gemini
- Qwen
- GPT-2
- GPT-Neo
- Pythia
- LLaMA
- Mistral
- other small chess-oriented models
The model achieves an **Elo rating of approximately {TBD}**, averaging **around ~{TBD} Elo** against other language models despite its small size.
---
## Benchmark Results
| Model | Elo Rating |
|------|------------|
| EleutherAI/pythia-70m-deduped | 1113 |
| nlpguy/amdchess-v9 | 1094 |
| nlpguy/smolchess-v2 | 1093 |
| mlabonne/chesspythia-70m | 1088 |
| **FlameF0X/ChessSLM** | **1087** |
| DedeProGames/mini-chennus | 1083 |
| distilbert/distilgpt2 | 1061 |
| Locutusque/TinyMistral-248M-v2.5 | 1061 |
| facebook/opt-125m | 1057 |
| mlabonne/grandpythia-200k-70m | 1050 |
| DedeProGames/Chesser-248K-Mini | 1048 |
| bharathrajcl/chess_llama_68m | 1046 |
---
## Limitations
Like many language-model-based chess systems, ChessSLM has several limitations:
- **Illegal move hallucinations:** The model may occasionally generate moves that violate chess rules.
- **No board-state verification:** Moves are generated purely from learned patterns rather than a validated game state.
- **Limited strategic depth:** While competitive at lower Elo levels, it cannot match dedicated chess engines.
These limitations are common for **pure language-model chess agents** that do not use external rule engines.
---
## Future Improvements
Potential improvements include:
- Adding **move legality filtering**
- Integrating **board-state validation**
---
## Summary
ChessSLM shows that **very small language models can achieve meaningful chess performance** when trained on domain-specific data.
It serves as a lightweight baseline for exploring **LLM-based chess agents** and **specialized small language models (SLMs)**.