chess_model4 / README.md
almvos's picture
Update README.md
00ba54c verified
---
library_name: transformers
tags:
- chess
- fen
- uci
datasets:
- bonna46/Chess-FEN-and-NL-Format-30K-Dataset
- Vasanth/chessdevilai_fen_dataset
base_model:
- openai-community/gpt2
---
# Model Card for chess_model4
### Model Description
The model was trained to be used for a chess-playing agent built on a fine-tuned GPT-2 model. It was trained for the player to take a board position in FEN format and returns a legal move in UCI notation.
- **Developed by:** Aliyah Vos
- **Model type:** Decoder Causal LM
- **Finetuned from model:** openai-community/gpt2
### Model Sources
- **Repository:** [almvos/Midtrm/Chess/Tournament](https://github.com/almvos/Midterm_Chess_Tournament.git)
## Uses
### Direct Use
Given a chess board in FEN notation, the model predicts the next best move in the form of a UCI string.
### Out-of-Scope Use
This model has been fine-tuned for chess move prediction.
## Training Details
### Training Data
A combination of different datasets was used to train the model
HF: ["Vasanth/chessdevilai_fen_dataset"](https://huggingface.co/datasets/Vasanth/chessdevilai_fen_dataset) <br>
HF: ["bonna46/Chess-FEN-and-NL-Format-30K-Dataset"](https://huggingface.co/datasets/bonna46/Chess-FEN-and-NL-Format-30K-Dataset) <br>
Kaggle: ["yousefradwanlmao/stockfish-best-moves-compilation"](https://www.kaggle.com/datasets/yousefradwanlmao/stockfish-best-moves-compilation) <br>
#### Preprocessing
The different datasets were normalised to be in the same format and shuffled to combine. The kaggle dataset was filtered for missing "Best move" values.
#### Training Hyperparameters
learning_rate = 3e-5 <br>
metric_for_best_model = "eval_loss" <br>
weight_decay = 0.01 <br>
warmup_ratio = 0.05 <br>