Model Card for chess_model4

Model Description

The model was trained to be used for a chess-playing agent built on a fine-tuned GPT-2 model. It was trained for the player to take a board position in FEN format and returns a legal move in UCI notation.

  • Developed by: Aliyah Vos
  • Model type: Decoder Causal LM
  • Finetuned from model: openai-community/gpt2

Model Sources

Uses

Direct Use

Given a chess board in FEN notation, the model predicts the next best move in the form of a UCI string.

Out-of-Scope Use

This model has been fine-tuned for chess move prediction.

Training Details

Training Data

A combination of different datasets was used to train the model

HF: "Vasanth/chessdevilai_fen_dataset"
HF: "bonna46/Chess-FEN-and-NL-Format-30K-Dataset"
Kaggle: "yousefradwanlmao/stockfish-best-moves-compilation"

Preprocessing

The different datasets were normalised to be in the same format and shuffled to combine. The kaggle dataset was filtered for missing "Best move" values.

Training Hyperparameters

learning_rate = 3e-5
metric_for_best_model = "eval_loss"
weight_decay = 0.01
warmup_ratio = 0.05

Downloads last month
276
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for almvos/chess_model4

Finetuned
(2092)
this model

Datasets used to train almvos/chess_model4