| |
|
| |
|
| | --- |
| | library_name: transformers |
| | tags: |
| | - chess |
| | - fen |
| | - uci |
| | datasets: |
| | - bonna46/Chess-FEN-and-NL-Format-30K-Dataset |
| | - Vasanth/chessdevilai_fen_dataset |
| | base_model: |
| | - openai-community/gpt2 |
| | --- |
| | |
| | # Model Card for chess_model4 |
| | |
| | |
| | ### Model Description |
| | |
| | The model was trained to be used for a chess-playing agent built on a fine-tuned GPT-2 model. It was trained for the player to take a board position in FEN format and returns a legal move in UCI notation. |
| | |
| | |
| | - **Developed by:** Aliyah Vos |
| | - **Model type:** Decoder Causal LM |
| | - **Finetuned from model:** openai-community/gpt2 |
| | |
| | ### Model Sources |
| | |
| | |
| | - **Repository:** [almvos/Midtrm/Chess/Tournament](https://github.com/almvos/Midterm_Chess_Tournament.git) |
| | |
| | |
| | ## Uses |
| | |
| | |
| | ### Direct Use |
| | |
| | Given a chess board in FEN notation, the model predicts the next best move in the form of a UCI string. |
| | |
| | |
| | |
| | ### Out-of-Scope Use |
| | |
| | This model has been fine-tuned for chess move prediction. |
| | |
| | |
| | ## Training Details |
| | |
| | ### Training Data |
| | |
| | A combination of different datasets was used to train the model |
| | |
| | HF: ["Vasanth/chessdevilai_fen_dataset"](https://huggingface.co/datasets/Vasanth/chessdevilai_fen_dataset) <br> |
| | HF: ["bonna46/Chess-FEN-and-NL-Format-30K-Dataset"](https://huggingface.co/datasets/bonna46/Chess-FEN-and-NL-Format-30K-Dataset) <br> |
| | Kaggle: ["yousefradwanlmao/stockfish-best-moves-compilation"](https://www.kaggle.com/datasets/yousefradwanlmao/stockfish-best-moves-compilation) <br> |
| | |
| | |
| | |
| | |
| | #### Preprocessing |
| | |
| | The different datasets were normalised to be in the same format and shuffled to combine. The kaggle dataset was filtered for missing "Best move" values. |
| | |
| | |
| | #### Training Hyperparameters |
| | |
| | learning_rate = 3e-5 <br> |
| | metric_for_best_model = "eval_loss" <br> |
| | weight_decay = 0.01 <br> |
| | warmup_ratio = 0.05 <br> |