Jochemvkem
/

magnusbot

@@ -2,78 +2,89 @@
 library_name: transformers
 tags:
 - chess
-- fine-tuned
-- qwen2.5
-- causal-lm
-base_model: Qwen/Qwen2.5-0.5B
 ---
-# Magnus Chess Player — Qwen2.5-0.5B Fine-tune
-A Qwen2.5-0.5B language model fine-tuned on Magnus Carlsen's Lichess games
-to predict strong chess moves given a board position in FEN notation.
 ## Model Details
 ### Model Description
-This model is a causal language model fine-tuned to predict chess moves in UCI
-format given a board state in FEN notation. It was trained on over 400,000 moves
-played by World Chess Champion Magnus Carlsen on Lichess.org, with the goal of
-capturing his playing style and strategic intuition.
 - **Developed by:** Jochem van Kemenade
-- **Model type:** Causal Language Model (decoder-only)
-- **Language(s):** Chess notation (FEN + UCI)
 - **License:** Apache 2.0
-- **Finetuned from:** Qwen/Qwen2.5-0.5B
 ## Uses
 ### Direct Use
-Given a chess board state in FEN notation, the model predicts the next best move
-in UCI format. It is used as the backbone of a `TransformerPlayer` for a chess
-tournament setting.
 ### Out-of-Scope Use
-This model is not intended for general natural language tasks. It has been
-specialized for chess move prediction and will perform poorly outside that domain.
 ## Training Details
 ### Training Data
-~400,000 moves sampled from a dataset of Magnus Carlsen's Lichess.org games.
-Each training example is formatted as:
-```
-FEN: <fen string>
-Move: <uci move>
-```
-95/5 train/eval split
-### Training Procedure
-#### Preprocessing
-FEN strings were taken from the `fen` column of the dataset. UCI moves were
-validated using the `python-chess` library. Invalid or malformed moves were
-discarded.
-#### Training Hyperparameters
-- **Training regime:** fp16 mixed precision
-- **Base model:** Qwen/Qwen2.5-0.5B
-- **Fine-tuning method:** LoRA (r=16, alpha=32, target modules: q_proj, v_proj, k_proj, o_proj)
-- **Epochs:** 3
-- **Batch size:** 12 (effective 60 with gradient accumulation steps=5)
-- **Learning rate:** 2e-4
-- **Max sequence length:** 128 tokens
-- **Warmup steps:** 100
 #### Hardware
 - **Hardware:** NVIDIA GeForce RTX 4070 Super (12GB VRAM)
-- **Training time:** ~5 hours

 library_name: transformers
 tags:
 - chess
+- custom-transformer
+- move-prediction
 ---
+# MagnusBot — Custom Transformer Chess Move Predictor
+A custom encoder-decoder Transformer trained from scratch to predict strong chess moves
+given a board position in FEN notation, using a dataset of 100K–400K chess positions
+including games and tactical puzzles.
 ## Model Details
 ### Model Description
+MagnusBot is a custom sequence-to-sequence Transformer trained end-to-end for chess move
+prediction. Given a board state in FEN notation, it outputs the predicted best move in UCI
+format. The model was trained in two phases: a base training phase (25 epochs) followed by
+a fine-tuning phase (4 epochs) focused on tactical positions, including checkmate threats
+and winning combinations.
 - **Developed by:** Jochem van Kemenade
+- **Model type:** Custom Encoder-Decoder Transformer (trained from scratch)
+- **Domain:** Chess notation (FEN input → UCI move output)
 - **License:** Apache 2.0
+- **Architecture:** Custom `Transformer` with `ChessTokenizer` (chess-specific vocabulary)
 ## Uses
 ### Direct Use
+Given a chess board state in FEN notation, the model predicts the next best move in UCI
+format. It is designed for use as a chess engine component or tournament player.
 ### Out-of-Scope Use
+This model is not intended for general natural language tasks. It has been specialized for
+chess move prediction and will perform poorly outside that domain.
 ## Training Details
 ### Training Data
+Training data was sourced from three datasets, combined and deduplicated:
+- **`chess_moves.csv`** — local dataset of chess positions (primary source)
+- **`train_data.csv`** — local dataset of additional chess positions
+- **`chess_moves_1st.csv`** — local dataset of first-move positions
+- **`ssingh22/chess-evaluations` (tactics split, HuggingFace)** — tactical puzzles filtered
+  to positions with engine evaluations between –2000 and +2000 centipawns, balanced between
+  white-favourable and black-favourable positions
+Total training data: **~4M examples**
+Each training example is formatted as a tokenized FEN string (source) mapped to a UCI move
+(target).
+#### Fine-Tuning Data
+The fine-tuning phase uses a smaller curated subset focused winning games under or at max 200 moves:
+-
+- 1% replay of the base training data to mitigate catastrophic forgetting
+- 50% sample of the local CSV data
+90/10 train/validation split for both phases.
+### Training Procedure
+Training is split into two phases:
+**Phase 1 — Base Training**
+- Trained from scratch on the full combined dataset
+- 25 epochs, Adam optimizer
+- Mixed precision training (AMP fp16 via `torch.amp`)
+- Batch size and learning rate sourced from Optuna-tuned config (`opt-configs.yml`)
+**Phase 2 — Fine-Tuning on Tactical Positions**
+- Initialized from Phase 1 weights
+- 4 epochs, learning rate reduced to 10% of base LR
+- Gradient accumulation over 4 steps (effective batch size ×4)
+- Mixed precision training (AMP fp16)
 #### Hardware
 - **Hardware:** NVIDIA GeForce RTX 4070 Super (12GB VRAM)
+- **Training time:** ~10 hours