Jochemvkem commited on
Commit
f6a90a2
·
verified ·
1 Parent(s): 8449ad5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -41
README.md CHANGED
@@ -2,78 +2,89 @@
2
  library_name: transformers
3
  tags:
4
  - chess
5
- - fine-tuned
6
- - qwen2.5
7
- - causal-lm
8
- base_model: Qwen/Qwen2.5-0.5B
9
  ---
10
 
11
- # Magnus Chess Player — Qwen2.5-0.5B Fine-tune
12
 
13
- A Qwen2.5-0.5B language model fine-tuned on Magnus Carlsen's Lichess games
14
- to predict strong chess moves given a board position in FEN notation.
 
15
 
16
  ## Model Details
17
 
18
  ### Model Description
19
 
20
- This model is a causal language model fine-tuned to predict chess moves in UCI
21
- format given a board state in FEN notation. It was trained on over 400,000 moves
22
- played by World Chess Champion Magnus Carlsen on Lichess.org, with the goal of
23
- capturing his playing style and strategic intuition.
 
24
 
25
  - **Developed by:** Jochem van Kemenade
26
- - **Model type:** Causal Language Model (decoder-only)
27
- - **Language(s):** Chess notation (FEN + UCI)
28
  - **License:** Apache 2.0
29
- - **Finetuned from:** Qwen/Qwen2.5-0.5B
30
 
31
  ## Uses
32
 
33
  ### Direct Use
34
 
35
- Given a chess board state in FEN notation, the model predicts the next best move
36
- in UCI format. It is used as the backbone of a `TransformerPlayer` for a chess
37
- tournament setting.
38
 
39
  ### Out-of-Scope Use
40
 
41
- This model is not intended for general natural language tasks. It has been
42
- specialized for chess move prediction and will perform poorly outside that domain.
43
 
44
  ## Training Details
45
 
46
  ### Training Data
47
 
48
- ~400,000 moves sampled from a dataset of Magnus Carlsen's Lichess.org games.
49
- Each training example is formatted as:
50
- ```
51
- FEN: <fen string>
52
- Move: <uci move>
53
- ```
54
 
55
- 95/5 train/eval split
 
 
 
 
 
56
 
57
- ### Training Procedure
 
 
 
 
 
 
 
58
 
59
- #### Preprocessing
 
 
 
 
 
 
60
 
61
- FEN strings were taken from the `fen` column of the dataset. UCI moves were
62
- validated using the `python-chess` library. Invalid or malformed moves were
63
- discarded.
64
 
65
- #### Training Hyperparameters
 
 
 
 
66
 
67
- - **Training regime:** fp16 mixed precision
68
- - **Base model:** Qwen/Qwen2.5-0.5B
69
- - **Fine-tuning method:** LoRA (r=16, alpha=32, target modules: q_proj, v_proj, k_proj, o_proj)
70
- - **Epochs:** 3
71
- - **Batch size:** 12 (effective 60 with gradient accumulation steps=5)
72
- - **Learning rate:** 2e-4
73
- - **Max sequence length:** 128 tokens
74
- - **Warmup steps:** 100
75
 
76
  #### Hardware
77
 
78
  - **Hardware:** NVIDIA GeForce RTX 4070 Super (12GB VRAM)
79
- - **Training time:** ~5 hours
 
2
  library_name: transformers
3
  tags:
4
  - chess
5
+ - custom-transformer
6
+ - move-prediction
 
 
7
  ---
8
 
9
+ # MagnusBot — Custom Transformer Chess Move Predictor
10
 
11
+ A custom encoder-decoder Transformer trained from scratch to predict strong chess moves
12
+ given a board position in FEN notation, using a dataset of 100K–400K chess positions
13
+ including games and tactical puzzles.
14
 
15
  ## Model Details
16
 
17
  ### Model Description
18
 
19
+ MagnusBot is a custom sequence-to-sequence Transformer trained end-to-end for chess move
20
+ prediction. Given a board state in FEN notation, it outputs the predicted best move in UCI
21
+ format. The model was trained in two phases: a base training phase (25 epochs) followed by
22
+ a fine-tuning phase (4 epochs) focused on tactical positions, including checkmate threats
23
+ and winning combinations.
24
 
25
  - **Developed by:** Jochem van Kemenade
26
+ - **Model type:** Custom Encoder-Decoder Transformer (trained from scratch)
27
+ - **Domain:** Chess notation (FEN input → UCI move output)
28
  - **License:** Apache 2.0
29
+ - **Architecture:** Custom `Transformer` with `ChessTokenizer` (chess-specific vocabulary)
30
 
31
  ## Uses
32
 
33
  ### Direct Use
34
 
35
+ Given a chess board state in FEN notation, the model predicts the next best move in UCI
36
+ format. It is designed for use as a chess engine component or tournament player.
 
37
 
38
  ### Out-of-Scope Use
39
 
40
+ This model is not intended for general natural language tasks. It has been specialized for
41
+ chess move prediction and will perform poorly outside that domain.
42
 
43
  ## Training Details
44
 
45
  ### Training Data
46
 
47
+ Training data was sourced from three datasets, combined and deduplicated:
 
 
 
 
 
48
 
49
+ - **`chess_moves.csv`** — local dataset of chess positions (primary source)
50
+ - **`train_data.csv`** — local dataset of additional chess positions
51
+ - **`chess_moves_1st.csv`** — local dataset of first-move positions
52
+ - **`ssingh22/chess-evaluations` (tactics split, HuggingFace)** — tactical puzzles filtered
53
+ to positions with engine evaluations between –2000 and +2000 centipawns, balanced between
54
+ white-favourable and black-favourable positions
55
 
56
+ Total training data: **~4M examples**
57
+
58
+ Each training example is formatted as a tokenized FEN string (source) mapped to a UCI move
59
+ (target).
60
+
61
+ #### Fine-Tuning Data
62
+
63
+ The fine-tuning phase uses a smaller curated subset focused winning games under or at max 200 moves:
64
 
65
+ -
66
+ - 1% replay of the base training data to mitigate catastrophic forgetting
67
+ - 50% sample of the local CSV data
68
+
69
+ 90/10 train/validation split for both phases.
70
+
71
+ ### Training Procedure
72
 
73
+ Training is split into two phases:
 
 
74
 
75
+ **Phase 1 — Base Training**
76
+ - Trained from scratch on the full combined dataset
77
+ - 25 epochs, Adam optimizer
78
+ - Mixed precision training (AMP fp16 via `torch.amp`)
79
+ - Batch size and learning rate sourced from Optuna-tuned config (`opt-configs.yml`)
80
 
81
+ **Phase 2 — Fine-Tuning on Tactical Positions**
82
+ - Initialized from Phase 1 weights
83
+ - 4 epochs, learning rate reduced to 10% of base LR
84
+ - Gradient accumulation over 4 steps (effective batch size ×4)
85
+ - Mixed precision training (AMP fp16)
 
 
 
86
 
87
  #### Hardware
88
 
89
  - **Hardware:** NVIDIA GeForce RTX 4070 Super (12GB VRAM)
90
+ - **Training time:** ~10 hours