sigmoidneuron123 commited on
Commit
a917654
·
verified ·
1 Parent(s): 910467a

Upload 4 files

Browse files
Files changed (4) hide show
  1. README.md +61 -3
  2. chessy_model.pth +3 -0
  3. chessy_modelt-1.pth +3 -0
  4. selfchess.py +213 -0
README.md CHANGED
@@ -1,3 +1,61 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # NeoChess
2
+
3
+ NeoChess is a self-learning chess engine written in Python. It uses PyTorch to build a neural network that evaluates chess positions, and it learns by playing games against the Stockfish engine and itself. The core learning mechanism is based on reinforcement learning principles, where the model is rewarded for winning games and penalized for losing.
4
+
5
+ ## How It Works
6
+
7
+ The training process is orchestrated by the `selfchess.py` script, which follows these steps:
8
+
9
+ 1. **Game Simulation**: The engine plays a large number of chess games. The games are divided into three categories:
10
+ * NeoChess (as White) vs. Stockfish (as Black)
11
+ * NeoChess (as Black) vs. Stockfish (as White)
12
+ * NeoChess vs. NeoChess (self-play)
13
+
14
+ 2. **Parallel Processing**: To speed up data generation, games are simulated in parallel using Python's `multiprocessing` library, utilizing available CPU cores.
15
+
16
+ 3. **Move Selection**:
17
+ * **NeoChess**: Uses a negamax search algorithm (`search`) to explore future moves. The evaluation of terminal positions in the search is provided by its neural network.
18
+ * **Stockfish**: A standard, powerful chess engine provides the moves for the opponent.
19
+
20
+ 4. **Data Collection**: During each game, every board position (FEN string) where it is NeoChess's turn to move is stored.
21
+
22
+ 5. **Training**: After a game concludes, a reward is assigned: `+10` for a win, `-10` for a loss, and `0` for a draw. The neural network is then trained on the collected board positions from that game. The training target for each position is weighted by the final game outcome, encouraging the model to value positions that lead to wins.
23
+
24
+ 6. **Model Saving**: The model's state (`chessy_model.pth`) is saved after each game. A backup (`chessy_modelt-1.pth`) is also kept and updated periodically.
25
+
26
+ ## Model Architecture
27
+
28
+ The brain of NeoChess is a neural network (`NN1` class) with the following structure:
29
+
30
+ - **Embedding Layer**: Converts the board's piece representation into a 64-dimensional vector space.
31
+ - **Multi-Head Attention**: An attention mechanism allows the model to weigh the importance of different pieces and their relationships on the board.
32
+ - **Feed-Forward Network**: A deep series of linear layers and ReLU activation functions process the features from the attention layer to produce a final evaluation score for the position.
33
+
34
+ ## Requirements
35
+
36
+ - Python 3.x
37
+ - PyTorch
38
+ - `python-chess` library
39
+ - A UCI-compatible chess engine binary (e.g., Stockfish)
40
+
41
+ You can install the Python dependencies using pip:
42
+ ```bash
43
+ pip install torch python-chess
44
+ ```
45
+
46
+ ## Setup and Usage
47
+
48
+ 1. **Download Stockfish**: Download the appropriate Stockfish binary for your system from the [official website](https://stockfishchess.org/download/).
49
+
50
+ 2. **Configure the Script**: Open `selfchess.py` and edit the `CONFIG` dictionary at the top of the file:
51
+ - `stockfish_path`: Set this to the absolute path of your downloaded Stockfish executable.
52
+ - `model_path`: The name of the file to save the primary model.
53
+ - `backup_model_path`: The name of the file for the backup model.
54
+ - Adjust other parameters like `num_games`, `learning_rate`, etc., as needed.
55
+
56
+ 3. **Run the Training**: Execute the script from your terminal:
57
+ ```bash
58
+ python selfchess.py
59
+ ```
60
+
61
+ The script will then begin the training process, printing the status of each game and the training loss.
chessy_model.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:23331be338f362b080799a951c5190a3047997e4e4f730524d65d9938d4a508e
3
+ size 21212144
chessy_modelt-1.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8b7194fdd0ee0c1a4347f98179ffab971e693d6e89f7c62f17f5262b07a75661
3
+ size 21212261
selfchess.py ADDED
@@ -0,0 +1,213 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ import torch.optim as optim
4
+ import chess
5
+ import os
6
+ import chess.engine as eng
7
+ import torch.multiprocessing as mp
8
+ from functools import partial
9
+
10
+ # CONFIGURATION
11
+ CONFIG = {
12
+ "stockfish_path": "/Users/aaronvattay/Downloads/stockfish/stockfish-macos-m1-apple-silicon",
13
+ "model_path": "chessy_model.pth",
14
+ "backup_model_path": "chessy_modelt-1.pth",
15
+ "device": torch.device("mps"),
16
+ "learning_rate": 1e-4,
17
+ "num_games": 3000,
18
+ "stockfish_time_limit": 1.0,
19
+ "search_depth": 1,
20
+ }
21
+
22
+ device = CONFIG["device"]
23
+
24
+ def board_to_tensor(board):
25
+ piece_encoding = {
26
+ 'P': 1, 'N': 2, 'B': 3, 'R': 4, 'Q': 5, 'K': 6,
27
+ 'p': 7, 'n': 8, 'b': 9, 'r': 10, 'q': 11, 'k': 12
28
+ }
29
+
30
+ tensor = torch.zeros(64, dtype=torch.long)
31
+ for square in chess.SQUARES:
32
+ piece = board.piece_at(square)
33
+ if piece:
34
+ tensor[square] = piece_encoding[piece.symbol()]
35
+ else:
36
+ tensor[square] = 0
37
+
38
+ return tensor.unsqueeze(0)
39
+
40
+ class NN1(nn.Module):
41
+ def __init__(self):
42
+ super().__init__()
43
+ self.embedding = nn.Embedding(13, 64)
44
+ self.attention = nn.MultiheadAttention(embed_dim=64, num_heads=16)
45
+ self.neu = 512
46
+ self.neurons = nn.Sequential(
47
+ nn.Linear(4096, self.neu),
48
+ nn.ReLU(),
49
+ nn.Linear(self.neu, self.neu),
50
+ nn.ReLU(),
51
+ nn.Linear(self.neu, self.neu),
52
+ nn.ReLU(),
53
+ nn.Linear(self.neu, self.neu),
54
+ nn.ReLU(),
55
+ nn.Linear(self.neu, self.neu),
56
+ nn.ReLU(),
57
+ nn.Linear(self.neu, self.neu),
58
+ nn.ReLU(),
59
+ nn.Linear(self.neu, self.neu),
60
+ nn.ReLU(),
61
+ nn.Linear(self.neu, self.neu),
62
+ nn.ReLU(),
63
+ nn.Linear(self.neu, self.neu),
64
+ nn.ReLU(),
65
+ nn.Linear(self.neu, self.neu),
66
+ nn.ReLU(),
67
+ nn.Linear(self.neu, self.neu),
68
+ nn.ReLU(),
69
+ nn.Linear(self.neu, self.neu),
70
+ nn.ReLU(),
71
+ nn.Linear(self.neu, self.neu),
72
+ nn.ReLU(),
73
+ nn.Linear(self.neu, 64),
74
+ nn.ReLU(),
75
+ nn.Linear(64, 4)
76
+ )
77
+
78
+ def forward(self, x):
79
+ x = self.embedding(x)
80
+ x = x.permute(1, 0, 2)
81
+ attn_output, _ = self.attention(x, x, x)
82
+ x = attn_output.permute(1, 0, 2).contiguous()
83
+ x = x.view(x.size(0), -1)
84
+ x = self.neurons(x)
85
+ return x
86
+
87
+ model = NN1().to(device)
88
+ optimizer = optim.Adam(model.parameters(), lr=CONFIG["learning_rate"])
89
+
90
+ try:
91
+ model.load_state_dict(torch.load(CONFIG["model_path"], map_location=device))
92
+ print(f"Loaded model from {CONFIG['model_path']}")
93
+ except FileNotFoundError:
94
+ try:
95
+ model.load_state_dict(torch.load(CONFIG["backup_model_path"], map_location=device))
96
+ print(f"Loaded backup model from {CONFIG['backup_model_path']}")
97
+ except FileNotFoundError:
98
+ print("No model file found, starting from scratch.")
99
+
100
+ model.train()
101
+ criterion = nn.MSELoss()
102
+ engine = eng.SimpleEngine.popen_uci(CONFIG["stockfish_path"])
103
+ lim = eng.Limit(time=CONFIG["stockfish_time_limit"])
104
+
105
+ def get_evaluation(board):
106
+ """
107
+ Returns the evaluation of the board from the perspective of the current player.
108
+ The model's output is from White's perspective.
109
+ """
110
+ tensor = board_to_tensor(board).to(device)
111
+ with torch.no_grad():
112
+ evaluation = model(tensor)[0][0].item()
113
+
114
+ if board.turn == chess.WHITE:
115
+ return evaluation
116
+ else:
117
+ return -evaluation
118
+
119
+ def search(board, depth, alpha, beta):
120
+ """
121
+ A negamax search function.
122
+ """
123
+ if depth == 0 or board.is_game_over():
124
+ return get_evaluation(board)
125
+
126
+ max_eval = float('-inf')
127
+ for move in board.legal_moves:
128
+ board.push(move)
129
+ eval = -search(board, depth - 1, -beta, -alpha)
130
+ board.pop()
131
+ max_eval = max(max_eval, eval)
132
+ alpha = max(alpha, eval)
133
+ if alpha >= beta:
134
+ break
135
+ return max_eval
136
+
137
+
138
+
139
+
140
+ def game_gen(engine_side):
141
+ data = []
142
+ mc = 0
143
+ board = chess.Board()
144
+ while not board.is_game_over():
145
+ is_bot_turn = board.turn != engine_side
146
+
147
+ if is_bot_turn:
148
+ evaling = {}
149
+ for move in board.legal_moves:
150
+ board.push(move)
151
+ evaling[move] = -search(board, depth=CONFIG["search_depth"], alpha=float('-inf'), beta=float('inf'))
152
+ board.pop()
153
+
154
+ if not evaling:
155
+ break
156
+
157
+ move = max(evaling, key=evaling.get)
158
+ else:
159
+ result = engine.play(board, lim)
160
+ move = result.move
161
+
162
+ if is_bot_turn:
163
+ data.append({
164
+ 'fen': board.fen(),
165
+ 'move_number': mc,
166
+ })
167
+
168
+ board.push(move)
169
+ mc += 1
170
+
171
+ result = board.result()
172
+ c = 0
173
+ if result == '1-0':
174
+ c = 10.0
175
+ elif result == '0-1':
176
+ c = -10.0
177
+ return data, c, mc
178
+ def train(data, c, mc):
179
+ for entry in data:
180
+ tensor = board_to_tensor(chess.Board(entry['fen'])).to(device)
181
+ target = torch.tensor(c * entry['move_number'] / mc, dtype=torch.float32).to(device)
182
+ output = model(tensor)[0][0]
183
+ loss = criterion(output, target)
184
+ optimizer.zero_grad()
185
+ loss.backward()
186
+ optimizer.step()
187
+
188
+ print(f"Saving model to {CONFIG['model_path']}")
189
+ torch.save(model.state_dict(), CONFIG["model_path"])
190
+ return
191
+ def main():
192
+ num_games = CONFIG['num_games']
193
+ num_instances = mp.cpu_count()
194
+ print(f"Saving backup model to {CONFIG['backup_model_path']}")
195
+ torch.save(model.state_dict(), CONFIG["backup_model_path"])
196
+ with mp.Pool(processes=num_instances) as pool:
197
+ play_white = partial(game_gen, engine_side=chess.WHITE)
198
+ play_black = partial(game_gen, engine_side=chess.BLACK)
199
+ play_self = partial(game_gen, engine_side=None)
200
+
201
+ results = pool.map(play_white, range(num_games // 3))
202
+ results += pool.map(play_black, range(num_games // 3))
203
+ results += pool.map(play_self, range(num_games // 3))
204
+ for batch in results:
205
+ data, c, mc = batch
206
+ print(f"Saving backup model to {CONFIG['backup_model_path']}")
207
+ torch.save(model.state_dict(), CONFIG["backup_model_path"])
208
+ if data:
209
+ train(data, c, mc)
210
+ print("Training complete.")
211
+ if __name__ == "__main__":
212
+ main()
213
+ engine.quit()