nathanael-fijalkow commited on
Commit
cb44915
·
1 Parent(s): eda3ee4

First version ready with webhook and deterministic eval

Browse files
.gitignore CHANGED
@@ -9,6 +9,8 @@ dist/
9
  build/
10
  *.egg
11
 
 
 
12
  # Virtual environments
13
  .venv/
14
  venv/
 
9
  build/
10
  *.egg
11
 
12
+ .github/
13
+
14
  # Virtual environments
15
  .venv/
16
  venv/
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: Chess Challenge Arena
3
- emoji: ♟️
4
  colorFrom: gray
5
  colorTo: yellow
6
  sdk: gradio
@@ -16,9 +16,3 @@ short_description: Play Chess like a Honey Bee
16
  This Space hosts the evaluation arena for the LLM Chess Challenge.
17
 
18
  **Chess Challenge Template**: https://github.com/nathanael-fijalkow/ChessChallengeTemplate
19
-
20
- ## Features
21
-
22
- - **Interactive Demo**: Test any submitted model against Stockfish
23
- - **Leaderboard**: See rankings of all submitted models
24
- - **Statistics**: View detailed performance metrics
 
1
  ---
2
  title: Chess Challenge Arena
3
+ emoji: chess_pawn
4
  colorFrom: gray
5
  colorTo: yellow
6
  sdk: gradio
 
16
  This Space hosts the evaluation arena for the LLM Chess Challenge.
17
 
18
  **Chess Challenge Template**: https://github.com/nathanael-fijalkow/ChessChallengeTemplate
 
 
 
 
 
 
TEMPLATE_README.md CHANGED
@@ -1,15 +1,16 @@
1
  # Chess Challenge
2
 
3
- Train a 1M parameter LLM to play chess!
4
 
5
  ## Objective
6
 
7
  Design and train a transformer-based language model to predict chess moves. Your model must:
8
 
9
  1. **Stay under 1M parameters** - This is the hard constraint!
10
- 2. **Use a custom tokenizer** - Design an efficient move-level tokenizer
11
- 3. **Play legal chess** - The model should learn to generate valid moves
12
- 4. **Beat Stockfish** - Your ELO will be measured against Stockfish Level 1
 
13
 
14
  ## Dataset
15
 
@@ -17,7 +18,7 @@ We use the Lichess dataset: [`dlouapre/lichess_2025-01_1M`](https://huggingface.
17
 
18
  The dataset uses an extended UCI notation:
19
  - `W`/`B` prefix for White/Black
20
- - Piece letter: `P`=Pawn, `N`=Knight, `B`=Bishop, `R`=Rook, `Q`=Queen, `K`=King
21
  - Source and destination squares (e.g., `e2e4`)
22
  - Special suffixes: `(x)`=capture, `(+)`=check, `(+*)`=checkmate, `(o)`/`(O)`=castling
23
 
@@ -26,127 +27,468 @@ Example game:
26
  WPe2e4 BPe7e5 WNg1f3 BNb8c6 WBf1b5 BPa7a6 WBb5c6(x) BPd7c6(x) ...
27
  ```
28
 
29
- ## Quick Start
30
 
31
- ### Train a Model
32
 
33
- ```bash
34
- # Basic training
35
- python -m src.train \
36
- --output_dir ./my_model \
37
- --num_train_epochs 3 \
38
- --per_device_train_batch_size 32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  ```
40
 
41
- ### Evaluate Your Model
42
 
43
- Evaluation happens in two phases:
44
 
45
- ```bash
46
- # Phase 1: Legal Move Evaluation (quick sanity check)
47
- python -m src.evaluate \
48
- --model_path ./my_model \
49
- --mode legal \
50
- --n_positions 500
51
-
52
- # Phase 2: Win Rate Evaluation (full games against Stockfish)
53
- python -m src.evaluate \
54
- --model_path ./my_model \
55
- --mode winrate \
56
- --n_games 100 \
57
- --stockfish_level 1
58
-
59
- # Or run both phases:
60
- python -m src.evaluate \
61
- --model_path ./my_model \
62
- --mode both
 
 
 
 
63
  ```
64
 
65
- ## Parameter Budget
66
 
67
- Use the utility function to check your budget:
 
 
 
 
 
 
68
 
69
  ```python
70
- from src import ChessConfig, print_parameter_budget
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
 
72
- config = ChessConfig(
73
- vocab_size=1200,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  n_embd=128,
75
  n_layer=4,
76
  n_head=4,
77
  )
78
- print_parameter_budget(config)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  ```
80
 
81
- ### Pro Tips
82
 
83
- 1. **Weight Tying**: The default config ties the embedding and output layer weights, saving ~154k parameters
84
- 2. **Vocabulary Size**: Keep it small! ~1200 tokens covers all moves
85
- 3. **Depth vs Width**: With limited parameters, experiment with shallow-but-wide vs deep-but-narrow
86
 
87
- ## Customization
88
 
89
- ### Custom Tokenizer
 
 
 
 
 
 
 
 
90
 
91
- The template provides a move-level tokenizer that builds vocabulary from the actual dataset.
92
- Feel free to try different approaches!
93
 
94
- ### Custom Architecture
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
- Modify the model in `src/model.py`:
97
 
98
  ```python
99
- from src import ChessConfig, ChessForCausalLM
100
-
101
- # Customize configuration
102
- config = ChessConfig(
103
- vocab_size=1200,
104
- n_embd=128, # Try 96, 128, or 192
105
- n_layer=4, # Try 3, 4, or 6
106
- n_head=4, # Try 4 or 8
107
- n_inner=384, # Feed-forward dimension (default: 3*n_embd)
108
- dropout=0.1,
109
- tie_weights=True,
110
- )
 
 
 
 
 
 
 
 
111
 
112
- model = ChessForCausalLM(config)
 
 
 
113
  ```
114
 
115
- ## Evaluation Metrics
 
 
 
116
 
117
- ### Phase 1: Legal Move Evaluation
118
 
119
- Tests if your model generates valid chess moves:
120
 
121
- | Metric | Description |
122
- |--------|-------------|
123
- | **Legal Rate (1st try)** | % of legal moves on first attempt |
124
- | **Legal Rate (with retry)** | % of legal moves within 3 attempts |
125
 
126
- > **Target**: >90% legal rate before proceeding to Phase 2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
127
 
128
- ### Phase 2: Win Rate Evaluation
 
 
 
 
129
 
130
- Full games against Stockfish to measure playing strength:
131
 
132
  | Metric | Description |
133
  |--------|-------------|
134
- | **Win Rate** | % of games won against Stockfish |
135
- | **ELO Rating** | Estimated rating based on game results |
136
- | **Avg Game Length** | Average number of moves per game |
137
- | **Illegal Move Rate** | % of illegal moves during games |
138
 
 
139
 
140
- ## Submission
141
 
142
- 1. Train your model
143
- 2. Log in to Hugging Face: `hf auth login`
144
- 3. Submit your model using the submission script:
145
 
146
- ```bash
147
- python submit.py --model_path ./my_model/final_model --model_name your-model-name
148
- ```
149
 
150
- The script will:
151
- - Upload your model to the LLM-course organization
152
- - Include your HF username in the model card for tracking
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Chess Challenge
2
 
3
+ Train a transformer with less than 1M parameters to play legal chess moves!
4
 
5
  ## Objective
6
 
7
  Design and train a transformer-based language model to predict chess moves. Your model must:
8
 
9
  1. **Stay under 1M parameters** - This is the hard constraint!
10
+ 2. **Create a custom tokenizer** - Design your own move-level tokenizer
11
+ 3. **Create a custom model architecture** - Build your own transformer
12
+ 4. **Play legal chess** - The model should learn to generate valid moves
13
+ 5. **Do NOT use python-chess to filter moves** - The model must generate legal moves on its own
14
 
15
  ## Dataset
16
 
 
18
 
19
  The dataset uses an extended UCI notation:
20
  - `W`/`B` prefix for White/Black
21
+ - Piece letter: `P`=Pawn, `N`=Knight, `B`=Bishop, `R`=Rook, `Q`=Queen, `K`=King
22
  - Source and destination squares (e.g., `e2e4`)
23
  - Special suffixes: `(x)`=capture, `(+)`=check, `(+*)`=checkmate, `(o)`/`(O)`=castling
24
 
 
27
  WPe2e4 BPe7e5 WNg1f3 BNb8c6 WBf1b5 BPa7a6 WBb5c6(x) BPd7c6(x) ...
28
  ```
29
 
30
+ ---
31
 
32
+ ## Building Your Solution
33
 
34
+ You need to create **from scratch**:
35
+
36
+ 1. A custom tokenizer class
37
+ 2. A custom model architecture
38
+ 3. A training script
39
+ 4. Save everything in the correct format
40
+
41
+ A complete working example is available in `example_solution/` - use it as reference, but build your own!
42
+
43
+ ---
44
+
45
+ ## Step 1: Create a Custom Tokenizer
46
+
47
+ Your tokenizer must inherit from `PreTrainedTokenizer` and implement the required methods.
48
+
49
+ ### Required Files
50
+
51
+ Create a file called `tokenizer.py` with your tokenizer class:
52
+
53
+ ```python
54
+ import json
55
+ from typing import Dict, List, Optional
56
+ from transformers import PreTrainedTokenizer
57
+
58
+
59
+ class MyChessTokenizer(PreTrainedTokenizer):
60
+ """Custom tokenizer for chess moves."""
61
+
62
+ # Tell HuggingFace which files to save/load
63
+ vocab_files_names = {"vocab_file": "vocab.json"}
64
+
65
+ def __init__(
66
+ self,
67
+ vocab_file: Optional[str] = None,
68
+ **kwargs,
69
+ ):
70
+ # Define special tokens
71
+ self.pad_token = "[PAD]"
72
+ self.bos_token = "[BOS]"
73
+ self.eos_token = "[EOS]"
74
+ self.unk_token = "[UNK]"
75
+
76
+ # Load or create vocabulary
77
+ if vocab_file is not None:
78
+ with open(vocab_file, "r") as f:
79
+ self._vocab = json.load(f)
80
+ else:
81
+ # Create default vocab with special tokens
82
+ self._vocab = {
83
+ "[PAD]": 0,
84
+ "[BOS]": 1,
85
+ "[EOS]": 2,
86
+ "[UNK]": 3,
87
+ }
88
+
89
+ self._ids_to_tokens = {v: k for k, v in self._vocab.items()}
90
+
91
+ # Call parent init AFTER setting up vocab
92
+ super().__init__(
93
+ pad_token=self.pad_token,
94
+ bos_token=self.bos_token,
95
+ eos_token=self.eos_token,
96
+ unk_token=self.unk_token,
97
+ **kwargs,
98
+ )
99
+
100
+ @property
101
+ def vocab_size(self) -> int:
102
+ return len(self._vocab)
103
+
104
+ def get_vocab(self) -> Dict[str, int]:
105
+ return self._vocab.copy()
106
+
107
+ def _tokenize(self, text: str) -> List[str]:
108
+ """Split text into tokens (moves are space-separated)."""
109
+ return text.strip().split()
110
+
111
+ def _convert_token_to_id(self, token: str) -> int:
112
+ return self._vocab.get(token, self._vocab.get(self.unk_token, 0))
113
+
114
+ def _convert_id_to_token(self, index: int) -> str:
115
+ return self._ids_to_tokens.get(index, self.unk_token)
116
+
117
+ def save_vocabulary(self, save_directory: str, filename_prefix: Optional[str] = None):
118
+ """Save vocabulary to a JSON file."""
119
+ import os
120
+ vocab_file = os.path.join(
121
+ save_directory,
122
+ (filename_prefix + "-" if filename_prefix else "") + "vocab.json"
123
+ )
124
+ with open(vocab_file, "w") as f:
125
+ json.dump(self._vocab, f, indent=2)
126
+ return (vocab_file,)
127
  ```
128
 
129
+ ### Building the Vocabulary
130
 
131
+ You need to build a vocabulary.
132
 
133
+ It could be written from scratch, or inferred from the dataset:
134
+
135
+ ```python
136
+ from datasets import load_dataset
137
+
138
+ # Load dataset
139
+ dataset = load_dataset("dlouapre/lichess_2025-01_1M", split="train")
140
+
141
+ # Collect all unique moves
142
+ vocab = {"[PAD]": 0, "[BOS]": 1, "[EOS]": 2, "[UNK]": 3}
143
+ for game in dataset:
144
+ moves = game["text"].split()
145
+ for move in moves:
146
+ if move not in vocab:
147
+ vocab[move] = len(vocab)
148
+
149
+ print(f"Vocabulary size: {len(vocab)}")
150
+
151
+ # Save vocabulary
152
+ import json
153
+ with open("vocab.json", "w") as f:
154
+ json.dump(vocab, f, indent=2)
155
  ```
156
 
157
+ ---
158
 
159
+ ## Step 2: Create a Custom Model
160
+
161
+ Your model must inherit from `PreTrainedModel` and use a config that inherits from `PretrainedConfig`.
162
+
163
+ ### Required Files
164
+
165
+ Create a file called `model.py` with your model class:
166
 
167
  ```python
168
+ import torch
169
+ import torch.nn as nn
170
+ from transformers import PretrainedConfig, PreTrainedModel
171
+ from transformers.modeling_outputs import CausalLMOutputWithPast
172
+
173
+
174
+ class MyChessConfig(PretrainedConfig):
175
+ """Configuration for the chess model."""
176
+
177
+ model_type = "my_chess_model"
178
+
179
+ def __init__(
180
+ self,
181
+ vocab_size: int = 1500,
182
+ n_embd: int = 128,
183
+ n_layer: int = 4,
184
+ n_head: int = 4,
185
+ n_ctx: int = 256,
186
+ dropout: float = 0.1,
187
+ **kwargs,
188
+ ):
189
+ super().__init__(**kwargs)
190
+ self.vocab_size = vocab_size
191
+ self.n_embd = n_embd
192
+ self.n_layer = n_layer
193
+ self.n_head = n_head
194
+ self.n_ctx = n_ctx
195
+ self.dropout = dropout
196
+
197
+
198
+ class MyChessModel(PreTrainedModel):
199
+ """A simple transformer for chess move prediction."""
200
+
201
+ config_class = MyChessConfig
202
+
203
+ def __init__(self, config: MyChessConfig):
204
+ super().__init__(config)
205
+
206
+ # Token and position embeddings
207
+ self.token_emb = nn.Embedding(config.vocab_size, config.n_embd)
208
+ self.pos_emb = nn.Embedding(config.n_ctx, config.n_embd)
209
+ self.dropout = nn.Dropout(config.dropout)
210
+
211
+ # Transformer layers
212
+ encoder_layer = nn.TransformerEncoderLayer(
213
+ d_model=config.n_embd,
214
+ nhead=config.n_head,
215
+ dim_feedforward=config.n_embd * 4,
216
+ dropout=config.dropout,
217
+ batch_first=True,
218
+ )
219
+ self.transformer = nn.TransformerEncoder(encoder_layer, config.n_layer)
220
+
221
+ # Output head
222
+ self.ln_f = nn.LayerNorm(config.n_embd)
223
+ self.lm_head = nn.Linear(config.n_embd, config.vocab_size, bias=False)
224
+
225
+ # Weight tying (saves parameters!)
226
+ self.lm_head.weight = self.token_emb.weight
227
+
228
+ self.post_init()
229
+
230
+ def forward(
231
+ self,
232
+ input_ids,
233
+ attention_mask=None,
234
+ labels=None,
235
+ **kwargs,
236
+ ):
237
+ batch_size, seq_len = input_ids.shape
238
+ device = input_ids.device
239
+
240
+ # Embeddings
241
+ positions = torch.arange(seq_len, device=device).unsqueeze(0)
242
+ x = self.token_emb(input_ids) + self.pos_emb(positions)
243
+ x = self.dropout(x)
244
+
245
+ # Causal mask for autoregressive generation
246
+ causal_mask = torch.triu(
247
+ torch.ones(seq_len, seq_len, device=device) * float('-inf'),
248
+ diagonal=1
249
+ )
250
+
251
+ # Transformer
252
+ x = self.transformer(x, mask=causal_mask)
253
+ x = self.ln_f(x)
254
+ logits = self.lm_head(x)
255
+
256
+ # Compute loss if labels provided
257
+ loss = None
258
+ if labels is not None:
259
+ shift_logits = logits[..., :-1, :].contiguous()
260
+ shift_labels = labels[..., 1:].contiguous()
261
+ loss = nn.functional.cross_entropy(
262
+ shift_logits.view(-1, self.config.vocab_size),
263
+ shift_labels.view(-1),
264
+ ignore_index=-100,
265
+ )
266
+
267
+ return CausalLMOutputWithPast(loss=loss, logits=logits)
268
+
269
+ def prepare_inputs_for_generation(self, input_ids, **kwargs):
270
+ return {"input_ids": input_ids}
271
+ ```
272
+
273
+ ### Parameter Budget Tips
274
+
275
+ With 1M parameters, you need to be careful:
276
+
277
+ | Component | Formula | Example (128 dim, 1500 vocab) |
278
+ |-----------|---------|------------------------------|
279
+ | Token embeddings | vocab_size x n_embd | 1500 x 128 = 192,000 |
280
+ | Position embeddings | n_ctx x n_embd | 256 x 128 = 32,768 |
281
+ | Transformer layer | ~4 x n_embd^2 | ~65,536 per layer |
282
+ | LM head | 0 (with weight tying) | 0 |
283
+
284
+ **Key savings:**
285
+ - **Weight tying**: Share token embeddings with output layer (saves vocab_size x n_embd)
286
+ - **Smaller vocabulary**: Only include moves that appear in training data
287
+ - **Fewer layers**: 4-6 layers is often enough
288
+
289
+ ---
290
 
291
+ ## Step 3: Train Your Model
292
+
293
+ Create a training script:
294
+
295
+ ```python
296
+ import torch
297
+ from datasets import load_dataset
298
+ from transformers import Trainer, TrainingArguments
299
+
300
+ from model import MyChessConfig, MyChessModel
301
+ from tokenizer import MyChessTokenizer
302
+
303
+ # Load tokenizer with your vocabulary
304
+ tokenizer = MyChessTokenizer(vocab_file="vocab.json")
305
+
306
+ # Create model
307
+ config = MyChessConfig(
308
+ vocab_size=tokenizer.vocab_size,
309
  n_embd=128,
310
  n_layer=4,
311
  n_head=4,
312
  )
313
+ model = MyChessModel(config)
314
+
315
+ # Check parameter count
316
+ n_params = sum(p.numel() for p in model.parameters())
317
+ print(f"Parameters: {n_params:,}")
318
+ assert n_params < 1_000_000, f"Model too large: {n_params:,} > 1M"
319
+
320
+ # Load and tokenize dataset
321
+ dataset = load_dataset("dlouapre/lichess_2025-01_1M", split="train")
322
+
323
+ def tokenize_function(examples):
324
+ return tokenizer(
325
+ examples["text"],
326
+ truncation=True,
327
+ max_length=256,
328
+ padding="max_length",
329
+ )
330
+
331
+ tokenized_dataset = dataset.map(tokenize_function, batched=True)
332
+
333
+ # Training
334
+ training_args = TrainingArguments(
335
+ output_dir="./my_model",
336
+ num_train_epochs=3,
337
+ per_device_train_batch_size=32,
338
+ learning_rate=5e-4,
339
+ save_steps=1000,
340
+ logging_steps=100,
341
+ )
342
+
343
+ trainer = Trainer(
344
+ model=model,
345
+ args=training_args,
346
+ train_dataset=tokenized_dataset,
347
+ )
348
+
349
+ trainer.train()
350
+
351
+ # Save final model
352
+ model.save_pretrained("./my_model/final")
353
+ tokenizer.save_pretrained("./my_model/final")
354
  ```
355
 
356
+ ---
357
 
358
+ ## Step 4: Prepare for Submission
 
 
359
 
360
+ Your model directory must contain these files:
361
 
362
+ ```
363
+ my_model/
364
+ config.json # Model configuration
365
+ model.safetensors # Model weights
366
+ tokenizer_config.json # Tokenizer configuration
367
+ vocab.json # Vocabulary
368
+ model.py # Your model class
369
+ tokenizer.py # Your tokenizer class
370
+ ```
371
 
372
+ ### Adding auto_map for Remote Loading
 
373
 
374
+ The `auto_map` field tells HuggingFace how to load your custom classes with `trust_remote_code=True`.
375
+
376
+ **In config.json**, add:
377
+ ```json
378
+ {
379
+ "auto_map": {
380
+ "AutoConfig": "model.MyChessConfig",
381
+ "AutoModelForCausalLM": "model.MyChessModel"
382
+ },
383
+ ...
384
+ }
385
+ ```
386
+
387
+ **In tokenizer_config.json**, add:
388
+ ```json
389
+ {
390
+ "auto_map": {
391
+ "AutoTokenizer": "tokenizer.MyChessTokenizer"
392
+ },
393
+ ...
394
+ }
395
+ ```
396
 
397
+ You can do this programmatically:
398
 
399
  ```python
400
+ # Register for auto loading
401
+ model.config.auto_map = {
402
+ "AutoConfig": "model.MyChessConfig",
403
+ "AutoModelForCausalLM": "model.MyChessModel",
404
+ }
405
+ tokenizer.register_for_auto_class("AutoTokenizer")
406
+
407
+ # Save
408
+ model.save_pretrained("./my_model/final")
409
+ tokenizer.save_pretrained("./my_model/final")
410
+
411
+ # Copy your Python files
412
+ import shutil
413
+ shutil.copy("model.py", "./my_model/final/model.py")
414
+ shutil.copy("tokenizer.py", "./my_model/final/tokenizer.py")
415
+ ```
416
+
417
+ ---
418
+
419
+ ## Local Evaluation (Optional but Recommended)
420
 
421
+ Before submitting, you can evaluate your model locally to check its performance. Since the evaluation is **fully deterministic** (fixed seed, deterministic opponent engine), you will get the exact same results locally as on the HuggingFace Space after submission.
422
+
423
+ ```bash
424
+ python -m src --model ./my_model/final
425
  ```
426
 
427
+ This runs the same evaluation procedure as the online leaderboard:
428
+ - 500 moves against the deterministic opponent
429
+ - Same random seed (42)
430
+ - Same move generation parameters
431
 
432
+ Use this to iterate quickly on your model before pushing to HuggingFace!
433
 
434
+ ---
435
 
436
+ ## Step 5: Submit
 
 
 
437
 
438
+ ```bash
439
+ python submit.py --model_path ./my_model/final --model_name my-chess-model
440
+ ```
441
+
442
+ The script will:
443
+ 1. Validate all required files are present
444
+ 2. Check that auto_map is configured
445
+ 3. Count parameters and warn if over 1M
446
+ 4. Log you into HuggingFace (if needed)
447
+ 5. Upload to the LLM-course organization
448
+
449
+ ---
450
+
451
+ ## Evaluation
452
+
453
+ After submission, go to the [Chess Challenge Arena](https://huggingface.co/spaces/LLM-course/Chess1MChallenge) to run evaluation.
454
+
455
+ ### Evaluation Procedure
456
 
457
+ 1. **Parameter Check**: Model must have < 1M parameters
458
+ 2. **Security Check**: Code is scanned for illegal python-chess usage
459
+ 3. **Game Play**: 500 moves against a deterministic opponent engine
460
+ 4. **Move Generation**: 3 retries allowed per move (greedy on 1st try, then sampling)
461
+ 5. **Scoring**: Legal move rate (first try and with retries)
462
 
463
+ ### Scoring
464
 
465
  | Metric | Description |
466
  |--------|-------------|
467
+ | **Legal Rate (1st try)** | % of moves legal on first attempt |
468
+ | **Legal Rate (with retries)** | % of moves legal within 3 attempts |
 
 
469
 
470
+ **Target**: >90% legal rate = excellent performance
471
 
472
+ ---
473
 
474
+ ## Example Solution
 
 
475
 
476
+ A complete working example is in `example_solution/`:
 
 
477
 
478
+ - `model.py` - Full transformer implementation
479
+ - `tokenizer.py` - Complete tokenizer class
480
+ - `train.py` - Training script with data loading
481
+ - `data.py` - Dataset utilities
482
+
483
+ Use it as reference to understand the expected format and structure.
484
+
485
+ ---
486
+
487
+ ## Rules
488
+
489
+ 1. **< 1M parameters** - Hard limit, checked automatically
490
+ 2. **No python-chess for move filtering** - Model must generate legal moves on its own
491
+ 3. **Custom architecture required** - Must include model.py and tokenizer.py
492
+ 4. **Use the submission script** - Required for leaderboard tracking
493
+
494
+ Good luck!
app.py CHANGED
@@ -1,21 +1,26 @@
1
  """
2
- Play Chess like a Honey Bee
3
 
4
  This Gradio app provides:
5
- 1. Interactive demo to test models
6
- 2. Leaderboard of submitted models
7
- 3. Live game visualization
 
8
 
9
- Instructions:
10
  The goal is to train a language model to play chess, under a strict constraint:
11
  less than 1M parameters! This is approximately the number of neurons of a honey bee.
12
 
13
  Leaderboard data is stored in a private HuggingFace dataset for persistence.
14
  """
15
 
 
 
16
  import io
 
17
  import os
 
18
  import sys
 
19
  from datetime import datetime
20
  from pathlib import Path
21
  from typing import Optional
@@ -28,38 +33,138 @@ ORGANIZATION = os.environ.get("HF_ORGANIZATION", "LLM-course")
28
  LEADERBOARD_DATASET = os.environ.get("LEADERBOARD_DATASET", f"{ORGANIZATION}/chess-challenge-leaderboard")
29
  LEADERBOARD_FILENAME = "leaderboard.csv"
30
  HF_TOKEN = os.environ.get("HF_TOKEN") # Required for private dataset access
31
-
32
- # Evaluation settings
33
- EVAL_SEED = 42
34
- EVAL_N_POSITIONS = 500
35
-
36
- STOCKFISH_LEVELS = {
37
- "Beginner (Level 0)": 0,
38
- "Easy (Level 1)": 1,
39
- "Medium (Level 3)": 3,
40
- "Hard (Level 5)": 5,
41
- }
42
 
43
  # CSV columns for the leaderboard
44
  LEADERBOARD_COLUMNS = [
45
  "model_id",
46
  "user_id",
47
- "legal_rate",
48
  "legal_rate_first_try",
49
- # "elo",
50
- # "win_rate",
51
- # "draw_rate",
52
- # "games_played",
53
  "last_updated",
54
  ]
55
 
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  def load_leaderboard() -> list:
58
  """Load leaderboard from private HuggingFace dataset."""
59
  try:
60
  from huggingface_hub import hf_hub_download
61
 
62
- # Download the CSV file from the dataset
63
  csv_path = hf_hub_download(
64
  repo_id=LEADERBOARD_DATASET,
65
  filename=LEADERBOARD_FILENAME,
@@ -72,7 +177,6 @@ def load_leaderboard() -> list:
72
 
73
  except Exception as e:
74
  print(f"Could not load leaderboard from dataset: {e}")
75
- # Return empty list if dataset doesn't exist yet
76
  return []
77
 
78
 
@@ -81,7 +185,6 @@ def save_leaderboard(data: list):
81
  try:
82
  from huggingface_hub import HfApi
83
 
84
- # Convert to DataFrame
85
  df = pd.DataFrame(data, columns=LEADERBOARD_COLUMNS)
86
 
87
  # Fill missing columns with defaults
@@ -89,7 +192,6 @@ def save_leaderboard(data: list):
89
  if col not in df.columns:
90
  df[col] = None
91
 
92
- # Reorder columns
93
  df = df[LEADERBOARD_COLUMNS]
94
 
95
  # Convert to CSV bytes
@@ -118,25 +220,21 @@ def get_available_models() -> list:
118
  try:
119
  from huggingface_hub import list_models
120
 
121
- # Get all chess models sorted by newest first
122
  models = list(list_models(author=ORGANIZATION, sort="lastModified", direction=-1))
123
  chess_models = [m for m in models if "chess" in m.id.lower()]
124
 
125
- # Keep only the latest model per user (based on model name pattern: chess-<username>-*)
126
  seen_users = set()
127
  filtered_models = []
128
  for m in chess_models:
129
- # Extract username from model id (format: LLM-course/chess-<username>-<modelname>)
130
- model_name = m.id.split("/")[-1] # e.g., "chess-johndoe-mymodel"
131
  parts = model_name.split("-")
132
  if len(parts) >= 2:
133
- # Username is after "chess-"
134
  username = parts[1] if parts[0] == "chess" else None
135
  if username and username not in seen_users:
136
  seen_users.add(username)
137
  filtered_models.append(m.id)
138
  else:
139
- # If pattern doesn't match, include the model anyway
140
  filtered_models.append(m.id)
141
 
142
  return filtered_models if filtered_models else ["No models available"]
@@ -145,21 +243,55 @@ def get_available_models() -> list:
145
  return ["No models available"]
146
 
147
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
  def format_leaderboard_html(data: list) -> str:
149
  """Format leaderboard data as HTML table."""
150
  if not data:
151
  return "<p>No models evaluated yet. Be the first to submit!</p>"
152
 
153
- # Keep only the best entry per user
154
  best_per_user = {}
155
  for entry in data:
156
  user_id = entry.get("user_id", "unknown")
157
- legal_rate = entry.get("legal_rate", 0)
158
- if user_id not in best_per_user or legal_rate > best_per_user[user_id].get("legal_rate", 0):
159
  best_per_user[user_id] = entry
160
 
161
- # Sort by legal_rate
162
- sorted_data = sorted(best_per_user.values(), key=lambda x: x.get("legal_rate", 0), reverse=True)
163
 
164
  html = """
165
  <style>
@@ -193,11 +325,10 @@ def format_leaderboard_html(data: list) -> str:
193
  <th>Rank</th>
194
  <th>User</th>
195
  <th>Model</th>
196
- <th>Legal Rate</th>
197
  <th>Legal Rate (1st try)</th>
198
- <!-- <th>ELO</th> -->
199
- <!-- <th>Win Rate</th> -->
200
- <!-- <th>Games</th> -->
201
  <th>Last Updated</th>
202
  </tr>
203
  </thead>
@@ -211,7 +342,7 @@ def format_leaderboard_html(data: list) -> str:
211
  model_url = f"https://huggingface.co/{entry['model_id']}"
212
 
213
  # Color code legal rate
214
- legal_rate = entry.get('legal_rate', 0)
215
  if legal_rate >= 0.9:
216
  legal_class = "legal-good"
217
  elif legal_rate >= 0.7:
@@ -219,19 +350,21 @@ def format_leaderboard_html(data: list) -> str:
219
  else:
220
  legal_class = "legal-bad"
221
 
222
- legal_rate_first_try = entry.get('legal_rate_first_try', 0)
223
  user_id = entry.get('user_id', 'unknown')
224
  user_url = f"https://huggingface.co/{user_id}"
 
 
 
 
225
  html += f"""
226
  <tr>
227
  <td class="{rank_class}">{rank_display}</td>
228
  <td><a href="{user_url}" target="_blank" class="model-link">{user_id}</a></td>
229
  <td><a href="{model_url}" target="_blank" class="model-link">{entry['model_id'].split('/')[-1]}</a></td>
 
 
230
  <td class="{legal_class}">{legal_rate*100:.1f}%</td>
231
- <td>{legal_rate_first_try*100:.1f}%</td>
232
- <!-- <td><strong>{entry.get('elo', 'N/A')}</strong></td> -->
233
- <!-- <td>{entry.get('win_rate', 0)*100:.1f}%</td> -->
234
- <!-- <td>{entry.get('games_played', 0)}</td> -->
235
  <td>{entry.get('last_updated', 'N/A')}</td>
236
  </tr>
237
  """
@@ -240,377 +373,160 @@ def format_leaderboard_html(data: list) -> str:
240
  return html
241
 
242
 
243
- def render_board_svg(fen: str = "startpos") -> str:
244
- """Render a chess board as SVG."""
245
- try:
246
- import chess
247
- import chess.svg
248
-
249
- if fen == "startpos":
250
- board = chess.Board()
251
- else:
252
- board = chess.Board(fen)
253
-
254
- return chess.svg.board(board, size=400)
255
- except ImportError:
256
- return "<p>Install python-chess to see the board</p>"
257
-
258
 
259
- def play_move(
260
  model_id: str,
261
- current_fen: str,
262
- move_history: str,
263
- temperature: float,
264
- ) -> tuple:
265
- """Play a move with the selected model."""
 
 
 
 
 
 
 
266
  try:
267
- import chess
268
- import torch
269
- import sys
270
  sys.path.insert(0, str(Path(__file__).parent))
271
 
272
- from src.evaluate import load_model_from_hub
273
-
274
- # Load model using the same method as evaluation
275
- model, tokenizer = load_model_from_hub(model_id)
276
- model.eval()
277
-
278
- # Setup board
279
- board = chess.Board(current_fen) if current_fen != "startpos" else chess.Board()
280
-
281
- # Tokenize history
282
- if move_history:
283
- inputs = tokenizer(move_history, return_tensors="pt")
284
- else:
285
- inputs = tokenizer(tokenizer.bos_token, return_tensors="pt")
286
 
287
- # Generate move
288
- with torch.no_grad():
289
- outputs = model(**inputs)
290
- logits = outputs.logits[:, -1, :] / temperature
291
- probs = torch.softmax(logits, dim=-1)
292
- next_token = torch.multinomial(probs, num_samples=1)
293
 
294
- move_token = tokenizer.decode(next_token[0])
 
295
 
296
- # Parse move
297
- if len(move_token) >= 6:
298
- uci_move = move_token[2:4] + move_token[4:6]
299
- try:
300
- move = chess.Move.from_uci(uci_move)
301
- if move in board.legal_moves:
302
- board.push(move)
303
- new_history = f"{move_history} {move_token}".strip()
304
- return (
305
- render_board_svg(board.fen()),
306
- board.fen(),
307
- new_history,
308
- f"Model played: {move_token} ({uci_move})",
309
- )
310
- except:
311
- pass
312
-
313
- return (
314
- render_board_svg(current_fen if current_fen != "startpos" else None),
315
- current_fen,
316
- move_history,
317
- f"Model generated illegal move: {move_token}",
318
- )
319
 
320
- except Exception as e:
321
- return (
322
- render_board_svg(),
323
- "startpos",
324
- "",
325
- f"Error: {str(e)}",
326
  )
327
-
328
-
329
- def get_model_submitter(model_id: str) -> Optional[str]:
330
- """Extract the submitter's username from the model's README on HuggingFace.
331
-
332
- Returns None if the submitter cannot be determined.
333
- """
334
- try:
335
- from huggingface_hub import hf_hub_download
336
- import re
337
 
338
- # Download the README.md from the model repo
339
- readme_path = hf_hub_download(
340
- repo_id=model_id,
341
- filename="README.md",
342
- token=HF_TOKEN,
343
- )
344
 
345
- with open(readme_path, "r") as f:
346
- readme_content = f.read()
347
 
348
- # Look for the pattern: **Submitted by**: [username](https://huggingface.co/username)
349
- match = re.search(r'\*\*Submitted by\*\*:\s*\[([^\]]+)\]', readme_content)
350
- if match:
351
- return match.group(1)
352
 
353
- # Fallback: try to get from model info
354
- from huggingface_hub import model_info
355
- info = model_info(model_id, token=HF_TOKEN)
356
- if info.author:
357
- return info.author
358
-
359
- except Exception as e:
360
- print(f"Could not extract submitter from model: {e}")
361
-
362
- return None
363
 
 
 
364
 
365
- def evaluate_legal_moves(
366
- model_id: str,
367
- progress: gr.Progress = gr.Progress(),
368
- ) -> str:
369
- """Evaluate a model's legal move generation."""
370
- try:
371
- import sys
372
- import io
373
- from contextlib import redirect_stdout
374
-
375
- sys.path.insert(0, str(Path(__file__).parent))
376
-
377
- from src.evaluate import ChessEvaluator, load_model_from_hub
378
-
379
- progress(0, desc="Loading model...")
380
-
381
- # Capture tokenizer debug info
382
- debug_output = io.StringIO()
383
- with redirect_stdout(debug_output):
384
- model, tokenizer = load_model_from_hub(model_id, verbose=True)
385
- tokenizer_info = debug_output.getvalue()
386
 
387
- progress(0.1, desc="Setting up evaluator...")
388
- evaluator = ChessEvaluator(
389
- model=model,
390
- tokenizer=tokenizer,
391
- stockfish_level=1, # Not used for legal move eval
392
- )
 
 
 
393
 
394
- progress(0.2, desc=f"Testing {EVAL_N_POSITIONS} positions...")
395
- results = evaluator.evaluate_legal_moves(
396
- n_positions=EVAL_N_POSITIONS,
397
- verbose=False,
398
- seed=EVAL_SEED,
399
- )
 
400
 
401
- # Extract user_id from model's README (submitted by field)
402
  user_id = get_model_submitter(model_id)
403
  if user_id is None:
404
- return f"""## Evaluation Failed
405
 
406
  Could not determine the submitter for model `{model_id}`.
407
 
408
  Please ensure your model was submitted using the official submission script (`submit.py`),
409
  which adds the required metadata to the README.md file.
 
 
 
410
  """
411
 
412
- # Update leaderboard - only one entry per user, keep the best
413
  leaderboard = load_leaderboard()
414
 
415
- # Find existing entry for this user (not model - one entry per user)
416
  user_entry = next((e for e in leaderboard if e.get("user_id") == user_id), None)
417
 
418
- new_legal_rate = results.get("legal_rate_with_retry", 0)
419
- new_legal_rate_first_try = results.get("legal_rate_first_try", 0)
 
 
 
 
 
 
 
420
 
421
  if user_entry is None:
422
- # New user - add to leaderboard
423
- entry = {
424
- "model_id": model_id,
425
- "user_id": user_id,
426
- "legal_rate": new_legal_rate,
427
- "legal_rate_first_try": new_legal_rate_first_try,
428
- "last_updated": datetime.now().strftime("%Y-%m-%d %H:%M"),
429
- }
430
- leaderboard.append(entry)
431
  save_leaderboard(leaderboard)
432
  update_message = "New entry added to leaderboard!"
433
  else:
434
- # Existing user - only update if this submission is better
435
- old_legal_rate = user_entry.get("legal_rate", 0)
436
- old_model = user_entry.get("model_id", "unknown")
437
- if new_legal_rate > old_legal_rate:
438
- user_entry.update({
439
- "model_id": model_id, # Update to new model if better
440
- "legal_rate": new_legal_rate,
441
- "legal_rate_first_try": new_legal_rate_first_try,
442
- "last_updated": datetime.now().strftime("%Y-%m-%d %H:%M"),
443
- })
444
  save_leaderboard(leaderboard)
445
- if old_model != model_id:
446
- update_message = f"🎉 Improved! New best model for {user_id}: {old_legal_rate*100:.1f}% → {new_legal_rate*100:.1f}%"
447
- else:
448
- update_message = f"🎉 Improved! Previous: {old_legal_rate*100:.1f}% → New: {new_legal_rate*100:.1f}%"
449
  else:
450
- update_message = f"ℹ️ No improvement. Your best: {old_legal_rate*100:.1f}% (model: {old_model.split('/')[-1]}), This run: {new_legal_rate*100:.1f}%"
451
 
452
- progress(1.0, desc="Done!")
 
 
 
 
 
 
 
 
453
 
454
- # Format tokenizer info for display
455
- tokenizer_debug = tokenizer_info.strip().replace(" ", "- ")
456
 
457
- return f"""
458
- ## Legal Move Evaluation for {model_id.split('/')[-1]}
459
 
460
- | Metric | Value |
461
- |--------|-------|
462
- | **Positions Tested** | {results['total_positions']} |
463
- | **Legal (1st try)** | {results['legal_first_try']} ({results['legal_rate_first_try']*100:.1f}%) |
464
- | **Legal (with retries)** | {results['legal_first_try'] + results['legal_with_retry']} ({results['legal_rate_with_retry']*100:.1f}%) |
465
- | **Always Illegal** | {results['illegal_all_retries']} ({results['illegal_rate']*100:.1f}%) |
466
 
467
- ### Tokenizer Info
468
- ```
469
- {tokenizer_debug}
470
- ```
471
 
472
  ### Leaderboard Update
473
  {update_message}
474
 
475
- ### Interpretation
476
- - **>90% legal rate**: Great! Model has learned chess rules well.
477
- - **70-90% legal rate**: Decent, but room for improvement.
478
- - **<70% legal rate**: Model struggles with legal move generation.
479
  """
480
 
481
  except Exception as e:
482
- return f"Evaluation failed: {str(e)}"
483
-
484
-
485
- # def evaluate_winrate(
486
- # model_id: str,
487
- # stockfish_level: str,
488
- # n_games: int,
489
- # progress: gr.Progress = gr.Progress(),
490
- # ) -> str:
491
- # """Evaluate a model's win rate against Stockfish."""
492
- # try:
493
- # import sys
494
- # sys.path.insert(0, str(Path(__file__).parent))
495
- #
496
- # from src.evaluate import ChessEvaluator, load_model_from_hub
497
- #
498
- # progress(0, desc="Loading model...")
499
- # model, tokenizer = load_model_from_hub(model_id)
500
- #
501
- # progress(0.1, desc="Setting up Stockfish...")
502
- # level = STOCKFISH_LEVELS.get(stockfish_level, 1)
503
- # evaluator = ChessEvaluator(
504
- # model=model,
505
- # tokenizer=tokenizer,
506
- # stockfish_level=level,
507
- # )
508
- #
509
- # progress(0.2, desc=f"Playing {n_games} games...")
510
- # results = evaluator.evaluate(n_games=n_games, verbose=False)
511
- #
512
- # # Update leaderboard
513
- # leaderboard = load_leaderboard()
514
- # entry = next((e for e in leaderboard if e["model_id"] == model_id), None)
515
- # if entry is None:
516
- # entry = {"model_id": model_id}
517
- # leaderboard.append(entry)
518
- #
519
- # entry.update({
520
- # "elo": results.get("estimated_elo", 1000),
521
- # "win_rate": results.get("win_rate", 0),
522
- # "games_played": entry.get("games_played", 0) + n_games,
523
- # "last_updated": datetime.now().strftime("%Y-%m-%d %H:%M"),
524
- # })
525
- #
526
- # save_leaderboard(leaderboard)
527
- # progress(1.0, desc="Done!")
528
- #
529
- # return f"""
530
- # ## Win Rate Evaluation for {model_id.split('/')[-1]}
531
- #
532
- # | Metric | Value |
533
- # |--------|-------|
534
- # | **Estimated ELO** | {results.get('estimated_elo', 'N/A'):.0f} |
535
- # | **Win Rate** | {results.get('win_rate', 0)*100:.1f}% |
536
- # | **Draw Rate** | {results.get('draw_rate', 0)*100:.1f}% |
537
- # | **Loss Rate** | {results.get('loss_rate', 0)*100:.1f}% |
538
- # | **Avg Game Length** | {results.get('avg_game_length', 0):.1f} moves |
539
- # | **Illegal Move Rate** | {results.get('illegal_move_rate', 0)*100:.2f}% |
540
- #
541
- # Games played: {n_games} against Stockfish {stockfish_level}
542
- # """
543
- #
544
- # except Exception as e:
545
- # return f"Evaluation failed: {str(e)}"
546
-
547
-
548
- # def evaluate_model(
549
- # model_id: str,
550
- # stockfish_level: str,
551
- # n_games: int,
552
- # progress: gr.Progress = gr.Progress(),
553
- # ) -> str:
554
- # """Evaluate a model against Stockfish."""
555
- # try:
556
- # # Import evaluation code
557
- # import sys
558
- # sys.path.insert(0, str(Path(__file__).parent))
559
- #
560
- # from src.evaluate import ChessEvaluator, load_model_from_hub
561
- #
562
- # progress(0, desc="Loading model...")
563
- # model, tokenizer = load_model_from_hub(model_id)
564
- #
565
- # progress(0.1, desc="Setting up Stockfish...")
566
- # level = STOCKFISH_LEVELS.get(stockfish_level, 1)
567
- # evaluator = ChessEvaluator(
568
- # model=model,
569
- # tokenizer=tokenizer,
570
- # stockfish_level=level,
571
- # )
572
- #
573
- # progress(0.2, desc=f"Playing {n_games} games...")
574
- # results = evaluator.evaluate(n_games=n_games, verbose=False)
575
- #
576
- # # Update leaderboard
577
- # leaderboard = load_leaderboard()
578
- #
579
- # # Find or create entry
580
- # entry = next((e for e in leaderboard if e["model_id"] == model_id), None)
581
- # if entry is None:
582
- # entry = {"model_id": model_id}
583
- # leaderboard.append(entry)
584
- #
585
- # entry.update({
586
- # "elo": results.get("estimated_elo", 1000),
587
- # "win_rate": results.get("win_rate", 0),
588
- # "games_played": entry.get("games_played", 0) + n_games,
589
- # "illegal_rate": results.get("illegal_move_rate", 0),
590
- # "last_updated": datetime.now().strftime("%Y-%m-%d %H:%M"),
591
- # })
592
- #
593
- # save_leaderboard(leaderboard)
594
- #
595
- # progress(1.0, desc="Done!")
596
- #
597
- # return f"""
598
- # ## Evaluation Results for {model_id.split('/')[-1]}
599
- #
600
- # | Metric | Value |
601
- # |--------|-------|
602
- # | **Estimated ELO** | {results.get('estimated_elo', 'N/A'):.0f} |
603
- # | **Win Rate** | {results.get('win_rate', 0)*100:.1f}% |
604
- # | **Draw Rate** | {results.get('draw_rate', 0)*100:.1f}% |
605
- # | **Loss Rate** | {results.get('loss_rate', 0)*100:.1f}% |
606
- # | **Avg Game Length** | {results.get('avg_game_length', 0):.1f} moves |
607
- # | **Illegal Move Rate** | {results.get('illegal_move_rate', 0)*100:.2f}% |
608
- #
609
- # Games played: {n_games} against Stockfish {stockfish_level}
610
- # """
611
- #
612
- # except Exception as e:
613
- # return f"Evaluation failed: {str(e)}"
614
 
615
 
616
  def refresh_leaderboard() -> str:
@@ -618,7 +534,10 @@ def refresh_leaderboard() -> str:
618
  return format_leaderboard_html(load_leaderboard())
619
 
620
 
621
- # Build Gradio Interface
 
 
 
622
  with gr.Blocks(
623
  title="Play Chess like a Honey Bee",
624
  theme=gr.themes.Soft(),
@@ -633,178 +552,208 @@ with gr.Blocks(
633
  """)
634
 
635
  with gr.Tabs():
636
- # Submission Guide Tab
637
- with gr.TabItem("How to Submit"):
638
  gr.Markdown(f"""
639
  ### Submitting Your Model
640
 
641
- The goal is to create a chess-playing language model with **under 1 million parameters**, which is roughly the number of neurons in a honey bee's brain.
642
- At this scale, efficiency and clever architecture choices are key! We are not targetting superhuman performance, but rather exploring how well small models can learn the rules of chess, the goal being (only) to play **legal moves**.
 
 
 
 
 
 
643
 
644
- 0. **Clone this repository**:
 
 
645
  ```bash
646
  git clone https://huggingface.co/spaces/LLM-course/Chess1MChallenge
647
  ```
648
- and check the `TEMPLATE_README.md` for detailed instructions.
649
-
650
- 1. **Train your model**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
651
 
652
- 2. **Push to Hugging Face Hub** using the `submit.py` script provided in the template to make sure that your model is registered correctly.
653
 
654
- 3. **Verify your submission** by checking the model page on Hugging Face
655
 
656
- 4. **Run evaluations**:
657
  ### Requirements
658
 
659
  - Model must be under **1M parameters**
660
- - Model must use the `ChessConfig` and `ChessForCausalLM` classes
661
  - Include the tokenizer with your submission
 
662
 
663
  ### Tips for Better Performance
664
 
665
  - Experiment with different architectures (layers, heads, dimensions)
666
  - Try weight tying to save parameters
 
 
667
  """)
668
 
669
- # Interactive Demo Tab (commented out for now)
670
- # with gr.TabItem("🎮 Interactive Demo"):
671
- # gr.Markdown("### Test a Model")
672
- #
673
- # with gr.Row():
674
- # with gr.Column(scale=1):
675
- # with gr.Row():
676
- # model_dropdown = gr.Dropdown(
677
- # choices=get_available_models(),
678
- # label="Select Model",
679
- # value=None,
680
- # scale=4,
681
- # )
682
- # refresh_models_btn = gr.Button("🔄", scale=1)
683
- # temperature_slider = gr.Slider(
684
- # minimum=0.1,
685
- # maximum=2.0,
686
- # value=0.7,
687
- # step=0.1,
688
- # label="Temperature",
689
- # )
690
- #
691
- # with gr.Row():
692
- # play_btn = gr.Button("Model Move", variant="primary")
693
- # reset_btn = gr.Button("Reset")
694
- #
695
- # status_text = gr.Textbox(label="Status", interactive=False)
696
- #
697
- # with gr.Column(scale=1):
698
- # board_display = gr.HTML(value=render_board_svg())
699
- #
700
- # # Hidden state
701
- # current_fen = gr.State("startpos")
702
- # move_history = gr.State("")
703
- #
704
- # def refresh_models():
705
- # return gr.update(choices=get_available_models())
706
- #
707
- # refresh_models_btn.click(
708
- # refresh_models,
709
- # outputs=[model_dropdown],
710
- # )
711
- #
712
- # play_btn.click(
713
- # play_move,
714
- # inputs=[model_dropdown, current_fen, move_history, temperature_slider],
715
- # outputs=[board_display, current_fen, move_history, status_text],
716
- # )
717
- #
718
- # def reset_game():
719
- # return render_board_svg(), "startpos", "", "Game reset!"
720
- #
721
- # reset_btn.click(
722
- # reset_game,
723
- # outputs=[board_display, current_fen, move_history, status_text],
724
- # )
725
-
726
- # Legal Move Evaluation Tab
727
- with gr.TabItem("Legal Move Eval"):
728
  gr.Markdown("""
729
- ### Phase 1: Legal Move Evaluation
730
 
731
- Test if your model can generate **legal chess moves** in random positions.
732
-
733
- - Tests the model on random board positions
734
- - Measures how often it generates legal moves
 
 
 
735
  """)
736
 
737
  with gr.Row():
738
- legal_model = gr.Dropdown(
739
  choices=get_available_models(),
740
  label="Model to Evaluate",
 
741
  )
742
- refresh_legal_models_btn = gr.Button("🔄", scale=0, min_width=40)
743
 
744
- def refresh_legal_models():
745
  return gr.update(choices=get_available_models())
746
 
747
- refresh_legal_models_btn.click(
748
- refresh_legal_models,
749
- outputs=[legal_model],
750
  )
751
 
752
- legal_btn = gr.Button("Run Legal Move Evaluation", variant="primary")
753
- legal_results = gr.Markdown()
754
 
755
- legal_btn.click(
756
- evaluate_legal_moves,
757
- inputs=[legal_model],
758
- outputs=legal_results,
759
  )
760
 
761
- # Win Rate Evaluation Tab (commented out for now)
762
- # with gr.TabItem("🏆 Win Rate Eval"):
763
- # gr.Markdown("""
764
- # ### Phase 2: Win Rate Evaluation
765
- #
766
- # Play full games against Stockfish and measure win rate.
767
- # This evaluation computes your model's **ELO rating**.
768
- #
769
- # - Plays complete games against Stockfish
770
- # - Measures win/draw/loss rates
771
- # - Estimates ELO rating
772
- # """)
773
- #
774
- # with gr.Row():
775
- # eval_model = gr.Dropdown(
776
- # choices=get_available_models(),
777
- # label="Model to Evaluate",
778
- # )
779
- # eval_level = gr.Dropdown(
780
- # choices=list(STOCKFISH_LEVELS.keys()),
781
- # value="Easy (Level 1)",
782
- # label="Stockfish Level",
783
- # )
784
- # eval_games = gr.Slider(
785
- # minimum=10,
786
- # maximum=100,
787
- # value=50,
788
- # step=10,
789
- # label="Number of Games",
790
- # )
791
- #
792
- # eval_btn = gr.Button("Run Win Rate Evaluation", variant="primary")
793
- # eval_results = gr.Markdown()
794
- #
795
- # eval_btn.click(
796
- # evaluate_winrate,
797
- # inputs=[eval_model, eval_level, eval_games],
798
- # outputs=eval_results,
799
- # )
800
-
801
- # Leaderboard Tab (moved to the end)
802
- with gr.TabItem("🏆 Leaderboard"):
803
  gr.Markdown("### Current Rankings")
 
 
 
 
 
 
 
804
  leaderboard_html = gr.HTML(value=format_leaderboard_html(load_leaderboard()))
805
  refresh_btn = gr.Button("Refresh Leaderboard")
806
  refresh_btn.click(refresh_leaderboard, outputs=leaderboard_html)
807
 
808
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
809
  if __name__ == "__main__":
810
  demo.launch(server_name="0.0.0.0", server_port=7860)
 
1
  """
2
+ Play Chess like a Honey Bee - Chess Challenge Arena
3
 
4
  This Gradio app provides:
5
+ 1. Leaderboard of submitted models
6
+ 2. Model evaluation interface
7
+ 3. Submission guide
8
+ 4. Webhook endpoint for automatic evaluation
9
 
 
10
  The goal is to train a language model to play chess, under a strict constraint:
11
  less than 1M parameters! This is approximately the number of neurons of a honey bee.
12
 
13
  Leaderboard data is stored in a private HuggingFace dataset for persistence.
14
  """
15
 
16
+ import hashlib
17
+ import hmac
18
  import io
19
+ import json
20
  import os
21
+ import queue
22
  import sys
23
+ import threading
24
  from datetime import datetime
25
  from pathlib import Path
26
  from typing import Optional
 
33
  LEADERBOARD_DATASET = os.environ.get("LEADERBOARD_DATASET", f"{ORGANIZATION}/chess-challenge-leaderboard")
34
  LEADERBOARD_FILENAME = "leaderboard.csv"
35
  HF_TOKEN = os.environ.get("HF_TOKEN") # Required for private dataset access
36
+ WEBHOOK_SECRET = os.environ.get("WEBHOOK_SECRET", "459f4c2c6b0b4b6468e21f981103753d14219d4955f07ab457e100fee93cae66")
 
 
 
 
 
 
 
 
 
 
37
 
38
  # CSV columns for the leaderboard
39
  LEADERBOARD_COLUMNS = [
40
  "model_id",
41
  "user_id",
42
+ "n_parameters",
43
  "legal_rate_first_try",
44
+ "legal_rate_with_retry",
45
+ "games_played",
 
 
46
  "last_updated",
47
  ]
48
 
49
 
50
+ # =============================================================================
51
+ # Webhook Queue and Worker
52
+ # =============================================================================
53
+
54
+ eval_queue = queue.Queue()
55
+ eval_status = {} # Track status of queued evaluations
56
+ eval_lock = threading.Lock()
57
+
58
+
59
+ def evaluation_worker():
60
+ """Background worker that processes evaluation queue."""
61
+ while True:
62
+ try:
63
+ model_id = eval_queue.get()
64
+
65
+ with eval_lock:
66
+ eval_status[model_id] = "running"
67
+
68
+ print(f"[Webhook Worker] Starting evaluation for: {model_id}")
69
+
70
+ try:
71
+ sys.path.insert(0, str(Path(__file__).parent))
72
+ from src.evaluate import (
73
+ ChessEvaluator,
74
+ load_model_and_tokenizer,
75
+ post_discussion_summary,
76
+ )
77
+
78
+ # Load and evaluate
79
+ model, tokenizer, _ = load_model_and_tokenizer(model_id, verbose=True)
80
+ evaluator = ChessEvaluator(model=model, tokenizer=tokenizer, model_path=model_id)
81
+ result = evaluator.evaluate(verbose=True)
82
+
83
+ # Update leaderboard if evaluation succeeded
84
+ if result.passed_param_check and result.passed_pychess_check and not result.error_message:
85
+ user_id = get_model_submitter(model_id)
86
+ if user_id:
87
+ leaderboard = load_leaderboard()
88
+ user_entry = next((e for e in leaderboard if e.get("user_id") == user_id), None)
89
+
90
+ new_entry = {
91
+ "model_id": model_id,
92
+ "user_id": user_id,
93
+ "n_parameters": result.n_parameters,
94
+ "legal_rate_first_try": result.legal_rate_first_try,
95
+ "legal_rate_with_retry": result.legal_rate_with_retry,
96
+ "games_played": result.games_played,
97
+ "last_updated": datetime.now().strftime("%Y-%m-%d %H:%M"),
98
+ }
99
+
100
+ if user_entry is None:
101
+ leaderboard.append(new_entry)
102
+ save_leaderboard(leaderboard)
103
+ print(f"[Webhook Worker] Added {model_id} to leaderboard")
104
+ elif result.legal_rate_with_retry > user_entry.get("legal_rate_with_retry", 0):
105
+ user_entry.update(new_entry)
106
+ save_leaderboard(leaderboard)
107
+ print(f"[Webhook Worker] Updated {model_id} on leaderboard (improvement)")
108
+ else:
109
+ print(f"[Webhook Worker] {model_id} - no improvement, not updating leaderboard")
110
+
111
+ # Post results to model discussion
112
+ if HF_TOKEN:
113
+ try:
114
+ post_discussion_summary(model_id, result, HF_TOKEN)
115
+ print(f"[Webhook Worker] Posted results to {model_id} discussion")
116
+ except Exception as e:
117
+ print(f"[Webhook Worker] Failed to post discussion: {e}")
118
+ else:
119
+ print(f"[Webhook Worker] Could not determine submitter for {model_id}")
120
+ else:
121
+ print(f"[Webhook Worker] Evaluation failed for {model_id}: {result.error_message}")
122
+
123
+ with eval_lock:
124
+ eval_status[model_id] = "completed"
125
+
126
+ except Exception as e:
127
+ print(f"[Webhook Worker] Error evaluating {model_id}: {e}")
128
+ with eval_lock:
129
+ eval_status[model_id] = f"error: {str(e)}"
130
+
131
+ except Exception as e:
132
+ print(f"[Webhook Worker] Queue error: {e}")
133
+ finally:
134
+ eval_queue.task_done()
135
+
136
+
137
+ # Start the background worker thread
138
+ worker_thread = threading.Thread(target=evaluation_worker, daemon=True)
139
+ worker_thread.start()
140
+ print("[Webhook] Evaluation worker started")
141
+
142
+
143
+ def is_chess_model(model_id: str) -> bool:
144
+ """Check if a model ID looks like a chess challenge submission."""
145
+ if not model_id.startswith(f"{ORGANIZATION}/"):
146
+ return False
147
+ model_name = model_id.split("/")[-1].lower()
148
+ return "chess" in model_name
149
+
150
+
151
+ def verify_webhook_signature(body: bytes, signature: str) -> bool:
152
+ """Verify the webhook signature using HMAC-SHA256."""
153
+ if not WEBHOOK_SECRET:
154
+ return True # Skip verification if no secret configured
155
+ expected = hmac.new(WEBHOOK_SECRET.encode(), body, hashlib.sha256).hexdigest()
156
+ return hmac.compare_digest(signature or "", expected)
157
+
158
+
159
+ # =============================================================================
160
+ # Leaderboard Management
161
+ # =============================================================================
162
+
163
  def load_leaderboard() -> list:
164
  """Load leaderboard from private HuggingFace dataset."""
165
  try:
166
  from huggingface_hub import hf_hub_download
167
 
 
168
  csv_path = hf_hub_download(
169
  repo_id=LEADERBOARD_DATASET,
170
  filename=LEADERBOARD_FILENAME,
 
177
 
178
  except Exception as e:
179
  print(f"Could not load leaderboard from dataset: {e}")
 
180
  return []
181
 
182
 
 
185
  try:
186
  from huggingface_hub import HfApi
187
 
 
188
  df = pd.DataFrame(data, columns=LEADERBOARD_COLUMNS)
189
 
190
  # Fill missing columns with defaults
 
192
  if col not in df.columns:
193
  df[col] = None
194
 
 
195
  df = df[LEADERBOARD_COLUMNS]
196
 
197
  # Convert to CSV bytes
 
220
  try:
221
  from huggingface_hub import list_models
222
 
 
223
  models = list(list_models(author=ORGANIZATION, sort="lastModified", direction=-1))
224
  chess_models = [m for m in models if "chess" in m.id.lower()]
225
 
226
+ # Keep only the latest model per user
227
  seen_users = set()
228
  filtered_models = []
229
  for m in chess_models:
230
+ model_name = m.id.split("/")[-1]
 
231
  parts = model_name.split("-")
232
  if len(parts) >= 2:
 
233
  username = parts[1] if parts[0] == "chess" else None
234
  if username and username not in seen_users:
235
  seen_users.add(username)
236
  filtered_models.append(m.id)
237
  else:
 
238
  filtered_models.append(m.id)
239
 
240
  return filtered_models if filtered_models else ["No models available"]
 
243
  return ["No models available"]
244
 
245
 
246
+ def get_model_submitter(model_id: str) -> Optional[str]:
247
+ """Extract the submitter's username from the model's README on HuggingFace."""
248
+ try:
249
+ from huggingface_hub import hf_hub_download
250
+ import re
251
+
252
+ readme_path = hf_hub_download(
253
+ repo_id=model_id,
254
+ filename="README.md",
255
+ token=HF_TOKEN,
256
+ )
257
+
258
+ with open(readme_path, "r") as f:
259
+ readme_content = f.read()
260
+
261
+ match = re.search(r'\*\*Submitted by\*\*:\s*\[([^\]]+)\]', readme_content)
262
+ if match:
263
+ return match.group(1)
264
+
265
+ from huggingface_hub import model_info
266
+ info = model_info(model_id, token=HF_TOKEN)
267
+ if info.author:
268
+ return info.author
269
+
270
+ except Exception as e:
271
+ print(f"Could not extract submitter from model: {e}")
272
+
273
+ return None
274
+
275
+
276
+ # =============================================================================
277
+ # Leaderboard Formatting
278
+ # =============================================================================
279
+
280
  def format_leaderboard_html(data: list) -> str:
281
  """Format leaderboard data as HTML table."""
282
  if not data:
283
  return "<p>No models evaluated yet. Be the first to submit!</p>"
284
 
285
+ # Keep only the best entry per user (by legal_rate_with_retry)
286
  best_per_user = {}
287
  for entry in data:
288
  user_id = entry.get("user_id", "unknown")
289
+ legal_rate = entry.get("legal_rate_with_retry", 0)
290
+ if user_id not in best_per_user or legal_rate > best_per_user[user_id].get("legal_rate_with_retry", 0):
291
  best_per_user[user_id] = entry
292
 
293
+ # Sort by legal_rate_with_retry
294
+ sorted_data = sorted(best_per_user.values(), key=lambda x: x.get("legal_rate_with_retry", 0), reverse=True)
295
 
296
  html = """
297
  <style>
 
325
  <th>Rank</th>
326
  <th>User</th>
327
  <th>Model</th>
328
+ <th>Parameters</th>
329
  <th>Legal Rate (1st try)</th>
330
+ <th>Legal Rate (with retries)</th>
331
+ <th>Games</th>
 
332
  <th>Last Updated</th>
333
  </tr>
334
  </thead>
 
342
  model_url = f"https://huggingface.co/{entry['model_id']}"
343
 
344
  # Color code legal rate
345
+ legal_rate = entry.get('legal_rate_with_retry', 0)
346
  if legal_rate >= 0.9:
347
  legal_class = "legal-good"
348
  elif legal_rate >= 0.7:
 
350
  else:
351
  legal_class = "legal-bad"
352
 
 
353
  user_id = entry.get('user_id', 'unknown')
354
  user_url = f"https://huggingface.co/{user_id}"
355
+ n_params = entry.get('n_parameters', 0)
356
+ legal_rate_first = entry.get('legal_rate_first_try', 0)
357
+ games = entry.get('games_played', 0)
358
+
359
  html += f"""
360
  <tr>
361
  <td class="{rank_class}">{rank_display}</td>
362
  <td><a href="{user_url}" target="_blank" class="model-link">{user_id}</a></td>
363
  <td><a href="{model_url}" target="_blank" class="model-link">{entry['model_id'].split('/')[-1]}</a></td>
364
+ <td>{n_params:,}</td>
365
+ <td>{legal_rate_first*100:.1f}%</td>
366
  <td class="{legal_class}">{legal_rate*100:.1f}%</td>
367
+ <td>{games}</td>
 
 
 
368
  <td>{entry.get('last_updated', 'N/A')}</td>
369
  </tr>
370
  """
 
373
  return html
374
 
375
 
376
+ # =============================================================================
377
+ # Evaluation Functions
378
+ # =============================================================================
 
 
 
 
 
 
 
 
 
 
 
 
379
 
380
+ def run_evaluation(
381
  model_id: str,
382
+ progress: gr.Progress = gr.Progress(),
383
+ ) -> str:
384
+ """
385
+ Run evaluation on a model and update the leaderboard.
386
+
387
+ Evaluation procedure:
388
+ 1. Check if model has < 1M parameters
389
+ 2. Check if model uses python-chess illegally
390
+ 3. Play 500 moves against opponent engine (restart after 25 moves)
391
+ 4. Track legal move rates
392
+ 5. Update leaderboard and post discussion
393
+ """
394
  try:
 
 
 
395
  sys.path.insert(0, str(Path(__file__).parent))
396
 
397
+ from src.evaluate import (
398
+ ChessEvaluator,
399
+ load_model_and_tokenizer,
400
+ post_discussion_summary,
401
+ )
 
 
 
 
 
 
 
 
 
402
 
403
+ progress(0, desc="Loading model...")
 
 
 
 
 
404
 
405
+ # Load model
406
+ model, tokenizer, _ = load_model_and_tokenizer(model_id, verbose=True)
407
 
408
+ progress(0.1, desc="Setting up evaluator...")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
409
 
410
+ # Create evaluator
411
+ evaluator = ChessEvaluator(
412
+ model=model,
413
+ tokenizer=tokenizer,
414
+ model_path=model_id,
 
415
  )
 
 
 
 
 
 
 
 
 
 
416
 
417
+ progress(0.2, desc="Running evaluation (500 moves)...")
 
 
 
 
 
418
 
419
+ # Run evaluation
420
+ result = evaluator.evaluate(verbose=True)
421
 
422
+ progress(0.9, desc="Updating leaderboard...")
 
 
 
423
 
424
+ # Check if evaluation was successful
425
+ if not result.passed_param_check:
426
+ return f"""## Evaluation Failed
 
 
 
 
 
 
 
427
 
428
+ **Model**: `{model_id}`
429
+ **Parameters**: {result.n_parameters:,}
430
 
431
+ Model exceeds the **1M parameter limit**. Please reduce model size and resubmit.
432
+ """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
433
 
434
+ if not result.passed_pychess_check:
435
+ return f"""## Evaluation Failed
436
+
437
+ **Model**: `{model_id}`
438
+
439
+ Model illegally uses python-chess for move filtering: {result.error_message}
440
+
441
+ This is not allowed. The model must generate moves without access to legal move lists.
442
+ """
443
 
444
+ if result.error_message:
445
+ return f"""## Evaluation Error
446
+
447
+ **Model**: `{model_id}`
448
+
449
+ An error occurred during evaluation: {result.error_message}
450
+ """
451
 
452
+ # Get submitter info
453
  user_id = get_model_submitter(model_id)
454
  if user_id is None:
455
+ return f"""## Evaluation Issue
456
 
457
  Could not determine the submitter for model `{model_id}`.
458
 
459
  Please ensure your model was submitted using the official submission script (`submit.py`),
460
  which adds the required metadata to the README.md file.
461
+
462
+ **Evaluation Results** (not saved to leaderboard):
463
+ {result.summary()}
464
  """
465
 
466
+ # Update leaderboard
467
  leaderboard = load_leaderboard()
468
 
469
+ # Find existing entry for this user
470
  user_entry = next((e for e in leaderboard if e.get("user_id") == user_id), None)
471
 
472
+ new_entry = {
473
+ "model_id": model_id,
474
+ "user_id": user_id,
475
+ "n_parameters": result.n_parameters,
476
+ "legal_rate_first_try": result.legal_rate_first_try,
477
+ "legal_rate_with_retry": result.legal_rate_with_retry,
478
+ "games_played": result.games_played,
479
+ "last_updated": datetime.now().strftime("%Y-%m-%d %H:%M"),
480
+ }
481
 
482
  if user_entry is None:
483
+ leaderboard.append(new_entry)
 
 
 
 
 
 
 
 
484
  save_leaderboard(leaderboard)
485
  update_message = "New entry added to leaderboard!"
486
  else:
487
+ old_rate = user_entry.get("legal_rate_with_retry", 0)
488
+ if result.legal_rate_with_retry > old_rate:
489
+ user_entry.update(new_entry)
 
 
 
 
 
 
 
490
  save_leaderboard(leaderboard)
491
+ update_message = f"Improved! {old_rate*100:.1f}% -> {result.legal_rate_with_retry*100:.1f}%"
 
 
 
492
  else:
493
+ update_message = f"No improvement. Best: {old_rate*100:.1f}%, This run: {result.legal_rate_with_retry*100:.1f}%"
494
 
495
+ # Post discussion to model page
496
+ if HF_TOKEN:
497
+ try:
498
+ post_discussion_summary(model_id, result, HF_TOKEN)
499
+ discussion_message = "Results posted to model page"
500
+ except Exception as e:
501
+ discussion_message = f"Could not post to model page: {e}"
502
+ else:
503
+ discussion_message = "No HF_TOKEN - results not posted to model page"
504
 
505
+ progress(1.0, desc="Done!")
 
506
 
507
+ return f"""## Evaluation Complete
 
508
 
509
+ {result.summary()}
 
 
 
 
 
510
 
511
+ ---
 
 
 
512
 
513
  ### Leaderboard Update
514
  {update_message}
515
 
516
+ ### Model Page Discussion
517
+ {discussion_message}
 
 
518
  """
519
 
520
  except Exception as e:
521
+ import traceback
522
+ return f"""## Evaluation Failed
523
+
524
+ An unexpected error occurred:
525
+
526
+ ```
527
+ {traceback.format_exc()}
528
+ ```
529
+ """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
530
 
531
 
532
  def refresh_leaderboard() -> str:
 
534
  return format_leaderboard_html(load_leaderboard())
535
 
536
 
537
+ # =============================================================================
538
+ # Gradio Interface
539
+ # =============================================================================
540
+
541
  with gr.Blocks(
542
  title="Play Chess like a Honey Bee",
543
  theme=gr.themes.Soft(),
 
552
  """)
553
 
554
  with gr.Tabs():
555
+ # How to Submit Tab
556
+ with gr.TabItem("📖 How to Submit"):
557
  gr.Markdown(f"""
558
  ### Submitting Your Model
559
 
560
+ The goal is to create a chess-playing language model with **under 1 million parameters**,
561
+ which is roughly the number of neurons in a honey bee's brain.
562
+
563
+ At this scale, efficiency and clever architecture choices are key! We are not targeting
564
+ superhuman performance, but rather exploring how well small models can learn the rules
565
+ of chess. The goal is to play **legal moves**.
566
+
567
+ ---
568
 
569
+ ### Getting Started
570
+
571
+ 1. **Clone this repository**:
572
  ```bash
573
  git clone https://huggingface.co/spaces/LLM-course/Chess1MChallenge
574
  ```
575
+
576
+ 2. **Check the example solution** in the `example_solution/` folder for reference
577
+
578
+ 3. **Train your model** using the provided training script or your own approach
579
+
580
+ 4. **Submit using the official script**:
581
+ ```bash
582
+ python submit.py --model_path ./my_model --model_name my-chess-model
583
+ ```
584
+
585
+ 5. **Run evaluation** on this page to see your results on the leaderboard
586
+
587
+ ---
588
+
589
+ ### Evaluation Procedure
590
+
591
+ Your model will be evaluated as follows:
592
+
593
+ 1. **Parameter check**: Must have < 1M parameters
594
+ 2. **Security check**: Model cannot use python-chess to filter legal moves
595
+ 3. **Game play**: 500 moves against opponent engine (games restart every 25 moves)
596
+ 4. **Move generation**: 3 retries allowed per move, greedy decoding
597
+ 5. **Scoring**: Legal move rate (first try and with retries)
598
 
599
+ The evaluation is **fully deterministic** (seeded randomness, deterministic opponent).
600
 
601
+ ---
602
 
 
603
  ### Requirements
604
 
605
  - Model must be under **1M parameters**
606
+ - Model must use the `ChessConfig` and `ChessForCausalLM` classes (or compatible)
607
  - Include the tokenizer with your submission
608
+ - **Do not** use python-chess to filter moves during generation
609
 
610
  ### Tips for Better Performance
611
 
612
  - Experiment with different architectures (layers, heads, dimensions)
613
  - Try weight tying to save parameters
614
+ - Focus on learning the rules of chess, not just memorizing openings
615
+ - Check the `example_solution/` folder for ideas
616
  """)
617
 
618
+ # Evaluation Tab
619
+ with gr.TabItem("Evaluate Model"):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
620
  gr.Markdown("""
621
+ ### Run Evaluation
622
 
623
+ Select a model to evaluate. The evaluation will:
624
+ - Check parameter count (< 1M required)
625
+ - Verify no illegal python-chess usage
626
+ - Play 500 moves against opponent engine
627
+ - Track legal move rates
628
+ - Update the leaderboard (if improvement)
629
+ - Post results to the model's discussion page
630
  """)
631
 
632
  with gr.Row():
633
+ model_dropdown = gr.Dropdown(
634
  choices=get_available_models(),
635
  label="Model to Evaluate",
636
+ scale=4,
637
  )
638
+ refresh_models_btn = gr.Button("Refresh", scale=1, min_width=50)
639
 
640
+ def refresh_models():
641
  return gr.update(choices=get_available_models())
642
 
643
+ refresh_models_btn.click(
644
+ refresh_models,
645
+ outputs=[model_dropdown],
646
  )
647
 
648
+ eval_btn = gr.Button("Run Evaluation", variant="primary")
649
+ eval_results = gr.Markdown()
650
 
651
+ eval_btn.click(
652
+ run_evaluation,
653
+ inputs=[model_dropdown],
654
+ outputs=eval_results,
655
  )
656
 
657
+ # Leaderboard Tab
658
+ with gr.TabItem("Leaderboard"):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
659
  gr.Markdown("### Current Rankings")
660
+ gr.Markdown("""
661
+ Rankings are based on **legal move rate (with retries)**.
662
+
663
+ - **Legal Rate (1st try)**: Percentage of moves that were legal on first attempt
664
+ - **Legal Rate (with retries)**: Percentage of moves that were legal within 3 attempts
665
+ """)
666
+
667
  leaderboard_html = gr.HTML(value=format_leaderboard_html(load_leaderboard()))
668
  refresh_btn = gr.Button("Refresh Leaderboard")
669
  refresh_btn.click(refresh_leaderboard, outputs=leaderboard_html)
670
 
671
 
672
+ # =============================================================================
673
+ # Webhook Endpoint (mounted on Gradio's FastAPI app)
674
+ # =============================================================================
675
+
676
+ from fastapi import Request
677
+ from fastapi.responses import JSONResponse
678
+
679
+ @demo.app.post("/webhook")
680
+ async def handle_webhook(request: Request):
681
+ """
682
+ Handle HuggingFace webhook events for automatic model evaluation.
683
+
684
+ Triggered on model creation and update events in the organization.
685
+ """
686
+ # Verify webhook signature
687
+ body = await request.body()
688
+ signature = request.headers.get("X-Webhook-Signature")
689
+
690
+ if not verify_webhook_signature(body, signature):
691
+ print("[Webhook] Invalid signature")
692
+ return JSONResponse({"error": "Invalid signature"}, status_code=401)
693
+
694
+ try:
695
+ payload = json.loads(body)
696
+ except json.JSONDecodeError:
697
+ return JSONResponse({"error": "Invalid JSON"}, status_code=400)
698
+
699
+ event = payload.get("event", {})
700
+ repo = payload.get("repo", {})
701
+
702
+ action = event.get("action")
703
+ scope = event.get("scope")
704
+ repo_type = repo.get("type")
705
+ repo_name = repo.get("name", "")
706
+
707
+ print(f"[Webhook] Received: action={action}, scope={scope}, type={repo_type}, repo={repo_name}")
708
+
709
+ # Only process model repos in our organization
710
+ if repo_type != "model":
711
+ return JSONResponse({"status": "ignored", "reason": "not a model"})
712
+
713
+ if not repo_name.startswith(f"{ORGANIZATION}/"):
714
+ return JSONResponse({"status": "ignored", "reason": "not in organization"})
715
+
716
+ # Only process create and update actions
717
+ if action not in ("create", "update"):
718
+ return JSONResponse({"status": "ignored", "reason": f"action {action} not handled"})
719
+
720
+ # Check if it looks like a chess model
721
+ if not is_chess_model(repo_name):
722
+ return JSONResponse({"status": "ignored", "reason": "not a chess model"})
723
+
724
+ # Check if already queued or running
725
+ with eval_lock:
726
+ current_status = eval_status.get(repo_name)
727
+ if current_status == "running":
728
+ return JSONResponse({"status": "ignored", "reason": "evaluation already running"})
729
+ if current_status == "queued":
730
+ return JSONResponse({"status": "ignored", "reason": "already in queue"})
731
+ eval_status[repo_name] = "queued"
732
+
733
+ # Queue the model for evaluation
734
+ eval_queue.put(repo_name)
735
+ queue_size = eval_queue.qsize()
736
+
737
+ print(f"[Webhook] Queued {repo_name} for evaluation (queue size: {queue_size})")
738
+
739
+ return JSONResponse({
740
+ "status": "queued",
741
+ "model_id": repo_name,
742
+ "queue_position": queue_size,
743
+ })
744
+
745
+
746
+ @demo.app.get("/webhook/status")
747
+ async def webhook_status():
748
+ """Get the current status of the evaluation queue."""
749
+ with eval_lock:
750
+ status_copy = dict(eval_status)
751
+
752
+ return JSONResponse({
753
+ "queue_size": eval_queue.qsize(),
754
+ "evaluations": status_copy,
755
+ })
756
+
757
+
758
  if __name__ == "__main__":
759
  demo.launch(server_name="0.0.0.0", server_port=7860)
example_solution/README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Example Solution
2
+
3
+ This folder contains a complete reference implementation for the Chess Challenge.
4
+
5
+ **Use this to understand the expected format** - see how model.py, tokenizer.py, and configuration files should be structured.
6
+
7
+ ## Files Included
8
+
9
+ | File | Description |
10
+ |------|-------------|
11
+ | `model.py` | Custom transformer architecture |
12
+ | `tokenizer.py` | Custom move-level tokenizer |
13
+ | `train.py` | Training script |
14
+ | `data.py` | Dataset utilities |
15
+ | `config.json` | Model configuration with auto_map |
16
+ | `model.safetensors` | Trained model weights |
17
+ | `vocab.json` | Tokenizer vocabulary |
18
+ | `tokenizer_config.json` | Tokenizer configuration with auto_map |
19
+ | `special_tokens_map.json` | Special token mappings |
20
+
21
+ ## Model Architecture
22
+
23
+ This example uses a small GPT-style transformer:
24
+
25
+ | Parameter | Value |
26
+ |-----------|-------|
27
+ | Embedding dim | 128 |
28
+ | Layers | 4 |
29
+ | Attention heads | 4 |
30
+ | Context length | 256 |
31
+ | Total parameters | ~910K |
32
+
33
+ ## Training Details
34
+
35
+ The model was trained on the Lichess dataset with:
36
+ - 3 epochs
37
+ - Batch size 32
38
+ - Learning rate 5e-4
39
+ - Weight tying (embedding = output layer)
40
+
41
+ ## How to Use This Example
42
+
43
+ ### Load the model:
44
+
45
+ ```python
46
+ from transformers import AutoModelForCausalLM, AutoTokenizer
47
+
48
+ model = AutoModelForCausalLM.from_pretrained("./example_solution", trust_remote_code=True)
49
+ tokenizer = AutoTokenizer.from_pretrained("./example_solution", trust_remote_code=True)
50
+ ```
51
+
52
+ ### Generate a move:
53
+
54
+ ```python
55
+ import torch
56
+
57
+ # Game history in the format: WPe2e4 BPe7e5 WNg1f3 ...
58
+ history = "[BOS] WPe2e4 BPe7e5"
59
+
60
+ inputs = tokenizer(history, return_tensors="pt")
61
+ with torch.no_grad():
62
+ outputs = model(**inputs)
63
+ next_token = outputs.logits[0, -1].argmax()
64
+
65
+ predicted_move = tokenizer.decode([next_token])
66
+ print(f"Predicted move: {predicted_move}")
67
+ ```
68
+
69
+ ## Evaluation
70
+
71
+ To evaluate this example:
72
+
73
+ ```bash
74
+ python -m src.evaluate --model_path ./example_solution
75
+ ```
76
+
77
+ ## Key Implementation Details
78
+
79
+ ### auto_map Configuration
80
+
81
+ The `config.json` contains:
82
+ ```json
83
+ {
84
+ "auto_map": {
85
+ "AutoConfig": "model.ChessConfig",
86
+ "AutoModelForCausalLM": "model.ChessForCausalLM"
87
+ }
88
+ }
89
+ ```
90
+
91
+ The `tokenizer_config.json` contains:
92
+ ```json
93
+ {
94
+ "auto_map": {
95
+ "AutoTokenizer": ["tokenizer.ChessTokenizer", null]
96
+ }
97
+ }
98
+ ```
99
+
100
+ Note: `AutoTokenizer` requires a list `[slow_class, fast_class]`, not a string!
101
+
102
+ ## Your Turn!
103
+
104
+ Use this as inspiration, but create your own solution! Ideas to explore:
105
+
106
+ 1. **Architecture changes**: Different number of layers, heads, or embedding dimensions
107
+ 2. **Training strategies**: Different learning rates, warmup schedules, or optimizers
108
+ 3. **Data augmentation**: Flip board colors, use different game phases
109
+ 4. **Tokenization**: Different move representation formats
example_solution/config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ChessForCausalLM"
4
+ ],
5
+ "auto_map": {
6
+ "AutoConfig": "model.ChessConfig",
7
+ "AutoModelForCausalLM": "model.ChessForCausalLM"
8
+ },
9
+ "bos_token_id": 1,
10
+ "dropout": 0.1,
11
+ "dtype": "float32",
12
+ "eos_token_id": 2,
13
+ "layer_norm_epsilon": 1e-05,
14
+ "model_type": "chess_transformer",
15
+ "n_ctx": 256,
16
+ "n_embd": 128,
17
+ "n_head": 4,
18
+ "n_inner": 384,
19
+ "n_layer": 4,
20
+ "pad_token_id": 0,
21
+ "tie_weights": true,
22
+ "transformers_version": "4.57.3",
23
+ "vocab_size": 1682
24
+ }
{src → example_solution}/data.py RENAMED
@@ -24,7 +24,7 @@ class ChessDataset(Dataset):
24
  The labels are shifted by one position for next-token prediction.
25
 
26
  Example:
27
- >>> from src.tokenizer import ChessTokenizer
28
  >>> tokenizer = ChessTokenizer.build_vocab_from_dataset()
29
  >>> dataset = ChessDataset(tokenizer, max_length=256)
30
  >>> sample = dataset[0]
 
24
  The labels are shifted by one position for next-token prediction.
25
 
26
  Example:
27
+ >>> from tokenizer import ChessTokenizer
28
  >>> tokenizer = ChessTokenizer.build_vocab_from_dataset()
29
  >>> dataset = ChessDataset(tokenizer, max_length=256)
30
  >>> sample = dataset[0]
{src → example_solution}/model.py RENAMED
File without changes
example_solution/special_tokens_map.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "[BOS]",
3
+ "eos_token": "[EOS]",
4
+ "pad_token": "[PAD]",
5
+ "unk_token": "[UNK]"
6
+ }
{src → example_solution}/tokenizer.py RENAMED
File without changes
example_solution/tokenizer_config.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[BOS]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[EOS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ }
35
+ },
36
+ "auto_map": {
37
+ "AutoTokenizer": [
38
+ "tokenizer.ChessTokenizer",
39
+ "tokenizer.ChessTokenizer"
40
+ ]
41
+ },
42
+ "bos_token": "[BOS]",
43
+ "clean_up_tokenization_spaces": false,
44
+ "eos_token": "[EOS]",
45
+ "extra_special_tokens": {},
46
+ "model_max_length": 1000000000000000019884624838656,
47
+ "pad_token": "[PAD]",
48
+ "tokenizer_class": "ChessTokenizer",
49
+ "unk_token": "[UNK]"
50
+ }
{src → example_solution}/train.py RENAMED
@@ -22,10 +22,16 @@ from transformers import (
22
  set_seed,
23
  )
24
 
25
- from src.data import ChessDataCollator, create_train_val_datasets
26
- from src.model import ChessConfig, ChessForCausalLM
27
- from src.tokenizer import ChessTokenizer
28
- from src.utils import count_parameters, print_parameter_budget
 
 
 
 
 
 
29
 
30
 
31
  def parse_args():
@@ -168,8 +174,13 @@ def main():
168
  eos_token_id=tokenizer.eos_token_id,
169
  )
170
 
171
- # Print parameter budget
172
- print_parameter_budget(config)
 
 
 
 
 
173
 
174
  # Create model
175
  print("\nCreating model...")
@@ -180,7 +191,7 @@ def main():
180
  if n_params > 1_000_000:
181
  print("WARNING: Model exceeds 1M parameter limit!")
182
  else:
183
- print("Model is within 1M parameter limit")
184
 
185
  # Load datasets
186
  print("\nLoading datasets...")
@@ -235,11 +246,44 @@ def main():
235
 
236
  # Save final model
237
  print("\nSaving final model...")
238
- trainer.save_model(os.path.join(args.output_dir, "final_model"))
239
- tokenizer.save_pretrained(os.path.join(args.output_dir, "final_model"))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
240
 
241
  print("\nTraining complete!")
242
- print(f" Model saved to: {args.output_dir}/final_model")
 
243
 
244
 
245
  if __name__ == "__main__":
 
22
  set_seed,
23
  )
24
 
25
+ from data import ChessDataCollator, create_train_val_datasets
26
+ from model import ChessConfig, ChessForCausalLM
27
+ from tokenizer import ChessTokenizer
28
+
29
+
30
+ def count_parameters(model, trainable_only=True):
31
+ """Count the number of parameters in a model."""
32
+ if trainable_only:
33
+ return sum(p.numel() for p in model.parameters() if p.requires_grad)
34
+ return sum(p.numel() for p in model.parameters())
35
 
36
 
37
  def parse_args():
 
174
  eos_token_id=tokenizer.eos_token_id,
175
  )
176
 
177
+ # Print configuration
178
+ print(f"\nModel configuration:")
179
+ print(f" vocab_size: {config.vocab_size}")
180
+ print(f" n_embd: {config.n_embd}")
181
+ print(f" n_layer: {config.n_layer}")
182
+ print(f" n_head: {config.n_head}")
183
+ print(f" tie_weights: {config.tie_weights}")
184
 
185
  # Create model
186
  print("\nCreating model...")
 
191
  if n_params > 1_000_000:
192
  print("WARNING: Model exceeds 1M parameter limit!")
193
  else:
194
+ print("OK: Model is within 1M parameter limit")
195
 
196
  # Load datasets
197
  print("\nLoading datasets...")
 
246
 
247
  # Save final model
248
  print("\nSaving final model...")
249
+ final_model_dir = os.path.join(args.output_dir, "final_model")
250
+ trainer.save_model(final_model_dir)
251
+ tokenizer.save_pretrained(final_model_dir)
252
+
253
+ # Copy model.py and tokenizer.py for trust_remote_code loading
254
+ import shutil
255
+ import json
256
+ script_dir = Path(__file__).parent
257
+ shutil.copy(script_dir / "model.py", final_model_dir)
258
+ shutil.copy(script_dir / "tokenizer.py", final_model_dir)
259
+ print(" Copied model.py and tokenizer.py")
260
+
261
+ # Add auto_map to config.json for AutoModelForCausalLM
262
+ config_path = os.path.join(final_model_dir, "config.json")
263
+ with open(config_path) as f:
264
+ config_dict = json.load(f)
265
+ config_dict["auto_map"] = {
266
+ "AutoConfig": "model.ChessConfig",
267
+ "AutoModelForCausalLM": "model.ChessForCausalLM",
268
+ }
269
+ with open(config_path, "w") as f:
270
+ json.dump(config_dict, f, indent=2)
271
+ print(" Added auto_map to config.json")
272
+
273
+ # Add auto_map to tokenizer_config.json for AutoTokenizer
274
+ tokenizer_config_path = os.path.join(final_model_dir, "tokenizer_config.json")
275
+ with open(tokenizer_config_path) as f:
276
+ tokenizer_dict = json.load(f)
277
+ tokenizer_dict["auto_map"] = {
278
+ "AutoTokenizer": ["tokenizer.ChessTokenizer", None],
279
+ }
280
+ with open(tokenizer_config_path, "w") as f:
281
+ json.dump(tokenizer_dict, f, indent=2)
282
+ print(" Added auto_map to tokenizer_config.json")
283
 
284
  print("\nTraining complete!")
285
+ print(f" Model saved to: {final_model_dir}")
286
+ print(" Ready for submission with: python submit.py --model_path " + final_model_dir)
287
 
288
 
289
  if __name__ == "__main__":
example_solution/vocab.json ADDED
@@ -0,0 +1,1684 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "[PAD]": 0,
3
+ "[BOS]": 1,
4
+ "[EOS]": 2,
5
+ "[UNK]": 3,
6
+ "BBa5b6": 4,
7
+ "BBa6b7": 5,
8
+ "BBb4a5": 6,
9
+ "BBb4c3(x)": 7,
10
+ "BBb4c3(x+)": 8,
11
+ "BBb4c5": 9,
12
+ "BBb4d2(x)": 10,
13
+ "BBb4d2(x+)": 11,
14
+ "BBb4d6": 12,
15
+ "BBb4e7": 13,
16
+ "BBb6c7": 14,
17
+ "BBb6d4(x)": 15,
18
+ "BBb7a6": 16,
19
+ "BBb7c6": 17,
20
+ "BBb7c6(x)": 18,
21
+ "BBb7c8": 19,
22
+ "BBb7d5": 20,
23
+ "BBb7d5(x)": 21,
24
+ "BBb7e4(x)": 22,
25
+ "BBb7f3(x)": 23,
26
+ "BBb7g2(x)": 24,
27
+ "BBc5a7": 25,
28
+ "BBc5b4": 26,
29
+ "BBc5b4(+)": 27,
30
+ "BBc5b6": 28,
31
+ "BBc5d4": 29,
32
+ "BBc5d4(x)": 30,
33
+ "BBc5d6": 31,
34
+ "BBc5e3(x)": 32,
35
+ "BBc5e7": 33,
36
+ "BBc5f2(x+)": 34,
37
+ "BBc6b5": 35,
38
+ "BBc6d5": 36,
39
+ "BBc6d5(x)": 37,
40
+ "BBc6d7": 38,
41
+ "BBc6e4(x)": 39,
42
+ "BBc6f3(x)": 40,
43
+ "BBc8a6": 41,
44
+ "BBc8b7": 42,
45
+ "BBc8d7": 43,
46
+ "BBc8d7(x)": 44,
47
+ "BBc8e6": 45,
48
+ "BBc8e6(x)": 46,
49
+ "BBc8f5": 47,
50
+ "BBc8f5(x)": 48,
51
+ "BBc8g4": 49,
52
+ "BBc8g4(x)": 50,
53
+ "BBc8h3": 51,
54
+ "BBc8h3(x)": 52,
55
+ "BBd5e6": 53,
56
+ "BBd5f3(x)": 54,
57
+ "BBd6b4": 55,
58
+ "BBd6c5": 56,
59
+ "BBd6c5(x)": 57,
60
+ "BBd6c7": 58,
61
+ "BBd6e5": 59,
62
+ "BBd6e5(x)": 60,
63
+ "BBd6e7": 61,
64
+ "BBd6f4": 62,
65
+ "BBd6f4(x)": 63,
66
+ "BBd6g3(x)": 64,
67
+ "BBd6h2(x+)": 65,
68
+ "BBd7b5": 66,
69
+ "BBd7b5(x)": 67,
70
+ "BBd7c6": 68,
71
+ "BBd7c6(x)": 69,
72
+ "BBd7c8": 70,
73
+ "BBd7e6": 71,
74
+ "BBd7e8": 72,
75
+ "BBd7f5": 73,
76
+ "BBd7f5(x)": 74,
77
+ "BBd7g4": 75,
78
+ "BBe4f3(x)": 76,
79
+ "BBe4g6": 77,
80
+ "BBe5d6": 78,
81
+ "BBe5f6": 79,
82
+ "BBe5g7": 80,
83
+ "BBe6a2(x)": 81,
84
+ "BBe6b3(x)": 82,
85
+ "BBe6c4": 83,
86
+ "BBe6c4(x)": 84,
87
+ "BBe6d5": 85,
88
+ "BBe6d5(x)": 86,
89
+ "BBe6d7": 87,
90
+ "BBe6f5": 88,
91
+ "BBe6f5(x)": 89,
92
+ "BBe6f7": 90,
93
+ "BBe6g4": 91,
94
+ "BBe6h3(x)": 92,
95
+ "BBe7b4": 93,
96
+ "BBe7c5": 94,
97
+ "BBe7c5(x)": 95,
98
+ "BBe7d6": 96,
99
+ "BBe7d6(x)": 97,
100
+ "BBe7d8": 98,
101
+ "BBe7f6": 99,
102
+ "BBe7f6(x)": 100,
103
+ "BBe7f8": 101,
104
+ "BBe7g5": 102,
105
+ "BBe7g5(x)": 103,
106
+ "BBe7h4": 104,
107
+ "BBe7h4(x)": 105,
108
+ "BBf5c2(x)": 106,
109
+ "BBf5d3": 107,
110
+ "BBf5d3(x)": 108,
111
+ "BBf5d7": 109,
112
+ "BBf5e4": 110,
113
+ "BBf5e4(x)": 111,
114
+ "BBf5e6": 112,
115
+ "BBf5g4": 113,
116
+ "BBf5g6": 114,
117
+ "BBf6b2(x)": 115,
118
+ "BBf6c3(x)": 116,
119
+ "BBf6d4(x)": 117,
120
+ "BBf6e5": 118,
121
+ "BBf6e5(x)": 119,
122
+ "BBf6e7": 120,
123
+ "BBf6g5": 121,
124
+ "BBf6g7": 122,
125
+ "BBf8b4": 123,
126
+ "BBf8b4(+)": 124,
127
+ "BBf8c5": 125,
128
+ "BBf8c5(+)": 126,
129
+ "BBf8c5(x)": 127,
130
+ "BBf8d6": 128,
131
+ "BBf8d6(x)": 129,
132
+ "BBf8e7": 130,
133
+ "BBf8e7(x)": 131,
134
+ "BBf8g7": 132,
135
+ "BBf8h6": 133,
136
+ "BBg4d1(x)": 134,
137
+ "BBg4d7": 135,
138
+ "BBg4e2(x)": 136,
139
+ "BBg4e6": 137,
140
+ "BBg4f3(x)": 138,
141
+ "BBg4f5": 139,
142
+ "BBg4h5": 140,
143
+ "BBg5f6": 141,
144
+ "BBg6d3(x)": 142,
145
+ "BBg6e4(x)": 143,
146
+ "BBg6h7": 144,
147
+ "BBg7b2(x)": 145,
148
+ "BBg7c3(x)": 146,
149
+ "BBg7d4(x)": 147,
150
+ "BBg7e5": 148,
151
+ "BBg7e5(x)": 149,
152
+ "BBg7f6": 150,
153
+ "BBg7f6(x)": 151,
154
+ "BBg7f8": 152,
155
+ "BBg7h6": 153,
156
+ "BBg7h6(x)": 154,
157
+ "BBh3g2(x)": 155,
158
+ "BBh5f3(x)": 156,
159
+ "BBh5g6": 157,
160
+ "BBh6g7": 158,
161
+ "BKb6a5": 159,
162
+ "BKb6b5": 160,
163
+ "BKb6c5": 161,
164
+ "BKb6c6": 162,
165
+ "BKb6c7": 163,
166
+ "BKb7a6": 164,
167
+ "BKb7b6": 165,
168
+ "BKb7c6": 166,
169
+ "BKb7c7": 167,
170
+ "BKb8a7": 168,
171
+ "BKb8a8": 169,
172
+ "BKb8b7": 170,
173
+ "BKb8c7": 171,
174
+ "BKb8c8": 172,
175
+ "BKc5b4": 173,
176
+ "BKc5c4": 174,
177
+ "BKc5d4": 175,
178
+ "BKc5d6": 176,
179
+ "BKc6b5": 177,
180
+ "BKc6b6": 178,
181
+ "BKc6b7": 179,
182
+ "BKc6c5": 180,
183
+ "BKc6c7": 181,
184
+ "BKc6d5": 182,
185
+ "BKc6d6": 183,
186
+ "BKc6d7": 184,
187
+ "BKc7b6": 185,
188
+ "BKc7b7": 186,
189
+ "BKc7b8": 187,
190
+ "BKc7c6": 188,
191
+ "BKc7c8": 189,
192
+ "BKc7d6": 190,
193
+ "BKc7d7": 191,
194
+ "BKc7d8": 192,
195
+ "BKc8b7": 193,
196
+ "BKc8b8": 194,
197
+ "BKc8c7": 195,
198
+ "BKc8d7": 196,
199
+ "BKc8d8": 197,
200
+ "BKd4c3": 198,
201
+ "BKd5c4": 199,
202
+ "BKd5c5": 200,
203
+ "BKd5c6": 201,
204
+ "BKd5d4": 202,
205
+ "BKd5e4": 203,
206
+ "BKd5e6": 204,
207
+ "BKd6c5": 205,
208
+ "BKd6c6": 206,
209
+ "BKd6c7": 207,
210
+ "BKd6d5": 208,
211
+ "BKd6d7": 209,
212
+ "BKd6e5": 210,
213
+ "BKd6e6": 211,
214
+ "BKd6e7": 212,
215
+ "BKd7c6": 213,
216
+ "BKd7c7": 214,
217
+ "BKd7c8": 215,
218
+ "BKd7d6": 216,
219
+ "BKd7d8": 217,
220
+ "BKd7e6": 218,
221
+ "BKd7e7": 219,
222
+ "BKd7e8": 220,
223
+ "BKd8c7": 221,
224
+ "BKd8c8": 222,
225
+ "BKd8d7": 223,
226
+ "BKd8e7": 224,
227
+ "BKd8e8": 225,
228
+ "BKe4d3": 226,
229
+ "BKe5d4": 227,
230
+ "BKe5d5": 228,
231
+ "BKe5d6": 229,
232
+ "BKe5e4": 230,
233
+ "BKe5f4": 231,
234
+ "BKe5f5": 232,
235
+ "BKe5f6": 233,
236
+ "BKe6d5": 234,
237
+ "BKe6d6": 235,
238
+ "BKe6d7": 236,
239
+ "BKe6e5": 237,
240
+ "BKe6e7": 238,
241
+ "BKe6f5": 239,
242
+ "BKe6f6": 240,
243
+ "BKe6f7": 241,
244
+ "BKe7d6": 242,
245
+ "BKe7d7": 243,
246
+ "BKe7d8": 244,
247
+ "BKe7e6": 245,
248
+ "BKe7e8": 246,
249
+ "BKe7f6": 247,
250
+ "BKe7f7": 248,
251
+ "BKe7f8": 249,
252
+ "BKe8c8(O)": 250,
253
+ "BKe8d7": 251,
254
+ "BKe8d7(x)": 252,
255
+ "BKe8d8": 253,
256
+ "BKe8d8(x)": 254,
257
+ "BKe8e7": 255,
258
+ "BKe8e7(x)": 256,
259
+ "BKe8f7": 257,
260
+ "BKe8f7(x)": 258,
261
+ "BKe8f8": 259,
262
+ "BKe8g8(o)": 260,
263
+ "BKf4f3": 261,
264
+ "BKf5e4": 262,
265
+ "BKf5e5": 263,
266
+ "BKf5e6": 264,
267
+ "BKf5f4": 265,
268
+ "BKf5f6": 266,
269
+ "BKf5g4": 267,
270
+ "BKf5g5": 268,
271
+ "BKf5g6": 269,
272
+ "BKf6e5": 270,
273
+ "BKf6e6": 271,
274
+ "BKf6e7": 272,
275
+ "BKf6f5": 273,
276
+ "BKf6f7": 274,
277
+ "BKf6g5": 275,
278
+ "BKf6g6": 276,
279
+ "BKf6g7": 277,
280
+ "BKf7e6": 278,
281
+ "BKf7e7": 279,
282
+ "BKf7e8": 280,
283
+ "BKf7f6": 281,
284
+ "BKf7f8": 282,
285
+ "BKf7g6": 283,
286
+ "BKf7g7": 284,
287
+ "BKf7g8": 285,
288
+ "BKf8e7": 286,
289
+ "BKf8e8": 287,
290
+ "BKf8f7": 288,
291
+ "BKf8g7": 289,
292
+ "BKf8g8": 290,
293
+ "BKg5f4": 291,
294
+ "BKg5f5": 292,
295
+ "BKg5f6": 293,
296
+ "BKg5g4": 294,
297
+ "BKg5h4": 295,
298
+ "BKg6f5": 296,
299
+ "BKg6f6": 297,
300
+ "BKg6f7": 298,
301
+ "BKg6g5": 299,
302
+ "BKg6g7": 300,
303
+ "BKg6h5": 301,
304
+ "BKg6h6": 302,
305
+ "BKg6h7": 303,
306
+ "BKg7f6": 304,
307
+ "BKg7f7": 305,
308
+ "BKg7f8": 306,
309
+ "BKg7g6": 307,
310
+ "BKg7g8": 308,
311
+ "BKg7h6": 309,
312
+ "BKg7h7": 310,
313
+ "BKg7h8": 311,
314
+ "BKg8f7": 312,
315
+ "BKg8f7(x)": 313,
316
+ "BKg8f8": 314,
317
+ "BKg8f8(x)": 315,
318
+ "BKg8g7": 316,
319
+ "BKg8g7(x)": 317,
320
+ "BKg8h7": 318,
321
+ "BKg8h7(x)": 319,
322
+ "BKg8h8": 320,
323
+ "BKh5g4": 321,
324
+ "BKh5g6": 322,
325
+ "BKh5h4": 323,
326
+ "BKh6g5": 324,
327
+ "BKh6g6": 325,
328
+ "BKh6g7": 326,
329
+ "BKh6h5": 327,
330
+ "BKh6h7": 328,
331
+ "BKh7g6": 329,
332
+ "BKh7g7": 330,
333
+ "BKh7g8": 331,
334
+ "BKh7h6": 332,
335
+ "BKh7h8": 333,
336
+ "BKh8g7": 334,
337
+ "BKh8g8": 335,
338
+ "BKh8h7": 336,
339
+ "BNa5b3(x)": 337,
340
+ "BNa5c4": 338,
341
+ "BNa5c4(x)": 339,
342
+ "BNa5c6": 340,
343
+ "BNa6b4": 341,
344
+ "BNa6c5": 342,
345
+ "BNa6c7": 343,
346
+ "BNb4a6": 344,
347
+ "BNb4c2": 345,
348
+ "BNb4c2(x)": 346,
349
+ "BNb4c6": 347,
350
+ "BNb4d3": 348,
351
+ "BNb4d3(x)": 349,
352
+ "BNb4d5": 350,
353
+ "BNb6c4": 351,
354
+ "BNb6c4(x)": 352,
355
+ "BNb6d5": 353,
356
+ "BNb6d5(x)": 354,
357
+ "BNb6d7": 355,
358
+ "BNb8a6": 356,
359
+ "BNb8c6": 357,
360
+ "BNb8c6(x)": 358,
361
+ "BNb8d7": 359,
362
+ "BNb8d7(x)": 360,
363
+ "BNc2a1(x)": 361,
364
+ "BNc4b2(x)": 362,
365
+ "BNc4d6": 363,
366
+ "BNc4e5": 364,
367
+ "BNc5d3": 365,
368
+ "BNc5d3(x)": 366,
369
+ "BNc5d7": 367,
370
+ "BNc5e4": 368,
371
+ "BNc5e4(x)": 369,
372
+ "BNc5e6": 370,
373
+ "BNc6a5": 371,
374
+ "BNc6a7": 372,
375
+ "BNc6b4": 373,
376
+ "BNc6b4(x)": 374,
377
+ "BNc6b8": 375,
378
+ "BNc6d4": 376,
379
+ "BNc6d4(x)": 377,
380
+ "BNc6d8": 378,
381
+ "BNc6d8(x)": 379,
382
+ "BNc6e5": 380,
383
+ "BNc6e5(x)": 381,
384
+ "BNc6e7": 382,
385
+ "BNc6e7(x)": 383,
386
+ "BNd3b2(x)": 384,
387
+ "BNd4b3(x)": 385,
388
+ "BNd4c2(x)": 386,
389
+ "BNd4c6": 387,
390
+ "BNd4e2(+)": 388,
391
+ "BNd4e2(x+)": 389,
392
+ "BNd4e6": 390,
393
+ "BNd4f3(+)": 391,
394
+ "BNd4f3(x+)": 392,
395
+ "BNd4f5": 393,
396
+ "BNd5b4": 394,
397
+ "BNd5b6": 395,
398
+ "BNd5c3": 396,
399
+ "BNd5c3(x)": 397,
400
+ "BNd5e3": 398,
401
+ "BNd5e3(x)": 399,
402
+ "BNd5e7": 400,
403
+ "BNd5f4": 401,
404
+ "BNd5f4(x)": 402,
405
+ "BNd5f6": 403,
406
+ "BNd6e4": 404,
407
+ "BNd6f5": 405,
408
+ "BNd7b6": 406,
409
+ "BNd7b8": 407,
410
+ "BNd7c5": 408,
411
+ "BNd7c5(x)": 409,
412
+ "BNd7e5": 410,
413
+ "BNd7e5(x)": 411,
414
+ "BNd7f6": 412,
415
+ "BNd7f6(x)": 413,
416
+ "BNd7f8": 414,
417
+ "BNe3f1(x)": 415,
418
+ "BNe4c3": 416,
419
+ "BNe4c3(x)": 417,
420
+ "BNe4c5": 418,
421
+ "BNe4d2": 419,
422
+ "BNe4d2(x)": 420,
423
+ "BNe4d6": 421,
424
+ "BNe4f2(x)": 422,
425
+ "BNe4f6": 423,
426
+ "BNe4g3(x)": 424,
427
+ "BNe4g5": 425,
428
+ "BNe4g5(x)": 426,
429
+ "BNe5c4": 427,
430
+ "BNe5c4(x)": 428,
431
+ "BNe5c6": 429,
432
+ "BNe5d3": 430,
433
+ "BNe5d3(x)": 431,
434
+ "BNe5d7": 432,
435
+ "BNe5f3(+)": 433,
436
+ "BNe5f3(x+)": 434,
437
+ "BNe5g4": 435,
438
+ "BNe5g6": 436,
439
+ "BNe6d4": 437,
440
+ "BNe6f4": 438,
441
+ "BNe7c6": 439,
442
+ "BNe7c6(x)": 440,
443
+ "BNe7c8": 441,
444
+ "BNe7d5": 442,
445
+ "BNe7d5(x)": 443,
446
+ "BNe7f5": 444,
447
+ "BNe7f5(x)": 445,
448
+ "BNe7g6": 446,
449
+ "BNe8d6": 447,
450
+ "BNe8f6": 448,
451
+ "BNf4e2(+)": 449,
452
+ "BNf5d4": 450,
453
+ "BNf5d4(x)": 451,
454
+ "BNf5d6": 452,
455
+ "BNf5e3": 453,
456
+ "BNf5e3(x)": 454,
457
+ "BNf5e7": 455,
458
+ "BNf5h4": 456,
459
+ "BNf6d5": 457,
460
+ "BNf6d5(x)": 458,
461
+ "BNf6d7": 459,
462
+ "BNf6d7(x)": 460,
463
+ "BNf6e4": 461,
464
+ "BNf6e4(x)": 462,
465
+ "BNf6e8": 463,
466
+ "BNf6g4": 464,
467
+ "BNf6g4(x)": 465,
468
+ "BNf6g8": 466,
469
+ "BNf6h5": 467,
470
+ "BNf6h5(x)": 468,
471
+ "BNf6h7": 469,
472
+ "BNf8e6": 470,
473
+ "BNf8g6": 471,
474
+ "BNg4e3": 472,
475
+ "BNg4e3(x)": 473,
476
+ "BNg4e5": 474,
477
+ "BNg4e5(x)": 475,
478
+ "BNg4f2(x)": 476,
479
+ "BNg4f6": 477,
480
+ "BNg4h6": 478,
481
+ "BNg6e5": 479,
482
+ "BNg6e5(x)": 480,
483
+ "BNg6e7": 481,
484
+ "BNg6f4": 482,
485
+ "BNg6f4(x)": 483,
486
+ "BNg6h4": 484,
487
+ "BNg8e7": 485,
488
+ "BNg8f6": 486,
489
+ "BNg8f6(x)": 487,
490
+ "BNg8h6": 488,
491
+ "BNh5f4": 489,
492
+ "BNh5f4(x)": 490,
493
+ "BNh5f6": 491,
494
+ "BNh5g3(x)": 492,
495
+ "BNh6f5": 493,
496
+ "BNh6f7": 494,
497
+ "BNh6g4": 495,
498
+ "BNh7f6": 496,
499
+ "BNh7g5": 497,
500
+ "BPa2a1(Q)": 498,
501
+ "BPa3a2": 499,
502
+ "BPa4a3": 500,
503
+ "BPa4b3(x)": 501,
504
+ "BPa5a4": 502,
505
+ "BPa5b4(x)": 503,
506
+ "BPa6a5": 504,
507
+ "BPa6b5(x)": 505,
508
+ "BPa7a5": 506,
509
+ "BPa7a6": 507,
510
+ "BPa7b6(x)": 508,
511
+ "BPb2b1(Q)": 509,
512
+ "BPb3b2": 510,
513
+ "BPb4a3(x)": 511,
514
+ "BPb4b3": 512,
515
+ "BPb4c3(x)": 513,
516
+ "BPb5a4(x)": 514,
517
+ "BPb5b4": 515,
518
+ "BPb5c4(x)": 516,
519
+ "BPb6a5(x)": 517,
520
+ "BPb6b5": 518,
521
+ "BPb6c5(x)": 519,
522
+ "BPb7a6(x)": 520,
523
+ "BPb7b5": 521,
524
+ "BPb7b6": 522,
525
+ "BPb7c6(x)": 523,
526
+ "BPc2c1(Q)": 524,
527
+ "BPc3c2": 525,
528
+ "BPc4b3(x)": 526,
529
+ "BPc4c3": 527,
530
+ "BPc4d3(x)": 528,
531
+ "BPc5b4(x)": 529,
532
+ "BPc5c4": 530,
533
+ "BPc5d4(x)": 531,
534
+ "BPc6b5(x)": 532,
535
+ "BPc6c5": 533,
536
+ "BPc6d5(x)": 534,
537
+ "BPc7b6(x)": 535,
538
+ "BPc7c5": 536,
539
+ "BPc7c6": 537,
540
+ "BPc7d6(x)": 538,
541
+ "BPd3d2": 539,
542
+ "BPd4c3(x)": 540,
543
+ "BPd4d3": 541,
544
+ "BPd4e3(x)": 542,
545
+ "BPd5c4(x)": 543,
546
+ "BPd5d4": 544,
547
+ "BPd5e4(x)": 545,
548
+ "BPd6c5(x)": 546,
549
+ "BPd6d5": 547,
550
+ "BPd6e5(x)": 548,
551
+ "BPd7c6(x)": 549,
552
+ "BPd7d5": 550,
553
+ "BPd7d6": 551,
554
+ "BPe3e2": 552,
555
+ "BPe4d3(x)": 553,
556
+ "BPe4e3": 554,
557
+ "BPe4f3(x)": 555,
558
+ "BPe5d4(x)": 556,
559
+ "BPe5e4": 557,
560
+ "BPe5f4(x)": 558,
561
+ "BPe6d5(x)": 559,
562
+ "BPe6e5": 560,
563
+ "BPe6f5(x)": 561,
564
+ "BPe7d6(x)": 562,
565
+ "BPe7e5": 563,
566
+ "BPe7e6": 564,
567
+ "BPe7f6(x)": 565,
568
+ "BPf3f2": 566,
569
+ "BPf4e3(x)": 567,
570
+ "BPf4f3": 568,
571
+ "BPf4g3(x)": 569,
572
+ "BPf5e4(x)": 570,
573
+ "BPf5f4": 571,
574
+ "BPf5g4(x)": 572,
575
+ "BPf6e5(x)": 573,
576
+ "BPf6f5": 574,
577
+ "BPf6g5(x)": 575,
578
+ "BPf7e6(x)": 576,
579
+ "BPf7f5": 577,
580
+ "BPf7f6": 578,
581
+ "BPf7g6(x)": 579,
582
+ "BPg2g1(Q)": 580,
583
+ "BPg3g2": 581,
584
+ "BPg4f3(x)": 582,
585
+ "BPg4g3": 583,
586
+ "BPg4h3(x)": 584,
587
+ "BPg5f4(x)": 585,
588
+ "BPg5g4": 586,
589
+ "BPg5h4(x)": 587,
590
+ "BPg6f5(x)": 588,
591
+ "BPg6g5": 589,
592
+ "BPg6h5(x)": 590,
593
+ "BPg7f6(x)": 591,
594
+ "BPg7g5": 592,
595
+ "BPg7g6": 593,
596
+ "BPg7h6(x)": 594,
597
+ "BPh2h1(Q)": 595,
598
+ "BPh3h2": 596,
599
+ "BPh4g3(x)": 597,
600
+ "BPh4h3": 598,
601
+ "BPh5g4(x)": 599,
602
+ "BPh5h4": 600,
603
+ "BPh6g5(x)": 601,
604
+ "BPh6h5": 602,
605
+ "BPh7g6(x)": 603,
606
+ "BPh7h5": 604,
607
+ "BPh7h6": 605,
608
+ "BQa5b6": 606,
609
+ "BQa5c7": 607,
610
+ "BQa5d8": 608,
611
+ "BQb2a2(x)": 609,
612
+ "BQb4b2(x)": 610,
613
+ "BQb6a5": 611,
614
+ "BQb6b2(x)": 612,
615
+ "BQb6c6": 613,
616
+ "BQb6c7": 614,
617
+ "BQb6d4(x)": 615,
618
+ "BQb6d8": 616,
619
+ "BQc7a5": 617,
620
+ "BQc7b6": 618,
621
+ "BQc7b7": 619,
622
+ "BQc7c6": 620,
623
+ "BQc7d6": 621,
624
+ "BQc7d6(x)": 622,
625
+ "BQc7d7": 623,
626
+ "BQc7d8": 624,
627
+ "BQc7e5(x)": 625,
628
+ "BQc7e7": 626,
629
+ "BQd5a5": 627,
630
+ "BQd5d6": 628,
631
+ "BQd5d8": 629,
632
+ "BQd6c6": 630,
633
+ "BQd6c7": 631,
634
+ "BQd6d7": 632,
635
+ "BQd6e6": 633,
636
+ "BQd6e7": 634,
637
+ "BQd7c6": 635,
638
+ "BQd7c7": 636,
639
+ "BQd7d6": 637,
640
+ "BQd7e6": 638,
641
+ "BQd7e7": 639,
642
+ "BQd7f5": 640,
643
+ "BQd7g4": 641,
644
+ "BQd8a5": 642,
645
+ "BQd8a5(+)": 643,
646
+ "BQd8a8(x)": 644,
647
+ "BQd8b6": 645,
648
+ "BQd8b6(+)": 646,
649
+ "BQd8b8": 647,
650
+ "BQd8c7": 648,
651
+ "BQd8c8": 649,
652
+ "BQd8d1(x)": 650,
653
+ "BQd8d1(x+)": 651,
654
+ "BQd8d4": 652,
655
+ "BQd8d4(x)": 653,
656
+ "BQd8d5": 654,
657
+ "BQd8d5(x)": 655,
658
+ "BQd8d6": 656,
659
+ "BQd8d6(x)": 657,
660
+ "BQd8d7": 658,
661
+ "BQd8d7(x)": 659,
662
+ "BQd8e7": 660,
663
+ "BQd8e7(+)": 661,
664
+ "BQd8e7(x)": 662,
665
+ "BQd8e8": 663,
666
+ "BQd8f6": 664,
667
+ "BQd8f6(x)": 665,
668
+ "BQd8f8": 666,
669
+ "BQd8g5": 667,
670
+ "BQd8g5(x)": 668,
671
+ "BQd8h4": 669,
672
+ "BQd8h4(+)": 670,
673
+ "BQd8h4(x)": 671,
674
+ "BQe7c5": 672,
675
+ "BQe7c7": 673,
676
+ "BQe7d6": 674,
677
+ "BQe7d7": 675,
678
+ "BQe7d8": 676,
679
+ "BQe7e5": 677,
680
+ "BQe7e5(x)": 678,
681
+ "BQe7e6": 679,
682
+ "BQe7e6(x)": 680,
683
+ "BQe7f6": 681,
684
+ "BQe7f6(x)": 682,
685
+ "BQe7f7": 683,
686
+ "BQe7g5": 684,
687
+ "BQe7h4": 685,
688
+ "BQf6d8": 686,
689
+ "BQf6e5(x)": 687,
690
+ "BQf6e6": 688,
691
+ "BQf6e7": 689,
692
+ "BQf6f3(x)": 690,
693
+ "BQf6f5": 691,
694
+ "BQf6g5": 692,
695
+ "BQf6g6": 693,
696
+ "BQg5f6": 694,
697
+ "BQg5g6": 695,
698
+ "BQg6f6": 696,
699
+ "BRa2a1(+)": 697,
700
+ "BRa2b2": 698,
701
+ "BRa8a1(x)": 699,
702
+ "BRa8a2(x)": 700,
703
+ "BRa8a6": 701,
704
+ "BRa8a6(x)": 702,
705
+ "BRa8a7": 703,
706
+ "BRa8b8": 704,
707
+ "BRa8c8": 705,
708
+ "BRa8c8(x)": 706,
709
+ "BRa8d8": 707,
710
+ "BRa8d8(x)": 708,
711
+ "BRa8e8": 709,
712
+ "BRa8e8(x)": 710,
713
+ "BRa8f8": 711,
714
+ "BRa8f8(x)": 712,
715
+ "BRa8g8": 713,
716
+ "BRa8h8": 714,
717
+ "BRb2a2(x)": 715,
718
+ "BRb8a8": 716,
719
+ "BRb8b2": 717,
720
+ "BRb8b2(x)": 718,
721
+ "BRb8b6": 719,
722
+ "BRb8b7": 720,
723
+ "BRb8b7(x)": 721,
724
+ "BRb8c8": 722,
725
+ "BRb8d8": 723,
726
+ "BRb8e8": 724,
727
+ "BRb8f8": 725,
728
+ "BRc2a2(x)": 726,
729
+ "BRc2b2(x)": 727,
730
+ "BRc8a8": 728,
731
+ "BRc8b8": 729,
732
+ "BRc8c1(x)": 730,
733
+ "BRc8c1(x+)": 731,
734
+ "BRc8c2": 732,
735
+ "BRc8c2(x)": 733,
736
+ "BRc8c3": 734,
737
+ "BRc8c3(x)": 735,
738
+ "BRc8c4": 736,
739
+ "BRc8c4(x)": 737,
740
+ "BRc8c5": 738,
741
+ "BRc8c5(x)": 739,
742
+ "BRc8c6": 740,
743
+ "BRc8c6(x)": 741,
744
+ "BRc8c7": 742,
745
+ "BRc8c7(x)": 743,
746
+ "BRc8d8": 744,
747
+ "BRc8e8": 745,
748
+ "BRc8f8": 746,
749
+ "BRd7c7": 747,
750
+ "BRd7e7": 748,
751
+ "BRd8a8": 749,
752
+ "BRd8b8": 750,
753
+ "BRd8c8": 751,
754
+ "BRd8d1(+)": 752,
755
+ "BRd8d1(x)": 753,
756
+ "BRd8d1(x+)": 754,
757
+ "BRd8d2": 755,
758
+ "BRd8d2(x)": 756,
759
+ "BRd8d3": 757,
760
+ "BRd8d3(x)": 758,
761
+ "BRd8d4": 759,
762
+ "BRd8d4(x)": 760,
763
+ "BRd8d5": 761,
764
+ "BRd8d5(x)": 762,
765
+ "BRd8d6": 763,
766
+ "BRd8d6(x)": 764,
767
+ "BRd8d7": 765,
768
+ "BRd8d7(x)": 766,
769
+ "BRd8e8": 767,
770
+ "BRd8f8": 768,
771
+ "BRd8g8": 769,
772
+ "BRd8h8": 770,
773
+ "BRe7d7": 771,
774
+ "BRe8b8": 772,
775
+ "BRe8c8": 773,
776
+ "BRe8d8": 774,
777
+ "BRe8d8(x)": 775,
778
+ "BRe8e1(+)": 776,
779
+ "BRe8e1(x)": 777,
780
+ "BRe8e1(x+)": 778,
781
+ "BRe8e2": 779,
782
+ "BRe8e2(x)": 780,
783
+ "BRe8e3": 781,
784
+ "BRe8e3(x)": 782,
785
+ "BRe8e4": 783,
786
+ "BRe8e4(x)": 784,
787
+ "BRe8e5": 785,
788
+ "BRe8e5(x)": 786,
789
+ "BRe8e6": 787,
790
+ "BRe8e6(x)": 788,
791
+ "BRe8e7": 789,
792
+ "BRe8e7(x)": 790,
793
+ "BRe8f8": 791,
794
+ "BRe8g8": 792,
795
+ "BRf6g6": 793,
796
+ "BRf7e7": 794,
797
+ "BRf7f8": 795,
798
+ "BRf8a8": 796,
799
+ "BRf8a8(x)": 797,
800
+ "BRf8b8": 798,
801
+ "BRf8c8": 799,
802
+ "BRf8c8(x)": 800,
803
+ "BRf8d8": 801,
804
+ "BRf8d8(x)": 802,
805
+ "BRf8e8": 803,
806
+ "BRf8e8(+)": 804,
807
+ "BRf8e8(x)": 805,
808
+ "BRf8f1(x+)": 806,
809
+ "BRf8f2(x)": 807,
810
+ "BRf8f3(x)": 808,
811
+ "BRf8f4": 809,
812
+ "BRf8f4(x)": 810,
813
+ "BRf8f5": 811,
814
+ "BRf8f5(x)": 812,
815
+ "BRf8f6": 813,
816
+ "BRf8f6(x)": 814,
817
+ "BRf8f7": 815,
818
+ "BRf8f7(x)": 816,
819
+ "BRf8g8": 817,
820
+ "BRf8h8": 818,
821
+ "BRg8f8": 819,
822
+ "BRg8g6": 820,
823
+ "BRg8g7": 821,
824
+ "BRg8h8": 822,
825
+ "BRh8c8": 823,
826
+ "BRh8d8": 824,
827
+ "BRh8e8": 825,
828
+ "BRh8f8": 826,
829
+ "BRh8g8": 827,
830
+ "BRh8h6": 828,
831
+ "BRh8h7": 829,
832
+ "WBa3b2": 830,
833
+ "WBa4b3": 831,
834
+ "WBa4c2": 832,
835
+ "WBb2a3": 833,
836
+ "WBb2c1": 834,
837
+ "WBb2c3": 835,
838
+ "WBb2c3(x)": 836,
839
+ "WBb2d4": 837,
840
+ "WBb2d4(x)": 838,
841
+ "WBb2e5(x)": 839,
842
+ "WBb2f6(x)": 840,
843
+ "WBb2g7(x)": 841,
844
+ "WBb3a2": 842,
845
+ "WBb3c2": 843,
846
+ "WBb3d5": 844,
847
+ "WBb3d5(x)": 845,
848
+ "WBb3e6(x)": 846,
849
+ "WBb5a4": 847,
850
+ "WBb5c4": 848,
851
+ "WBb5c6(x)": 849,
852
+ "WBb5c6(x+)": 850,
853
+ "WBb5d3": 851,
854
+ "WBb5d7(x)": 852,
855
+ "WBb5d7(x+)": 853,
856
+ "WBb5e2": 854,
857
+ "WBc1a3": 855,
858
+ "WBc1b2": 856,
859
+ "WBc1d2": 857,
860
+ "WBc1d2(x)": 858,
861
+ "WBc1e3": 859,
862
+ "WBc1e3(x)": 860,
863
+ "WBc1f4": 861,
864
+ "WBc1f4(x)": 862,
865
+ "WBc1g5": 863,
866
+ "WBc1g5(x)": 864,
867
+ "WBc1h6": 865,
868
+ "WBc1h6(x)": 866,
869
+ "WBc2b3": 867,
870
+ "WBc2e4(x)": 868,
871
+ "WBc3d2": 869,
872
+ "WBc4a2": 870,
873
+ "WBc4b3": 871,
874
+ "WBc4b5": 872,
875
+ "WBc4b5(+)": 873,
876
+ "WBc4d3": 874,
877
+ "WBc4d5": 875,
878
+ "WBc4d5(x)": 876,
879
+ "WBc4e2": 877,
880
+ "WBc4e6(x)": 878,
881
+ "WBc4f7(x)": 879,
882
+ "WBc4f7(x+)": 880,
883
+ "WBd2b4": 881,
884
+ "WBd2b4(x)": 882,
885
+ "WBd2c1": 883,
886
+ "WBd2c3": 884,
887
+ "WBd2c3(x)": 885,
888
+ "WBd2e1": 886,
889
+ "WBd2e3": 887,
890
+ "WBd2f4": 888,
891
+ "WBd2f4(x)": 889,
892
+ "WBd2g5": 890,
893
+ "WBd3a6(x)": 891,
894
+ "WBd3b1": 892,
895
+ "WBd3b5": 893,
896
+ "WBd3b5(x)": 894,
897
+ "WBd3c2": 895,
898
+ "WBd3c4": 896,
899
+ "WBd3c4(x)": 897,
900
+ "WBd3e2": 898,
901
+ "WBd3e4": 899,
902
+ "WBd3e4(x)": 900,
903
+ "WBd3f1": 901,
904
+ "WBd3f5": 902,
905
+ "WBd3f5(x)": 903,
906
+ "WBd3g6(x)": 904,
907
+ "WBd3h7(x+)": 905,
908
+ "WBd4e3": 906,
909
+ "WBd4f6(x)": 907,
910
+ "WBd5b3": 908,
911
+ "WBe2b5": 909,
912
+ "WBe2c4": 910,
913
+ "WBe2c4(x)": 911,
914
+ "WBe2d1": 912,
915
+ "WBe2d3": 913,
916
+ "WBe2f1": 914,
917
+ "WBe2f3": 915,
918
+ "WBe2f3(x)": 916,
919
+ "WBe2g4": 917,
920
+ "WBe2g4(x)": 918,
921
+ "WBe2h5": 919,
922
+ "WBe2h5(x)": 920,
923
+ "WBe3a7(x)": 921,
924
+ "WBe3b6(x)": 922,
925
+ "WBe3c1": 923,
926
+ "WBe3c5": 924,
927
+ "WBe3c5(x)": 925,
928
+ "WBe3d2": 926,
929
+ "WBe3d4": 927,
930
+ "WBe3d4(x)": 928,
931
+ "WBe3f2": 929,
932
+ "WBe3f4": 930,
933
+ "WBe3f4(x)": 931,
934
+ "WBe3g5": 932,
935
+ "WBe3g5(x)": 933,
936
+ "WBe3h6": 934,
937
+ "WBe3h6(x)": 935,
938
+ "WBe4d3": 936,
939
+ "WBe4f3": 937,
940
+ "WBe5f6(x)": 938,
941
+ "WBe5g3": 939,
942
+ "WBf1b5": 940,
943
+ "WBf1b5(+)": 941,
944
+ "WBf1c4": 942,
945
+ "WBf1c4(x)": 943,
946
+ "WBf1d3": 944,
947
+ "WBf1d3(x)": 945,
948
+ "WBf1e2": 946,
949
+ "WBf1g2": 947,
950
+ "WBf1h3": 948,
951
+ "WBf3b7(x)": 949,
952
+ "WBf3c6(x)": 950,
953
+ "WBf3d5(x)": 951,
954
+ "WBf3e2": 952,
955
+ "WBf3e4": 953,
956
+ "WBf3e4(x)": 954,
957
+ "WBf3g2": 955,
958
+ "WBf3g4": 956,
959
+ "WBf4c7(x)": 957,
960
+ "WBf4d2": 958,
961
+ "WBf4d6": 959,
962
+ "WBf4d6(x)": 960,
963
+ "WBf4e3": 961,
964
+ "WBf4e5": 962,
965
+ "WBf4e5(x)": 963,
966
+ "WBf4g3": 964,
967
+ "WBf4g5": 965,
968
+ "WBf4h2": 966,
969
+ "WBf4h6": 967,
970
+ "WBg2b7(x)": 968,
971
+ "WBg2c6(x)": 969,
972
+ "WBg2d5(x)": 970,
973
+ "WBg2e4": 971,
974
+ "WBg2e4(x)": 972,
975
+ "WBg2f1": 973,
976
+ "WBg2f3": 974,
977
+ "WBg2f3(x)": 975,
978
+ "WBg2h3": 976,
979
+ "WBg3d6(x)": 977,
980
+ "WBg3e5": 978,
981
+ "WBg3e5(x)": 979,
982
+ "WBg3f2": 980,
983
+ "WBg3h2": 981,
984
+ "WBg3h4": 982,
985
+ "WBg4f3": 983,
986
+ "WBg5d2": 984,
987
+ "WBg5d8(x)": 985,
988
+ "WBg5e3": 986,
989
+ "WBg5e7(x)": 987,
990
+ "WBg5f4": 988,
991
+ "WBg5f6": 989,
992
+ "WBg5f6(x)": 990,
993
+ "WBg5h4": 991,
994
+ "WBg5h6": 992,
995
+ "WBh4e7(x)": 993,
996
+ "WBh4f6(x)": 994,
997
+ "WBh4g3": 995,
998
+ "WBh6f8(x)": 996,
999
+ "WBh6g5": 997,
1000
+ "WBh6g7(x)": 998,
1001
+ "WKb1a1": 999,
1002
+ "WKb1a2": 1000,
1003
+ "WKb1b2": 1001,
1004
+ "WKb1c1": 1002,
1005
+ "WKb1c2": 1003,
1006
+ "WKb2a3": 1004,
1007
+ "WKb2b3": 1005,
1008
+ "WKb2c2": 1006,
1009
+ "WKb2c3": 1007,
1010
+ "WKb3a4": 1008,
1011
+ "WKb3c2": 1009,
1012
+ "WKb3c4": 1010,
1013
+ "WKc1b1": 1011,
1014
+ "WKc1b2": 1012,
1015
+ "WKc1c2": 1013,
1016
+ "WKc1d1": 1014,
1017
+ "WKc1d2": 1015,
1018
+ "WKc2b1": 1016,
1019
+ "WKc2b2": 1017,
1020
+ "WKc2b3": 1018,
1021
+ "WKc2c3": 1019,
1022
+ "WKc2d2": 1020,
1023
+ "WKc2d3": 1021,
1024
+ "WKc3b2": 1022,
1025
+ "WKc3b3": 1023,
1026
+ "WKc3b4": 1024,
1027
+ "WKc3c4": 1025,
1028
+ "WKc3d2": 1026,
1029
+ "WKc3d3": 1027,
1030
+ "WKc3d4": 1028,
1031
+ "WKc4b5": 1029,
1032
+ "WKc4c5": 1030,
1033
+ "WKc4d5": 1031,
1034
+ "WKd1c1": 1032,
1035
+ "WKd1c2": 1033,
1036
+ "WKd1d2": 1034,
1037
+ "WKd1e1": 1035,
1038
+ "WKd1e2": 1036,
1039
+ "WKd2c1": 1037,
1040
+ "WKd2c2": 1038,
1041
+ "WKd2c3": 1039,
1042
+ "WKd2d1": 1040,
1043
+ "WKd2d3": 1041,
1044
+ "WKd2e1": 1042,
1045
+ "WKd2e2": 1043,
1046
+ "WKd2e3": 1044,
1047
+ "WKd3c2": 1045,
1048
+ "WKd3c3": 1046,
1049
+ "WKd3c4": 1047,
1050
+ "WKd3d2": 1048,
1051
+ "WKd3d4": 1049,
1052
+ "WKd3e2": 1050,
1053
+ "WKd3e3": 1051,
1054
+ "WKd3e4": 1052,
1055
+ "WKd4c3": 1053,
1056
+ "WKd4c4": 1054,
1057
+ "WKd4c5": 1055,
1058
+ "WKd4d5": 1056,
1059
+ "WKd4e3": 1057,
1060
+ "WKd4e4": 1058,
1061
+ "WKd4e5": 1059,
1062
+ "WKd5c6": 1060,
1063
+ "WKe1c1(O)": 1061,
1064
+ "WKe1d1": 1062,
1065
+ "WKe1d1(x)": 1063,
1066
+ "WKe1d2": 1064,
1067
+ "WKe1d2(x)": 1065,
1068
+ "WKe1e2": 1066,
1069
+ "WKe1e2(x)": 1067,
1070
+ "WKe1f1": 1068,
1071
+ "WKe1f2": 1069,
1072
+ "WKe1f2(x)": 1070,
1073
+ "WKe1g1(o)": 1071,
1074
+ "WKe2d1": 1072,
1075
+ "WKe2d2": 1073,
1076
+ "WKe2d3": 1074,
1077
+ "WKe2e1": 1075,
1078
+ "WKe2e3": 1076,
1079
+ "WKe2f1": 1077,
1080
+ "WKe2f2": 1078,
1081
+ "WKe2f3": 1079,
1082
+ "WKe3d2": 1080,
1083
+ "WKe3d3": 1081,
1084
+ "WKe3d4": 1082,
1085
+ "WKe3e2": 1083,
1086
+ "WKe3e4": 1084,
1087
+ "WKe3f2": 1085,
1088
+ "WKe3f3": 1086,
1089
+ "WKe3f4": 1087,
1090
+ "WKe4d3": 1088,
1091
+ "WKe4d4": 1089,
1092
+ "WKe4d5": 1090,
1093
+ "WKe4e3": 1091,
1094
+ "WKe4e5": 1092,
1095
+ "WKe4f3": 1093,
1096
+ "WKe4f4": 1094,
1097
+ "WKe4f5": 1095,
1098
+ "WKe5d6": 1096,
1099
+ "WKe5f6": 1097,
1100
+ "WKf1e1": 1098,
1101
+ "WKf1e2": 1099,
1102
+ "WKf1f2": 1100,
1103
+ "WKf1g1": 1101,
1104
+ "WKf1g2": 1102,
1105
+ "WKf2e1": 1103,
1106
+ "WKf2e2": 1104,
1107
+ "WKf2e3": 1105,
1108
+ "WKf2f1": 1106,
1109
+ "WKf2f3": 1107,
1110
+ "WKf2g1": 1108,
1111
+ "WKf2g2": 1109,
1112
+ "WKf2g3": 1110,
1113
+ "WKf3e2": 1111,
1114
+ "WKf3e3": 1112,
1115
+ "WKf3e4": 1113,
1116
+ "WKf3f2": 1114,
1117
+ "WKf3f4": 1115,
1118
+ "WKf3g2": 1116,
1119
+ "WKf3g3": 1117,
1120
+ "WKf3g4": 1118,
1121
+ "WKf4e3": 1119,
1122
+ "WKf4e4": 1120,
1123
+ "WKf4e5": 1121,
1124
+ "WKf4f3": 1122,
1125
+ "WKf4f5": 1123,
1126
+ "WKf4g3": 1124,
1127
+ "WKf4g4": 1125,
1128
+ "WKf4g5": 1126,
1129
+ "WKg1f1": 1127,
1130
+ "WKg1f1(x)": 1128,
1131
+ "WKg1f2": 1129,
1132
+ "WKg1f2(x)": 1130,
1133
+ "WKg1g2": 1131,
1134
+ "WKg1g2(x)": 1132,
1135
+ "WKg1h1": 1133,
1136
+ "WKg1h2": 1134,
1137
+ "WKg1h2(x)": 1135,
1138
+ "WKg2f1": 1136,
1139
+ "WKg2f2": 1137,
1140
+ "WKg2f3": 1138,
1141
+ "WKg2g1": 1139,
1142
+ "WKg2g3": 1140,
1143
+ "WKg2h1": 1141,
1144
+ "WKg2h2": 1142,
1145
+ "WKg2h3": 1143,
1146
+ "WKg3f2": 1144,
1147
+ "WKg3f3": 1145,
1148
+ "WKg3f4": 1146,
1149
+ "WKg3g2": 1147,
1150
+ "WKg3g4": 1148,
1151
+ "WKg3h2": 1149,
1152
+ "WKg3h3": 1150,
1153
+ "WKg3h4": 1151,
1154
+ "WKg4f3": 1152,
1155
+ "WKg4f4": 1153,
1156
+ "WKg4f5": 1154,
1157
+ "WKg4g3": 1155,
1158
+ "WKg4g5": 1156,
1159
+ "WKg4h3": 1157,
1160
+ "WKg4h5": 1158,
1161
+ "WKg5f6": 1159,
1162
+ "WKh1g1": 1160,
1163
+ "WKh1g2": 1161,
1164
+ "WKh1h2": 1162,
1165
+ "WKh2g1": 1163,
1166
+ "WKh2g2": 1164,
1167
+ "WKh2g3": 1165,
1168
+ "WKh2h1": 1166,
1169
+ "WKh2h3": 1167,
1170
+ "WKh3g2": 1168,
1171
+ "WKh3g3": 1169,
1172
+ "WKh3g4": 1170,
1173
+ "WKh3h2": 1171,
1174
+ "WKh3h4": 1172,
1175
+ "WKh4g3": 1173,
1176
+ "WKh4g5": 1174,
1177
+ "WKh4h5": 1175,
1178
+ "WNa3b5": 1176,
1179
+ "WNa3c2": 1177,
1180
+ "WNa3c4": 1178,
1181
+ "WNa4c3": 1179,
1182
+ "WNa4c5": 1180,
1183
+ "WNa4c5(x)": 1181,
1184
+ "WNb1a3": 1182,
1185
+ "WNb1c3": 1183,
1186
+ "WNb1c3(x)": 1184,
1187
+ "WNb1d2": 1185,
1188
+ "WNb1d2(x)": 1186,
1189
+ "WNb3c5": 1187,
1190
+ "WNb3d2": 1188,
1191
+ "WNb3d4": 1189,
1192
+ "WNb5a3": 1190,
1193
+ "WNb5c3": 1191,
1194
+ "WNb5c7": 1192,
1195
+ "WNb5d4": 1193,
1196
+ "WNb5d6": 1194,
1197
+ "WNb5d6(+)": 1195,
1198
+ "WNb5d6(x)": 1196,
1199
+ "WNc2e3": 1197,
1200
+ "WNc3a2": 1198,
1201
+ "WNc3a4": 1199,
1202
+ "WNc3b1": 1200,
1203
+ "WNc3b5": 1201,
1204
+ "WNc3b5(x)": 1202,
1205
+ "WNc3d1": 1203,
1206
+ "WNc3d1(x)": 1204,
1207
+ "WNc3d5": 1205,
1208
+ "WNc3d5(x)": 1206,
1209
+ "WNc3e2": 1207,
1210
+ "WNc3e2(x)": 1208,
1211
+ "WNc3e4": 1209,
1212
+ "WNc3e4(x)": 1210,
1213
+ "WNc4d2": 1211,
1214
+ "WNc4d6": 1212,
1215
+ "WNc4e3": 1213,
1216
+ "WNc4e5": 1214,
1217
+ "WNc4e5(x)": 1215,
1218
+ "WNc5d3": 1216,
1219
+ "WNc7a8(x)": 1217,
1220
+ "WNd1e3": 1218,
1221
+ "WNd2b1": 1219,
1222
+ "WNd2b3": 1220,
1223
+ "WNd2c4": 1221,
1224
+ "WNd2c4(x)": 1222,
1225
+ "WNd2e4": 1223,
1226
+ "WNd2e4(x)": 1224,
1227
+ "WNd2f1": 1225,
1228
+ "WNd2f3": 1226,
1229
+ "WNd2f3(x)": 1227,
1230
+ "WNd3e5": 1228,
1231
+ "WNd3f4": 1229,
1232
+ "WNd4b3": 1230,
1233
+ "WNd4b5": 1231,
1234
+ "WNd4c6": 1232,
1235
+ "WNd4c6(x)": 1233,
1236
+ "WNd4e2": 1234,
1237
+ "WNd4e6": 1235,
1238
+ "WNd4e6(x)": 1236,
1239
+ "WNd4f3": 1237,
1240
+ "WNd4f5": 1238,
1241
+ "WNd4f5(x)": 1239,
1242
+ "WNd5c3": 1240,
1243
+ "WNd5c7(x)": 1241,
1244
+ "WNd5e3": 1242,
1245
+ "WNd5e7(+)": 1243,
1246
+ "WNd5e7(x)": 1244,
1247
+ "WNd5e7(x+)": 1245,
1248
+ "WNd5f4": 1246,
1249
+ "WNd5f6(+)": 1247,
1250
+ "WNd5f6(x+)": 1248,
1251
+ "WNd6b7(x)": 1249,
1252
+ "WNe1f3": 1250,
1253
+ "WNe2c3": 1251,
1254
+ "WNe2c3(x)": 1252,
1255
+ "WNe2d4": 1253,
1256
+ "WNe2d4(x)": 1254,
1257
+ "WNe2f4": 1255,
1258
+ "WNe2f4(x)": 1256,
1259
+ "WNe2g3": 1257,
1260
+ "WNe3c4": 1258,
1261
+ "WNe3d5": 1259,
1262
+ "WNe3f5": 1260,
1263
+ "WNe3g4": 1261,
1264
+ "WNe4c3": 1262,
1265
+ "WNe4c5": 1263,
1266
+ "WNe4c5(x)": 1264,
1267
+ "WNe4d2": 1265,
1268
+ "WNe4d6": 1266,
1269
+ "WNe4d6(+)": 1267,
1270
+ "WNe4d6(x)": 1268,
1271
+ "WNe4f6(+)": 1269,
1272
+ "WNe4f6(x+)": 1270,
1273
+ "WNe4g3": 1271,
1274
+ "WNe4g5": 1272,
1275
+ "WNe5c4": 1273,
1276
+ "WNe5c4(x)": 1274,
1277
+ "WNe5c6": 1275,
1278
+ "WNe5c6(x)": 1276,
1279
+ "WNe5d3": 1277,
1280
+ "WNe5d7": 1278,
1281
+ "WNe5d7(x)": 1279,
1282
+ "WNe5f3": 1280,
1283
+ "WNe5f7(x)": 1281,
1284
+ "WNe5g4": 1282,
1285
+ "WNe5g4(x)": 1283,
1286
+ "WNe5g6": 1284,
1287
+ "WNe5g6(x)": 1285,
1288
+ "WNe6f8(x)": 1286,
1289
+ "WNf1e3": 1287,
1290
+ "WNf1g3": 1288,
1291
+ "WNf3d2": 1289,
1292
+ "WNf3d2(x)": 1290,
1293
+ "WNf3d4": 1291,
1294
+ "WNf3d4(x)": 1292,
1295
+ "WNf3e1": 1293,
1296
+ "WNf3e5": 1294,
1297
+ "WNf3e5(+)": 1295,
1298
+ "WNf3e5(x)": 1296,
1299
+ "WNf3g1": 1297,
1300
+ "WNf3g5": 1298,
1301
+ "WNf3g5(+)": 1299,
1302
+ "WNf3g5(x)": 1300,
1303
+ "WNf3h2": 1301,
1304
+ "WNf3h4": 1302,
1305
+ "WNf3h4(x)": 1303,
1306
+ "WNf4d3": 1304,
1307
+ "WNf4d5": 1305,
1308
+ "WNf4d5(x)": 1306,
1309
+ "WNf4e6(x)": 1307,
1310
+ "WNf4h5": 1308,
1311
+ "WNf5e3": 1309,
1312
+ "WNf5e7(+)": 1310,
1313
+ "WNf7h8(x)": 1311,
1314
+ "WNg1e2": 1312,
1315
+ "WNg1f3": 1313,
1316
+ "WNg1f3(x)": 1314,
1317
+ "WNg1h3": 1315,
1318
+ "WNg3e2": 1316,
1319
+ "WNg3e4": 1317,
1320
+ "WNg3e4(x)": 1318,
1321
+ "WNg3f5": 1319,
1322
+ "WNg3f5(x)": 1320,
1323
+ "WNg3h5": 1321,
1324
+ "WNg4e3": 1322,
1325
+ "WNg4e5": 1323,
1326
+ "WNg5e4": 1324,
1327
+ "WNg5e4(x)": 1325,
1328
+ "WNg5e6": 1326,
1329
+ "WNg5e6(x)": 1327,
1330
+ "WNg5f3": 1328,
1331
+ "WNg5f7(x)": 1329,
1332
+ "WNg5h3": 1330,
1333
+ "WNh2f3": 1331,
1334
+ "WNh2g4": 1332,
1335
+ "WNh3f2": 1333,
1336
+ "WNh3f4": 1334,
1337
+ "WNh3g5": 1335,
1338
+ "WNh4f3": 1336,
1339
+ "WNh4f5": 1337,
1340
+ "WNh4f5(x)": 1338,
1341
+ "WNh4g6(x)": 1339,
1342
+ "WPa2a3": 1340,
1343
+ "WPa2a4": 1341,
1344
+ "WPa2b3(x)": 1342,
1345
+ "WPa3a4": 1343,
1346
+ "WPa3b4(x)": 1344,
1347
+ "WPa4a5": 1345,
1348
+ "WPa4b5(x)": 1346,
1349
+ "WPa5a6": 1347,
1350
+ "WPa5b6(x)": 1348,
1351
+ "WPa6a7": 1349,
1352
+ "WPa7a8(Q)": 1350,
1353
+ "WPb2a3(x)": 1351,
1354
+ "WPb2b3": 1352,
1355
+ "WPb2b4": 1353,
1356
+ "WPb2c3(x)": 1354,
1357
+ "WPb3a4(x)": 1355,
1358
+ "WPb3b4": 1356,
1359
+ "WPb3c4(x)": 1357,
1360
+ "WPb4a5(x)": 1358,
1361
+ "WPb4b5": 1359,
1362
+ "WPb4c5(x)": 1360,
1363
+ "WPb5a6(x)": 1361,
1364
+ "WPb5b6": 1362,
1365
+ "WPb5c6(x)": 1363,
1366
+ "WPb6b7": 1364,
1367
+ "WPb7b8(Q)": 1365,
1368
+ "WPc2b3(x)": 1366,
1369
+ "WPc2c3": 1367,
1370
+ "WPc2c4": 1368,
1371
+ "WPc2d3(x)": 1369,
1372
+ "WPc3b4(x)": 1370,
1373
+ "WPc3c4": 1371,
1374
+ "WPc3d4(x)": 1372,
1375
+ "WPc4b5(x)": 1373,
1376
+ "WPc4c5": 1374,
1377
+ "WPc4d5(x)": 1375,
1378
+ "WPc5b6(x)": 1376,
1379
+ "WPc5c6": 1377,
1380
+ "WPc5d6(x)": 1378,
1381
+ "WPc6c7": 1379,
1382
+ "WPc7c8(Q)": 1380,
1383
+ "WPd2c3(x)": 1381,
1384
+ "WPd2d3": 1382,
1385
+ "WPd2d4": 1383,
1386
+ "WPd3c4(x)": 1384,
1387
+ "WPd3d4": 1385,
1388
+ "WPd3e4(x)": 1386,
1389
+ "WPd4c5(x)": 1387,
1390
+ "WPd4d5": 1388,
1391
+ "WPd4e5(x)": 1389,
1392
+ "WPd5c6(x)": 1390,
1393
+ "WPd5d6": 1391,
1394
+ "WPd5e6(x)": 1392,
1395
+ "WPd6d7": 1393,
1396
+ "WPd7d8(Q)": 1394,
1397
+ "WPe2e3": 1395,
1398
+ "WPe2e4": 1396,
1399
+ "WPe3d4(x)": 1397,
1400
+ "WPe3e4": 1398,
1401
+ "WPe3f4(x)": 1399,
1402
+ "WPe4d5(x)": 1400,
1403
+ "WPe4e5": 1401,
1404
+ "WPe4f5(x)": 1402,
1405
+ "WPe5d6(x)": 1403,
1406
+ "WPe5e6": 1404,
1407
+ "WPe5f6(x)": 1405,
1408
+ "WPe5f6(xE)": 1406,
1409
+ "WPe6e7": 1407,
1410
+ "WPe6f7(x+)": 1408,
1411
+ "WPf2e3(x)": 1409,
1412
+ "WPf2f3": 1410,
1413
+ "WPf2f4": 1411,
1414
+ "WPf2g3(x)": 1412,
1415
+ "WPf3e4(x)": 1413,
1416
+ "WPf3f4": 1414,
1417
+ "WPf3g4(x)": 1415,
1418
+ "WPf4e5(x)": 1416,
1419
+ "WPf4f5": 1417,
1420
+ "WPf4g5(x)": 1418,
1421
+ "WPf5e6(x)": 1419,
1422
+ "WPf5f6": 1420,
1423
+ "WPf5g6(x)": 1421,
1424
+ "WPf6f7": 1422,
1425
+ "WPg2f3(x)": 1423,
1426
+ "WPg2g3": 1424,
1427
+ "WPg2g4": 1425,
1428
+ "WPg2h3(x)": 1426,
1429
+ "WPg3f4(x)": 1427,
1430
+ "WPg3g4": 1428,
1431
+ "WPg3h4(x)": 1429,
1432
+ "WPg4f5(x)": 1430,
1433
+ "WPg4g5": 1431,
1434
+ "WPg4h5(x)": 1432,
1435
+ "WPg5f6(x)": 1433,
1436
+ "WPg5g6": 1434,
1437
+ "WPg5h6(x)": 1435,
1438
+ "WPg6g7": 1436,
1439
+ "WPg7g8(Q)": 1437,
1440
+ "WPh2g3(x)": 1438,
1441
+ "WPh2h3": 1439,
1442
+ "WPh2h4": 1440,
1443
+ "WPh3g4(x)": 1441,
1444
+ "WPh3h4": 1442,
1445
+ "WPh4g5(x)": 1443,
1446
+ "WPh4h5": 1444,
1447
+ "WPh5g6(x)": 1445,
1448
+ "WPh5h6": 1446,
1449
+ "WPh6h7": 1447,
1450
+ "WPh7h8(Q)": 1448,
1451
+ "WQa4b3": 1449,
1452
+ "WQa4c2": 1450,
1453
+ "WQb3b7(x)": 1451,
1454
+ "WQb3c2": 1452,
1455
+ "WQb3d1": 1453,
1456
+ "WQc2b3": 1454,
1457
+ "WQc2c3": 1455,
1458
+ "WQc2d2": 1456,
1459
+ "WQc2d3": 1457,
1460
+ "WQc2e2": 1458,
1461
+ "WQc2e4(x)": 1459,
1462
+ "WQd1a1(x)": 1460,
1463
+ "WQd1a4": 1461,
1464
+ "WQd1a4(+)": 1462,
1465
+ "WQd1b3": 1463,
1466
+ "WQd1c1": 1464,
1467
+ "WQd1c2": 1465,
1468
+ "WQd1d2": 1466,
1469
+ "WQd1d2(x)": 1467,
1470
+ "WQd1d3": 1468,
1471
+ "WQd1d3(x)": 1469,
1472
+ "WQd1d4": 1470,
1473
+ "WQd1d4(x)": 1471,
1474
+ "WQd1d5": 1472,
1475
+ "WQd1d5(x)": 1473,
1476
+ "WQd1d6(x)": 1474,
1477
+ "WQd1d8(x)": 1475,
1478
+ "WQd1d8(x+)": 1476,
1479
+ "WQd1e1": 1477,
1480
+ "WQd1e2": 1478,
1481
+ "WQd1e2(+)": 1479,
1482
+ "WQd1e2(x)": 1480,
1483
+ "WQd1f3": 1481,
1484
+ "WQd1f3(x)": 1482,
1485
+ "WQd1g4": 1483,
1486
+ "WQd1g4(x)": 1484,
1487
+ "WQd1h5": 1485,
1488
+ "WQd1h5(+)": 1486,
1489
+ "WQd1h5(x)": 1487,
1490
+ "WQd2c2": 1488,
1491
+ "WQd2c3": 1489,
1492
+ "WQd2d3": 1490,
1493
+ "WQd2e2": 1491,
1494
+ "WQd2e3": 1492,
1495
+ "WQd2e3(x)": 1493,
1496
+ "WQd2f2": 1494,
1497
+ "WQd2f4": 1495,
1498
+ "WQd2f4(x)": 1496,
1499
+ "WQd2g5": 1497,
1500
+ "WQd2h6(x)": 1498,
1501
+ "WQd3c2": 1499,
1502
+ "WQd3d2": 1500,
1503
+ "WQd3e2": 1501,
1504
+ "WQd3e3": 1502,
1505
+ "WQd3e4(x)": 1503,
1506
+ "WQd3f3": 1504,
1507
+ "WQd3g3": 1505,
1508
+ "WQd4d1": 1506,
1509
+ "WQd4d3": 1507,
1510
+ "WQd4e3": 1508,
1511
+ "WQe2c2": 1509,
1512
+ "WQe2c4": 1510,
1513
+ "WQe2c4(x)": 1511,
1514
+ "WQe2d1": 1512,
1515
+ "WQe2d2": 1513,
1516
+ "WQe2d3": 1514,
1517
+ "WQe2e3": 1515,
1518
+ "WQe2e3(x)": 1516,
1519
+ "WQe2e4": 1517,
1520
+ "WQe2e4(x)": 1518,
1521
+ "WQe2f2": 1519,
1522
+ "WQe2f3": 1520,
1523
+ "WQe2f3(x)": 1521,
1524
+ "WQe2g4": 1522,
1525
+ "WQe2h5": 1523,
1526
+ "WQe3e2": 1524,
1527
+ "WQe3f3": 1525,
1528
+ "WQe3g3": 1526,
1529
+ "WQf3b7(x)": 1527,
1530
+ "WQf3d1": 1528,
1531
+ "WQf3d3": 1529,
1532
+ "WQf3d5(x)": 1530,
1533
+ "WQf3e2": 1531,
1534
+ "WQf3e3": 1532,
1535
+ "WQf3e4(x)": 1533,
1536
+ "WQf3f4": 1534,
1537
+ "WQf3f6(x)": 1535,
1538
+ "WQf3g3": 1536,
1539
+ "WQf3g4": 1537,
1540
+ "WQf3h3": 1538,
1541
+ "WQf3h5": 1539,
1542
+ "WQg3f3": 1540,
1543
+ "WQg3g4": 1541,
1544
+ "WQg3h4": 1542,
1545
+ "WQg4f3": 1543,
1546
+ "WQg4g3": 1544,
1547
+ "WQh5f3": 1545,
1548
+ "WRa1a2": 1546,
1549
+ "WRa1a3": 1547,
1550
+ "WRa1a6(x)": 1548,
1551
+ "WRa1a7(x)": 1549,
1552
+ "WRa1a8(x)": 1550,
1553
+ "WRa1b1": 1551,
1554
+ "WRa1c1": 1552,
1555
+ "WRa1c1(x)": 1553,
1556
+ "WRa1d1": 1554,
1557
+ "WRa1d1(x)": 1555,
1558
+ "WRa1e1": 1556,
1559
+ "WRa1e1(x)": 1557,
1560
+ "WRa1f1": 1558,
1561
+ "WRa1f1(x)": 1559,
1562
+ "WRa1g1": 1560,
1563
+ "WRa1h1": 1561,
1564
+ "WRb1a1": 1562,
1565
+ "WRb1b2": 1563,
1566
+ "WRb1b2(x)": 1564,
1567
+ "WRb1b3": 1565,
1568
+ "WRb1b7": 1566,
1569
+ "WRb1b7(x)": 1567,
1570
+ "WRb1c1": 1568,
1571
+ "WRb1d1": 1569,
1572
+ "WRb1e1": 1570,
1573
+ "WRb1f1": 1571,
1574
+ "WRb7a7(x)": 1572,
1575
+ "WRc1a1": 1573,
1576
+ "WRc1b1": 1574,
1577
+ "WRc1c2": 1575,
1578
+ "WRc1c2(x)": 1576,
1579
+ "WRc1c3": 1577,
1580
+ "WRc1c3(x)": 1578,
1581
+ "WRc1c4(x)": 1579,
1582
+ "WRc1c5": 1580,
1583
+ "WRc1c5(x)": 1581,
1584
+ "WRc1c6(x)": 1582,
1585
+ "WRc1c7": 1583,
1586
+ "WRc1c7(x)": 1584,
1587
+ "WRc1c8(x)": 1585,
1588
+ "WRc1d1": 1586,
1589
+ "WRc1e1": 1587,
1590
+ "WRc1f1": 1588,
1591
+ "WRc7b7(x)": 1589,
1592
+ "WRd1a1": 1590,
1593
+ "WRd1b1": 1591,
1594
+ "WRd1c1": 1592,
1595
+ "WRd1d2": 1593,
1596
+ "WRd1d2(x)": 1594,
1597
+ "WRd1d3": 1595,
1598
+ "WRd1d3(x)": 1596,
1599
+ "WRd1d4": 1597,
1600
+ "WRd1d4(x)": 1598,
1601
+ "WRd1d5": 1599,
1602
+ "WRd1d5(x)": 1600,
1603
+ "WRd1d6": 1601,
1604
+ "WRd1d6(x)": 1602,
1605
+ "WRd1d7": 1603,
1606
+ "WRd1d7(+)": 1604,
1607
+ "WRd1d7(x)": 1605,
1608
+ "WRd1d8(+)": 1606,
1609
+ "WRd1d8(x)": 1607,
1610
+ "WRd1d8(x+)": 1608,
1611
+ "WRd1e1": 1609,
1612
+ "WRd1e1(x)": 1610,
1613
+ "WRd1f1": 1611,
1614
+ "WRd1g1": 1612,
1615
+ "WRd1h1": 1613,
1616
+ "WRd2e2": 1614,
1617
+ "WRd7b7(x)": 1615,
1618
+ "WRe1a1": 1616,
1619
+ "WRe1b1": 1617,
1620
+ "WRe1c1": 1618,
1621
+ "WRe1d1": 1619,
1622
+ "WRe1d1(x)": 1620,
1623
+ "WRe1e2": 1621,
1624
+ "WRe1e2(x)": 1622,
1625
+ "WRe1e3": 1623,
1626
+ "WRe1e3(x)": 1624,
1627
+ "WRe1e4": 1625,
1628
+ "WRe1e4(x)": 1626,
1629
+ "WRe1e5": 1627,
1630
+ "WRe1e5(x)": 1628,
1631
+ "WRe1e6": 1629,
1632
+ "WRe1e6(x)": 1630,
1633
+ "WRe1e7": 1631,
1634
+ "WRe1e7(x)": 1632,
1635
+ "WRe1e8(+)": 1633,
1636
+ "WRe1e8(x)": 1634,
1637
+ "WRe1e8(x+)": 1635,
1638
+ "WRe1f1": 1636,
1639
+ "WRe1g1": 1637,
1640
+ "WRe1h1": 1638,
1641
+ "WRe2d2": 1639,
1642
+ "WRe2e3": 1640,
1643
+ "WRe3f3": 1641,
1644
+ "WRe3g3": 1642,
1645
+ "WRf1a1": 1643,
1646
+ "WRf1a1(x)": 1644,
1647
+ "WRf1b1": 1645,
1648
+ "WRf1c1": 1646,
1649
+ "WRf1c1(x)": 1647,
1650
+ "WRf1d1": 1648,
1651
+ "WRf1d1(x)": 1649,
1652
+ "WRf1e1": 1650,
1653
+ "WRf1e1(+)": 1651,
1654
+ "WRf1e1(x)": 1652,
1655
+ "WRf1f2": 1653,
1656
+ "WRf1f2(x)": 1654,
1657
+ "WRf1f3": 1655,
1658
+ "WRf1f3(x)": 1656,
1659
+ "WRf1f4": 1657,
1660
+ "WRf1f4(x)": 1658,
1661
+ "WRf1f5(x)": 1659,
1662
+ "WRf1f6(x)": 1660,
1663
+ "WRf1f7": 1661,
1664
+ "WRf1f7(x)": 1662,
1665
+ "WRf1f8(x+)": 1663,
1666
+ "WRf1g1": 1664,
1667
+ "WRf1h1": 1665,
1668
+ "WRf2e2": 1666,
1669
+ "WRf3g3": 1667,
1670
+ "WRf3h3": 1668,
1671
+ "WRg1e1": 1669,
1672
+ "WRg1f1": 1670,
1673
+ "WRg1g2": 1671,
1674
+ "WRg1g3": 1672,
1675
+ "WRg1h1": 1673,
1676
+ "WRh1c1": 1674,
1677
+ "WRh1d1": 1675,
1678
+ "WRh1e1": 1676,
1679
+ "WRh1f1": 1677,
1680
+ "WRh1g1": 1678,
1681
+ "WRh1h2": 1679,
1682
+ "WRh1h3": 1680,
1683
+ "WRh1h5(x)": 1681
1684
+ }
pyproject.toml CHANGED
@@ -23,7 +23,7 @@ classifiers = [
23
  ]
24
  dependencies = [
25
  "torch>=2.0.0",
26
- "transformers>=4.40.0",
27
  "accelerate>=0.26.0",
28
  "datasets>=2.14.0",
29
  "python-chess>=1.999",
@@ -39,12 +39,8 @@ dev = [
39
  "black>=23.0.0",
40
  "ruff>=0.1.0",
41
  ]
42
- eval = [
43
- "stockfish>=3.28.0",
44
- ]
45
 
46
  [project.scripts]
47
- chess-train = "src.train:main"
48
  chess-eval = "src.evaluate:main"
49
 
50
  [tool.setuptools.packages.find]
 
23
  ]
24
  dependencies = [
25
  "torch>=2.0.0",
26
+ "transformers>=4.40.0,<5.0.0",
27
  "accelerate>=0.26.0",
28
  "datasets>=2.14.0",
29
  "python-chess>=1.999",
 
39
  "black>=23.0.0",
40
  "ruff>=0.1.0",
41
  ]
 
 
 
42
 
43
  [project.scripts]
 
44
  chess-eval = "src.evaluate:main"
45
 
46
  [tool.setuptools.packages.find]
src/__init__.py CHANGED
@@ -1,22 +1,20 @@
1
- """Chess Challenge source module."""
2
 
3
- from .model import ChessConfig, ChessForCausalLM
4
- from .tokenizer import ChessTokenizer
5
-
6
- # Lazy import for evaluate to avoid RuntimeWarning when running as module
7
  def __getattr__(name):
8
  if name == "ChessEvaluator":
9
  from .evaluate import ChessEvaluator
10
  return ChessEvaluator
11
- if name == "load_model_from_hub":
12
- from .evaluate import load_model_from_hub
13
- return load_model_from_hub
 
 
 
14
  raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
15
 
16
  __all__ = [
17
- "ChessConfig",
18
- "ChessForCausalLM",
19
- "ChessTokenizer",
20
  "ChessEvaluator",
21
- "load_model_from_hub",
 
22
  ]
 
1
+ """Chess Challenge evaluation module."""
2
 
3
+ # Lazy imports to avoid circular dependencies
 
 
 
4
  def __getattr__(name):
5
  if name == "ChessEvaluator":
6
  from .evaluate import ChessEvaluator
7
  return ChessEvaluator
8
+ if name == "load_model_and_tokenizer":
9
+ from .evaluate import load_model_and_tokenizer
10
+ return load_model_and_tokenizer
11
+ if name == "count_parameters":
12
+ from .evaluate import count_parameters
13
+ return count_parameters
14
  raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
15
 
16
  __all__ = [
 
 
 
17
  "ChessEvaluator",
18
+ "load_model_and_tokenizer",
19
+ "count_parameters",
20
  ]
src/__main__.py ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ CLI entry point for running evaluation as a module.
3
+
4
+ Usage:
5
+ python -m src --model ./my_model/final
6
+ python -m src --model username/model-name
7
+ """
8
+
9
+ import argparse
10
+ import sys
11
+
12
+ from .evaluate import evaluate_model
13
+
14
+
15
+ def main():
16
+ parser = argparse.ArgumentParser(
17
+ description="Evaluate a chess model",
18
+ prog="python -m src",
19
+ )
20
+ parser.add_argument(
21
+ "--model",
22
+ "-m",
23
+ type=str,
24
+ required=True,
25
+ help="Path to model directory or HuggingFace model ID",
26
+ )
27
+ parser.add_argument(
28
+ "--quiet",
29
+ "-q",
30
+ action="store_true",
31
+ help="Suppress progress output",
32
+ )
33
+
34
+ args = parser.parse_args()
35
+
36
+ result = evaluate_model(args.model, verbose=not args.quiet)
37
+ print()
38
+ print(result.summary())
39
+
40
+ return 0
41
+
42
+
43
+ if __name__ == "__main__":
44
+ sys.exit(main())
src/evaluate.py CHANGED
@@ -1,838 +1,1002 @@
1
  """
2
  Evaluation script for the Chess Challenge.
3
 
4
- This script evaluates a trained chess model by playing games against
5
- Stockfish and computing ELO ratings.
 
 
 
 
 
6
  """
7
 
8
  from __future__ import annotations
9
 
10
  import argparse
 
 
11
  import random
12
  import re
13
- from dataclasses import dataclass
 
 
14
  from typing import List, Optional, Tuple
15
 
16
  import torch
17
 
 
 
 
 
 
 
 
 
 
18
 
19
  @dataclass
20
- class GameResult:
21
- """Result of a single game."""
22
- moves: List[str]
23
- result: str # "1-0", "0-1", or "1/2-1/2"
24
- model_color: str # "white" or "black"
25
- termination: str # "checkmate", "stalemate", "illegal_move", "max_moves", etc.
26
- illegal_move_count: int
27
-
28
-
29
- class ChessEvaluator:
30
- """
31
- Evaluator for chess models.
 
 
 
 
32
 
33
- This class handles playing games between a trained model and Stockfish,
34
- tracking results, and computing ELO ratings.
 
35
 
36
- Supports any tokenization format as long as the model generates valid
37
- chess squares (e.g., e2, e4). The evaluator extracts UCI moves by finding
38
- square patterns in the generated output.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  """
 
40
 
41
- # Regex pattern to match chess squares
42
- SQUARE_PATTERN = r'[a-h][1-8]'
43
 
44
- def __init__(
45
- self,
46
- model,
47
- tokenizer,
48
- stockfish_path: Optional[str] = None,
49
- stockfish_level: int = 1,
50
- max_retries: int = 3,
51
- device: str = "cuda" if torch.cuda.is_available() else "cpu",
52
- ):
53
- """
54
- Initialize the evaluator.
55
-
56
- Args:
57
- model: The trained chess model.
58
- tokenizer: The chess tokenizer.
59
- stockfish_path: Path to Stockfish executable.
60
- stockfish_level: Stockfish skill level (0-20).
61
- max_retries: Maximum retries for illegal moves.
62
- device: Device to run the model on.
63
- """
64
- self.model = model.to(device)
65
- self.tokenizer = tokenizer
66
- self.max_retries = max_retries
67
- self.device = device
68
 
69
- # Initialize Stockfish
70
- try:
71
- import chess
72
- import chess.engine
73
-
74
- self.chess = chess
75
-
76
- if stockfish_path is None:
77
- # Try common paths
78
- import shutil
79
- stockfish_path = shutil.which("stockfish")
80
-
81
- if stockfish_path:
82
- self.engine = chess.engine.SimpleEngine.popen_uci(stockfish_path)
83
- self.engine.configure({"Skill Level": stockfish_level})
84
- else:
85
- print("WARNING: Stockfish not found. Install it for full evaluation.")
86
- self.engine = None
87
-
88
- except ImportError:
89
- raise ImportError(
90
- "python-chess is required for evaluation. "
91
- "Install it with: pip install python-chess"
92
- )
93
 
94
- def __del__(self):
95
- """Clean up Stockfish engine."""
96
- if hasattr(self, 'engine') and self.engine:
97
- self.engine.quit()
 
98
 
99
- def _detect_tokenizer_format(self) -> str:
100
- """
101
- Detect the tokenizer's expected move format by testing tokenization.
102
-
103
- Tests various formats with a sample move and picks the one that
104
- produces the fewest unknown tokens. This makes evaluation work
105
- with any tokenizer format.
106
-
107
- Supported formats:
108
- - 'decomposed': "WP e2_f e4_t" (piece, from_suffix, to_suffix)
109
- - 'standard': "WPe2e4" (combined with optional annotations)
110
- - 'uci': "e2e4" (pure UCI notation)
111
- - 'uci_spaced': "e2 e4" (UCI with space separator)
112
-
113
- Returns:
114
- The format string that best matches the tokenizer's vocabulary.
115
- """
116
- if hasattr(self, '_cached_format'):
117
- return self._cached_format
118
-
119
- # Sample move representations to test
120
- test_formats = {
121
- 'decomposed': "WP e2_f e4_t",
122
- 'standard': "WPe2e4",
123
- 'uci': "e2e4",
124
- 'uci_spaced': "e2 e4",
125
- }
126
-
127
- unk_token_id = getattr(self.tokenizer, 'unk_token_id', None)
128
- best_format = 'standard'
129
- min_unk_count = float('inf')
130
-
131
- for fmt, sample in test_formats.items():
132
- try:
133
- tokens = self.tokenizer.encode(sample, add_special_tokens=False)
134
- # Count unknown tokens
135
- unk_count = tokens.count(unk_token_id) if unk_token_id is not None else 0
136
- # Also penalize if the entire thing became one UNK
137
- if len(tokens) == 1 and unk_count == 1:
138
- unk_count = 100 # Heavy penalty
139
-
140
- if unk_count < min_unk_count:
141
- min_unk_count = unk_count
142
- best_format = fmt
143
- except Exception:
144
- continue
145
-
146
- self._cached_format = best_format
147
- return best_format
148
 
149
- def _format_move(self, color: str, piece: str, from_sq: str, to_sq: str,
150
- promotion: str = None) -> str:
151
- """
152
- Format a single move according to the detected tokenizer format.
153
-
154
- Args:
155
- color: 'W' or 'B'
156
- piece: Piece letter (P, N, B, R, Q, K)
157
- from_sq: Source square (e.g., 'e2')
158
- to_sq: Destination square (e.g., 'e4')
159
- promotion: Promotion piece letter or None
160
-
161
- Returns:
162
- Formatted move string.
163
- """
164
- fmt = self._detect_tokenizer_format()
165
-
166
- if fmt == 'decomposed':
167
- move_str = f"{color}{piece} {from_sq}_f {to_sq}_t"
168
- elif fmt == 'uci':
169
- move_str = f"{from_sq}{to_sq}"
170
- if promotion:
171
- move_str += promotion.lower()
172
- elif fmt == 'uci_spaced':
173
- move_str = f"{from_sq} {to_sq}"
174
- if promotion:
175
- move_str += f" {promotion.lower()}"
176
- else: # standard
177
- move_str = f"{color}{piece}{from_sq}{to_sq}"
178
- if promotion:
179
- move_str += f"={promotion}"
180
-
181
- return move_str
182
-
183
- def _convert_board_to_moves(self, board) -> str:
184
- """
185
- Convert board move history to model input format.
186
-
187
- Automatically detects the tokenizer's expected format and outputs
188
- moves accordingly. Supports any tokenization strategy.
189
- """
190
- moves = []
191
- temp_board = self.chess.Board()
192
- fmt = self._detect_tokenizer_format()
193
-
194
- for move in board.move_stack:
195
- # Get piece and color
196
- color = "W" if temp_board.turn == self.chess.WHITE else "B"
197
- piece = temp_board.piece_at(move.from_square)
198
- piece_letter = piece.symbol().upper() if piece else "P"
199
-
200
- # Get squares
201
- from_sq = self.chess.square_name(move.from_square)
202
- to_sq = self.chess.square_name(move.to_square)
203
-
204
- # Get promotion piece if any
205
- promo = None
206
- if move.promotion:
207
- promo = self.chess.piece_symbol(move.promotion).upper()
208
-
209
- # Format based on detected tokenizer format
210
- move_str = self._format_move(color, piece_letter, from_sq, to_sq, promo)
211
 
212
- # For standard format, add annotations (capture, check, castling)
213
- if fmt == 'standard':
214
- # Add capture suffix
215
- if temp_board.is_capture(move):
216
- move_str += "(x)"
217
-
218
- # Push move to check for check/checkmate
219
- temp_board.push(move)
220
-
221
- if temp_board.is_checkmate():
222
- if "(x)" in move_str:
223
- move_str = move_str.replace("(x)", "(x+*)")
224
- else:
225
- move_str += "(+*)"
226
- elif temp_board.is_check():
227
- if "(x)" in move_str:
228
- move_str = move_str.replace("(x)", "(x+)")
229
- else:
230
- move_str += "(+)"
231
 
232
- # Handle castling notation
233
- if piece_letter == "K":
234
- if abs(ord(from_sq[0]) - ord(to_sq[0])) > 1:
235
- if to_sq[0] == 'g': # Kingside
236
- move_str = move_str.split("(")[0] + "(o)"
237
- else: # Queenside
238
- move_str = move_str.split("(")[0] + "(O)"
 
239
  else:
240
- # For non-standard formats, just push the move
241
- temp_board.push(move)
242
-
243
- moves.append(move_str)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
244
 
245
- return " ".join(moves)
 
 
 
246
 
247
- def _is_separator_token(self, token_str: str) -> bool:
248
- """
249
- Check if a token represents a separator (whitespace, EOS, etc.).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
250
 
251
- This allows the evaluator to work with different tokenization strategies:
252
- - Move-level tokenizers: each move is one token, no separators generated
253
- - Character-level tokenizers: space character marks end of move
254
- - BPE/subword tokenizers: may generate partial moves
255
 
256
- Args:
257
- token_str: The decoded token string.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
258
 
259
- Returns:
260
- True if this token indicates end of a move.
261
- """
262
- # Check for EOS token
 
 
 
 
 
 
263
  if hasattr(self.tokenizer, 'eos_token') and token_str == self.tokenizer.eos_token:
264
  return True
265
-
266
- # Check for whitespace (space, newline, etc.)
267
- if token_str.strip() == "" and len(token_str) > 0:
268
- return True
269
-
270
- # Check if the token ends with whitespace (some tokenizers include trailing space)
271
- if token_str != token_str.rstrip():
272
- return True
273
-
274
- return False
275
-
276
  def _extract_uci_move(self, text: str) -> Optional[str]:
277
  """
278
- Extract a UCI move from generated text using pattern matching.
279
-
280
- This generic method works with any tokenization format by finding
281
- chess square patterns ([a-h][1-8]) in the output.
282
-
283
- Supported formats include:
284
- - Standard: "WPe2e4" -> "e2e4"
285
- - Decomposed: "WP e2_f e4_t" -> "e2e4"
286
- - Pure UCI: "e2e4" -> "e2e4"
287
- - With separators: "e2-e4", "e2 e4" -> "e2e4"
288
- - With promotion: "e7e8=Q", "e7e8q" -> "e7e8q"
289
 
290
- Args:
291
- text: The generated text containing a move.
292
-
293
- Returns:
294
- UCI move string (e.g., "e2e4", "e7e8q") or None if not found.
295
  """
296
- if not text:
297
- return None
298
-
299
- # Find all squares in the text
300
- squares = re.findall(self.SQUARE_PATTERN, text)
301
 
302
  if len(squares) < 2:
303
  return None
304
 
305
- # Take the first two squares as from and to
306
  from_sq, to_sq = squares[0], squares[1]
307
  uci_move = from_sq + to_sq
308
 
309
- # Check for promotion (letter after to_square)
310
- # Look for patterns like "=Q", "=q", or just "q" after the to_square
311
- to_sq_idx = text.find(to_sq)
312
- if to_sq_idx != -1:
313
- remaining = text[to_sq_idx + 2:to_sq_idx + 5] # Check next few chars
314
  promo_match = re.search(r'[=]?([qrbnQRBN])', remaining)
315
  if promo_match:
316
  uci_move += promo_match.group(1).lower()
317
 
318
  return uci_move
319
-
320
- def _has_complete_move(self, text: str) -> bool:
321
- """
322
- Check if the generated text contains a complete move.
323
-
324
- A complete move has at least two valid chess squares.
325
-
326
- Args:
327
- text: The generated text so far.
328
-
329
- Returns:
330
- True if text contains at least two squares.
331
- """
332
- squares = re.findall(self.SQUARE_PATTERN, text)
333
- return len(squares) >= 2
334
-
335
- def _generate_move_tokens(
336
- self,
337
  input_ids: torch.Tensor,
338
- temperature: float = 0.7,
339
- top_k: int = 10,
340
- max_tokens: int = 20,
341
  ) -> str:
342
  """
343
- Generate tokens until a complete move is detected or separator is hit.
344
-
345
- This method is tokenizer-agnostic and stops when:
346
- - A separator token (whitespace/EOS) is encountered
347
- - Two chess squares have been generated (complete move)
348
- - max_tokens limit is reached
349
 
350
  Args:
351
- input_ids: The input token IDs.
352
- temperature: Sampling temperature.
353
- top_k: Top-k filtering parameter.
354
- max_tokens: Maximum tokens to generate for a single move.
355
 
356
- Returns:
357
- The generated move string.
358
  """
359
  generated_tokens = []
360
  current_ids = input_ids.clone()
361
- accumulated_text = ""
362
 
363
- for _ in range(max_tokens):
364
- with torch.no_grad():
365
  outputs = self.model(input_ids=current_ids)
366
- logits = outputs.logits[:, -1, :] / temperature
367
 
368
- # Apply top-k filtering
369
- if top_k > 0:
370
- top_k_vals = torch.topk(logits, min(top_k, logits.size(-1)))
371
- indices_to_remove = logits < top_k_vals[0][..., -1, None]
372
- logits[indices_to_remove] = float("-inf")
 
 
373
 
374
- # Sample
375
- probs = torch.softmax(logits, dim=-1)
376
- next_token = torch.multinomial(probs, num_samples=1)
377
-
378
- # Decode the token
379
- token_str = self.tokenizer.decode(next_token[0])
380
-
381
- # Check if this is a separator token
382
- if self._is_separator_token(token_str):
383
- # If we already have a complete move, stop
384
- if self._has_complete_move(accumulated_text):
385
- break
386
- # Otherwise, if it's EOS, we should also stop
387
- if hasattr(self.tokenizer, 'eos_token'):
388
- if token_str == self.tokenizer.eos_token:
389
- break
390
- # For whitespace separators, only stop if we have content
391
- if accumulated_text:
392
  break
393
-
394
- generated_tokens.append(next_token[0])
395
- current_ids = torch.cat([current_ids, next_token], dim=-1)
396
- accumulated_text += token_str
397
-
398
- # Stop if we have a complete move (two squares found)
399
- if self._has_complete_move(accumulated_text):
400
- # Check if this might be a promotion - peek for one more token
401
- # if the move is to rank 1 or 8
402
- squares = re.findall(self.SQUARE_PATTERN, accumulated_text)
403
- if len(squares) >= 2:
404
- to_sq = squares[1]
405
- if to_sq[1] in '18': # Potential promotion
406
- # Allow one more iteration to capture promotion piece
407
- if len(generated_tokens) > 3: # Already have enough
408
- break
409
- else:
410
- break
411
 
412
- # Decode all generated tokens together
413
  if generated_tokens:
414
- all_tokens = torch.cat(generated_tokens, dim=0)
415
- move_str = self.tokenizer.decode(all_tokens, skip_special_tokens=True)
416
- return move_str.strip()
417
 
418
  return ""
419
-
420
- def _get_model_move(
421
  self,
422
- board,
423
- temperature: float = 0.7,
424
- top_k: int = 10,
425
- ) -> Tuple[Optional[str], int]:
426
  """
427
- Get the model's next move prediction.
428
-
429
- This method is tokenizer-agnostic. It generates tokens and extracts
430
- UCI moves using pattern matching on chess squares.
431
 
432
- Works with any tokenization format:
433
- - Move-level: "WPe2e4" -> e2e4
434
- - Decomposed: "WP e2_f e4_t" -> e2e4
435
- - Pure UCI: "e2e4" -> e2e4
436
- - Character-level: "e" "2" "e" "4" -> e2e4
437
- - BPE/subword: "e2" "e4" -> e2e4
438
 
 
 
 
 
439
  Returns:
440
- Tuple of (UCI move string, number of retries used).
 
441
  """
442
- self.model.eval()
443
-
444
- # Convert board to input format
445
- moves_str = self._convert_board_to_moves(board)
446
-
447
- # Add BOS token if no moves yet
448
- if not moves_str:
449
- input_text = self.tokenizer.bos_token
450
  else:
451
- input_text = self.tokenizer.bos_token + " " + moves_str
 
 
 
452
 
453
- # Tokenize
454
  inputs = self.tokenizer(
455
  input_text,
456
  return_tensors="pt",
457
  truncation=True,
458
- max_length=self.model.config.n_ctx - 10,
459
  ).to(self.device)
460
 
461
  # Try to generate a legal move
462
- for retry in range(self.max_retries):
463
- # Generate tokens until we have a move
464
- move_text = self._generate_move_tokens(
465
- inputs["input_ids"],
466
- temperature=temperature,
467
- top_k=top_k,
468
- )
469
 
470
- # Extract UCI move using generic pattern matching
471
  uci_move = self._extract_uci_move(move_text)
472
 
473
- if uci_move:
474
- try:
475
- move = self.chess.Move.from_uci(uci_move)
476
- if move in board.legal_moves:
477
- return uci_move, retry
478
- except (ValueError, self.chess.InvalidMoveError):
479
- pass
480
 
481
- return None, self.max_retries
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
482
 
483
- def _get_stockfish_move(self, board, time_limit: float = 0.1) -> str:
484
- """Get Stockfish's move."""
485
- if self.engine is None:
486
- raise RuntimeError("Stockfish engine not initialized")
487
-
488
- result = self.engine.play(board, self.chess.engine.Limit(time=time_limit))
489
- return result.move.uci()
490
 
491
- def play_game(
492
- self,
493
- model_color: str = "white",
494
- max_moves: int = 200,
495
- temperature: float = 0.7,
496
- ) -> GameResult:
497
  """
498
- Play a single game between the model and Stockfish.
499
 
500
- Args:
501
- model_color: "white" or "black".
502
- max_moves: Maximum number of moves before draw.
503
- temperature: Sampling temperature for model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
504
 
505
- Returns:
506
- GameResult with the game details.
 
507
  """
508
- board = self.chess.Board()
509
- moves = []
510
- illegal_move_count = 0
511
 
512
- model_is_white = model_color == "white"
 
 
 
513
 
514
- while not board.is_game_over() and len(moves) < max_moves:
515
- is_model_turn = (board.turn == self.chess.WHITE) == model_is_white
516
-
517
- if is_model_turn:
518
- # Model's turn
519
- uci_move, retries = self._get_model_move(board, temperature)
520
- illegal_move_count += retries
521
-
522
- if uci_move is None:
523
- # Model couldn't find a legal move
524
- return GameResult(
525
- moves=moves,
526
- result="0-1" if model_is_white else "1-0",
527
- model_color=model_color,
528
- termination="illegal_move",
529
- illegal_move_count=illegal_move_count + 1,
530
- )
531
-
532
- move = self.chess.Move.from_uci(uci_move)
533
- else:
534
- # Stockfish's turn
535
- if self.engine:
536
- uci_move = self._get_stockfish_move(board)
537
- move = self.chess.Move.from_uci(uci_move)
538
- else:
539
- # Random move if no engine
540
- move = random.choice(list(board.legal_moves))
541
-
542
- board.push(move)
543
- moves.append(move.uci())
544
 
545
- # Determine result
546
- if board.is_checkmate():
547
- if board.turn == self.chess.WHITE:
548
- result = "0-1" # Black wins
549
- else:
550
- result = "1-0" # White wins
551
- termination = "checkmate"
552
- elif board.is_stalemate():
553
- result = "1/2-1/2"
554
- termination = "stalemate"
555
- elif board.is_insufficient_material():
556
- result = "1/2-1/2"
557
- termination = "insufficient_material"
558
- elif board.can_claim_draw():
559
- result = "1/2-1/2"
560
- termination = "draw_claim"
561
- elif len(moves) >= max_moves:
562
- result = "1/2-1/2"
563
- termination = "max_moves"
564
  else:
565
- result = "1/2-1/2"
566
- termination = "unknown"
567
-
568
- return GameResult(
569
- moves=moves,
570
- result=result,
571
- model_color=model_color,
572
- termination=termination,
573
- illegal_move_count=illegal_move_count,
 
 
 
 
 
 
 
 
 
 
 
 
 
574
  )
 
 
 
 
 
 
575
 
576
- def evaluate_legal_moves(
577
- self,
578
- n_positions: int = 1000,
579
- temperature: float = 0.7,
580
- verbose: bool = True,
581
- seed: int = 42,
582
- ) -> dict:
583
- """
584
- Evaluate the model's ability to generate legal moves.
585
 
586
- This evaluation only checks if the model generates legal moves,
587
- without playing full games. Useful as a first-pass evaluation.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
588
 
589
- Args:
590
- n_positions: Number of positions to test.
591
- temperature: Sampling temperature.
592
- verbose: Whether to print progress.
593
- seed: Random seed for reproducibility.
 
 
 
 
 
 
 
 
 
 
594
 
595
- Returns:
596
- Dictionary with legal move statistics.
597
  """
598
- # Set random seed for reproducibility
599
- random.seed(seed)
600
- torch.manual_seed(seed)
601
-
602
- results = {
603
- "total_positions": 0,
604
- "legal_first_try": 0,
605
- "legal_with_retry": 0,
606
- "illegal_all_retries": 0,
607
- "positions": [],
608
- }
609
 
610
- # Generate random positions by playing random moves
611
- for i in range(n_positions):
612
- board = self.chess.Board()
 
613
 
614
- # Play random number of moves (5-40) to get varied positions
615
- n_random_moves = random.randint(5, 40)
616
- for _ in range(n_random_moves):
617
- if board.is_game_over():
618
- break
619
- move = random.choice(list(board.legal_moves))
620
- board.push(move)
621
 
622
- if board.is_game_over():
623
- continue # Skip terminal positions
624
 
625
- results["total_positions"] += 1
 
 
 
626
 
627
- # Test model's move generation
628
- uci_move, retries = self._get_model_move(board, temperature)
 
629
 
630
- position_result = {
631
- "fen": board.fen(),
632
- "move_number": len(board.move_stack),
633
- "legal": uci_move is not None,
634
- "retries": retries,
635
- }
636
- results["positions"].append(position_result)
637
 
638
- if uci_move is not None:
639
- if retries == 0:
640
- results["legal_first_try"] += 1
 
 
 
 
641
  else:
642
- results["legal_with_retry"] += 1
643
- else:
644
- results["illegal_all_retries"] += 1
645
 
646
- if verbose and (i + 1) % 100 == 0:
647
- legal_rate = (results["legal_first_try"] + results["legal_with_retry"]) / results["total_positions"]
648
- print(f" Positions: {i + 1}/{n_positions} | Legal rate: {legal_rate:.1%}")
649
-
650
- # Calculate statistics
651
- total = results["total_positions"]
652
- if total > 0:
653
- results["legal_rate_first_try"] = results["legal_first_try"] / total
654
- results["legal_rate_with_retry"] = (results["legal_first_try"] + results["legal_with_retry"]) / total
655
- results["illegal_rate"] = results["illegal_all_retries"] / total
656
- else:
657
- results["legal_rate_first_try"] = 0
658
- results["legal_rate_with_retry"] = 0
659
- results["illegal_rate"] = 1
660
 
661
- return results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
662
 
663
- def evaluate(
 
 
 
 
664
  self,
665
- n_games: int = 100,
666
- temperature: float = 0.7,
667
- verbose: bool = True,
668
- ) -> dict:
669
- """
670
- Run a full win-rate evaluation of the model against Stockfish.
 
 
 
 
 
 
 
671
 
672
- Args:
673
- n_games: Number of games to play.
674
- temperature: Sampling temperature.
675
- verbose: Whether to print progress.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
676
 
677
  Returns:
678
- Dictionary with evaluation metrics.
679
  """
680
- results = {
681
- "wins": 0,
682
- "losses": 0,
683
- "draws": 0,
684
- "illegal_moves": 0,
685
- "total_moves": 0,
686
- "games": [],
687
- }
688
 
689
- for i in range(n_games):
690
- # Alternate colors
691
- model_color = "white" if i % 2 == 0 else "black"
692
-
693
- game = self.play_game(
694
- model_color=model_color,
695
- temperature=temperature,
 
 
 
 
 
 
 
 
 
 
 
 
696
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
697
 
698
- results["games"].append(game)
699
- results["total_moves"] += len(game.moves)
700
- results["illegal_moves"] += game.illegal_move_count
701
-
702
- # Count result
703
- if game.result == "1/2-1/2":
704
- results["draws"] += 1
705
- elif (game.result == "1-0" and model_color == "white") or \
706
- (game.result == "0-1" and model_color == "black"):
707
- results["wins"] += 1
708
- else:
709
- results["losses"] += 1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
710
 
711
- if verbose and (i + 1) % 10 == 0:
712
- print(f" Games: {i + 1}/{n_games} | "
713
- f"W: {results['wins']} L: {results['losses']} D: {results['draws']}")
714
-
715
- # Calculate statistics
716
- total = results["wins"] + results["losses"] + results["draws"]
717
- results["win_rate"] = results["wins"] / total if total > 0 else 0
718
- results["draw_rate"] = results["draws"] / total if total > 0 else 0
719
- results["loss_rate"] = results["losses"] / total if total > 0 else 0
720
-
721
- total_attempts = results["total_moves"] + results["illegal_moves"]
722
-
723
- # Average length counts both legal moves and illegal attempts so early illegal terminations
724
- # don't show as near-zero length games.
725
- results["avg_game_length"] = total_attempts / total if total > 0 else 0
726
-
727
- # Illegal move rate: illegal attempts over total attempts
728
- results["illegal_move_rate"] = results["illegal_moves"] / total_attempts if total_attempts > 0 else 0
729
-
730
- # Estimate ELO (simplified)
731
- # Stockfish Level 1 is approximately 1350 ELO
732
- stockfish_elo = 1350
733
- if results["win_rate"] > 0 or results["loss_rate"] > 0:
734
- score = results["wins"] + 0.5 * results["draws"]
735
- expected = total * 0.5 # Expected score against equal opponent
736
 
737
- # Simple ELO estimation
738
- if score > 0:
739
- win_ratio = score / total
740
- if win_ratio > 0 and win_ratio < 1:
741
- elo_diff = -400 * (1 - 2 * win_ratio) / (1 if win_ratio > 0.5 else -1)
742
- results["estimated_elo"] = stockfish_elo + elo_diff
743
- else:
744
- results["estimated_elo"] = stockfish_elo + (400 if win_ratio >= 1 else -400)
745
- else:
746
- results["estimated_elo"] = stockfish_elo - 400
747
- else:
748
- results["estimated_elo"] = None
749
-
750
- return results
 
 
751
 
 
 
 
752
 
753
- def load_model_from_hub(model_id: str, device: str = "auto", verbose: bool = True):
754
  """
755
- Load a model from the Hugging Face Hub.
756
 
757
  Args:
758
- model_id: Model ID on Hugging Face Hub.
759
- device: Device to load the model on.
760
- verbose: Whether to print debug info about loaded tokenizer.
761
-
762
- Returns:
763
- Tuple of (model, tokenizer).
764
  """
765
- from transformers import AutoModelForCausalLM, AutoTokenizer
766
-
767
- # Import to register custom classes
768
- from src.model import ChessConfig, ChessForCausalLM
769
- from src.tokenizer import ChessTokenizer
770
-
771
- # Try AutoTokenizer with trust_remote_code first to load custom tokenizer.py from Hub
772
- # Fall back to local ChessTokenizer if the model doesn't have a custom tokenizer
773
- tokenizer_source = None
774
  try:
775
- tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
776
- tokenizer_source = "AutoTokenizer (from Hub with trust_remote_code=True)"
 
 
 
 
 
 
 
 
 
 
 
 
777
  except Exception as e:
778
- if verbose:
779
- print(f" AutoTokenizer failed: {e}")
780
- tokenizer = ChessTokenizer.from_pretrained(model_id)
781
- tokenizer_source = "ChessTokenizer (local class, vocab from Hub)"
782
-
783
- model = AutoModelForCausalLM.from_pretrained(
784
- model_id,
785
- trust_remote_code=True,
786
- device_map=device,
787
- )
788
-
789
- # Print debug info
790
- if verbose:
791
- print(f" Tokenizer loaded via: {tokenizer_source}")
792
- print(f" Tokenizer class: {type(tokenizer).__name__}")
793
- print(f" Tokenizer vocab size: {tokenizer.vocab_size}")
794
- # Check if tokenizer has custom attributes that might differ
795
- if hasattr(tokenizer, '_vocab'):
796
- print(f" Tokenizer has _vocab attribute: yes ({len(tokenizer._vocab)} entries)")
797
-
798
- return model, tokenizer
799
 
800
 
 
 
 
 
801
  def main():
802
  """Main evaluation function."""
803
- parser = argparse.ArgumentParser(description="Evaluate a chess model")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
804
 
805
  parser.add_argument(
806
  "--model_path", type=str, required=True,
807
- help="Path to the model or Hugging Face model ID"
808
- )
809
- parser.add_argument(
810
- "--mode", type=str, default="legal", choices=["legal", "winrate", "both"],
811
- help="Evaluation mode: 'legal' for legal move rate, 'winrate' for games, 'both' for both"
812
- )
813
- parser.add_argument(
814
- "--stockfish_path", type=str, default=None,
815
- help="Path to Stockfish executable"
816
- )
817
- parser.add_argument(
818
- "--stockfish_level", type=int, default=1,
819
- help="Stockfish skill level (0-20)"
820
  )
821
  parser.add_argument(
822
- "--n_positions", type=int, default=500,
823
- help="Number of positions for legal move evaluation"
824
  )
825
  parser.add_argument(
826
- "--seed", type=int, default=42,
827
- help="Random seed for reproducibility"
828
  )
829
  parser.add_argument(
830
- "--n_games", type=int, default=100,
831
- help="Number of games to play for win rate evaluation"
832
- )
833
- parser.add_argument(
834
- "--temperature", type=float, default=0.7,
835
- help="Sampling temperature"
836
  )
837
 
838
  args = parser.parse_args()
@@ -840,95 +1004,76 @@ def main():
840
  print("=" * 60)
841
  print("CHESS CHALLENGE - EVALUATION")
842
  print("=" * 60)
 
843
 
844
- # Load model
845
- print(f"\nLoading model from: {args.model_path}")
846
-
847
- import os
848
- is_local_path = os.path.exists(args.model_path)
849
 
850
- if is_local_path:
851
- # Local path
852
- from transformers import AutoModelForCausalLM
853
- from src.tokenizer import ChessTokenizer
854
- from src.model import ChessConfig, ChessForCausalLM
855
-
856
- tokenizer = ChessTokenizer.from_pretrained(args.model_path)
857
- model = AutoModelForCausalLM.from_pretrained(
858
- args.model_path,
859
- device_map="auto",
860
- )
861
- else:
862
- # Assume Hugging Face model ID (or invalid path)
863
- if args.model_path.startswith(".") or args.model_path.startswith("/"):
864
- raise FileNotFoundError(
865
- f"Local model path not found: {args.model_path}\n"
866
- f"Please check that the path exists and contains model files."
867
- )
868
- model, tokenizer = load_model_from_hub(args.model_path)
869
 
870
  # Create evaluator
871
- print(f"\nSetting up evaluator...")
872
  evaluator = ChessEvaluator(
873
  model=model,
874
  tokenizer=tokenizer,
875
- stockfish_path=args.stockfish_path,
876
- stockfish_level=args.stockfish_level,
877
  )
878
 
879
- # Run legal move evaluation
880
- if args.mode in ["legal", "both"]:
881
- print(f"\n" + "=" * 60)
882
- print("PHASE 1: LEGAL MOVE EVALUATION")
883
- print("=" * 60)
884
- print(f"Testing {args.n_positions} random positions...")
885
-
886
- legal_results = evaluator.evaluate_legal_moves(
887
- n_positions=args.n_positions,
888
- temperature=args.temperature,
889
- verbose=True,
890
- seed=args.seed,
891
- )
892
-
893
- print("\n" + "-" * 40)
894
- print("LEGAL MOVE RESULTS")
895
- print("-" * 40)
896
- print(f" Positions tested: {legal_results['total_positions']}")
897
- print(f" Legal (1st try): {legal_results['legal_first_try']} ({legal_results['legal_rate_first_try']:.1%})")
898
- print(f" Legal (with retry): {legal_results['legal_first_try'] + legal_results['legal_with_retry']} ({legal_results['legal_rate_with_retry']:.1%})")
899
- print(f" Always illegal: {legal_results['illegal_all_retries']} ({legal_results['illegal_rate']:.1%})")
900
-
901
- # Run win rate evaluation
902
- if args.mode in ["winrate", "both"]:
903
- print(f"\n" + "=" * 60)
904
- print("PHASE 2: WIN RATE EVALUATION")
905
- print("=" * 60)
906
- print(f"Playing {args.n_games} games against Stockfish (Level {args.stockfish_level})...")
907
-
908
- winrate_results = evaluator.evaluate(
909
- n_games=args.n_games,
910
- temperature=args.temperature,
911
- verbose=True,
912
- )
913
-
914
- print("\n" + "-" * 40)
915
- print("WIN RATE RESULTS")
916
- print("-" * 40)
917
- print(f" Wins: {winrate_results['wins']}")
918
- print(f" Losses: {winrate_results['losses']}")
919
- print(f" Draws: {winrate_results['draws']}")
920
- print(f"\n Win Rate: {winrate_results['win_rate']:.1%}")
921
- print(f" Draw Rate: {winrate_results['draw_rate']:.1%}")
922
- print(f" Loss Rate: {winrate_results['loss_rate']:.1%}")
923
- print(f"\n Avg Game Length: {winrate_results['avg_game_length']:.1f} moves")
924
- print(f" Illegal Move Rate: {winrate_results['illegal_move_rate']:.2%}")
925
-
926
- if winrate_results["estimated_elo"]:
927
- print(f"\n Estimated ELO: {winrate_results['estimated_elo']:.0f}")
928
-
929
- print("\n" + "=" * 60)
930
  print("EVALUATION COMPLETE")
931
  print("=" * 60)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
932
 
933
 
934
  if __name__ == "__main__":
 
1
  """
2
  Evaluation script for the Chess Challenge.
3
 
4
+ This script evaluates a trained chess model by:
5
+ 1. Checking if the model has < 1M parameters
6
+ 2. Verifying no illegal use of python-chess for move filtering
7
+ 3. Playing games against a deterministic engine (500 total moves, restarting after 25 moves)
8
+ 4. Tracking legal move rates (first try and with retries)
9
+
10
+ The evaluation is deterministic (greedy decoding, seeded random).
11
  """
12
 
13
  from __future__ import annotations
14
 
15
  import argparse
16
+ import ast
17
+ import os
18
  import random
19
  import re
20
+ import warnings
21
+ from dataclasses import dataclass, field
22
+ from pathlib import Path
23
  from typing import List, Optional, Tuple
24
 
25
  import torch
26
 
27
+ # Suppress HuggingFace warning about empty module names (harmless)
28
+ # This warning comes from transformers' dynamic_module_utils when loading custom code
29
+ import transformers.utils.logging as hf_logging
30
+ hf_logging.set_verbosity_error()
31
+
32
+
33
+ # =============================================================================
34
+ # Data Classes
35
+ # =============================================================================
36
 
37
  @dataclass
38
+ class EvaluationResult:
39
+ """Complete result of an evaluation run."""
40
+ model_id: str
41
+ n_parameters: int
42
+ passed_param_check: bool
43
+ passed_pychess_check: bool
44
+ total_moves: int
45
+ legal_moves_first_try: int
46
+ legal_moves_with_retry: int
47
+ games_played: int
48
+ moves_per_game: List[int] = field(default_factory=list)
49
+ error_message: Optional[str] = None
50
+
51
+ @property
52
+ def legal_rate_first_try(self) -> float:
53
+ return self.legal_moves_first_try / self.total_moves if self.total_moves > 0 else 0.0
54
 
55
+ @property
56
+ def legal_rate_with_retry(self) -> float:
57
+ return self.legal_moves_with_retry / self.total_moves if self.total_moves > 0 else 0.0
58
 
59
+ def to_dict(self) -> dict:
60
+ return {
61
+ "model_id": self.model_id,
62
+ "n_parameters": self.n_parameters,
63
+ "passed_param_check": self.passed_param_check,
64
+ "passed_pychess_check": self.passed_pychess_check,
65
+ "total_moves": self.total_moves,
66
+ "legal_moves_first_try": self.legal_moves_first_try,
67
+ "legal_moves_with_retry": self.legal_moves_with_retry,
68
+ "legal_rate_first_try": self.legal_rate_first_try,
69
+ "legal_rate_with_retry": self.legal_rate_with_retry,
70
+ "games_played": self.games_played,
71
+ "moves_per_game": self.moves_per_game,
72
+ "error_message": self.error_message,
73
+ }
74
+
75
+ def summary(self) -> str:
76
+ """Generate a human-readable summary for the model page discussion."""
77
+ lines = [
78
+ "## Evaluation Results",
79
+ "",
80
+ f"**Model**: `{self.model_id}`",
81
+ f"**Parameters**: {self.n_parameters:,} {'[PASS]' if self.passed_param_check else '[FAIL] (exceeds 1M limit)'}",
82
+ f"**Chess library check**: {'[PASS]' if self.passed_pychess_check else '[FAIL] (illegal use of python-chess)'}",
83
+ "",
84
+ ]
85
+
86
+ if not self.passed_param_check:
87
+ lines.append("**Evaluation not performed**: Model exceeds 1M parameter limit.")
88
+ return "\n".join(lines)
89
+
90
+ if not self.passed_pychess_check:
91
+ lines.append("**Evaluation not performed**: Model illegally uses python-chess for move filtering.")
92
+ return "\n".join(lines)
93
+
94
+ if self.error_message:
95
+ lines.append(f"**Evaluation error**: {self.error_message}")
96
+ return "\n".join(lines)
97
+
98
+ lines.extend([
99
+ "### Performance",
100
+ "",
101
+ "| Metric | Value |",
102
+ "|--------|-------|",
103
+ f"| Total moves played | {self.total_moves} |",
104
+ f"| Games played | {self.games_played} |",
105
+ f"| Legal moves (first try) | {self.legal_moves_first_try} ({self.legal_rate_first_try*100:.1f}%) |",
106
+ f"| Legal moves (with retries) | {self.legal_moves_with_retry} ({self.legal_rate_with_retry*100:.1f}%) |",
107
+ "",
108
+ "### Interpretation",
109
+ "",
110
+ "- **>90% legal rate**: Excellent! Model has learned chess rules well.",
111
+ "- **70-90% legal rate**: Good, but room for improvement.",
112
+ "- **<70% legal rate**: Model struggles with legal move generation.",
113
+ ])
114
+
115
+ return "\n".join(lines)
116
+
117
+
118
+ # =============================================================================
119
+ # Security Checks
120
+ # =============================================================================
121
+
122
+ def count_parameters(model) -> int:
123
+ """Count the total number of parameters in a model."""
124
+ return sum(p.numel() for p in model.parameters())
125
+
126
+
127
+ def check_pychess_usage(model_path: str) -> Tuple[bool, Optional[str]]:
128
  """
129
+ Check if the model code illegally uses python-chess for move filtering.
130
 
131
+ Scans Python files in the model directory for patterns that suggest
132
+ using chess.Board.legal_moves or similar to filter model outputs.
133
 
134
+ Args:
135
+ model_path: Path to the model directory.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
 
137
+ Returns:
138
+ Tuple of (passed_check, error_message).
139
+ passed_check is True if no illegal usage detected.
140
+ """
141
+ forbidden_patterns = [
142
+ r'\.legal_moves',
143
+ r'board\.is_legal\s*\(',
144
+ r'move\s+in\s+.*legal',
145
+ r'filter.*legal',
146
+ r'legal.*filter',
147
+ ]
 
 
 
 
 
 
 
 
 
 
 
 
 
148
 
149
+ model_dir = Path(model_path)
150
+ if not model_dir.is_dir():
151
+ # If it's a HuggingFace model ID, we can't check local files
152
+ # We'll check the downloaded files after loading
153
+ return True, None
154
 
155
+ python_files = list(model_dir.glob("*.py"))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
 
157
+ for py_file in python_files:
158
+ try:
159
+ content = py_file.read_text()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
160
 
161
+ # Skip if it's just the standard model.py or tokenizer.py from the template
162
+ if py_file.name in ["model.py", "tokenizer.py"]:
163
+ # Check if it contains suspicious patterns in generate/forward methods
164
+ tree = ast.parse(content)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
165
 
166
+ for node in ast.walk(tree):
167
+ if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
168
+ if node.name in ["forward", "generate", "__call__", "get_move"]:
169
+ func_code = ast.get_source_segment(content, node)
170
+ if func_code:
171
+ for pattern in forbidden_patterns:
172
+ if re.search(pattern, func_code, re.IGNORECASE):
173
+ return False, f"Illegal chess library usage in {py_file.name}:{node.name}"
174
  else:
175
+ # For other files, check all content
176
+ for pattern in forbidden_patterns:
177
+ if re.search(pattern, content, re.IGNORECASE):
178
+ return False, f"Illegal chess library usage detected in {py_file.name}"
179
+
180
+ except Exception as e:
181
+ # If we can't parse the file, skip it
182
+ continue
183
+
184
+ return True, None
185
+
186
+
187
+ # =============================================================================
188
+ # Model Loading
189
+ # =============================================================================
190
+
191
+ REQUIRED_MODEL_FILES = [
192
+ "config.json", # Model configuration
193
+ "model.safetensors", # Model weights (or pytorch_model.bin)
194
+ ]
195
+
196
+ REQUIRED_TOKENIZER_FILES = [
197
+ "tokenizer_config.json", # Tokenizer configuration
198
+ "vocab.json", # Vocabulary file
199
+ ]
200
+
201
+
202
+ def validate_model_files(model_path: str) -> Tuple[bool, List[str]]:
203
+ """
204
+ Validate that a model directory contains all required files.
205
+
206
+ For local paths, checks that the model contains:
207
+ - Model architecture (config.json + weights)
208
+ - Tokenizer (tokenizer_config.json + vocab.json)
209
+
210
+ For HuggingFace Hub models, this is handled by the Hub.
211
+
212
+ Args:
213
+ model_path: Local path or HuggingFace model ID.
214
 
215
+ Returns:
216
+ Tuple of (is_valid, list of missing files).
217
+ """
218
+ is_local = os.path.exists(model_path)
219
 
220
+ if not is_local:
221
+ # HuggingFace Hub - validation happens during download
222
+ return True, []
223
+
224
+ model_dir = Path(model_path)
225
+ missing_files = []
226
+
227
+ # Check model files
228
+ has_safetensors = (model_dir / "model.safetensors").exists()
229
+ has_pytorch = (model_dir / "pytorch_model.bin").exists()
230
+ if not (has_safetensors or has_pytorch):
231
+ missing_files.append("model.safetensors (or pytorch_model.bin)")
232
+
233
+ if not (model_dir / "config.json").exists():
234
+ missing_files.append("config.json")
235
+
236
+ # Check tokenizer files
237
+ for fname in REQUIRED_TOKENIZER_FILES:
238
+ if not (model_dir / fname).exists():
239
+ missing_files.append(fname)
240
+
241
+ return len(missing_files) == 0, missing_files
242
+
243
+
244
+ def load_model_and_tokenizer(
245
+ model_path: str,
246
+ device: str = "auto",
247
+ verbose: bool = True,
248
+ ) -> Tuple[any, any, str]:
249
+ """
250
+ Load a model and tokenizer from a local path or HuggingFace Hub.
251
+
252
+ The model must contain all necessary files:
253
+ - config.json: Model configuration
254
+ - model.safetensors (or pytorch_model.bin): Model weights
255
+ - tokenizer_config.json: Tokenizer configuration
256
+ - vocab.json: Vocabulary file
257
+
258
+ Models must use trust_remote_code=True to load custom architectures.
259
+
260
+ Args:
261
+ model_path: Local path or HuggingFace model ID.
262
+ device: Device to load the model on.
263
+ verbose: Whether to print debug info.
264
 
265
+ Returns:
266
+ Tuple of (model, tokenizer, source_description).
 
 
267
 
268
+ Raises:
269
+ FileNotFoundError: If required model files are missing.
270
+ RuntimeError: If model or tokenizer cannot be loaded.
271
+ """
272
+ from transformers import AutoModelForCausalLM, AutoTokenizer
273
+
274
+ is_local = os.path.exists(model_path)
275
+
276
+ # Validate model files for local paths
277
+ is_valid, missing_files = validate_model_files(model_path)
278
+ if not is_valid:
279
+ raise FileNotFoundError(
280
+ f"Model is missing required files: {', '.join(missing_files)}\\n"
281
+ f"Your model must contain:\\n"
282
+ f" - config.json (model configuration)\\n"
283
+ f" - model.safetensors or pytorch_model.bin (model weights)\\n"
284
+ f" - tokenizer_config.json (tokenizer configuration)\\n"
285
+ f" - vocab.json (vocabulary)\\n"
286
+ f"See example_solution/ for a reference."
287
+ )
288
+
289
+ if verbose:
290
+ source = "local path" if is_local else "HuggingFace Hub"
291
+ print(f"Loading model from {source}: {model_path}")
292
+
293
+ # Try to load tokenizer
294
+ tokenizer = None
295
+
296
+ load_kwargs = {"trust_remote_code": True}
297
+ if is_local:
298
+ load_kwargs["local_files_only"] = True
299
+
300
+ try:
301
+ tokenizer = AutoTokenizer.from_pretrained(model_path, **load_kwargs)
302
+ except Exception as e:
303
+ raise RuntimeError(
304
+ f"Failed to load tokenizer from {model_path}: {e}\\n"
305
+ f"Make sure your model includes tokenizer files and custom tokenizer class."
306
+ )
307
+
308
+ # Load model
309
+ try:
310
+ model = AutoModelForCausalLM.from_pretrained(
311
+ model_path,
312
+ trust_remote_code=True,
313
+ device_map=device,
314
+ local_files_only=is_local,
315
+ )
316
+ except Exception as e:
317
+ raise RuntimeError(
318
+ f"Failed to load model from {model_path}: {e}\\n"
319
+ f"Make sure your model includes config.json with auto_map and model weights."
320
+ )
321
+
322
+ if verbose:
323
+ print(f" Tokenizer: {type(tokenizer).__name__} (vocab_size={tokenizer.vocab_size})")
324
+ print(f" Model: {type(model).__name__}")
325
+ print(f" Parameters: {count_parameters(model):,}")
326
+
327
+ return model, tokenizer, model_path
328
+
329
+
330
+ # =============================================================================
331
+ # Move Generation
332
+ # =============================================================================
333
+
334
+ class MoveGenerator:
335
+ """
336
+ Generates moves from a chess model using greedy decoding.
337
+
338
+ The generation process:
339
+ 1. Tokenize the current game history
340
+ 2. Generate tokens greedily until whitespace is produced
341
+ 3. Extract UCI move from generated text
342
+ 4. Retry up to max_retries times if move is illegal
343
+ """
344
+
345
+ SQUARE_PATTERN = re.compile(r'[a-h][1-8]')
346
+
347
+ def __init__(
348
+ self,
349
+ model,
350
+ tokenizer,
351
+ device: str = "cuda" if torch.cuda.is_available() else "cpu",
352
+ max_retries: int = 3,
353
+ max_tokens_per_move: int = 20,
354
+ ):
355
+ self.model = model
356
+ self.tokenizer = tokenizer
357
+ self.device = device
358
+ self.max_retries = max_retries
359
+ self.max_tokens_per_move = max_tokens_per_move
360
 
361
+ # Move model to device and set to eval mode
362
+ if hasattr(model, 'to'):
363
+ self.model = model.to(device)
364
+ self.model.eval()
365
+
366
+ def _is_whitespace_token(self, token_str: str) -> bool:
367
+ """Check if token represents whitespace (separator between moves)."""
368
+ if not token_str:
369
+ return False
370
+ # Check for EOS
371
  if hasattr(self.tokenizer, 'eos_token') and token_str == self.tokenizer.eos_token:
372
  return True
373
+ # Check for whitespace
374
+ return token_str.strip() == "" and len(token_str) > 0
375
+
 
 
 
 
 
 
 
 
376
  def _extract_uci_move(self, text: str) -> Optional[str]:
377
  """
378
+ Extract a UCI move from generated text.
 
 
 
 
 
 
 
 
 
 
379
 
380
+ Looks for two consecutive chess squares (e.g., e2e4).
381
+ Handles promotion by looking for q/r/b/n after the destination.
 
 
 
382
  """
383
+ squares = self.SQUARE_PATTERN.findall(text)
 
 
 
 
384
 
385
  if len(squares) < 2:
386
  return None
387
 
 
388
  from_sq, to_sq = squares[0], squares[1]
389
  uci_move = from_sq + to_sq
390
 
391
+ # Check for promotion piece
392
+ to_idx = text.find(to_sq)
393
+ if to_idx != -1:
394
+ remaining = text[to_idx + 2:to_idx + 5]
 
395
  promo_match = re.search(r'[=]?([qrbnQRBN])', remaining)
396
  if promo_match:
397
  uci_move += promo_match.group(1).lower()
398
 
399
  return uci_move
400
+
401
+ def _generate_until_whitespace(
402
+ self,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
403
  input_ids: torch.Tensor,
404
+ temperature: float = 0.0,
 
 
405
  ) -> str:
406
  """
407
+ Generate tokens until whitespace is encountered.
 
 
 
 
 
408
 
409
  Args:
410
+ input_ids: Input token IDs.
411
+ temperature: Sampling temperature. 0.0 = greedy (argmax).
 
 
412
 
413
+ Uses greedy decoding (argmax) when temperature=0 for determinism.
414
+ Uses sampling when temperature>0 for retries.
415
  """
416
  generated_tokens = []
417
  current_ids = input_ids.clone()
 
418
 
419
+ with torch.no_grad():
420
+ for _ in range(self.max_tokens_per_move):
421
  outputs = self.model(input_ids=current_ids)
422
+ logits = outputs.logits[:, -1, :]
423
 
424
+ if temperature == 0.0:
425
+ # Greedy decoding: take argmax
426
+ next_token = logits.argmax(dim=-1, keepdim=True)
427
+ else:
428
+ # Sampling with temperature
429
+ probs = torch.softmax(logits / temperature, dim=-1)
430
+ next_token = torch.multinomial(probs, num_samples=1)
431
 
432
+ # Decode token
433
+ token_str = self.tokenizer.decode(next_token[0])
434
+
435
+ # Check for whitespace/separator
436
+ if self._is_whitespace_token(token_str):
 
 
 
 
 
 
 
 
 
 
 
 
 
437
  break
438
+
439
+ generated_tokens.append(next_token)
440
+ current_ids = torch.cat([current_ids, next_token], dim=-1)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
441
 
 
442
  if generated_tokens:
443
+ all_tokens = torch.cat(generated_tokens, dim=1)
444
+ return self.tokenizer.decode(all_tokens[0], skip_special_tokens=True)
 
445
 
446
  return ""
447
+
448
+ def get_move(
449
  self,
450
+ game_history: str,
451
+ legal_moves: set,
452
+ ) -> Tuple[Optional[str], bool]:
 
453
  """
454
+ Generate a move for the current position.
 
 
 
455
 
456
+ First attempt uses greedy decoding (deterministic).
457
+ Retries use sampling with temperature (seeded for reproducibility).
 
 
 
 
458
 
459
+ Args:
460
+ game_history: Space-separated move history in model's format.
461
+ legal_moves: Set of legal UCI moves for validation.
462
+
463
  Returns:
464
+ Tuple of (uci_move, was_first_try).
465
+ uci_move is None if all retries failed.
466
  """
467
+ # Prepare input
468
+ if game_history:
469
+ input_text = self.tokenizer.bos_token + " " + game_history
 
 
 
 
 
470
  else:
471
+ input_text = self.tokenizer.bos_token
472
+
473
+ # Get max context length
474
+ max_length = getattr(self.model.config, 'n_ctx', 512)
475
 
 
476
  inputs = self.tokenizer(
477
  input_text,
478
  return_tensors="pt",
479
  truncation=True,
480
+ max_length=max_length - self.max_tokens_per_move,
481
  ).to(self.device)
482
 
483
  # Try to generate a legal move
484
+ for attempt in range(self.max_retries):
485
+ # First attempt: greedy (temperature=0)
486
+ # Retries: sampling with increasing temperature
487
+ temperature = 0.0 if attempt == 0 else 0.5 + 0.25 * attempt
 
 
 
488
 
489
+ move_text = self._generate_until_whitespace(inputs["input_ids"], temperature)
490
  uci_move = self._extract_uci_move(move_text)
491
 
492
+ if uci_move and uci_move in legal_moves:
493
+ return uci_move, (attempt == 0)
 
 
 
 
 
494
 
495
+ return None, False
496
+
497
+
498
+ # =============================================================================
499
+ # Chess Game Handler (with built-in deterministic engine)
500
+ # =============================================================================
501
+
502
+ # Piece values for simple evaluation
503
+ PIECE_VALUES = {
504
+ 'P': 100, 'N': 320, 'B': 330, 'R': 500, 'Q': 900, 'K': 20000,
505
+ 'p': -100, 'n': -320, 'b': -330, 'r': -500, 'q': -900, 'k': -20000,
506
+ }
507
+
508
+ # Piece-square tables for positional evaluation (simplified)
509
+ PAWN_TABLE = [
510
+ 0, 0, 0, 0, 0, 0, 0, 0,
511
+ 50, 50, 50, 50, 50, 50, 50, 50,
512
+ 10, 10, 20, 30, 30, 20, 10, 10,
513
+ 5, 5, 10, 25, 25, 10, 5, 5,
514
+ 0, 0, 0, 20, 20, 0, 0, 0,
515
+ 5, -5,-10, 0, 0,-10, -5, 5,
516
+ 5, 10, 10,-20,-20, 10, 10, 5,
517
+ 0, 0, 0, 0, 0, 0, 0, 0,
518
+ ]
519
+
520
+
521
+ class SimpleEngine:
522
+ """
523
+ A simple deterministic chess engine using minimax with alpha-beta pruning.
524
 
525
+ This replaces Stockfish to ensure fully deterministic evaluation.
526
+ The engine is intentionally weak (shallow search) to be beatable.
527
+ """
 
 
 
 
528
 
529
+ def __init__(self, depth: int = 2):
530
+ self.depth = depth
531
+
532
+ def evaluate_board(self, board) -> int:
 
 
533
  """
534
+ Evaluate the board position.
535
 
536
+ Returns a score from white's perspective.
537
+ Positive = white advantage, Negative = black advantage.
538
+ """
539
+ if board.is_checkmate():
540
+ return -30000 if board.turn else 30000
541
+ if board.is_stalemate() or board.is_insufficient_material():
542
+ return 0
543
+
544
+ score = 0
545
+
546
+ # Material counting
547
+ for square in range(64):
548
+ piece = board.piece_at(square)
549
+ if piece:
550
+ symbol = piece.symbol()
551
+ score += PIECE_VALUES.get(symbol, 0)
552
+
553
+ # Add positional bonus for pawns
554
+ if symbol == 'P':
555
+ score += PAWN_TABLE[63 - square] # Flip for white
556
+ elif symbol == 'p':
557
+ score -= PAWN_TABLE[square]
558
+
559
+ # Small bonus for mobility
560
+ if board.turn: # White to move
561
+ score += len(list(board.legal_moves))
562
+ else:
563
+ score -= len(list(board.legal_moves))
564
 
565
+ return score
566
+
567
+ def minimax(self, board, depth: int, alpha: int, beta: int, maximizing: bool) -> Tuple[int, Optional[any]]:
568
  """
569
+ Minimax with alpha-beta pruning.
 
 
570
 
571
+ Returns (score, best_move).
572
+ """
573
+ if depth == 0 or board.is_game_over():
574
+ return self.evaluate_board(board), None
575
 
576
+ # Sort moves for better pruning (captures first)
577
+ moves = list(board.legal_moves)
578
+ moves.sort(key=lambda m: (board.is_capture(m), board.gives_check(m)), reverse=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
579
 
580
+ best_move = moves[0] if moves else None
581
+
582
+ if maximizing:
583
+ max_eval = -float('inf')
584
+ for move in moves:
585
+ board.push(move)
586
+ eval_score, _ = self.minimax(board, depth - 1, alpha, beta, False)
587
+ board.pop()
588
+
589
+ if eval_score > max_eval:
590
+ max_eval = eval_score
591
+ best_move = move
592
+ alpha = max(alpha, eval_score)
593
+ if beta <= alpha:
594
+ break
595
+ return max_eval, best_move
 
 
 
596
  else:
597
+ min_eval = float('inf')
598
+ for move in moves:
599
+ board.push(move)
600
+ eval_score, _ = self.minimax(board, depth - 1, alpha, beta, True)
601
+ board.pop()
602
+
603
+ if eval_score < min_eval:
604
+ min_eval = eval_score
605
+ best_move = move
606
+ beta = min(beta, eval_score)
607
+ if beta <= alpha:
608
+ break
609
+ return min_eval, best_move
610
+
611
+ def get_best_move(self, board) -> str:
612
+ """Get the best move for the current position."""
613
+ _, best_move = self.minimax(
614
+ board,
615
+ self.depth,
616
+ -float('inf'),
617
+ float('inf'),
618
+ board.turn # True if white to move
619
  )
620
+ return best_move.uci() if best_move else None
621
+
622
+
623
+ class ChessGameHandler:
624
+ """
625
+ Handles chess game logic using python-chess.
626
 
627
+ This class is used ONLY by the evaluation framework, not by the model.
628
+ It manages the chess board state and uses a simple built-in engine
629
+ for deterministic opponent moves.
630
+ """
631
+
632
+ def __init__(self, engine_depth: int = 2):
633
+ import chess
 
 
634
 
635
+ self.chess = chess
636
+ self.board = chess.Board()
637
+ self.engine = SimpleEngine(depth=engine_depth)
638
+
639
+ def reset(self):
640
+ """Reset the board to starting position."""
641
+ self.board = self.chess.Board()
642
+
643
+ def get_legal_moves_uci(self) -> set:
644
+ """Get set of legal moves in UCI format."""
645
+ return {move.uci() for move in self.board.legal_moves}
646
+
647
+ def make_move(self, uci_move: str) -> bool:
648
+ """Make a move on the board. Returns True if successful."""
649
+ try:
650
+ move = self.chess.Move.from_uci(uci_move)
651
+ if move in self.board.legal_moves:
652
+ self.board.push(move)
653
+ return True
654
+ except (ValueError, self.chess.InvalidMoveError):
655
+ pass
656
+ return False
657
+
658
+ def get_opponent_move(self) -> str:
659
+ """Get the opponent engine's move for the current position.
660
 
661
+ Uses the built-in SimpleEngine for deterministic moves.
662
+ """
663
+ return self.engine.get_best_move(self.board)
664
+
665
+ def is_game_over(self) -> bool:
666
+ """Check if the game is over."""
667
+ return self.board.is_game_over()
668
+
669
+ def get_turn(self) -> str:
670
+ """Get whose turn it is ('white' or 'black')."""
671
+ return "white" if self.board.turn == self.chess.WHITE else "black"
672
+
673
+ def get_move_history_formatted(self) -> str:
674
+ """
675
+ Get move history in the model's expected format.
676
 
677
+ Converts UCI moves to the format: WPe2e4, BNg8f6, etc.
 
678
  """
679
+ moves = []
680
+ temp_board = self.chess.Board()
 
 
 
 
 
 
 
 
 
681
 
682
+ for move in self.board.move_stack:
683
+ color = "W" if temp_board.turn == self.chess.WHITE else "B"
684
+ piece = temp_board.piece_at(move.from_square)
685
+ piece_letter = piece.symbol().upper() if piece else "P"
686
 
687
+ from_sq = self.chess.square_name(move.from_square)
688
+ to_sq = self.chess.square_name(move.to_square)
 
 
 
 
 
689
 
690
+ move_str = f"{color}{piece_letter}{from_sq}{to_sq}"
 
691
 
692
+ # Handle promotion
693
+ if move.promotion:
694
+ promo_piece = self.chess.piece_symbol(move.promotion).upper()
695
+ move_str += f"={promo_piece}"
696
 
697
+ # Handle capture
698
+ if temp_board.is_capture(move):
699
+ move_str += "(x)"
700
 
701
+ temp_board.push(move)
 
 
 
 
 
 
702
 
703
+ # Handle check/checkmate
704
+ if temp_board.is_checkmate():
705
+ move_str += "(+*)" if "(x)" not in move_str else ""
706
+ move_str = move_str.replace("(x)", "(x+*)")
707
+ elif temp_board.is_check():
708
+ if "(x)" in move_str:
709
+ move_str = move_str.replace("(x)", "(x+)")
710
  else:
711
+ move_str += "(+)"
 
 
712
 
713
+ moves.append(move_str)
 
 
 
 
 
 
 
 
 
 
 
 
 
714
 
715
+ return " ".join(moves)
716
+
717
+ def close(self):
718
+ """Clean up resources (no-op for built-in engine)."""
719
+ pass
720
+
721
+
722
+ # =============================================================================
723
+ # Main Evaluator
724
+ # =============================================================================
725
+
726
+ class ChessEvaluator:
727
+ """
728
+ Main evaluator for the Chess Challenge.
729
+
730
+ Evaluation procedure:
731
+ 1. Check model has < 1M parameters
732
+ 2. Check model doesn't use python-chess illegally
733
+ 3. Play games against deterministic engine:
734
+ - 500 total moves (model moves)
735
+ - Restart game after 25 moves
736
+ - Model always plays white
737
+ 4. Track legal move rates
738
+ """
739
 
740
+ TOTAL_MOVES = 500
741
+ MOVES_PER_GAME = 25
742
+ SEED = 42
743
+
744
+ def __init__(
745
  self,
746
+ model,
747
+ tokenizer,
748
+ model_path: str,
749
+ engine_depth: int = 2,
750
+ max_retries: int = 3,
751
+ device: str = "auto",
752
+ total_moves: int = None, # Override TOTAL_MOVES for testing
753
+ moves_per_game: int = None, # Override MOVES_PER_GAME for testing
754
+ ):
755
+ self.model = model
756
+ self.tokenizer = tokenizer
757
+ self.model_path = model_path
758
+ self.max_retries = max_retries
759
 
760
+ # Allow overriding constants for testing
761
+ self.total_moves = total_moves if total_moves is not None else self.TOTAL_MOVES
762
+ self.moves_per_game = moves_per_game if moves_per_game is not None else self.MOVES_PER_GAME
763
+
764
+ # Determine device
765
+ if device == "auto":
766
+ device = "cuda" if torch.cuda.is_available() else "cpu"
767
+ self.device = device
768
+
769
+ # Initialize move generator
770
+ self.move_generator = MoveGenerator(
771
+ model=model,
772
+ tokenizer=tokenizer,
773
+ device=device,
774
+ max_retries=max_retries,
775
+ )
776
+
777
+ # Initialize game handler with built-in deterministic engine
778
+ self.game_handler = ChessGameHandler(engine_depth=engine_depth)
779
+
780
+ def __del__(self):
781
+ if hasattr(self, 'game_handler'):
782
+ self.game_handler.close()
783
+
784
+ def evaluate(self, verbose: bool = True) -> EvaluationResult:
785
+ """
786
+ Run the complete evaluation procedure.
787
 
788
  Returns:
789
+ EvaluationResult with all metrics.
790
  """
791
+ # Set seeds for determinism
792
+ random.seed(self.SEED)
793
+ torch.manual_seed(self.SEED)
794
+ if torch.cuda.is_available():
795
+ torch.cuda.manual_seed_all(self.SEED)
 
 
 
796
 
797
+ # Count parameters
798
+ n_params = count_parameters(self.model)
799
+ passed_param_check = n_params <= 1_000_000
800
+
801
+ if verbose:
802
+ status = "[PASS]" if passed_param_check else "[FAIL]"
803
+ print(f"Parameter check: {n_params:,} parameters {status}")
804
+
805
+ if not passed_param_check:
806
+ return EvaluationResult(
807
+ model_id=self.model_path,
808
+ n_parameters=n_params,
809
+ passed_param_check=False,
810
+ passed_pychess_check=True,
811
+ total_moves=0,
812
+ legal_moves_first_try=0,
813
+ legal_moves_with_retry=0,
814
+ games_played=0,
815
+ error_message="Model exceeds 1M parameter limit",
816
  )
817
+
818
+ # Check for illegal python-chess usage
819
+ passed_pychess, pychess_error = check_pychess_usage(self.model_path)
820
+
821
+ if verbose:
822
+ status = "[PASS]" if passed_pychess else "[FAIL]"
823
+ print(f"Python-chess check: {status}")
824
+
825
+ if not passed_pychess:
826
+ return EvaluationResult(
827
+ model_id=self.model_path,
828
+ n_parameters=n_params,
829
+ passed_param_check=True,
830
+ passed_pychess_check=False,
831
+ total_moves=0,
832
+ legal_moves_first_try=0,
833
+ legal_moves_with_retry=0,
834
+ games_played=0,
835
+ error_message=pychess_error,
836
+ )
837
+
838
+ # Run evaluation games
839
+ if verbose:
840
+ print(f"\nPlaying games against opponent engine...")
841
+ print(f" Total moves: {self.total_moves}")
842
+ print(f" Moves per game: {self.moves_per_game}")
843
+
844
+ try:
845
+ result = self._play_evaluation_games(verbose=verbose)
846
+ result.passed_param_check = True
847
+ result.passed_pychess_check = True
848
+ result.n_parameters = n_params
849
+ return result
850
+ except Exception as e:
851
+ return EvaluationResult(
852
+ model_id=self.model_path,
853
+ n_parameters=n_params,
854
+ passed_param_check=True,
855
+ passed_pychess_check=True,
856
+ total_moves=0,
857
+ legal_moves_first_try=0,
858
+ legal_moves_with_retry=0,
859
+ games_played=0,
860
+ error_message=str(e),
861
+ )
862
+
863
+ def _play_evaluation_games(self, verbose: bool = True) -> EvaluationResult:
864
+ """
865
+ Play evaluation games and collect statistics.
866
+ """
867
+ total_model_moves = 0
868
+ legal_first_try = 0
869
+ legal_with_retry = 0
870
+ games_played = 0
871
+ moves_per_game = []
872
+
873
+ while total_model_moves < self.total_moves:
874
+ # Start a new game
875
+ self.game_handler.reset()
876
+ game_moves = 0
877
+ games_played += 1
878
 
879
+ while game_moves < self.moves_per_game and total_model_moves < self.total_moves:
880
+ if self.game_handler.is_game_over():
881
+ break
882
+
883
+ turn = self.game_handler.get_turn()
884
+
885
+ if turn == "white":
886
+ # Model's turn
887
+ legal_moves = self.game_handler.get_legal_moves_uci()
888
+ history = self.game_handler.get_move_history_formatted()
889
+
890
+ move, was_first_try = self.move_generator.get_move(history, legal_moves)
891
+
892
+ total_model_moves += 1
893
+ game_moves += 1
894
+
895
+ if move:
896
+ if was_first_try:
897
+ legal_first_try += 1
898
+ legal_with_retry += 1
899
+ self.game_handler.make_move(move)
900
+ else:
901
+ # All retries failed - make a random legal move to continue
902
+ # Sort for determinism (set iteration order is not guaranteed)
903
+ if legal_moves:
904
+ sorted_moves = sorted(legal_moves)
905
+ random_move = random.choice(sorted_moves)
906
+ self.game_handler.make_move(random_move)
907
+ else:
908
+ # Opponent engine's turn
909
+ opp_move = self.game_handler.get_opponent_move()
910
+ self.game_handler.make_move(opp_move)
911
 
912
+ moves_per_game.append(game_moves)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
913
 
914
+ if verbose and games_played % 5 == 0:
915
+ rate = legal_with_retry / total_model_moves if total_model_moves > 0 else 0
916
+ print(f" Games: {games_played} | Moves: {total_model_moves}/{self.TOTAL_MOVES} | Legal rate: {rate:.1%}")
917
+
918
+ return EvaluationResult(
919
+ model_id=self.model_path,
920
+ n_parameters=0, # Will be set by caller
921
+ passed_param_check=True,
922
+ passed_pychess_check=True,
923
+ total_moves=total_model_moves,
924
+ legal_moves_first_try=legal_first_try,
925
+ legal_moves_with_retry=legal_with_retry,
926
+ games_played=games_played,
927
+ moves_per_game=moves_per_game,
928
+ )
929
+
930
 
931
+ # =============================================================================
932
+ # Hub Integration
933
+ # =============================================================================
934
 
935
+ def post_discussion_summary(model_id: str, result: EvaluationResult, token: Optional[str] = None):
936
  """
937
+ Post evaluation summary as a discussion on the model's HuggingFace page.
938
 
939
  Args:
940
+ model_id: The HuggingFace model ID.
941
+ result: The evaluation result.
942
+ token: HuggingFace token with write access.
 
 
 
943
  """
 
 
 
 
 
 
 
 
 
944
  try:
945
+ from huggingface_hub import HfApi
946
+
947
+ api = HfApi(token=token)
948
+
949
+ # Create discussion with evaluation results
950
+ api.create_discussion(
951
+ repo_id=model_id,
952
+ title="🏆 Evaluation Results",
953
+ description=result.summary(),
954
+ repo_type="model",
955
+ )
956
+
957
+ print(f"Posted evaluation summary to {model_id}")
958
+
959
  except Exception as e:
960
+ print(f"Failed to post discussion: {e}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
961
 
962
 
963
+ # =============================================================================
964
+ # CLI
965
+ # =============================================================================
966
+
967
  def main():
968
  """Main evaluation function."""
969
+ parser = argparse.ArgumentParser(
970
+ description="Evaluate a chess model for the Chess Challenge",
971
+ formatter_class=argparse.RawDescriptionHelpFormatter,
972
+ epilog="""
973
+ Examples:
974
+ # Evaluate a local model
975
+ python -m src.evaluate --model_path ./my_model
976
+
977
+ # Evaluate a HuggingFace model
978
+ python -m src.evaluate --model_path LLM-course/chess-example
979
+
980
+ # Evaluate and post results to HuggingFace
981
+ python -m src.evaluate --model_path LLM-course/chess-example --post_results
982
+ """
983
+ )
984
 
985
  parser.add_argument(
986
  "--model_path", type=str, required=True,
987
+ help="Path to the model directory or HuggingFace model ID"
 
 
 
 
 
 
 
 
 
 
 
 
988
  )
989
  parser.add_argument(
990
+ "--engine_depth", type=int, default=2,
991
+ help="Opponent engine search depth (default: 2)"
992
  )
993
  parser.add_argument(
994
+ "--post_results", action="store_true",
995
+ help="Post results as a discussion on the model's HuggingFace page"
996
  )
997
  parser.add_argument(
998
+ "--hf_token", type=str, default=None,
999
+ help="HuggingFace token for posting results (uses HF_TOKEN env var if not provided)"
 
 
 
 
1000
  )
1001
 
1002
  args = parser.parse_args()
 
1004
  print("=" * 60)
1005
  print("CHESS CHALLENGE - EVALUATION")
1006
  print("=" * 60)
1007
+ print()
1008
 
1009
+ # Load model and tokenizer
1010
+ model, tokenizer, model_id = load_model_and_tokenizer(
1011
+ args.model_path,
1012
+ verbose=True,
1013
+ )
1014
 
1015
+ print()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1016
 
1017
  # Create evaluator
 
1018
  evaluator = ChessEvaluator(
1019
  model=model,
1020
  tokenizer=tokenizer,
1021
+ model_path=args.model_path,
1022
+ engine_depth=args.engine_depth,
1023
  )
1024
 
1025
+ # Run evaluation
1026
+ result = evaluator.evaluate(verbose=True)
1027
+
1028
+ # Print results
1029
+ print()
1030
+ print("=" * 60)
1031
+ print("RESULTS")
1032
+ print("=" * 60)
1033
+ print()
1034
+ print(result.summary())
1035
+
1036
+ # Post results if requested
1037
+ if args.post_results:
1038
+ token = args.hf_token or os.environ.get("HF_TOKEN")
1039
+ if token:
1040
+ post_discussion_summary(model_id, result, token)
1041
+ else:
1042
+ print("\nWarning: No HuggingFace token provided. Cannot post results.")
1043
+
1044
+ print()
1045
+ print("=" * 60)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1046
  print("EVALUATION COMPLETE")
1047
  print("=" * 60)
1048
+
1049
+ return result
1050
+
1051
+
1052
+ def evaluate_model(model_path: str, verbose: bool = True) -> EvaluationResult:
1053
+ """
1054
+ Convenience function to evaluate a model from a path.
1055
+
1056
+ Args:
1057
+ model_path: Path to the model directory (local or HuggingFace repo ID)
1058
+ verbose: Whether to print progress
1059
+
1060
+ Returns:
1061
+ EvaluationResult with all metrics
1062
+
1063
+ Example:
1064
+ >>> from src.evaluate import evaluate_model
1065
+ >>> results = evaluate_model("./my_model/final")
1066
+ >>> print(results.to_markdown())
1067
+ """
1068
+ model, tokenizer, model_id = load_model_and_tokenizer(model_path, verbose=verbose)
1069
+
1070
+ evaluator = ChessEvaluator(
1071
+ model=model,
1072
+ tokenizer=tokenizer,
1073
+ model_path=model_path,
1074
+ )
1075
+
1076
+ return evaluator.evaluate(verbose=verbose)
1077
 
1078
 
1079
  if __name__ == "__main__":
src/utils.py DELETED
@@ -1,305 +0,0 @@
1
- """
2
- Utility functions for the Chess Challenge.
3
-
4
- This module provides helper functions for:
5
- - Parameter counting and budget analysis
6
- - Model registration with Hugging Face
7
- - Move validation with python-chess
8
- """
9
-
10
- from __future__ import annotations
11
-
12
- from typing import Dict, Optional, TYPE_CHECKING
13
-
14
- import torch.nn as nn
15
-
16
- if TYPE_CHECKING:
17
- from src.model import ChessConfig
18
-
19
-
20
- def count_parameters(model: nn.Module, trainable_only: bool = True) -> int:
21
- """
22
- Count the number of parameters in a model.
23
-
24
- Args:
25
- model: The PyTorch model.
26
- trainable_only: If True, only count trainable parameters.
27
-
28
- Returns:
29
- Total number of parameters.
30
- """
31
- if trainable_only:
32
- return sum(p.numel() for p in model.parameters() if p.requires_grad)
33
- return sum(p.numel() for p in model.parameters())
34
-
35
-
36
- def count_parameters_by_component(model: nn.Module) -> Dict[str, int]:
37
- """
38
- Count parameters broken down by model component.
39
-
40
- Args:
41
- model: The PyTorch model.
42
-
43
- Returns:
44
- Dictionary mapping component names to parameter counts.
45
- """
46
- counts = {}
47
- for name, module in model.named_modules():
48
- if len(list(module.children())) == 0: # Leaf module
49
- param_count = sum(p.numel() for p in module.parameters(recurse=False))
50
- if param_count > 0:
51
- counts[name] = param_count
52
- return counts
53
-
54
-
55
- def estimate_parameters(config: "ChessConfig") -> Dict[str, int]:
56
- """
57
- Estimate the parameter count for a given configuration.
58
-
59
- This is useful for planning your architecture before building the model.
60
-
61
- Args:
62
- config: Model configuration.
63
-
64
- Returns:
65
- Dictionary with estimated parameter counts by component.
66
- """
67
- V = config.vocab_size
68
- d = config.n_embd
69
- L = config.n_layer
70
- n_ctx = config.n_ctx
71
- n_inner = config.n_inner
72
-
73
- estimates = {
74
- "token_embeddings": V * d,
75
- "position_embeddings": n_ctx * d,
76
- "attention_qkv_per_layer": 3 * d * d,
77
- "attention_proj_per_layer": d * d,
78
- "ffn_per_layer": 2 * d * n_inner,
79
- "layernorm_per_layer": 4 * d, # 2 LayerNorms, each with weight and bias
80
- "final_layernorm": 2 * d,
81
- }
82
-
83
- # Calculate totals
84
- per_layer = (
85
- estimates["attention_qkv_per_layer"] +
86
- estimates["attention_proj_per_layer"] +
87
- estimates["ffn_per_layer"] +
88
- estimates["layernorm_per_layer"]
89
- )
90
-
91
- estimates["total_transformer_layers"] = L * per_layer
92
-
93
- # LM head (tied with embeddings by default)
94
- if config.tie_weights:
95
- estimates["lm_head"] = 0
96
- estimates["lm_head_note"] = "Tied with token embeddings"
97
- else:
98
- estimates["lm_head"] = V * d
99
-
100
- # Grand total
101
- estimates["total"] = (
102
- estimates["token_embeddings"] +
103
- estimates["position_embeddings"] +
104
- estimates["total_transformer_layers"] +
105
- estimates["final_layernorm"] +
106
- estimates["lm_head"]
107
- )
108
-
109
- return estimates
110
-
111
-
112
- def print_parameter_budget(config: "ChessConfig", limit: int = 1_000_000) -> None:
113
- """
114
- Print a formatted parameter budget analysis.
115
-
116
- Args:
117
- config: Model configuration.
118
- limit: Parameter limit to compare against.
119
- """
120
- estimates = estimate_parameters(config)
121
-
122
- print("=" * 60)
123
- print("PARAMETER BUDGET ANALYSIS")
124
- print("=" * 60)
125
- print(f"\nConfiguration:")
126
- print(f" vocab_size (V) = {config.vocab_size}")
127
- print(f" n_embd (d) = {config.n_embd}")
128
- print(f" n_layer (L) = {config.n_layer}")
129
- print(f" n_head = {config.n_head}")
130
- print(f" n_ctx = {config.n_ctx}")
131
- print(f" n_inner = {config.n_inner}")
132
- print(f" tie_weights = {config.tie_weights}")
133
-
134
- print(f"\nParameter Breakdown:")
135
- print(f" Token Embeddings: {estimates['token_embeddings']:>10,}")
136
- print(f" Position Embeddings: {estimates['position_embeddings']:>10,}")
137
- print(f" Transformer Layers: {estimates['total_transformer_layers']:>10,}")
138
- print(f" Final LayerNorm: {estimates['final_layernorm']:>10,}")
139
-
140
- if config.tie_weights:
141
- print(f" LM Head: {'(tied)':>10}")
142
- else:
143
- print(f" LM Head: {estimates['lm_head']:>10,}")
144
-
145
- print(f" " + "-" * 30)
146
- print(f" TOTAL: {estimates['total']:>10,}")
147
-
148
- print(f"\nBudget Status:")
149
- print(f" Limit: {limit:>10,}")
150
- print(f" Used: {estimates['total']:>10,}")
151
- print(f" Remaining:{limit - estimates['total']:>10,}")
152
-
153
- if estimates['total'] <= limit:
154
- print(f"\n Within budget! ({estimates['total'] / limit * 100:.1f}% used)")
155
- else:
156
- print(f"\n OVER BUDGET by {estimates['total'] - limit:,} parameters!")
157
-
158
- print("=" * 60)
159
-
160
-
161
- def validate_move_with_chess(move: str, board_fen: Optional[str] = None) -> bool:
162
- """
163
- Validate a move using python-chess.
164
-
165
- This function converts the dataset's extended UCI format to standard UCI
166
- and validates it against the current board state.
167
-
168
- Args:
169
- move: Move in extended UCI format (e.g., "WPe2e4", "BNg8f6(x)").
170
- board_fen: FEN string of the current board state (optional).
171
-
172
- Returns:
173
- True if the move is legal, False otherwise.
174
- """
175
- try:
176
- import chess
177
- except ImportError:
178
- raise ImportError("python-chess is required for move validation. "
179
- "Install it with: pip install python-chess")
180
-
181
- # Parse the extended UCI format
182
- # Format: [W|B][Piece][from_sq][to_sq][suffix]
183
- # Example: WPe2e4, BNg8f6(x), WKe1g1(o)
184
-
185
- if len(move) < 6:
186
- return False
187
-
188
- # Extract components
189
- color = move[0] # W or B
190
- piece = move[1] # P, N, B, R, Q, K
191
- from_sq = move[2:4] # e.g., "e2"
192
- to_sq = move[4:6] # e.g., "e4"
193
-
194
- # Check for promotion
195
- promotion = None
196
- if "=" in move:
197
- promo_idx = move.index("=")
198
- promotion = move[promo_idx + 1].lower()
199
-
200
- # Create board
201
- board = chess.Board(board_fen) if board_fen else chess.Board()
202
-
203
- # Build UCI move string
204
- uci_move = from_sq + to_sq
205
- if promotion:
206
- uci_move += promotion
207
-
208
- try:
209
- move_obj = chess.Move.from_uci(uci_move)
210
- return move_obj in board.legal_moves
211
- except (ValueError, chess.InvalidMoveError):
212
- return False
213
-
214
-
215
- def convert_extended_uci_to_uci(move: str) -> str:
216
- """
217
- Convert extended UCI format to standard UCI format.
218
-
219
- Args:
220
- move: Move in extended UCI format (e.g., "WPe2e4").
221
-
222
- Returns:
223
- Move in standard UCI format (e.g., "e2e4").
224
- """
225
- if len(move) < 6:
226
- return move
227
-
228
- # Extract squares
229
- from_sq = move[2:4]
230
- to_sq = move[4:6]
231
-
232
- # Check for promotion
233
- promotion = ""
234
- if "=" in move:
235
- promo_idx = move.index("=")
236
- promotion = move[promo_idx + 1].lower()
237
-
238
- return from_sq + to_sq + promotion
239
-
240
-
241
- def convert_uci_to_extended(
242
- uci_move: str,
243
- board_fen: str,
244
- ) -> str:
245
- """
246
- Convert standard UCI format to extended UCI format.
247
-
248
- Args:
249
- uci_move: Move in standard UCI format (e.g., "e2e4").
250
- board_fen: FEN string of the current board state.
251
-
252
- Returns:
253
- Move in extended UCI format (e.g., "WPe2e4").
254
- """
255
- try:
256
- import chess
257
- except ImportError:
258
- raise ImportError("python-chess is required for move conversion.")
259
-
260
- board = chess.Board(board_fen)
261
- move = chess.Move.from_uci(uci_move)
262
-
263
- # Get color
264
- color = "W" if board.turn == chess.WHITE else "B"
265
-
266
- # Get piece
267
- piece = board.piece_at(move.from_square)
268
- piece_letter = piece.symbol().upper() if piece else "P"
269
-
270
- # Build extended UCI
271
- from_sq = chess.square_name(move.from_square)
272
- to_sq = chess.square_name(move.to_square)
273
-
274
- result = f"{color}{piece_letter}{from_sq}{to_sq}"
275
-
276
- # Add promotion
277
- if move.promotion:
278
- result += f"={chess.piece_symbol(move.promotion).upper()}"
279
-
280
- # Add suffix for captures
281
- if board.is_capture(move):
282
- result += "(x)"
283
-
284
- # Add suffix for check/checkmate
285
- board.push(move)
286
- if board.is_checkmate():
287
- if "(x)" in result:
288
- result = result.replace("(x)", "(x+*)")
289
- else:
290
- result += "(+*)"
291
- elif board.is_check():
292
- if "(x)" in result:
293
- result = result.replace("(x)", "(x+)")
294
- else:
295
- result += "(+)"
296
- board.pop()
297
-
298
- # Handle castling notation
299
- if board.is_castling(move):
300
- if move.to_square in [chess.G1, chess.G8]: # Kingside
301
- result = result.replace("(x)", "").replace("(+)", "") + "(o)"
302
- else: # Queenside
303
- result = result.replace("(x)", "").replace("(+)", "") + "(O)"
304
-
305
- return result
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
submit.py CHANGED
@@ -2,23 +2,143 @@
2
  """
3
  Submission script for the Chess Challenge.
4
 
5
- This script pushes your trained model to the Hugging Face Hub under the
6
- LLM-course organization, with metadata tracking who submitted it.
 
 
 
 
 
 
 
 
7
 
8
  Usage:
9
- python submit.py --model_path ./my_model/final_model --model_name my-chess-model
10
  """
11
 
12
  import argparse
13
  import os
14
- import tempfile
15
  from pathlib import Path
16
 
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  def main():
19
- parser = argparse.ArgumentParser(description="Submit your chess model to Hugging Face Hub")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  parser.add_argument(
21
- "--model_path", type=str, default="./my_model/final_model",
22
  help="Path to your trained model directory"
23
  )
24
  parser.add_argument(
@@ -26,89 +146,95 @@ def main():
26
  help="Name for your model on the Hub (e.g., 'my-chess-model')"
27
  )
28
  args = parser.parse_args()
29
-
30
- # Fixed organization
31
  organization = "LLM-course"
32
-
33
- # Check model path exists
34
- if not os.path.exists(args.model_path):
35
- print(f"Error: Model path '{args.model_path}' does not exist.")
36
- print("Train a model first with: python -m src.train --output_dir ./my_model")
37
- return 1
38
-
39
- # Import here to avoid slow startup
40
- from huggingface_hub import HfApi, HfFolder, whoami
41
- from transformers import AutoModelForCausalLM
42
-
43
- # Ensure user is logged in and get their info
44
  print("=" * 60)
45
  print("CHESS CHALLENGE - MODEL SUBMISSION")
46
  print("=" * 60)
47
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  try:
49
  user_info = whoami()
50
  username = user_info["name"]
51
- print(f"\nLogged in as: {username}")
52
  except Exception:
53
- print("\nYou need to log in to Hugging Face first.")
54
- print("Run: huggingface-cli login")
55
- return 1
56
-
57
- # Import custom classes to register them
58
- from src.model import ChessConfig, ChessForCausalLM
59
- from src.tokenizer import ChessTokenizer
60
-
61
- # Load model and tokenizer
62
- print(f"\nLoading model from: {args.model_path}")
63
- model = AutoModelForCausalLM.from_pretrained(args.model_path)
64
- tokenizer = ChessTokenizer.from_pretrained(args.model_path)
65
-
66
- # Count parameters
67
- n_params = sum(p.numel() for p in model.parameters())
68
- print(f"Model parameters: {n_params:,}")
69
-
70
- if n_params > 1_000_000:
71
- print(f"WARNING: Model exceeds 1M parameter limit ({n_params:,} params)")
72
-
73
- # Prepare repo name
74
- repo_id = f"{organization}/{args.model_name}"
75
- print(f"\nSubmitting to: {repo_id}")
76
-
77
- # Create a temporary directory to prepare submission
78
- with tempfile.TemporaryDirectory() as tmp_dir:
79
- tmp_path = Path(tmp_dir)
80
-
81
- # Register tokenizer for AutoTokenizer so it can be loaded with trust_remote_code=True
82
- # This adds the 'auto_map' field to tokenizer_config.json
83
- tokenizer.register_for_auto_class("AutoTokenizer")
84
-
85
- # Register model for AutoModelForCausalLM so custom architectures load correctly
86
- # This adds the 'auto_map' field to config.json
87
- model.config.auto_map = {
88
- "AutoConfig": "model.ChessConfig",
89
- "AutoModelForCausalLM": "model.ChessForCausalLM",
90
- }
91
-
92
- # Save model and tokenizer
93
- model.save_pretrained(tmp_path)
94
- tokenizer.save_pretrained(tmp_path)
95
 
96
- # Copy tokenizer.py to allow loading with trust_remote_code=True
97
- # This ensures the custom ChessTokenizer can be loaded from the Hub
98
- import shutil
99
- tokenizer_src = Path(__file__).parent / "src" / "tokenizer.py"
100
- if tokenizer_src.exists():
101
- shutil.copy(tokenizer_src, tmp_path / "tokenizer.py")
102
- print(" Included tokenizer.py for remote loading")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
 
104
- # Copy model.py to allow loading custom model architectures with trust_remote_code=True
105
- # This ensures students who modify the model architecture can load their models from the Hub
106
- model_src = Path(__file__).parent / "src" / "model.py"
107
- if model_src.exists():
108
- shutil.copy(model_src, tmp_path / "model.py")
109
- print(" Included model.py for remote loading")
110
-
111
- # Create model card with submitter info
112
  model_card = f"""---
113
  library_name: transformers
114
  tags:
@@ -128,43 +254,47 @@ Chess model submitted to the LLM Course Chess Challenge.
128
  - **Parameters**: {n_params:,}
129
  - **Organization**: {organization}
130
 
131
- ## Model Details
132
 
133
- - **Architecture**: Chess Transformer (GPT-style)
134
- - **Vocab size**: {tokenizer.vocab_size}
135
- - **Embedding dim**: {model.config.n_embd}
136
- - **Layers**: {model.config.n_layer}
137
- - **Heads**: {model.config.n_head}
138
- """
139
- (tmp_path / "README.md").write_text(model_card)
140
 
141
- # Push to Hub
142
- print("\nUploading to Hugging Face Hub...")
143
- api = HfApi()
144
 
145
- # Create repo if it doesn't exist
146
- api.create_repo(
147
- repo_id=repo_id,
148
- exist_ok=True,
149
- )
150
 
 
 
 
 
 
 
 
151
  # Upload all files
152
  api.upload_folder(
153
- folder_path=tmp_path,
154
  repo_id=repo_id,
155
  commit_message=f"Chess Challenge submission by {username}",
156
  )
157
-
 
 
 
 
158
  print("\n" + "=" * 60)
159
  print("SUBMISSION COMPLETE!")
160
  print("=" * 60)
161
- print(f"\nYour model is now available at:")
162
  print(f" https://huggingface.co/{repo_id}")
163
  print(f"\nSubmitted by: {username}")
164
  print(f"Parameters: {n_params:,}")
165
-
 
 
166
  return 0
167
 
168
 
169
  if __name__ == "__main__":
170
- exit(main())
 
2
  """
3
  Submission script for the Chess Challenge.
4
 
5
+ This script validates and uploads your trained model to the Hugging Face Hub
6
+ under the LLM-course organization.
7
+
8
+ Your model directory must contain:
9
+ - config.json: Model configuration with auto_map for custom architecture
10
+ - model.safetensors (or pytorch_model.bin): Model weights
11
+ - tokenizer_config.json: Tokenizer configuration with auto_map
12
+ - vocab.json: Vocabulary file
13
+ - model.py: Your custom model architecture (for trust_remote_code)
14
+ - tokenizer.py: Your custom tokenizer (for trust_remote_code)
15
 
16
  Usage:
17
+ python submit.py --model_path ./my_model --model_name my-chess-model
18
  """
19
 
20
  import argparse
21
  import os
22
+ import sys
23
  from pathlib import Path
24
 
25
 
26
+ # Required files for a valid submission
27
+ REQUIRED_FILES = {
28
+ "config.json": "Model configuration (must include auto_map)",
29
+ "tokenizer_config.json": "Tokenizer configuration (must include auto_map)",
30
+ "vocab.json": "Vocabulary file",
31
+ "model.py": "Custom model architecture (for trust_remote_code=True)",
32
+ "tokenizer.py": "Custom tokenizer class (for trust_remote_code=True)",
33
+ }
34
+
35
+ # At least one of these weight files must exist
36
+ WEIGHT_FILES = ["model.safetensors", "pytorch_model.bin"]
37
+
38
+
39
+ def validate_model_directory(model_path: Path) -> tuple[bool, list[str]]:
40
+ """
41
+ Validate that the model directory contains all required files.
42
+
43
+ Returns:
44
+ Tuple of (is_valid, list of error messages).
45
+ """
46
+ errors = []
47
+
48
+ # Check required files
49
+ for filename, description in REQUIRED_FILES.items():
50
+ if not (model_path / filename).exists():
51
+ errors.append(f"Missing {filename}: {description}")
52
+
53
+ # Check weight files (need at least one)
54
+ has_weights = any((model_path / f).exists() for f in WEIGHT_FILES)
55
+ if not has_weights:
56
+ errors.append(f"Missing model weights: need {' or '.join(WEIGHT_FILES)}")
57
+
58
+ return len(errors) == 0, errors
59
+
60
+
61
+ def validate_auto_map(model_path: Path) -> tuple[bool, list[str]]:
62
+ """
63
+ Validate that config.json and tokenizer_config.json have auto_map fields.
64
+
65
+ Returns:
66
+ Tuple of (is_valid, list of error messages).
67
+ """
68
+ import json
69
+
70
+ errors = []
71
+
72
+ # Check config.json for auto_map
73
+ config_path = model_path / "config.json"
74
+ if config_path.exists():
75
+ with open(config_path) as f:
76
+ config = json.load(f)
77
+ if "auto_map" not in config:
78
+ errors.append(
79
+ "config.json missing 'auto_map' field. Add:\n"
80
+ ' "auto_map": {\n'
81
+ ' "AutoConfig": "model.YourConfig",\n'
82
+ ' "AutoModelForCausalLM": "model.YourModel"\n'
83
+ ' }'
84
+ )
85
+
86
+ # Check tokenizer_config.json for auto_map
87
+ tokenizer_config_path = model_path / "tokenizer_config.json"
88
+ if tokenizer_config_path.exists():
89
+ with open(tokenizer_config_path) as f:
90
+ tokenizer_config = json.load(f)
91
+ if "auto_map" not in tokenizer_config:
92
+ errors.append(
93
+ "tokenizer_config.json missing 'auto_map' field. Add:\n"
94
+ ' "auto_map": {\n'
95
+ ' "AutoTokenizer": ["tokenizer.YourTokenizer", null]\n'
96
+ ' }\n'
97
+ 'Note: AutoTokenizer value must be a list [slow_class, fast_class].'
98
+ )
99
+ elif "AutoTokenizer" in tokenizer_config.get("auto_map", {}):
100
+ auto_tok = tokenizer_config["auto_map"]["AutoTokenizer"]
101
+ if isinstance(auto_tok, str):
102
+ errors.append(
103
+ "tokenizer_config.json auto_map.AutoTokenizer must be a list, not a string.\n"
104
+ 'Change from: "AutoTokenizer": "tokenizer.YourTokenizer"\n'
105
+ 'To: "AutoTokenizer": ["tokenizer.YourTokenizer", null]'
106
+ )
107
+
108
+ return len(errors) == 0, errors
109
+
110
+
111
+ def count_parameters(model_path: Path) -> int:
112
+ """Count parameters in the model."""
113
+ from transformers import AutoModelForCausalLM
114
+
115
+ model = AutoModelForCausalLM.from_pretrained(
116
+ model_path,
117
+ trust_remote_code=True,
118
+ local_files_only=True,
119
+ )
120
+ return sum(p.numel() for p in model.parameters())
121
+
122
+
123
  def main():
124
+ parser = argparse.ArgumentParser(
125
+ description="Submit your chess model to the Hugging Face Hub",
126
+ formatter_class=argparse.RawDescriptionHelpFormatter,
127
+ epilog="""
128
+ Required files in your model directory:
129
+ - config.json Model configuration with auto_map
130
+ - model.safetensors Model weights (or pytorch_model.bin)
131
+ - tokenizer_config.json Tokenizer configuration with auto_map
132
+ - vocab.json Vocabulary file
133
+ - model.py Custom model architecture
134
+ - tokenizer.py Custom tokenizer class
135
+
136
+ Example:
137
+ python submit.py --model_path ./my_model --model_name my-chess-model
138
+ """
139
+ )
140
  parser.add_argument(
141
+ "--model_path", type=str, required=True,
142
  help="Path to your trained model directory"
143
  )
144
  parser.add_argument(
 
146
  help="Name for your model on the Hub (e.g., 'my-chess-model')"
147
  )
148
  args = parser.parse_args()
149
+
150
+ model_path = Path(args.model_path)
151
  organization = "LLM-course"
152
+
 
 
 
 
 
 
 
 
 
 
 
153
  print("=" * 60)
154
  print("CHESS CHALLENGE - MODEL SUBMISSION")
155
  print("=" * 60)
156
+
157
+ # Check model path exists
158
+ if not model_path.exists():
159
+ print(f"\nError: Model path '{model_path}' does not exist.")
160
+ return 1
161
+
162
+ # Validate required files
163
+ print("\n[1/5] Checking required files...")
164
+ is_valid, errors = validate_model_directory(model_path)
165
+ if not is_valid:
166
+ print("\nError: Model directory is incomplete:")
167
+ for error in errors:
168
+ print(f" - {error}")
169
+ print("\nSee example_solution/ for a complete example.")
170
+ return 1
171
+ print(" All required files present.")
172
+
173
+ # Validate auto_map fields
174
+ print("\n[2/5] Validating auto_map configuration...")
175
+ is_valid, errors = validate_auto_map(model_path)
176
+ if not is_valid:
177
+ print("\nError: Configuration files need auto_map:")
178
+ for error in errors:
179
+ print(f" - {error}")
180
+ return 1
181
+ print(" auto_map configuration valid.")
182
+
183
+ # Count parameters
184
+ print("\n[3/5] Counting parameters...")
185
+ try:
186
+ n_params = count_parameters(model_path)
187
+ print(f" Parameters: {n_params:,}")
188
+ if n_params > 1_000_000:
189
+ print(f"\n WARNING: Model exceeds 1M parameter limit!")
190
+ print(f" Your model has {n_params:,} parameters.")
191
+ print(f" It will fail the evaluation parameter check.")
192
+ except Exception as e:
193
+ print(f"\nError: Could not load model to count parameters: {e}")
194
+ return 1
195
+
196
+ # Hugging Face login
197
+ print("\n[4/5] Checking Hugging Face authentication...")
198
+ try:
199
+ from huggingface_hub import HfApi, whoami
200
+ except ImportError:
201
+ print("\nError: huggingface_hub not installed.")
202
+ print("Install with: pip install huggingface_hub")
203
+ return 1
204
+
205
  try:
206
  user_info = whoami()
207
  username = user_info["name"]
208
+ print(f" Logged in as: {username}")
209
  except Exception:
210
+ print("\n Not logged in. Starting login process...")
211
+ print(" You need a Hugging Face account and access token.")
212
+ print(" Get your token at: https://huggingface.co/settings/tokens")
213
+ print()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
214
 
215
+ # Interactive login
216
+ from huggingface_hub import login
217
+ try:
218
+ login()
219
+ user_info = whoami()
220
+ username = user_info["name"]
221
+ print(f"\n Successfully logged in as: {username}")
222
+ except Exception as e:
223
+ print(f"\nError: Login failed: {e}")
224
+ return 1
225
+
226
+ # Upload to Hub
227
+ print("\n[5/5] Uploading to Hugging Face Hub...")
228
+ repo_id = f"{organization}/{args.model_name}"
229
+ print(f" Repository: {repo_id}")
230
+
231
+ api = HfApi()
232
+
233
+ try:
234
+ # Create repo if it doesn't exist
235
+ api.create_repo(repo_id=repo_id, exist_ok=True)
236
 
237
+ # Create a model card
 
 
 
 
 
 
 
238
  model_card = f"""---
239
  library_name: transformers
240
  tags:
 
254
  - **Parameters**: {n_params:,}
255
  - **Organization**: {organization}
256
 
257
+ ## Usage
258
 
259
+ ```python
260
+ from transformers import AutoModelForCausalLM, AutoTokenizer
 
 
 
 
 
261
 
262
+ model = AutoModelForCausalLM.from_pretrained("{repo_id}", trust_remote_code=True)
263
+ tokenizer = AutoTokenizer.from_pretrained("{repo_id}", trust_remote_code=True)
264
+ ```
265
 
266
+ ## Evaluation
 
 
 
 
267
 
268
+ This model is evaluated at the [Chess Challenge Arena](https://huggingface.co/spaces/LLM-course/Chess1MChallenge).
269
+ """
270
+
271
+ # Write model card
272
+ readme_path = model_path / "README.md"
273
+ readme_path.write_text(model_card)
274
+
275
  # Upload all files
276
  api.upload_folder(
277
+ folder_path=model_path,
278
  repo_id=repo_id,
279
  commit_message=f"Chess Challenge submission by {username}",
280
  )
281
+
282
+ except Exception as e:
283
+ print(f"\nError: Upload failed: {e}")
284
+ return 1
285
+
286
  print("\n" + "=" * 60)
287
  print("SUBMISSION COMPLETE!")
288
  print("=" * 60)
289
+ print(f"\nYour model is available at:")
290
  print(f" https://huggingface.co/{repo_id}")
291
  print(f"\nSubmitted by: {username}")
292
  print(f"Parameters: {n_params:,}")
293
+ print(f"\nNext step: Go to the Chess Challenge Arena to run evaluation:")
294
+ print(f" https://huggingface.co/spaces/LLM-course/Chess1MChallenge")
295
+
296
  return 0
297
 
298
 
299
  if __name__ == "__main__":
300
+ sys.exit(main())
uv.lock ADDED
The diff for this file is too large to render. See raw diff