File size: 2,745 Bytes
cb44915
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# Example Solution

This folder contains a complete reference implementation for the Chess Challenge.

**Use this to understand the expected format** - see how model.py, tokenizer.py, and configuration files should be structured.

## Files Included

| File | Description |
|------|-------------|
| `model.py` | Custom transformer architecture |
| `tokenizer.py` | Custom move-level tokenizer |
| `train.py` | Training script |
| `data.py` | Dataset utilities |
| `config.json` | Model configuration with auto_map |
| `model.safetensors` | Trained model weights |
| `vocab.json` | Tokenizer vocabulary |
| `tokenizer_config.json` | Tokenizer configuration with auto_map |
| `special_tokens_map.json` | Special token mappings |

## Model Architecture

This example uses a small GPT-style transformer:

| Parameter | Value |
|-----------|-------|
| Embedding dim | 128 |
| Layers | 4 |
| Attention heads | 4 |
| Context length | 256 |
| Total parameters | ~910K |

## Training Details

The model was trained on the Lichess dataset with:
- 3 epochs
- Batch size 32
- Learning rate 5e-4
- Weight tying (embedding = output layer)

## How to Use This Example

### Load the model:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("./example_solution", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("./example_solution", trust_remote_code=True)
```

### Generate a move:

```python
import torch

# Game history in the format: WPe2e4 BPe7e5 WNg1f3 ...
history = "[BOS] WPe2e4 BPe7e5"

inputs = tokenizer(history, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    next_token = outputs.logits[0, -1].argmax()
    
predicted_move = tokenizer.decode([next_token])
print(f"Predicted move: {predicted_move}")
```

## Evaluation

To evaluate this example:

```bash
python -m src.evaluate --model_path ./example_solution
```

## Key Implementation Details

### auto_map Configuration

The `config.json` contains:
```json
{
  "auto_map": {
    "AutoConfig": "model.ChessConfig",
    "AutoModelForCausalLM": "model.ChessForCausalLM"
  }
}
```

The `tokenizer_config.json` contains:
```json
{
  "auto_map": {
    "AutoTokenizer": ["tokenizer.ChessTokenizer", null]
  }
}
```

Note: `AutoTokenizer` requires a list `[slow_class, fast_class]`, not a string!

## Your Turn!

Use this as inspiration, but create your own solution! Ideas to explore:

1. **Architecture changes**: Different number of layers, heads, or embedding dimensions
2. **Training strategies**: Different learning rates, warmup schedules, or optimizers
3. **Data augmentation**: Flip board colors, use different game phases
4. **Tokenization**: Different move representation formats