Spaces:

LLM-course
/

Chess1MChallenge

Running

App Files Files Community

Chess1MChallenge / example_solution /README.md

nathanael-fijalkow

First version ready with webhook and deterministic eval

cb44915 8 days ago

preview code

raw

history blame contribute delete

2.75 kB

	# Example Solution

	This folder contains a complete reference implementation for the Chess Challenge.

	Use this to understand the expected format - see how model.py, tokenizer.py, and configuration files should be structured.

	## Files Included

	\| File \| Description \|
	\|------\|-------------\|
	\| `model.py` \| Custom transformer architecture \|
	\| `tokenizer.py` \| Custom move-level tokenizer \|
	\| `train.py` \| Training script \|
	\| `data.py` \| Dataset utilities \|
	\| `config.json` \| Model configuration with auto_map \|
	\| `model.safetensors` \| Trained model weights \|
	\| `vocab.json` \| Tokenizer vocabulary \|
	\| `tokenizer_config.json` \| Tokenizer configuration with auto_map \|
	\| `special_tokens_map.json` \| Special token mappings \|

	## Model Architecture

	This example uses a small GPT-style transformer:

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Embedding dim \| 128 \|
	\| Layers \| 4 \|
	\| Attention heads \| 4 \|
	\| Context length \| 256 \|
	\| Total parameters \| ~910K \|

	## Training Details

	The model was trained on the Lichess dataset with:
	- 3 epochs
	- Batch size 32
	- Learning rate 5e-4
	- Weight tying (embedding = output layer)

	## How to Use This Example

	### Load the model:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("./example_solution", trust_remote_code=True)
	tokenizer = AutoTokenizer.from_pretrained("./example_solution", trust_remote_code=True)
	```

	### Generate a move:

	```python
	import torch

	# Game history in the format: WPe2e4 BPe7e5 WNg1f3 ...
	history = "[BOS] WPe2e4 BPe7e5"

	inputs = tokenizer(history, return_tensors="pt")
	with torch.no_grad():
	outputs = model(**inputs)
	next_token = outputs.logits[0, -1].argmax()

	predicted_move = tokenizer.decode([next_token])
	print(f"Predicted move: {predicted_move}")
	```

	## Evaluation

	To evaluate this example:

	```bash
	python -m src.evaluate --model_path ./example_solution
	```

	## Key Implementation Details

	### auto_map Configuration

	The `config.json` contains:
	```json
	{
	"auto_map": {
	"AutoConfig": "model.ChessConfig",
	"AutoModelForCausalLM": "model.ChessForCausalLM"
	}
	}
	```

	The `tokenizer_config.json` contains:
	```json
	{
	"auto_map": {
	"AutoTokenizer": ["tokenizer.ChessTokenizer", null]
	}
	}
	```

	Note: `AutoTokenizer` requires a list `[slow_class, fast_class]`, not a string!

	## Your Turn!

	Use this as inspiration, but create your own solution! Ideas to explore:

	1. Architecture changes: Different number of layers, heads, or embedding dimensions
	2. Training strategies: Different learning rates, warmup schedules, or optimizers
	3. Data augmentation: Flip board colors, use different game phases
	4. Tokenization: Different move representation formats