File size: 1,457 Bytes
d09a186 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | ---
library_name: transformers
tags:
- chess
- llm-course
- chess-challenge
license: mit
---
# chess-ooooooooo
A chess transformer model trained for the LLM Course Chess Challenge.
## Model Architecture
This model uses a GPT-style transformer architecture optimized for chess move prediction:
- **Parameters**: 948,352 (0.95M)
- **Vocabulary size**: 85
- **Embedding dimension**: 128
- **Number of layers**: 6
- **Attention heads**: 4
- **Feed-forward dimension**: 320
- **Context length**: 256
- **Dropout**: 0.1
## Training
The model was trained on a subset of the Lichess 2025 dataset, focusing on learning valid chess move sequences. The architecture was carefully tuned to stay within the 1M parameter constraint while maintaining reasonable performance.
## Usage
```python
from transformers import AutoModelForCausalLM
from src.tokenizer import ChessTokenizer
model = AutoModelForCausalLM.from_pretrained(
"LLM-course/chess-ooooooooo",
trust_remote_code=True
)
tokenizer = ChessTokenizer.from_pretrained(
"LLM-course/chess-ooooooooo",
trust_remote_code=True
)
# Generate moves
input_text = "[BOS] WPe2e4"
input_ids = tokenizer.encode(input_text)
outputs = model.generate(input_ids, max_length=50)
predicted_moves = tokenizer.decode(outputs[0])
```
## Submission
Submitted by [etienneLefranc](https://huggingface.co/etienneLefranc) for the LLM Course Chess Challenge.
|