| ---
|
| library_name: transformers
|
| tags:
|
| - chess
|
| - llm-course
|
| - chess-challenge
|
| license: mit
|
| ---
|
|
|
| # chess-ooooooooo-2
|
|
|
| A chess transformer model trained for the LLM Course Chess Challenge.
|
|
|
| ## Model Architecture
|
|
|
| This model uses a GPT-style transformer architecture optimized for chess move prediction:
|
|
|
| - **Parameters**: 948,352 (0.95M)
|
| - **Vocabulary size**: 85
|
| - **Embedding dimension**: 128
|
| - **Number of layers**: 6
|
| - **Attention heads**: 4
|
| - **Feed-forward dimension**: 320
|
| - **Context length**: 256
|
| - **Dropout**: 0.102
|
|
|
| ## Training
|
|
|
| The model was trained on a subset of the Lichess 2025 dataset, focusing on learning valid chess move sequences. The architecture was carefully tuned to stay within the 1M parameter constraint while maintaining reasonable performance.
|
|
|
| ## Usage
|
|
|
| ```python
|
| from transformers import AutoModelForCausalLM
|
| from src.tokenizer import ChessTokenizer
|
|
|
| model = AutoModelForCausalLM.from_pretrained(
|
| "LLM-course/chess-ooooooooo-2",
|
| trust_remote_code=True
|
| )
|
| tokenizer = ChessTokenizer.from_pretrained(
|
| "LLM-course/chess-ooooooooo-2",
|
| trust_remote_code=True
|
| )
|
|
|
| # Generate moves
|
| input_text = "[BOS] WPe2e4"
|
| input_ids = tokenizer.encode(input_text)
|
| outputs = model.generate(input_ids, max_length=50)
|
| predicted_moves = tokenizer.decode(outputs[0])
|
| ```
|
|
|
| ## Submission
|
|
|
| Submitted by [etienneLefranc](https://huggingface.co/etienneLefranc) for the LLM Course Chess Challenge.
|
| Version 2 of chess-ooooooooo.
|
|
|