INFOMTALC 2026: Midterm assignment 1: Transformers-based chess player

Introduction

This model was made as part of the course: Transformers: Applications in Language and Communication which is part of the Applied Data Science Master at Utrecht University. More specifically, this model was made as part of the midterm assignment of this course. For this assignment, only a free tier of Google Colab was allowed to train your model. Additionally, the use of a pre-trained chess LLM was not permitted.

Model Summary

A QLoRA fine-tuned version of the Qwen2.5-1.5B LLM for chess move prediction. Given a board state in the Forsyth-Edwards Notation (FEN) notation, the model outputs a move in the Universal Chess Interface (UCI) format.

Model Architecture:

Base model: Qwen2.5-1.5B (Causal LM)
Architecture: Transformer with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings
Total parameters: 1.54B (1.31B non-embedding)
Trainable QLoRA parameters: 4,358,144 (0.49% of total)
Layers: 28 | Context length: 32,768 tokens

Training Data

Two datasets were combined (239,077 total samples):

Dataset	Samples	Share
Lichess/chess-position-evaluations (depth ≥ 16)	218,217	91.3%
Synthetic Stockfish games (1,000 simulated games)	20,860	8.7%

Lichess entries were filtered to engine depth ≥ 16 to ensure high-quality
move annotations, at the cost of reduced dataset size.

Training Procedure

Method: QLoRA fine-tuning (PEFT)
Training samples: 227,134
Effective batch size: 64
Training steps: 7,098 (2 epochs)
Final training loss: 1.089 (cross-entropy, steadily decreasing)
Environment: Google Colab (free tier, T4 GPU)

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B")
model = PeftModel.from_pretrained(base, "")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")

fen = "rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1"
inputs = tokenizer(fen, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=8)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Limitations

Training loss of 1.089 suggests the model is not highly confident in its
predictions and will produce suboptimal moves in many positions.
QLoRA at 0.49% of parameters means only a small fraction of the model was
updated; chess knowledge is partially constrained by the base LLM's pretraining.
Training data consists primarily of high-level Lichess games (depth ≥ 16),
meaning the model is tuned on strong engine moves and may struggle with
unusual or unconventional positions.
No legal move validation is enforced — the model may occasionally output
illegal moves.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MSweetbread/qwen2.5-1.5b-chess-qlora

Base model

Qwen/Qwen2.5-1.5B

Finetuned

(355)

this model