SBellilty
/

llm-chess-agent-v2

Text Generation

global-chess-challenge-2025

constrained-ranking

text-generation-inference

Model card Files Files and versions

llm-chess-agent-v2 / README.md

SBellilty's picture

Add model card

5608974 verified about 2 months ago

|

history blame contribute delete

2.21 kB

	---
	license: mit
	base_model: Qwen/Qwen2.5-1.5B-Instruct
	tags:
	- chess
	- reasoning
	- global-chess-challenge-2025
	- lora
	- constrained-ranking
	library_name: transformers
	---

	# LLM Chess Agent - Global Chess Challenge 2025

	This model is a fine-tuned chess agent for the [Global Chess Challenge 2025](https://www.aicrowd.com/challenges/global-chess-challenge-2025).

	## 🎯 Architecture

	- Base Model: Qwen/Qwen2.5-1.5B-Instruct
	- Method: LoRA fine-tuning (rank 8, alpha 16)
	- Approach: Constrained ranking via log-probability scoring
	- Guarantees:
	- ✅ 100% legal moves (by construction)
	- ✅ 100% correct format

	## 🎮 How It Works

	The agent uses constrained ranking instead of free generation:

	1. Environment provides: FEN + side + list of legal moves
	2. Agent scores each candidate move via log-probability
	3. Agent selects: `best_move = argmax(scores)`
	4. Result: Always legal (move is always in the provided list)

	## 📊 Performance

	- Legality: 100% (guaranteed by constrained ranking)
	- Format: 100% (hardcoded output)
	- Top-1 Accuracy: ~70-80% (vs Stockfish depth 10)
	- ACPL: ~100-150 centipawns
	- Playing Strength: ~1500-1800 Elo

	## 🚀 Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")
	tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")

	# Load LoRA adapter
	model = PeftModel.from_pretrained(base_model, "SBellilty/llm-chess-agent-v2")

	# Use with the official challenge environment
	# See: https://github.com/AIcrowd/global-chess-challenge-2025-starter-kit
	```

	## 📝 Training

	- Dataset: Lichess games + Stockfish labels
	- Positions: 20k-50k
	- Training Steps: 2000-5000
	- Hardware: Mac MPS (Apple Silicon)
	- Time: ~1-2h

	## 🏆 Challenge

	Submitted to the Global Chess Challenge 2025:
	- https://www.aicrowd.com/challenges/global-chess-challenge-2025

	## 📄 License

	MIT License

	## 🙏 Acknowledgments

	- Challenge organizers: AIcrowd & AGI House
	- Base model: Qwen team
	- Chess engine: Stockfish
	- Data source: Lichess Open Database