Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Paper
β’
1712.01815
β’
Published
β’
3
π§ Zweig Chess Engine / Cognitive Chess Coach
The brain needs a body. Get the inference code and tools here.
"A chess engine that makes human mistakes, on purpose."
This repository contains human-aligned PyTorch chess models. Unlike traditional engines like Stockfish that calculate the absolute best move, Zweig is trained to predict the move a human player would likely make at a specific ELO rating.
The models were fine-tuned on the Lichess 2025 dataset, split into 12 skill brackets, and are available here on Hugging Face.
Select the model that corresponds to the skill level you wish to simulate.
| Filename | ELO Target | Skill Level | Description |
|---|---|---|---|
maia_finetuned_train_01_400-1000.pth |
400 - 1000 | Novice | |
maia_finetuned_train_02_1001-1200.pth |
1001 - 1200 | Beginner | |
maia_finetuned_train_03_1201-1325.pth |
1201 - 1325 | Casual | |
maia_finetuned_train_05_1426-1500.pth |
1426 - 1500 | Intermediate | |
maia_finetuned_train_08_1651-1750.pth |
1651 - 1750 | Club Player | |
maia_finetuned_train_09_1751-1875.pth |
1751 - 1875 | Strong Club | |
maia_finetuned_train_10_1876-2100.pth |
1876 - 2100 | Expert | |
maia_finetuned_train_11_2101-2400.pth |
2101 - 2400 | Master | |
maia_finetuned_train_12_2401-PLUS.pth |
2401+ | Elite / GM | |
..._12_2401-PLUS_AGGRESSIVE.pth |
2401+ | Elite (Aggro) | π§ͺ Experimental: Aggressive tactical lines. |
You need the model definition from the GitHub Repository.
import torch
import chess
from src.model import Maia2_New # Ensure you have the repo structure
from src.utils import board_to_tensor_19ch, create_vocab
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
VOCAB_SIZE = 4208 # Fixed vocabulary size
# 1. Initialize model
model = Maia2_New(vocab_size=VOCAB_SIZE).to(DEVICE)
# 2. Load checkpoint
# The script will download the model if using the provided utility functions
model_path = "maia_finetuned_train_05_1426-1500.pth"
checkpoint = torch.load(model_path, map_location=DEVICE)
if 'model_state_dict' in checkpoint:
model.load_state_dict(checkpoint['model_state_dict'])
else:
model.load_state_dict(checkpoint)
model.eval()
print(f"β
Model loaded: {model_path}")
β οΈ Note:
β These models are not a Stockfish replacement.
β No tree search or evaluation function is included.
β
The focus is purely on human-like move prediction and style.
π Hugging Face Dataset
Training data is hosted separately and can be streamed directly. No local .pgn files required.
Python
from datasets import load_dataset
# Load dataset from Hugging Face
dataset = load_dataset("ygkla/zweig-chess-engine-processed")
# Example: Access the 'Intermediate' bracket
elo_05 = dataset["train"] # or specify split if configured
# Inspect first game
print(elo_05[0])
Dataset Features:
ELO Brackets: 12 distinct levels (400 to 2401+)
Format: Processed PGN sequences
Volume: ~11 Million games
Source: Lichess Open Database (2025)
π References
Original Paper: McIlroy-Young, R., Sen, S., Kleinberg, J., & Anderson, A. (2020). Aligning Superhuman AI with Human Behavior: Chess as a Model System. KDD '20.
AlphaZero: Silver, D., et al. (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815.
Data Source: Lichess Open Database
βοΈ License
Released under the MIT License.