Breakthrough MCVS - Zone Guided AI
Advanced Monte-Carlo Value Search (MCVS) engine for the game Breakthrough (8x8), powered by a novel Displacement-based ABC Model and Weighted Adjacency Matrices with Hilbert-ordered Zone Guidance.
This repository implements a complete zone-guided reinforcement learning system, including self-play training, neural networks, and comparative tournaments against classic UCT.
Core Idea
The engine uses:
- Displacement-based ABC Model with homogeneous coordinates
- Dynamic Weighted Adjacency Matrices
W = A β S β F - Hilbert curve ordering for efficient zone retrieval
- A learned Zone Database that stores winning/losing position patterns
- Zone Guidance (
Ξ»-PUCT) to bias search toward promising zones
For more information please refer to the paper at: https://doi.org/10.13140/RG.2.2.18795.09764
Files Overview
| File | Purpose |
|---|---|
breakthrough_mcvs.py |
Main implementation: game logic, ABC model, Zone Database, MCVS, neural networks, incremental training |
mcvs_vs_uct.py |
200-game tournament between MCVS and UCT with detailed logging and online learning |
abc_model.py |
Displacement-based ABC Model |
matrix_model.py |
Computes the weighted adjacency matrix |
breakthrough_zone_db.npz |
Learned Zone Database |
breakthrough_checkpoint.pt |
Saved Policy & Value neural network weights |
inspect_npz.py |
Utility to inspect the zone database |
Requirements
How to use:
A. Incremental Training
python breakthrough_mcvs.py
This runs continuous self-play + training:
- Generates games using MCVS
- Trains the neural networks
- Updates and saves the Zone Database
- Fully incremental (you can stop and resume anytime)
B. Tournament with Online Learning
This script runs a 200-game tournament while the AI learns.
With Neural Networks (Full version β Online Learning):
bash: python mcvs_vs_uct.py
What happens:
- MCVS plays against classic UCT (alternating sides)
- Neural Policy and Value networks learn online after every game
- Zone Database is updated and saved after each game
- Zone guidance turns on automatically after game 1
- Creates detailed logs:
- breakthrough_full_results.txt β tournament summary
- move_log.txt β per-move statistics
- learning_log.txt β training progress
Without Neural Networks (Zone-only Ablation)
To run the faster ablation version:
- Open mcvs_vs_uct.py
- Change this line near the bottom:
ablation_no_nets=False # β Change to True
pip install torch numpy