GomokuZeroAI / README.md
maojh15's picture
Upload README.md with huggingface_hub
87e8db2 verified
---
license: mit
library_name: pytorch
tags:
- gomoku
- alphazero
- mcts
- board-game-ai
- pytorch
pipeline_tag: reinforcement-learning
---
# GomokuZeroAI
GomokuZeroAI is an AlphaZero-style Gomoku checkpoint trained with self-play, a PyTorch policy-value network, and Monte Carlo Tree Search.
This repository hosts the model weights used by the companion project:
```text
https://github.com/maojh15/GomokuZeroAI
```
The current checkpoint is:
```text
iter_0150_15x15.pt
```
It is intended for local human-vs-AI play through the project's web UI.
## Quick Start
Clone the code repository:
```bash
git clone https://github.com/maojh15/GomokuZeroAI.git
cd GomokuZeroAI
```
Install dependencies:
```bash
pip install numpy torch pyyaml huggingface_hub
```
Download the checkpoint:
```bash
hf download maojh15/GomokuZeroAI iter_0150_15x15.pt --local-dir result_15x15/checkpoints
```
Start the local human-vs-AI server:
```bash
python play_human.py --host 127.0.0.1 --port 8765
```
Open the web UI:
```text
http://127.0.0.1:8765
```
Select `iter_0150_15x15.pt` in the checkpoint dropdown and click the new-game button to start playing.
## Model Details
- Game: Gomoku / Five in a Row
- Board size: 15x15
- Checkpoint: `iter_0150_15x15.pt`
- Training iteration: 150
- Framework: PyTorch
- Architecture: convolutional policy-value network
- Input channels: 2
- Network width: 128 channels
- Player encodings: `1` and `-1`
- MCTS backend used during training: C++ Torch Extension
- MCTS playouts during training: 2000
- Opening self-play temperature: 1.0 for the first 12 moves
- Evaluation temperature: 0.001 after the opening
The network predicts:
- a policy distribution over legal board moves
- a value estimate in `[0, 1]` from the current player's perspective
The local web UI can display both the raw network value and the MCTS root value.
## Intended Use
This checkpoint is meant for:
- playing Gomoku against the AI locally
- inspecting policy and visit overlays in the web UI
- comparing future GomokuZeroAI checkpoints
- experimenting with AlphaZero-style self-play training code
This is not a Transformers model and is not intended for use through the Hugging Face `pipeline()` API.
## Limitations
- The model was trained for 15x15 Gomoku only.
- It requires the GomokuZeroAI codebase to load and run correctly.
- Playing strength depends heavily on the MCTS playout setting used at inference time.
- Higher playouts usually improve move quality but increase latency.
- The checkpoint is an experimental game AI model, not a benchmarked tournament engine.
## Recommended Inference Settings
For interactive human-vs-AI play, start with:
- `MCTS playouts`: 2000
- `c_puct`: 5.0
- `candidate distance`: empty / all legal moves
- `mcts_tactical_shortcuts`: enabled for faster tactical responses in the web UI
If moves are too slow on your machine, reduce `MCTS playouts` to 400-1000.
## Files
```text
iter_0150_15x15.pt
```
This file contains the model weights and training configuration payload used by the GomokuZeroAI checkpoint loader.
## Citation
If you use this checkpoint or codebase in your own experiments, please reference the project repository:
```text
https://github.com/maojh15/GomokuZeroAI
```