| --- |
| license: mit |
| library_name: pytorch |
| tags: |
| - gomoku |
| - alphazero |
| - mcts |
| - board-game-ai |
| - pytorch |
| pipeline_tag: reinforcement-learning |
| --- |
| |
| # GomokuZeroAI |
|
|
| GomokuZeroAI is an AlphaZero-style Gomoku checkpoint trained with self-play, a PyTorch policy-value network, and Monte Carlo Tree Search. |
|
|
| This repository hosts the model weights used by the companion project: |
|
|
| ```text |
| https://github.com/maojh15/GomokuZeroAI |
| ``` |
|
|
| The current checkpoint is: |
|
|
| ```text |
| iter_0150_15x15.pt |
| ``` |
|
|
| It is intended for local human-vs-AI play through the project's web UI. |
|
|
| ## Quick Start |
|
|
| Clone the code repository: |
|
|
| ```bash |
| git clone https://github.com/maojh15/GomokuZeroAI.git |
| cd GomokuZeroAI |
| ``` |
|
|
| Install dependencies: |
|
|
| ```bash |
| pip install numpy torch pyyaml huggingface_hub |
| ``` |
|
|
| Download the checkpoint: |
|
|
| ```bash |
| hf download maojh15/GomokuZeroAI iter_0150_15x15.pt --local-dir result_15x15/checkpoints |
| ``` |
|
|
| Start the local human-vs-AI server: |
|
|
| ```bash |
| python play_human.py --host 127.0.0.1 --port 8765 |
| ``` |
|
|
| Open the web UI: |
|
|
| ```text |
| http://127.0.0.1:8765 |
| ``` |
|
|
| Select `iter_0150_15x15.pt` in the checkpoint dropdown and click the new-game button to start playing. |
|
|
| ## Model Details |
|
|
| - Game: Gomoku / Five in a Row |
| - Board size: 15x15 |
| - Checkpoint: `iter_0150_15x15.pt` |
| - Training iteration: 150 |
| - Framework: PyTorch |
| - Architecture: convolutional policy-value network |
| - Input channels: 2 |
| - Network width: 128 channels |
| - Player encodings: `1` and `-1` |
| - MCTS backend used during training: C++ Torch Extension |
| - MCTS playouts during training: 2000 |
| - Opening self-play temperature: 1.0 for the first 12 moves |
| - Evaluation temperature: 0.001 after the opening |
|
|
| The network predicts: |
|
|
| - a policy distribution over legal board moves |
| - a value estimate in `[0, 1]` from the current player's perspective |
|
|
| The local web UI can display both the raw network value and the MCTS root value. |
|
|
| ## Intended Use |
|
|
| This checkpoint is meant for: |
|
|
| - playing Gomoku against the AI locally |
| - inspecting policy and visit overlays in the web UI |
| - comparing future GomokuZeroAI checkpoints |
| - experimenting with AlphaZero-style self-play training code |
|
|
| This is not a Transformers model and is not intended for use through the Hugging Face `pipeline()` API. |
|
|
| ## Limitations |
|
|
| - The model was trained for 15x15 Gomoku only. |
| - It requires the GomokuZeroAI codebase to load and run correctly. |
| - Playing strength depends heavily on the MCTS playout setting used at inference time. |
| - Higher playouts usually improve move quality but increase latency. |
| - The checkpoint is an experimental game AI model, not a benchmarked tournament engine. |
|
|
| ## Recommended Inference Settings |
|
|
| For interactive human-vs-AI play, start with: |
|
|
| - `MCTS playouts`: 2000 |
| - `c_puct`: 5.0 |
| - `candidate distance`: empty / all legal moves |
| - `mcts_tactical_shortcuts`: enabled for faster tactical responses in the web UI |
|
|
| If moves are too slow on your machine, reduce `MCTS playouts` to 400-1000. |
|
|
| ## Files |
|
|
| ```text |
| iter_0150_15x15.pt |
| ``` |
|
|
| This file contains the model weights and training configuration payload used by the GomokuZeroAI checkpoint loader. |
|
|
| ## Citation |
|
|
| If you use this checkpoint or codebase in your own experiments, please reference the project repository: |
|
|
| ```text |
| https://github.com/maojh15/GomokuZeroAI |
| ``` |
|
|