maojh15
/

GomokuZeroAI

+---
+license: mit
+library_name: pytorch
+tags:
+  - gomoku
+  - alphazero
+  - mcts
+  - board-game-ai
+  - pytorch
+pipeline_tag: reinforcement-learning
+---
+# GomokuZeroAI
+GomokuZeroAI is an AlphaZero-style Gomoku checkpoint trained with self-play, a PyTorch policy-value network, and Monte Carlo Tree Search.
+This repository hosts the model weights used by the companion project:
+```text
+https://github.com/maojh15/GomokuZeroAI
+```
+The current checkpoint is:
+```text
+iter_0150_15x15.pt
+```
+It is intended for local human-vs-AI play through the project's web UI.
+## Quick Start
+Clone the code repository:
+```bash
+git clone https://github.com/maojh15/GomokuZeroAI.git
+cd GomokuZeroAI
+```
+Install dependencies:
+```bash
+pip install numpy torch pyyaml huggingface_hub
+```
+Download the checkpoint:
+```bash
+hf download maojh15/GomokuZeroAI iter_0150_15x15.pt --local-dir result_15x15/checkpoints
+```
+Start the local human-vs-AI server:
+```bash
+python play_human.py --host 127.0.0.1 --port 8765
+```
+Open the web UI:
+```text
+http://127.0.0.1:8765
+```
+Select `iter_0150_15x15.pt` in the checkpoint dropdown and click the new-game button to start playing.
+## Model Details
+- Game: Gomoku / Five in a Row
+- Board size: 15x15
+- Checkpoint: `iter_0150_15x15.pt`
+- Training iteration: 150
+- Framework: PyTorch
+- Architecture: convolutional policy-value network
+- Input channels: 2
+- Network width: 128 channels
+- Player encodings: `1` and `-1`
+- MCTS backend used during training: C++ Torch Extension
+- MCTS playouts during training: 2000
+- Opening self-play temperature: 1.0 for the first 12 moves
+- Evaluation temperature: 0.001 after the opening
+The network predicts:
+- a policy distribution over legal board moves
+- a value estimate in `[0, 1]` from the current player's perspective
+The local web UI can display both the raw network value and the MCTS root value.
+## Intended Use
+This checkpoint is meant for:
+- playing Gomoku against the AI locally
+- inspecting policy and visit overlays in the web UI
+- comparing future GomokuZeroAI checkpoints
+- experimenting with AlphaZero-style self-play training code
+This is not a Transformers model and is not intended for use through the Hugging Face `pipeline()` API.
+## Limitations
+- The model was trained for 15x15 Gomoku only.
+- It requires the GomokuZeroAI codebase to load and run correctly.
+- Playing strength depends heavily on the MCTS playout setting used at inference time.
+- Higher playouts usually improve move quality but increase latency.
+- The checkpoint is an experimental game AI model, not a benchmarked tournament engine.
+## Recommended Inference Settings
+For interactive human-vs-AI play, start with:
+- `MCTS playouts`: 2000
+- `c_puct`: 5.0
+- `candidate distance`: empty / all legal moves
+- `mcts_tactical_shortcuts`: enabled for faster tactical responses in the web UI
+If moves are too slow on your machine, reduce `MCTS playouts` to 400-1000.
+## Files
+```text
+iter_0150_15x15.pt
+```
+This file contains the model weights and training configuration payload used by the GomokuZeroAI checkpoint loader.
+## Citation
+If you use this checkpoint or codebase in your own experiments, please reference the project repository:
+```text
+https://github.com/maojh15/GomokuZeroAI
+```