HeXO Bootstrap Model
Pretrained policy/value network for Hex Tic-Tac-Toe โ
a two-player game on an infinite hexagonal grid, 6-in-a-row to win. Used as
the starting point for AlphaZero-style self-play training in
seeligto/hexo_rl.
Architecture
- Input: 18 ร 19 ร 19 float tensor (AlphaZero-style history + scalar planes)
- ResNet-12 trunk with squeeze-and-excitation blocks
- GroupNorm(8) throughout (BN-free, stable under small batch sizes)
- Dual-pool value head with BCE loss
- Auxiliary heads: ownership prediction + winning-line prediction
- Saved as a
state_dictinside a standardtorch.savecheckpoint
The board is genuinely infinite: the accompanying Rust engine uses a sparse coordinate hashmap. The network receives a 19ร19 window assembled around the active stone cluster, so the model itself has no board-size prior.
Training
Supervised bootstrap only โ no self-play was used to produce this artifact. Trained on a mixed corpus of:
- SealBot self-play games (community minimax engine, mixed time limits)
- Anonymized public human games (visibility=public, PII-stripped at ingestion)
- Hybrid human-seed + bot-continuation games
See the companion dataset (access-restricted):
timmyburn/hexo-bootstrap-corpus.
Usage
import torch
from huggingface_hub import hf_hub_download
path = hf_hub_download(
repo_id="timmyburn/hexo-bootstrap-models",
filename="bootstrap_model.pt",
)
ckpt = torch.load(path, map_location="cpu", weights_only=False)
# Load into the network defined in seeligto/hexo_rl:
# from hexo_rl.model.network import HexTacToeNet
# model = HexTacToeNet(in_channels=18)
# model.load_state_dict(ckpt["model"])
# model.eval()
The full inference path (windowing, legal-move masking, policy projection
over the 362-dim action space) lives in the
hexo_rl repo.
Evaluation
Calibrated against a threat-detection probe on 18-plane fixtures:
| Metric | Pass threshold | Notes |
|---|---|---|
| C2: extension cell in policy top-5 | โฅ 25% | baseline for bootstrap-v4 |
| C3: extension cell in policy top-10 | โฅ 40% | baseline for bootstrap-v4 |
Thresholds are minimum-viable โ later self-play checkpoints should clear these comfortably and will be released as a separate model variant.
Files
| File | Size | Description |
|---|---|---|
bootstrap_model.pt |
~17 MB | PyTorch checkpoint (state dict + optimizer + metadata) |
License
MIT โ see the repository LICENSE.