add model card
Browse files
README.md
ADDED
|
@@ -0,0 +1,87 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
library_name: pytorch
|
| 4 |
+
tags:
|
| 5 |
+
- reinforcement-learning
|
| 6 |
+
- alphazero
|
| 7 |
+
- board-games
|
| 8 |
+
- hex-tic-tac-toe
|
| 9 |
+
- mcts
|
| 10 |
+
pipeline_tag: other
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# HeXO Bootstrap Model
|
| 14 |
+
|
| 15 |
+
Pretrained policy/value network for [Hex Tic-Tac-Toe](https://hex-tic-tac-toe.github.io/) —
|
| 16 |
+
a two-player game on an infinite hexagonal grid, 6-in-a-row to win. Used as
|
| 17 |
+
the starting point for AlphaZero-style self-play training in
|
| 18 |
+
[`seeligto/hexo_rl`](https://github.com/seeligto/hexo_rl).
|
| 19 |
+
|
| 20 |
+
## Architecture
|
| 21 |
+
|
| 22 |
+
- Input: **18 × 19 × 19** float tensor (AlphaZero-style history + scalar planes)
|
| 23 |
+
- ResNet-12 trunk with squeeze-and-excitation blocks
|
| 24 |
+
- GroupNorm(8) throughout (BN-free, stable under small batch sizes)
|
| 25 |
+
- Dual-pool value head with BCE loss
|
| 26 |
+
- Auxiliary heads: ownership prediction + winning-line prediction
|
| 27 |
+
- Saved as a `state_dict` inside a standard `torch.save` checkpoint
|
| 28 |
+
|
| 29 |
+
The board is genuinely infinite: the accompanying Rust engine uses a sparse
|
| 30 |
+
coordinate hashmap. The network receives a 19×19 window assembled around
|
| 31 |
+
the active stone cluster, so the model itself has no board-size prior.
|
| 32 |
+
|
| 33 |
+
## Training
|
| 34 |
+
|
| 35 |
+
Supervised bootstrap only — no self-play was used to produce this artifact.
|
| 36 |
+
Trained on a mixed corpus of:
|
| 37 |
+
|
| 38 |
+
- **SealBot self-play games** (community minimax engine, mixed time limits)
|
| 39 |
+
- **Anonymized public human games** (visibility=public, PII-stripped at ingestion)
|
| 40 |
+
- **Hybrid human-seed + bot-continuation games**
|
| 41 |
+
|
| 42 |
+
See the companion dataset (access-restricted):
|
| 43 |
+
[`timmyburn/hexo-bootstrap-corpus`](https://huggingface.co/datasets/timmyburn/hexo-bootstrap-corpus).
|
| 44 |
+
|
| 45 |
+
## Usage
|
| 46 |
+
|
| 47 |
+
```python
|
| 48 |
+
import torch
|
| 49 |
+
from huggingface_hub import hf_hub_download
|
| 50 |
+
|
| 51 |
+
path = hf_hub_download(
|
| 52 |
+
repo_id="timmyburn/hexo-bootstrap-models",
|
| 53 |
+
filename="bootstrap_model.pt",
|
| 54 |
+
)
|
| 55 |
+
ckpt = torch.load(path, map_location="cpu", weights_only=False)
|
| 56 |
+
# Load into the network defined in seeligto/hexo_rl:
|
| 57 |
+
# from hexo_rl.model.network import HexTacToeNet
|
| 58 |
+
# model = HexTacToeNet(in_channels=18)
|
| 59 |
+
# model.load_state_dict(ckpt["model"])
|
| 60 |
+
# model.eval()
|
| 61 |
+
```
|
| 62 |
+
|
| 63 |
+
The full inference path (windowing, legal-move masking, policy projection
|
| 64 |
+
over the 362-dim action space) lives in the
|
| 65 |
+
[`hexo_rl`](https://github.com/seeligto/hexo_rl) repo.
|
| 66 |
+
|
| 67 |
+
## Evaluation
|
| 68 |
+
|
| 69 |
+
Calibrated against a threat-detection probe on 18-plane fixtures:
|
| 70 |
+
|
| 71 |
+
| Metric | Pass threshold | Notes |
|
| 72 |
+
|---|---|---|
|
| 73 |
+
| C2: extension cell in policy top-5 | ≥ 25% | baseline for bootstrap-v4 |
|
| 74 |
+
| C3: extension cell in policy top-10 | ≥ 40% | baseline for bootstrap-v4 |
|
| 75 |
+
|
| 76 |
+
Thresholds are minimum-viable — later self-play checkpoints should clear
|
| 77 |
+
these comfortably and will be released as a separate model variant.
|
| 78 |
+
|
| 79 |
+
## Files
|
| 80 |
+
|
| 81 |
+
| File | Size | Description |
|
| 82 |
+
|---|---|---|
|
| 83 |
+
| `bootstrap_model.pt` | ~17 MB | PyTorch checkpoint (state dict + optimizer + metadata) |
|
| 84 |
+
|
| 85 |
+
## License
|
| 86 |
+
|
| 87 |
+
MIT — see the [repository LICENSE](https://github.com/seeligto/hexo_rl/blob/master/LICENSE).
|