NanoGo

Train and play a strong Go AI on a single GPU.

Elo ~3900 from a single RTX 4090. Beats KataGo b6c64 100% and goes even with KataGo b6c96 (Elo ~3956) using raw policy alone — no search.

Model Files

File	Description	Size
`sl_final_ema.pt`	SL policy (EMA), 56.48% top-1 on KGS expert games	~263MB
`rl_iter9950_policy.pt`	RL policy (iter 9950), 100% vs KataGo b6c64, 46% vs b6c96-s18M	~263MB
`rl_iter9950_value.pt`	Value network (co-trained with RL policy)	~261MB

Architecture

Backbone: 12-layer Transformer, dim=384, 21.6M params
Policy head: Decoupled pass classifier — 361-way board softmax + separate sigmoid pass
Value head: Same backbone, scalar output in [-1, 1]
Features: 48-plane AlphaGo-style board encoding (19x19)
Training: SL on KGS expert games → REINFORCE self-play (10K iters, 3.5 days on RTX 4090)

Usage

import torch
from models.factory import build_policy_net
from training.utils import load_checkpoint
import config as cfg

cfg.BACKBONE = "transformer"
cfg.BACKBONE_DIM = 384
cfg.TRANSFORMER_LAYERS = 12
cfg.NORM_TYPE = "rmsnorm"
cfg.FFN_ACTIVATION = "swiglu"
cfg.POS_EMBED = "learned_2d"

model = build_policy_net(cfg)
load_checkpoint("rl_iter9950_policy.pt", model)
model.eval()

# Returns PolicyOutput(board_logits=(B, 361), pass_logit=(B,))
output = model(features)

shaochuan
/

nanogo

NanoGo

Model Files

Architecture

Usage

Links