File size: 1,785 Bytes
7e891bb fd8125f 7e891bb fd8125f 7e891bb 5c14814 ed85c0e 5c14814 ed85c0e 5c14814 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | # simplified__first_attack checkpoint
Accepted fixed-bot initialization checkpoint for the selected-five MuZero/EfficientZero training sequence.
- Source run: `main5_hf_bot_mode_recovery_50k_20260522`
- Checkpoint: `checkpoints/envstep_150000.pth.tar`
- Uploaded: `2026-05-23T11:46:20+00:00`
<!-- CRPT_LATEST_BEST_START -->
## Latest Best Checkpoint
The current latest best checkpoint pointer for this repository is:
`checkpoints/first_attack_bot_only_bnfreeze_detcollect_lr000001_gate_20260525/ckpt_last.pth.tar`
Metadata:
- Game: `simplified__first_attack`
- Source run: `first_attack_bot_only_bnfreeze_detcollect_lr000001_gate_20260525`
- Checkpoint role: BN-frozen bot-mode latest-best `ckpt_last.pth.tar` after 200K env-step stabilization
- Local source at upload: `/mnt/nvme/home/molfetta/molfetta-reasoning/models/first_attack_bot_only_bnfreeze_detcollect_lr000001_gate_20260525/attempt-01/ckpt/ckpt_last.pth.tar`
- Checkpoint SHA256 at upload: `5507d982c6958fa07755e1c7749f856563c592366af87b1f89d4e1f14f3bd353`
- Source note: conservative continuation from the externally audited First Attack frozen checkpoint, with fresh optimizer LR `1e-6`, deterministic collection, and `freeze_batch_norm_stats=true` to avoid BatchNorm running-stat drift.
- Evaluation note: final training-time bot-mode fixed-bot eval on 2026-05-25, 20 games, 50 simulations, rule bot, seat-swapped: reward_mean 1.0, reward_min 1.0, reward_max 1.0, seat1_reward_mean 1.0, seat2_reward_mean 1.0.
- Upload manifest: `metadata/first_attack_bot_only_bnfreeze_detcollect_lr000001_gate_20260525_latest_best_manifest.json`
Older checkpoint files in this repository are preserved; this section is the canonical pointer to use when a consumer needs the latest best checkpoint.
<!-- CRPT_LATEST_BEST_END -->
|