|
|
| ======================================================================== |
| WORKER 0 |
| ======================================================================== |
|
|
| ======================================================================== |
| EXP075 WORKER 0 on GPU 0 (NVIDIA A40) |
| ======================================================================== |
| Timestamp: 2026-03-28T15:19:06.860207+00:00 |
| VRAM: 47.7 GB |
|
|
| Data split: files [0:818) = 818/3275 files, ~208M positions |
| /usr/local/lib/python3.11/dist-packages/torch/nn/modules/transformer.py:307: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.norm_first was True |
| warnings.warn(f"enable_nested_tensor is True, but self.use_nested_tensor is False because {why_not_sparsity_fast_path}") |
| Model: 204.0M params |
| Downloading best_model.pt from avewright/chess-transformer-200m-v2... |
| Loaded v2 weights OK (816 MB) |
| Downloading eval data: data/test-00000-of-00001.parquet... |
| Tokenized 488,309 eval candidates, using 5,000 |
| Eval ready: 5000 positions |
| StreamingHFChessLoader: 818 parquet files (src), est ~208M positions, batch=256, device=cuda:0, rev=a9dfd59e |
| Training: ~208M positions, ~202,902 opt steps |
| Batch: 256 x accum=4 (eff=1024), lr=0.0001 |
| Warmup: 2029 steps, sync every 500 steps |
|
|
| Initial evaluation... |
| Loaded model: acc=16.3% top3=41.8% sf_rank=66.6 val=78.5% |
| endgame: 18.1% |
| middlegame: 17.9% |
| opening: 15.9% |
| Saved baseline as best_model.pt (acc=16.3%) |
|
|
| ------------------------------------------------------------------------ |
| [W0] Training started (~208M positions, ~202,902 opt steps) |
| ------------------------------------------------------------------------ |
|
|
|
|
| ======================================================================== |
| WORKER 1 |
| ======================================================================== |
|
|
| ======================================================================== |
| EXP075 WORKER 1 on GPU 1 (NVIDIA A40) |
| ======================================================================== |
| Timestamp: 2026-03-28T15:19:09.020936+00:00 |
| VRAM: 47.7 GB |
|
|
| Data split: files [818:1636) = 818/3275 files, ~208M positions |
| /usr/local/lib/python3.11/dist-packages/torch/nn/modules/transformer.py:307: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.norm_first was True |
| warnings.warn(f"enable_nested_tensor is True, but self.use_nested_tensor is False because {why_not_sparsity_fast_path}") |
| Model: 204.0M params |
| Downloading best_model.pt from avewright/chess-transformer-200m-v2... |
| Loaded v2 weights OK (816 MB) |
| StreamingHFChessLoader: 818 parquet files (src), est ~208M positions, batch=256, device=cuda:1, rev=a9dfd59e |
| Training: ~208M positions, ~202,902 opt steps |
| Batch: 256 x accum=4 (eff=1024), lr=0.0001 |
| Warmup: 2029 steps, sync every 500 steps |
|
|
| ------------------------------------------------------------------------ |
| [W1] Training started (~208M positions, ~202,902 opt steps) |
| ------------------------------------------------------------------------ |
|
|
|
|
| ======================================================================== |
| WORKER 2 |
| ======================================================================== |
|
|
| ======================================================================== |
| EXP075 WORKER 2 on GPU 2 (NVIDIA A40) |
| ======================================================================== |
| Timestamp: 2026-03-28T15:19:11.098770+00:00 |
| VRAM: 47.7 GB |
|
|
| Data split: files [1636:2454) = 818/3275 files, ~208M positions |
| /usr/local/lib/python3.11/dist-packages/torch/nn/modules/transformer.py:307: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.norm_first was True |
| warnings.warn(f"enable_nested_tensor is True, but self.use_nested_tensor is False because {why_not_sparsity_fast_path}") |
| Model: 204.0M params |
| Downloading best_model.pt from avewright/chess-transformer-200m-v2... |
| Loaded v2 weights OK (816 MB) |
| StreamingHFChessLoader: 818 parquet files (src), est ~208M positions, batch=256, device=cuda:2, rev=a9dfd59e |
| Training: ~208M positions, ~202,902 opt steps |
| Batch: 256 x accum=4 (eff=1024), lr=0.0001 |
| Warmup: 2029 steps, sync every 500 steps |
|
|
| ------------------------------------------------------------------------ |
| [W2] Training started (~208M positions, ~202,902 opt steps) |
| ------------------------------------------------------------------------ |
|
|
|
|
| ======================================================================== |
| WORKER 3 |
| ======================================================================== |
|
|
| ======================================================================== |
| EXP075 WORKER 3 on GPU 3 (NVIDIA A40) |
| ======================================================================== |
| Timestamp: 2026-03-28T15:19:13.165183+00:00 |
| VRAM: 47.7 GB |
|
|
| Data split: files [2454:3275) = 821/3275 files, ~209M positions |
| /usr/local/lib/python3.11/dist-packages/torch/nn/modules/transformer.py:307: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.norm_first was True |
| warnings.warn(f"enable_nested_tensor is True, but self.use_nested_tensor is False because {why_not_sparsity_fast_path}") |
| Model: 204.0M params |
| Downloading best_model.pt from avewright/chess-transformer-200m-v2... |
| Loaded v2 weights OK (816 MB) |
| StreamingHFChessLoader: 821 parquet files (src), est ~209M positions, batch=256, device=cuda:3, rev=a9dfd59e |
| Training: ~209M positions, ~203,646 opt steps |
| Batch: 256 x accum=4 (eff=1024), lr=0.0001 |
| Warmup: 2036 steps, sync every 500 steps |
|
|
| ------------------------------------------------------------------------ |
| [W3] Training started (~209M positions, ~203,646 opt steps) |
| ------------------------------------------------------------------------ |
|
|
|
|