seconds-0 commited on
Commit
5f89419
·
verified ·
1 Parent(s): f82c4e8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -13,7 +13,7 @@ tags:
13
  datasets:
14
  - arc-prize-2025
15
  model-index:
16
- - name: Tiny Recursive Models — ARC-AGI-2 (Step 72,385)
17
  results:
18
  - task:
19
  type: program-synthesis
@@ -34,9 +34,9 @@ model-index:
34
  value: 0.9070
35
  ---
36
 
37
- # Tiny Recursive Models — ARC-AGI-2 (8×GPU, Step 72,385)
38
 
39
- **Abstract.** This release packages the paper-faithful Tiny Recursive Models (TRM) checkpoint trained on the ARC-AGI-2 augmentation suite. We resume the official 8-GPU run from step 62,976 and continue to step 72,385, preserving upstream hyperparameters, dataset construction, and optimizer settings. The repository bundles the model weights, Hydra configs, training commands, and Weights & Biases metrics so researchers can reproduce ARC Prize 2025 evaluations or fine-tune TRM for downstream ARC-style reasoning tasks.
40
 
41
  **Special thanks** to Shawn Lewis (CTO of Weights & Biases) and the CoreWeave team (coreweave.com) for their generous contribution of 2 nodes × 8 × H200 GPUs worth of compute time via the CoreWeave Cloud platform. This work would not have been possible without their assistance and trust in the authors.
42
 
 
13
  datasets:
14
  - arc-prize-2025
15
  model-index:
16
+ - name: Tiny Recursive Models — ARC-AGI-2
17
  results:
18
  - task:
19
  type: program-synthesis
 
34
  value: 0.9070
35
  ---
36
 
37
+ # Tiny Recursive Models — ARC-AGI-2 (8×GPU)
38
 
39
+ **Abstract.** This release packages a Tiny Recursive Models (TRM) checkpoint trained on the ARC-AGI-2 augmentation suite using the paper-faithful configuration (which targets 100,000 training steps). This particular checkpoint is from an interrupted training run (captured at step 72,385 after resuming from step 62,976), while preserving upstream hyperparameters, dataset construction, and optimizer settings. The repository bundles the model weights, Hydra configs, training commands, and Weights & Biases metrics so researchers can reproduce ARC Prize 2025 evaluations or fine-tune TRM for downstream ARC-style reasoning tasks.
40
 
41
  **Special thanks** to Shawn Lewis (CTO of Weights & Biases) and the CoreWeave team (coreweave.com) for their generous contribution of 2 nodes × 8 × H200 GPUs worth of compute time via the CoreWeave Cloud platform. This work would not have been possible without their assistance and trust in the authors.
42