seconds-0
/

trm-arc2-8gpu

program-synthesis

tiny-recursive-models

recursive-reasoning

resume-training

reproducibility

Eval Results (legacy)

Model card Files Files and versions

seconds-0 commited on Oct 22, 2025

Commit

5f89419

·

verified ·

1 Parent(s): f82c4e8

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ tags:
 datasets:
   - arc-prize-2025
 model-index:
-  - name: Tiny Recursive Models — ARC-AGI-2 (Step 72,385)
     results:
       - task:
           type: program-synthesis
@@ -34,9 +34,9 @@ model-index:
             value: 0.9070
 ---
-# Tiny Recursive Models — ARC-AGI-2 (8×GPU, Step 72,385)
-**Abstract.** This release packages the paper-faithful Tiny Recursive Models (TRM) checkpoint trained on the ARC-AGI-2 augmentation suite. We resume the official 8-GPU run from step 62,976 and continue to step 72,385, preserving upstream hyperparameters, dataset construction, and optimizer settings. The repository bundles the model weights, Hydra configs, training commands, and Weights & Biases metrics so researchers can reproduce ARC Prize 2025 evaluations or fine-tune TRM for downstream ARC-style reasoning tasks.
 **Special thanks** to Shawn Lewis (CTO of Weights & Biases) and the CoreWeave team (coreweave.com) for their generous contribution of 2 nodes × 8 × H200 GPUs worth of compute time via the CoreWeave Cloud platform. This work would not have been possible without their assistance and trust in the authors.

 datasets:
   - arc-prize-2025
 model-index:
+  - name: Tiny Recursive Models — ARC-AGI-2
     results:
       - task:
           type: program-synthesis
             value: 0.9070
 ---
+# Tiny Recursive Models — ARC-AGI-2 (8×GPU)
+**Abstract.** This release packages a Tiny Recursive Models (TRM) checkpoint trained on the ARC-AGI-2 augmentation suite using the paper-faithful configuration (which targets 100,000 training steps). This particular checkpoint is from an interrupted training run (captured at step 72,385 after resuming from step 62,976), while preserving upstream hyperparameters, dataset construction, and optimizer settings. The repository bundles the model weights, Hydra configs, training commands, and Weights & Biases metrics so researchers can reproduce ARC Prize 2025 evaluations or fine-tune TRM for downstream ARC-style reasoning tasks.
 **Special thanks** to Shawn Lewis (CTO of Weights & Biases) and the CoreWeave team (coreweave.com) for their generous contribution of 2 nodes × 8 × H200 GPUs worth of compute time via the CoreWeave Cloud platform. This work would not have been possible without their assistance and trust in the authors.