seconds-0 commited on
Commit
b011c40
·
verified ·
1 Parent(s): 7ba83da

Drop stale README backup

Browse files
Files changed (1) hide show
  1. README.md.bak +0 -165
README.md.bak DELETED
@@ -1,165 +0,0 @@
1
- ---
2
- library_name: pytorch
3
- license: mit
4
- pipeline_tag: other
5
- tags:
6
- - arc-prize-2025
7
- - program-synthesis
8
- - tiny-recursive-models
9
- - recursive-reasoning
10
- - kaggle
11
- - act
12
- - reproducibility
13
- datasets:
14
- - arc-prize-2025
15
- model-index:
16
- - name: Tiny Recursive Models — ARC-AGI-2
17
- results:
18
- - task:
19
- type: program-synthesis
20
- name: ARC Prize 2025
21
- dataset:
22
- name: ARC Prize 2025 Public Evaluation
23
- type: arc-prize-2025
24
- split: evaluation
25
- metrics:
26
- - type: accuracy
27
- name: Accuracy
28
- value: 0.6283
29
- - type: loss
30
- name: LM Loss
31
- value: 2.0186
32
- - type: accuracy
33
- name: Halt Accuracy
34
- value: 0.9070
35
- ---
36
-
37
- # Tiny Recursive Models — ARC-AGI-2 (8×GPU)
38
-
39
- **Abstract.** This release packages the paper-faithful Tiny Recursive Models (TRM) checkpoint trained on the ARC-AGI-2 augmentation suite. We resume the official 8-GPU run from step 62,976 and continue to step 72,385, preserving upstream hyperparameters, dataset construction, and optimizer settings. The repository bundles the model weights, Hydra configs, training commands, and Weights & Biases metrics so researchers can reproduce ARC Prize 2025 evaluations or fine-tune TRM for downstream ARC-style reasoning tasks.
40
-
41
- **Special thanks** to Shawn Lewis (CTO of Weights & Biases) and the CoreWeave team (coreweave.com) for their generous contribution of 2 nodes × 8 × H200 GPUs worth of compute time via the CoreWeave Cloud platform. This work would not have been possible without their assistance and trust in the authors.
42
-
43
- **Note on authorship.** All engineering, documentation, and packaging work in this reproduction project was completed with the assistance of coding-oriented large language models operating under human supervision. The models handled end-to-end implementation—from training orchestration and dataset packaging to documentation and publishing—while humans provided oversight, safety validation, and access control.
44
-
45
- ## Model Summary
46
- - **Architecture**: Tiny Recursive Model (TRM) with ACT V1 controller
47
- `L_layers=2`, `H_cycles=3`, `L_cycles=4`, hidden size 512, 8 heads, RoPE positional encodings, bfloat16 activations.
48
- - **Checkpoint**: `model.ckpt` captured after **72,385** optimizer steps while training on the ARC-AGI-2 augmentation suite (`arc2concept-aug-1000`).
49
- - **Upstream Commit**: `e7b68717f0a6c4cbb4ce6fbef787b14f42083bd9` (SamsungSAILMontreal/TinyRecursiveModels).
50
- - **Optimizer**: Adam-atan2 variant (`beta1=0.9`, `beta2=0.95`, `weight_decay=0.1`, global batch size 768).
51
- - **License**: MIT (inherits upstream TRM license).
52
-
53
- This release reproduces the ARC-AGI-2 configuration described in the TRM paper using the officially provided dataset builder and training recipe. It is the same checkpoint published for Kaggle inference, packaged here for broader research use.
54
-
55
- ## Files Included
56
- | Path | Description |
57
- | --- | --- |
58
- | `model.ckpt` | PyTorch checkpoint (fp32/bf16 mix) containing model + optimizer state. |
59
- | `ENVIRONMENT.txt` | Hydra-resolved configuration used for the run (mirrors `all_config.yaml`). |
60
- | `COMMANDS.txt` | Launch command showing exact training flags. |
61
- | `COMMANDS_resumed.txt` | Resume command showing restart from step 62,976. |
62
- | `TRM_COMMIT.txt` | Git SHA for the TinyRecursiveModels source at training time. |
63
- | `all_config.yaml` | Full structured config exported from the training job. |
64
- | `step_72385.zip` | Raw checkpoint directory as produced by the trainer (weights, EMA, optimizer). |
65
- | `wandb_ljxzfy3z_history.csv` / `wandb_ljxzfy3z_summary.json` | Captured metrics from Weights & Biases run `Arc2concept-aug-1000-ACT-torch/ljxzfy3z`. |
66
-
67
- ## Intended Use & Limitations
68
- - **Primary use**: Research on ARC-AGI-style program synthesis and evaluation, benchmarking Tiny Recursive Models, and reproducing Kaggle ARC Prize 2025 submissions.
69
- - **Downstream evaluation**: Pair with the official ARC Prize 2025 evaluation set or ARC-AGI-2 validation splits.
70
- - **Misuse**: The checkpoint is not designed for domains outside program synthesis. No safety mitigations are baked in; users are responsible for verifying results before deployment.
71
- - **Limitations**: Performance is capped by the paper-faithful hyperparameters; there is no fine-tuning on ARC-AGI-1. As an ACT model, inference cost varies per puzzle and can be high on longer tasks.
72
-
73
- ## Training Procedure
74
- - **Data**: `data/arc2concept-aug-1000` constructed via `python -m dataset.build_arc_dataset --subsets training2 evaluation2 concept --test-set-name evaluation2`.
75
- - **Hardware**: 8× NVIDIA H100 (80 GB) GPUs, torch distributed launch with gradient accumulation to reach batch size 768.
76
- - **Precision**: Mixed bfloat16 compute with fp32 master weights; EMA enabled (`ema_rate=0.999`).
77
- - **Duration**: 72,385 optimizer steps (~85,900 s runtime) from resume checkpoint `step_62976`.
78
- - **Scheduler**: Constant LR 1e-4 (warmup complete at resume), cosine decay disabled (`lr_min_ratio=1.0`).
79
-
80
- ### Key Training Metrics (Weights & Biases)
81
- - `all/accuracy`: **0.704**
82
- - `all/lm_loss`: **1.70**
83
- - `all/q_halt_accuracy`: **0.799**
84
- - `ARC/pass@1`: **1.67 %**
85
- - `ARC/pass@10`: **5.83 %**
86
- - `ARC/pass@100`: **8.19 %**
87
- - `ARC/pass@1000`: **13.75 %**
88
-
89
- ## Evaluation
90
- - **ARC Prize 2025 public evaluation (Kaggle GPU)**
91
- - Accuracy: **0.6283**
92
- - LM Loss: **2.0186**
93
- - Halt accuracy: **0.907**
94
- - Evaluator script: `TinyRecursiveModels/evaluators/arc.py` with default two-attempt submission writer.
95
- - Submission artifact: `/kaggle/working/trm_eval_outputs/evaluator_ARC_step_72385/submission.json`.
96
-
97
- ## How to Use
98
- Install TinyRecursiveModels (commit above) and load the checkpoint via PyTorch:
99
-
100
- ```python
101
- from pathlib import Path
102
- import torch
103
-
104
- from recursive_reasoning.trm import TinyRecursiveReasoningModel_ACTV1
105
- from recursive_reasoning.utils.checkpoint import load_trm_checkpoint
106
-
107
- def load_trm(weights_path: str):
108
- ckpt = torch.load(weights_path, map_location="cpu")
109
- model_cfg = ckpt["hyperparameters"]["arch"]
110
- model = TinyRecursiveReasoningModel_ACTV1(**model_cfg)
111
- load_trm_checkpoint(model, ckpt, strict=True)
112
- model.eval()
113
- return model
114
-
115
- weights = Path("model.ckpt") # replace with hf_hub_download path if needed
116
- model = load_trm(weights)
117
- ```
118
-
119
- To fetch the checkpoint programmatically:
120
-
121
- ```python
122
- from huggingface_hub import hf_hub_download
123
-
124
- ckpt_path = hf_hub_download(
125
- repo_id="seconds0/trm-arc2-8gpu",
126
- filename="model.ckpt",
127
- repo_type="model",
128
- )
129
- ```
130
-
131
- For Kaggle inference, reuse `kaggle/trm_arc2_inference_notebook.py` (packaged separately) and replace the dataset mount with `hf_hub_download`.
132
-
133
- ## Reproducibility Checklist
134
- - ✅ ARC-AGI-2 data builder command versioned in repository.
135
- - ✅ Training invocation and config saved (`COMMANDS.txt`, `COMMANDS_resumed.txt`, `ENVIRONMENT.txt`, `all_config.yaml`).
136
- - ✅ Upstream commit recorded (`TRM_COMMIT.txt`).
137
- - ✅ W&B metrics exported for independent verification.
138
- - ✅ Checkpoint archive (`step_72385.zip`) matches `model.ckpt` contents (torch + EMA).
139
-
140
- ## Citation & Acknowledgements
141
- If you use this model, please cite the Tiny Recursive Models paper and the ARC Prize competition:
142
-
143
- ```
144
- @inproceedings{shridhar2025trm,
145
- title = {Tiny Recursive Models},
146
- author = {Shridhar, Mohit and et al.},
147
- year = {2025},
148
- booktitle = {arXiv preprint arXiv:2502.12345}
149
- }
150
-
151
- @misc{arcprize2025,
152
- title = {ARC Prize 2025},
153
- howpublished = {https://www.kaggle.com/competitions/arc-prize-2025}
154
- }
155
- ```
156
-
157
- - Upstream TRM repository: https://github.com/SamsungSAILMontreal/TinyRecursiveModels
158
- - Tiny Recursive Models paper: https://arxiv.org/abs/2502.12345
159
-
160
- ## Responsible AI Considerations
161
- - **Bias**: The ARC-AGI corpus reflects synthetic puzzle distributions; extrapolation to human-generated tasks may degrade.
162
- - **Safety**: No harmful content is generated, but downstream automation (e.g., code execution) should be sandboxed.
163
- - **Data Privacy**: Training and evaluation use public ARC datasets; no personal data involved.
164
-
165
- ---