ocxlabs
/

FloydARC

diffusion-models

Model card Files Files and versions

FloydARC / README.md

paramecinm's picture

Update README.md

02541b0 verified 2 months ago

|

history blame contribute delete

2.83 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- floydnet
	- diffusion-models
	- ARC-AGI
	---


	# FloydARC (ARC-AGI Reasoning)

	## Model Summary

	FloydARC is a neural algorithmic reasoning model adapted from FloydNet for the ARC-AGI benchmark.
	This checkpoint is trained primarily on ARC-style synthetic and curated data, and is designed to solve ARC tasks via iterative refinement and test-time adaptation, rather than large-scale web pretraining.

	Among models trained mainly on ARC-like data, FloydARC achieves state-of-the-art performance on both ARC-AGI-1 and ARC-AGI-2, significantly narrowing the gap to very large proprietary models.

	---

	## Performance

	FloydARC demonstrates strong generalization on ARC benchmarks under standard evaluation protocols.

	ARC-AGI benchmark results:

	\| Model \| #Params \| ARC-AGI-1 \| ARC-AGI-2 \|
	\| ------------ \| ------: \| --------: \| --------: \|
	\| VARC \| 73M \| 60.4 \| 11.1 \|
	\| Loop-ViT \| 11.2M \| 61.2 \| 10.3 \|
	\| HRM \| 27M \| 40.3 \| 5.0 \|
	\| FloydARC \| 153.7M \| 70.5 \| 15.3 \|



	---

	## Model Details

	* Model ID: `ocxlabs/FloydARC`
	* Task: Abstraction and Reasoning Corpus (ARC-AGI)
	* Architecture: FloydNet-based global relational reasoning with looped refinement
	* Input / Output: ARC grid-based visual reasoning (query canvas → predicted answer canvas)
	* License: Apache 2.0

	---

	## Usage: Inference & Evaluation

	This checkpoint is intended for research and evaluation use on ARC-AGI. Full reproduction of reported results requires multi-GPU inference with test-time training.

	### 1. Download checkpoint

	Download the pretrained checkpoint from Hugging Face:

	```
	https://huggingface.co/ocxlabs/FloydARC
	```

	Place the downloaded folder anywhere on disk and pass its path via `--ckpt_path`.

	---

	### 2. Prepare ARC evaluation data

	Place the original ARC JSON files under `rawdata/`, then preprocess:

	```bash
	python -m scripts.process_data \
	--input_dir ./rawdata/ARC-AGI-1_evaluation/ \
	--output_dir ./preprocessed/arc1 \
	--split test
	```

	Repeat with `ARC-AGI-2_evaluation` for ARC-AGI-2.

	---

	### 3. Run inference with Test-Time Training (recommended)

	```bash
	python -m scripts.TTT \
	--ckpt_path /path/to/floydarc_ckpt \
	--subset arc1 \
	--output_dir ./output/TTT_results
	```

	Notes:

	* Default configuration uses 8 GPUs on a single node
	* LoRA-based TTT is enabled by default and recommended
	* For ARC-AGI-2, set `--subset arc2`

	---

	### 4. Ensembling & visualization

	For reproducible evaluation and qualitative inspection:

	```bash
	python -m scripts.analyze \
	--result-folder ./output/TTT_results \
	--subset arc1 \
	--out-html output/arc1_results.html
	```

	Multiple result folders can be passed to enable max-voting ensembling.