--- license: apache-2.0 language: - en tags: - floydnet - diffusion-models - ARC-AGI --- # FloydARC (ARC-AGI Reasoning) ## Model Summary **FloydARC** is a neural algorithmic reasoning model adapted from FloydNet for the **ARC-AGI** benchmark. This checkpoint is trained primarily on ARC-style synthetic and curated data, and is designed to solve ARC tasks via **iterative refinement and test-time adaptation**, rather than large-scale web pretraining. Among models trained mainly on ARC-like data, FloydARC achieves **state-of-the-art performance** on both ARC-AGI-1 and ARC-AGI-2, significantly narrowing the gap to very large proprietary models. --- ## Performance FloydARC demonstrates strong generalization on ARC benchmarks under standard evaluation protocols. **ARC-AGI benchmark results:** | Model | #Params | ARC-AGI-1 | ARC-AGI-2 | | ------------ | ------: | --------: | --------: | | VARC | 73M | 60.4 | 11.1 | | Loop-ViT | 11.2M | 61.2 | 10.3 | | HRM | 27M | 40.3 | 5.0 | | **FloydARC** | 153.7M | **70.5** | **15.3** | --- ## Model Details * **Model ID**: `ocxlabs/FloydARC` * **Task**: Abstraction and Reasoning Corpus (ARC-AGI) * **Architecture**: FloydNet-based global relational reasoning with looped refinement * **Input / Output**: ARC grid-based visual reasoning (query canvas → predicted answer canvas) * **License**: Apache 2.0 --- ## Usage: Inference & Evaluation This checkpoint is intended for **research and evaluation use** on ARC-AGI. Full reproduction of reported results requires multi-GPU inference with test-time training. ### 1. Download checkpoint Download the pretrained checkpoint from Hugging Face: ``` https://huggingface.co/ocxlabs/FloydARC ``` Place the downloaded folder anywhere on disk and pass its path via `--ckpt_path`. --- ### 2. Prepare ARC evaluation data Place the original ARC JSON files under `rawdata/`, then preprocess: ```bash python -m scripts.process_data \ --input_dir ./rawdata/ARC-AGI-1_evaluation/ \ --output_dir ./preprocessed/arc1 \ --split test ``` Repeat with `ARC-AGI-2_evaluation` for ARC-AGI-2. --- ### 3. Run inference with Test-Time Training (recommended) ```bash python -m scripts.TTT \ --ckpt_path /path/to/floydarc_ckpt \ --subset arc1 \ --output_dir ./output/TTT_results ``` Notes: * Default configuration uses **8 GPUs on a single node** * LoRA-based TTT is enabled by default and recommended * For ARC-AGI-2, set `--subset arc2` --- ### 4. Ensembling & visualization For reproducible evaluation and qualitative inspection: ```bash python -m scripts.analyze \ --result-folder ./output/TTT_results \ --subset arc1 \ --out-html output/arc1_results.html ``` Multiple result folders can be passed to enable max-voting ensembling.