| --- |
| license: mit |
| --- |
| |
| # 🧠 NeuroGolf 2026: Ultra-Efficient ARC-AGI Solver |
|
|
| ## 📌 Overview |
| This repository contains the implementation of **NeuroGolf 2026**, an ultra-efficient model designed to solve **Abstraction and Reasoning Corpus (ARC-AGI)** image transformations. |
|
|
| The project focuses on maximizing reasoning capability while strictly adhering to extreme model size constraints required for competition submission. |
|
|
| --- |
|
|
| ## 🏁 Competition Constraints |
|
|
| The model is strictly optimized to meet the following requirements: |
|
|
| - **ONNX File Size Limit:** ≤ **1.44 MB** |
| - **Parameter Budget:** |
| - ~360K parameters (Float32) |
| - ~1.4M parameters (INT8 quantized) |
| - **Input/Output Shape:** |
| `(1, 10, 30, 30)` for both input and output logits |
|
|
| --- |
|
|
| ## 🏗️ Architecture |
|
|
| The system uses a **Teacher–Student Distillation framework** to compress high-level reasoning into a micro-scale deployable model. |
|
|
| ### 🧠 Mega-Teacher Model (`MegaTeacherARCNet`) |
| - **Purpose:** Captures complex patterns and logic across 400+ ARC tasks |
| - **Dimensions:** 512 hidden units, 16 residual blocks deep |
| - **Technique:** Standard convolutions + deep residual architecture for maximum pattern recognition |
|
|
| --- |
|
|
| ### ⚡ Student Model (`UltraTinyARCNet`) |
| - **Purpose:** Final deployable model optimized for strict size limits |
| - **Dimensions:** 56 hidden units, 5 residual blocks deep |
|
|
| #### 🔧 Key Techniques |
| - **Depthwise Separable Convolutions** → ~10× parameter reduction |
| - **No Bias Terms** → `bias=False` in Conv2d to reduce parameter count |
| - **Residual Blocks** → Maintain gradient flow in ultra-small networks |
|
|
| --- |
|
|
| ## 🔄 Training Pipeline |
|
|
| ### 1️⃣ Teacher Training |
| - Train Mega-Teacher for **50 epochs** |
| - Dataset: Full **400+ ARC tasks** |
| - Augmentation: **8× (rotations + flips)** |
|
|
| --- |
|
|
| ### 2️⃣ Knowledge Distillation |
| - Student learns from teacher’s **soft probability distributions** |
| - Transfers **“dark knowledge”** |
| - Achieves better generalization vs hard-label training |
|
|
| --- |
|
|
| ### 3️⃣ Pruning & Fine-Tuning |
|
|
| #### ✂️ Pruning |
| - Remove **30–35%** of low-magnitude weights |
| - Method: **L1 unstructured pruning** |
| - Ensures ONNX file remains under **1.44 MB** |
|
|
| #### 🔧 Fine-Tuning |
| - **20 epochs** recovery training |
| - Restores performance lost during pruning |
|
|
| --- |
|
|
| ## ⚙️ Installation & Usage |
|
|
| ### 📋 Prerequisites |
| - Linux environment (**Debian/Kali recommended**) |
| - Python **3.10+** |
| - NVIDIA GPU with CUDA support (**optimized for 2× T4 setup**) |
|
|
| --- |
|
|
| ### 🛠️ Setup |
| ```bash |
| pip install torch torchvision numpy onnx onnxruntime |
| mkdir -p data/training/ |
| # Place ARC task JSON files in data/training/ |