ONNX
MOEW_SMALL / README.md
Hemanshu121's picture
Update README.md
6adbe1b verified
---
license: mit
---
# 🧠 NeuroGolf 2026: Ultra-Efficient ARC-AGI Solver
## 📌 Overview
This repository contains the implementation of **NeuroGolf 2026**, an ultra-efficient model designed to solve **Abstraction and Reasoning Corpus (ARC-AGI)** image transformations.
The project focuses on maximizing reasoning capability while strictly adhering to extreme model size constraints required for competition submission.
---
## 🏁 Competition Constraints
The model is strictly optimized to meet the following requirements:
- **ONNX File Size Limit:****1.44 MB**
- **Parameter Budget:**
- ~360K parameters (Float32)
- ~1.4M parameters (INT8 quantized)
- **Input/Output Shape:**
`(1, 10, 30, 30)` for both input and output logits
---
## 🏗️ Architecture
The system uses a **Teacher–Student Distillation framework** to compress high-level reasoning into a micro-scale deployable model.
### 🧠 Mega-Teacher Model (`MegaTeacherARCNet`)
- **Purpose:** Captures complex patterns and logic across 400+ ARC tasks
- **Dimensions:** 512 hidden units, 16 residual blocks deep
- **Technique:** Standard convolutions + deep residual architecture for maximum pattern recognition
---
### ⚡ Student Model (`UltraTinyARCNet`)
- **Purpose:** Final deployable model optimized for strict size limits
- **Dimensions:** 56 hidden units, 5 residual blocks deep
#### 🔧 Key Techniques
- **Depthwise Separable Convolutions** → ~10× parameter reduction
- **No Bias Terms**`bias=False` in Conv2d to reduce parameter count
- **Residual Blocks** → Maintain gradient flow in ultra-small networks
---
## 🔄 Training Pipeline
### 1️⃣ Teacher Training
- Train Mega-Teacher for **50 epochs**
- Dataset: Full **400+ ARC tasks**
- Augmentation: **8× (rotations + flips)**
---
### 2️⃣ Knowledge Distillation
- Student learns from teacher’s **soft probability distributions**
- Transfers **“dark knowledge”**
- Achieves better generalization vs hard-label training
---
### 3️⃣ Pruning & Fine-Tuning
#### ✂️ Pruning
- Remove **30–35%** of low-magnitude weights
- Method: **L1 unstructured pruning**
- Ensures ONNX file remains under **1.44 MB**
#### 🔧 Fine-Tuning
- **20 epochs** recovery training
- Restores performance lost during pruning
---
## ⚙️ Installation & Usage
### 📋 Prerequisites
- Linux environment (**Debian/Kali recommended**)
- Python **3.10+**
- NVIDIA GPU with CUDA support (**optimized for 2× T4 setup**)
---
### 🛠️ Setup
```bash
pip install torch torchvision numpy onnx onnxruntime
mkdir -p data/training/
# Place ARC task JSON files in data/training/