--- license: cc-by-nc-4.0 tags: - sudoku - reasoning - pytorch - rhan --- # PotatoAGI (RHAN-Sudoku) This is the official weight repository for the **Recurrent Hybrid Attention Network (RHAN)** trained on Sudoku. It uses a **Universal Linear Attention** mechanism combined with **Recursive Memory** and was trained using **Adversarial Erasure**. ## Stats - **Parameters:** ~150k - **Architecture:** 12-Loop Recurrent CNN + Linear Attention - **Accuracy:** 99% Cell Accuracy / 90%+ Perfect Solve Rate - **License:** CC BY-NC 4.0 (Non-Commercial Research Use Only) ## Files in this Repository model.py # Model architecture (UniversalPotato) model.safetensors # Trained weights local_test_sudoku.py # Dataset-based local evaluation README.md ## Usage ### 1️⃣ Install dependencies ```bash pip install torch safetensors ``` Python ≥ 3.10 recommended. 2️⃣ Load the model and weights import torch from safetensors.torch import load_file from model import UniversalPotato, HIDDEN_DIM device = "cuda" if torch.cuda.is_available() else "cpu" model = UniversalPotato().to(device) model.load_state_dict(load_file("model.safetensors"), strict=True) model.eval() 3️⃣ Run inference on a single Sudoku puzzle Sudoku grids are represented as a flat tensor of length 81, with 0 indicating empty cells. # Example puzzle (0 = empty) puzzle = [ 5,3,0,0,7,0,0,0,0, 6,0,0,1,9,5,0,0,0, 0,9,8,0,0,0,0,6,0, 8,0,0,0,6,0,0,0,3, 4,0,0,8,0,3,0,0,1, 7,0,0,0,2,0,0,0,6, 0,6,0,0,0,0,2,8,0, 0,0,0,4,1,9,0,0,5, 0,0,0,0,8,0,0,7,9, ] clues = torch.tensor(puzzle, dtype=torch.long).unsqueeze(0).to(device) board = clues.clone() memory = torch.zeros(1, HIDDEN_DIM, 9, 9, device=device) with torch.no_grad(): for _ in range(24): # reasoning steps logits, memory = model( clues=clues, current_board=board, memory=memory, blindfold=False, ) board = logits.argmax(dim=-1) solution = board.view(9, 9).cpu() print(solution) 4️⃣ Dataset-based evaluation To evaluate the model on a real Sudoku dataset: Download sudoku.csv from Kaggle 👉 https://www.kaggle.com/datasets/rohanrao/sudoku Place it in the repository root Run: python local_test_sudoku.py This script: runs multi-step inference compares predictions against ground truth reports solve success rate Notes This model does not use Hugging Face Transformers model.py is the authoritative architecture definition Inference requires multiple recurrent steps for best results Designed for reasoning research, not commercial deployment License This project is released under CC BY-NC 4.0. You may: use modify redistribute for non-commercial research purposes only, with attribution. Commercial use is not permitted.