File size: 2,986 Bytes
0b51134 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
---
license: cc-by-nc-4.0
tags:
- sudoku
- reasoning
- pytorch
- rhan
---
# PotatoAGI (RHAN-Sudoku)
This is the official weight repository for the **Recurrent Hybrid Attention Network (RHAN)** trained on Sudoku.
It uses a **Universal Linear Attention** mechanism combined with **Recursive Memory** and was trained using **Adversarial Erasure**.
## Stats
- **Parameters:** ~150k
- **Architecture:** 12-Loop Recurrent CNN + Linear Attention
- **Accuracy:** 99% Cell Accuracy / 90%+ Perfect Solve Rate
- **License:** CC BY-NC 4.0 (Non-Commercial Research Use Only)
## Files in this Repository
model.py # Model architecture (UniversalPotato)
model.safetensors # Trained weights
local_test_sudoku.py # Dataset-based local evaluation
README.md
## Usage
### 1️⃣ Install dependencies
```bash
pip install torch safetensors
```
Python ≥ 3.10 recommended.
2️⃣ Load the model and weights
import torch
from safetensors.torch import load_file
from model import UniversalPotato, HIDDEN_DIM
device = "cuda" if torch.cuda.is_available() else "cpu"
model = UniversalPotato().to(device)
model.load_state_dict(load_file("model.safetensors"), strict=True)
model.eval()
3️⃣ Run inference on a single Sudoku puzzle
Sudoku grids are represented as a flat tensor of length 81,
with 0 indicating empty cells.
# Example puzzle (0 = empty)
puzzle = [
5,3,0,0,7,0,0,0,0,
6,0,0,1,9,5,0,0,0,
0,9,8,0,0,0,0,6,0,
8,0,0,0,6,0,0,0,3,
4,0,0,8,0,3,0,0,1,
7,0,0,0,2,0,0,0,6,
0,6,0,0,0,0,2,8,0,
0,0,0,4,1,9,0,0,5,
0,0,0,0,8,0,0,7,9,
]
clues = torch.tensor(puzzle, dtype=torch.long).unsqueeze(0).to(device)
board = clues.clone()
memory = torch.zeros(1, HIDDEN_DIM, 9, 9, device=device)
with torch.no_grad():
for _ in range(24): # reasoning steps
logits, memory = model(
clues=clues,
current_board=board,
memory=memory,
blindfold=False,
)
board = logits.argmax(dim=-1)
solution = board.view(9, 9).cpu()
print(solution)
4️⃣ Dataset-based evaluation
To evaluate the model on a real Sudoku dataset:
Download sudoku.csv from Kaggle
👉 https://www.kaggle.com/datasets/rohanrao/sudoku
Place it in the repository root
Run:
python local_test_sudoku.py
This script:
runs multi-step inference
compares predictions against ground truth
reports solve success rate
Notes
This model does not use Hugging Face Transformers
model.py is the authoritative architecture definition
Inference requires multiple recurrent steps for best results
Designed for reasoning research, not commercial deployment
License
This project is released under CC BY-NC 4.0.
You may:
use
modify
redistribute
for non-commercial research purposes only, with attribution.
Commercial use is not permitted.
|