File size: 2,986 Bytes
0b51134
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---

license: cc-by-nc-4.0
tags:
- sudoku
- reasoning
- pytorch
- rhan
---


# PotatoAGI (RHAN-Sudoku)

This is the official weight repository for the **Recurrent Hybrid Attention Network (RHAN)** trained on Sudoku.

It uses a **Universal Linear Attention** mechanism combined with **Recursive Memory** and was trained using **Adversarial Erasure**.

## Stats
- **Parameters:** ~150k
- **Architecture:** 12-Loop Recurrent CNN + Linear Attention
- **Accuracy:** 99% Cell Accuracy / 90%+ Perfect Solve Rate
- **License:** CC BY-NC 4.0 (Non-Commercial Research Use Only)

## Files in this Repository

model.py # Model architecture (UniversalPotato)
model.safetensors # Trained weights
local_test_sudoku.py # Dataset-based local evaluation
README.md

## Usage
### 1️⃣ Install dependencies

```bash

pip install torch safetensors

```

Python ≥ 3.10 recommended.

2️⃣ Load the model and weights

import torch
from safetensors.torch import load_file

from model import UniversalPotato, HIDDEN_DIM

device = "cuda" if torch.cuda.is_available() else "cpu"



model = UniversalPotato().to(device)

model.load_state_dict(load_file("model.safetensors"), strict=True)
model.eval()

3️⃣ Run inference on a single Sudoku puzzle

Sudoku grids are represented as a flat tensor of length 81,
with 0 indicating empty cells.

# Example puzzle (0 = empty)
puzzle = [
    5,3,0,0,7,0,0,0,0,

    6,0,0,1,9,5,0,0,0,

    0,9,8,0,0,0,0,6,0,

    8,0,0,0,6,0,0,0,3,

    4,0,0,8,0,3,0,0,1,

    7,0,0,0,2,0,0,0,6,

    0,6,0,0,0,0,2,8,0,

    0,0,0,4,1,9,0,0,5,

    0,0,0,0,8,0,0,7,9,

]


clues = torch.tensor(puzzle, dtype=torch.long).unsqueeze(0).to(device)
board = clues.clone()
memory = torch.zeros(1, HIDDEN_DIM, 9, 9, device=device)



with torch.no_grad():
    for _ in range(24):  # reasoning steps

        logits, memory = model(

            clues=clues,

            current_board=board,

            memory=memory,

            blindfold=False,

        )

        board = logits.argmax(dim=-1)


solution = board.view(9, 9).cpu()
print(solution)

4️⃣ Dataset-based evaluation

To evaluate the model on a real Sudoku dataset:

    Download sudoku.csv from Kaggle

    👉 https://www.kaggle.com/datasets/rohanrao/sudoku


    Place it in the repository root


    Run:


python local_test_sudoku.py

This script:

    runs multi-step inference


    compares predictions against ground truth


    reports solve success rate


Notes

    This model does not use Hugging Face Transformers


    model.py is the authoritative architecture definition


    Inference requires multiple recurrent steps for best results


    Designed for reasoning research, not commercial deployment


License

This project is released under CC BY-NC 4.0.

You may:

    use


    modify


    redistribute

    for non-commercial research purposes only, with attribution.


Commercial use is not permitted.