ebuzertaha
/

PotatoAGI

Model card Files Files and versions

PotatoAGI / README.md

ebuzertaha's picture

Initial commit with Xet-managed safetensors

0b51134 13 days ago

|

history blame contribute delete

2.99 kB

	---
	license: cc-by-nc-4.0
	tags:
	- sudoku
	- reasoning
	- pytorch
	- rhan
	---

	# PotatoAGI (RHAN-Sudoku)

	This is the official weight repository for the Recurrent Hybrid Attention Network (RHAN) trained on Sudoku.

	It uses a Universal Linear Attention mechanism combined with Recursive Memory and was trained using Adversarial Erasure.

	## Stats
	- Parameters: ~150k
	- Architecture: 12-Loop Recurrent CNN + Linear Attention
	- Accuracy: 99% Cell Accuracy / 90%+ Perfect Solve Rate
	- License: CC BY-NC 4.0 (Non-Commercial Research Use Only)

	## Files in this Repository

	model.py # Model architecture (UniversalPotato)
	model.safetensors # Trained weights
	local_test_sudoku.py # Dataset-based local evaluation
	README.md

	## Usage
	### 1️⃣ Install dependencies

	```bash
	pip install torch safetensors
	```

	Python ≥ 3.10 recommended.

	2️⃣ Load the model and weights

	import torch
	from safetensors.torch import load_file
	from model import UniversalPotato, HIDDEN_DIM

	device = "cuda" if torch.cuda.is_available() else "cpu"

	model = UniversalPotato().to(device)
	model.load_state_dict(load_file("model.safetensors"), strict=True)
	model.eval()

	3️⃣ Run inference on a single Sudoku puzzle

	Sudoku grids are represented as a flat tensor of length 81,
	with 0 indicating empty cells.

	# Example puzzle (0 = empty)
	puzzle = [
	5,3,0,0,7,0,0,0,0,
	6,0,0,1,9,5,0,0,0,
	0,9,8,0,0,0,0,6,0,
	8,0,0,0,6,0,0,0,3,
	4,0,0,8,0,3,0,0,1,
	7,0,0,0,2,0,0,0,6,
	0,6,0,0,0,0,2,8,0,
	0,0,0,4,1,9,0,0,5,
	0,0,0,0,8,0,0,7,9,
	]

	clues = torch.tensor(puzzle, dtype=torch.long).unsqueeze(0).to(device)
	board = clues.clone()
	memory = torch.zeros(1, HIDDEN_DIM, 9, 9, device=device)

	with torch.no_grad():
	for _ in range(24): # reasoning steps
	logits, memory = model(
	clues=clues,
	current_board=board,
	memory=memory,
	blindfold=False,
	)
	board = logits.argmax(dim=-1)

	solution = board.view(9, 9).cpu()
	print(solution)

	4️⃣ Dataset-based evaluation

	To evaluate the model on a real Sudoku dataset:

	Download sudoku.csv from Kaggle
	👉 https://www.kaggle.com/datasets/rohanrao/sudoku

	Place it in the repository root

	Run:

	python local_test_sudoku.py

	This script:

	runs multi-step inference

	compares predictions against ground truth

	reports solve success rate

	Notes

	This model does not use Hugging Face Transformers

	model.py is the authoritative architecture definition

	Inference requires multiple recurrent steps for best results

	Designed for reasoning research, not commercial deployment

	License

	This project is released under CC BY-NC 4.0.

	You may:

	use

	modify

	redistribute
	for non-commercial research purposes only, with attribution.

	Commercial use is not permitted.