pixelmodel / README.md

wop

Update README.md

1d88bd9 verified 3 days ago

preview code

Raw

History Blame Contribute Delete

2.33 kB

metadata

license: mit
pipeline_tag: text-to-image

PixelModel 🖼️

A neural network where the weights are the image.

📌 What is this?

model.png is not a picture — it is the model.

Every pixel encodes neural network weights. At inference, the PNG is decoded into weight matrices forming a tiny MLP. The prompt is embedded into a vector, and the model generates a 32×32 image.

Training directly optimizes pixel values via gradient descent until the PNG becomes the model itself.

🎨 Weight Encoding

R channel → weight magnitude (0–255 → 0.0–1.0)
B channel → weight sign (<128 = negative, ≥128 = positive)
G channel → unused / reserved

🧠 Architecture

prompt string
  → char embedding → 32-dim vector
  → W1 (64×32)  → tanh
  → W2 (64×64)  → tanh
  → W3 (3072×64) → sigmoid
  → reshape → 32×32×3 image

All weights live inside model.png.

🧪 Dataset vs Outputs

Target	Output

📁 Files

model.png       ← THE MODEL (64×3200 px)
main.py         ← inference
train.py        ← training
model.py        ← architecture
dataset/
  red.png
  red.txt       ← prompt: "red"
  ...

⚙️ Usage

python train.py
python train.py --epochs 500 --lr 0.05

python main.py "red"
python main.py "a cat" --out cat.png --scale 8

📊 Tips

6–20 samples are enough
Simple patterns converge fastest
200–500 epochs typical
Loss < 0.001 is strong for toy datasets

It’s a toy. It’s not useful. But it works.

Bench Labs · Simple, Reliable, Open sourced