pixelmodel / README.md
wop's picture
Update README.md
1d88bd9 verified
|
Raw
History Blame Contribute Delete
2.33 kB
---
license: mit
pipeline_tag: text-to-image
---
# PixelModel πŸ–ΌοΈ
A neural network where the weights **are** the image.
## πŸ“Œ What is this?
`model.png` is not a picture β€” it *is* the model.
Every pixel encodes neural network weights. At inference, the PNG is decoded into weight matrices forming a tiny MLP. The prompt is embedded into a vector, and the model generates a 32Γ—32 image.
Training directly optimizes pixel values via gradient descent until the PNG becomes the model itself.
---
## 🎨 Weight Encoding
- **R channel** β†’ weight magnitude (0–255 β†’ 0.0–1.0)
- **B channel** β†’ weight sign (<128 = negative, β‰₯128 = positive)
- **G channel** β†’ unused / reserved
---
## 🧠 Architecture
```text
prompt string
β†’ char embedding β†’ 32-dim vector
β†’ W1 (64Γ—32) β†’ tanh
β†’ W2 (64Γ—64) β†’ tanh
β†’ W3 (3072Γ—64) β†’ sigmoid
β†’ reshape β†’ 32Γ—32Γ—3 image
````
All weights live inside `model.png`.
---
## πŸ§ͺ Dataset vs Outputs
| Target | Output |
| ------------------------------------------ | -------------------------------------- |
| <img src="dataset/red.png" width="120"> | <img src="out_red.png" width="120"> |
| <img src="dataset/green.png" width="120"> | <img src="out_green.png" width="120"> |
| <img src="dataset/blue.png" width="120"> | <img src="out_blue.png" width="120"> |
| <img src="dataset/white.png" width="120"> | <img src="out_white.png" width="120"> |
| <img src="dataset/yellow.png" width="120"> | <img src="out_yellow.png" width="120"> |
| <img src="dataset/dark.png" width="120"> | <img src="out_dark.png" width="120"> |
---
## πŸ“ Files
```text
model.png ← THE MODEL (64Γ—3200 px)
main.py ← inference
train.py ← training
model.py ← architecture
dataset/
red.png
red.txt ← prompt: "red"
...
```
---
## βš™οΈ Usage
```bash
python train.py
python train.py --epochs 500 --lr 0.05
python main.py "red"
python main.py "a cat" --out cat.png --scale 8
```
---
## πŸ“Š Tips
* 6–20 samples are enough
* Simple patterns converge fastest
* 200–500 epochs typical
* Loss < 0.001 is strong for toy datasets
---
*It’s a toy. It’s not useful. But it works.*
Bench Labs Β· Simple, Reliable, Open sourced