| --- |
| license: mit |
| pipeline_tag: text-to-image |
| --- |
| |
| # PixelModel πΌοΈ |
|
|
| A neural network where the weights **are** the image. |
|
|
| ## π What is this? |
|
|
| `model.png` is not a picture β it *is* the model. |
|
|
| Every pixel encodes neural network weights. At inference, the PNG is decoded into weight matrices forming a tiny MLP. The prompt is embedded into a vector, and the model generates a 32Γ32 image. |
|
|
| Training directly optimizes pixel values via gradient descent until the PNG becomes the model itself. |
|
|
| --- |
|
|
| ## π¨ Weight Encoding |
|
|
| - **R channel** β weight magnitude (0β255 β 0.0β1.0) |
| - **B channel** β weight sign (<128 = negative, β₯128 = positive) |
| - **G channel** β unused / reserved |
|
|
| --- |
|
|
| ## π§ Architecture |
|
|
| ```text |
| prompt string |
| β char embedding β 32-dim vector |
| β W1 (64Γ32) β tanh |
| β W2 (64Γ64) β tanh |
| β W3 (3072Γ64) β sigmoid |
| β reshape β 32Γ32Γ3 image |
| ```` |
|
|
| All weights live inside `model.png`. |
|
|
| --- |
|
|
| ## π§ͺ Dataset vs Outputs |
|
|
| | Target | Output | |
| | ------------------------------------------ | -------------------------------------- | |
| | <img src="dataset/red.png" width="120"> | <img src="out_red.png" width="120"> | |
| | <img src="dataset/green.png" width="120"> | <img src="out_green.png" width="120"> | |
| | <img src="dataset/blue.png" width="120"> | <img src="out_blue.png" width="120"> | |
| | <img src="dataset/white.png" width="120"> | <img src="out_white.png" width="120"> | |
| | <img src="dataset/yellow.png" width="120"> | <img src="out_yellow.png" width="120"> | |
| | <img src="dataset/dark.png" width="120"> | <img src="out_dark.png" width="120"> | |
|
|
| --- |
|
|
| ## π Files |
|
|
| ```text |
| model.png β THE MODEL (64Γ3200 px) |
| main.py β inference |
| train.py β training |
| model.py β architecture |
| dataset/ |
| red.png |
| red.txt β prompt: "red" |
| ... |
| ``` |
|
|
| --- |
|
|
| ## βοΈ Usage |
|
|
| ```bash |
| python train.py |
| python train.py --epochs 500 --lr 0.05 |
| |
| python main.py "red" |
| python main.py "a cat" --out cat.png --scale 8 |
| ``` |
|
|
| --- |
|
|
| ## π Tips |
|
|
| * 6β20 samples are enough |
| * Simple patterns converge fastest |
| * 200β500 epochs typical |
| * Loss < 0.001 is strong for toy datasets |
|
|
| --- |
|
|
| *Itβs a toy. Itβs not useful. But it works.* |
|
|
| Bench Labs Β· Simple, Reliable, Open sourced |
|
|