wop commited on
Commit
1d88bd9
Β·
verified Β·
1 Parent(s): 9936e1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -147
README.md CHANGED
@@ -2,156 +2,93 @@
2
  license: mit
3
  pipeline_tag: text-to-image
4
  ---
5
- <div style="font-family: system-ui, sans-serif; background: #0f0f0f; color: #eaeaea; padding: 2rem; border-radius: 16px; max-width: 860px; margin: auto;">
6
-
7
- <!-- HERO -->
8
- <div style="background: #161616; border: 1px solid #222; padding: 2rem; border-radius: 14px; text-align: center; margin-bottom: 1.5rem;">
9
- <h1 style="color: #33b0d8; font-size: 2rem; margin: 0 0 0.5rem;">PixelModel πŸ–ΌοΈ</h1>
10
- <p style="color: #aaa; font-size: 0.95rem; margin: 0;">
11
- A neural network where the weights <strong style="color: #ddd;">are</strong> the image.
12
- </p>
13
- </div>
14
-
15
- <!-- DATASET VS OUTPUTS -->
16
- <div style="background: #161616; border: 1px solid #222; padding: 1.4rem; border-radius: 12px; margin-bottom: 1.5rem;">
17
- <h2 style="color: #fff; font-size: 1rem; margin: 0 0 0.5rem;">πŸ§ͺ Dataset vs Outputs</h2>
18
- <p style="color: #aaa; font-size: 0.875rem; margin: 0 0 1rem;">Ground truth dataset images compared with generated outputs.</p>
19
- <table style="width: 100%; border-collapse: collapse; text-align: center;">
20
- <tr>
21
- <th style="padding: 8px; color: #33b0d8; font-size: 0.85rem;">Red</th>
22
- <th style="padding: 8px; color: #33b0d8; font-size: 0.85rem;">Green</th>
23
- <th style="padding: 8px; color: #33b0d8; font-size: 0.85rem;">Blue</th>
24
- </tr>
25
- <tr>
26
- <td style="padding: 8px;">
27
- <div style="font-size: 0.75rem; color: #555; margin-bottom: 4px;">dataset</div>
28
- <img src="dataset/red.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
29
- <div style="font-size: 0.75rem; color: #555; margin: 4px 0;">output</div>
30
- <img src="out_red.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
31
- </td>
32
- <td style="padding: 8px;">
33
- <div style="font-size: 0.75rem; color: #555; margin-bottom: 4px;">dataset</div>
34
- <img src="dataset/green.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
35
- <div style="font-size: 0.75rem; color: #555; margin: 4px 0;">output</div>
36
- <img src="out_green.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
37
- </td>
38
- <td style="padding: 8px;">
39
- <div style="font-size: 0.75rem; color: #555; margin-bottom: 4px;">dataset</div>
40
- <img src="dataset/blue.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
41
- <div style="font-size: 0.75rem; color: #555; margin: 4px 0;">output</div>
42
- <img src="out_blue.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
43
- </td>
44
- </tr>
45
- <tr>
46
- <th style="padding: 8px; color: #33b0d8; font-size: 0.85rem;">White</th>
47
- <th style="padding: 8px; color: #33b0d8; font-size: 0.85rem;">Yellow</th>
48
- <th style="padding: 8px; color: #33b0d8; font-size: 0.85rem;">Dark</th>
49
- </tr>
50
- <tr>
51
- <td style="padding: 8px;">
52
- <div style="font-size: 0.75rem; color: #555; margin-bottom: 4px;">dataset</div>
53
- <img src="dataset/white.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
54
- <div style="font-size: 0.75rem; color: #555; margin: 4px 0;">output</div>
55
- <img src="out_white.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
56
- </td>
57
- <td style="padding: 8px;">
58
- <div style="font-size: 0.75rem; color: #555; margin-bottom: 4px;">dataset</div>
59
- <img src="dataset/yellow.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
60
- <div style="font-size: 0.75rem; color: #555; margin: 4px 0;">output</div>
61
- <img src="out_yellow.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
62
- </td>
63
- <td style="padding: 8px;">
64
- <div style="font-size: 0.75rem; color: #555; margin-bottom: 4px;">dataset</div>
65
- <img src="dataset/dark.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
66
- <div style="font-size: 0.75rem; color: #555; margin: 4px 0;">output</div>
67
- <img src="out_dark.png" style="width: 100px; image-rendering: pixelated; border-radius: 6px; display: block; margin: auto;" />
68
- </td>
69
- </tr>
70
- </table>
71
- </div>
72
-
73
- <!-- WHAT IS THIS -->
74
- <div style="background: #161616; border: 1px solid #222; padding: 1.4rem; border-radius: 12px; margin-bottom: 1.5rem;">
75
- <h2 style="color: #fff; font-size: 1rem; margin: 0 0 0.6rem;">What is this?</h2>
76
- <p style="color: #aaa; font-size: 0.875rem; line-height: 1.7; margin: 0 0 0.75rem;">
77
- <code style="background: #1e1e1e; padding: 2px 6px; border-radius: 4px;">model.png</code> is not a picture of anything β€” it <em>is</em> the model.
78
- Every pixel's RGB values encode neural network weights:
79
- </p>
80
- <ul style="color: #aaa; font-size: 0.875rem; line-height: 1.7; margin: 0 0 0.75rem; padding-left: 1.1rem;">
81
- <li><strong style="color: #ddd;">R channel</strong> β€” weight magnitude</li>
82
- <li><strong style="color: #ddd;">B channel</strong> β€” weight sign (β‰₯128 = positive)</li>
83
- <li><strong style="color: #ddd;">G channel</strong> β€” bias values</li>
84
- </ul>
85
- <p style="color: #aaa; font-size: 0.875rem; line-height: 1.7; margin: 0;">
86
- At inference, pixels are parsed into 3 weight matrices forming a tiny MLP.
87
- The prompt is embedded into a vector, then a forward pass generates a 32Γ—32 image.
88
- Training directly optimizes pixel values via gradient descent until the PNG itself becomes the model.
89
- </p>
90
- </div>
91
-
92
- <!-- FILES -->
93
- <div style="background: #161616; border: 1px solid #222; padding: 1.4rem; border-radius: 12px; margin-bottom: 1.5rem;">
94
- <h2 style="color: #fff; font-size: 1rem; margin: 0 0 0.75rem;">πŸ“ Files</h2>
95
- <pre style="background: #111; border: 1px solid #1e1e1e; padding: 1rem; border-radius: 8px; color: #aaa; font-size: 0.8rem; overflow-x: auto; margin: 0;">
96
  model.png ← THE MODEL (64Γ—3200 px)
97
  main.py ← inference
98
  train.py ← training
99
  model.py ← architecture
100
- dataset/ ← training data
101
- cat.png
102
- cat.txt ← prompt: "a cat"
103
- ...</pre>
104
- </div>
105
-
106
- <!-- USAGE -->
107
- <div style="background: #161616; border: 1px solid #222; padding: 1.4rem; border-radius: 12px; margin-bottom: 1.5rem;">
108
- <h2 style="color: #fff; font-size: 1rem; margin: 0 0 0.75rem;">βš™οΈ Usage</h2>
109
- <p style="color: #33b0d8; font-size: 0.8rem; font-weight: 600; margin: 0 0 0.3rem;">Train</p>
110
- <pre style="background: #111; border: 1px solid #1e1e1e; padding: 0.9rem; border-radius: 8px; color: #aaa; font-size: 0.8rem; margin: 0 0 1rem;">
111
  python train.py
112
- python train.py --epochs 500 --lr 0.05</pre>
113
- <p style="color: #33b0d8; font-size: 0.8rem; font-weight: 600; margin: 0 0 0.3rem;">Generate</p>
114
- <pre style="background: #111; border: 1px solid #1e1e1e; padding: 0.9rem; border-radius: 8px; color: #aaa; font-size: 0.8rem; margin: 0 0 0.75rem;">
115
  python main.py "red"
116
- python main.py "a cat" --out cat_out.png --scale 8</pre>
117
- <p style="color: #aaa; font-size: 0.8rem; margin: 0;">
118
- <code style="background: #1e1e1e; padding: 2px 6px; border-radius: 4px;">--scale 8</code> upscales 32Γ—32 β†’ 256Γ—256 using nearest-neighbour interpolation.
119
- </p>
120
- </div>
121
-
122
- <!-- ARCHITECTURE -->
123
- <div style="background: #161616; border: 1px solid #222; padding: 1.4rem; border-radius: 12px; margin-bottom: 1.5rem;">
124
- <h2 style="color: #fff; font-size: 1rem; margin: 0 0 0.75rem;">🧠 Architecture</h2>
125
- <pre style="background: #111; border: 1px solid #1e1e1e; padding: 1rem; border-radius: 8px; color: #aaa; font-size: 0.8rem; overflow-x: auto; margin: 0 0 0.75rem;">
126
- prompt string
127
- β†’ char-level embedding β†’ 32-dim vector
128
- β†’ W1 (64Γ—32) β†’ tanh
129
- β†’ W2 (64Γ—64) β†’ tanh
130
- β†’ W3 (3072Γ—64) β†’ sigmoid
131
- β†’ reshape β†’ 32Γ—32Γ—3 image</pre>
132
- <p style="color: #aaa; font-size: 0.875rem; margin: 0;">
133
- All weights live inside <code style="background: #1e1e1e; padding: 2px 6px; border-radius: 4px;">model.png</code>. Opening the PNG is literally opening the neural network.
134
- </p>
135
- </div>
136
-
137
- <!-- DATASET TIPS -->
138
- <div style="background: #161616; border: 1px solid #222; padding: 1.4rem; border-radius: 12px; margin-bottom: 1.5rem;">
139
- <h2 style="color: #fff; font-size: 1rem; margin: 0 0 0.6rem;">πŸ“Š Dataset Tips</h2>
140
- <ul style="color: #aaa; font-size: 0.875rem; line-height: 1.7; margin: 0; padding-left: 1.1rem;">
141
- <li>6–20 image-prompt pairs is enough</li>
142
- <li>Simple targets converge fastest (solid colors, gradients, shapes)</li>
143
- <li>200–500 epochs typically sufficient</li>
144
- <li>Loss below 0.001 is good for simple datasets</li>
145
- <li>Model capacity is fixed (~600K implicit parameters)</li>
146
- </ul>
147
- </div>
148
-
149
- <!-- FOOTER -->
150
- <div style="background: #161616; border: 1px solid #222; padding: 1.2rem; border-radius: 12px; text-align: center;">
151
- <p style="color: #aaa; font-size: 0.875rem; margin: 0 0 0.25rem;">
152
- It's a toy. It's not useful. But it's cool that it works.
153
- </p>
154
- <p style="color: #444; font-size: 0.8rem; margin: 0;">Bench Labs Β· Simple, Reliable, Open sourced</p>
155
- </div>
156
-
157
- </div>
 
2
  license: mit
3
  pipeline_tag: text-to-image
4
  ---
5
+
6
+ # PixelModel πŸ–ΌοΈ
7
+
8
+ A neural network where the weights **are** the image.
9
+
10
+ ## πŸ“Œ What is this?
11
+
12
+ `model.png` is not a picture β€” it *is* the model.
13
+
14
+ Every pixel encodes neural network weights. At inference, the PNG is decoded into weight matrices forming a tiny MLP. The prompt is embedded into a vector, and the model generates a 32Γ—32 image.
15
+
16
+ Training directly optimizes pixel values via gradient descent until the PNG becomes the model itself.
17
+
18
+ ---
19
+
20
+ ## 🎨 Weight Encoding
21
+
22
+ - **R channel** β†’ weight magnitude (0–255 β†’ 0.0–1.0)
23
+ - **B channel** β†’ weight sign (<128 = negative, β‰₯128 = positive)
24
+ - **G channel** β†’ unused / reserved
25
+
26
+ ---
27
+
28
+ ## 🧠 Architecture
29
+
30
+ ```text
31
+ prompt string
32
+ β†’ char embedding β†’ 32-dim vector
33
+ β†’ W1 (64Γ—32) β†’ tanh
34
+ β†’ W2 (64Γ—64) β†’ tanh
35
+ β†’ W3 (3072Γ—64) β†’ sigmoid
36
+ β†’ reshape β†’ 32Γ—32Γ—3 image
37
+ ````
38
+
39
+ All weights live inside `model.png`.
40
+
41
+ ---
42
+
43
+ ## πŸ§ͺ Dataset vs Outputs
44
+
45
+ | Target | Output |
46
+ | ------------------------------------------ | -------------------------------------- |
47
+ | <img src="dataset/red.png" width="120"> | <img src="out_red.png" width="120"> |
48
+ | <img src="dataset/green.png" width="120"> | <img src="out_green.png" width="120"> |
49
+ | <img src="dataset/blue.png" width="120"> | <img src="out_blue.png" width="120"> |
50
+ | <img src="dataset/white.png" width="120"> | <img src="out_white.png" width="120"> |
51
+ | <img src="dataset/yellow.png" width="120"> | <img src="out_yellow.png" width="120"> |
52
+ | <img src="dataset/dark.png" width="120"> | <img src="out_dark.png" width="120"> |
53
+
54
+ ---
55
+
56
+ ## πŸ“ Files
57
+
58
+ ```text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  model.png ← THE MODEL (64Γ—3200 px)
60
  main.py ← inference
61
  train.py ← training
62
  model.py ← architecture
63
+ dataset/
64
+ red.png
65
+ red.txt ← prompt: "red"
66
+ ...
67
+ ```
68
+
69
+ ---
70
+
71
+ ## βš™οΈ Usage
72
+
73
+ ```bash
74
  python train.py
75
+ python train.py --epochs 500 --lr 0.05
76
+
 
77
  python main.py "red"
78
+ python main.py "a cat" --out cat.png --scale 8
79
+ ```
80
+
81
+ ---
82
+
83
+ ## πŸ“Š Tips
84
+
85
+ * 6–20 samples are enough
86
+ * Simple patterns converge fastest
87
+ * 200–500 epochs typical
88
+ * Loss < 0.001 is strong for toy datasets
89
+
90
+ ---
91
+
92
+ *It’s a toy. It’s not useful. But it works.*
93
+
94
+ Bench Labs Β· Simple, Reliable, Open sourced