Update README.md

#1
by calcuis - opened
Files changed (1) hide show
  1. README.md +9 -526
README.md CHANGED
@@ -3,543 +3,26 @@ license: mit
3
  ---
4
  # Pixel
5
 
6
- A PyTorch-based Generative Adversarial Network (GAN) for training and generating CryptoPunk-style pixel art images.
7
 
8
- ## Table of Contents
9
 
10
- - [Overview](#overview)
11
- - [Project Structure](#project-structure)
12
- - [Architecture](#architecture)
13
- - [Workflow](#workflow)
14
- - [Installation](#installation)
15
- - [User Guide](#user-guide)
16
- - [Configuration](#configuration)
17
- - [Examples](#examples)
18
-
19
- ---
20
-
21
- ## Overview
22
-
23
- This project implements a Deep Convolutional GAN (DCGAN) to generate pixel art images. It includes:
24
-
25
- - **Training Pipeline**: Train a GAN model on your custom image dataset
26
- - **Image Generation**: Generate new images using trained models
27
- - **Interactive GUI**: Tkinter-based interface for real-time generation
28
- - **GGUF Support**: Convert and use GGUF quantized models
29
-
30
- ---
31
-
32
- ## Project Structure
33
-
34
- ```
35
- ai-picture-model-trainer/
36
- β”‚
37
- β”œβ”€β”€ trainer.py # GAN training script
38
- β”œβ”€β”€ generator.py # Image generation (GUI + CLI)
39
- β”‚
40
- β”œβ”€β”€ data/ # Training data
41
- β”‚ β”œβ”€β”€ attributes.csv # Dataset metadata
42
- β”‚ └── images/ # Training images (punk000.png, punk001.png, ...)
43
- β”‚
44
- β”œβ”€β”€ models/ # Trained model storage
45
- β”‚ └── generator_model.safetensors
46
- β”‚
47
- β”œβ”€β”€ generated/ # Generated image outputs
48
- β”‚ β”œβ”€β”€ output.png # Grid visualizations
49
- β”‚ └── individual/ # Individual generated images
50
- β”‚
51
- β”œβ”€β”€ gen_images/ # Training progress images
52
- β”‚ β”œβ”€β”€ epoch_0.png
53
- β”‚ β”œβ”€β”€ epoch_1.png
54
- β”‚ └── ...
55
- β”‚
56
- └── gguf/ # GGUF format support
57
- └── generator.py # GGUF model converter/generator
58
- ```
59
-
60
- ### Directory Purpose
61
-
62
- | Directory | Purpose |
63
- |-----------|---------|
64
- | `data/` | Input training images and metadata |
65
- | `models/` | Saved trained models (.safetensors format) |
66
- | `generated/` | Output from generator.py |
67
- | `gen_images/` | Training progress visualizations |
68
- | `gguf/` | GGUF quantized model support |
69
-
70
- ---
71
-
72
- ## Architecture
73
-
74
- ### GAN Components
75
-
76
- ```
77
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
78
- β”‚ GAN Architecture β”‚
79
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
80
-
81
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
82
- β”‚ Generator β”‚ β”‚ Discriminator β”‚
83
- β”‚ β”‚ β”‚ β”‚
84
- β”‚ Input: Noise β”‚ β”‚ Input: Images β”‚
85
- β”‚ (100-dim) β”‚ β”‚ (24x24x4) β”‚
86
- β”‚ β”‚ β”‚ β”‚
87
- β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
88
- β”‚ β”‚ FC β”‚ β”‚ β”‚ β”‚ Conv2D β”‚ β”‚
89
- β”‚ β”‚ (9,216) β”‚ β”‚ β”‚ β”‚ 64 filtersβ”‚ β”‚
90
- β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
91
- β”‚ ↓ β”‚ β”‚ ↓ β”‚
92
- β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
93
- β”‚ β”‚ Reshape β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ Conv2D β”‚ β”‚
94
- β”‚ β”‚ (256,6,6) β”‚ │───→│ Real or │←───│ β”‚ 128 filtersβ”‚ β”‚
95
- β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ Fake? β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
96
- β”‚ ↓ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ ↓ β”‚
97
- β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
98
- β”‚ β”‚ ConvTrans β”‚ β”‚ β”‚ β”‚ Conv2D β”‚ β”‚
99
- β”‚ β”‚ 128 filtersβ”‚ β”‚ β”‚ β”‚ 256 filtersβ”‚ β”‚
100
- β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
101
- β”‚ ↓ β”‚ β”‚ ↓ β”‚
102
- β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
103
- β”‚ β”‚ ConvTrans β”‚ β”‚ β”‚ β”‚ GlobalAvg β”‚ β”‚
104
- β”‚ β”‚ 64 filtersβ”‚ β”‚ β”‚ β”‚ Pool β”‚ β”‚
105
- β”‚ β””β”€β”€β”€οΏ½οΏ½β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
106
- β”‚ ↓ β”‚ β”‚ ↓ β”‚
107
- β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
108
- β”‚ β”‚ ConvTrans β”‚ β”‚ β”‚ β”‚ FC + Sig β”‚ β”‚
109
- β”‚ β”‚ 4 channelsβ”‚ β”‚ β”‚ β”‚ (0-1) β”‚ β”‚
110
- β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
111
- β”‚ β”‚ β”‚ β”‚
112
- β”‚ Output: Image β”‚ β”‚ Output: Score β”‚
113
- β”‚ (24x24x4 RGBA) β”‚ β”‚ (Real/Fake) β”‚
114
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
115
- ```
116
-
117
- ### Model Details
118
-
119
- **Generator**:
120
- - Input: 100-dimensional latent vector (random noise)
121
- - Architecture: FC β†’ BatchNorm β†’ 3x ConvTranspose2D with BatchNorm
122
- - Output: 24x24x4 RGBA image (values in [-1, 1])
123
- - Activation: LeakyReLU + Tanh (output)
124
-
125
- **Discriminator**:
126
- - Input: 24x24x4 RGBA image
127
- - Architecture: 3x Conv2D with Dropout β†’ GlobalAvgPool β†’ FC
128
- - Output: Probability score [0, 1] (real vs. fake)
129
- - Activation: LeakyReLU + Sigmoid (output)
130
-
131
- ---
132
-
133
- ## Workflow
134
-
135
- ### Training Workflow
136
-
137
- ```
138
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
139
- β”‚ Start β”‚
140
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
141
- β”‚
142
- β–Ό
143
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
144
- β”‚ Load Dataset β”‚
145
- β”‚ (data/images/) β”‚
146
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
147
- β”‚
148
- β–Ό
149
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
150
- β”‚ Initialize Models β”‚
151
- β”‚ - Generator β”‚
152
- β”‚ - Discriminator β”‚
153
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
154
- β”‚
155
- β–Ό
156
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
157
- β”‚ Training Loop │◄─────────┐
158
- β”‚ (N epochs) β”‚ β”‚
159
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
160
- β”‚ β”‚
161
- β–Ό β”‚
162
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
163
- β”‚ For each batch: β”‚ β”‚
164
- β”‚ 1. Train Discrim. β”‚ β”‚
165
- β”‚ 2. Train Generator β”‚ β”‚
166
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
167
- β”‚ β”‚
168
- β–Ό β”‚
169
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
170
- β”‚ Save Progress β”‚ β”‚
171
- β”‚ (gen_images/) β”‚β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
172
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
173
- β”‚
174
- β–Ό
175
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
176
- β”‚ Save Final Model β”‚
177
- β”‚ (models/*.safetensors)
178
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
179
- β”‚
180
- β–Ό
181
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
182
- β”‚ Complete β”‚
183
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
184
- ```
185
-
186
- ### Generation Workflow
187
-
188
- ```
189
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
190
- β”‚ Mode Selection β”‚
191
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
192
- β”‚
193
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
194
- β”‚ β”‚
195
- β–Ό β–Ό
196
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
197
- β”‚ GUI Mode β”‚ β”‚ CLI Mode β”‚
198
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
199
- β”‚ β”‚
200
- β–Ό β–Ό
201
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
202
- β”‚ Load Model β”‚ β”‚ Load Model β”‚
203
- β”‚ (safetensors)β”‚ β”‚ Parse Args β”‚
204
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
205
- β”‚ β”‚
206
- β–Ό β–Ό
207
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
208
- β”‚ Tkinter GUI β”‚ β”‚ Generate N Imagesβ”‚
209
- β”‚ - Button 1x1 β”‚ β”‚ - Custom grid β”‚
210
- β”‚ - Button 3x3 β”‚ β”‚ - Custom seed β”‚
211
- β”‚ - Button 5x5 β”‚ β”‚ - Save options β”‚
212
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
213
- β”‚ β”‚
214
- β–Ό β–Ό
215
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
216
- β”‚ On Click: β”‚ β”‚ Save Grid β”‚
217
- β”‚ Generate β”‚ β”‚ Save Individual β”‚
218
- β”‚ Display β”‚ β”‚ (optional) β”‚
219
- β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
220
- β”‚
221
- β–Ό
222
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
223
- β”‚ Interactive β”‚
224
- β”‚ Generation β”‚
225
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
226
- ```
227
-
228
- ---
229
-
230
- ## Installation
231
-
232
- ### Prerequisites
233
-
234
- - Python 3.8+
235
- - CUDA-compatible GPU (optional, for faster training)
236
-
237
- ### Setup
238
-
239
- 1. **Clone/Download the repository**
240
-
241
- 2. **Install dependencies**:
242
-
243
- ```bash
244
- pip install torch torchvision
245
- pip install numpy pandas matplotlib pillow
246
- pip install safetensors
247
- ```
248
-
249
- 3. **Prepare your dataset**:
250
-
251
- Place your training images in `data/images/` with filenames like `punk000.png`, `punk001.png`, etc.
252
-
253
- Create `data/attributes.csv`:
254
- ```csv
255
- id
256
- 0
257
- 1
258
- 2
259
- ...
260
- ```
261
-
262
- ---
263
-
264
- ## User Guide
265
-
266
- ### 1. Training a Model
267
-
268
- Train the GAN on your dataset:
269
-
270
- ```bash
271
- python trainer.py \
272
- --data_path ./data/attributes.csv \
273
- --images_path ./data/images/ \
274
- --model_output_path ./models/ \
275
- --images_output_path ./gen_images/ \
276
- --epochs 50 \
277
- --batch_size 16
278
  ```
279
-
280
- **Training Parameters**:
281
-
282
- | Parameter | Default | Description |
283
- |-----------|---------|-------------|
284
- | `--data_path` | `./data/attributes.csv` | Path to dataset metadata |
285
- | `--images_path` | `./data/images/` | Directory containing training images |
286
- | `--model_output_path` | `./models/` | Where to save trained model |
287
- | `--images_output_path` | `./gen_images/` | Save progress images during training |
288
- | `--epochs` | `50` | Number of training epochs |
289
- | `--batch_size` | `16` | Training batch size |
290
- | `--codings_size` | `100` | Latent vector dimension |
291
- | `--image_size` | `24` | Output image size (24x24) |
292
- | `--image_channels` | `4` | Image channels (4=RGBA, 3=RGB) |
293
-
294
- **Training Output**:
295
- - Progress displayed: `Epoch X/Y - Gen Loss: X.XXXX, Disc Loss: X.XXXX`
296
- - Progress images saved to `gen_images/epoch_N.png`
297
- - Final model saved to `models/generator_model.safetensors`
298
-
299
- ---
300
-
301
- ### 2. Generating Images
302
-
303
- #### Option A: Interactive GUI Mode (Default)
304
-
305
- Launch the Tkinter GUI for real-time generation:
306
-
307
- ```bash
308
- python generator.py
309
  ```
310
 
311
- or explicitly:
312
-
313
- ```bash
314
- python generator.py --gui
315
  ```
316
-
317
- **GUI Controls**:
318
- - **Generate 1 cryptopunk**: Single image
319
- - **Generate 3x3 cryptopunks**: 3x3 grid (9 images)
320
- - **Generate 5x5 cryptopunks**: 5x5 grid (25 images)
321
- - **Terminate**: Close the application
322
-
323
- ---
324
-
325
- #### Option B: Command-Line Interface (CLI)
326
-
327
- Batch generate images from terminal:
328
-
329
- **Basic generation** (16 images):
330
- ```bash
331
- python generator.py --num_images 16 --output_path ./generated/output.png
332
  ```
333
 
334
- **Custom grid** (8x8 = 64 images):
335
- ```bash
336
- python generator.py --grid_size 8 --output_path ./generated/grid_8x8.png
337
  ```
338
-
339
- **Reproducible generation** (with seed):
340
- ```bash
341
- python generator.py --grid_size 4 --seed 42 --output_path ./generated/seed42.png
342
  ```
343
 
344
- **Save individual images**:
345
- ```bash
346
- python generator.py \
347
- --num_images 100 \
348
- --save_individual \
349
- --individual_output_dir ./generated/individual/ \
350
- --output_path ./generated/batch.png
351
  ```
352
-
353
- **CLI Parameters**:
354
-
355
- | Parameter | Default | Description |
356
- |-----------|---------|-------------|
357
- | `--model_path` | `./models/generator_model.safetensors` | Path to trained model |
358
- | `--output_path` | `./generated/output.png` | Output path for grid image |
359
- | `--num_images` | `16` | Number of images to generate |
360
- | `--grid_size` | `None` | Grid size N for NxN layout |
361
- | `--seed` | `None` | Random seed for reproducibility |
362
- | `--save_individual` | `False` | Save each image separately |
363
- | `--individual_output_dir` | `./generated/individual/` | Directory for individual images |
364
-
365
- ---
366
-
367
- ### 3. GGUF Model Support
368
-
369
- Use quantized GGUF models for smaller file sizes:
370
-
371
- ```bash
372
- cd gguf/
373
  python generator.py
374
  ```
375
-
376
- The GGUF generator will:
377
- 1. Detect available `.gguf` files in the directory
378
- 2. Prompt you to select a model
379
- 3. Convert GGUF β†’ SafeTensors format
380
- 4. Launch the standard generator
381
-
382
- ---
383
-
384
- ## Configuration
385
-
386
- ### Model Architecture Configuration
387
-
388
- Modify these parameters in both `trainer.py` and `generator.py`:
389
-
390
- ```python
391
- --codings_size 100 # Latent vector dimension
392
- --image_size 24 # Output image size
393
- --image_channels 4 # RGBA (4) or RGB (3)
394
- ```
395
-
396
- ### Training Hyperparameters
397
-
398
- In `trainer.py`:
399
-
400
- ```python
401
- # Optimizer
402
- gen_optimizer = optim.RMSprop(generator.parameters(), lr=0.001)
403
- disc_optimizer = optim.RMSprop(discriminator.parameters(), lr=0.001)
404
-
405
- # Loss function
406
- criterion = nn.BCELoss()
407
-
408
- # Dropout rate (in Discriminator)
409
- nn.Dropout(0.4)
410
- ```
411
-
412
- ### Data Preprocessing
413
-
414
- In `trainer.py` β†’ `ImageDataset`:
415
-
416
- ```python
417
- transforms.Compose([
418
- transforms.Resize((image_size, image_size)),
419
- transforms.ToTensor(),
420
- transforms.Normalize([0.5] * channels, [0.5] * channels) # [-1, 1]
421
- ])
422
- ```
423
-
424
- ---
425
-
426
- ## Examples
427
-
428
- ### Example 1: Train on Custom Dataset
429
-
430
- ```bash
431
- # Prepare your data
432
- # data/images/punk000.png, punk001.png, ..., punk099.png
433
- # data/attributes.csv with ids 0-99
434
-
435
- # Train for 100 epochs
436
- python trainer.py \
437
- --data_path ./data/attributes.csv \
438
- --images_path ./data/images/ \
439
- --epochs 100 \
440
- --batch_size 32 \
441
- --model_output_path ./models/my_model.safetensors
442
- ```
443
-
444
- ### Example 2: Generate with Specific Seed
445
-
446
- ```bash
447
- # Generate same images every time
448
- python generator.py \
449
- --model_path ./models/generator_model.safetensors \
450
- --grid_size 5 \
451
- --seed 12345 \
452
- --output_path ./results/reproducible.png
453
- ```
454
-
455
- ### Example 3: Batch Generation
456
-
457
- ```bash
458
- # Generate 1000 individual images
459
- python generator.py \
460
- --num_images 1000 \
461
- --save_individual \
462
- --individual_output_dir ./dataset_synthetic/ \
463
- --output_path ./dataset_synthetic/overview.png
464
- ```
465
-
466
- ### Example 4: Monitor Training Progress
467
-
468
- ```bash
469
- # Training with progress visualization
470
- python trainer.py \
471
- --epochs 200 \
472
- --images_output_path ./training_progress/
473
-
474
- # View progress images
475
- ls ./training_progress/
476
- # epoch_0.png, epoch_1.png, ..., epoch_199.png
477
- ```
478
-
479
- ---
480
-
481
- ## Technical Details
482
-
483
- ### Model File Format
484
-
485
- Models are saved in **SafeTensors** format (`.safetensors`) with embedded metadata:
486
-
487
- ```python
488
- metadata = {
489
- 'codings_size': '100',
490
- 'image_size': '24',
491
- 'image_channels': '4'
492
- }
493
- ```
494
-
495
- This ensures the generator automatically loads the correct architecture.
496
-
497
- ### Image Value Ranges
498
-
499
- - **Training**: Images normalized to [-1, 1]
500
- - **Generation output**: Images scaled to [0, 1]
501
- - **Saved files**: Images saved as uint8 [0, 255]
502
-
503
- ### GPU Support
504
-
505
- The code automatically detects and uses CUDA if available:
506
-
507
- ```python
508
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
509
- ```
510
-
511
- ---
512
-
513
- ## Troubleshooting
514
-
515
- **Q: Training loss not decreasing?**
516
- - Try adjusting learning rates
517
- - Increase batch size or epochs
518
- - Check if dataset has sufficient variety
519
-
520
- **Q: Generated images look like noise?**
521
- - Model needs more training epochs
522
- - Dataset may be too small (need 50+ images minimum)
523
- - Try adjusting discriminator dropout rate
524
-
525
- **Q: GUI not launching?**
526
- - Check Tkinter installation: `python -m tkinter`
527
- - On Linux: `sudo apt-get install python3-tk`
528
-
529
- **Q: CUDA out of memory?**
530
- - Reduce batch size: `--batch_size 8`
531
- - Reduce image size: `--image_size 16`
532
-
533
- ---
534
-
535
- ## License
536
-
537
- This project is provided as-is for educational and creative purposes.
538
-
539
- ---
540
-
541
- ## Acknowledgments
542
-
543
- - Built with PyTorch
544
- - Inspired by DCGAN architecture
545
- - Uses SafeTensors for model serialization
 
3
  ---
4
  # Pixel
5
 
6
+ A PyTorch-based Generative Adversarial Network (GAN) for training and generating pixel art images.
7
 
8
+ ## Setup
9
 
10
+ Git clone the pixel repo:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ```
12
+ git clone https://github.com/mochiyaki/pixel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ```
14
 
15
+ Get inside the cloned folder:
 
 
 
16
  ```
17
+ cd pixel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ```
19
 
20
+ Start training with your dataset (in ./data/):
 
 
21
  ```
22
+ python trainer.py
 
 
 
23
  ```
24
 
25
+ When finished, check the model file (in ./models/) then run the inference:
 
 
 
 
 
 
26
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  python generator.py
28
  ```