Files changed (1) hide show
  1. README.md +9 -526
README.md CHANGED
@@ -3,543 +3,26 @@ license: mit
3
  ---
4
  # Pixel
5
 
6
- A PyTorch-based Generative Adversarial Network (GAN) for training and generating CryptoPunk-style pixel art images.
7
 
8
- ## Table of Contents
9
 
10
- - [Overview](#overview)
11
- - [Project Structure](#project-structure)
12
- - [Architecture](#architecture)
13
- - [Workflow](#workflow)
14
- - [Installation](#installation)
15
- - [User Guide](#user-guide)
16
- - [Configuration](#configuration)
17
- - [Examples](#examples)
18
-
19
- ---
20
-
21
- ## Overview
22
-
23
- This project implements a Deep Convolutional GAN (DCGAN) to generate pixel art images. It includes:
24
-
25
- - **Training Pipeline**: Train a GAN model on your custom image dataset
26
- - **Image Generation**: Generate new images using trained models
27
- - **Interactive GUI**: Tkinter-based interface for real-time generation
28
- - **GGUF Support**: Convert and use GGUF quantized models
29
-
30
- ---
31
-
32
- ## Project Structure
33
-
34
- ```
35
- ai-picture-model-trainer/
36
-
37
- ├── trainer.py # GAN training script
38
- ├── generator.py # Image generation (GUI + CLI)
39
-
40
- ├── data/ # Training data
41
- │ ├── attributes.csv # Dataset metadata
42
- │ └── images/ # Training images (punk000.png, punk001.png, ...)
43
-
44
- ├── models/ # Trained model storage
45
- │ └── generator_model.safetensors
46
-
47
- ├── generated/ # Generated image outputs
48
- │ ├── output.png # Grid visualizations
49
- │ └── individual/ # Individual generated images
50
-
51
- ├── gen_images/ # Training progress images
52
- │ ├── epoch_0.png
53
- │ ├── epoch_1.png
54
- │ └── ...
55
-
56
- └── gguf/ # GGUF format support
57
- └── generator.py # GGUF model converter/generator
58
- ```
59
-
60
- ### Directory Purpose
61
-
62
- | Directory | Purpose |
63
- |-----------|---------|
64
- | `data/` | Input training images and metadata |
65
- | `models/` | Saved trained models (.safetensors format) |
66
- | `generated/` | Output from generator.py |
67
- | `gen_images/` | Training progress visualizations |
68
- | `gguf/` | GGUF quantized model support |
69
-
70
- ---
71
-
72
- ## Architecture
73
-
74
- ### GAN Components
75
-
76
- ```
77
- ┌─────────────────────────────────────────────────────────────┐
78
- │ GAN Architecture │
79
- └─────────────────────────────────────────────────────────────┘
80
-
81
- ┌──────────────────┐ ┌──────────────────┐
82
- │ Generator │ │ Discriminator │
83
- │ │ │ │
84
- │ Input: Noise │ │ Input: Images │
85
- │ (100-dim) │ │ (24x24x4) │
86
- │ │ │ │
87
- │ ┌────────────┐ │ │ ┌────────────┐ │
88
- │ │ FC │ │ │ │ Conv2D │ │
89
- │ │ (9,216) │ │ │ │ 64 filters│ │
90
- │ └────────────┘ │ │ └────────────┘ │
91
- │ ↓ │ │ ↓ │
92
- │ ┌────────────┐ │ │ ┌────────────┐ │
93
- │ │ Reshape │ │ ┌──────────┐ │ │ Conv2D │ │
94
- │ │ (256,6,6) │ │───→│ Real or │←───│ │ 128 filters│ │
95
- │ └────────────┘ │ │ Fake? │ │ └────────────┘ │
96
- │ ↓ │ └──────────┘ │ ↓ │
97
- │ ┌────────────┐ │ │ ┌────────────┐ │
98
- │ │ ConvTrans │ │ │ │ Conv2D │ │
99
- │ │ 128 filters│ │ │ │ 256 filters│ │
100
- │ └────────────┘ │ │ └────────────┘ │
101
- │ ↓ │ │ ↓ │
102
- │ ┌────────────┐ │ │ ┌────────────┐ │
103
- │ │ ConvTrans │ │ │ │ GlobalAvg │ │
104
- │ │ 64 filters│ │ │ │ Pool │ │
105
- │ └───��────────┘ │ │ └────────────┘ │
106
- │ ↓ │ │ ↓ │
107
- │ ┌────────────┐ │ │ ┌────────────┐ │
108
- │ │ ConvTrans │ │ │ │ FC + Sig │ │
109
- │ │ 4 channels│ │ │ │ (0-1) │ │
110
- │ └────────────┘ │ │ └────────────┘ │
111
- │ │ │ │
112
- │ Output: Image │ │ Output: Score │
113
- │ (24x24x4 RGBA) │ │ (Real/Fake) │
114
- └──────────────────┘ └──────────────────┘
115
- ```
116
-
117
- ### Model Details
118
-
119
- **Generator**:
120
- - Input: 100-dimensional latent vector (random noise)
121
- - Architecture: FC → BatchNorm → 3x ConvTranspose2D with BatchNorm
122
- - Output: 24x24x4 RGBA image (values in [-1, 1])
123
- - Activation: LeakyReLU + Tanh (output)
124
-
125
- **Discriminator**:
126
- - Input: 24x24x4 RGBA image
127
- - Architecture: 3x Conv2D with Dropout → GlobalAvgPool → FC
128
- - Output: Probability score [0, 1] (real vs. fake)
129
- - Activation: LeakyReLU + Sigmoid (output)
130
-
131
- ---
132
-
133
- ## Workflow
134
-
135
- ### Training Workflow
136
-
137
- ```
138
- ┌─────────────┐
139
- │ Start │
140
- └──────┬──────┘
141
-
142
-
143
- ┌──────────────────────┐
144
- │ Load Dataset │
145
- │ (data/images/) │
146
- └──────┬───────────────┘
147
-
148
-
149
- ┌──────────────────────┐
150
- │ Initialize Models │
151
- │ - Generator │
152
- │ - Discriminator │
153
- └──────┬───────────────┘
154
-
155
-
156
- ┌──────────────────────┐
157
- │ Training Loop │◄─────────┐
158
- │ (N epochs) │ │
159
- └──────┬───────────────┘ │
160
- │ │
161
- ▼ │
162
- ┌──────────────────────┐ │
163
- │ For each batch: │ │
164
- │ 1. Train Discrim. │ │
165
- │ 2. Train Generator │ │
166
- └──────┬───────────────┘ │
167
- │ │
168
- ▼ │
169
- ┌──────────────────────┐ │
170
- │ Save Progress │ │
171
- │ (gen_images/) │──────────┘
172
- └──────┬───────────────┘
173
-
174
-
175
- ┌──────────────────────┐
176
- │ Save Final Model │
177
- │ (models/*.safetensors)
178
- └──────┬───────────────┘
179
-
180
-
181
- ┌──────────────┐
182
- │ Complete │
183
- └──────────────┘
184
- ```
185
-
186
- ### Generation Workflow
187
-
188
- ```
189
- ┌─────────────────────┐
190
- │ Mode Selection │
191
- └──────┬──────────────┘
192
-
193
- ├─────────────────────────┐
194
- │ │
195
- ▼ ▼
196
- ┌──────────────┐ ┌──────────────────┐
197
- │ GUI Mode │ │ CLI Mode │
198
- └──────┬───────┘ └──────┬───────────┘
199
- │ │
200
- ▼ ▼
201
- ┌──────────────┐ ┌──────────────────┐
202
- │ Load Model │ │ Load Model │
203
- │ (safetensors)│ │ Parse Args │
204
- └──────┬───────┘ └──────┬───────────┘
205
- │ │
206
- ▼ ▼
207
- ┌──────────────┐ ┌──────────────────┐
208
- │ Tkinter GUI │ │ Generate N Images│
209
- │ - Button 1x1 │ │ - Custom grid │
210
- │ - Button 3x3 │ │ - Custom seed │
211
- │ - Button 5x5 │ │ - Save options │
212
- └──────┬───────┘ └──────┬───────────┘
213
- │ │
214
- ▼ ▼
215
- ┌──────────────┐ ┌──────────────────┐
216
- │ On Click: │ │ Save Grid │
217
- │ Generate │ │ Save Individual │
218
- │ Display │ │ (optional) │
219
- └──────┬───────┘ └──────────────────┘
220
-
221
-
222
- ┌──────────────┐
223
- │ Interactive │
224
- │ Generation │
225
- └──────────────┘
226
- ```
227
-
228
- ---
229
-
230
- ## Installation
231
-
232
- ### Prerequisites
233
-
234
- - Python 3.8+
235
- - CUDA-compatible GPU (optional, for faster training)
236
-
237
- ### Setup
238
-
239
- 1. **Clone/Download the repository**
240
-
241
- 2. **Install dependencies**:
242
-
243
- ```bash
244
- pip install torch torchvision
245
- pip install numpy pandas matplotlib pillow
246
- pip install safetensors
247
- ```
248
-
249
- 3. **Prepare your dataset**:
250
-
251
- Place your training images in `data/images/` with filenames like `punk000.png`, `punk001.png`, etc.
252
-
253
- Create `data/attributes.csv`:
254
- ```csv
255
- id
256
- 0
257
- 1
258
- 2
259
- ...
260
- ```
261
-
262
- ---
263
-
264
- ## User Guide
265
-
266
- ### 1. Training a Model
267
-
268
- Train the GAN on your dataset:
269
-
270
- ```bash
271
- python trainer.py \
272
- --data_path ./data/attributes.csv \
273
- --images_path ./data/images/ \
274
- --model_output_path ./models/ \
275
- --images_output_path ./gen_images/ \
276
- --epochs 50 \
277
- --batch_size 16
278
  ```
279
-
280
- **Training Parameters**:
281
-
282
- | Parameter | Default | Description |
283
- |-----------|---------|-------------|
284
- | `--data_path` | `./data/attributes.csv` | Path to dataset metadata |
285
- | `--images_path` | `./data/images/` | Directory containing training images |
286
- | `--model_output_path` | `./models/` | Where to save trained model |
287
- | `--images_output_path` | `./gen_images/` | Save progress images during training |
288
- | `--epochs` | `50` | Number of training epochs |
289
- | `--batch_size` | `16` | Training batch size |
290
- | `--codings_size` | `100` | Latent vector dimension |
291
- | `--image_size` | `24` | Output image size (24x24) |
292
- | `--image_channels` | `4` | Image channels (4=RGBA, 3=RGB) |
293
-
294
- **Training Output**:
295
- - Progress displayed: `Epoch X/Y - Gen Loss: X.XXXX, Disc Loss: X.XXXX`
296
- - Progress images saved to `gen_images/epoch_N.png`
297
- - Final model saved to `models/generator_model.safetensors`
298
-
299
- ---
300
-
301
- ### 2. Generating Images
302
-
303
- #### Option A: Interactive GUI Mode (Default)
304
-
305
- Launch the Tkinter GUI for real-time generation:
306
-
307
- ```bash
308
- python generator.py
309
  ```
310
 
311
- or explicitly:
312
-
313
- ```bash
314
- python generator.py --gui
315
  ```
316
-
317
- **GUI Controls**:
318
- - **Generate 1 cryptopunk**: Single image
319
- - **Generate 3x3 cryptopunks**: 3x3 grid (9 images)
320
- - **Generate 5x5 cryptopunks**: 5x5 grid (25 images)
321
- - **Terminate**: Close the application
322
-
323
- ---
324
-
325
- #### Option B: Command-Line Interface (CLI)
326
-
327
- Batch generate images from terminal:
328
-
329
- **Basic generation** (16 images):
330
- ```bash
331
- python generator.py --num_images 16 --output_path ./generated/output.png
332
  ```
333
 
334
- **Custom grid** (8x8 = 64 images):
335
- ```bash
336
- python generator.py --grid_size 8 --output_path ./generated/grid_8x8.png
337
  ```
338
-
339
- **Reproducible generation** (with seed):
340
- ```bash
341
- python generator.py --grid_size 4 --seed 42 --output_path ./generated/seed42.png
342
  ```
343
 
344
- **Save individual images**:
345
- ```bash
346
- python generator.py \
347
- --num_images 100 \
348
- --save_individual \
349
- --individual_output_dir ./generated/individual/ \
350
- --output_path ./generated/batch.png
351
  ```
352
-
353
- **CLI Parameters**:
354
-
355
- | Parameter | Default | Description |
356
- |-----------|---------|-------------|
357
- | `--model_path` | `./models/generator_model.safetensors` | Path to trained model |
358
- | `--output_path` | `./generated/output.png` | Output path for grid image |
359
- | `--num_images` | `16` | Number of images to generate |
360
- | `--grid_size` | `None` | Grid size N for NxN layout |
361
- | `--seed` | `None` | Random seed for reproducibility |
362
- | `--save_individual` | `False` | Save each image separately |
363
- | `--individual_output_dir` | `./generated/individual/` | Directory for individual images |
364
-
365
- ---
366
-
367
- ### 3. GGUF Model Support
368
-
369
- Use quantized GGUF models for smaller file sizes:
370
-
371
- ```bash
372
- cd gguf/
373
  python generator.py
374
  ```
375
-
376
- The GGUF generator will:
377
- 1. Detect available `.gguf` files in the directory
378
- 2. Prompt you to select a model
379
- 3. Convert GGUF → SafeTensors format
380
- 4. Launch the standard generator
381
-
382
- ---
383
-
384
- ## Configuration
385
-
386
- ### Model Architecture Configuration
387
-
388
- Modify these parameters in both `trainer.py` and `generator.py`:
389
-
390
- ```python
391
- --codings_size 100 # Latent vector dimension
392
- --image_size 24 # Output image size
393
- --image_channels 4 # RGBA (4) or RGB (3)
394
- ```
395
-
396
- ### Training Hyperparameters
397
-
398
- In `trainer.py`:
399
-
400
- ```python
401
- # Optimizer
402
- gen_optimizer = optim.RMSprop(generator.parameters(), lr=0.001)
403
- disc_optimizer = optim.RMSprop(discriminator.parameters(), lr=0.001)
404
-
405
- # Loss function
406
- criterion = nn.BCELoss()
407
-
408
- # Dropout rate (in Discriminator)
409
- nn.Dropout(0.4)
410
- ```
411
-
412
- ### Data Preprocessing
413
-
414
- In `trainer.py` → `ImageDataset`:
415
-
416
- ```python
417
- transforms.Compose([
418
- transforms.Resize((image_size, image_size)),
419
- transforms.ToTensor(),
420
- transforms.Normalize([0.5] * channels, [0.5] * channels) # [-1, 1]
421
- ])
422
- ```
423
-
424
- ---
425
-
426
- ## Examples
427
-
428
- ### Example 1: Train on Custom Dataset
429
-
430
- ```bash
431
- # Prepare your data
432
- # data/images/punk000.png, punk001.png, ..., punk099.png
433
- # data/attributes.csv with ids 0-99
434
-
435
- # Train for 100 epochs
436
- python trainer.py \
437
- --data_path ./data/attributes.csv \
438
- --images_path ./data/images/ \
439
- --epochs 100 \
440
- --batch_size 32 \
441
- --model_output_path ./models/my_model.safetensors
442
- ```
443
-
444
- ### Example 2: Generate with Specific Seed
445
-
446
- ```bash
447
- # Generate same images every time
448
- python generator.py \
449
- --model_path ./models/generator_model.safetensors \
450
- --grid_size 5 \
451
- --seed 12345 \
452
- --output_path ./results/reproducible.png
453
- ```
454
-
455
- ### Example 3: Batch Generation
456
-
457
- ```bash
458
- # Generate 1000 individual images
459
- python generator.py \
460
- --num_images 1000 \
461
- --save_individual \
462
- --individual_output_dir ./dataset_synthetic/ \
463
- --output_path ./dataset_synthetic/overview.png
464
- ```
465
-
466
- ### Example 4: Monitor Training Progress
467
-
468
- ```bash
469
- # Training with progress visualization
470
- python trainer.py \
471
- --epochs 200 \
472
- --images_output_path ./training_progress/
473
-
474
- # View progress images
475
- ls ./training_progress/
476
- # epoch_0.png, epoch_1.png, ..., epoch_199.png
477
- ```
478
-
479
- ---
480
-
481
- ## Technical Details
482
-
483
- ### Model File Format
484
-
485
- Models are saved in **SafeTensors** format (`.safetensors`) with embedded metadata:
486
-
487
- ```python
488
- metadata = {
489
- 'codings_size': '100',
490
- 'image_size': '24',
491
- 'image_channels': '4'
492
- }
493
- ```
494
-
495
- This ensures the generator automatically loads the correct architecture.
496
-
497
- ### Image Value Ranges
498
-
499
- - **Training**: Images normalized to [-1, 1]
500
- - **Generation output**: Images scaled to [0, 1]
501
- - **Saved files**: Images saved as uint8 [0, 255]
502
-
503
- ### GPU Support
504
-
505
- The code automatically detects and uses CUDA if available:
506
-
507
- ```python
508
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
509
- ```
510
-
511
- ---
512
-
513
- ## Troubleshooting
514
-
515
- **Q: Training loss not decreasing?**
516
- - Try adjusting learning rates
517
- - Increase batch size or epochs
518
- - Check if dataset has sufficient variety
519
-
520
- **Q: Generated images look like noise?**
521
- - Model needs more training epochs
522
- - Dataset may be too small (need 50+ images minimum)
523
- - Try adjusting discriminator dropout rate
524
-
525
- **Q: GUI not launching?**
526
- - Check Tkinter installation: `python -m tkinter`
527
- - On Linux: `sudo apt-get install python3-tk`
528
-
529
- **Q: CUDA out of memory?**
530
- - Reduce batch size: `--batch_size 8`
531
- - Reduce image size: `--image_size 16`
532
-
533
- ---
534
-
535
- ## License
536
-
537
- This project is provided as-is for educational and creative purposes.
538
-
539
- ---
540
-
541
- ## Acknowledgments
542
-
543
- - Built with PyTorch
544
- - Inspired by DCGAN architecture
545
- - Uses SafeTensors for model serialization
 
3
  ---
4
  # Pixel
5
 
6
+ A PyTorch-based Generative Adversarial Network (GAN) for training and generating pixel art images.
7
 
8
+ ## Setup
9
 
10
+ Git clone the pixel repo:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ```
12
+ git clone https://github.com/mochiyaki/pixel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ```
14
 
15
+ Get inside the cloned folder:
 
 
 
16
  ```
17
+ cd pixel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ```
19
 
20
+ Start training with your dataset (in ./data/):
 
 
21
  ```
22
+ python trainer.py
 
 
 
23
  ```
24
 
25
+ When finished, check the model file (in ./models/) then run the inference:
 
 
 
 
 
 
26
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  python generator.py
28
  ```