| --- |
| license: mit |
| tags: |
| - gan |
| - pytorch |
| - vision |
| - cats |
| - dcgan |
| metrics: |
| - loss |
| datasets: |
| - huggan/cats |
| --- |
| |
| # CatGen v2 - 128px DCGAN |
|
|
| This model is a Deep Convolutional Generative Adversarial Network (DCGAN) trained to generate high-quality 128x128 images of cats. It was trained for 165 epochs on a curated dataset of feline images, pushing the boundaries of traditional GAN architectures at this resolution. |
|
|
| ## Sample |
| Here's a sample after epoch 165: |
|  |
|
|
| ## Best of - Cat Images |
|
|
|  |
|  |
|  |
|
|
| ## Model Details |
| - **Architecture:** DCGAN (Deep Convolutional GAN) |
| - **Resolution:** 128x128 pixels (RGB) |
| - **Parameters:** ~186M (Generator) |
| - **Training Duration:** ~5 hours on NVIDIA T4 GPU |
| - **Framework:** PyTorch with Mixed Precision (AMP) |
|
|
| ## Training Hyperparameters |
| - **Batch Size:** 128 |
| - **Learning Rate:** 0.0002 |
| - **Optimizer:** Adam (Beta1: 0.5, Beta2: 0.999) |
| - **Latent Vector (Z):** 128 dimensions |
|
|
| ## Training details |
| The full training code can be found as `catgen-v2.ipynb` in this repo. |
| The training data we used was from HF: huggan/cats |
|
|
| ## Intended Use |
| This model is intended for artistic and research purposes. It demonstrates how GANs can capture complex textures like fur and eye reflections at medium resolutions. |
|
|
| ## How to use |
| To use this model, clone this repository and run the provided inference script. Ensure you have `matplotlib`, `torch` and `torchvision` installed. |
|
|
| ```bash |
| python3 inference.py |
| ``` |
|
|
| --> Sample output: |
|  |
|
|
| ## Limitations & Bias |
| As a GAN, the model might occasionally produce "dream-like" artifacts or distorted anatomy (e.g., extra ears or eyes). It is not a diffusion model and generates images in a single forward pass. |