File size: 2,250 Bytes
bfdf9fc a704c22 bfdf9fc 5c112cf bfdf9fc 3513bc9 bfdf9fc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | ---
license: mit
tags:
- gan
- pytorch
- vision
- cats
- dcgan
metrics:
- loss
datasets:
- huggan/cats
---
# CatGen v2 - 128px DCGAN
This model is a Deep Convolutional Generative Adversarial Network (DCGAN) trained to generate high-quality 128x128 images of cats. It was trained for 165 epochs on a curated dataset of feline images, pushing the boundaries of traditional GAN architectures at this resolution.
## Sample
Here's a sample after epoch 165:

## Best of - Cat Images



## Model Details
- **Architecture:** DCGAN (Deep Convolutional GAN)
- **Resolution:** 128x128 pixels (RGB)
- **Parameters:** ~186M (Generator)
- **Training Duration:** ~5 hours on NVIDIA T4 GPU
- **Framework:** PyTorch with Mixed Precision (AMP)
## Training Hyperparameters
- **Batch Size:** 128
- **Learning Rate:** 0.0002
- **Optimizer:** Adam (Beta1: 0.5, Beta2: 0.999)
- **Latent Vector (Z):** 128 dimensions
## Training details
The full training code can be found as `catgen-v2.ipynb` in this repo.
The training data we used was from HF: huggan/cats
## Intended Use
This model is intended for artistic and research purposes. It demonstrates how GANs can capture complex textures like fur and eye reflections at medium resolutions.
## How to use
To use this model, clone this repository and run the provided inference script. Ensure you have `matplotlib`, `torch` and `torchvision` installed.
```bash
python3 inference.py
```
--> Sample output:

## Limitations & Bias
As a GAN, the model might occasionally produce "dream-like" artifacts or distorted anatomy (e.g., extra ears or eyes). It is not a diffusion model and generates images in a single forward pass. |