LH-Tech-AI
/

CatGen-v2

Model card Files Files and versions

CatGen-v2 / README.md

LH-Tech-AI's picture

Update README.md

a704c22 verified 1 day ago

|

history blame contribute delete

2.25 kB

	---
	license: mit
	tags:
	- gan
	- pytorch
	- vision
	- cats
	- dcgan
	metrics:
	- loss
	datasets:
	- huggan/cats
	---

	# CatGen v2 - 128px DCGAN

	This model is a Deep Convolutional Generative Adversarial Network (DCGAN) trained to generate high-quality 128x128 images of cats. It was trained for 165 epochs on a curated dataset of feline images, pushing the boundaries of traditional GAN architectures at this resolution.

	## Sample
	Here's a sample after epoch 165:
	![__results___8_0](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/VV8AhZgJFA_dvsV1-ul7P.png)

	## Best of - Cat Images

	![best_of_1](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/bOThglzoRcfy8nNnVjxGg.png)
	![best_of_2](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/OGAZijZhGyY4Ss1k2zRPo.png)
	![best_of_3](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/UATLmkIogTIZhyFJUTImA.png)

	## Model Details
	- Architecture: DCGAN (Deep Convolutional GAN)
	- Resolution: 128x128 pixels (RGB)
	- Parameters: ~186M (Generator)
	- Training Duration: ~5 hours on NVIDIA T4 GPU
	- Framework: PyTorch with Mixed Precision (AMP)

	## Training Hyperparameters
	- Batch Size: 128
	- Learning Rate: 0.0002
	- Optimizer: Adam (Beta1: 0.5, Beta2: 0.999)
	- Latent Vector (Z): 128 dimensions

	## Training details
	The full training code can be found as `catgen-v2.ipynb` in this repo.
	The training data we used was from HF: huggan/cats

	## Intended Use
	This model is intended for artistic and research purposes. It demonstrates how GANs can capture complex textures like fur and eye reflections at medium resolutions.

	## How to use
	To use this model, clone this repository and run the provided inference script. Ensure you have `matplotlib`, `torch` and `torchvision` installed.

	```bash
	python3 inference.py
	```

	--> Sample output:
	![image](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/UA3btFlIlqwEhrTOEaqRe.png)

	## Limitations & Bias
	As a GAN, the model might occasionally produce "dream-like" artifacts or distorted anatomy (e.g., extra ears or eyes). It is not a diffusion model and generates images in a single forward pass.