gperdrizet
/

compression_autoencoder

Model card Files Files and versions

compression_autoencoder / README.md

gperdrizet's picture

Update README.md

2bef96e verified 26 days ago

|

history blame contribute delete

3.06 kB

	---
	license: mit
	---

	# Image compression autoencoder

	A convolutional autoencoder trained to compress 256×256 RGB images into a compact 1024-dimensional latent representation, achieving 192× compression ratio.

	## Model description

	This model learns to compress high-quality images by encoding them into a compact latent space, then reconstructing them with minimal quality loss. The encoder reduces a 196,608-value image (256×256×3) to just 1024 numbers, while the decoder reconstructs the original image from this compressed representation.

	Architecture:
	- Encoder: Convolutional layers with downsampling (256×256×3 → 1024)
	- Decoder: Transposed convolutional layers with upsampling (1024 → 256×256×3)
	- Activation: LeakyReLU and Sigmoid
	- Normalization: Batch normalization

	Performance:
	- Compression ratio: 192×
	- Target PSNR: >30 dB
	- Target SSIM: >0.90

	## Intended use

	This model is designed for educational purposes to demonstrate how autoencoders can learn compression automatically from data, rather than using hand-crafted rules like JPEG or PNG.

	Use cases:
	- Understanding autoencoder architectures
	- Learning about lossy compression
	- Exploring latent space representations
	- Teaching AI/ML concepts in bootcamps

	## Training data

	Trained on [DF2K_OST](https://huggingface.co/datasets/gperdrizet/DF2K_OST), a combined dataset of high-quality images from:
	- DIV2K (800 images)
	- Flickr2K (2,650 images)
	- OutdoorSceneTraining (10,424 images)

	All images resized to 256×256 pixels using Lanczos resampling.

	## Training details

	Hyperparameters:
	- Optimizer: Adam (lr=1e-3)
	- Loss function: Mean Squared Error (MSE)
	- Batch size: 16
	- Epochs: Up to 100 (with early stopping)
	- Train/validation split: 90/10

	Callbacks:
	- Early stopping (patience=5, monitoring validation loss)
	- Learning rate reduction (factor=0.5, patience=3)
	- Model checkpoint (best validation loss)

	Hardware:
	- Single NVIDIA GPU with memory growth enabled

	## How to use

	```python
	import shutil

	from tensorflow import keras
	from huggingface_hub import hf_hub_download

	# Download model
	downloaded_model = hf_hub_download(
	repo_id='gperdrizet/compression_autoencoder',
	filename='models/compression_ae.keras',
	repo_type='model'
	)

	# Load model
	autoencoder = keras.models.load_model(downloaded_model)

	# Use for compression/decompression
	compressed = autoencoder.predict(images) # images shape: (N, 256, 256, 3)
	```

	For complete examples, see the [training notebook](https://github.com/gperdrizet/autoencoders/blob/main/notebooks/01-compression.ipynb).

	## Limitations

	- Fixed input size (256×256 RGB images)
	- Lossy compression (some quality loss)
	- Not optimized for specific image types
	- Slower than traditional codecs
	- Educational model, not production-ready

	## Project repository

	Full code, training notebooks, and interactive demo: [gperdrizet/autoencoders](https://github.com/gperdrizet/autoencoders)

	## Citation

	If you use this model for educational purposes, please reference the project repository.