gperdrizet's picture
Update README.md
2bef96e verified
metadata
license: mit

Image compression autoencoder

A convolutional autoencoder trained to compress 256×256 RGB images into a compact 1024-dimensional latent representation, achieving 192× compression ratio.

Model description

This model learns to compress high-quality images by encoding them into a compact latent space, then reconstructing them with minimal quality loss. The encoder reduces a 196,608-value image (256×256×3) to just 1024 numbers, while the decoder reconstructs the original image from this compressed representation.

Architecture:

  • Encoder: Convolutional layers with downsampling (256×256×3 → 1024)
  • Decoder: Transposed convolutional layers with upsampling (1024 → 256×256×3)
  • Activation: LeakyReLU and Sigmoid
  • Normalization: Batch normalization

Performance:

  • Compression ratio: 192×
  • Target PSNR: >30 dB
  • Target SSIM: >0.90

Intended use

This model is designed for educational purposes to demonstrate how autoencoders can learn compression automatically from data, rather than using hand-crafted rules like JPEG or PNG.

Use cases:

  • Understanding autoencoder architectures
  • Learning about lossy compression
  • Exploring latent space representations
  • Teaching AI/ML concepts in bootcamps

Training data

Trained on DF2K_OST, a combined dataset of high-quality images from:

  • DIV2K (800 images)
  • Flickr2K (2,650 images)
  • OutdoorSceneTraining (10,424 images)

All images resized to 256×256 pixels using Lanczos resampling.

Training details

Hyperparameters:

  • Optimizer: Adam (lr=1e-3)
  • Loss function: Mean Squared Error (MSE)
  • Batch size: 16
  • Epochs: Up to 100 (with early stopping)
  • Train/validation split: 90/10

Callbacks:

  • Early stopping (patience=5, monitoring validation loss)
  • Learning rate reduction (factor=0.5, patience=3)
  • Model checkpoint (best validation loss)

Hardware:

  • Single NVIDIA GPU with memory growth enabled

How to use

import shutil

from tensorflow import keras
from huggingface_hub import hf_hub_download

# Download model
downloaded_model = hf_hub_download(
    repo_id='gperdrizet/compression_autoencoder',
    filename='models/compression_ae.keras',
    repo_type='model'
)

# Load model
autoencoder = keras.models.load_model(downloaded_model)

# Use for compression/decompression
compressed = autoencoder.predict(images)  # images shape: (N, 256, 256, 3)

For complete examples, see the training notebook.

Limitations

  • Fixed input size (256×256 RGB images)
  • Lossy compression (some quality loss)
  • Not optimized for specific image types
  • Slower than traditional codecs
  • Educational model, not production-ready

Project repository

Full code, training notebooks, and interactive demo: gperdrizet/autoencoders

Citation

If you use this model for educational purposes, please reference the project repository.