Update README.md

2bef96e verified 25 days ago

3.06 kB

license: mit

Image compression autoencoder

A convolutional autoencoder trained to compress 256×256 RGB images into a compact 1024-dimensional latent representation, achieving 192× compression ratio.

Model description

This model learns to compress high-quality images by encoding them into a compact latent space, then reconstructing them with minimal quality loss. The encoder reduces a 196,608-value image (256×256×3) to just 1024 numbers, while the decoder reconstructs the original image from this compressed representation.

Architecture:

Encoder: Convolutional layers with downsampling (256×256×3 → 1024)
Decoder: Transposed convolutional layers with upsampling (1024 → 256×256×3)
Activation: LeakyReLU and Sigmoid
Normalization: Batch normalization

Performance:

Compression ratio: 192×
Target PSNR: >30 dB
Target SSIM: >0.90

Intended use

This model is designed for educational purposes to demonstrate how autoencoders can learn compression automatically from data, rather than using hand-crafted rules like JPEG or PNG.

Use cases:

Understanding autoencoder architectures
Learning about lossy compression
Exploring latent space representations
Teaching AI/ML concepts in bootcamps

Training data

Trained on DF2K_OST, a combined dataset of high-quality images from:

DIV2K (800 images)
Flickr2K (2,650 images)
OutdoorSceneTraining (10,424 images)

All images resized to 256×256 pixels using Lanczos resampling.

Training details

Hyperparameters:

Optimizer: Adam (lr=1e-3)
Loss function: Mean Squared Error (MSE)
Batch size: 16
Epochs: Up to 100 (with early stopping)
Train/validation split: 90/10

Callbacks:

Early stopping (patience=5, monitoring validation loss)
Learning rate reduction (factor=0.5, patience=3)
Model checkpoint (best validation loss)

Hardware:

Single NVIDIA GPU with memory growth enabled

How to use

import shutil

from tensorflow import keras
from huggingface_hub import hf_hub_download

# Download model
downloaded_model = hf_hub_download(
    repo_id='gperdrizet/compression_autoencoder',
    filename='models/compression_ae.keras',
    repo_type='model'
)

# Load model
autoencoder = keras.models.load_model(downloaded_model)

# Use for compression/decompression
compressed = autoencoder.predict(images)  # images shape: (N, 256, 256, 3)

For complete examples, see the training notebook.

Limitations

Fixed input size (256×256 RGB images)
Lossy compression (some quality loss)
Not optimized for specific image types
Slower than traditional codecs
Educational model, not production-ready

Project repository

Full code, training notebooks, and interactive demo: gperdrizet/autoencoders

Citation

If you use this model for educational purposes, please reference the project repository.