File size: 3,061 Bytes
2bef96e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | ---
license: mit
---
# Image compression autoencoder
A convolutional autoencoder trained to compress 256×256 RGB images into a compact 1024-dimensional latent representation, achieving 192× compression ratio.
## Model description
This model learns to compress high-quality images by encoding them into a compact latent space, then reconstructing them with minimal quality loss. The encoder reduces a 196,608-value image (256×256×3) to just 1024 numbers, while the decoder reconstructs the original image from this compressed representation.
**Architecture:**
- Encoder: Convolutional layers with downsampling (256×256×3 → 1024)
- Decoder: Transposed convolutional layers with upsampling (1024 → 256×256×3)
- Activation: LeakyReLU and Sigmoid
- Normalization: Batch normalization
**Performance:**
- Compression ratio: 192×
- Target PSNR: >30 dB
- Target SSIM: >0.90
## Intended use
This model is designed for **educational purposes** to demonstrate how autoencoders can learn compression automatically from data, rather than using hand-crafted rules like JPEG or PNG.
**Use cases:**
- Understanding autoencoder architectures
- Learning about lossy compression
- Exploring latent space representations
- Teaching AI/ML concepts in bootcamps
## Training data
Trained on [DF2K_OST](https://huggingface.co/datasets/gperdrizet/DF2K_OST), a combined dataset of high-quality images from:
- DIV2K (800 images)
- Flickr2K (2,650 images)
- OutdoorSceneTraining (10,424 images)
All images resized to 256×256 pixels using Lanczos resampling.
## Training details
**Hyperparameters:**
- Optimizer: Adam (lr=1e-3)
- Loss function: Mean Squared Error (MSE)
- Batch size: 16
- Epochs: Up to 100 (with early stopping)
- Train/validation split: 90/10
**Callbacks:**
- Early stopping (patience=5, monitoring validation loss)
- Learning rate reduction (factor=0.5, patience=3)
- Model checkpoint (best validation loss)
**Hardware:**
- Single NVIDIA GPU with memory growth enabled
## How to use
```python
import shutil
from tensorflow import keras
from huggingface_hub import hf_hub_download
# Download model
downloaded_model = hf_hub_download(
repo_id='gperdrizet/compression_autoencoder',
filename='models/compression_ae.keras',
repo_type='model'
)
# Load model
autoencoder = keras.models.load_model(downloaded_model)
# Use for compression/decompression
compressed = autoencoder.predict(images) # images shape: (N, 256, 256, 3)
```
For complete examples, see the [training notebook](https://github.com/gperdrizet/autoencoders/blob/main/notebooks/01-compression.ipynb).
## Limitations
- Fixed input size (256×256 RGB images)
- Lossy compression (some quality loss)
- Not optimized for specific image types
- Slower than traditional codecs
- Educational model, not production-ready
## Project repository
Full code, training notebooks, and interactive demo: [gperdrizet/autoencoders](https://github.com/gperdrizet/autoencoders)
## Citation
If you use this model for educational purposes, please reference the project repository. |