gperdrizet commited on
Commit
2bef96e
·
verified ·
1 Parent(s): ff4b304

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -3
README.md CHANGED
@@ -1,3 +1,98 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # Image compression autoencoder
6
+
7
+ A convolutional autoencoder trained to compress 256×256 RGB images into a compact 1024-dimensional latent representation, achieving 192× compression ratio.
8
+
9
+ ## Model description
10
+
11
+ This model learns to compress high-quality images by encoding them into a compact latent space, then reconstructing them with minimal quality loss. The encoder reduces a 196,608-value image (256×256×3) to just 1024 numbers, while the decoder reconstructs the original image from this compressed representation.
12
+
13
+ **Architecture:**
14
+ - Encoder: Convolutional layers with downsampling (256×256×3 → 1024)
15
+ - Decoder: Transposed convolutional layers with upsampling (1024 → 256×256×3)
16
+ - Activation: LeakyReLU and Sigmoid
17
+ - Normalization: Batch normalization
18
+
19
+ **Performance:**
20
+ - Compression ratio: 192×
21
+ - Target PSNR: >30 dB
22
+ - Target SSIM: >0.90
23
+
24
+ ## Intended use
25
+
26
+ This model is designed for **educational purposes** to demonstrate how autoencoders can learn compression automatically from data, rather than using hand-crafted rules like JPEG or PNG.
27
+
28
+ **Use cases:**
29
+ - Understanding autoencoder architectures
30
+ - Learning about lossy compression
31
+ - Exploring latent space representations
32
+ - Teaching AI/ML concepts in bootcamps
33
+
34
+ ## Training data
35
+
36
+ Trained on [DF2K_OST](https://huggingface.co/datasets/gperdrizet/DF2K_OST), a combined dataset of high-quality images from:
37
+ - DIV2K (800 images)
38
+ - Flickr2K (2,650 images)
39
+ - OutdoorSceneTraining (10,424 images)
40
+
41
+ All images resized to 256×256 pixels using Lanczos resampling.
42
+
43
+ ## Training details
44
+
45
+ **Hyperparameters:**
46
+ - Optimizer: Adam (lr=1e-3)
47
+ - Loss function: Mean Squared Error (MSE)
48
+ - Batch size: 16
49
+ - Epochs: Up to 100 (with early stopping)
50
+ - Train/validation split: 90/10
51
+
52
+ **Callbacks:**
53
+ - Early stopping (patience=5, monitoring validation loss)
54
+ - Learning rate reduction (factor=0.5, patience=3)
55
+ - Model checkpoint (best validation loss)
56
+
57
+ **Hardware:**
58
+ - Single NVIDIA GPU with memory growth enabled
59
+
60
+ ## How to use
61
+
62
+ ```python
63
+ import shutil
64
+
65
+ from tensorflow import keras
66
+ from huggingface_hub import hf_hub_download
67
+
68
+ # Download model
69
+ downloaded_model = hf_hub_download(
70
+ repo_id='gperdrizet/compression_autoencoder',
71
+ filename='models/compression_ae.keras',
72
+ repo_type='model'
73
+ )
74
+
75
+ # Load model
76
+ autoencoder = keras.models.load_model(downloaded_model)
77
+
78
+ # Use for compression/decompression
79
+ compressed = autoencoder.predict(images) # images shape: (N, 256, 256, 3)
80
+ ```
81
+
82
+ For complete examples, see the [training notebook](https://github.com/gperdrizet/autoencoders/blob/main/notebooks/01-compression.ipynb).
83
+
84
+ ## Limitations
85
+
86
+ - Fixed input size (256×256 RGB images)
87
+ - Lossy compression (some quality loss)
88
+ - Not optimized for specific image types
89
+ - Slower than traditional codecs
90
+ - Educational model, not production-ready
91
+
92
+ ## Project repository
93
+
94
+ Full code, training notebooks, and interactive demo: [gperdrizet/autoencoders](https://github.com/gperdrizet/autoencoders)
95
+
96
+ ## Citation
97
+
98
+ If you use this model for educational purposes, please reference the project repository.