|
|
--- |
|
|
language: en |
|
|
license: mit |
|
|
tags: |
|
|
- image-denoising |
|
|
- tensorflow |
|
|
- unet |
|
|
- computer-vision |
|
|
inference: true |
|
|
datasets: |
|
|
- AIOmarRehan/Cropped_Yale_Faces |
|
|
--- |
|
|
|
|
|
[If you would like a detailed explanation of this project, please refer to the Medium article below.](https://medium.com/@ai.omar.rehan/a-u-net-based-cnn-autoencoder-for-cleaning-noisy-images-before-classification-132e27b828e2) |
|
|
|
|
|
--- |
|
|
|
|
|
# **U-Net CNN Autoencoder for Image Denoising** |
|
|
|
|
|
A hands-on guide to building a deep-learning model that cleans noisy images, improving downstream classification tasks. |
|
|
|
|
|
When I began experimenting with image-classification projects, I quickly realized how sensitive models are to noise. Small imperfections—sensor noise, compression artifacts, random pixel disturbances—could drastically reduce performance. |
|
|
|
|
|
Instead of training classifiers directly on noisy images, I decided to build a **preprocessing model**: one whose sole purpose is to take a noisy input and output a cleaner version. This approach allows classifiers to focus on meaningful patterns rather than irrelevant distortions. |
|
|
|
|
|
That led me to design a **U-Net–based CNN Autoencoder**. |
|
|
|
|
|
This repository covers: |
|
|
|
|
|
* Why I chose a U-Net structure |
|
|
* The design of the autoencoder |
|
|
* How noisy images were generated |
|
|
* Training and evaluation process |
|
|
* Key results and insights |
|
|
|
|
|
**Goal:** Leverage a robust deep-learning architecture to denoise images before feeding them to classifiers. |
|
|
|
|
|
--- |
|
|
|
|
|
## 1. Environment Setup |
|
|
|
|
|
The project uses the standard TensorFlow/Keras stack: |
|
|
|
|
|
```python |
|
|
import tensorflow as tf |
|
|
from tensorflow.keras.layers import * |
|
|
from tensorflow.keras.models import Model |
|
|
import numpy as np |
|
|
import matplotlib.pyplot as plt |
|
|
``` |
|
|
|
|
|
This provides a flexible foundation for building custom CNN architectures. |
|
|
|
|
|
--- |
|
|
|
|
|
## 2. Why a U-Net Autoencoder? |
|
|
|
|
|
Traditional autoencoders compress and reconstruct images but often lose important details. |
|
|
|
|
|
**U-Net advantages:** |
|
|
|
|
|
* Downsamples to learn a compact representation |
|
|
* Upsamples to reconstruct the image |
|
|
* Uses **skip connections** to preserve high-resolution features |
|
|
|
|
|
This makes U-Net ideal for: denoising, segmentation, super-resolution, and image restoration tasks. |
|
|
|
|
|
--- |
|
|
|
|
|
## 3. Building the Model |
|
|
|
|
|
**Encoder:** |
|
|
|
|
|
```python |
|
|
c1 = Conv2D(64, 3, activation='relu', padding='same')(inputs) |
|
|
p1 = MaxPooling2D((2, 2))(c1) |
|
|
c2 = Conv2D(128, 3, activation='relu', padding='same')(p1) |
|
|
p2 = MaxPooling2D((2, 2))(c2) |
|
|
``` |
|
|
|
|
|
**Bottleneck:** |
|
|
|
|
|
```python |
|
|
bn = Conv2D(256, 3, activation='relu', padding='same')(p2) |
|
|
``` |
|
|
|
|
|
**Decoder:** |
|
|
|
|
|
```python |
|
|
u1 = UpSampling2D((2, 2))(bn) |
|
|
m1 = concatenate([u1, c2]) |
|
|
c3 = Conv2D(128, 3, activation='relu', padding='same')(m1) |
|
|
u2 = UpSampling2D((2, 2))(c3) |
|
|
m2 = concatenate([u2, c1]) |
|
|
c4 = Conv2D(64, 3, activation='relu', padding='same')(m2) |
|
|
outputs = Conv2D(1, 3, activation='sigmoid', padding='same')(c4) |
|
|
``` |
|
|
|
|
|
Core concept: **down → compress → up → reconnect → reconstruct** |
|
|
|
|
|
--- |
|
|
|
|
|
## 4. Creating Noisy Data |
|
|
|
|
|
I added Gaussian noise to MNIST digits to generate training pairs: |
|
|
|
|
|
```python |
|
|
noise_factor = 0.4 |
|
|
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) |
|
|
``` |
|
|
|
|
|
Training pairs: |
|
|
|
|
|
* **Clean image** |
|
|
* **Noisy version** |
|
|
|
|
|
Perfect for learning a denoising function. |
|
|
|
|
|
--- |
|
|
|
|
|
## 5. Training the Autoencoder |
|
|
|
|
|
Compile: |
|
|
|
|
|
```python |
|
|
model.compile(optimizer='adam', loss='binary_crossentropy') |
|
|
``` |
|
|
|
|
|
Train: |
|
|
|
|
|
```python |
|
|
model.fit(x_train_noisy, x_train, epochs=10, batch_size=128, validation_split=0.1) |
|
|
``` |
|
|
|
|
|
The model learns a simple rule: **Noisy input → Clean output**. |
|
|
|
|
|
--- |
|
|
|
|
|
## 6. Visualizing Results |
|
|
|
|
|
After training, comparing: |
|
|
|
|
|
* Noisy input |
|
|
* Denoised output |
|
|
* Original image |
|
|
|
|
|
The autoencoder effectively removes noise while keeping key structures intact—ideal for lightweight models and MNIST. |
|
|
|
|
|
--- |
|
|
|
|
|
## 7. Benefits for Classification |
|
|
|
|
|
A denoising preprocessing step improves real-world image classification pipelines: |
|
|
|
|
|
**Pipeline:** |
|
|
`Noisy Image → Autoencoder → Classifier → Prediction` |
|
|
|
|
|
Helps with noise from: |
|
|
|
|
|
* Cameras or sensors |
|
|
* Low-light conditions |
|
|
* Compression or motion blur |
|
|
|
|
|
Cleaner inputs → better predictions. |
|
|
|
|
|
--- |
|
|
|
|
|
## 8. Key Takeaways |
|
|
|
|
|
* U-Net skip connections preserve important features |
|
|
* Autoencoders are powerful preprocessing tools |
|
|
* Denoising improves classifier performance |
|
|
* Lightweight, easy to integrate, scalable to any dataset |
|
|
|
|
|
This method is practical and immediately applicable to real-world noisy data. |