varun0908's picture
Update README.md
af2b1b1 verified
---
license: apache-2.0
tags:
- image-deblurring
- image-restoration
- stripformer
- pytorch
- computer-vision
- image-to-image
---
# Stripformer TTA Ensembling (Image Deblurring)
This repository provides a **PyTorch implementation of an image deblurring model**
based on the **official Stripformer architecture (ECCV 2022)**, trained on the
**GoPro Motion Deblurring Dataset**.
The released model includes a **full Test-Time Augmentation (TTA) ensembling
pipeline** and **post-processing enhancements**, improving robustness on
real-world motion blur.
---
## 📘 Conceptual Walkthrough (Medium Article)
This model is accompanied by a detailed **conceptual explanation of modern image
deblurring**, written to explain *how and why* neural networks learn to undo blur
in a clear, step-by-step, and intuitive manner.
The article walks through the full deblurring pipeline—from feature extraction
and spatial reasoning to coarse-to-fine refinement and artifact suppression—and
uses **Stripformer + TTA** as a concrete reference throughout to bridge theory
with practice.
👉 **Read the full explanation here:**
https://medium.com/@varunpatels2004/how-neural-network-learns-to-undo-the-blur-4751bbf86f29
---
## Architecture and References
- **Model Architecture:** Stripformer (ECCV 2022, official design)
- **Paper:** https://arxiv.org/abs/2204.04627
- **Official GitHub Repository:**
https://github.com/pp00704831/Stripformer-ECCV-2022-
The architecture follows an encoder–transformer–decoder design with
strip-based self-attention and residual learning (`output + input`).
---
## Model Overview
- **Task:** Image Deblurring (Image-to-Image)
- **Framework:** PyTorch
- **Training Data:** GoPro Motion Deblurring Dataset
- **Inference:** Single-image inference with full TTA
- **Checkpoint format:** PyTorch `state_dict`
- **License:** Apache-2.0
---
## Training Details
- **Dataset:** GoPro Motion Deblurring Dataset
- **Supervision:** Paired blurry–sharp images
- **Blur Type:** Realistic motion blur
- **Image Domain:** Natural RGB images
The model was trained using the **official Stripformer architecture without
architectural modification**.
---
## Inference Pipeline
The provided `inference.py` reproduces the **exact inference pipeline used
during evaluation**.
### Test-Time Augmentation (TTA)
The following augmentations are applied:
- Identity
- Horizontal flip
- Vertical flip
- Horizontal + vertical flip
- Transpose
Each prediction is de-augmented and **averaged** to produce the final output.
### Post-Processing
- High-frequency detail enhancement
- Adaptive CLAHE (contrast normalization)
These steps improve perceptual sharpness and stability for strong motion blur.
---
## Evaluation Summary (Limited)
A focused evaluation was conducted on **10 extreme-blur samples**.
| Metric | Observation |
|------|------------|
| Best base PSNR | ~31.71 dB |
| TTA PSNR gain | +0.14 to +0.33 dB |
| Avg SSIM improvement | +0.0039 |
| Consistency | Improvement on all tested samples |
These results illustrate **consistent but modest gains from TTA**.
This is **not** a full benchmark evaluation.
---
## Qualitative Results
The following examples compare:
- Blurred input
- Base Stripformer output
- Stripformer + full TTA
- Ground-truth sharp image
### Example 1
![](assets/extreme_1571_450.png)
### Example 2
![](assets/extreme_0133_1118.png)
---
## Repository Structure
| File | Description |
|----|----|
| `best_model.pth` | Trained model weights (`state_dict`) |
| `inference.py` | Full inference pipeline with TTA |
| `requirements.txt` | Python dependencies |
| `config.json` | Model metadata |
| `.gitattributes` | Git LFS configuration |
---
## Installation
```bash
pip install -r requirements.txt