Update README.md

af2b1b1 verified 2 days ago

3.77 kB

	---
	license: apache-2.0
	tags:
	- image-deblurring
	- image-restoration
	- stripformer
	- pytorch
	- computer-vision
	- image-to-image
	---

	# Stripformer TTA Ensembling (Image Deblurring)

	This repository provides a PyTorch implementation of an image deblurring model
	based on the official Stripformer architecture (ECCV 2022), trained on the
	GoPro Motion Deblurring Dataset.

	The released model includes a **full Test-Time Augmentation (TTA) ensembling
	pipeline and post-processing enhancements**, improving robustness on
	real-world motion blur.

	---

	## 📘 Conceptual Walkthrough (Medium Article)

	This model is accompanied by a detailed **conceptual explanation of modern image
	deblurring*, written to explain how and why* neural networks learn to undo blur
	in a clear, step-by-step, and intuitive manner.

	The article walks through the full deblurring pipeline—from feature extraction
	and spatial reasoning to coarse-to-fine refinement and artifact suppression—and
	uses Stripformer + TTA as a concrete reference throughout to bridge theory
	with practice.

	👉 Read the full explanation here:
	https://medium.com/@varunpatels2004/how-neural-network-learns-to-undo-the-blur-4751bbf86f29

	---

	## Architecture and References

	- Model Architecture: Stripformer (ECCV 2022, official design)
	- Paper: https://arxiv.org/abs/2204.04627
	- Official GitHub Repository:
	https://github.com/pp00704831/Stripformer-ECCV-2022-

	The architecture follows an encoder–transformer–decoder design with
	strip-based self-attention and residual learning (`output + input`).

	---

	## Model Overview

	- Task: Image Deblurring (Image-to-Image)
	- Framework: PyTorch
	- Training Data: GoPro Motion Deblurring Dataset
	- Inference: Single-image inference with full TTA
	- Checkpoint format: PyTorch `state_dict`
	- License: Apache-2.0

	---

	## Training Details

	- Dataset: GoPro Motion Deblurring Dataset
	- Supervision: Paired blurry–sharp images
	- Blur Type: Realistic motion blur
	- Image Domain: Natural RGB images

	The model was trained using the **official Stripformer architecture without
	architectural modification**.

	---

	## Inference Pipeline

	The provided `inference.py` reproduces the **exact inference pipeline used
	during evaluation**.

	### Test-Time Augmentation (TTA)

	The following augmentations are applied:
	- Identity
	- Horizontal flip
	- Vertical flip
	- Horizontal + vertical flip
	- Transpose

	Each prediction is de-augmented and averaged to produce the final output.

	### Post-Processing

	- High-frequency detail enhancement
	- Adaptive CLAHE (contrast normalization)

	These steps improve perceptual sharpness and stability for strong motion blur.

	---

	## Evaluation Summary (Limited)

	A focused evaluation was conducted on 10 extreme-blur samples.

	\| Metric \| Observation \|
	\|------\|------------\|
	\| Best base PSNR \| ~31.71 dB \|
	\| TTA PSNR gain \| +0.14 to +0.33 dB \|
	\| Avg SSIM improvement \| +0.0039 \|
	\| Consistency \| Improvement on all tested samples \|

	These results illustrate consistent but modest gains from TTA.

	This is not a full benchmark evaluation.

	---

	## Qualitative Results

	The following examples compare:
	- Blurred input
	- Base Stripformer output
	- Stripformer + full TTA
	- Ground-truth sharp image

	### Example 1
	![](assets/extreme_1571_450.png)

	### Example 2
	![](assets/extreme_0133_1118.png)

	---

	## Repository Structure

	\| File \| Description \|
	\|----\|----\|
	\| `best_model.pth` \| Trained model weights (`state_dict`) \|
	\| `inference.py` \| Full inference pipeline with TTA \|
	\| `requirements.txt` \| Python dependencies \|
	\| `config.json` \| Model metadata \|
	\| `.gitattributes` \| Git LFS configuration \|

	---

	## Installation

	```bash
	pip install -r requirements.txt