varun0908's picture
Update README.md
af2b1b1 verified
metadata
license: apache-2.0
tags:
  - image-deblurring
  - image-restoration
  - stripformer
  - pytorch
  - computer-vision
  - image-to-image

Stripformer TTA Ensembling (Image Deblurring)

This repository provides a PyTorch implementation of an image deblurring model based on the official Stripformer architecture (ECCV 2022), trained on the GoPro Motion Deblurring Dataset.

The released model includes a full Test-Time Augmentation (TTA) ensembling pipeline and post-processing enhancements, improving robustness on real-world motion blur.


📘 Conceptual Walkthrough (Medium Article)

This model is accompanied by a detailed conceptual explanation of modern image deblurring, written to explain how and why neural networks learn to undo blur in a clear, step-by-step, and intuitive manner.

The article walks through the full deblurring pipeline—from feature extraction and spatial reasoning to coarse-to-fine refinement and artifact suppression—and uses Stripformer + TTA as a concrete reference throughout to bridge theory with practice.

👉 Read the full explanation here:
https://medium.com/@varunpatels2004/how-neural-network-learns-to-undo-the-blur-4751bbf86f29


Architecture and References

The architecture follows an encoder–transformer–decoder design with strip-based self-attention and residual learning (output + input).


Model Overview

  • Task: Image Deblurring (Image-to-Image)
  • Framework: PyTorch
  • Training Data: GoPro Motion Deblurring Dataset
  • Inference: Single-image inference with full TTA
  • Checkpoint format: PyTorch state_dict
  • License: Apache-2.0

Training Details

  • Dataset: GoPro Motion Deblurring Dataset
  • Supervision: Paired blurry–sharp images
  • Blur Type: Realistic motion blur
  • Image Domain: Natural RGB images

The model was trained using the official Stripformer architecture without architectural modification.


Inference Pipeline

The provided inference.py reproduces the exact inference pipeline used during evaluation.

Test-Time Augmentation (TTA)

The following augmentations are applied:

  • Identity
  • Horizontal flip
  • Vertical flip
  • Horizontal + vertical flip
  • Transpose

Each prediction is de-augmented and averaged to produce the final output.

Post-Processing

  • High-frequency detail enhancement
  • Adaptive CLAHE (contrast normalization)

These steps improve perceptual sharpness and stability for strong motion blur.


Evaluation Summary (Limited)

A focused evaluation was conducted on 10 extreme-blur samples.

Metric Observation
Best base PSNR ~31.71 dB
TTA PSNR gain +0.14 to +0.33 dB
Avg SSIM improvement +0.0039
Consistency Improvement on all tested samples

These results illustrate consistent but modest gains from TTA.

This is not a full benchmark evaluation.


Qualitative Results

The following examples compare:

  • Blurred input
  • Base Stripformer output
  • Stripformer + full TTA
  • Ground-truth sharp image

Example 1

Example 2


Repository Structure

File Description
best_model.pth Trained model weights (state_dict)
inference.py Full inference pipeline with TTA
requirements.txt Python dependencies
config.json Model metadata
.gitattributes Git LFS configuration

Installation

pip install -r requirements.txt