You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Stripformer TTA Ensembling (Image Deblurring)

This repository provides a PyTorch implementation of an image deblurring model based on the official Stripformer architecture (ECCV 2022), trained on the GoPro Motion Deblurring Dataset.

The released model includes a full Test-Time Augmentation (TTA) ensembling pipeline and post-processing enhancements, improving robustness on real-world motion blur.


📘 Conceptual Walkthrough (Medium Article)

This model is accompanied by a detailed conceptual explanation of modern image deblurring, written to explain how and why neural networks learn to undo blur in a clear, step-by-step, and intuitive manner.

The article walks through the full deblurring pipeline—from feature extraction and spatial reasoning to coarse-to-fine refinement and artifact suppression—and uses Stripformer + TTA as a concrete reference throughout to bridge theory with practice.

👉 Read the full explanation here:
https://medium.com/@varunpatels2004/how-neural-network-learns-to-undo-the-blur-4751bbf86f29


Architecture and References

The architecture follows an encoder–transformer–decoder design with strip-based self-attention and residual learning (output + input).


Model Overview

  • Task: Image Deblurring (Image-to-Image)
  • Framework: PyTorch
  • Training Data: GoPro Motion Deblurring Dataset
  • Inference: Single-image inference with full TTA
  • Checkpoint format: PyTorch state_dict
  • License: Apache-2.0

Training Details

  • Dataset: GoPro Motion Deblurring Dataset
  • Supervision: Paired blurry–sharp images
  • Blur Type: Realistic motion blur
  • Image Domain: Natural RGB images

The model was trained using the official Stripformer architecture without architectural modification.


Inference Pipeline

The provided inference.py reproduces the exact inference pipeline used during evaluation.

Test-Time Augmentation (TTA)

The following augmentations are applied:

  • Identity
  • Horizontal flip
  • Vertical flip
  • Horizontal + vertical flip
  • Transpose

Each prediction is de-augmented and averaged to produce the final output.

Post-Processing

  • High-frequency detail enhancement
  • Adaptive CLAHE (contrast normalization)

These steps improve perceptual sharpness and stability for strong motion blur.


Evaluation Summary (Limited)

A focused evaluation was conducted on 10 extreme-blur samples.

Metric Observation
Best base PSNR ~31.71 dB
TTA PSNR gain +0.14 to +0.33 dB
Avg SSIM improvement +0.0039
Consistency Improvement on all tested samples

These results illustrate consistent but modest gains from TTA.

This is not a full benchmark evaluation.


Qualitative Results

The following examples compare:

  • Blurred input
  • Base Stripformer output
  • Stripformer + full TTA
  • Ground-truth sharp image

Example 1

Example 2


Repository Structure

File Description
best_model.pth Trained model weights (state_dict)
inference.py Full inference pipeline with TTA
requirements.txt Python dependencies
config.json Model metadata
.gitattributes Git LFS configuration

Installation

pip install -r requirements.txt
Downloads last month
42
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for varun0908/Stripformer_TTA_Deblurring