You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Stripformer TTA Ensembling (Image Deblurring)

This repository provides a PyTorch implementation of an image deblurring model based on the official Stripformer architecture (ECCV 2022), trained on the GoPro Motion Deblurring Dataset.

The released model includes a full Test-Time Augmentation (TTA) ensembling pipeline and post-processing enhancements, improving robustness on real-world motion blur.

📘 Conceptual Walkthrough (Medium Article)

This model is accompanied by a detailed conceptual explanation of modern image deblurring, written to explain how and why neural networks learn to undo blur in a clear, step-by-step, and intuitive manner.

The article walks through the full deblurring pipeline—from feature extraction and spatial reasoning to coarse-to-fine refinement and artifact suppression—and uses Stripformer + TTA as a concrete reference throughout to bridge theory with practice.

👉 Read the full explanation here:
https://medium.com/@varunpatels2004/how-neural-network-learns-to-undo-the-blur-4751bbf86f29

Architecture and References

Model Architecture: Stripformer (ECCV 2022, official design)
Paper: https://arxiv.org/abs/2204.04627
Official GitHub Repository:
https://github.com/pp00704831/Stripformer-ECCV-2022-

The architecture follows an encoder–transformer–decoder design with strip-based self-attention and residual learning (output + input).

Model Overview

Task: Image Deblurring (Image-to-Image)
Framework: PyTorch
Training Data: GoPro Motion Deblurring Dataset
Inference: Single-image inference with full TTA
Checkpoint format: PyTorch state_dict
License: Apache-2.0

Training Details

Dataset: GoPro Motion Deblurring Dataset
Supervision: Paired blurry–sharp images
Blur Type: Realistic motion blur
Image Domain: Natural RGB images

The model was trained using the official Stripformer architecture without architectural modification.

Inference Pipeline

The provided inference.py reproduces the exact inference pipeline used during evaluation.

Test-Time Augmentation (TTA)

The following augmentations are applied:

Identity
Horizontal flip
Vertical flip
Horizontal + vertical flip
Transpose

Each prediction is de-augmented and averaged to produce the final output.

Post-Processing

High-frequency detail enhancement
Adaptive CLAHE (contrast normalization)

These steps improve perceptual sharpness and stability for strong motion blur.

Evaluation Summary (Limited)

A focused evaluation was conducted on 10 extreme-blur samples.

Metric	Observation
Best base PSNR	~31.71 dB
TTA PSNR gain	+0.14 to +0.33 dB
Avg SSIM improvement	+0.0039
Consistency	Improvement on all tested samples

These results illustrate consistent but modest gains from TTA.

This is not a full benchmark evaluation.

Qualitative Results

The following examples compare:

Blurred input
Base Stripformer output
Stripformer + full TTA
Ground-truth sharp image

Example 1

Example 2

Repository Structure

File	Description
`best_model.pth`	Trained model weights (`state_dict`)
`inference.py`	Full inference pipeline with TTA
`requirements.txt`	Python dependencies
`config.json`	Model metadata
`.gitattributes`	Git LFS configuration

Installation

pip install -r requirements.txt

Downloads last month: 7

Paper for varun0908/Stripformer_TTA_Deblurring

Stripformer: Strip Transformer for Fast Image Deblurring

Paper • 2204.04627 • Published Apr 10, 2022