Stripformer_TTA_Deblurring / README.md

varun0908

Update README.md

af2b1b1 verified 1 day ago

preview code

raw

history blame contribute delete

3.77 kB

metadata

license: apache-2.0
tags:
  - image-deblurring
  - image-restoration
  - stripformer
  - pytorch
  - computer-vision
  - image-to-image

Stripformer TTA Ensembling (Image Deblurring)

This repository provides a PyTorch implementation of an image deblurring model based on the official Stripformer architecture (ECCV 2022), trained on the GoPro Motion Deblurring Dataset.

The released model includes a full Test-Time Augmentation (TTA) ensembling pipeline and post-processing enhancements, improving robustness on real-world motion blur.

📘 Conceptual Walkthrough (Medium Article)

This model is accompanied by a detailed conceptual explanation of modern image deblurring, written to explain how and why neural networks learn to undo blur in a clear, step-by-step, and intuitive manner.

The article walks through the full deblurring pipeline—from feature extraction and spatial reasoning to coarse-to-fine refinement and artifact suppression—and uses Stripformer + TTA as a concrete reference throughout to bridge theory with practice.

👉 Read the full explanation here:
https://medium.com/@varunpatels2004/how-neural-network-learns-to-undo-the-blur-4751bbf86f29

Architecture and References

Model Architecture: Stripformer (ECCV 2022, official design)
Paper: https://arxiv.org/abs/2204.04627
Official GitHub Repository:
https://github.com/pp00704831/Stripformer-ECCV-2022-

The architecture follows an encoder–transformer–decoder design with strip-based self-attention and residual learning (output + input).

Model Overview

Task: Image Deblurring (Image-to-Image)
Framework: PyTorch
Training Data: GoPro Motion Deblurring Dataset
Inference: Single-image inference with full TTA
Checkpoint format: PyTorch state_dict
License: Apache-2.0

Training Details

Dataset: GoPro Motion Deblurring Dataset
Supervision: Paired blurry–sharp images
Blur Type: Realistic motion blur
Image Domain: Natural RGB images

The model was trained using the official Stripformer architecture without architectural modification.

Inference Pipeline

The provided inference.py reproduces the exact inference pipeline used during evaluation.

Test-Time Augmentation (TTA)

The following augmentations are applied:

Identity
Horizontal flip
Vertical flip
Horizontal + vertical flip
Transpose

Each prediction is de-augmented and averaged to produce the final output.

Post-Processing

High-frequency detail enhancement
Adaptive CLAHE (contrast normalization)

These steps improve perceptual sharpness and stability for strong motion blur.

Evaluation Summary (Limited)

A focused evaluation was conducted on 10 extreme-blur samples.

Metric	Observation
Best base PSNR	~31.71 dB
TTA PSNR gain	+0.14 to +0.33 dB
Avg SSIM improvement	+0.0039
Consistency	Improvement on all tested samples

These results illustrate consistent but modest gains from TTA.

This is not a full benchmark evaluation.

Qualitative Results

The following examples compare:

Blurred input
Base Stripformer output
Stripformer + full TTA
Ground-truth sharp image

Example 1

Example 2

Repository Structure

File	Description
`best_model.pth`	Trained model weights (`state_dict`)
`inference.py`	Full inference pipeline with TTA
`requirements.txt`	Python dependencies
`config.json`	Model metadata
`.gitattributes`	Git LFS configuration

Installation

pip install -r requirements.txt