Nipun's picture
Complete SIREN super-resolution demo with improvements
691ba3c

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: Siren Super Resolution
emoji: πŸ”₯
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false

πŸ”₯ SIREN Super-Resolution Demo

A Gradio demo showcasing SIREN (Sinusoidal Representation Networks) for image super-resolution.

What is SIREN?

SIREN networks use periodic activation functions (sine) instead of traditional ReLU activations, making them exceptionally well-suited for representing continuous signals and capturing fine details in images.

Key advantages:

  • Smooth, continuous representations
  • Excellent for capturing high-frequency details
  • Can represent images at arbitrary resolutions
  • Implicit neural representation - no upsampling layers needed!

How This Demo Works

  1. Upload a high-resolution image (this serves as the ground truth)
  2. Downsample the image artificially by a selected scale factor (2x, 4x, or 8x)
  3. Train SIREN to learn the downsampled image representation
  4. Generate a super-resolved version at the original resolution
  5. Compare the results: downsampled input, SIREN output, and ground truth

Features

  • 🎚️ Multiple scale factors: 2x, 4x, 8x super-resolution
  • πŸ“Š Quality metrics: PSNR, SSIM, and MAE for objective quality assessment
  • πŸ’Ύ Model caching: Save and reuse trained models to avoid retraining
  • 🎨 Improved UI: Tabbed interface with side-by-side comparison view
  • πŸŽ›οΈ Configurable model: Adjust hidden layers, features, and training steps
  • πŸ“ˆ Training visualization: Watch the loss curve during training
  • πŸ“Έ Real sample images: High-quality photos from Unsplash (cat, landscape, portrait, flower)

Installation

# Install dependencies
pip install -r requirements.txt

# Generate sample images (optional - already included)
python create_samples.py

# Run the demo
python app.py

Usage

Running locally:

python app.py

Then open your browser to the URL shown (usually http://127.0.0.1:7860)

Quick test:

python test_siren.py

This runs a quick test to verify the SIREN implementation works correctly.

Files

  • app.py - Main Gradio application
  • siren.py - SIREN model implementation
  • utils.py - Image processing utilities
  • create_samples.py - Script to generate sample images
  • test_siren.py - Quick test script
  • samples/ - Sample images for testing

Parameters

Model Architecture

  • Hidden Features: Width of the network (128-512)
    • More features = more capacity but slower training
  • Hidden Layers: Depth of the network (2-6)
    • More layers = more capacity but slower training

Training

  • Training Steps: Number of optimization steps (500-5000)
    • More steps = better quality but takes longer
    • 2000 steps is a good balance

Super-Resolution

  • Scale Factor: Downsampling/upsampling factor (2x, 4x, 8x)
    • 2x: Easier task, faster training
    • 4x: Moderate difficulty
    • 8x: Challenging, may need more steps

Example Results

The demo shows three outputs:

  1. Downsampled (Input): The artificially downsampled low-resolution image
  2. Super-Resolved (SIREN): The SIREN-generated high-resolution output
  3. Ground Truth (Original): The original high-resolution image for comparison

References

Quality Metrics Explained

The demo now includes three standard image quality metrics:

  • PSNR (Peak Signal-to-Noise Ratio): Measures reconstruction quality in dB. Higher is better.

    • >30 dB: Good quality
    • >40 dB: Excellent quality
  • SSIM (Structural Similarity Index): Perceptual quality metric ranging from 0 to 1. Closer to 1.0 is better.

    • >0.9: Very good quality
    • >0.95: Excellent quality
  • MAE (Mean Absolute Error): Average pixel-wise difference. Lower is better.

    • <0.01: Excellent
    • <0.05: Good

Model Caching

Trained models are automatically saved and can be reused:

  • Models are cached in model_cache/ directory
  • Cache key includes: image size, scale factor, training steps, and architecture
  • Enable/disable caching with the checkbox in the UI
  • Drastically speeds up repeated experiments with the same settings

Tips for Best Results

  1. Start with lower scale factors (2x) for faster experimentation
  2. Scale-specific training steps:
    • 2x: 1500-2000 steps
    • 4x: 3000 steps
    • 8x: 4000-5000 steps
  3. For 8x super-resolution:
    • Use 4000-5000 training steps
    • Increase hidden layers to 4-5
    • Use 512 hidden features
    • Check quality metrics to verify results
  4. Use images with rich details to see SIREN's strength in capturing high-frequency content
  5. Enable model cache to avoid retraining with identical settings

License

This demo is for educational purposes. Please cite the original SIREN paper if you use this in your work:

@inproceedings{sitzmann2020implicit,
    title={Implicit Neural Representations with Periodic Activation Functions},
    author={Sitzmann, Vincent and Martel, Julien NP and Bergman, Alexander W and Lindell, David B and Wetzstein, Gordon},
    booktitle={Proc. NeurIPS},
    year={2020}
}