Nipun's picture
Complete SIREN super-resolution demo with improvements
691ba3c
---
title: Siren Super Resolution
emoji: πŸ”₯
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false
---
# πŸ”₯ SIREN Super-Resolution Demo
A Gradio demo showcasing **SIREN** (Sinusoidal Representation Networks) for image super-resolution.
## What is SIREN?
SIREN networks use periodic activation functions (sine) instead of traditional ReLU activations, making them exceptionally well-suited for representing continuous signals and capturing fine details in images.
**Key advantages:**
- Smooth, continuous representations
- Excellent for capturing high-frequency details
- Can represent images at arbitrary resolutions
- Implicit neural representation - no upsampling layers needed!
## How This Demo Works
1. **Upload** a high-resolution image (this serves as the ground truth)
2. **Downsample** the image artificially by a selected scale factor (2x, 4x, or 8x)
3. **Train** SIREN to learn the downsampled image representation
4. **Generate** a super-resolved version at the original resolution
5. **Compare** the results: downsampled input, SIREN output, and ground truth
## Features
- 🎚️ **Multiple scale factors**: 2x, 4x, 8x super-resolution
- πŸ“Š **Quality metrics**: PSNR, SSIM, and MAE for objective quality assessment
- πŸ’Ύ **Model caching**: Save and reuse trained models to avoid retraining
- 🎨 **Improved UI**: Tabbed interface with side-by-side comparison view
- πŸŽ›οΈ **Configurable model**: Adjust hidden layers, features, and training steps
- πŸ“ˆ **Training visualization**: Watch the loss curve during training
- πŸ“Έ **Real sample images**: High-quality photos from Unsplash (cat, landscape, portrait, flower)
## Installation
```bash
# Install dependencies
pip install -r requirements.txt
# Generate sample images (optional - already included)
python create_samples.py
# Run the demo
python app.py
```
## Usage
### Running locally:
```bash
python app.py
```
Then open your browser to the URL shown (usually `http://127.0.0.1:7860`)
### Quick test:
```bash
python test_siren.py
```
This runs a quick test to verify the SIREN implementation works correctly.
## Files
- `app.py` - Main Gradio application
- `siren.py` - SIREN model implementation
- `utils.py` - Image processing utilities
- `create_samples.py` - Script to generate sample images
- `test_siren.py` - Quick test script
- `samples/` - Sample images for testing
## Parameters
### Model Architecture
- **Hidden Features**: Width of the network (128-512)
- More features = more capacity but slower training
- **Hidden Layers**: Depth of the network (2-6)
- More layers = more capacity but slower training
### Training
- **Training Steps**: Number of optimization steps (500-5000)
- More steps = better quality but takes longer
- 2000 steps is a good balance
### Super-Resolution
- **Scale Factor**: Downsampling/upsampling factor (2x, 4x, 8x)
- 2x: Easier task, faster training
- 4x: Moderate difficulty
- 8x: Challenging, may need more steps
## Example Results
The demo shows three outputs:
1. **Downsampled (Input)**: The artificially downsampled low-resolution image
2. **Super-Resolved (SIREN)**: The SIREN-generated high-resolution output
3. **Ground Truth (Original)**: The original high-resolution image for comparison
## References
- **Paper**: [Implicit Neural Representations with Periodic Activation Functions (SIREN)](https://arxiv.org/abs/2006.09661)
- **Project Page**: [https://vsitzmann.github.io/siren/](https://vsitzmann.github.io/siren/)
- **Notebook Tutorial**: [SIREN Tutorial by Nipun Batra](https://github.com/nipunbatra/pml-teaching/blob/master/notebooks/siren.ipynb)
## Quality Metrics Explained
The demo now includes three standard image quality metrics:
- **PSNR (Peak Signal-to-Noise Ratio)**: Measures reconstruction quality in dB. Higher is better.
- \>30 dB: Good quality
- \>40 dB: Excellent quality
- **SSIM (Structural Similarity Index)**: Perceptual quality metric ranging from 0 to 1. Closer to 1.0 is better.
- \>0.9: Very good quality
- \>0.95: Excellent quality
- **MAE (Mean Absolute Error)**: Average pixel-wise difference. Lower is better.
- <0.01: Excellent
- <0.05: Good
## Model Caching
Trained models are automatically saved and can be reused:
- Models are cached in `model_cache/` directory
- Cache key includes: image size, scale factor, training steps, and architecture
- Enable/disable caching with the checkbox in the UI
- Drastically speeds up repeated experiments with the same settings
## Tips for Best Results
1. **Start with lower scale factors** (2x) for faster experimentation
2. **Scale-specific training steps**:
- 2x: 1500-2000 steps
- 4x: 3000 steps
- 8x: 4000-5000 steps
3. **For 8x super-resolution**:
- Use 4000-5000 training steps
- Increase hidden layers to 4-5
- Use 512 hidden features
- Check quality metrics to verify results
4. **Use images with rich details** to see SIREN's strength in capturing high-frequency content
5. **Enable model cache** to avoid retraining with identical settings
## License
This demo is for educational purposes. Please cite the original SIREN paper if you use this in your work:
```bibtex
@inproceedings{sitzmann2020implicit,
title={Implicit Neural Representations with Periodic Activation Functions},
author={Sitzmann, Vincent and Martel, Julien NP and Bergman, Alexander W and Lindell, David B and Wetzstein, Gordon},
booktitle={Proc. NeurIPS},
year={2020}
}
```