Spaces:

Nipun
/

siren-super-resolution

Sleeping

App Files Files Community

siren-super-resolution / README.md

Nipun

Complete SIREN super-resolution demo with improvements

691ba3c 7 days ago

preview code

raw

history blame contribute delete

5.5 kB

	---
	title: Siren Super Resolution
	emoji: 🔥
	colorFrom: purple
	colorTo: pink
	sdk: gradio
	sdk_version: 6.0.2
	app_file: app.py
	pinned: false
	---

	# 🔥 SIREN Super-Resolution Demo

	A Gradio demo showcasing SIREN (Sinusoidal Representation Networks) for image super-resolution.

	## What is SIREN?

	SIREN networks use periodic activation functions (sine) instead of traditional ReLU activations, making them exceptionally well-suited for representing continuous signals and capturing fine details in images.

	Key advantages:
	- Smooth, continuous representations
	- Excellent for capturing high-frequency details
	- Can represent images at arbitrary resolutions
	- Implicit neural representation - no upsampling layers needed!

	## How This Demo Works

	1. Upload a high-resolution image (this serves as the ground truth)
	2. Downsample the image artificially by a selected scale factor (2x, 4x, or 8x)
	3. Train SIREN to learn the downsampled image representation
	4. Generate a super-resolved version at the original resolution
	5. Compare the results: downsampled input, SIREN output, and ground truth

	## Features

	- 🎚️ Multiple scale factors: 2x, 4x, 8x super-resolution
	- 📊 Quality metrics: PSNR, SSIM, and MAE for objective quality assessment
	- 💾 Model caching: Save and reuse trained models to avoid retraining
	- 🎨 Improved UI: Tabbed interface with side-by-side comparison view
	- 🎛️ Configurable model: Adjust hidden layers, features, and training steps
	- 📈 Training visualization: Watch the loss curve during training
	- 📸 Real sample images: High-quality photos from Unsplash (cat, landscape, portrait, flower)

	## Installation

	```bash
	# Install dependencies
	pip install -r requirements.txt

	# Generate sample images (optional - already included)
	python create_samples.py

	# Run the demo
	python app.py
	```

	## Usage

	### Running locally:

	```bash
	python app.py
	```

	Then open your browser to the URL shown (usually `http://127.0.0.1:7860`)

	### Quick test:

	```bash
	python test_siren.py
	```

	This runs a quick test to verify the SIREN implementation works correctly.

	## Files

	- `app.py` - Main Gradio application
	- `siren.py` - SIREN model implementation
	- `utils.py` - Image processing utilities
	- `create_samples.py` - Script to generate sample images
	- `test_siren.py` - Quick test script
	- `samples/` - Sample images for testing

	## Parameters

	### Model Architecture
	- Hidden Features: Width of the network (128-512)
	- More features = more capacity but slower training
	- Hidden Layers: Depth of the network (2-6)
	- More layers = more capacity but slower training

	### Training
	- Training Steps: Number of optimization steps (500-5000)
	- More steps = better quality but takes longer
	- 2000 steps is a good balance

	### Super-Resolution
	- Scale Factor: Downsampling/upsampling factor (2x, 4x, 8x)
	- 2x: Easier task, faster training
	- 4x: Moderate difficulty
	- 8x: Challenging, may need more steps

	## Example Results

	The demo shows three outputs:
	1. Downsampled (Input): The artificially downsampled low-resolution image
	2. Super-Resolved (SIREN): The SIREN-generated high-resolution output
	3. Ground Truth (Original): The original high-resolution image for comparison

	## References

	- Paper: [Implicit Neural Representations with Periodic Activation Functions (SIREN)](https://arxiv.org/abs/2006.09661)
	- Project Page: [https://vsitzmann.github.io/siren/](https://vsitzmann.github.io/siren/)
	- Notebook Tutorial: [SIREN Tutorial by Nipun Batra](https://github.com/nipunbatra/pml-teaching/blob/master/notebooks/siren.ipynb)

	## Quality Metrics Explained

	The demo now includes three standard image quality metrics:

	- PSNR (Peak Signal-to-Noise Ratio): Measures reconstruction quality in dB. Higher is better.
	- \>30 dB: Good quality
	- \>40 dB: Excellent quality

	- SSIM (Structural Similarity Index): Perceptual quality metric ranging from 0 to 1. Closer to 1.0 is better.
	- \>0.9: Very good quality
	- \>0.95: Excellent quality

	- MAE (Mean Absolute Error): Average pixel-wise difference. Lower is better.
	- <0.01: Excellent
	- <0.05: Good

	## Model Caching

	Trained models are automatically saved and can be reused:

	- Models are cached in `model_cache/` directory
	- Cache key includes: image size, scale factor, training steps, and architecture
	- Enable/disable caching with the checkbox in the UI
	- Drastically speeds up repeated experiments with the same settings

	## Tips for Best Results

	1. Start with lower scale factors (2x) for faster experimentation
	2. Scale-specific training steps:
	- 2x: 1500-2000 steps
	- 4x: 3000 steps
	- 8x: 4000-5000 steps
	3. For 8x super-resolution:
	- Use 4000-5000 training steps
	- Increase hidden layers to 4-5
	- Use 512 hidden features
	- Check quality metrics to verify results
	4. Use images with rich details to see SIREN's strength in capturing high-frequency content
	5. Enable model cache to avoid retraining with identical settings

	## License

	This demo is for educational purposes. Please cite the original SIREN paper if you use this in your work:

	```bibtex
	@inproceedings{sitzmann2020implicit,
	title={Implicit Neural Representations with Periodic Activation Functions},
	author={Sitzmann, Vincent and Martel, Julien NP and Bergman, Alexander W and Lindell, David B and Wetzstein, Gordon},
	booktitle={Proc. NeurIPS},
	year={2020}
	}
	```