Spaces:
Sleeping
Sleeping
| title: Siren Super Resolution | |
| emoji: π₯ | |
| colorFrom: purple | |
| colorTo: pink | |
| sdk: gradio | |
| sdk_version: 6.0.2 | |
| app_file: app.py | |
| pinned: false | |
| # π₯ SIREN Super-Resolution Demo | |
| A Gradio demo showcasing **SIREN** (Sinusoidal Representation Networks) for image super-resolution. | |
| ## What is SIREN? | |
| SIREN networks use periodic activation functions (sine) instead of traditional ReLU activations, making them exceptionally well-suited for representing continuous signals and capturing fine details in images. | |
| **Key advantages:** | |
| - Smooth, continuous representations | |
| - Excellent for capturing high-frequency details | |
| - Can represent images at arbitrary resolutions | |
| - Implicit neural representation - no upsampling layers needed! | |
| ## How This Demo Works | |
| 1. **Upload** a high-resolution image (this serves as the ground truth) | |
| 2. **Downsample** the image artificially by a selected scale factor (2x, 4x, or 8x) | |
| 3. **Train** SIREN to learn the downsampled image representation | |
| 4. **Generate** a super-resolved version at the original resolution | |
| 5. **Compare** the results: downsampled input, SIREN output, and ground truth | |
| ## Features | |
| - ποΈ **Multiple scale factors**: 2x, 4x, 8x super-resolution | |
| - π **Quality metrics**: PSNR, SSIM, and MAE for objective quality assessment | |
| - πΎ **Model caching**: Save and reuse trained models to avoid retraining | |
| - π¨ **Improved UI**: Tabbed interface with side-by-side comparison view | |
| - ποΈ **Configurable model**: Adjust hidden layers, features, and training steps | |
| - π **Training visualization**: Watch the loss curve during training | |
| - πΈ **Real sample images**: High-quality photos from Unsplash (cat, landscape, portrait, flower) | |
| ## Installation | |
| ```bash | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Generate sample images (optional - already included) | |
| python create_samples.py | |
| # Run the demo | |
| python app.py | |
| ``` | |
| ## Usage | |
| ### Running locally: | |
| ```bash | |
| python app.py | |
| ``` | |
| Then open your browser to the URL shown (usually `http://127.0.0.1:7860`) | |
| ### Quick test: | |
| ```bash | |
| python test_siren.py | |
| ``` | |
| This runs a quick test to verify the SIREN implementation works correctly. | |
| ## Files | |
| - `app.py` - Main Gradio application | |
| - `siren.py` - SIREN model implementation | |
| - `utils.py` - Image processing utilities | |
| - `create_samples.py` - Script to generate sample images | |
| - `test_siren.py` - Quick test script | |
| - `samples/` - Sample images for testing | |
| ## Parameters | |
| ### Model Architecture | |
| - **Hidden Features**: Width of the network (128-512) | |
| - More features = more capacity but slower training | |
| - **Hidden Layers**: Depth of the network (2-6) | |
| - More layers = more capacity but slower training | |
| ### Training | |
| - **Training Steps**: Number of optimization steps (500-5000) | |
| - More steps = better quality but takes longer | |
| - 2000 steps is a good balance | |
| ### Super-Resolution | |
| - **Scale Factor**: Downsampling/upsampling factor (2x, 4x, 8x) | |
| - 2x: Easier task, faster training | |
| - 4x: Moderate difficulty | |
| - 8x: Challenging, may need more steps | |
| ## Example Results | |
| The demo shows three outputs: | |
| 1. **Downsampled (Input)**: The artificially downsampled low-resolution image | |
| 2. **Super-Resolved (SIREN)**: The SIREN-generated high-resolution output | |
| 3. **Ground Truth (Original)**: The original high-resolution image for comparison | |
| ## References | |
| - **Paper**: [Implicit Neural Representations with Periodic Activation Functions (SIREN)](https://arxiv.org/abs/2006.09661) | |
| - **Project Page**: [https://vsitzmann.github.io/siren/](https://vsitzmann.github.io/siren/) | |
| - **Notebook Tutorial**: [SIREN Tutorial by Nipun Batra](https://github.com/nipunbatra/pml-teaching/blob/master/notebooks/siren.ipynb) | |
| ## Quality Metrics Explained | |
| The demo now includes three standard image quality metrics: | |
| - **PSNR (Peak Signal-to-Noise Ratio)**: Measures reconstruction quality in dB. Higher is better. | |
| - \>30 dB: Good quality | |
| - \>40 dB: Excellent quality | |
| - **SSIM (Structural Similarity Index)**: Perceptual quality metric ranging from 0 to 1. Closer to 1.0 is better. | |
| - \>0.9: Very good quality | |
| - \>0.95: Excellent quality | |
| - **MAE (Mean Absolute Error)**: Average pixel-wise difference. Lower is better. | |
| - <0.01: Excellent | |
| - <0.05: Good | |
| ## Model Caching | |
| Trained models are automatically saved and can be reused: | |
| - Models are cached in `model_cache/` directory | |
| - Cache key includes: image size, scale factor, training steps, and architecture | |
| - Enable/disable caching with the checkbox in the UI | |
| - Drastically speeds up repeated experiments with the same settings | |
| ## Tips for Best Results | |
| 1. **Start with lower scale factors** (2x) for faster experimentation | |
| 2. **Scale-specific training steps**: | |
| - 2x: 1500-2000 steps | |
| - 4x: 3000 steps | |
| - 8x: 4000-5000 steps | |
| 3. **For 8x super-resolution**: | |
| - Use 4000-5000 training steps | |
| - Increase hidden layers to 4-5 | |
| - Use 512 hidden features | |
| - Check quality metrics to verify results | |
| 4. **Use images with rich details** to see SIREN's strength in capturing high-frequency content | |
| 5. **Enable model cache** to avoid retraining with identical settings | |
| ## License | |
| This demo is for educational purposes. Please cite the original SIREN paper if you use this in your work: | |
| ```bibtex | |
| @inproceedings{sitzmann2020implicit, | |
| title={Implicit Neural Representations with Periodic Activation Functions}, | |
| author={Sitzmann, Vincent and Martel, Julien NP and Bergman, Alexander W and Lindell, David B and Wetzstein, Gordon}, | |
| booktitle={Proc. NeurIPS}, | |
| year={2020} | |
| } | |
| ``` | |