| --- |
| license: cc-by-nc-4.0 |
| library_name: pytorch |
| tags: |
| - inverse-rendering |
| - image-decomposition |
| - basecolor |
| - normal-map |
| - pbr |
| - material-estimation |
| - shadenet |
| pipeline_tag: image-to-image |
| datasets: |
| - flickr8k |
| --- |
| |
| # ShadeNet 28M |
|
|
| A lightweight inverse rendering model that decomposes RGB images into PBR material maps (basecolor, normal, roughness/metallic/depth) and reconstructs them back to RGB. |
|
|
| ## Examples |
|
|
| ## Examples |
|
|
| <a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/input_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/input_result.png" alt="Sample 1"></a> |
|
|
| <a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/152029243_b3582c36fa_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/152029243_b3582c36fa_result.png" alt="Sample 2"></a> |
|
|
| <a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/160585932_fa6339f248_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/160585932_fa6339f248_result.png" alt="Sample 3"></a> |
|
|
| *Each result shows: Input (blue) β Basecolor, Normal, Depth, Roughness, Metallic (green) β Recon RGB (orange). Click an image to view full size.* |
|
|
| ## Architecture |
|
|
| The model is a **MobileNetUNet (27.9M params)** with: |
| - **MobileNetV2 backbone** (frozen except last 8 layers) for feature extraction |
| - **Parallel Encoder** for additional learned features |
| - **UNet-style decoder** with skip connections, channel attention, and spatial attention |
| - **Dual mode** forward pass: |
| - **Mode 0**: RGB β Inverse Maps (basecolor, normal, roughness/metallic/depth) |
| - **Mode 1**: Inverse Maps β RGB reconstruction |
|
|
| ## Output Maps |
|
|
| | Map | Channels | Description | |
| |-----|----------|-------------| |
| | **Basecolor** | 3 | Albedo / diffuse color | |
| | **Normal** | 3 | Surface normals (tangent space) | |
| | **Roughness** | 1 | R channel of RMD - surface roughness | |
| | **Metallic** | 1 | G channel of RMD - metalness | |
| | **Depth** | 1 | B channel of RMD - relative depth | |
| | **RGB** | 3 | Reconstructed RGB from inverse maps | |
|
|
| ## Files |
|
|
| ``` |
| shadenet/ |
| βββ app.py # Gradio Space app |
| βββ inference.py # Standalone inference script (CLI) |
| βββ inference_utils.py # Inference utilities (tiling, compositing) |
| βββ model.py # Model architecture |
| βββ layers.py # Layer components |
| βββ config.py # Configuration |
| βββ requirements.txt # Python dependencies |
| βββ README.md # This file |
| βββ checkpoints/ |
| β βββ last.ckpt # PyTorch Lightning checkpoint (model weights) |
| βββ onnx/ |
| βββ model_mode0.onnx # Mode 0 ONNX (RGB β inverse maps) |
| βββ model_mode0_quantized.onnx # Quantized mode 0 |
| βββ model_mode1.onnx # Mode 1 ONNX (inverse maps β RGB) |
| βββ model_mode1_quantized.onnx # Quantized mode 1 |
| ``` |
|
|
| ## Usage |
|
|
| ### Gradio Space |
|
|
| A **HuggingFace Space** hosts this model as an interactive web app β upload an image in your browser and see results instantly, no installation needed. |
|
|
| The `app.py` in this repo is the Space entrypoint. To create one: |
| 1. Go to [huggingface.co/new-space](https://huggingface.co/new-space) |
| 2. Select **Gradio SDK**, choose this repo as the source |
| 3. Space will auto-launch with the Gradio interface |
|
|
| ### CLI Inference |
|
|
| ```bash |
| pip install -r requirements.txt |
| python inference.py input.jpg --output_dir ./output |
| ``` |
|
|
| ### ONNX Inference |
|
|
| The `onnx/` folder contains exported ONNX models for deployment without PyTorch: |
|
|
| - `model_mode0.onnx` / `model_mode0_quantized.onnx`: RGB β basecolor, normal, RMD |
| - `model_mode1.onnx` / `model_mode1_quantized.onnx`: Inverse maps β RGB |
|
|
| Input shape: `[1, 3, 512, 512]`, values in `[-1, 1]` |
|
|
| ## Training |
|
|
| Trained on Flickr8k with paired inverse-rendered data. The model learns both forward (RGBβinverse) and reverse (inverseβRGB) mappings simultaneously using a combined L1 + MSE loss per output map. |
|
|
| - Optimizer: AdamW / Prodigy |
| - Image size: 512Γ512 |
| - Precision: 16-mixed |
| - Loss weights: basecolor=1.0, normal=1.5, RMD=1.0, RGB=1.0 |
|
|
| ## Citation |
|
|
| ``` |
| @software{shadenet, |
| author = {Sachin}, |
| title = {ShadeNet}, |
| year = {2026}, |
| } |
| ``` |
|
|