File size: 4,348 Bytes

---
license: cc-by-nc-4.0
library_name: pytorch
tags:
- inverse-rendering
- image-decomposition
- basecolor
- normal-map
- pbr
- material-estimation
- shadenet
pipeline_tag: image-to-image
datasets:
- flickr8k
---

# ShadeNet 28M

A lightweight inverse rendering model that decomposes RGB images into PBR material maps (basecolor, normal, roughness/metallic/depth) and reconstructs them back to RGB.

## Examples

## Examples

<a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/input_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/input_result.png" alt="Sample 1"></a>

<a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/152029243_b3582c36fa_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/152029243_b3582c36fa_result.png" alt="Sample 2"></a>

<a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/160585932_fa6339f248_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/160585932_fa6339f248_result.png" alt="Sample 3"></a>

*Each result shows: Input (blue) → Basecolor, Normal, Depth, Roughness, Metallic (green) → Recon RGB (orange). Click an image to view full size.*

## Architecture

The model is a **MobileNetUNet (27.9M params)** with:
- **MobileNetV2 backbone** (frozen except last 8 layers) for feature extraction
- **Parallel Encoder** for additional learned features
- **UNet-style decoder** with skip connections, channel attention, and spatial attention
- **Dual mode** forward pass:
  - **Mode 0**: RGB → Inverse Maps (basecolor, normal, roughness/metallic/depth)
  - **Mode 1**: Inverse Maps → RGB reconstruction

## Output Maps

| Map | Channels | Description |
|-----|----------|-------------|
| **Basecolor** | 3 | Albedo / diffuse color |
| **Normal** | 3 | Surface normals (tangent space) |
| **Roughness** | 1 | R channel of RMD - surface roughness |
| **Metallic** | 1 | G channel of RMD - metalness |
| **Depth** | 1 | B channel of RMD - relative depth |
| **RGB** | 3 | Reconstructed RGB from inverse maps |

## Files

```
shadenet/
├── app.py                 # Gradio Space app
├── inference.py           # Standalone inference script (CLI)
├── inference_utils.py     # Inference utilities (tiling, compositing)
├── model.py               # Model architecture
├── layers.py              # Layer components
├── config.py              # Configuration
├── requirements.txt       # Python dependencies
├── README.md              # This file
├── checkpoints/
│   └── last.ckpt          # PyTorch Lightning checkpoint (model weights)
└── onnx/
    ├── model_mode0.onnx              # Mode 0 ONNX (RGB → inverse maps)
    ├── model_mode0_quantized.onnx    # Quantized mode 0
    ├── model_mode1.onnx              # Mode 1 ONNX (inverse maps → RGB)
    └── model_mode1_quantized.onnx    # Quantized mode 1
```

## Usage

### Gradio Space

A **HuggingFace Space** hosts this model as an interactive web app — upload an image in your browser and see results instantly, no installation needed.

The `app.py` in this repo is the Space entrypoint. To create one:
1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
2. Select **Gradio SDK**, choose this repo as the source
3. Space will auto-launch with the Gradio interface

### CLI Inference

```bash
pip install -r requirements.txt
python inference.py input.jpg --output_dir ./output
```

### ONNX Inference

The `onnx/` folder contains exported ONNX models for deployment without PyTorch:

- `model_mode0.onnx` / `model_mode0_quantized.onnx`: RGB → basecolor, normal, RMD
- `model_mode1.onnx` / `model_mode1_quantized.onnx`: Inverse maps → RGB

Input shape: `[1, 3, 512, 512]`, values in `[-1, 1]`

## Training

Trained on Flickr8k with paired inverse-rendered data. The model learns both forward (RGB→inverse) and reverse (inverse→RGB) mappings simultaneously using a combined L1 + MSE loss per output map.

- Optimizer: AdamW / Prodigy
- Image size: 512×512
- Precision: 16-mixed
- Loss weights: basecolor=1.0, normal=1.5, RMD=1.0, RGB=1.0

## Citation

```
@software{shadenet,
  author = {Sachin},
  title = {ShadeNet},
  year = {2026},
}
```