File size: 4,348 Bytes
12510fb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f471943
 
 
 
 
 
 
3a2dbc9
f471943
12510fb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eed0afe
 
 
 
 
 
12510fb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
license: cc-by-nc-4.0
library_name: pytorch
tags:
- inverse-rendering
- image-decomposition
- basecolor
- normal-map
- pbr
- material-estimation
- shadenet
pipeline_tag: image-to-image
datasets:
- flickr8k
---

# ShadeNet 28M

A lightweight inverse rendering model that decomposes RGB images into PBR material maps (basecolor, normal, roughness/metallic/depth) and reconstructs them back to RGB.

## Examples

## Examples

<a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/input_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/input_result.png" alt="Sample 1"></a>

<a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/152029243_b3582c36fa_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/152029243_b3582c36fa_result.png" alt="Sample 2"></a>

<a href="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/160585932_fa6339f248_result.png" target="_blank"><img src="https://huggingface.co/singam96/ShadeNet/resolve/main/assets/160585932_fa6339f248_result.png" alt="Sample 3"></a>

*Each result shows: Input (blue) β†’ Basecolor, Normal, Depth, Roughness, Metallic (green) β†’ Recon RGB (orange). Click an image to view full size.*

## Architecture

The model is a **MobileNetUNet (27.9M params)** with:
- **MobileNetV2 backbone** (frozen except last 8 layers) for feature extraction
- **Parallel Encoder** for additional learned features
- **UNet-style decoder** with skip connections, channel attention, and spatial attention
- **Dual mode** forward pass:
  - **Mode 0**: RGB β†’ Inverse Maps (basecolor, normal, roughness/metallic/depth)
  - **Mode 1**: Inverse Maps β†’ RGB reconstruction

## Output Maps

| Map | Channels | Description |
|-----|----------|-------------|
| **Basecolor** | 3 | Albedo / diffuse color |
| **Normal** | 3 | Surface normals (tangent space) |
| **Roughness** | 1 | R channel of RMD - surface roughness |
| **Metallic** | 1 | G channel of RMD - metalness |
| **Depth** | 1 | B channel of RMD - relative depth |
| **RGB** | 3 | Reconstructed RGB from inverse maps |

## Files

```
shadenet/
β”œβ”€β”€ app.py                 # Gradio Space app
β”œβ”€β”€ inference.py           # Standalone inference script (CLI)
β”œβ”€β”€ inference_utils.py     # Inference utilities (tiling, compositing)
β”œβ”€β”€ model.py               # Model architecture
β”œβ”€β”€ layers.py              # Layer components
β”œβ”€β”€ config.py              # Configuration
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ README.md              # This file
β”œβ”€β”€ checkpoints/
β”‚   └── last.ckpt          # PyTorch Lightning checkpoint (model weights)
└── onnx/
    β”œβ”€β”€ model_mode0.onnx              # Mode 0 ONNX (RGB β†’ inverse maps)
    β”œβ”€β”€ model_mode0_quantized.onnx    # Quantized mode 0
    β”œβ”€β”€ model_mode1.onnx              # Mode 1 ONNX (inverse maps β†’ RGB)
    └── model_mode1_quantized.onnx    # Quantized mode 1
```

## Usage

### Gradio Space

A **HuggingFace Space** hosts this model as an interactive web app β€” upload an image in your browser and see results instantly, no installation needed.

The `app.py` in this repo is the Space entrypoint. To create one:
1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
2. Select **Gradio SDK**, choose this repo as the source
3. Space will auto-launch with the Gradio interface

### CLI Inference

```bash
pip install -r requirements.txt
python inference.py input.jpg --output_dir ./output
```

### ONNX Inference

The `onnx/` folder contains exported ONNX models for deployment without PyTorch:

- `model_mode0.onnx` / `model_mode0_quantized.onnx`: RGB β†’ basecolor, normal, RMD
- `model_mode1.onnx` / `model_mode1_quantized.onnx`: Inverse maps β†’ RGB

Input shape: `[1, 3, 512, 512]`, values in `[-1, 1]`

## Training

Trained on Flickr8k with paired inverse-rendered data. The model learns both forward (RGB→inverse) and reverse (inverse→RGB) mappings simultaneously using a combined L1 + MSE loss per output map.

- Optimizer: AdamW / Prodigy
- Image size: 512Γ—512
- Precision: 16-mixed
- Loss weights: basecolor=1.0, normal=1.5, RMD=1.0, RGB=1.0

## Citation

```
@software{shadenet,
  author = {Sachin},
  title = {ShadeNet},
  year = {2026},
}
```