---
license: mit
language:
- en
base_model:
- sd2-community/stable-diffusion-2-1
pipeline_tag: image-to-image
tags:
- material-decomposition
- diffusion
---

# Model Overview

This repository hosts the pretrained parameters for the SuperMat project, as described in [*"SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates"* (ICCV 2025)](https://arxiv.org/abs/2411.17515)

| Model | Description |
| --- | --- |
| supermat.pth | Base SuperMat model for material decomposition
| supermat_mv.pth | Multi-view version of SuperMat processing six orthogonal views
| uv_refine_bc.pth | UV refinement network for albedo materials
| uv_refine_rm.pth | UV refinement network for roughness & metallic materials

All models are built upon the base model `stabilityai/stable-diffusion-2-1`.
Note: The official `stabilityai/stable-diffusion-2-1` model has been removed. You may need to obtain the base model parameters through alternative sources, such as `sd2-community/stable-diffusion-2-1`.

# Model Details
#### SuperMat (supermat.pth)
The core model for material decomposition. It takes RGBA images as input and decomposes materials from the target object.
#### SuperMat Multi-View (supermat_mv.pth)
An extended version that processes six orthogonal views simultaneously. This model leverages multi-view consistency for improved material estimation. For each view, the camera-to-world (c2w) matrix is provided as camera embeddings.
#### UV Refinement Networks
Two specialized networks for refining UV maps:
- uv_refine_bc.pth: Refines the UV map for albedo materials
- uv_refine_rm.pth: Refines the UV map for roughness & metallic materials

# Download & Usage
Download the desired model(s) from this repository and place them in the checkpoints folder:

```
checkpoints/
├── supermat.pth
├── supermat_mv.pth
├── uv_refine_bc.pth
└── uv_refine_rm.pth
```
The models are independent of each other, so you only need to download those required for your specific inference task.

# Input Requirements
#### Image Format
- SuperMat models expect RGBA images where only the target object appears as foreground, with alpha values set to `0` for all other regions
- During inference, the input image is alpha-composited with a gray background `(0.5, 0.5, 0.5)`
#### Resolution Preferences
- SuperMat models: `512×512` resolution (recommended)
- UV refinement networks: `1024×1024` resolution (recommended)
#### Multi-View Specific Requirements
For the multi-view model:
- All inputs for a single case should be organized in one folder
- Input images must follow the naming convention as shown in `examples/bag_rendered_6views`
- Camera information is stored in `meta.json` (refer to the example for the required format with c2w matrices)

# Quick Inference Examples
#### SuperMat Single-Image
```
python inference_supermat.py \
  --input examples/ring_rendered_2views \
  --output-dir outputs \
  --checkpoint checkpoints/supermat.pth \
  --base-model sd2-community/stable-diffusion-2-1 \
  --device cuda:0 \
  --image-size 512
```

#### SuperMat Multi-View
```
python inference_supermat_mv.py \
  --input examples/bag_rendered_6views \
  --output-dir outputs_mv \
  --checkpoint checkpoints/supermat_mv.pth \
  --base-model sd2-community/stable-diffusion-2-1 \
  --device cuda:0 \
  --image-size 512 \
  --num_views 6 \
  --use-camera-embeds
```

#### UV Refinement (Albedo)
```
python inference_uv_refine.py \
  --input-uv examples/axe_uv/uv_bc.png \
  --input-uv-position examples/axe_uv/uv_position.png \
  --input-uv-mask examples/axe_uv/uv_mask.png \
  --output-dir outputs_uv_bc \
  --checkpoint checkpoints/uv_refine_bc.pth \
  --base-model sd2-community/stable-diffusion-2-1 \
  --device cuda:0 \
  --image-size 1024
```

For complete usage instructions, please refer to the [main repository](https://github.com/hyj542682306/SuperMat).

# Citation
If you find these models useful in your research, please cite:
```
@inproceedings{hong2025supermat,
  title={Supermat: Physically consistent pbr material estimation at interactive rates},
  author={Hong, Yijia and Guo, Yuan-Chen and Yi, Ran and Chen, Yulong and Cao, Yan-Pei and Ma, Lizhuang},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={25083--25093},
  year={2025}
}
```