| --- |
| license: mit |
| language: |
| - en |
| base_model: |
| - sd2-community/stable-diffusion-2-1 |
| pipeline_tag: image-to-image |
| tags: |
| - material-decomposition |
| - diffusion |
| --- |
| |
| # Model Overview |
|
|
| This repository hosts the pretrained parameters for the SuperMat project, as described in [*"SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates"* (ICCV 2025)](https://arxiv.org/abs/2411.17515) |
|
|
| | Model | Description | |
| | --- | --- | |
| | supermat.pth | Base SuperMat model for material decomposition |
| | supermat_mv.pth | Multi-view version of SuperMat processing six orthogonal views |
| | uv_refine_bc.pth | UV refinement network for albedo materials |
| | uv_refine_rm.pth | UV refinement network for roughness & metallic materials |
| |
| All models are built upon the base model `stabilityai/stable-diffusion-2-1`. |
| Note: The official `stabilityai/stable-diffusion-2-1` model has been removed. You may need to obtain the base model parameters through alternative sources, such as `sd2-community/stable-diffusion-2-1`. |
| |
| # Model Details |
| #### SuperMat (supermat.pth) |
| The core model for material decomposition. It takes RGBA images as input and decomposes materials from the target object. |
| #### SuperMat Multi-View (supermat_mv.pth) |
| An extended version that processes six orthogonal views simultaneously. This model leverages multi-view consistency for improved material estimation. For each view, the camera-to-world (c2w) matrix is provided as camera embeddings. |
| #### UV Refinement Networks |
| Two specialized networks for refining UV maps: |
| - uv_refine_bc.pth: Refines the UV map for albedo materials |
| - uv_refine_rm.pth: Refines the UV map for roughness & metallic materials |
|
|
| # Download & Usage |
| Download the desired model(s) from this repository and place them in the checkpoints folder: |
|
|
| ``` |
| checkpoints/ |
| βββ supermat.pth |
| βββ supermat_mv.pth |
| βββ uv_refine_bc.pth |
| βββ uv_refine_rm.pth |
| ``` |
| The models are independent of each other, so you only need to download those required for your specific inference task. |
|
|
| # Input Requirements |
| #### Image Format |
| - SuperMat models expect RGBA images where only the target object appears as foreground, with alpha values set to `0` for all other regions |
| - During inference, the input image is alpha-composited with a gray background `(0.5, 0.5, 0.5)` |
| #### Resolution Preferences |
| - SuperMat models: `512Γ512` resolution (recommended) |
| - UV refinement networks: `1024Γ1024` resolution (recommended) |
| #### Multi-View Specific Requirements |
| For the multi-view model: |
| - All inputs for a single case should be organized in one folder |
| - Input images must follow the naming convention as shown in `examples/bag_rendered_6views` |
| - Camera information is stored in `meta.json` (refer to the example for the required format with c2w matrices) |
|
|
| # Quick Inference Examples |
| #### SuperMat Single-Image |
| ``` |
| python inference_supermat.py \ |
| --input examples/ring_rendered_2views \ |
| --output-dir outputs \ |
| --checkpoint checkpoints/supermat.pth \ |
| --base-model sd2-community/stable-diffusion-2-1 \ |
| --device cuda:0 \ |
| --image-size 512 |
| ``` |
|
|
| #### SuperMat Multi-View |
| ``` |
| python inference_supermat_mv.py \ |
| --input examples/bag_rendered_6views \ |
| --output-dir outputs_mv \ |
| --checkpoint checkpoints/supermat_mv.pth \ |
| --base-model sd2-community/stable-diffusion-2-1 \ |
| --device cuda:0 \ |
| --image-size 512 \ |
| --num_views 6 \ |
| --use-camera-embeds |
| ``` |
|
|
| #### UV Refinement (Albedo) |
| ``` |
| python inference_uv_refine.py \ |
| --input-uv examples/axe_uv/uv_bc.png \ |
| --input-uv-position examples/axe_uv/uv_position.png \ |
| --input-uv-mask examples/axe_uv/uv_mask.png \ |
| --output-dir outputs_uv_bc \ |
| --checkpoint checkpoints/uv_refine_bc.pth \ |
| --base-model sd2-community/stable-diffusion-2-1 \ |
| --device cuda:0 \ |
| --image-size 1024 |
| ``` |
|
|
| For complete usage instructions, please refer to the [main repository](https://github.com/hyj542682306/SuperMat). |
|
|
| # Citation |
| If you find these models useful in your research, please cite: |
| ``` |
| @inproceedings{hong2025supermat, |
| title={Supermat: Physically consistent pbr material estimation at interactive rates}, |
| author={Hong, Yijia and Guo, Yuan-Chen and Yi, Ran and Chen, Yulong and Cao, Yan-Pei and Ma, Lizhuang}, |
| booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, |
| pages={25083--25093}, |
| year={2025} |
| } |
| ``` |