--- license: mit language: - en base_model: - sd2-community/stable-diffusion-2-1 pipeline_tag: image-to-image tags: - material-decomposition - diffusion --- # Model Overview This repository hosts the pretrained parameters for the SuperMat project, as described in [*"SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates"* (ICCV 2025)](https://arxiv.org/abs/2411.17515) | Model | Description | | --- | --- | | supermat.pth | Base SuperMat model for material decomposition | supermat_mv.pth | Multi-view version of SuperMat processing six orthogonal views | uv_refine_bc.pth | UV refinement network for albedo materials | uv_refine_rm.pth | UV refinement network for roughness & metallic materials All models are built upon the base model `stabilityai/stable-diffusion-2-1`. Note: The official `stabilityai/stable-diffusion-2-1` model has been removed. You may need to obtain the base model parameters through alternative sources, such as `sd2-community/stable-diffusion-2-1`. # Model Details #### SuperMat (supermat.pth) The core model for material decomposition. It takes RGBA images as input and decomposes materials from the target object. #### SuperMat Multi-View (supermat_mv.pth) An extended version that processes six orthogonal views simultaneously. This model leverages multi-view consistency for improved material estimation. For each view, the camera-to-world (c2w) matrix is provided as camera embeddings. #### UV Refinement Networks Two specialized networks for refining UV maps: - uv_refine_bc.pth: Refines the UV map for albedo materials - uv_refine_rm.pth: Refines the UV map for roughness & metallic materials # Download & Usage Download the desired model(s) from this repository and place them in the checkpoints folder: ``` checkpoints/ ├── supermat.pth ├── supermat_mv.pth ├── uv_refine_bc.pth └── uv_refine_rm.pth ``` The models are independent of each other, so you only need to download those required for your specific inference task. # Input Requirements #### Image Format - SuperMat models expect RGBA images where only the target object appears as foreground, with alpha values set to `0` for all other regions - During inference, the input image is alpha-composited with a gray background `(0.5, 0.5, 0.5)` #### Resolution Preferences - SuperMat models: `512×512` resolution (recommended) - UV refinement networks: `1024×1024` resolution (recommended) #### Multi-View Specific Requirements For the multi-view model: - All inputs for a single case should be organized in one folder - Input images must follow the naming convention as shown in `examples/bag_rendered_6views` - Camera information is stored in `meta.json` (refer to the example for the required format with c2w matrices) # Quick Inference Examples #### SuperMat Single-Image ``` python inference_supermat.py \ --input examples/ring_rendered_2views \ --output-dir outputs \ --checkpoint checkpoints/supermat.pth \ --base-model sd2-community/stable-diffusion-2-1 \ --device cuda:0 \ --image-size 512 ``` #### SuperMat Multi-View ``` python inference_supermat_mv.py \ --input examples/bag_rendered_6views \ --output-dir outputs_mv \ --checkpoint checkpoints/supermat_mv.pth \ --base-model sd2-community/stable-diffusion-2-1 \ --device cuda:0 \ --image-size 512 \ --num_views 6 \ --use-camera-embeds ``` #### UV Refinement (Albedo) ``` python inference_uv_refine.py \ --input-uv examples/axe_uv/uv_bc.png \ --input-uv-position examples/axe_uv/uv_position.png \ --input-uv-mask examples/axe_uv/uv_mask.png \ --output-dir outputs_uv_bc \ --checkpoint checkpoints/uv_refine_bc.pth \ --base-model sd2-community/stable-diffusion-2-1 \ --device cuda:0 \ --image-size 1024 ``` For complete usage instructions, please refer to the [main repository](https://github.com/hyj542682306/SuperMat). # Citation If you find these models useful in your research, please cite: ``` @inproceedings{hong2025supermat, title={Supermat: Physically consistent pbr material estimation at interactive rates}, author={Hong, Yijia and Guo, Yuan-Chen and Yi, Ran and Chen, Yulong and Cao, Yan-Pei and Ma, Lizhuang}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, pages={25083--25093}, year={2025} } ```