Upload weights, notebooks, sample images
Browse files
README.md
CHANGED
|
@@ -1,13 +1,12 @@
|
|
| 1 |
---
|
| 2 |
-
tags:
|
| 3 |
-
- image-to-image
|
| 4 |
-
- reflection-removal
|
| 5 |
-
- highlight-removal
|
| 6 |
-
- computer-vision
|
| 7 |
-
- dinov3
|
| 8 |
-
- surgical-imaging
|
| 9 |
license: mit
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
# UnReflectAnything
|
|
@@ -16,30 +15,43 @@ pipeline_tag: image-to-image
|
|
| 16 |
[](https://pypi.org/project/unreflectanything/)
|
| 17 |
[](https://arxiv.org/abs/2512.09583)
|
| 18 |
[](https://huggingface.co/spaces/AlbeRota/UnReflectAnything)
|
|
|
|
| 19 |
[](https://github.com/alberto-rota/UnReflectAnything/wiki)
|
| 20 |
[](https://mit-license.org/)
|
| 21 |
|
| 22 |
UnReflectAnything inputs any RGB image and removes specular highlights, returning a clean diffuse-only outputs. We trained UnReflectAnything by synthetizing specularities and supervising in DINOv3 feature space.
|
|
|
|
| 23 |
UnReflectAnything works on both natural indoor and surgical/endoscopic domain data.
|
| 24 |
|
|
|
|
|
|
|
| 25 |
## Architecture
|
| 26 |

|
| 27 |
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
*
|
| 31 |
-
|
| 32 |
-
* **<font color="#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
---
|
|
|
|
| 35 |
## Training Strategy
|
| 36 |
We train UnReflectAnything with **Synthetic Specular Supervision** by inferring 3D geometry from [MoGe-2](https://wangrc.site/MoGe2Page/) and rendering highlights with a Blinn-Phong reflection model. We randomly sample the light source position in 3D space at every training iteration enhance etherogeneity.
|
| 37 |
|
| 38 |

|
| 39 |
|
| 40 |
We train the model in two stages
|
| 41 |
-
1. **DPT Decoder Pre-Training**: The **<font color="#ff7700">Decoder</font>** is first pre-trained in an autoencoder configuration to ensure it can reconstruct realistic RGB textures from the DINOV3 latent space.
|
| 42 |
-
2. **End-to-End Refinement**: The full pipeline is then trained to predict reflection masks and fill them using the **<font color="#38761D">Token Inpainter</font>**, ensuring the final output is both visually consistent and physically accurate.
|
|
|
|
|
|
|
| 43 |
|
| 44 |
## Weights
|
| 45 |
Install the API and CLI on a **Python>=3.11** environment with
|
|
@@ -53,6 +65,7 @@ unreflectanything download --weights
|
|
| 53 |
to download the `.pth` weights in the package cache dir. The cache dir is usually at `.cache/unreflectanything`
|
| 54 |
|
| 55 |
---
|
|
|
|
| 56 |
### Basic Python Usage
|
| 57 |
|
| 58 |
```python
|
|
@@ -72,6 +85,7 @@ unreflectanything.inference("input_with_highlights.png", output="diffuse_result.
|
|
| 72 |
Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the API endpoints
|
| 73 |
|
| 74 |
---
|
|
|
|
| 75 |
### CLI Overview
|
| 76 |
|
| 77 |
The package provides a comprehensive command-line interface via `ura`, `unreflect`, or `unreflectanything`.
|
|
@@ -83,6 +97,7 @@ The package provides a comprehensive command-line interface via `ura`, `unreflec
|
|
| 83 |
Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the CLI endpoints
|
| 84 |
|
| 85 |
---
|
|
|
|
| 86 |
## Citation
|
| 87 |
|
| 88 |
If you use UnReflectAnything in your research or pipeline, please cite our paper:
|
|
@@ -100,4 +115,4 @@ If you use UnReflectAnything in your research or pipeline, please cite our paper
|
|
| 100 |
|
| 101 |
```
|
| 102 |
|
| 103 |
-
---
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- image-to-image
|
| 5 |
+
- reflection-removal
|
| 6 |
+
- highlight-removal
|
| 7 |
+
- computer-vision
|
| 8 |
+
- dinov3
|
| 9 |
+
- surgical-imaging
|
| 10 |
---
|
| 11 |
|
| 12 |
# UnReflectAnything
|
|
|
|
| 15 |
[](https://pypi.org/project/unreflectanything/)
|
| 16 |
[](https://arxiv.org/abs/2512.09583)
|
| 17 |
[](https://huggingface.co/spaces/AlbeRota/UnReflectAnything)
|
| 18 |
+
[](https://huggingface.co/AlbeRota/UnReflectAnything)
|
| 19 |
[](https://github.com/alberto-rota/UnReflectAnything/wiki)
|
| 20 |
[](https://mit-license.org/)
|
| 21 |
|
| 22 |
UnReflectAnything inputs any RGB image and removes specular highlights, returning a clean diffuse-only outputs. We trained UnReflectAnything by synthetizing specularities and supervising in DINOv3 feature space.
|
| 23 |
+
|
| 24 |
UnReflectAnything works on both natural indoor and surgical/endoscopic domain data.
|
| 25 |
|
| 26 |
+
---
|
| 27 |
+
|
| 28 |
## Architecture
|
| 29 |

|
| 30 |
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
* **<font color="#a001e0">Encoder</font> ($\mathit{\textcolor{a001e0}{E}}$ )**: Processes the input image $\mathbf{I}$ to extract a rich latent representation, $\mathbf{F}_\ell$. This is the off-the-shelf pretrained [DINOv3-large](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m)
|
| 34 |
+
|
| 35 |
+
* **<font color="#0167ff">Reflection Predictor</font> ($\mathit{\textcolor{0167ff}{H}}$ )**: Predicts a soft highlight mask (**H**), identifying areas of specular highlights.
|
| 36 |
+
|
| 37 |
+
* **Masking Operation</font> ($\mathit{P}$ )**: A binary mask **P** is derived from the prediction and applied to the feature map: $(1-\mathbf{P}) \odot \mathbf{F}_\ell$. This removes features contaminated by reflections, leaving "holes" in the data.
|
| 38 |
+
|
| 39 |
+
* **<font color="#23ac2c">Token Inpainter</font> ($\mathit{\textcolor{23ac2c}{T}}$ )**: Acts as a neural in-painter. It processes the masked features and uses the surrounding clean context prior and a learned mask token to synthesize the missing information in embedding space, producing the completed feature map $\mathbf{F}_{\text{comp}}$.
|
| 40 |
+
|
| 41 |
+
* **<font color="#ff7700">Decoder</font> ($\mathit{\textcolor{ff7700}{D}}$ )**: Project the completed features back into the pixel space to generate the final, reflection-free image $\mathbf{I}_{\text{diff}}$.
|
| 42 |
|
| 43 |
---
|
| 44 |
+
|
| 45 |
## Training Strategy
|
| 46 |
We train UnReflectAnything with **Synthetic Specular Supervision** by inferring 3D geometry from [MoGe-2](https://wangrc.site/MoGe2Page/) and rendering highlights with a Blinn-Phong reflection model. We randomly sample the light source position in 3D space at every training iteration enhance etherogeneity.
|
| 47 |
|
| 48 |

|
| 49 |
|
| 50 |
We train the model in two stages
|
| 51 |
+
1. **DPT Decoder Pre-Training**: The **<font color="#ff7700">Decoder</font>** is first pre-trained in an autoencoder configuration ($\min_{\theta} \mathcal{L}(M_{\theta}(\mathbf{I}), \mathbf{I})$) to ensure it can reconstruct realistic RGB textures from the DINOV3 latent space.
|
| 52 |
+
2. **End-to-End Refinement**: The full pipeline is then trained to predict reflection masks from $\mathit{\textcolor{0167ff}{H}}$, and fill them using the **<font color="#38761D">Token Inpainter</font>**, ensuring the final output is both visually consistent and physically accurate. The decoder is also fine-tuned at this stage
|
| 53 |
+
|
| 54 |
+
|
| 55 |
|
| 56 |
## Weights
|
| 57 |
Install the API and CLI on a **Python>=3.11** environment with
|
|
|
|
| 65 |
to download the `.pth` weights in the package cache dir. The cache dir is usually at `.cache/unreflectanything`
|
| 66 |
|
| 67 |
---
|
| 68 |
+
|
| 69 |
### Basic Python Usage
|
| 70 |
|
| 71 |
```python
|
|
|
|
| 85 |
Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the API endpoints
|
| 86 |
|
| 87 |
---
|
| 88 |
+
|
| 89 |
### CLI Overview
|
| 90 |
|
| 91 |
The package provides a comprehensive command-line interface via `ura`, `unreflect`, or `unreflectanything`.
|
|
|
|
| 97 |
Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the CLI endpoints
|
| 98 |
|
| 99 |
---
|
| 100 |
+
|
| 101 |
## Citation
|
| 102 |
|
| 103 |
If you use UnReflectAnything in your research or pipeline, please cite our paper:
|
|
|
|
| 115 |
|
| 116 |
```
|
| 117 |
|
| 118 |
+
---
|