AlbeRota commited on
Commit
6298130
·
verified ·
1 Parent(s): 10a2918

Upload weights, notebooks, sample images

Browse files
Files changed (1) hide show
  1. README.md +31 -16
README.md CHANGED
@@ -1,13 +1,12 @@
1
  ---
2
- tags:
3
- - image-to-image
4
- - reflection-removal
5
- - highlight-removal
6
- - computer-vision
7
- - dinov3
8
- - surgical-imaging
9
  license: mit
10
- pipeline_tag: image-to-image
 
 
 
 
 
 
11
  ---
12
 
13
  # UnReflectAnything
@@ -16,30 +15,43 @@ pipeline_tag: image-to-image
16
  [![PyPI](https://img.shields.io/pypi/v/unreflectanything?color=76b1f3&label=pip%20install&logo=python&logoColor=76b1f3)](https://pypi.org/project/unreflectanything/)
17
  [![Paper](https://img.shields.io/badge/Paper-arXiv-B31B1B?logo=arxiv&logoColor=B31B1B)](https://arxiv.org/abs/2512.09583)
18
  [![Demo](https://img.shields.io/badge/Demo-HF%20-FFD21E?logo=huggingface&logoColor=FFD21E)](https://huggingface.co/spaces/AlbeRota/UnReflectAnything)
 
19
  [![Wiki](https://img.shields.io/badge/API-Wiki-9187FF?logo=wikipedia&logoColor=9187FF)](https://github.com/alberto-rota/UnReflectAnything/wiki)
20
  [![Licence](https://img.shields.io/badge/MIT-License-1E811F)](https://mit-license.org/)
21
 
22
  UnReflectAnything inputs any RGB image and removes specular highlights, returning a clean diffuse-only outputs. We trained UnReflectAnything by synthetizing specularities and supervising in DINOv3 feature space.
 
23
  UnReflectAnything works on both natural indoor and surgical/endoscopic domain data.
24
 
 
 
25
  ## Architecture
26
  ![Architecture](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/architecture.png)
27
 
28
- * **<font color="#a001e0">Encoder E </font>**: Processes the input image to extract a rich latent representation. This is the off-the-shelf pretrained [DINOv3-large](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m)
29
- * **<font color="#0167ff">Reflection Predictor H</font>**: Predicts a soft highlight mask (**H**), identifying areas of specular highlights.
30
- * **Masking Operation**: A binary mask **P** is derived from the prediction and applied to the feature map: This removes features contaminated by reflections, leaving "holes" in the data.
31
- * **<font color="#23ac2c">Token Inpainter T</font>**: Acts as a neural in-painter. It processes the masked features and uses the surrounding clean context prior and a learned mask token to synthesize the missing information in embedding space, producing the completed feature map $\mathbf{F}_{\text{comp}}$.
32
- * **<font color="#ff7700">Decoder D </font>**: Project the completed features back into the pixel space to generate the final, reflection-free image
 
 
 
 
 
 
33
 
34
  ---
 
35
  ## Training Strategy
36
  We train UnReflectAnything with **Synthetic Specular Supervision** by inferring 3D geometry from [MoGe-2](https://wangrc.site/MoGe2Page/) and rendering highlights with a Blinn-Phong reflection model. We randomly sample the light source position in 3D space at every training iteration enhance etherogeneity.
37
 
38
  ![SupervisionExamples](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/highlights.png)
39
 
40
  We train the model in two stages
41
- 1. **DPT Decoder Pre-Training**: The **<font color="#ff7700">Decoder</font>** is first pre-trained in an autoencoder configuration to ensure it can reconstruct realistic RGB textures from the DINOV3 latent space.
42
- 2. **End-to-End Refinement**: The full pipeline is then trained to predict reflection masks and fill them using the **<font color="#38761D">Token Inpainter</font>**, ensuring the final output is both visually consistent and physically accurate. We utilize the Synthetic Specular Supervision to generate ground-truth signals in feature space. The decoder is also fine-tuned at this stage
 
 
43
 
44
  ## Weights
45
  Install the API and CLI on a **Python>=3.11** environment with
@@ -53,6 +65,7 @@ unreflectanything download --weights
53
  to download the `.pth` weights in the package cache dir. The cache dir is usually at `.cache/unreflectanything`
54
 
55
  ---
 
56
  ### Basic Python Usage
57
 
58
  ```python
@@ -72,6 +85,7 @@ unreflectanything.inference("input_with_highlights.png", output="diffuse_result.
72
  Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the API endpoints
73
 
74
  ---
 
75
  ### CLI Overview
76
 
77
  The package provides a comprehensive command-line interface via `ura`, `unreflect`, or `unreflectanything`.
@@ -83,6 +97,7 @@ The package provides a comprehensive command-line interface via `ura`, `unreflec
83
  Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the CLI endpoints
84
 
85
  ---
 
86
  ## Citation
87
 
88
  If you use UnReflectAnything in your research or pipeline, please cite our paper:
@@ -100,4 +115,4 @@ If you use UnReflectAnything in your research or pipeline, please cite our paper
100
 
101
  ```
102
 
103
- ---
 
1
  ---
 
 
 
 
 
 
 
2
  license: mit
3
+ tags:
4
+ - image-to-image
5
+ - reflection-removal
6
+ - highlight-removal
7
+ - computer-vision
8
+ - dinov3
9
+ - surgical-imaging
10
  ---
11
 
12
  # UnReflectAnything
 
15
  [![PyPI](https://img.shields.io/pypi/v/unreflectanything?color=76b1f3&label=pip%20install&logo=python&logoColor=76b1f3)](https://pypi.org/project/unreflectanything/)
16
  [![Paper](https://img.shields.io/badge/Paper-arXiv-B31B1B?logo=arxiv&logoColor=B31B1B)](https://arxiv.org/abs/2512.09583)
17
  [![Demo](https://img.shields.io/badge/Demo-HF%20-FFD21E?logo=huggingface&logoColor=FFD21E)](https://huggingface.co/spaces/AlbeRota/UnReflectAnything)
18
+ [![Modelcard](https://img.shields.io/badge/Model%20Card-HF%20-FFD21E?logo=huggingface&logoColor=FFD21E)](https://huggingface.co/AlbeRota/UnReflectAnything)
19
  [![Wiki](https://img.shields.io/badge/API-Wiki-9187FF?logo=wikipedia&logoColor=9187FF)](https://github.com/alberto-rota/UnReflectAnything/wiki)
20
  [![Licence](https://img.shields.io/badge/MIT-License-1E811F)](https://mit-license.org/)
21
 
22
  UnReflectAnything inputs any RGB image and removes specular highlights, returning a clean diffuse-only outputs. We trained UnReflectAnything by synthetizing specularities and supervising in DINOv3 feature space.
23
+
24
  UnReflectAnything works on both natural indoor and surgical/endoscopic domain data.
25
 
26
+ ---
27
+
28
  ## Architecture
29
  ![Architecture](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/architecture.png)
30
 
31
+
32
+
33
+ * **<font color="#a001e0">Encoder</font> ($\mathit{\textcolor{a001e0}{E}}$ )**: Processes the input image $\mathbf{I}$ to extract a rich latent representation, $\mathbf{F}_\ell$. This is the off-the-shelf pretrained [DINOv3-large](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m)
34
+
35
+ * **<font color="#0167ff">Reflection Predictor</font> ($\mathit{\textcolor{0167ff}{H}}$ )**: Predicts a soft highlight mask (**H**), identifying areas of specular highlights.
36
+
37
+ * **Masking Operation</font> ($\mathit{P}$ )**: A binary mask **P** is derived from the prediction and applied to the feature map: $(1-\mathbf{P}) \odot \mathbf{F}_\ell$. This removes features contaminated by reflections, leaving "holes" in the data.
38
+
39
+ * **<font color="#23ac2c">Token Inpainter</font> ($\mathit{\textcolor{23ac2c}{T}}$ )**: Acts as a neural in-painter. It processes the masked features and uses the surrounding clean context prior and a learned mask token to synthesize the missing information in embedding space, producing the completed feature map $\mathbf{F}_{\text{comp}}$.
40
+
41
+ * **<font color="#ff7700">Decoder</font> ($\mathit{\textcolor{ff7700}{D}}$ )**: Project the completed features back into the pixel space to generate the final, reflection-free image $\mathbf{I}_{\text{diff}}$.
42
 
43
  ---
44
+
45
  ## Training Strategy
46
  We train UnReflectAnything with **Synthetic Specular Supervision** by inferring 3D geometry from [MoGe-2](https://wangrc.site/MoGe2Page/) and rendering highlights with a Blinn-Phong reflection model. We randomly sample the light source position in 3D space at every training iteration enhance etherogeneity.
47
 
48
  ![SupervisionExamples](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/highlights.png)
49
 
50
  We train the model in two stages
51
+ 1. **DPT Decoder Pre-Training**: The **<font color="#ff7700">Decoder</font>** is first pre-trained in an autoencoder configuration ($\min_{\theta} \mathcal{L}(M_{\theta}(\mathbf{I}), \mathbf{I})$) to ensure it can reconstruct realistic RGB textures from the DINOV3 latent space.
52
+ 2. **End-to-End Refinement**: The full pipeline is then trained to predict reflection masks from $\mathit{\textcolor{0167ff}{H}}$, and fill them using the **<font color="#38761D">Token Inpainter</font>**, ensuring the final output is both visually consistent and physically accurate. The decoder is also fine-tuned at this stage
53
+
54
+
55
 
56
  ## Weights
57
  Install the API and CLI on a **Python>=3.11** environment with
 
65
  to download the `.pth` weights in the package cache dir. The cache dir is usually at `.cache/unreflectanything`
66
 
67
  ---
68
+
69
  ### Basic Python Usage
70
 
71
  ```python
 
85
  Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the API endpoints
86
 
87
  ---
88
+
89
  ### CLI Overview
90
 
91
  The package provides a comprehensive command-line interface via `ura`, `unreflect`, or `unreflectanything`.
 
97
  Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the CLI endpoints
98
 
99
  ---
100
+
101
  ## Citation
102
 
103
  If you use UnReflectAnything in your research or pipeline, please cite our paper:
 
115
 
116
  ```
117
 
118
+ ---