AlbeRota commited on
Commit
105104a
·
verified ·
1 Parent(s): 5081b32

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -21
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- license: mit
3
  tags:
4
  - image-to-image
5
  - reflection-removal
@@ -7,8 +6,6 @@ tags:
7
  - computer-vision
8
  - dinov3
9
  - surgical-imaging
10
- language:
11
- - en
12
  base_model:
13
  - facebook/dinov3-vitl16-pretrain-lvd1689m
14
  - Ruicheng/moge-2-vitl-normal
@@ -24,35 +21,26 @@ base_model:
24
  [![Licence](https://img.shields.io/badge/MIT-License-1E811F)](https://mit-license.org/)
25
 
26
  UnReflectAnything inputs any RGB image and removes specular highlights, returning a clean diffuse-only outputs. We trained UnReflectAnything by synthetizing specularities and supervising in DINOv3 feature space.
27
-
28
  UnReflectAnything works on both natural indoor and surgical/endoscopic domain data.
29
 
30
- ---
31
-
32
  ## Architecture
33
  ![Architecture](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/architecture.png)
34
 
35
- * **<font color="#a001e0">Encoder E <font>**: Processes the input image $\mathbf{I}$ to extract a rich latent representation, $\mathbf{F}_\ell$. This is the off-the-shelf pretrained [DINOv3-large](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m)
36
- * **<font color="#0167ff">Reflection Predictor H</font> **: Predicts a soft highlight mask (**H**), identifying areas of specular highlights.
37
-
38
- * **Masking Operation</font>**: A binary mask **P** is derived from the prediction and applied to the feature map: This removes features contaminated by reflections, leaving "holes" in the data.
39
-
40
  * **<font color="#23ac2c">Token Inpainter T</font>**: Acts as a neural in-painter. It processes the masked features and uses the surrounding clean context prior and a learned mask token to synthesize the missing information in embedding space, producing the completed feature map $\mathbf{F}_{\text{comp}}$.
41
-
42
- * **<font color="#ff7700">Decoder D </font> **: Project the completed features back into the pixel space to generate the final, reflection-free image $\mathbf{I}_{\text{diff}}$.
43
 
44
  ---
45
-
46
  ## Training Strategy
47
  We train UnReflectAnything with **Synthetic Specular Supervision** by inferring 3D geometry from [MoGe-2](https://wangrc.site/MoGe2Page/) and rendering highlights with a Blinn-Phong reflection model. We randomly sample the light source position in 3D space at every training iteration enhance etherogeneity.
48
 
49
  ![SupervisionExamples](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/highlights.png)
50
 
51
  We train the model in two stages
52
- 1. **DPT Decoder Pre-Training**: The **<font color="#ff7700">Decoder</font>** is first pre-trained in an autoencoder configuration ($\min_{\theta} \mathcal{L}(M_{\theta}(\mathbf{I}), \mathbf{I})$) to ensure it can reconstruct realistic RGB textures from the DINOV3 latent space.
53
- 2. **End-to-End Refinement**: The full pipeline is then trained to predict reflection masks from $\mathit{\textcolor{0167ff}{H}}$, and fill them using the **<font color="#38761D">Token Inpainter</font>**, ensuring the final output is both visually consistent and physically accurate. The decoder is also fine-tuned at this stage
54
-
55
-
56
 
57
  ## Weights
58
  Install the API and CLI on a **Python>=3.11** environment with
@@ -66,7 +54,6 @@ unreflectanything download --weights
66
  to download the `.pth` weights in the package cache dir. The cache dir is usually at `.cache/unreflectanything`
67
 
68
  ---
69
-
70
  ### Basic Python Usage
71
 
72
  ```python
@@ -86,7 +73,6 @@ unreflectanything.inference("input_with_highlights.png", output="diffuse_result.
86
  Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the API endpoints
87
 
88
  ---
89
-
90
  ### CLI Overview
91
 
92
  The package provides a comprehensive command-line interface via `ura`, `unreflect`, or `unreflectanything`.
@@ -98,7 +84,6 @@ The package provides a comprehensive command-line interface via `ura`, `unreflec
98
  Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the CLI endpoints
99
 
100
  ---
101
-
102
  ## Citation
103
 
104
  If you use UnReflectAnything in your research or pipeline, please cite our paper:
 
1
  ---
 
2
  tags:
3
  - image-to-image
4
  - reflection-removal
 
6
  - computer-vision
7
  - dinov3
8
  - surgical-imaging
 
 
9
  base_model:
10
  - facebook/dinov3-vitl16-pretrain-lvd1689m
11
  - Ruicheng/moge-2-vitl-normal
 
21
  [![Licence](https://img.shields.io/badge/MIT-License-1E811F)](https://mit-license.org/)
22
 
23
  UnReflectAnything inputs any RGB image and removes specular highlights, returning a clean diffuse-only outputs. We trained UnReflectAnything by synthetizing specularities and supervising in DINOv3 feature space.
 
24
  UnReflectAnything works on both natural indoor and surgical/endoscopic domain data.
25
 
 
 
26
  ## Architecture
27
  ![Architecture](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/architecture.png)
28
 
29
+ * **<font color="#a001e0">Encoder E </font>**: Processes the input image to extract a rich latent representation. This is the off-the-shelf pretrained [DINOv3-large](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m)
30
+ * **<font color="#0167ff">Reflection Predictor H</font>**: Predicts a soft highlight mask (**H**), identifying areas of specular highlights.
31
+ * **Masking Operation**: A binary mask **P** is derived from the prediction and applied to the feature map: This removes features contaminated by reflections, leaving "holes" in the data.
 
 
32
  * **<font color="#23ac2c">Token Inpainter T</font>**: Acts as a neural in-painter. It processes the masked features and uses the surrounding clean context prior and a learned mask token to synthesize the missing information in embedding space, producing the completed feature map $\mathbf{F}_{\text{comp}}$.
33
+ * **<font color="#ff7700">Decoder D </font>**: Project the completed features back into the pixel space to generate the final, reflection-free image
 
34
 
35
  ---
 
36
  ## Training Strategy
37
  We train UnReflectAnything with **Synthetic Specular Supervision** by inferring 3D geometry from [MoGe-2](https://wangrc.site/MoGe2Page/) and rendering highlights with a Blinn-Phong reflection model. We randomly sample the light source position in 3D space at every training iteration enhance etherogeneity.
38
 
39
  ![SupervisionExamples](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/highlights.png)
40
 
41
  We train the model in two stages
42
+ 1. **DPT Decoder Pre-Training**: The **<font color="#ff7700">Decoder</font>** is first pre-trained in an autoencoder configuration to ensure it can reconstruct realistic RGB textures from the DINOV3 latent space.
43
+ 2. **End-to-End Refinement**: The full pipeline is then trained to predict reflection masks and fill them using the **<font color="#38761D">Token Inpainter</font>**, ensuring the final output is both visually consistent and physically accurate. We utilize the Synthetic Specular Supervision to generate ground-truth signals in feature space. The decoder is also fine-tuned at this stage
 
 
44
 
45
  ## Weights
46
  Install the API and CLI on a **Python>=3.11** environment with
 
54
  to download the `.pth` weights in the package cache dir. The cache dir is usually at `.cache/unreflectanything`
55
 
56
  ---
 
57
  ### Basic Python Usage
58
 
59
  ```python
 
73
  Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the API endpoints
74
 
75
  ---
 
76
  ### CLI Overview
77
 
78
  The package provides a comprehensive command-line interface via `ura`, `unreflect`, or `unreflectanything`.
 
84
  Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the CLI endpoints
85
 
86
  ---
 
87
  ## Citation
88
 
89
  If you use UnReflectAnything in your research or pipeline, please cite our paper: