Julien Blanchon commited on
Commit
c4db8c0
Β·
1 Parent(s): 3c3566a
Files changed (1) hide show
  1. README.md +65 -33
README.md CHANGED
@@ -5,7 +5,11 @@ colorFrom: blue
5
  colorTo: green
6
  sdk: gradio
7
  sdk_version: 5.0.0
8
- app_port: 7860
 
 
 
 
9
  pinned: false
10
  ---
11
 
@@ -52,64 +56,77 @@ pinned: false
52
  </div>
53
 
54
  ## Setup
 
55
  1. Create a dedicated Python environment and install the dependencies
56
- ```bash
57
- git clone https://github.com/NYU-ICL/image-gs.git
58
- cd image-gs
59
- conda env create -f environment.yml
60
- conda activate image-gs
61
- pip install git+https://github.com/rahul-goel/fused-ssim/ --no-build-isolation
62
- cd gsplat
63
- pip install -e ".[dev]"
64
- cd ..
65
- ```
66
  2. Download the image and texture datasets from [OneDrive](https://1drv.ms/u/c/3a8968df8a027819/EeshjZJlMtdCmvvmESiN2pABM71EDaoLYmEwuOvecg0tAA?e=GybqBv) and organize the folder structure as follows
67
- ```
68
- image-gs
69
- └── media
70
- β”œβ”€β”€ images
71
- └── textures
72
- ```
73
  3. (Optional) To run saliency-guided Gaussian position initialization, download the pre-trained [EML-Net](https://github.com/SenJia/EML-NET-Saliency) models ([res_imagenet.pth](https://drive.google.com/open?id=1-a494canr9qWKLdm-DUDMgbGwtlAJz71), [res_places.pth](https://drive.google.com/open?id=18nRz0JSRICLqnLQtAvq01azZAsH0SEzS), [res_decoder.pth](https://drive.google.com/open?id=1vwrkz3eX-AMtXQE08oivGMwS4lKB74sH)) and place them under the `models/emlnet/` folder
74
- ```
75
- image-gs
76
- └── models
77
- └── emlnet
78
- β”œβ”€β”€ res_decoder.pth
79
- β”œβ”€β”€ res_imagenet.pth
80
- └── res_places.pth
81
- ```
82
 
83
  ## Quick Start
84
 
85
- #### Image Compression
 
86
  - Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters
 
87
  ```bash
88
  python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize
89
  ```
 
90
  - Render the corresponding optimized Image-GS representation at a new resolution with height `4000` (aspect ratio is maintained)
 
91
  ```bash
92
  python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --eval --render_height=4000
93
  ```
94
 
95
  #### Texture Stack Compression
 
96
  - Optimize an Image-GS representation for an input texture stack `alarm-clock_2k` using `30000` Gaussians with half-precision parameters
 
97
  ```bash
98
  python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize
99
  ```
 
100
  - Render the corresponding optimized Image-GS representation at a new resolution with height `3000` (aspect ratio is maintained)
 
101
  ```bash
102
  python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize --eval --render_height=3000
103
  ```
104
 
105
  #### Control bit precision of Gaussian parameters
 
106
  - Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with 12-bit-precision parameters
 
107
  ```bash
108
  python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --pos_bits=12 --scale_bits 12 --rot_bits 12 --feat_bits 12
109
  ```
110
 
111
- #### Switch to saliency-guided Gaussian position initialization
 
112
  - Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters and saliency-guided initialization
 
113
  ```bash
114
  python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --init_mode="saliency"
115
  ```
@@ -121,11 +138,13 @@ We provide a user-friendly web interface built with Gradio for easy experimentat
121
  ### Setup for Web Interface
122
 
123
  1. Install Gradio (in addition to the main dependencies):
 
124
  ```bash
125
- pip install gradio>=4.0.0
126
  ```
127
 
128
  2. Launch the web interface:
 
129
  ```bash
130
  python gradio_app.py
131
  ```
@@ -145,18 +164,20 @@ The Gradio interface provides:
145
 
146
  ### Interface Sections
147
 
148
- 1. **Configuration Panel**:
 
149
  - Basic parameters (number of Gaussians, training steps)
150
  - Quantization settings for memory efficiency
151
  - Initialization modes (gradient, saliency, random)
152
  - Advanced optimization parameters (learning rates, loss weights)
153
 
154
- 2. **Training Progress**:
 
155
  - Real-time streaming logs
156
  - Current render and Gaussian visualization updates
157
  - Training status and control buttons
158
 
159
- 3. **Results Display**:
160
  - Final optimized image
161
  - Gradient and saliency maps used for initialization
162
  - Download capabilities for all results
@@ -170,13 +191,16 @@ The Gradio interface provides:
170
  - For quick tests, reduce **max steps** to 500-1000
171
 
172
  ### Command Line Arguments
 
173
  Please refer to `cfgs/default.yaml` for the full list of arguments and their default values.
174
 
175
  **Post-optimization rendering**
 
176
  - `--eval` render the optimized Image-GS representation.
177
  - `--render_height` image height for rendering (aspect ratio is maintained).
178
 
179
- **Bit precision control**: 32 bits (float32) per dimension by default
 
180
  - `--quantize` enable bit precision control of Gaussian parameters.
181
  - `--pos_bits` bit precision of individual coordinate dimension.
182
  - `--scale_bits` bit precision of individual scale dimension.
@@ -184,18 +208,21 @@ Please refer to `cfgs/default.yaml` for the full list of arguments and their def
184
  - `--feat_bits` bit precision of individual feature dimension.
185
 
186
  **Logging**
 
187
  - `--exp_name` path to the logging directory.
188
  - `--vis_gaussians`: visualize Gaussians during optimization.
189
  - `--save_image_steps` frequency of rendering intermediate results during optimization.
190
  - `--save_ckpt_steps` frequency of checkpointing during optimization.
191
 
192
  **Input image**
 
193
  - `--input_path` path to an image file or a directory containing a texture stack.
194
  - `--downsample` load a downsampled version of the input image or texture stack as the optimization target to evaluate image upsampling performance.
195
  - `--downsample_ratio` downsampling ratio.
196
  - `--gamma` optimize in a gamma-corrected space, modify with caution.
197
 
198
  **Gaussian**
 
199
  - `--num_gaussians` number of Gaussians (for compression rate control).
200
  - `--init_scale` initial Gaussian scale in number of pixels.
201
  - `--disable_topk_norm` disable top-K normalization.
@@ -204,6 +231,7 @@ Please refer to `cfgs/default.yaml` for the full list of arguments and their def
204
  - `--init_random_ratio` ratio of Gaussians with randomly initialized position.
205
 
206
  **Optimization**
 
207
  - `--disable_tiles` disable tile-based rendering (warning: optimization and rendering without tiles will be way slower).
208
  - `--max_steps` maximum number of optimization steps.
209
  - `--pos_lr` Gaussian position learning rate.
@@ -214,13 +242,17 @@ Please refer to `cfgs/default.yaml` for the full list of arguments and their def
214
  - `--disable_prog_optim` disable error-guided progressive optimization.
215
 
216
  ## Acknowledgements
 
217
  We would like to thank the [gsplat](https://github.com/nerfstudio-project/gsplat) team, and the authors of [3DGS](https://github.com/graphdeco-inria/gaussian-splatting), [fused-ssim](https://github.com/rahul-goel/fused-ssim), and [EML-Net](https://github.com/SenJia/EML-NET-Saliency) for their great work, based on which Image-GS was developed.
218
 
219
  ## License
 
220
  This project is licensed under the terms of the MIT license.
221
 
222
  ## Citation
 
223
  If you find this project helpful to your research, please consider citing [BibTeX](assets/docs/image-gs.bib):
 
224
  ```bibtex
225
  @inproceedings{zhang2025image,
226
  title={Image-gs: Content-adaptive image representation via 2d gaussians},
@@ -229,4 +261,4 @@ If you find this project helpful to your research, please consider citing [BibTe
229
  pages={1--11},
230
  year={2025}
231
  }
232
- ```
 
5
  colorTo: green
6
  sdk: gradio
7
  sdk_version: 5.0.0
8
+ python_version: "3.10"
9
+ app_file: gradio_app.py
10
+ suggested_hardware: "cpu-basic"
11
+ models:
12
+ - blanchon/image-gs-models-utils
13
  pinned: false
14
  ---
15
 
 
56
  </div>
57
 
58
  ## Setup
59
+
60
  1. Create a dedicated Python environment and install the dependencies
61
+ ```bash
62
+ git clone https://github.com/NYU-ICL/image-gs.git
63
+ cd image-gs
64
+ conda env create -f environment.yml
65
+ conda activate image-gs
66
+ pip install git+https://github.com/rahul-goel/fused-ssim/ --no-build-isolation
67
+ cd gsplat
68
+ pip install -e ".[dev]"
69
+ cd ..
70
+ ```
71
  2. Download the image and texture datasets from [OneDrive](https://1drv.ms/u/c/3a8968df8a027819/EeshjZJlMtdCmvvmESiN2pABM71EDaoLYmEwuOvecg0tAA?e=GybqBv) and organize the folder structure as follows
72
+ ```
73
+ image-gs
74
+ └── media
75
+ β”œβ”€β”€ images
76
+ └── textures
77
+ ```
78
  3. (Optional) To run saliency-guided Gaussian position initialization, download the pre-trained [EML-Net](https://github.com/SenJia/EML-NET-Saliency) models ([res_imagenet.pth](https://drive.google.com/open?id=1-a494canr9qWKLdm-DUDMgbGwtlAJz71), [res_places.pth](https://drive.google.com/open?id=18nRz0JSRICLqnLQtAvq01azZAsH0SEzS), [res_decoder.pth](https://drive.google.com/open?id=1vwrkz3eX-AMtXQE08oivGMwS4lKB74sH)) and place them under the `models/emlnet/` folder
79
+ ```
80
+ image-gs
81
+ └── models
82
+ └── emlnet
83
+ β”œβ”€β”€ res_decoder.pth
84
+ β”œβ”€β”€ res_imagenet.pth
85
+ └── res_places.pth
86
+ ```
87
 
88
  ## Quick Start
89
 
90
+ #### Image Compression
91
+
92
  - Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters
93
+
94
  ```bash
95
  python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize
96
  ```
97
+
98
  - Render the corresponding optimized Image-GS representation at a new resolution with height `4000` (aspect ratio is maintained)
99
+
100
  ```bash
101
  python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --eval --render_height=4000
102
  ```
103
 
104
  #### Texture Stack Compression
105
+
106
  - Optimize an Image-GS representation for an input texture stack `alarm-clock_2k` using `30000` Gaussians with half-precision parameters
107
+
108
  ```bash
109
  python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize
110
  ```
111
+
112
  - Render the corresponding optimized Image-GS representation at a new resolution with height `3000` (aspect ratio is maintained)
113
+
114
  ```bash
115
  python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize --eval --render_height=3000
116
  ```
117
 
118
  #### Control bit precision of Gaussian parameters
119
+
120
  - Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with 12-bit-precision parameters
121
+
122
  ```bash
123
  python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --pos_bits=12 --scale_bits 12 --rot_bits 12 --feat_bits 12
124
  ```
125
 
126
+ #### Switch to saliency-guided Gaussian position initialization
127
+
128
  - Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters and saliency-guided initialization
129
+
130
  ```bash
131
  python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --init_mode="saliency"
132
  ```
 
138
  ### Setup for Web Interface
139
 
140
  1. Install Gradio (in addition to the main dependencies):
141
+
142
  ```bash
143
+ pip install gradio>=5.0.0
144
  ```
145
 
146
  2. Launch the web interface:
147
+
148
  ```bash
149
  python gradio_app.py
150
  ```
 
164
 
165
  ### Interface Sections
166
 
167
+ 1. **Configuration Panel**:
168
+
169
  - Basic parameters (number of Gaussians, training steps)
170
  - Quantization settings for memory efficiency
171
  - Initialization modes (gradient, saliency, random)
172
  - Advanced optimization parameters (learning rates, loss weights)
173
 
174
+ 2. **Training Progress**:
175
+
176
  - Real-time streaming logs
177
  - Current render and Gaussian visualization updates
178
  - Training status and control buttons
179
 
180
+ 3. **Results Display**:
181
  - Final optimized image
182
  - Gradient and saliency maps used for initialization
183
  - Download capabilities for all results
 
191
  - For quick tests, reduce **max steps** to 500-1000
192
 
193
  ### Command Line Arguments
194
+
195
  Please refer to `cfgs/default.yaml` for the full list of arguments and their default values.
196
 
197
  **Post-optimization rendering**
198
+
199
  - `--eval` render the optimized Image-GS representation.
200
  - `--render_height` image height for rendering (aspect ratio is maintained).
201
 
202
+ **Bit precision control**: 32 bits (float32) per dimension by default
203
+
204
  - `--quantize` enable bit precision control of Gaussian parameters.
205
  - `--pos_bits` bit precision of individual coordinate dimension.
206
  - `--scale_bits` bit precision of individual scale dimension.
 
208
  - `--feat_bits` bit precision of individual feature dimension.
209
 
210
  **Logging**
211
+
212
  - `--exp_name` path to the logging directory.
213
  - `--vis_gaussians`: visualize Gaussians during optimization.
214
  - `--save_image_steps` frequency of rendering intermediate results during optimization.
215
  - `--save_ckpt_steps` frequency of checkpointing during optimization.
216
 
217
  **Input image**
218
+
219
  - `--input_path` path to an image file or a directory containing a texture stack.
220
  - `--downsample` load a downsampled version of the input image or texture stack as the optimization target to evaluate image upsampling performance.
221
  - `--downsample_ratio` downsampling ratio.
222
  - `--gamma` optimize in a gamma-corrected space, modify with caution.
223
 
224
  **Gaussian**
225
+
226
  - `--num_gaussians` number of Gaussians (for compression rate control).
227
  - `--init_scale` initial Gaussian scale in number of pixels.
228
  - `--disable_topk_norm` disable top-K normalization.
 
231
  - `--init_random_ratio` ratio of Gaussians with randomly initialized position.
232
 
233
  **Optimization**
234
+
235
  - `--disable_tiles` disable tile-based rendering (warning: optimization and rendering without tiles will be way slower).
236
  - `--max_steps` maximum number of optimization steps.
237
  - `--pos_lr` Gaussian position learning rate.
 
242
  - `--disable_prog_optim` disable error-guided progressive optimization.
243
 
244
  ## Acknowledgements
245
+
246
  We would like to thank the [gsplat](https://github.com/nerfstudio-project/gsplat) team, and the authors of [3DGS](https://github.com/graphdeco-inria/gaussian-splatting), [fused-ssim](https://github.com/rahul-goel/fused-ssim), and [EML-Net](https://github.com/SenJia/EML-NET-Saliency) for their great work, based on which Image-GS was developed.
247
 
248
  ## License
249
+
250
  This project is licensed under the terms of the MIT license.
251
 
252
  ## Citation
253
+
254
  If you find this project helpful to your research, please consider citing [BibTeX](assets/docs/image-gs.bib):
255
+
256
  ```bibtex
257
  @inproceedings{zhang2025image,
258
  title={Image-gs: Content-adaptive image representation via 2d gaussians},
 
261
  pages={1--11},
262
  year={2025}
263
  }
264
+ ```