Julien Blanchon
commited on
Commit
Β·
c4db8c0
1
Parent(s):
3c3566a
Update
Browse files
README.md
CHANGED
|
@@ -5,7 +5,11 @@ colorFrom: blue
|
|
| 5 |
colorTo: green
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: 5.0.0
|
| 8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
pinned: false
|
| 10 |
---
|
| 11 |
|
|
@@ -52,64 +56,77 @@ pinned: false
|
|
| 52 |
</div>
|
| 53 |
|
| 54 |
## Setup
|
|
|
|
| 55 |
1. Create a dedicated Python environment and install the dependencies
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
2. Download the image and texture datasets from [OneDrive](https://1drv.ms/u/c/3a8968df8a027819/EeshjZJlMtdCmvvmESiN2pABM71EDaoLYmEwuOvecg0tAA?e=GybqBv) and organize the folder structure as follows
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
3. (Optional) To run saliency-guided Gaussian position initialization, download the pre-trained [EML-Net](https://github.com/SenJia/EML-NET-Saliency) models ([res_imagenet.pth](https://drive.google.com/open?id=1-a494canr9qWKLdm-DUDMgbGwtlAJz71), [res_places.pth](https://drive.google.com/open?id=18nRz0JSRICLqnLQtAvq01azZAsH0SEzS), [res_decoder.pth](https://drive.google.com/open?id=1vwrkz3eX-AMtXQE08oivGMwS4lKB74sH)) and place them under the `models/emlnet/` folder
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
|
| 83 |
## Quick Start
|
| 84 |
|
| 85 |
-
#### Image Compression
|
|
|
|
| 86 |
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters
|
|
|
|
| 87 |
```bash
|
| 88 |
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize
|
| 89 |
```
|
|
|
|
| 90 |
- Render the corresponding optimized Image-GS representation at a new resolution with height `4000` (aspect ratio is maintained)
|
|
|
|
| 91 |
```bash
|
| 92 |
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --eval --render_height=4000
|
| 93 |
```
|
| 94 |
|
| 95 |
#### Texture Stack Compression
|
|
|
|
| 96 |
- Optimize an Image-GS representation for an input texture stack `alarm-clock_2k` using `30000` Gaussians with half-precision parameters
|
|
|
|
| 97 |
```bash
|
| 98 |
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize
|
| 99 |
```
|
|
|
|
| 100 |
- Render the corresponding optimized Image-GS representation at a new resolution with height `3000` (aspect ratio is maintained)
|
|
|
|
| 101 |
```bash
|
| 102 |
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize --eval --render_height=3000
|
| 103 |
```
|
| 104 |
|
| 105 |
#### Control bit precision of Gaussian parameters
|
|
|
|
| 106 |
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with 12-bit-precision parameters
|
|
|
|
| 107 |
```bash
|
| 108 |
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --pos_bits=12 --scale_bits 12 --rot_bits 12 --feat_bits 12
|
| 109 |
```
|
| 110 |
|
| 111 |
-
#### Switch to saliency-guided Gaussian position initialization
|
|
|
|
| 112 |
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters and saliency-guided initialization
|
|
|
|
| 113 |
```bash
|
| 114 |
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --init_mode="saliency"
|
| 115 |
```
|
|
@@ -121,11 +138,13 @@ We provide a user-friendly web interface built with Gradio for easy experimentat
|
|
| 121 |
### Setup for Web Interface
|
| 122 |
|
| 123 |
1. Install Gradio (in addition to the main dependencies):
|
|
|
|
| 124 |
```bash
|
| 125 |
-
pip install gradio>=
|
| 126 |
```
|
| 127 |
|
| 128 |
2. Launch the web interface:
|
|
|
|
| 129 |
```bash
|
| 130 |
python gradio_app.py
|
| 131 |
```
|
|
@@ -145,18 +164,20 @@ The Gradio interface provides:
|
|
| 145 |
|
| 146 |
### Interface Sections
|
| 147 |
|
| 148 |
-
1. **Configuration Panel**:
|
|
|
|
| 149 |
- Basic parameters (number of Gaussians, training steps)
|
| 150 |
- Quantization settings for memory efficiency
|
| 151 |
- Initialization modes (gradient, saliency, random)
|
| 152 |
- Advanced optimization parameters (learning rates, loss weights)
|
| 153 |
|
| 154 |
-
2. **Training Progress**:
|
|
|
|
| 155 |
- Real-time streaming logs
|
| 156 |
- Current render and Gaussian visualization updates
|
| 157 |
- Training status and control buttons
|
| 158 |
|
| 159 |
-
3. **Results Display**:
|
| 160 |
- Final optimized image
|
| 161 |
- Gradient and saliency maps used for initialization
|
| 162 |
- Download capabilities for all results
|
|
@@ -170,13 +191,16 @@ The Gradio interface provides:
|
|
| 170 |
- For quick tests, reduce **max steps** to 500-1000
|
| 171 |
|
| 172 |
### Command Line Arguments
|
|
|
|
| 173 |
Please refer to `cfgs/default.yaml` for the full list of arguments and their default values.
|
| 174 |
|
| 175 |
**Post-optimization rendering**
|
|
|
|
| 176 |
- `--eval` render the optimized Image-GS representation.
|
| 177 |
- `--render_height` image height for rendering (aspect ratio is maintained).
|
| 178 |
|
| 179 |
-
**Bit precision control**: 32 bits (float32) per dimension by default
|
|
|
|
| 180 |
- `--quantize` enable bit precision control of Gaussian parameters.
|
| 181 |
- `--pos_bits` bit precision of individual coordinate dimension.
|
| 182 |
- `--scale_bits` bit precision of individual scale dimension.
|
|
@@ -184,18 +208,21 @@ Please refer to `cfgs/default.yaml` for the full list of arguments and their def
|
|
| 184 |
- `--feat_bits` bit precision of individual feature dimension.
|
| 185 |
|
| 186 |
**Logging**
|
|
|
|
| 187 |
- `--exp_name` path to the logging directory.
|
| 188 |
- `--vis_gaussians`: visualize Gaussians during optimization.
|
| 189 |
- `--save_image_steps` frequency of rendering intermediate results during optimization.
|
| 190 |
- `--save_ckpt_steps` frequency of checkpointing during optimization.
|
| 191 |
|
| 192 |
**Input image**
|
|
|
|
| 193 |
- `--input_path` path to an image file or a directory containing a texture stack.
|
| 194 |
- `--downsample` load a downsampled version of the input image or texture stack as the optimization target to evaluate image upsampling performance.
|
| 195 |
- `--downsample_ratio` downsampling ratio.
|
| 196 |
- `--gamma` optimize in a gamma-corrected space, modify with caution.
|
| 197 |
|
| 198 |
**Gaussian**
|
|
|
|
| 199 |
- `--num_gaussians` number of Gaussians (for compression rate control).
|
| 200 |
- `--init_scale` initial Gaussian scale in number of pixels.
|
| 201 |
- `--disable_topk_norm` disable top-K normalization.
|
|
@@ -204,6 +231,7 @@ Please refer to `cfgs/default.yaml` for the full list of arguments and their def
|
|
| 204 |
- `--init_random_ratio` ratio of Gaussians with randomly initialized position.
|
| 205 |
|
| 206 |
**Optimization**
|
|
|
|
| 207 |
- `--disable_tiles` disable tile-based rendering (warning: optimization and rendering without tiles will be way slower).
|
| 208 |
- `--max_steps` maximum number of optimization steps.
|
| 209 |
- `--pos_lr` Gaussian position learning rate.
|
|
@@ -214,13 +242,17 @@ Please refer to `cfgs/default.yaml` for the full list of arguments and their def
|
|
| 214 |
- `--disable_prog_optim` disable error-guided progressive optimization.
|
| 215 |
|
| 216 |
## Acknowledgements
|
|
|
|
| 217 |
We would like to thank the [gsplat](https://github.com/nerfstudio-project/gsplat) team, and the authors of [3DGS](https://github.com/graphdeco-inria/gaussian-splatting), [fused-ssim](https://github.com/rahul-goel/fused-ssim), and [EML-Net](https://github.com/SenJia/EML-NET-Saliency) for their great work, based on which Image-GS was developed.
|
| 218 |
|
| 219 |
## License
|
|
|
|
| 220 |
This project is licensed under the terms of the MIT license.
|
| 221 |
|
| 222 |
## Citation
|
|
|
|
| 223 |
If you find this project helpful to your research, please consider citing [BibTeX](assets/docs/image-gs.bib):
|
|
|
|
| 224 |
```bibtex
|
| 225 |
@inproceedings{zhang2025image,
|
| 226 |
title={Image-gs: Content-adaptive image representation via 2d gaussians},
|
|
@@ -229,4 +261,4 @@ If you find this project helpful to your research, please consider citing [BibTe
|
|
| 229 |
pages={1--11},
|
| 230 |
year={2025}
|
| 231 |
}
|
| 232 |
-
```
|
|
|
|
| 5 |
colorTo: green
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: 5.0.0
|
| 8 |
+
python_version: "3.10"
|
| 9 |
+
app_file: gradio_app.py
|
| 10 |
+
suggested_hardware: "cpu-basic"
|
| 11 |
+
models:
|
| 12 |
+
- blanchon/image-gs-models-utils
|
| 13 |
pinned: false
|
| 14 |
---
|
| 15 |
|
|
|
|
| 56 |
</div>
|
| 57 |
|
| 58 |
## Setup
|
| 59 |
+
|
| 60 |
1. Create a dedicated Python environment and install the dependencies
|
| 61 |
+
```bash
|
| 62 |
+
git clone https://github.com/NYU-ICL/image-gs.git
|
| 63 |
+
cd image-gs
|
| 64 |
+
conda env create -f environment.yml
|
| 65 |
+
conda activate image-gs
|
| 66 |
+
pip install git+https://github.com/rahul-goel/fused-ssim/ --no-build-isolation
|
| 67 |
+
cd gsplat
|
| 68 |
+
pip install -e ".[dev]"
|
| 69 |
+
cd ..
|
| 70 |
+
```
|
| 71 |
2. Download the image and texture datasets from [OneDrive](https://1drv.ms/u/c/3a8968df8a027819/EeshjZJlMtdCmvvmESiN2pABM71EDaoLYmEwuOvecg0tAA?e=GybqBv) and organize the folder structure as follows
|
| 72 |
+
```
|
| 73 |
+
image-gs
|
| 74 |
+
βββ media
|
| 75 |
+
βββ images
|
| 76 |
+
βββ textures
|
| 77 |
+
```
|
| 78 |
3. (Optional) To run saliency-guided Gaussian position initialization, download the pre-trained [EML-Net](https://github.com/SenJia/EML-NET-Saliency) models ([res_imagenet.pth](https://drive.google.com/open?id=1-a494canr9qWKLdm-DUDMgbGwtlAJz71), [res_places.pth](https://drive.google.com/open?id=18nRz0JSRICLqnLQtAvq01azZAsH0SEzS), [res_decoder.pth](https://drive.google.com/open?id=1vwrkz3eX-AMtXQE08oivGMwS4lKB74sH)) and place them under the `models/emlnet/` folder
|
| 79 |
+
```
|
| 80 |
+
image-gs
|
| 81 |
+
βββ models
|
| 82 |
+
βββ emlnet
|
| 83 |
+
βββ res_decoder.pth
|
| 84 |
+
βββ res_imagenet.pth
|
| 85 |
+
βββ res_places.pth
|
| 86 |
+
```
|
| 87 |
|
| 88 |
## Quick Start
|
| 89 |
|
| 90 |
+
#### Image Compression
|
| 91 |
+
|
| 92 |
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters
|
| 93 |
+
|
| 94 |
```bash
|
| 95 |
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize
|
| 96 |
```
|
| 97 |
+
|
| 98 |
- Render the corresponding optimized Image-GS representation at a new resolution with height `4000` (aspect ratio is maintained)
|
| 99 |
+
|
| 100 |
```bash
|
| 101 |
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --eval --render_height=4000
|
| 102 |
```
|
| 103 |
|
| 104 |
#### Texture Stack Compression
|
| 105 |
+
|
| 106 |
- Optimize an Image-GS representation for an input texture stack `alarm-clock_2k` using `30000` Gaussians with half-precision parameters
|
| 107 |
+
|
| 108 |
```bash
|
| 109 |
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize
|
| 110 |
```
|
| 111 |
+
|
| 112 |
- Render the corresponding optimized Image-GS representation at a new resolution with height `3000` (aspect ratio is maintained)
|
| 113 |
+
|
| 114 |
```bash
|
| 115 |
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize --eval --render_height=3000
|
| 116 |
```
|
| 117 |
|
| 118 |
#### Control bit precision of Gaussian parameters
|
| 119 |
+
|
| 120 |
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with 12-bit-precision parameters
|
| 121 |
+
|
| 122 |
```bash
|
| 123 |
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --pos_bits=12 --scale_bits 12 --rot_bits 12 --feat_bits 12
|
| 124 |
```
|
| 125 |
|
| 126 |
+
#### Switch to saliency-guided Gaussian position initialization
|
| 127 |
+
|
| 128 |
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters and saliency-guided initialization
|
| 129 |
+
|
| 130 |
```bash
|
| 131 |
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --init_mode="saliency"
|
| 132 |
```
|
|
|
|
| 138 |
### Setup for Web Interface
|
| 139 |
|
| 140 |
1. Install Gradio (in addition to the main dependencies):
|
| 141 |
+
|
| 142 |
```bash
|
| 143 |
+
pip install gradio>=5.0.0
|
| 144 |
```
|
| 145 |
|
| 146 |
2. Launch the web interface:
|
| 147 |
+
|
| 148 |
```bash
|
| 149 |
python gradio_app.py
|
| 150 |
```
|
|
|
|
| 164 |
|
| 165 |
### Interface Sections
|
| 166 |
|
| 167 |
+
1. **Configuration Panel**:
|
| 168 |
+
|
| 169 |
- Basic parameters (number of Gaussians, training steps)
|
| 170 |
- Quantization settings for memory efficiency
|
| 171 |
- Initialization modes (gradient, saliency, random)
|
| 172 |
- Advanced optimization parameters (learning rates, loss weights)
|
| 173 |
|
| 174 |
+
2. **Training Progress**:
|
| 175 |
+
|
| 176 |
- Real-time streaming logs
|
| 177 |
- Current render and Gaussian visualization updates
|
| 178 |
- Training status and control buttons
|
| 179 |
|
| 180 |
+
3. **Results Display**:
|
| 181 |
- Final optimized image
|
| 182 |
- Gradient and saliency maps used for initialization
|
| 183 |
- Download capabilities for all results
|
|
|
|
| 191 |
- For quick tests, reduce **max steps** to 500-1000
|
| 192 |
|
| 193 |
### Command Line Arguments
|
| 194 |
+
|
| 195 |
Please refer to `cfgs/default.yaml` for the full list of arguments and their default values.
|
| 196 |
|
| 197 |
**Post-optimization rendering**
|
| 198 |
+
|
| 199 |
- `--eval` render the optimized Image-GS representation.
|
| 200 |
- `--render_height` image height for rendering (aspect ratio is maintained).
|
| 201 |
|
| 202 |
+
**Bit precision control**: 32 bits (float32) per dimension by default
|
| 203 |
+
|
| 204 |
- `--quantize` enable bit precision control of Gaussian parameters.
|
| 205 |
- `--pos_bits` bit precision of individual coordinate dimension.
|
| 206 |
- `--scale_bits` bit precision of individual scale dimension.
|
|
|
|
| 208 |
- `--feat_bits` bit precision of individual feature dimension.
|
| 209 |
|
| 210 |
**Logging**
|
| 211 |
+
|
| 212 |
- `--exp_name` path to the logging directory.
|
| 213 |
- `--vis_gaussians`: visualize Gaussians during optimization.
|
| 214 |
- `--save_image_steps` frequency of rendering intermediate results during optimization.
|
| 215 |
- `--save_ckpt_steps` frequency of checkpointing during optimization.
|
| 216 |
|
| 217 |
**Input image**
|
| 218 |
+
|
| 219 |
- `--input_path` path to an image file or a directory containing a texture stack.
|
| 220 |
- `--downsample` load a downsampled version of the input image or texture stack as the optimization target to evaluate image upsampling performance.
|
| 221 |
- `--downsample_ratio` downsampling ratio.
|
| 222 |
- `--gamma` optimize in a gamma-corrected space, modify with caution.
|
| 223 |
|
| 224 |
**Gaussian**
|
| 225 |
+
|
| 226 |
- `--num_gaussians` number of Gaussians (for compression rate control).
|
| 227 |
- `--init_scale` initial Gaussian scale in number of pixels.
|
| 228 |
- `--disable_topk_norm` disable top-K normalization.
|
|
|
|
| 231 |
- `--init_random_ratio` ratio of Gaussians with randomly initialized position.
|
| 232 |
|
| 233 |
**Optimization**
|
| 234 |
+
|
| 235 |
- `--disable_tiles` disable tile-based rendering (warning: optimization and rendering without tiles will be way slower).
|
| 236 |
- `--max_steps` maximum number of optimization steps.
|
| 237 |
- `--pos_lr` Gaussian position learning rate.
|
|
|
|
| 242 |
- `--disable_prog_optim` disable error-guided progressive optimization.
|
| 243 |
|
| 244 |
## Acknowledgements
|
| 245 |
+
|
| 246 |
We would like to thank the [gsplat](https://github.com/nerfstudio-project/gsplat) team, and the authors of [3DGS](https://github.com/graphdeco-inria/gaussian-splatting), [fused-ssim](https://github.com/rahul-goel/fused-ssim), and [EML-Net](https://github.com/SenJia/EML-NET-Saliency) for their great work, based on which Image-GS was developed.
|
| 247 |
|
| 248 |
## License
|
| 249 |
+
|
| 250 |
This project is licensed under the terms of the MIT license.
|
| 251 |
|
| 252 |
## Citation
|
| 253 |
+
|
| 254 |
If you find this project helpful to your research, please consider citing [BibTeX](assets/docs/image-gs.bib):
|
| 255 |
+
|
| 256 |
```bibtex
|
| 257 |
@inproceedings{zhang2025image,
|
| 258 |
title={Image-gs: Content-adaptive image representation via 2d gaussians},
|
|
|
|
| 261 |
pages={1--11},
|
| 262 |
year={2025}
|
| 263 |
}
|
| 264 |
+
```
|