lhmd commited on
Commit
4452342
verified
1 Parent(s): c9d8efd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -3
README.md CHANGED
@@ -1,3 +1,70 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-to-3d
4
+ ---
5
+
6
+ <p align="center">
7
+ <h1 align="center">VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction</h1>
8
+ <p align="center">
9
+ <a href="https://lhmd.top">Weijie Wang*</a>
10
+
11
+ <a href="https://github.com/Alresd">Yeqing Chen*</a>
12
+
13
+ <a href="https://steve-zeyu-zhang.github.io">Zeyu Zhang</a>
14
+
15
+ <a href="https://liuhengyu321.github.io">Hengyu Liu</a>
16
+
17
+ <a href="https://wang-haoxiao.github.io">Haoxiao Wang</a>
18
+
19
+ <a href="https://scholar.google.com/citations?user=4HaLG0oAAAAJ">Zhiyuan Feng</a>
20
+
21
+ <a href="https://scholar.google.com/citations?user=TE9stNgAAAAJ">Wenkang Qin</a>
22
+
23
+ <a href="http://www.zhengzhu.net/">Zheng Zhu</a>
24
+
25
+ <a href="https://donydchen.github.io">Donny Y. Chen</a>
26
+
27
+ <a href="https://bohanzhuang.github.io">Bohan Zhuang</a>
28
+ </p>
29
+ <h3 align="center"><a href="https://arxiv.org/abs/2505.23734">Paper</a> | <a href="https://lhmd.top/volsplat">Project Page</a> | <a href="https://github.com/ziplab/VolSplat">Code</a> | <a href="https://huggingface.co/lhmd/VolSplat">Models</a> </h3>
30
+ <div align="center"></div>
31
+ </p>
32
+
33
+
34
+ <p align="center">
35
+ <a href="">
36
+ <img src="https://lhmd.top/volsplat/assets/teaser_horizontal.jpg" alt="Logo" width="100%">
37
+ </a>
38
+ </p>
39
+ Pixel-aligned feed-forward 3DGS methods suffer from two primary limitations: 1) 2D feature matching struggles to effectively resolve the multi-view alignment problem, and 2) the Gaussian density is constrained and cannot be adaptively controlled according to scene complexity. We propose VolSplat, a method that directly regresses Gaussians from 3D features based on a voxel-aligned prediction strategy. This approach achieves adaptive control over scene complexity and resolves the multi-view alignment challenge.
40
+
41
+ ## Method
42
+ <p align="center">
43
+ <a href="">
44
+ <img src="https://lhmd.top/volsplat/assets/pipeline.jpg" alt="Logo" width="100%">
45
+ </a>
46
+ </p>
47
+ <strong>Overview of VolSplat</strong>. Given multi-view images as input, we first extract 2D features for each image using a Transformer-based network and construct per-view cost volumes with plane sweeping. Depth Prediction Module then estimates a depth map for each view, which is used to unproject the 2D features into 3D space to form a voxel feature grid. Subsequently, we employ a sparse 3D decoder to refine these features in 3D space and predict the parameters of a 3D Gaussian for each occupied voxel. Finally, novel views are rendered from the predicted 3D Gaussians.
48
+
49
+
50
+ ## TODOs
51
+ - [ ] Release Code.
52
+ - [ ] Release Model Checkpoints.
53
+
54
+ ## Citation
55
+ If you find our work useful for your research, please consider citing us:
56
+
57
+ ```bibtex
58
+ @article{wang2025volsplat,
59
+ title={VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction},
60
+ author={Wang, Weijie and Chen, YeQing and Zhang, Zeyu and Liu, Hengyu and Wang, Haoxiao and Feng, Zhiyuan and Qin, Wenkang and Zhu, Zheng and Chen, Donny Y. and Zhuang, Bohan},
61
+ journal={},
62
+ year={2025}
63
+ }
64
+ ```
65
+ ## Contact
66
+ If you have any questions, please create an issue on this repository or contact at wangweijie@zju.edu.cn.
67
+
68
+ ## Acknowledgements
69
+
70
+ This project is developed with [DepthSplat](https://github.com/cvg/depthsplat). We thank the original authors for their excellent work.