Splatent: Splatting Diffusion Latents for Novel View Synthesis
Paper • 2512.09923 • Published • 1
Project Website | Paper (arXiv) | GitHub
Splatent is a diffusion-based enhancement framework designed to operate on top of 3D Gaussian Splatting (3DGS) in the latent space of VAEs. It enhances novel view synthesis by leveraging Stable Diffusion Turbo and multi-view attention mechanisms to recover fine-grained details while maintaining multi-view consistency.
To set up the environment, run:
conda env create -f environment.yml
conda activate splatent
pip install --no-build-isolation submodules/feature-3dgs/submodules/custom_rasterizer
pip install --no-build-isolation submodules/feature-3dgs/submodules/simple-knn
For inference on custom images using the pre-trained weights, you can use the following command from the source repository:
python src/inference_custom.py \
--input_image /path/to/input \
--ref_image /path/to/reference \
--prompt "A photo of a scene" \
--model_path /path/to/checkpoint.pkl \
--output_dir ./results
If you find Splatent useful for your research, please cite the following paper:
@article{splatent2025,
title={Splatent: Splatting Diffusion Latents for Novel View Synthesis},
author={Or Hirschorn and Omer Sela and Inbar Huberman-Spiegelglas and Netalee Efrat and Eli Alshan and Ianir Ideses and Frederic Devernay and Yochai Zvik and Lior Fritz},
year={2025},
eprint={2512.09923},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.09923},
}
We thank the excellent repositories of Stable Diffusion, Diffusers, Diffix3d+, Feature 3DGS, 3D Gaussian Splatting, and LPIPS.