orhir
/

Splatent

Image-to-Image

Model card Files Files and versions

xet

Community

Add model card with metadata and usage instructions

by nielsr HF Staff - opened 12 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+55

-3

Files changed (1) hide show

README.md +55 -3

README.md CHANGED Viewed

@@ -1,3 +1,55 @@
----
-license: cc-by-nc-nd-4.0
----

+---
+license: cc-by-nc-nd-4.0
+pipeline_tag: image-to-image
+---
+# Splatent: Splatting Diffusion Latents for Novel View Synthesis
+[**Project Website**](https://orhir.github.io/Splatent/) | [**Paper (arXiv)**](https://arxiv.org/abs/2512.09923) | [**GitHub**](https://github.com/orhir/Splatent)
+Splatent is a diffusion-based enhancement framework designed to operate on top of 3D Gaussian Splatting (3DGS) in the latent space of VAEs. It enhances novel view synthesis by leveraging Stable Diffusion Turbo and multi-view attention mechanisms to recover fine-grained details while maintaining multi-view consistency.
+![Splatent Architecture](https://github.com/orhir/Splatent/raw/main/assets/splatent_architecture.png)
+## Installation
+To set up the environment, run:
+```bash
+conda env create -f environment.yml
+conda activate splatent
+pip install --no-build-isolation submodules/feature-3dgs/submodules/custom_rasterizer
+pip install --no-build-isolation submodules/feature-3dgs/submodules/simple-knn
+```
+## Custom Inference
+For inference on custom images using the pre-trained weights, you can use the following command from the source repository:
+```bash
+python src/inference_custom.py \
+  --input_image /path/to/input \
+  --ref_image /path/to/reference \
+  --prompt "A photo of a scene" \
+  --model_path /path/to/checkpoint.pkl \
+  --output_dir ./results
+```
+## Citation
+If you find Splatent useful for your research, please cite the following paper:
+```bibtex
+@article{splatent2025,
+      title={Splatent: Splatting Diffusion Latents for Novel View Synthesis},
+      author={Or Hirschorn and Omer Sela and Inbar Huberman-Spiegelglas and Netalee Efrat and Eli Alshan and Ianir Ideses and Frederic Devernay and Yochai Zvik and Lior Fritz},
+      year={2025},
+      eprint={2512.09923},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2512.09923},
+}
+```
+## Acknowledgments
+We thank the excellent repositories of [Stable Diffusion](https://github.com/Stability-AI/stablediffusion), [Diffusers](https://github.com/huggingface/diffusers), [Diffix3d+](https://github.com/nv-tlabs/Difix3D), [Feature 3DGS](https://github.com/ShijieZhou-UCLA/feature-3dgs), [3D Gaussian Splatting](https://github.com/graphdeco-inria/gaussian-splatting), and [LPIPS](https://github.com/richzhang/PerceptualSimilarity).