|
|
--- |
|
|
license: apache-2.0 |
|
|
pipeline_tag: image-to-3d |
|
|
tags: |
|
|
- dino |
|
|
- scene-understanding |
|
|
- semantic-scene-completion |
|
|
- unsupervised |
|
|
library_name: pytorch |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
<h1>Feed-Forward <i>SceneDINO</i> for Unsupervised Semantic Scene Completion</h1> |
|
|
|
|
|
[**Aleksandar Jevtić**](https://jev-aleks.github.io/)<sup>*1</sup> |
|
|
[**Christoph Reich**](https://christophreich1996.github.io/)<sup>*1,2,4,5</sup> |
|
|
[**Felix Wimbauer**](https://fwmb.github.io/)<sup>1,4</sup> |
|
|
[**Oliver Hahn**](https://olvrhhn.github.io/)<sup>2</sup> |
|
|
[**Christian Rupprecht**](https://chrirupp.github.io/)<sup>3</sup> |
|
|
[**Stefan Roth**](https://www.visinf.tu-darmstadt.de/visual_inference/people_vi/stefan_roth.en.jsp)<sup>2,5,6</sup> |
|
|
[**Daniel Cremers**](https://cvg.cit.tum.de/members/cremers/)<sup>1,4,5</sup> |
|
|
|
|
|
<sup>1</sup>TU Munich <sup>2</sup>TU Darmstadt <sup>3</sup>University of Oxford <sup>4</sup>MCML <sup>5</sup>ELIZA <sup>6</sup>hessian.AI *equal contribution |
|
|
|
|
|
<a href="https://arxiv.org/abs/2507.06230"><img src='https://img.shields.io/badge/ArXiv-grey' alt='Paper PDF'></a> |
|
|
<a href="https://visinf.github.io/scenedino/"><img src='https://img.shields.io/badge/Project Page-grey' alt='Project Page URL'></a> |
|
|
<a href="https://huggingface.co/spaces/jev-aleks/SceneDINO"><img src='https://img.shields.io/badge/🤗 Demo-grey' alt='Project Page URL'></a> |
|
|
<a href="https://opensource.org/licenses/Apache-2.0"><img src='https://img.shields.io/badge/License-Apache%202.0-blue.svg' alt='License'></a> |
|
|
[](https://pytorch.org/) |
|
|
|
|
|
</div> |
|
|
|
|
|
## Overview |
|
|
|
|
|
SceneDINO is unsupervised and infers 3D geometry and features from a single image in a feed-forward manner. Distilling and clustering SceneDINO's 3D feature field results in unsupervised semantic scene completion predictions. The method is trained using multi-view self-supervision. |
|
|
|
|
|
## Installation & Quick Start |
|
|
|
|
|
Please refer to our [Github Repo](https://github.com/tum-vision/scenedino). |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you find our work useful, please consider giving it a star ⭐ and citing our paper. |
|
|
|
|
|
```bibtex |
|
|
@inproceedings{Jevtic:2025:SceneDINO, |
|
|
author = {Aleksandar Jevti{\'c} and |
|
|
Christoph Reich and |
|
|
Felix Wimbauer and |
|
|
Oliver Hahn and |
|
|
Christian Rupprecht and |
|
|
Stefan Roth and |
|
|
Daniel Cremers}, |
|
|
title = {Feed-Forward {SceneDINO} for Unsupervised Semantic Scene Completion}, |
|
|
journal = {IEEE/CVF International Conference on Computer Vision (ICCV)}, |
|
|
year = {2025}, |
|
|
} |
|
|
``` |