Improve model card: Add pipeline tag, paper, project page, and code links
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,3 +1,52 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: cc-by-nc-4.0
|
| 3 |
-
--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
+
pipeline_tag: image-to-3d
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
# TUN3D: Towards Real-World Scene Understanding from Unposed Images
|
| 7 |
+
|
| 8 |
+
This repository contains an implementation of TUN3D, a method for real-world indoor scene understanding from multi-view images.
|
| 9 |
+
|
| 10 |
+
* **Paper:** [TUN3D: Towards Real-World Scene Understanding from Unposed Images](https://huggingface.co/papers/2509.21388)
|
| 11 |
+
* **Project Page:** https://bulatko.github.io/tun3d/
|
| 12 |
+
* **Code:** https://github.com/col14m/TUN3D
|
| 13 |
+
|
| 14 |
+
<div align="center">
|
| 15 |
+
<video src="https://github.com/user-attachments/assets/8644a6d7-3a4e-4b1b-b58e-023276ea12ee"> </video>
|
| 16 |
+
<p><i>TUN3D works with GT point clouds, posed images (with known camera poses), or fully unposed image sets (without poses or depths).</i></p>
|
| 17 |
+
</div>
|
| 18 |
+
|
| 19 |
+
## Abstract
|
| 20 |
+
Layout estimation and 3D object detection are two fundamental tasks in indoor scene understanding. When combined, they enable the creation of a compact yet semantically rich spatial representation of a scene. Existing approaches typically rely on point cloud input, which poses a major limitation since most consumer cameras lack depth sensors and visual-only data remains far more common. We address this issue with TUN3D, the first method that tackles joint layout estimation and 3D object detection in real scans, given multi-view images as input, and does not require ground-truth camera poses or depth supervision. Our approach builds on a lightweight sparse-convolutional backbone and employs two dedicated heads: one for 3D object detection and one for layout estimation, leveraging a novel and effective parametric wall representation. Extensive experiments show that TUN3D achieves state-of-the-art performance across three challenging scene understanding benchmarks: (i) using ground-truth point clouds, (ii) using posed images, and (iii) using unposed images. While performing on par with specialized 3D object detection methods, TUN3D significantly advances layout estimation, setting a new benchmark in holistic indoor scene understanding.
|
| 21 |
+
|
| 22 |
+
## Installation and Usage
|
| 23 |
+
The repository is divided into two modules: `Reconstruction` and `Recognition`. Each module requires a separate installation of dependencies. Please refer to the [GitHub repository](https://github.com/col14m/TUN3D) for detailed installation instructions, data preprocessing steps, and guidance on running the model.
|
| 24 |
+
|
| 25 |
+
## Predictions example
|
| 26 |
+
|
| 27 |
+
#### ScanNet
|
| 28 |
+
|
| 29 |
+
<p float="left">
|
| 30 |
+
<img src="https://github.com/col14m/TUN3D/raw/main/recognition/imgs/predictions_scannet.png" width="900" height="396" />
|
| 31 |
+
</p>
|
| 32 |
+
|
| 33 |
+
#### S3DIS
|
| 34 |
+
<p float="left">
|
| 35 |
+
<img src="https://github.com/col14m/TUN3D/raw/main/recognition/imgs/predictions_s3dis.png" width="900" height="396" />
|
| 36 |
+
</p>
|
| 37 |
+
|
| 38 |
+
## Citation
|
| 39 |
+
|
| 40 |
+
If you find this work useful for your research, please cite our paper:
|
| 41 |
+
|
| 42 |
+
```bibtex
|
| 43 |
+
@misc{konushin2025tun3drealworldsceneunderstanding,
|
| 44 |
+
title={TUN3D: Towards Real-World Scene Understanding from Unposed Images},
|
| 45 |
+
author={Anton Konushin and Nikita Drozdov and Bulat Gabdullin and Alexey Zakharov and Anna Vorontsova and Danila Rukhovich and Maksim Kolodiazhnyi},
|
| 46 |
+
year={2025},
|
| 47 |
+
eprint={2509.21388},
|
| 48 |
+
archivePrefix={arXiv},
|
| 49 |
+
primaryClass={cs.CV},
|
| 50 |
+
url={https://arxiv.org/abs/2509.21388},
|
| 51 |
+
}
|
| 52 |
+
```
|