--- license: mit datasets: - JTRNEO/SynRS3D language: - en base_model: - facebook/dinov2-large --- # SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery **Authors:** [Jian Song](https://scholar.google.ch/citations?user=CgcMFJsAAAAJ&hl=zh-CN)1,2, [Hongruixuan Chen](https://scholar.google.ch/citations?user=XOk4Cf0AAAAJ&hl=zh-CN&oi=ao)1, [Weihao Xuan](https://weihaoxuan.com/)1,2, [Junshi Xia](https://scholar.google.com/citations?user=n1aKdTkAAAAJ&hl=en)2, [Naoto Yokoya](https://scholar.google.co.jp/citations?user=DJ2KOn8AAAAJ&hl=en)1,2 1 The University of Tokyo 2 RIKEN AIP **Conference:** Neural Information Processing Systems (Spotlight), 2024 For more details, please refer to our [paper](https://arxiv.org/pdf/2406.18151) and visit our GitHub [repository](https://github.com/JTRNEO/SynRS3D). --- ### Overview **TL;DR:** We are excited to release two high-performing models for **height estimation** and **land cover mapping**. These models were trained on the SynRS3D dataset using our novel domain adaptation method, **RS3DAda**. - **Encoder:** Vision Transformer (ViT-L), pretrained with **DINOv2** - **Decoder:** [DPT](https://arxiv.org/abs/2103.13413), trained from scratch These models excel in tasks involving large-scale global 3D semantic understanding from high-resolution remote sensing imagery. Feel free to integrate them into your projects for enhanced performance in related applications. --- ### How to Cite If you find the RS3DAda model useful in your research, please consider citing: ``` @article{song2024synrs3d, title={SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery}, author={Song, Jian and Chen, Hongruixuan and Xuan, Weihao and Xia, Junshi and Yokoya, Naoto}, journal={arXiv preprint arXiv:2406.18151}, year={2024} } ``` --- ### Contact For any questions or feedback, please reach out via email at **song@ms.k.u-tokyo.ac.jp**. We hope you enjoy using the pretrained RS3DAda models!