--- license: cc-by-nc-sa-4.0 language: - en pipeline_tag: depth-estimation tags: - depth-estimation - metric-depth-estimation - monocular-depth-estimation - aerial - UAV - drone --- # OccuFly's Aerial DepthAnythingV2 ## Introduction Following its acceptance as a [CVPR 2026 Oral](https://cvpr.thecvf.com/virtual/2026/oral/40308), we release our fine-tuned [DepthAnythingV2](https://depth-anything-v2.github.io/) model, specialized for aerial imagery. It was trained using the [OccuFly dataset](https://markus-42.github.io/publications/2026/occufly/), the first large-scale, real-world benchmark for aerial Metric Monocular Depth Estimation and Semantic Scene Completion. This model represents the depth estimation component of our [OccuFly project](https://markus-42.github.io/publications/2026/occufly/), in which fine-tuned `DepthAnythingV2-ViT-S` to infer accurate metric depth (in meters) from a single aerial image. ### Key Features - **Aerial-specialized**: Fine-tuned on diverse aerial imagery from urban, industrial, and rural environments. - **Multi-altitude performance**: Trained on data from 50m, 40m, and 30m altitudes. - **Seasonal robustness**: Captures data across all seasons for improved generalization. - **Lightweight**: Uses the ViT-S backbone for efficient inference. ## Installation ```bash git clone https://huggingface.co/spaces/depth-anything/Depth-Anything-V2 cd Depth-Anything-V2 pip install -r requirements.txt ``` ## Quickstart Download the [model checkpoint](https://huggingface.co/markus-42/OccuFly-DepthAnythingV2/resolve/main/OccuFly-DepthAnything2.pth) and place it in your desired directory: ```python import cv2 import torch from depth_anything_v2.dpt import DepthAnythingV2 # Load the fine-tuned aerial model model = DepthAnythingV2(encoder='vits', features=64, out_channels=[48, 96, 192, 384]) model.load_state_dict(torch.load('OccuFly-DepthAnything2.pth', map_location='cpu')) model.eval() # Inference with torch.no_grad(): raw_img = cv2.imread('example.jpg') depth = model.infer_image(raw_img) # HxW metric depth map ``` ## OccuFly Dataset The model is fine-tuned on [OccuFly](https://huggingface.co/datasets/markus-42/OccuFly), which includes: - **20,000+ aerial RGB images** with corresponding depth maps - **Multiple altitudes**: 30m, 40m, 50m flight altitudes - **Seasonal diversity**: Spring, Summer, Fall, Winter - **Multiple environments**: Urban, industrial, rural - **21 semantic classes** with dense voxel grid annotations ## Citation If our work was helpful to you, we would appreciate citing our paper and the original DepthAnythingV2 work, or giving the repository a like ❤️ ```bibtex @inproceedings{gross2026occufly, title={{OccuFly: A 3D Vision Benchmark for Semantic Scene Completion from the Aerial Perspective}}, author={Markus Gross and Sai B. Matha and Aya Fahmy and Rui Song and Daniel Cremers and Henri Meess}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2026}, } @article{depth_anything_v2, title={Depth Anything V2}, author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang}, journal={arXiv preprint arXiv:2406.09414}, year={2024} } ``` ## Related Resources 🌐 [OccuFly Project Page](https://markus-42.github.io/publications/2026/occufly/)
🤗 [OccuFly Dataset on HuggingFace](https://huggingface.co/datasets/markus-42/OccuFly)
📜 [OccuFly Paper](https://arxiv.org/abs/2512.20770)
🌐 [Original DepthAnythingV2](https://github.com/DepthAnything/Depth-Anything-V2) ## License This work is licensed under the [CC BY-NC-SA 4.0 license](https://creativecommons.org/licenses/by-nc-sa/4.0/). See the LICENSE file for the full legal terms.