galileo101010's picture
minor changes
e6ca873
---
license: cc-by-nc-sa-4.0
language:
- en
pipeline_tag: depth-estimation
tags:
- depth-estimation
- metric-depth-estimation
- monocular-depth-estimation
- aerial
- UAV
- drone
---
# OccuFly's Aerial DepthAnythingV2
## Introduction
Following its acceptance as a [CVPR 2026 Oral](https://cvpr.thecvf.com/virtual/2026/oral/40308), we release our fine-tuned [DepthAnythingV2](https://depth-anything-v2.github.io/) model, specialized for aerial imagery. It was trained using the [OccuFly dataset](https://markus-42.github.io/publications/2026/occufly/), the first large-scale, real-world benchmark for aerial Metric Monocular Depth Estimation and Semantic Scene Completion.
This model represents the depth estimation component of our [OccuFly project](https://markus-42.github.io/publications/2026/occufly/), in which fine-tuned `DepthAnythingV2-ViT-S` to infer accurate metric depth (in meters) from a single aerial image.
### Key Features
- **Aerial-specialized**: Fine-tuned on diverse aerial imagery from urban, industrial, and rural environments.
- **Multi-altitude performance**: Trained on data from 50m, 40m, and 30m altitudes.
- **Seasonal robustness**: Captures data across all seasons for improved generalization.
- **Lightweight**: Uses the ViT-S backbone for efficient inference.
## Installation
```bash
git clone https://huggingface.co/spaces/depth-anything/Depth-Anything-V2
cd Depth-Anything-V2
pip install -r requirements.txt
```
## Quickstart
Download the [model checkpoint](https://huggingface.co/markus-42/OccuFly-DepthAnythingV2/resolve/main/OccuFly-DepthAnything2.pth) and place it in your desired directory:
```python
import cv2
import torch
from depth_anything_v2.dpt import DepthAnythingV2
# Load the fine-tuned aerial model
model = DepthAnythingV2(encoder='vits', features=64, out_channels=[48, 96, 192, 384])
model.load_state_dict(torch.load('OccuFly-DepthAnything2.pth', map_location='cpu'))
model.eval()
# Inference
with torch.no_grad():
raw_img = cv2.imread('example.jpg')
depth = model.infer_image(raw_img) # HxW metric depth map
```
## OccuFly Dataset
The model is fine-tuned on [OccuFly](https://huggingface.co/datasets/markus-42/OccuFly), which includes:
- **20,000+ aerial RGB images** with corresponding depth maps
- **Multiple altitudes**: 30m, 40m, 50m flight altitudes
- **Seasonal diversity**: Spring, Summer, Fall, Winter
- **Multiple environments**: Urban, industrial, rural
- **21 semantic classes** with dense voxel grid annotations
## Citation
If our work was helpful to you, we would appreciate citing our paper and the original DepthAnythingV2 work, or giving the repository a like ❀️
```bibtex
@inproceedings{gross2026occufly,
title={{OccuFly: A 3D Vision Benchmark for Semantic Scene Completion from the Aerial Perspective}},
author={Markus Gross and Sai B. Matha and Aya Fahmy and Rui Song and Daniel Cremers and Henri Meess},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026},
}
@article{depth_anything_v2,
title={Depth Anything V2},
author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
journal={arXiv preprint arXiv:2406.09414},
year={2024}
}
```
## Related Resources
🌐 [OccuFly Project Page](https://markus-42.github.io/publications/2026/occufly/)<br>
πŸ€— [OccuFly Dataset on HuggingFace](https://huggingface.co/datasets/markus-42/OccuFly)<br>
πŸ“œ [OccuFly Paper](https://arxiv.org/abs/2512.20770)<br>
🌐 [Original DepthAnythingV2](https://github.com/DepthAnything/Depth-Anything-V2)
## License
This work is licensed under the [CC BY-NC-SA 4.0 license](https://creativecommons.org/licenses/by-nc-sa/4.0/). See the LICENSE file for the full legal terms.