Morgan Williams
Add Base and Large CoreML models with documentation
e1db38c
---
license: cc-by-nc-4.0
tags:
- depth-estimation
- coreml
- apple-silicon
- vision
- computer-vision
library_name: coreml
---
# Depth Anything V2 - CoreML
Depth Anything V2 models (Base and Large) converted to CoreML format for optimized inference on Apple Silicon (M-series chips).
## Models
| Model | Size | Parameters | Performance (M4 Pro est.) | License |
|-------|------|------------|---------------------------|---------|
| Small F16 | 48 MB | 24.8M | ~30ms (~33 fps) | Apache-2.0 |
| Base F16 | 172 MB | 97.5M | ~60-90ms (~14 fps) | CC-BY-NC-4.0 |
| Large F16 | 590 MB | 335.3M | ~200-300ms (~4 fps) | CC-BY-NC-4.0 |
All models use Float16 precision and run on Apple's Neural Engine + GPU + CPU.
## License
Both **Base** and **Large** models are **CC-BY-NC-4.0** (non-commercial only), following the [official Depth Anything V2 licensing](https://github.com/DepthAnything/Depth-Anything-V2#license).
**For commercial use**, you must use the Small model (Apache-2.0), which is available directly from [Apple's CoreML model zoo](https://developer.apple.com/machine-learning/models/).
## Download
**Base model**:
```bash
curl -L -o DepthAnythingV2BaseF16.mlpackage.tar.gz \
"https://huggingface.co/mrgnw/depth-anything-v2-coreml/resolve/main/DepthAnythingV2BaseF16.mlpackage.tar.gz"
tar -xzf DepthAnythingV2BaseF16.mlpackage.tar.gz
```
**Large model**:
```bash
curl -L -o DepthAnythingV2LargeF16.mlpackage.tar.gz \
"https://huggingface.co/mrgnw/depth-anything-v2-coreml/resolve/main/DepthAnythingV2LargeF16.mlpackage.tar.gz"
tar -xzf DepthAnythingV2LargeF16.mlpackage.tar.gz
```
**Small model** (from Apple):
```bash
curl -L -o DepthAnythingV2SmallF16.mlpackage.zip \
"https://ml-assets.apple.com/coreml/models/Image/DepthEstimation/DepthAnything/DepthAnythingV2SmallF16.mlpackage.zip"
unzip DepthAnythingV2SmallF16.mlpackage.zip
```
## Usage
### Swift
```swift
import CoreML
let modelURL = URL(fileURLWithPath: "DepthAnythingV2BaseF16.mlpackage")
let config = MLModelConfiguration()
config.computeUnits = .all // Use Neural Engine + GPU + CPU
let model = try MLModel(contentsOf: modelURL, configuration: config)
// Input: RGB image (1, 3, 518, 518)
// Output: depth map (1, 518, 518)
```
## Performance
**M4 Pro (estimated):**
- Small: ~25-30ms per frame
- Base: ~60-90ms per frame
- Large: ~200-300ms per frame
These are **10-20x faster** than ONNX CPU inference because they use the Apple Neural Engine.
## Citation
```bibtex
@article{yang2024depth,
title={Depth Anything V2},
author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
journal={arXiv:2406.09414},
year={2024}
}
```
## Related
- [Original Depth Anything V2](https://github.com/DepthAnything/Depth-Anything-V2)
- [spatial-maker](https://github.com/mrgnw/spatial-maker) - Uses these models for spatial video/photo conversion