---
license: cc-by-nc-4.0
tags:
- depth-estimation
- coreml
- apple-silicon
- vision
- computer-vision
library_name: coreml
---

# Depth Anything V2 - CoreML

Depth Anything V2 models (Base and Large) converted to CoreML format for optimized inference on Apple Silicon (M-series chips).

## Models

| Model | Size | Parameters | Performance (M4 Pro est.) | License |
|-------|------|------------|---------------------------|---------|
| Small F16 | 48 MB | 24.8M | ~30ms (~33 fps) | Apache-2.0 |
| Base F16 | 172 MB | 97.5M | ~60-90ms (~14 fps) | CC-BY-NC-4.0 |
| Large F16 | 590 MB | 335.3M | ~200-300ms (~4 fps) | CC-BY-NC-4.0 |

All models use Float16 precision and run on Apple's Neural Engine + GPU + CPU.

## License

Both **Base** and **Large** models are **CC-BY-NC-4.0** (non-commercial only), following the [official Depth Anything V2 licensing](https://github.com/DepthAnything/Depth-Anything-V2#license).

**For commercial use**, you must use the Small model (Apache-2.0), which is available directly from [Apple's CoreML model zoo](https://developer.apple.com/machine-learning/models/).

## Download

**Base model**:
```bash
curl -L -o DepthAnythingV2BaseF16.mlpackage.tar.gz \
  "https://huggingface.co/mrgnw/depth-anything-v2-coreml/resolve/main/DepthAnythingV2BaseF16.mlpackage.tar.gz"
tar -xzf DepthAnythingV2BaseF16.mlpackage.tar.gz
```

**Large model**:
```bash
curl -L -o DepthAnythingV2LargeF16.mlpackage.tar.gz \
  "https://huggingface.co/mrgnw/depth-anything-v2-coreml/resolve/main/DepthAnythingV2LargeF16.mlpackage.tar.gz"
tar -xzf DepthAnythingV2LargeF16.mlpackage.tar.gz
```

**Small model** (from Apple):
```bash
curl -L -o DepthAnythingV2SmallF16.mlpackage.zip \
  "https://ml-assets.apple.com/coreml/models/Image/DepthEstimation/DepthAnything/DepthAnythingV2SmallF16.mlpackage.zip"
unzip DepthAnythingV2SmallF16.mlpackage.zip
```

## Usage

### Swift

```swift
import CoreML

let modelURL = URL(fileURLWithPath: "DepthAnythingV2BaseF16.mlpackage")
let config = MLModelConfiguration()
config.computeUnits = .all  // Use Neural Engine + GPU + CPU

let model = try MLModel(contentsOf: modelURL, configuration: config)
// Input: RGB image (1, 3, 518, 518)
// Output: depth map (1, 518, 518)
```

## Performance

**M4 Pro (estimated):**
- Small: ~25-30ms per frame
- Base: ~60-90ms per frame 
- Large: ~200-300ms per frame

These are **10-20x faster** than ONNX CPU inference because they use the Apple Neural Engine.

## Citation

```bibtex
@article{yang2024depth,
  title={Depth Anything V2},
  author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
  journal={arXiv:2406.09414},
  year={2024}
}
```

## Related

- [Original Depth Anything V2](https://github.com/DepthAnything/Depth-Anything-V2)
- [spatial-maker](https://github.com/mrgnw/spatial-maker) - Uses these models for spatial video/photo conversion