Performance-Optimized & Lightweight ONNX Version of DepthPro
This ONNX-based DepthPro model generates high-quality depth maps with minimal overhead. Depth values are encoded such that near points are bright and far points are dark, making the output directly usable for stereo and disparity-based applications without additional inversion or preprocessing. The model is optimized for efficient inference on standard hardware.
Key Features
- Depth-only ONNX export: Significantly reduced model size while preserving full depth quality
- Skips field-of-view calibration: Outputs raw predicted depth values without the post-processing step, avoiding normalization artifacts and computational overhead
- Disparity-ready output: Compatible with stereo/disparity workflows out of the box - no conversion needed
- FP16 weights: Optimized for GPU acceleration via DirectML for faster inference
- Batch size 1: Benchmarks show single-image batches deliver optimal throughput; larger batches are slower
- Opset 21: Uses modern ONNX operators for broader runtime optimization support
- Aggressive graph optimization: Simplified model graph for reduced computation and faster loading
- Fast inference: Minimal memory footprint and rapid depth map generation
Technical Specifications
| Property | Value |
|---|---|
| Input shape | (1, 3, 1536, 1536) NCHW |
| Input dtype | float16 |
| Input range | [-1.0, 1.0] (normalized RGB) |
| Output shape | (1, 1536, 1536) |
| Output dtype | float16 |
| Output range | Relative depth (higher = closer) |
Requirements
- VRAM: ~5.2 GB
- ONNX Runtime: 1.19.0 or higher
- Python: 3.8 or higher
Quick Start
pip install onnxruntime-directml numpy opencv-python
import cv2
import numpy as np
import onnxruntime as ort
# Load model
session = ort.InferenceSession('depthpro_1536x1536_bs1_fp16_opset21_optimized.onnx', providers=['DmlExecutionProvider', 'CPUExecutionProvider'])
input_name, output_name = session.get_inputs()[0].name, session.get_outputs()[0].name
# Load & preprocess
img = cv2.cvtColor(cv2.imread('examples/sample1/source.jpg'), cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (1536, 1536))
img = np.transpose(((img.astype(np.float32)/127.5)-1.0).astype(np.float16), (2,0,1))[np.newaxis]
# Inference
depth = session.run([output_name], {input_name: img})[0].squeeze().astype(np.float32)
# Clip extreme values and normalize
depth = np.clip(np.nan_to_num(depth, nan=0.0), -1e3, 1e3)
depth_norm = (depth - depth.min()) / max(depth.max() - depth.min(), 1e-6)
# Save 8-bit PNG for smaller size
cv2.imwrite('depth_frame_0001.png', (depth_norm * 255).round().astype(np.uint8))
# Save 16-bit TIFF for higher precision
cv2.imwrite('depth_frame_0001.tif', (depth_norm * 65535).round().astype(np.uint16), [cv2.IMWRITE_TIFF_COMPRESSION, cv2.IMWRITE_TIFF_COMPRESSION_DEFLATE])
print(f'Depth maps saved')
Benchmark: Speed, Size & Depth Map Quality
Benchmarked on an AMD Radeon RX 7900 XTX using ONNX Runtime v1.23.0 with DirectML.
DepthPro-based models WITHOUT Post-Processing
| Model | Throughput | Model Size | ![]() |
![]() |
![]() |
![]() |
|---|---|---|---|---|---|---|
| apple/DepthPro-hf | 1.5 img/min | 1.8 GB | ![]() |
![]() |
![]() |
![]() |
| Owl3D Precision V2 | 9.6 img/min | 1.2 GB | ![]() |
![]() |
![]() |
![]() |
| This Model | 75.7 img/min | 1.2 GB | ![]() |
![]() |
![]() |
![]() |
DepthPro-based models WITH Post-Processing
DepthPro's post-processing step calibrates depth values using field-of-view information and normalizes the output. This can cause severe artifacts:
- Crushed contrast: Extreme outlier depth values (e.g., 10,000 m instead of the typical ~130 m maximum observed across various scenes) cause normalization to compress useful depth information into a narrow range, mapping most pixels to extreme near values
- Inconsistent results: These artifacts appear unpredictably, especially with quantized models, but also with full-precision versions
The models below use post-processing and may exhibit these issues depending on the scene:
| Model | Throughput | Model Size | ![]() |
![]() |
![]() |
![]() |
|---|---|---|---|---|---|---|
| apple/DepthPro-hf | 1.5 img/min | 1.8 GB | ![]() |
![]() |
![]() |
![]() |
| DepthPro-ONNX - model_fp16.onnx | 69.4 img/min | 1.8 GB | ![]() |
![]() |
![]() |
![]() |
| DepthPro-ONNX - model_q4f16.onnx | 52.9 img/min | 0.6 GB | ![]() |
![]() |
![]() |
![]() |
| DepthPro-ONNX - model.onnx | 44.0 img/min | 3.5 GB | ![]() |
![]() |
![]() |
![]() |
| DepthPro-ONNX - model_q4.onnx | 33.3 img/min | 0.7 GB | ![]() |
![]() |
![]() |
![]() |
| DepthPro-ONNX - model_quantized.onnx | 17.3 img/min | 0.9 GB | ![]() |
![]() |
![]() |
![]() |
| DepthPro-ONNX - model_uint8.onnx | 17.3 img/min | 0.9 GB | ![]() |
![]() |
![]() |
![]() |
| DepthPro-ONNX - model_int8.onnx | 15.9 img/min | 0.9 GB | ![]() |
![]() |
![]() |
![]() |
| DepthPro-ONNX - model_bnb4.onnx | 1.3 img/min | 0.6 GB | ![]() |
![]() |
![]() |
![]() |
License / Usage
This ONNX version of DepthPro is licensed under the Apple Machine Learning Research Model License.
- Use is restricted to non-commercial scientific research and academic development.
- Redistribution is allowed only with this license included.
- Do not use Apple's trademarks, logos, or name to promote derivative models.
- Commercial use, product integration, or service deployment is not allowed.
- Downloads last month
- 5
Model tree for Jens-Duttke/DepthPro-ONNX-HighPerf
Base model
apple/DepthPro














































