Performance-Optimized & Lightweight ONNX Version of DepthPro

This ONNX-based DepthPro model generates high-quality depth maps with minimal overhead. Depth values are encoded such that near points are bright and far points are dark, making the output directly usable for stereo and disparity-based applications without additional inversion or preprocessing. The model is optimized for efficient inference on standard hardware.

Key Features

  • Depth-only ONNX export: Significantly reduced model size while preserving full depth quality
  • Skips field-of-view calibration: Outputs raw predicted depth values without the post-processing step, avoiding normalization artifacts and computational overhead
  • Disparity-ready output: Compatible with stereo/disparity workflows out of the box - no conversion needed
  • FP16 weights: Optimized for GPU acceleration via DirectML for faster inference
  • Batch size 1: Benchmarks show single-image batches deliver optimal throughput; larger batches are slower
  • Opset 21: Uses modern ONNX operators for broader runtime optimization support
  • Aggressive graph optimization: Simplified model graph for reduced computation and faster loading
  • Fast inference: Minimal memory footprint and rapid depth map generation

Technical Specifications

Property Value
Input shape (1, 3, 1536, 1536) NCHW
Input dtype float16
Input range [-1.0, 1.0] (normalized RGB)
Output shape (1, 1536, 1536)
Output dtype float16
Output range Relative depth (higher = closer)

Requirements

  • VRAM: ~5.2 GB
  • ONNX Runtime: 1.19.0 or higher
  • Python: 3.8 or higher

Quick Start

pip install onnxruntime-directml numpy opencv-python
import cv2
import numpy as np
import onnxruntime as ort

# Load model
session = ort.InferenceSession('depthpro_1536x1536_bs1_fp16_opset21_optimized.onnx', providers=['DmlExecutionProvider', 'CPUExecutionProvider'])
input_name, output_name = session.get_inputs()[0].name, session.get_outputs()[0].name

# Load & preprocess
img = cv2.cvtColor(cv2.imread('examples/sample1/source.jpg'), cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (1536, 1536))
img = np.transpose(((img.astype(np.float32)/127.5)-1.0).astype(np.float16), (2,0,1))[np.newaxis]

# Inference
depth = session.run([output_name], {input_name: img})[0].squeeze().astype(np.float32)

# Clip extreme values and normalize
depth = np.clip(np.nan_to_num(depth, nan=0.0), -1e3, 1e3)
depth_norm = (depth - depth.min()) / max(depth.max() - depth.min(), 1e-6)

# Save 8-bit PNG for smaller size
cv2.imwrite('depth_frame_0001.png', (depth_norm * 255).round().astype(np.uint8))

# Save 16-bit TIFF for higher precision
cv2.imwrite('depth_frame_0001.tif', (depth_norm * 65535).round().astype(np.uint16), [cv2.IMWRITE_TIFF_COMPRESSION, cv2.IMWRITE_TIFF_COMPRESSION_DEFLATE])

print(f'Depth maps saved')

Benchmark: Speed, Size & Depth Map Quality

Benchmarked on an AMD Radeon RX 7900 XTX using ONNX Runtime v1.23.0 with DirectML.

DepthPro-based models WITHOUT Post-Processing

Model Throughput Model Size
apple/DepthPro-hf 1.5 img/min 1.8 GB
Owl3D Precision V2 9.6 img/min 1.2 GB
This Model 75.7 img/min 1.2 GB

DepthPro-based models WITH Post-Processing

DepthPro's post-processing step calibrates depth values using field-of-view information and normalizes the output. This can cause severe artifacts:

  • Crushed contrast: Extreme outlier depth values (e.g., 10,000 m instead of the typical ~130 m maximum observed across various scenes) cause normalization to compress useful depth information into a narrow range, mapping most pixels to extreme near values
  • Inconsistent results: These artifacts appear unpredictably, especially with quantized models, but also with full-precision versions

The models below use post-processing and may exhibit these issues depending on the scene:

Model Throughput Model Size
apple/DepthPro-hf 1.5 img/min 1.8 GB
DepthPro-ONNX - model_fp16.onnx 69.4 img/min 1.8 GB
DepthPro-ONNX - model_q4f16.onnx 52.9 img/min 0.6 GB
DepthPro-ONNX - model.onnx 44.0 img/min 3.5 GB
DepthPro-ONNX - model_q4.onnx 33.3 img/min 0.7 GB
DepthPro-ONNX - model_quantized.onnx 17.3 img/min 0.9 GB
DepthPro-ONNX - model_uint8.onnx 17.3 img/min 0.9 GB
DepthPro-ONNX - model_int8.onnx 15.9 img/min 0.9 GB
DepthPro-ONNX - model_bnb4.onnx 1.3 img/min 0.6 GB

License / Usage

This ONNX version of DepthPro is licensed under the Apple Machine Learning Research Model License.

  • Use is restricted to non-commercial scientific research and academic development.
  • Redistribution is allowed only with this license included.
  • Do not use Apple's trademarks, logos, or name to promote derivative models.
  • Commercial use, product integration, or service deployment is not allowed.
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Jens-Duttke/DepthPro-ONNX-HighPerf

Base model

apple/DepthPro
Quantized
(7)
this model