DepthPro CoreML (1024x1024 High-Resolution)

This repository contains the High-Resolution (1024x1024) version of the DepthPro model, optimized for CoreML.

DepthPro is a state-of-the-art monocular depth estimation model that provides sharp, metric-scale depth maps. This 1024px version is specifically designed for High-Quality 3D Exports where edge precision and fine detail preservation are critical.

🚀 Key Features

High Fidelity: Captures thin structures (threads, instruments, hair) with superior accuracy compared to the 512px version.
Symmetric 3D Rendering Optimized: Perfectly suited for symmetric shifting in VR/AR to minimize visual discomfort.
VisionOS Ready: Fully compatible with Apple Vision Pro (optimized for GPU/CPU).

📊 Performance & Requirements

Metric	Specification
Input Resolution	1024 x 1024 pixels
Compute Units	GPU + CPU (Recommended for stability)
Average Latency	~7.5s per frame (on M2 Ultra/M3 Max)
Target Use Case	Offline Video Conversion / High-Quality Spatial Video Export

To ensure inference stability at this resolution, this model is configured to use the GPU/CPU path rather than ANE to avoid memory limits.

📦 Repository Contents

The repository contains the following core components:

DepthPro_transform.mlpackage: Image preprocessing.
DepthPro_encoder.mlpackage: Feature extraction (ViT-Large).
DepthPro_decoder.mlpackage: Multiresolution fusion.
DepthPro_depth.mlpackage: Final depth output and high-res feature generation.

🛠 Usage with Swift Transformers

You can download and cache this model dynamically using swift-transformers:

let hub = Hub()
let modelDir = try await hub.snapshot(repoId: "aarondevstack/DepthPro-1024x1024-coreml")
// Load models from the downloaded directory

Downloads last month: 154

Inference Providers NEW

Depth Estimation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support