Depth Anything V2 Large β€” SafeTensors

Depth Anything V2 (Large, ViT-L backbone) converted to SafeTensors format for safe, fast loading in robotic depth estimation pipelines. 335M parameters for high-quality monocular depth maps.

This model is part of the RobotFlowLabs model library, built for the ANIMA agentic robotics platform β€” a modular ROS2-native AI system that brings foundation model intelligence to real robots operating in the real world.

Why This Model Exists

Monocular depth estimation is fundamental to robotic navigation and manipulation β€” robots need to know how far away things are from a single camera. Depth Anything V2 produces the highest-quality relative depth maps from a single image. The original weights are distributed as raw .pth files. We converted them to SafeTensors format for safe, zero-copy memory-mapped loading.

Model Details

Property Value
Architecture DPT head + ViT-Large encoder
Parameters 335M
Encoder ViT-L/14 (DINOv2-based)
Input Resolution Flexible (recommended 518Γ—518)
Output Dense relative depth map
Training Synthetic + real depth labels (multi-stage)
Original Model depth-anything/Depth-Anything-V2-Large
License Apache-2.0

Included Files

depth-anything-v2-large/
β”œβ”€β”€ model.safetensors          # 1.3 GB β€” Full model weights
└── README.md                  # This file

Quick Start

from safetensors.torch import load_file
import torch

# Load SafeTensors weights
state_dict = load_file("model.safetensors")

# Load into Depth Anything V2 architecture
from depth_anything_v2.dpt import DepthAnythingV2

model = DepthAnythingV2(encoder='vitl', features=256, out_channels=[256, 512, 1024, 1024])
model.load_state_dict(state_dict)
model.to("cuda").eval()

# Predict depth
depth = model.infer_image(image)  # Returns relative depth map

With Transformers

from transformers import AutoModelForDepthEstimation, AutoImageProcessor
import torch

processor = AutoImageProcessor.from_pretrained("depth-anything/Depth-Anything-V2-Large")
model = AutoModelForDepthEstimation.from_pretrained("depth-anything/Depth-Anything-V2-Large")
model.to("cuda").eval()

inputs = processor(images=image, return_tensors="pt").to("cuda")
with torch.no_grad():
    depth = model(**inputs).predicted_depth

With FORGE (ANIMA Integration)

from forge.vision import VisionEncoderRegistry

depth_estimator = VisionEncoderRegistry.load("depth-anything-v2-large")
depth_map = depth_estimator(image_tensor)  # Relative depth map

Use Cases in ANIMA

Depth estimation is critical across ANIMA modules:

  • Obstacle Avoidance β€” Real-time depth maps for safe navigation
  • Grasp Planning β€” Estimate object distance for manipulation reach calculations
  • 3D Reconstruction β€” Dense depth for point cloud generation from single camera
  • Safety Zones β€” Distance-based safety boundaries for human-robot collaboration
  • Path Planning β€” Identify traversable spaces and obstacle heights

Depth Anything V2 Family

Model Params Size Best For
depth-anything-v2-large 335M 1.3 GB Highest quality depth
depth-anything-v2-small 24.8M 95 MB Real-time edge deployment

Intended Use

Designed For

  • Monocular depth estimation for robotic navigation
  • Dense depth maps for manipulation planning
  • Point cloud generation from RGB cameras
  • Obstacle detection and distance estimation

Limitations

  • Produces relative (not metric) depth β€” requires calibration for absolute distances
  • Performance degrades on reflective, transparent, or textureless surfaces
  • Single-frame estimation β€” no temporal consistency for video
  • Inherits biases from training data distribution

Out of Scope

  • Safety-critical autonomous driving without additional validation
  • Medical depth estimation
  • Surveillance applications

Attribution

Citation

@article{yang2024depth_anything_v2,
  title={Depth Anything V2},
  author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
  journal={arXiv preprint arXiv:2406.09414},
  year={2024}
}

Built with FORGE by RobotFlowLabs
Optimizing foundation models for real robots.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for robotflowlabs/depth-anything-v2-large

Finetuned
(3)
this model

Collection including robotflowlabs/depth-anything-v2-large

Paper for robotflowlabs/depth-anything-v2-large

Evaluation results