Spaces:

Aedelon
/

awesome-depth-anything-3

Running

App Files Files Community

awesome-depth-anything-3 / docs /CLI.md

Delanoe Pirard

Deploy to HuggingFace Spaces

18b382b 10 days ago

preview code

raw

history blame

21.4 kB

🚀 Depth Anything 3 Command Line Interface

📋 Table of Contents

📖 Overview
⚡ Quick Start
📚 Command Reference
⚙️ Parameter Details
💡 Usage Examples

📖 Overview

The Depth Anything 3 CLI provides a comprehensive command-line toolkit supporting image depth estimation, video processing, COLMAP dataset handling, and web applications.

The backend service enables cache model to GPU so that we do not need to reload model for each command.

⚡ Quick Start

The CLI can run fully offline or connect to the backend for cached weights and task scheduling:

# 🔧 Start backend service (optional, keeps model resident in GPU memory)
da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE

# 🚀 Use auto mode to process input
da3 auto path/to/input --export-dir ./workspace/scene001

# ♻️ Reuse backend for next job
da3 auto path/to/video.mp4 \
    --export-dir ./workspace/scene002 \
    --use-backend \
    --backend-url http://localhost:8008

Each export directory contains scene.glb, scene.jpg, and optional extras such as depth_vis/ or gs_video/ depending on the requested format.

📚 Command Reference

🤖 auto - Auto Mode

Automatically detect input type and dispatch to the appropriate handler.

Usage:

da3 auto INPUT_PATH [OPTIONS]

Input Type Detection:

🖼️ Single image file (.jpg, .png, .jpeg, .webp, .bmp, .tiff, .tif)
📁 Image directory
🎬 Video file (.mp4, .avi, .mov, .mkv, .flv, .wmv, .webm, .m4v)
📐 COLMAP directory (containing images/ and sparse/ subdirectories)

Parameters:

Parameter	Type	Default	Description
`INPUT_PATH`	str	Required	Input path (image, directory, video, or COLMAP)
`--model-dir`	str	Default model	Model directory path
`--export-dir`	str	`debug`	Export directory
`--export-format`	str	`glb`	Export format (supports `mini_npz`, `glb`, `feat_vis`, etc., can be combined with hyphens)
`--device`	str	`cuda`	Device to use
`--use-backend`	bool	`False`	Use backend service for inference
`--backend-url`	str	`http://localhost:8008`	Backend service URL
`--process-res`	int	`504`	Processing resolution
`--process-res-method`	str	`upper_bound_resize`	Processing resolution method
`--export-feat`	str	`""`	Export features from specified layers, comma-separated (e.g., `"0,1,2"`)
`--auto-cleanup`	bool	`False`	Automatically clean export directory without confirmation
`--fps`	float	`1.0`	[Video] Frame sampling FPS
`--sparse-subdir`	str	`""`	[COLMAP] Sparse reconstruction subdirectory (e.g., `"0"` for `sparse/0/`)
`--align-to-input-ext-scale`	bool	`True`	[COLMAP] Align prediction to input extrinsics scale
`--use-ray-pose`	bool	`False`	Use ray-based pose estimation instead of camera decoder
`--ref-view-strategy`	str	`saddle_balanced`	Reference view selection strategy: `first`, `middle`, `saddle_balanced`, `saddle_sim_range`. See docs
`--conf-thresh-percentile`	float	`40.0`	[GLB] Lower percentile for adaptive confidence threshold
`--num-max-points`	int	`1000000`	[GLB] Maximum number of points in the point cloud
`--show-cameras`	bool	`True`	[GLB] Show camera wireframes in the exported scene
`--feat-vis-fps`	int	`15`	[FEAT_VIS] Frame rate for output video

Examples:

# 🖼️ Auto-process an image
da3 auto path/to/image.jpg --export-dir ./output

# 🎬 Auto-process a video
da3 auto path/to/video.mp4 --fps 2.0 --export-dir ./output

# 🔧 Use backend service
da3 auto path/to/input \
    --export-format mini_npz-glb \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

🖼️ image - Single Image Processing

Process a single image for camera pose and depth estimation.

Usage:

da3 image IMAGE_PATH [OPTIONS]

Parameters:

Parameter	Type	Default	Description
`IMAGE_PATH`	str	Required	Input image file path
`--model-dir`	str	Default model	Model directory path
`--export-dir`	str	`debug`	Export directory
`--export-format`	str	`glb`	Export format
`--device`	str	`cuda`	Device to use
`--use-backend`	bool	`False`	Use backend service for inference
`--backend-url`	str	`http://localhost:8008`	Backend service URL
`--process-res`	int	`504`	Processing resolution
`--process-res-method`	str	`upper_bound_resize`	Processing resolution method
`--export-feat`	str	`""`	Export feature layer indices (comma-separated)
`--auto-cleanup`	bool	`False`	Automatically clean export directory
`--use-ray-pose`	bool	`False`	Use ray-based pose estimation instead of camera decoder
`--ref-view-strategy`	str	`saddle_balanced`	Reference view selection strategy. See docs
`--conf-thresh-percentile`	float	`40.0`	[GLB] Confidence threshold percentile
`--num-max-points`	int	`1000000`	[GLB] Maximum number of points
`--show-cameras`	bool	`True`	[GLB] Show cameras
`--feat-vis-fps`	int	`15`	[FEAT_VIS] Video frame rate

Examples:

# ✨ Basic usage
da3 image path/to/image.png --export-dir ./output

# ⚡ With backend acceleration
da3 image path/to/image.png \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

# 🔍 Export feature visualization
da3 image image.jpg \
    --export-format feat_vis \
    --export-feat "9,19,29,39" \
    --export-dir ./results

🗂️ images - Image Directory Processing

Process a directory of images for batch depth estimation.

Usage:

da3 images IMAGES_DIR [OPTIONS]

Parameters:

Parameter	Type	Default	Description
`IMAGES_DIR`	str	Required	Directory path containing images
`--image-extensions`	str	`png,jpg,jpeg`	Image file extensions to process (comma-separated)
`--model-dir`	str	Default model	Model directory path
`--export-dir`	str	`debug`	Export directory
`--export-format`	str	`glb`	Export format
`--device`	str	`cuda`	Device to use
`--use-backend`	bool	`False`	Use backend service for inference
`--backend-url`	str	`http://localhost:8008`	Backend service URL
`--process-res`	int	`504`	Processing resolution
`--process-res-method`	str	`upper_bound_resize`	Processing resolution method
`--export-feat`	str	`""`	Export feature layer indices
`--auto-cleanup`	bool	`False`	Automatically clean export directory
`--use-ray-pose`	bool	`False`	Use ray-based pose estimation instead of camera decoder
`--ref-view-strategy`	str	`saddle_balanced`	Reference view selection strategy. See docs
`--conf-thresh-percentile`	float	`40.0`	[GLB] Confidence threshold percentile
`--num-max-points`	int	`1000000`	[GLB] Maximum number of points
`--show-cameras`	bool	`True`	[GLB] Show cameras
`--feat-vis-fps`	int	`15`	[FEAT_VIS] Video frame rate

Examples:

# 📁 Process directory (defaults to png/jpg/jpeg)
da3 images ./image_folder --export-dir ./output

# 🎯 Custom extensions
da3 images ./dataset --image-extensions "png,jpg,webp" --export-dir ./output

# 🔧 Use backend service
da3 images ./dataset \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

🎬 video - Video Processing

Process video by extracting frames for depth estimation.

Usage:

da3 video VIDEO_PATH [OPTIONS]

Parameters:

Parameter	Type	Default	Description
`VIDEO_PATH`	str	Required	Input video file path
`--fps`	float	`1.0`	Frame extraction sampling FPS
`--model-dir`	str	Default model	Model directory path
`--export-dir`	str	`debug`	Export directory
`--export-format`	str	`glb`	Export format
`--device`	str	`cuda`	Device to use
`--use-backend`	bool	`False`	Use backend service for inference
`--backend-url`	str	`http://localhost:8008`	Backend service URL
`--process-res`	int	`504`	Processing resolution
`--process-res-method`	str	`upper_bound_resize`	Processing resolution method
`--export-feat`	str	`""`	Export feature layer indices
`--auto-cleanup`	bool	`False`	Automatically clean export directory
`--use-ray-pose`	bool	`False`	Use ray-based pose estimation instead of camera decoder
`--ref-view-strategy`	str	`saddle_balanced`	Reference view selection strategy. See docs
`--conf-thresh-percentile`	float	`40.0`	[GLB] Confidence threshold percentile
`--num-max-points`	int	`1000000`	[GLB] Maximum number of points
`--show-cameras`	bool	`True`	[GLB] Show cameras
`--feat-vis-fps`	int	`15`	[FEAT_VIS] Video frame rate

Examples:

# ✨ Basic video processing
da3 video path/to/video.mp4 --export-dir ./output

# ⚙️ Control frame sampling and resolution
da3 video path/to/video.mp4 \
    --fps 2.0 \
    --process-res 1024 \
    --export-dir ./output

# 🔧 Use backend service
da3 video path/to/video.mp4 \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

📐 colmap - COLMAP Dataset Processing

Run pose-conditioned depth estimation on COLMAP data.

Usage:

da3 colmap COLMAP_DIR [OPTIONS]

Parameters:

Parameter	Type	Default	Description
`COLMAP_DIR`	str	Required	COLMAP directory containing `images/` and `sparse/` subdirectories
`--sparse-subdir`	str	`""`	Sparse reconstruction subdirectory (e.g., `"0"` for `sparse/0/`)
`--align-to-input-ext-scale`	bool	`True`	Align prediction to input extrinsics scale
`--model-dir`	str	Default model	Model directory path
`--export-dir`	str	`debug`	Export directory
`--export-format`	str	`glb`	Export format
`--device`	str	`cuda`	Device to use
`--use-backend`	bool	`False`	Use backend service for inference
`--backend-url`	str	`http://localhost:8008`	Backend service URL
`--process-res`	int	`504`	Processing resolution
`--process-res-method`	str	`upper_bound_resize`	Processing resolution method
`--export-feat`	str	`""`	Export feature layer indices
`--auto-cleanup`	bool	`False`	Automatically clean export directory
`--use-ray-pose`	bool	`False`	Use ray-based pose estimation instead of camera decoder
`--ref-view-strategy`	str	`saddle_balanced`	Reference view selection strategy. See docs
`--conf-thresh-percentile`	float	`40.0`	[GLB] Confidence threshold percentile
`--num-max-points`	int	`1000000`	[GLB] Maximum number of points
`--show-cameras`	bool	`True`	[GLB] Show cameras
`--feat-vis-fps`	int	`15`	[FEAT_VIS] Video frame rate

Examples:

# 📐 Process COLMAP dataset
da3 colmap ./colmap_dataset --export-dir ./output

# 🎯 Use specific sparse subdirectory and align scale
da3 colmap ./colmap_dataset \
    --sparse-subdir 0 \
    --align-to-input-ext-scale \
    --export-dir ./output

# 🔧 Use backend service
da3 colmap ./colmap_dataset \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

🔧 backend - Backend Service

Start model backend service with integrated gallery.

Usage:

da3 backend [OPTIONS]

Parameters:

Parameter	Type	Default	Description
`--model-dir`	str	Default model	Model directory path
`--device`	str	`cuda`	Device to use
`--host`	str	`127.0.0.1`	Host address to bind to
`--port`	int	`8008`	Port number to bind to
`--gallery-dir`	str	Default gallery dir	Gallery directory path (optional)

Features:

🎯 Keeps model resident in GPU memory
🔌 Provides REST inference API
📊 Integrated dashboard and status monitoring
🖼️ Optional gallery browser (if --gallery-dir is provided)

Available Endpoints:

🏠 / - Home page
📊 /dashboard - Dashboard
✅ /status - API status
🖼️ /gallery/ - Gallery browser (if enabled)

Examples:

# 🚀 Basic backend service
da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE

# 🖼️ Backend with gallery
da3 backend \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --device cuda \
    --host 0.0.0.0 \
    --port 8008 \
    --gallery-dir ./workspace

# 💻 Use CPU
da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE --device cpu

🎨 gradio - Gradio Application

Launch Depth Anything 3 Gradio interactive web application.

Usage:

da3 gradio [OPTIONS]

Parameters:

Parameter	Type	Default	Description
`--model-dir`	str	Required	Model directory path
`--workspace-dir`	str	Required	Workspace directory path
`--gallery-dir`	str	Required	Gallery directory path
`--host`	str	`127.0.0.1`	Host address to bind to
`--port`	int	`7860`	Port number to bind to
`--share`	bool	`False`	Create a public link
`--debug`	bool	`False`	Enable debug mode
`--cache-examples`	bool	`False`	Pre-cache all example scenes at startup
`--cache-gs-tag`	str	`""`	Tag to match scene names for high-res+3DGS caching

Examples:

# 🎨 Basic Gradio application
da3 gradio \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --workspace-dir ./workspace \
    --gallery-dir ./gallery

# 🌐 Enable sharing and debug
da3 gradio \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --workspace-dir ./workspace \
    --gallery-dir ./gallery \
    --share \
    --debug

# ⚡ Pre-cache examples
da3 gradio \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --workspace-dir ./workspace \
    --gallery-dir ./gallery \
    --cache-examples \
    --cache-gs-tag "dl3dv"

🖼️ gallery - Gallery Server

Launch standalone Depth Anything 3 Gallery server.

Usage:

da3 gallery [OPTIONS]

Parameters:

Parameter	Type	Default	Description
`--gallery-dir`	str	Default gallery dir	Gallery root directory
`--host`	str	`127.0.0.1`	Host address to bind to
`--port`	int	`8007`	Port number to bind to
`--open-browser`	bool	`False`	Open browser after launch

Note: The gallery expects each scene folder to contain at least scene.glb and scene.jpg, with optional subfolders such as depth_vis/ or gs_video/.

Examples:

# 🖼️ Basic gallery server
da3 gallery --gallery-dir ./workspace

# 🌐 Custom host and port
da3 gallery \
    --gallery-dir ./workspace \
    --host 0.0.0.0 \
    --port 8007

# 🚀 Auto-open browser
da3 gallery --gallery-dir ./workspace --open-browser

⚙️ Parameter Details

🔧 Common Parameters

--export-dir: Output directory, defaults to debug
--export-format: Export format, supports combining multiple formats with hyphens:
- 📦 mini_npz: Compressed NumPy format
- 🎨 glb: glTF binary format (3D scene)
- 🔍 feat_vis: Feature visualization
- Example: mini_npz-glb exports both formats
--process-res / --process-res-method: Control preprocessing resolution strategy
- process-res: Target resolution (default 504)
- process-res-method: Resize method (default upper_bound_resize)
--auto-cleanup: Remove existing export directory without confirmation
--use-backend / --backend-url: Reuse running backend service
- ⚡ Reduces model loading time
- 🌐 Supports distributed processing
--export-feat: Layer indices for exporting intermediate features (comma-separated)
- Example: "9,19,29,39"

🎨 GLB Export Parameters

--conf-thresh-percentile: Lower percentile for adaptive confidence threshold (default 40.0)
- Used to filter low-confidence points
--num-max-points: Maximum number of points in point cloud (default 1,000,000)
- Controls output file size and performance
--show-cameras: Show camera wireframes in exported scene (default True)

🔍 Feature Visualization Parameters

--feat-vis-fps: Frame rate for feature visualization output video (default 15)

🎬 Video-Specific Parameters

--fps: Video frame extraction sampling rate (default 1.0 FPS)
- Higher values extract more frames

📐 COLMAP-Specific Parameters

--sparse-subdir: Sparse reconstruction subdirectory
- Empty string uses sparse/ directory
- "0" uses sparse/0/ directory
--align-to-input-ext-scale: Align prediction to input extrinsics scale (default True)
- Ensures depth estimation is consistent with COLMAP scale

💡 Usage Examples

1️⃣ Basic Workflow

# 🔧 Start backend service
da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE --host 0.0.0.0 --port 8008

# 🖼️ Process single image
da3 image image.jpg --export-dir ./output1 --use-backend

# 🎬 Process video
da3 video video.mp4 --fps 2.0 --export-dir ./output2 --use-backend

# 📐 Process COLMAP dataset
da3 colmap ./colmap_data --export-dir ./output3 --use-backend

2️⃣ Using Auto Mode

# 🤖 Auto-detect and process
da3 auto ./unknown_input --export-dir ./output

# ⚡ With backend acceleration
da3 auto ./unknown_input \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

3️⃣ Multi-Format Export

# 📦 Export both NPZ and GLB formats
da3 auto assets/examples/SOH \
    --export-format mini_npz-glb \
    --export-dir ./workspace/soh

# 🔍 Export feature visualization
da3 image image.jpg \
    --export-format feat_vis \
    --export-feat "9,19,29,39" \
    --export-dir ./results

4️⃣ Advanced Configuration

# ⚙️ Custom resolution and point cloud density
da3 image image.jpg \
    --process-res 1024 \
    --num-max-points 2000000 \
    --conf-thresh-percentile 30.0 \
    --export-dir ./output

# 📐 COLMAP advanced options
da3 colmap ./colmap_data \
    --sparse-subdir 0 \
    --align-to-input-ext-scale \
    --process-res 756 \
    --export-dir ./output

5️⃣ Batch Processing Workflow

# 🔧 Start backend
da3 backend \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --device cuda \
    --host 0.0.0.0 \
    --port 8008 \
    --gallery-dir ./workspace

# 🔄 Batch process multiple scenes
for scene in scene1 scene2 scene3; do
    da3 auto ./data/$scene \
        --export-dir ./workspace/$scene \
        --use-backend \
        --auto-cleanup
done

# 🖼️ Launch gallery to view results
da3 gallery --gallery-dir ./workspace --open-browser

6️⃣ Web Applications

# 🎨 Launch Gradio application
da3 gradio \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --workspace-dir workspace/gradio \
    --gallery-dir ./gallery \
    --host 0.0.0.0 \
    --port 7860 \
    --share

7️⃣ Transformer Feature Visualization

# 🔍 Export Transformer features
# 📦 Combined with numerical output
da3 auto video.mp4 \
    --export-format glb-feat_vis \
    --export-feat "11,21,31" \
    --export-dir ./debug \
    --use-backend

📝 Notes

🔧 Backend Service: Recommended for processing multiple tasks to improve efficiency
💾 GPU Memory: Be mindful of GPU memory usage when processing high-resolution inputs
📁 Export Directory: Use --auto-cleanup to avoid manual confirmation for deletion
🔀 Format Combination: Multiple export formats can be combined with hyphens (e.g., mini_npz-glb-feat_vis)
📐 COLMAP Data: Ensure COLMAP directory structure is correct (contains images/ and sparse/ subdirectories)

❓ Getting Help

View detailed help for any command:

# 📖 View main help
da3 --help

# 🔍 View specific command help
da3 auto --help
da3 image --help
da3 backend --help