# 🚀 Depth Anything 3 Command Line Interface ## 📋 Table of Contents - [📖 Overview](#overview) - [⚡ Quick Start](#quick-start) - [📚 Command Reference](#command-reference) - [🤖 auto - Auto Mode](#auto---auto-mode) - [🖼️ image - Single Image Processing](#image---single-image-processing) - [🗂️ images - Image Directory Processing](#images---image-directory-processing) - [🎬 video - Video Processing](#video---video-processing) - [📐 colmap - COLMAP Dataset Processing](#colmap---colmap-dataset-processing) - [🔧 backend - Backend Service](#backend---backend-service) - [🎨 gradio - Gradio Application](#gradio---gradio-application) - [🖼️ gallery - Gallery Server](#gallery---gallery-server) - [⚙️ Parameter Details](#parameter-details) - [💡 Usage Examples](#usage-examples) ## 📖 Overview The Depth Anything 3 CLI provides a comprehensive command-line toolkit supporting image depth estimation, video processing, COLMAP dataset handling, and web applications. The backend service enables cache model to GPU so that we do not need to reload model for each command. ## ⚡ Quick Start The CLI can run fully offline or connect to the backend for cached weights and task scheduling: ```bash # 🔧 Start backend service (optional, keeps model resident in GPU memory) da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE # 🚀 Use auto mode to process input da3 auto path/to/input --export-dir ./workspace/scene001 # ♻️ Reuse backend for next job da3 auto path/to/video.mp4 \ --export-dir ./workspace/scene002 \ --use-backend \ --backend-url http://localhost:8008 ``` Each export directory contains `scene.glb`, `scene.jpg`, and optional extras such as `depth_vis/` or `gs_video/` depending on the requested format. ## 📚 Command Reference ### 🤖 auto - Auto Mode Automatically detect input type and dispatch to the appropriate handler. **Usage:** ```bash da3 auto INPUT_PATH [OPTIONS] ``` **Input Type Detection:** - 🖼️ Single image file (.jpg, .png, .jpeg, .webp, .bmp, .tiff, .tif) - 📁 Image directory - 🎬 Video file (.mp4, .avi, .mov, .mkv, .flv, .wmv, .webm, .m4v) - 📐 COLMAP directory (containing `images/` and `sparse/` subdirectories) **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `INPUT_PATH` | str | Required | Input path (image, directory, video, or COLMAP) | | `--model-dir` | str | Default model | Model directory path | | `--export-dir` | str | `debug` | Export directory | | `--export-format` | str | `glb` | Export format (supports `mini_npz`, `glb`, `feat_vis`, etc., can be combined with hyphens) | | `--device` | str | `cuda` | Device to use | | `--use-backend` | bool | `False` | Use backend service for inference | | `--backend-url` | str | `http://localhost:8008` | Backend service URL | | `--process-res` | int | `504` | Processing resolution | | `--process-res-method` | str | `upper_bound_resize` | Processing resolution method | | `--export-feat` | str | `""` | Export features from specified layers, comma-separated (e.g., `"0,1,2"`) | | `--auto-cleanup` | bool | `False` | Automatically clean export directory without confirmation | | `--fps` | float | `1.0` | [Video] Frame sampling FPS | | `--sparse-subdir` | str | `""` | [COLMAP] Sparse reconstruction subdirectory (e.g., `"0"` for `sparse/0/`) | | `--align-to-input-ext-scale` | bool | `True` | [COLMAP] Align prediction to input extrinsics scale | | `--use-ray-pose` | bool | `False` | Use ray-based pose estimation instead of camera decoder | | `--ref-view-strategy` | str | `saddle_balanced` | Reference view selection strategy: `first`, `middle`, `saddle_balanced`, `saddle_sim_range`. See [docs](funcs/ref_view_strategy.md) | | `--conf-thresh-percentile` | float | `40.0` | [GLB] Lower percentile for adaptive confidence threshold | | `--num-max-points` | int | `1000000` | [GLB] Maximum number of points in the point cloud | | `--show-cameras` | bool | `True` | [GLB] Show camera wireframes in the exported scene | | `--feat-vis-fps` | int | `15` | [FEAT_VIS] Frame rate for output video | **Examples:** ```bash # 🖼️ Auto-process an image da3 auto path/to/image.jpg --export-dir ./output # 🎬 Auto-process a video da3 auto path/to/video.mp4 --fps 2.0 --export-dir ./output # 🔧 Use backend service da3 auto path/to/input \ --export-format mini_npz-glb \ --use-backend \ --backend-url http://localhost:8008 \ --export-dir ./output ``` --- ### 🖼️ image - Single Image Processing Process a single image for camera pose and depth estimation. **Usage:** ```bash da3 image IMAGE_PATH [OPTIONS] ``` **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `IMAGE_PATH` | str | Required | Input image file path | | `--model-dir` | str | Default model | Model directory path | | `--export-dir` | str | `debug` | Export directory | | `--export-format` | str | `glb` | Export format | | `--device` | str | `cuda` | Device to use | | `--use-backend` | bool | `False` | Use backend service for inference | | `--backend-url` | str | `http://localhost:8008` | Backend service URL | | `--process-res` | int | `504` | Processing resolution | | `--process-res-method` | str | `upper_bound_resize` | Processing resolution method | | `--export-feat` | str | `""` | Export feature layer indices (comma-separated) | | `--auto-cleanup` | bool | `False` | Automatically clean export directory | | `--use-ray-pose` | bool | `False` | Use ray-based pose estimation instead of camera decoder | | `--ref-view-strategy` | str | `saddle_balanced` | Reference view selection strategy. See [docs](funcs/ref_view_strategy.md) | | `--conf-thresh-percentile` | float | `40.0` | [GLB] Confidence threshold percentile | | `--num-max-points` | int | `1000000` | [GLB] Maximum number of points | | `--show-cameras` | bool | `True` | [GLB] Show cameras | | `--feat-vis-fps` | int | `15` | [FEAT_VIS] Video frame rate | **Examples:** ```bash # ✨ Basic usage da3 image path/to/image.png --export-dir ./output # ⚡ With backend acceleration da3 image path/to/image.png \ --use-backend \ --backend-url http://localhost:8008 \ --export-dir ./output # 🔍 Export feature visualization da3 image image.jpg \ --export-format feat_vis \ --export-feat "9,19,29,39" \ --export-dir ./results ``` --- ### 🗂️ images - Image Directory Processing Process a directory of images for batch depth estimation. **Usage:** ```bash da3 images IMAGES_DIR [OPTIONS] ``` **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `IMAGES_DIR` | str | Required | Directory path containing images | | `--image-extensions` | str | `png,jpg,jpeg` | Image file extensions to process (comma-separated) | | `--model-dir` | str | Default model | Model directory path | | `--export-dir` | str | `debug` | Export directory | | `--export-format` | str | `glb` | Export format | | `--device` | str | `cuda` | Device to use | | `--use-backend` | bool | `False` | Use backend service for inference | | `--backend-url` | str | `http://localhost:8008` | Backend service URL | | `--process-res` | int | `504` | Processing resolution | | `--process-res-method` | str | `upper_bound_resize` | Processing resolution method | | `--export-feat` | str | `""` | Export feature layer indices | | `--auto-cleanup` | bool | `False` | Automatically clean export directory | | `--use-ray-pose` | bool | `False` | Use ray-based pose estimation instead of camera decoder | | `--ref-view-strategy` | str | `saddle_balanced` | Reference view selection strategy. See [docs](funcs/ref_view_strategy.md) | | `--conf-thresh-percentile` | float | `40.0` | [GLB] Confidence threshold percentile | | `--num-max-points` | int | `1000000` | [GLB] Maximum number of points | | `--show-cameras` | bool | `True` | [GLB] Show cameras | | `--feat-vis-fps` | int | `15` | [FEAT_VIS] Video frame rate | **Examples:** ```bash # 📁 Process directory (defaults to png/jpg/jpeg) da3 images ./image_folder --export-dir ./output # 🎯 Custom extensions da3 images ./dataset --image-extensions "png,jpg,webp" --export-dir ./output # 🔧 Use backend service da3 images ./dataset \ --use-backend \ --backend-url http://localhost:8008 \ --export-dir ./output ``` --- ### 🎬 video - Video Processing Process video by extracting frames for depth estimation. **Usage:** ```bash da3 video VIDEO_PATH [OPTIONS] ``` **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `VIDEO_PATH` | str | Required | Input video file path | | `--fps` | float | `1.0` | Frame extraction sampling FPS | | `--model-dir` | str | Default model | Model directory path | | `--export-dir` | str | `debug` | Export directory | | `--export-format` | str | `glb` | Export format | | `--device` | str | `cuda` | Device to use | | `--use-backend` | bool | `False` | Use backend service for inference | | `--backend-url` | str | `http://localhost:8008` | Backend service URL | | `--process-res` | int | `504` | Processing resolution | | `--process-res-method` | str | `upper_bound_resize` | Processing resolution method | | `--export-feat` | str | `""` | Export feature layer indices | | `--auto-cleanup` | bool | `False` | Automatically clean export directory | | `--use-ray-pose` | bool | `False` | Use ray-based pose estimation instead of camera decoder | | `--ref-view-strategy` | str | `saddle_balanced` | Reference view selection strategy. See [docs](funcs/ref_view_strategy.md) | | `--conf-thresh-percentile` | float | `40.0` | [GLB] Confidence threshold percentile | | `--num-max-points` | int | `1000000` | [GLB] Maximum number of points | | `--show-cameras` | bool | `True` | [GLB] Show cameras | | `--feat-vis-fps` | int | `15` | [FEAT_VIS] Video frame rate | **Examples:** ```bash # ✨ Basic video processing da3 video path/to/video.mp4 --export-dir ./output # ⚙️ Control frame sampling and resolution da3 video path/to/video.mp4 \ --fps 2.0 \ --process-res 1024 \ --export-dir ./output # 🔧 Use backend service da3 video path/to/video.mp4 \ --use-backend \ --backend-url http://localhost:8008 \ --export-dir ./output ``` --- ### 📐 colmap - COLMAP Dataset Processing Run pose-conditioned depth estimation on COLMAP data. **Usage:** ```bash da3 colmap COLMAP_DIR [OPTIONS] ``` **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `COLMAP_DIR` | str | Required | COLMAP directory containing `images/` and `sparse/` subdirectories | | `--sparse-subdir` | str | `""` | Sparse reconstruction subdirectory (e.g., `"0"` for `sparse/0/`) | | `--align-to-input-ext-scale` | bool | `True` | Align prediction to input extrinsics scale | | `--model-dir` | str | Default model | Model directory path | | `--export-dir` | str | `debug` | Export directory | | `--export-format` | str | `glb` | Export format | | `--device` | str | `cuda` | Device to use | | `--use-backend` | bool | `False` | Use backend service for inference | | `--backend-url` | str | `http://localhost:8008` | Backend service URL | | `--process-res` | int | `504` | Processing resolution | | `--process-res-method` | str | `upper_bound_resize` | Processing resolution method | | `--export-feat` | str | `""` | Export feature layer indices | | `--auto-cleanup` | bool | `False` | Automatically clean export directory | | `--use-ray-pose` | bool | `False` | Use ray-based pose estimation instead of camera decoder | | `--ref-view-strategy` | str | `saddle_balanced` | Reference view selection strategy. See [docs](funcs/ref_view_strategy.md) | | `--conf-thresh-percentile` | float | `40.0` | [GLB] Confidence threshold percentile | | `--num-max-points` | int | `1000000` | [GLB] Maximum number of points | | `--show-cameras` | bool | `True` | [GLB] Show cameras | | `--feat-vis-fps` | int | `15` | [FEAT_VIS] Video frame rate | **Examples:** ```bash # 📐 Process COLMAP dataset da3 colmap ./colmap_dataset --export-dir ./output # 🎯 Use specific sparse subdirectory and align scale da3 colmap ./colmap_dataset \ --sparse-subdir 0 \ --align-to-input-ext-scale \ --export-dir ./output # 🔧 Use backend service da3 colmap ./colmap_dataset \ --use-backend \ --backend-url http://localhost:8008 \ --export-dir ./output ``` --- ### 🔧 backend - Backend Service Start model backend service with integrated gallery. **Usage:** ```bash da3 backend [OPTIONS] ``` **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `--model-dir` | str | Default model | Model directory path | | `--device` | str | `cuda` | Device to use | | `--host` | str | `127.0.0.1` | Host address to bind to | | `--port` | int | `8008` | Port number to bind to | | `--gallery-dir` | str | Default gallery dir | Gallery directory path (optional) | **Features:** - 🎯 Keeps model resident in GPU memory - 🔌 Provides REST inference API - 📊 Integrated dashboard and status monitoring - 🖼️ Optional gallery browser (if `--gallery-dir` is provided) **Available Endpoints:** - 🏠 `/` - Home page - 📊 `/dashboard` - Dashboard - ✅ `/status` - API status - 🖼️ `/gallery/` - Gallery browser (if enabled) **Examples:** ```bash # 🚀 Basic backend service da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE # 🖼️ Backend with gallery da3 backend \ --model-dir depth-anything/DA3NESTED-GIANT-LARGE \ --device cuda \ --host 0.0.0.0 \ --port 8008 \ --gallery-dir ./workspace # 💻 Use CPU da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE --device cpu ``` --- ### 🎨 gradio - Gradio Application Launch Depth Anything 3 Gradio interactive web application. **Usage:** ```bash da3 gradio [OPTIONS] ``` **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `--model-dir` | str | Required | Model directory path | | `--workspace-dir` | str | Required | Workspace directory path | | `--gallery-dir` | str | Required | Gallery directory path | | `--host` | str | `127.0.0.1` | Host address to bind to | | `--port` | int | `7860` | Port number to bind to | | `--share` | bool | `False` | Create a public link | | `--debug` | bool | `False` | Enable debug mode | | `--cache-examples` | bool | `False` | Pre-cache all example scenes at startup | | `--cache-gs-tag` | str | `""` | Tag to match scene names for high-res+3DGS caching | **Examples:** ```bash # 🎨 Basic Gradio application da3 gradio \ --model-dir depth-anything/DA3NESTED-GIANT-LARGE \ --workspace-dir ./workspace \ --gallery-dir ./gallery # 🌐 Enable sharing and debug da3 gradio \ --model-dir depth-anything/DA3NESTED-GIANT-LARGE \ --workspace-dir ./workspace \ --gallery-dir ./gallery \ --share \ --debug # ⚡ Pre-cache examples da3 gradio \ --model-dir depth-anything/DA3NESTED-GIANT-LARGE \ --workspace-dir ./workspace \ --gallery-dir ./gallery \ --cache-examples \ --cache-gs-tag "dl3dv" ``` --- ### 🖼️ gallery - Gallery Server Launch standalone Depth Anything 3 Gallery server. **Usage:** ```bash da3 gallery [OPTIONS] ``` **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `--gallery-dir` | str | Default gallery dir | Gallery root directory | | `--host` | str | `127.0.0.1` | Host address to bind to | | `--port` | int | `8007` | Port number to bind to | | `--open-browser` | bool | `False` | Open browser after launch | **Note:** The gallery expects each scene folder to contain at least `scene.glb` and `scene.jpg`, with optional subfolders such as `depth_vis/` or `gs_video/`. **Examples:** ```bash # 🖼️ Basic gallery server da3 gallery --gallery-dir ./workspace # 🌐 Custom host and port da3 gallery \ --gallery-dir ./workspace \ --host 0.0.0.0 \ --port 8007 # 🚀 Auto-open browser da3 gallery --gallery-dir ./workspace --open-browser ``` --- ## ⚙️ Parameter Details ### 🔧 Common Parameters - **`--export-dir`**: Output directory, defaults to `debug` - **`--export-format`**: Export format, supports combining multiple formats with hyphens: - 📦 `mini_npz`: Compressed NumPy format - 🎨 `glb`: glTF binary format (3D scene) - 🔍 `feat_vis`: Feature visualization - Example: `mini_npz-glb` exports both formats - **`--process-res`** / **`--process-res-method`**: Control preprocessing resolution strategy - `process-res`: Target resolution (default 504) - `process-res-method`: Resize method (default `upper_bound_resize`) - **`--auto-cleanup`**: Remove existing export directory without confirmation - **`--use-backend`** / **`--backend-url`**: Reuse running backend service - ⚡ Reduces model loading time - 🌐 Supports distributed processing - **`--export-feat`**: Layer indices for exporting intermediate features (comma-separated) - Example: `"9,19,29,39"` ### 🎨 GLB Export Parameters - **`--conf-thresh-percentile`**: Lower percentile for adaptive confidence threshold (default 40.0) - Used to filter low-confidence points - **`--num-max-points`**: Maximum number of points in point cloud (default 1,000,000) - Controls output file size and performance - **`--show-cameras`**: Show camera wireframes in exported scene (default True) ### 🔍 Feature Visualization Parameters - **`--feat-vis-fps`**: Frame rate for feature visualization output video (default 15) ### 🎬 Video-Specific Parameters - **`--fps`**: Video frame extraction sampling rate (default 1.0 FPS) - Higher values extract more frames ### 📐 COLMAP-Specific Parameters - **`--sparse-subdir`**: Sparse reconstruction subdirectory - Empty string uses `sparse/` directory - `"0"` uses `sparse/0/` directory - **`--align-to-input-ext-scale`**: Align prediction to input extrinsics scale (default True) - Ensures depth estimation is consistent with COLMAP scale --- ## 💡 Usage Examples ### 1️⃣ Basic Workflow ```bash # 🔧 Start backend service da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE --host 0.0.0.0 --port 8008 # 🖼️ Process single image da3 image image.jpg --export-dir ./output1 --use-backend # 🎬 Process video da3 video video.mp4 --fps 2.0 --export-dir ./output2 --use-backend # 📐 Process COLMAP dataset da3 colmap ./colmap_data --export-dir ./output3 --use-backend ``` ### 2️⃣ Using Auto Mode ```bash # 🤖 Auto-detect and process da3 auto ./unknown_input --export-dir ./output # ⚡ With backend acceleration da3 auto ./unknown_input \ --use-backend \ --backend-url http://localhost:8008 \ --export-dir ./output ``` ### 3️⃣ Multi-Format Export ```bash # 📦 Export both NPZ and GLB formats da3 auto assets/examples/SOH \ --export-format mini_npz-glb \ --export-dir ./workspace/soh # 🔍 Export feature visualization da3 image image.jpg \ --export-format feat_vis \ --export-feat "9,19,29,39" \ --export-dir ./results ``` ### 4️⃣ Advanced Configuration ```bash # ⚙️ Custom resolution and point cloud density da3 image image.jpg \ --process-res 1024 \ --num-max-points 2000000 \ --conf-thresh-percentile 30.0 \ --export-dir ./output # 📐 COLMAP advanced options da3 colmap ./colmap_data \ --sparse-subdir 0 \ --align-to-input-ext-scale \ --process-res 756 \ --export-dir ./output ``` ### 5️⃣ Batch Processing Workflow ```bash # 🔧 Start backend da3 backend \ --model-dir depth-anything/DA3NESTED-GIANT-LARGE \ --device cuda \ --host 0.0.0.0 \ --port 8008 \ --gallery-dir ./workspace # 🔄 Batch process multiple scenes for scene in scene1 scene2 scene3; do da3 auto ./data/$scene \ --export-dir ./workspace/$scene \ --use-backend \ --auto-cleanup done # 🖼️ Launch gallery to view results da3 gallery --gallery-dir ./workspace --open-browser ``` ### 6️⃣ Web Applications ```bash # 🎨 Launch Gradio application da3 gradio \ --model-dir depth-anything/DA3NESTED-GIANT-LARGE \ --workspace-dir workspace/gradio \ --gallery-dir ./gallery \ --host 0.0.0.0 \ --port 7860 \ --share ``` ### 7️⃣ Transformer Feature Visualization ```bash # 🔍 Export Transformer features # 📦 Combined with numerical output da3 auto video.mp4 \ --export-format glb-feat_vis \ --export-feat "11,21,31" \ --export-dir ./debug \ --use-backend ``` --- ## 📝 Notes 1. **🔧 Backend Service**: Recommended for processing multiple tasks to improve efficiency 2. **💾 GPU Memory**: Be mindful of GPU memory usage when processing high-resolution inputs 3. **📁 Export Directory**: Use `--auto-cleanup` to avoid manual confirmation for deletion 4. **🔀 Format Combination**: Multiple export formats can be combined with hyphens (e.g., `mini_npz-glb-feat_vis`) 5. **📐 COLMAP Data**: Ensure COLMAP directory structure is correct (contains `images/` and `sparse/` subdirectories) --- ## ❓ Getting Help View detailed help for any command: ```bash # 📖 View main help da3 --help # 🔍 View specific command help da3 auto --help da3 image --help da3 backend --help ```