DA3 Model Selection Guide
Overview
DA3 provides multiple model series, each optimized for different use cases. This guide helps you choose the right model for YLFF workflows.
Model Series
π DA3 Main Series
Models: DA3-GIANT, DA3-LARGE, DA3-BASE, DA3-SMALL
Capabilities:
- β Monocular depth estimation
- β Multi-view depth estimation
- β Pose-conditioned depth estimation
- β Camera pose estimation
- β 3D Gaussian estimation
Characteristics:
- Unified depth-ray representation
- Not metric (relative depth, requires scale alignment)
- Varying sizes: Giant (best quality) β Small (fastest)
Best For:
- General-purpose visual geometry tasks
- When you need pose estimation but can handle scale alignment
- Fast iteration with smaller models
π DA3 Metric Series
Models: DA3Metric-LARGE
Capabilities:
- β Monocular depth estimation
- β Metric depth (real-world scale)
Characteristics:
- Specialized for metric depth
- Fine-tuned for real-world scale
- No pose estimation
Best For:
- Applications requiring real-world scale
- When you have poses from another source
- Metric depth-only workflows
π DA3 Monocular Series
Models: DA3Mono-LARGE
Capabilities:
- β High-quality relative monocular depth
Characteristics:
- Dedicated for monocular depth
- Superior geometric accuracy vs. disparity-based models
- No pose estimation, not metric
Best For:
- Single-image depth estimation
- When geometric accuracy is critical
- Relative depth is sufficient
π DA3 Nested Series
Models: DA3NESTED-GIANT-LARGE
Capabilities:
- β Monocular depth estimation
- β Multi-view depth estimation
- β Pose-conditioned depth estimation
- β Camera pose estimation
- β Metric depth (real-world scale)
Characteristics:
- Combines giant model with metric model
- Both pose estimation AND metric depth
- Real-world metric scale reconstruction
- Recommended for BA validation and fine-tuning
Best For:
- β BA validation (needs metric depth + poses)
- β Fine-tuning workflows (needs metric depth + poses)
- β Metric reconstruction at real-world scale
- β When you need both pose and metric depth
YLFF Recommendations
For BA Validation
Recommended: DA3NESTED-GIANT-LARGE
Why:
- Provides both camera poses and metric depth
- Metric depth enables proper comparison with BA (real-world scale)
- Best accuracy for validation workflows
Usage:
# Auto-selects DA3NESTED-GIANT-LARGE
ylff validate arkit assets/examples/ARKit
# Or explicitly specify
ylff validate arkit assets/examples/ARKit \
--model-name depth-anything/DA3NESTED-GIANT-LARGE
For Fine-Tuning
Recommended: DA3NESTED-GIANT-LARGE
Why:
- Fine-tuning benefits from metric depth (real-world scale)
- Pose estimation needed for training
- Best starting point for improvement
Usage:
# Auto-selects DA3NESTED-GIANT-LARGE
ylff train start data/training
# Or explicitly specify
ylff train start data/training \
--model-name depth-anything/DA3NESTED-GIANT-LARGE
For Fast Experimentation
Recommended: DA3-LARGE or DA3-BASE
Why:
- Faster inference
- Still provides pose estimation
- Good for quick tests
Usage:
ylff validate sequence path/to/images \
--model-name depth-anything/DA3-BASE
For Metric Depth Only
Recommended: DA3Metric-LARGE
Why:
- Specialized for metric depth
- Best accuracy for metric-only tasks
Note: This model does not provide pose estimation. Use with external pose sources.
Model Comparison
| Model | Pose Est. | Metric Depth | Speed | Quality | Use Case |
|---|---|---|---|---|---|
| DA3NESTED-GIANT-LARGE | β | β | Medium | Best | BA validation, fine-tuning |
| DA3-GIANT | β | β | Slow | Best | Best quality, non-metric |
| DA3-LARGE | β | β | Medium | High | General purpose |
| DA3-BASE | β | β | Fast | Good | Fast iteration |
| DA3-SMALL | β | β | Fastest | Good | Fastest |
| DA3Metric-LARGE | β | β | Medium | High | Metric depth only |
| DA3Mono-LARGE | β | β | Medium | High | Monocular depth only |
Auto-Selection
YLFF automatically selects the best model for each use case:
from ylff.models import get_recommended_model
# For BA validation
model = get_recommended_model("ba_validation")
# Returns: "depth-anything/DA3NESTED-GIANT-LARGE"
# For fine-tuning
model = get_recommended_model("fine_tuning")
# Returns: "depth-anything/DA3NESTED-GIANT-LARGE"
# For fast inference
model = get_recommended_model("fast")
# Returns: "depth-anything/DA3-SMALL"
CLI Usage
Auto-Select Model
# YLFF auto-selects DA3NESTED-GIANT-LARGE for BA validation
ylff validate arkit assets/examples/ARKit
# YLFF auto-selects DA3NESTED-GIANT-LARGE for fine-tuning
ylff train start data/training
Explicit Model Selection
# Use specific model
ylff validate arkit assets/examples/ARKit \
--model-name depth-anything/DA3-LARGE
# Use smaller model for speed
ylff validate sequence path/to/images \
--model-name depth-anything/DA3-BASE
List Available Models
from ylff.models import list_available_models, get_model_info
# List all models
models = list_available_models()
for name, info in models.items():
print(f"{name}: {info['description']}")
# Get specific model info
info = get_model_info("depth-anything/DA3NESTED-GIANT-LARGE")
print(info['capabilities'])
print(info['recommended_for'])
Why DA3NESTED-GIANT-LARGE for BA Validation?
Metric Depth: BA works in real-world scale. Metric depth enables proper comparison.
Pose Estimation: BA validation compares predicted poses with BA-refined poses. Need pose estimation capability.
Accuracy: Nested model combines best of both worlds (giant model quality + metric specialization).
Consistency: Using metric depth ensures depth values are in real-world units, matching BA's output scale.
Performance Considerations
- DA3NESTED-GIANT-LARGE: Slower but most accurate for BA workflows
- DA3-LARGE: Good balance for experimentation
- DA3-BASE: Faster, good for quick tests
- DA3-SMALL: Fastest, acceptable quality for rapid iteration
Migration Guide
If you were using DA3-LARGE before:
# Old (still works)
ylff validate arkit assets/examples/ARKit \
--model-name depth-anything/DA3-LARGE
# New (recommended, auto-selected)
ylff validate arkit assets/examples/ARKit
# Automatically uses DA3NESTED-GIANT-LARGE
The new default provides better results for BA validation due to metric depth support.