# DA3 Model Selection Guide ## Overview DA3 provides multiple model series, each optimized for different use cases. This guide helps you choose the right model for YLFF workflows. ## Model Series ### 🌟 DA3 Main Series **Models**: `DA3-GIANT`, `DA3-LARGE`, `DA3-BASE`, `DA3-SMALL` **Capabilities**: - ✅ Monocular depth estimation - ✅ Multi-view depth estimation - ✅ Pose-conditioned depth estimation - ✅ Camera pose estimation - ✅ 3D Gaussian estimation **Characteristics**: - Unified depth-ray representation - **Not metric** (relative depth, requires scale alignment) - Varying sizes: Giant (best quality) → Small (fastest) **Best For**: - General-purpose visual geometry tasks - When you need pose estimation but can handle scale alignment - Fast iteration with smaller models ### 📐 DA3 Metric Series **Models**: `DA3Metric-LARGE` **Capabilities**: - ✅ Monocular depth estimation - ✅ **Metric depth** (real-world scale) **Characteristics**: - Specialized for metric depth - Fine-tuned for real-world scale - **No pose estimation** **Best For**: - Applications requiring real-world scale - When you have poses from another source - Metric depth-only workflows ### 🔍 DA3 Monocular Series **Models**: `DA3Mono-LARGE` **Capabilities**: - ✅ High-quality relative monocular depth **Characteristics**: - Dedicated for monocular depth - Superior geometric accuracy vs. disparity-based models - **No pose estimation, not metric** **Best For**: - Single-image depth estimation - When geometric accuracy is critical - Relative depth is sufficient ### 🔗 DA3 Nested Series **Models**: `DA3NESTED-GIANT-LARGE` **Capabilities**: - ✅ Monocular depth estimation - ✅ Multi-view depth estimation - ✅ Pose-conditioned depth estimation - ✅ Camera pose estimation - ✅ **Metric depth** (real-world scale) **Characteristics**: - Combines giant model with metric model - **Both pose estimation AND metric depth** - Real-world metric scale reconstruction - **Recommended for BA validation and fine-tuning** **Best For**: - ✅ **BA validation** (needs metric depth + poses) - ✅ **Fine-tuning workflows** (needs metric depth + poses) - ✅ Metric reconstruction at real-world scale - ✅ When you need both pose and metric depth ## YLFF Recommendations ### For BA Validation **Recommended**: `DA3NESTED-GIANT-LARGE` **Why**: - Provides both camera poses and metric depth - Metric depth enables proper comparison with BA (real-world scale) - Best accuracy for validation workflows **Usage**: ```bash # Auto-selects DA3NESTED-GIANT-LARGE ylff validate arkit assets/examples/ARKit # Or explicitly specify ylff validate arkit assets/examples/ARKit \ --model-name depth-anything/DA3NESTED-GIANT-LARGE ``` ### For Fine-Tuning **Recommended**: `DA3NESTED-GIANT-LARGE` **Why**: - Fine-tuning benefits from metric depth (real-world scale) - Pose estimation needed for training - Best starting point for improvement **Usage**: ```bash # Auto-selects DA3NESTED-GIANT-LARGE ylff train start data/training # Or explicitly specify ylff train start data/training \ --model-name depth-anything/DA3NESTED-GIANT-LARGE ``` ### For Fast Experimentation **Recommended**: `DA3-LARGE` or `DA3-BASE` **Why**: - Faster inference - Still provides pose estimation - Good for quick tests **Usage**: ```bash ylff validate sequence path/to/images \ --model-name depth-anything/DA3-BASE ``` ### For Metric Depth Only **Recommended**: `DA3Metric-LARGE` **Why**: - Specialized for metric depth - Best accuracy for metric-only tasks **Note**: This model does **not** provide pose estimation. Use with external pose sources. ## Model Comparison | Model | Pose Est. | Metric Depth | Speed | Quality | Use Case | | --------------------- | --------- | ------------ | ------- | ------- | ------------------------------ | | DA3NESTED-GIANT-LARGE | ✅ | ✅ | Medium | Best | **BA validation, fine-tuning** | | DA3-GIANT | ✅ | ❌ | Slow | Best | Best quality, non-metric | | DA3-LARGE | ✅ | ❌ | Medium | High | General purpose | | DA3-BASE | ✅ | ❌ | Fast | Good | Fast iteration | | DA3-SMALL | ✅ | ❌ | Fastest | Good | Fastest | | DA3Metric-LARGE | ❌ | ✅ | Medium | High | Metric depth only | | DA3Mono-LARGE | ❌ | ❌ | Medium | High | Monocular depth only | ## Auto-Selection YLFF automatically selects the best model for each use case: ```python from ylff.models import get_recommended_model # For BA validation model = get_recommended_model("ba_validation") # Returns: "depth-anything/DA3NESTED-GIANT-LARGE" # For fine-tuning model = get_recommended_model("fine_tuning") # Returns: "depth-anything/DA3NESTED-GIANT-LARGE" # For fast inference model = get_recommended_model("fast") # Returns: "depth-anything/DA3-SMALL" ``` ## CLI Usage ### Auto-Select Model ```bash # YLFF auto-selects DA3NESTED-GIANT-LARGE for BA validation ylff validate arkit assets/examples/ARKit # YLFF auto-selects DA3NESTED-GIANT-LARGE for fine-tuning ylff train start data/training ``` ### Explicit Model Selection ```bash # Use specific model ylff validate arkit assets/examples/ARKit \ --model-name depth-anything/DA3-LARGE # Use smaller model for speed ylff validate sequence path/to/images \ --model-name depth-anything/DA3-BASE ``` ### List Available Models ```python from ylff.models import list_available_models, get_model_info # List all models models = list_available_models() for name, info in models.items(): print(f"{name}: {info['description']}") # Get specific model info info = get_model_info("depth-anything/DA3NESTED-GIANT-LARGE") print(info['capabilities']) print(info['recommended_for']) ``` ## Why DA3NESTED-GIANT-LARGE for BA Validation? 1. **Metric Depth**: BA works in real-world scale. Metric depth enables proper comparison. 2. **Pose Estimation**: BA validation compares predicted poses with BA-refined poses. Need pose estimation capability. 3. **Accuracy**: Nested model combines best of both worlds (giant model quality + metric specialization). 4. **Consistency**: Using metric depth ensures depth values are in real-world units, matching BA's output scale. ## Performance Considerations - **DA3NESTED-GIANT-LARGE**: Slower but most accurate for BA workflows - **DA3-LARGE**: Good balance for experimentation - **DA3-BASE**: Faster, good for quick tests - **DA3-SMALL**: Fastest, acceptable quality for rapid iteration ## Migration Guide If you were using `DA3-LARGE` before: ```bash # Old (still works) ylff validate arkit assets/examples/ARKit \ --model-name depth-anything/DA3-LARGE # New (recommended, auto-selected) ylff validate arkit assets/examples/ARKit # Automatically uses DA3NESTED-GIANT-LARGE ``` The new default provides better results for BA validation due to metric depth support.