| # DA3 Model Selection Guide | |
| ## Overview | |
| DA3 provides multiple model series, each optimized for different use cases. This guide helps you choose the right model for YLFF workflows. | |
| ## Model Series | |
| ### π DA3 Main Series | |
| **Models**: `DA3-GIANT`, `DA3-LARGE`, `DA3-BASE`, `DA3-SMALL` | |
| **Capabilities**: | |
| - β Monocular depth estimation | |
| - β Multi-view depth estimation | |
| - β Pose-conditioned depth estimation | |
| - β Camera pose estimation | |
| - β 3D Gaussian estimation | |
| **Characteristics**: | |
| - Unified depth-ray representation | |
| - **Not metric** (relative depth, requires scale alignment) | |
| - Varying sizes: Giant (best quality) β Small (fastest) | |
| **Best For**: | |
| - General-purpose visual geometry tasks | |
| - When you need pose estimation but can handle scale alignment | |
| - Fast iteration with smaller models | |
| ### π DA3 Metric Series | |
| **Models**: `DA3Metric-LARGE` | |
| **Capabilities**: | |
| - β Monocular depth estimation | |
| - β **Metric depth** (real-world scale) | |
| **Characteristics**: | |
| - Specialized for metric depth | |
| - Fine-tuned for real-world scale | |
| - **No pose estimation** | |
| **Best For**: | |
| - Applications requiring real-world scale | |
| - When you have poses from another source | |
| - Metric depth-only workflows | |
| ### π DA3 Monocular Series | |
| **Models**: `DA3Mono-LARGE` | |
| **Capabilities**: | |
| - β High-quality relative monocular depth | |
| **Characteristics**: | |
| - Dedicated for monocular depth | |
| - Superior geometric accuracy vs. disparity-based models | |
| - **No pose estimation, not metric** | |
| **Best For**: | |
| - Single-image depth estimation | |
| - When geometric accuracy is critical | |
| - Relative depth is sufficient | |
| ### π DA3 Nested Series | |
| **Models**: `DA3NESTED-GIANT-LARGE` | |
| **Capabilities**: | |
| - β Monocular depth estimation | |
| - β Multi-view depth estimation | |
| - β Pose-conditioned depth estimation | |
| - β Camera pose estimation | |
| - β **Metric depth** (real-world scale) | |
| **Characteristics**: | |
| - Combines giant model with metric model | |
| - **Both pose estimation AND metric depth** | |
| - Real-world metric scale reconstruction | |
| - **Recommended for BA validation and fine-tuning** | |
| **Best For**: | |
| - β **BA validation** (needs metric depth + poses) | |
| - β **Fine-tuning workflows** (needs metric depth + poses) | |
| - β Metric reconstruction at real-world scale | |
| - β When you need both pose and metric depth | |
| ## YLFF Recommendations | |
| ### For BA Validation | |
| **Recommended**: `DA3NESTED-GIANT-LARGE` | |
| **Why**: | |
| - Provides both camera poses and metric depth | |
| - Metric depth enables proper comparison with BA (real-world scale) | |
| - Best accuracy for validation workflows | |
| **Usage**: | |
| ```bash | |
| # Auto-selects DA3NESTED-GIANT-LARGE | |
| ylff validate arkit assets/examples/ARKit | |
| # Or explicitly specify | |
| ylff validate arkit assets/examples/ARKit \ | |
| --model-name depth-anything/DA3NESTED-GIANT-LARGE | |
| ``` | |
| ### For Fine-Tuning | |
| **Recommended**: `DA3NESTED-GIANT-LARGE` | |
| **Why**: | |
| - Fine-tuning benefits from metric depth (real-world scale) | |
| - Pose estimation needed for training | |
| - Best starting point for improvement | |
| **Usage**: | |
| ```bash | |
| # Auto-selects DA3NESTED-GIANT-LARGE | |
| ylff train start data/training | |
| # Or explicitly specify | |
| ylff train start data/training \ | |
| --model-name depth-anything/DA3NESTED-GIANT-LARGE | |
| ``` | |
| ### For Fast Experimentation | |
| **Recommended**: `DA3-LARGE` or `DA3-BASE` | |
| **Why**: | |
| - Faster inference | |
| - Still provides pose estimation | |
| - Good for quick tests | |
| **Usage**: | |
| ```bash | |
| ylff validate sequence path/to/images \ | |
| --model-name depth-anything/DA3-BASE | |
| ``` | |
| ### For Metric Depth Only | |
| **Recommended**: `DA3Metric-LARGE` | |
| **Why**: | |
| - Specialized for metric depth | |
| - Best accuracy for metric-only tasks | |
| **Note**: This model does **not** provide pose estimation. Use with external pose sources. | |
| ## Model Comparison | |
| | Model | Pose Est. | Metric Depth | Speed | Quality | Use Case | | |
| | --------------------- | --------- | ------------ | ------- | ------- | ------------------------------ | | |
| | DA3NESTED-GIANT-LARGE | β | β | Medium | Best | **BA validation, fine-tuning** | | |
| | DA3-GIANT | β | β | Slow | Best | Best quality, non-metric | | |
| | DA3-LARGE | β | β | Medium | High | General purpose | | |
| | DA3-BASE | β | β | Fast | Good | Fast iteration | | |
| | DA3-SMALL | β | β | Fastest | Good | Fastest | | |
| | DA3Metric-LARGE | β | β | Medium | High | Metric depth only | | |
| | DA3Mono-LARGE | β | β | Medium | High | Monocular depth only | | |
| ## Auto-Selection | |
| YLFF automatically selects the best model for each use case: | |
| ```python | |
| from ylff.models import get_recommended_model | |
| # For BA validation | |
| model = get_recommended_model("ba_validation") | |
| # Returns: "depth-anything/DA3NESTED-GIANT-LARGE" | |
| # For fine-tuning | |
| model = get_recommended_model("fine_tuning") | |
| # Returns: "depth-anything/DA3NESTED-GIANT-LARGE" | |
| # For fast inference | |
| model = get_recommended_model("fast") | |
| # Returns: "depth-anything/DA3-SMALL" | |
| ``` | |
| ## CLI Usage | |
| ### Auto-Select Model | |
| ```bash | |
| # YLFF auto-selects DA3NESTED-GIANT-LARGE for BA validation | |
| ylff validate arkit assets/examples/ARKit | |
| # YLFF auto-selects DA3NESTED-GIANT-LARGE for fine-tuning | |
| ylff train start data/training | |
| ``` | |
| ### Explicit Model Selection | |
| ```bash | |
| # Use specific model | |
| ylff validate arkit assets/examples/ARKit \ | |
| --model-name depth-anything/DA3-LARGE | |
| # Use smaller model for speed | |
| ylff validate sequence path/to/images \ | |
| --model-name depth-anything/DA3-BASE | |
| ``` | |
| ### List Available Models | |
| ```python | |
| from ylff.models import list_available_models, get_model_info | |
| # List all models | |
| models = list_available_models() | |
| for name, info in models.items(): | |
| print(f"{name}: {info['description']}") | |
| # Get specific model info | |
| info = get_model_info("depth-anything/DA3NESTED-GIANT-LARGE") | |
| print(info['capabilities']) | |
| print(info['recommended_for']) | |
| ``` | |
| ## Why DA3NESTED-GIANT-LARGE for BA Validation? | |
| 1. **Metric Depth**: BA works in real-world scale. Metric depth enables proper comparison. | |
| 2. **Pose Estimation**: BA validation compares predicted poses with BA-refined poses. Need pose estimation capability. | |
| 3. **Accuracy**: Nested model combines best of both worlds (giant model quality + metric specialization). | |
| 4. **Consistency**: Using metric depth ensures depth values are in real-world units, matching BA's output scale. | |
| ## Performance Considerations | |
| - **DA3NESTED-GIANT-LARGE**: Slower but most accurate for BA workflows | |
| - **DA3-LARGE**: Good balance for experimentation | |
| - **DA3-BASE**: Faster, good for quick tests | |
| - **DA3-SMALL**: Fastest, acceptable quality for rapid iteration | |
| ## Migration Guide | |
| If you were using `DA3-LARGE` before: | |
| ```bash | |
| # Old (still works) | |
| ylff validate arkit assets/examples/ARKit \ | |
| --model-name depth-anything/DA3-LARGE | |
| # New (recommended, auto-selected) | |
| ylff validate arkit assets/examples/ARKit | |
| # Automatically uses DA3NESTED-GIANT-LARGE | |
| ``` | |
| The new default provides better results for BA validation due to metric depth support. | |