Spaces:
Running
on
Zero
Running
on
Zero
File size: 6,161 Bytes
789c9b1 a7087a4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
---
title: Depth Anything Compare Demo
emoji: π
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.46.0
app_file: app.py
pinned: false
---
# Depth Anything v1 vs v2 Comparison Demo
A comprehensive comparison tool for **Depth Anything v1** and **Depth Anything v2** models, built with Gradio and optimized for HuggingFace Spaces with ZeroGPU support.
## π Features
### Three Comparison Modes
1. **ποΈ Slider Comparison**: Interactive side-by-side comparison with a draggable slider
2. **π Method Comparison**: Traditional side-by-side view with model labels
3. **π¬ Single Model**: Run individual models for detailed analysis
### Supported Models
#### Depth Anything v1
- **ViT-S (Small)**: Fastest inference, good quality
- **ViT-B (Base)**: Balanced speed and quality
- **ViT-L (Large)**: Best quality, slower inference
#### Depth Anything v2
- **ViT-Small**: Enhanced small model with improved accuracy
- **ViT-Base**: Balanced performance with v2 improvements
- **ViT-Large**: State-of-the-art depth estimation quality
## πΌοΈ Example Images
The demo includes 20+ carefully selected example images showcasing various scenarios:
- Indoor and outdoor scenes
- Different lighting conditions
- Various object types and compositions
- Challenging depth estimation scenarios
## π οΈ Technical Details
### Architecture
- **Framework**: Gradio 4.0+ with modern UI components
- **Backend**: PyTorch with CUDA acceleration
- **Deployment**: ZeroGPU-optimized for HuggingFace Spaces
- **Memory Management**: Automatic model loading/unloading for efficient GPU usage
### ZeroGPU Optimizations
- `@spaces.GPU` decorators for GPU-intensive functions
- Automatic memory cleanup between inferences
- On-demand model loading to prevent OOM errors
- Efficient resource allocation and deallocation
### Depth Visualization
- **Colormap**: Spectral_r colormap for intuitive depth representation
- **Normalization**: Min-max scaling for consistent visualization
- **Resolution**: Maintains original image aspect ratios
## π¦ Installation & Setup
### Local Development
1. **Clone the repository**:
```bash
git clone <repository-url>
cd Depth-Anything-Compare-demo
```
2. **Install dependencies**:
```bash
pip install -r requirements.txt
```
3. **Download model checkpoints** (for local usage):
```bash
# Depth Anything v1 models are downloaded automatically from HuggingFace Hub
# For v2 models, download checkpoints to Depth-Anything-V2/checkpoints/
```
4. **Run locally**:
```bash
python app_local.py # For local development
python app.py # For ZeroGPU deployment
```
### HuggingFace Spaces Deployment
This app is optimized for HuggingFace Spaces with ZeroGPU. Simply:
1. Upload the repository to your HuggingFace Space
2. Set hardware to "ZeroGPU"
3. The app will automatically handle GPU allocation and model loading
## π Project Structure
```
Depth-Anything-Compare-demo/
βββ app.py # ZeroGPU-optimized main application
βββ app_local.py # Local development version
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ assets/
β βββ examples/ # Example images for testing
βββ Depth-Anything/ # Depth Anything v1 implementation
β βββ depth_anything/
β β βββ dpt.py # v1 model architecture
β β βββ util/ # v1 utilities and transforms
β βββ torchhub/ # Required dependencies
βββ Depth-Anything-V2/ # Depth Anything v2 implementation
βββ depth_anything_v2/
β βββ dpt.py # v2 model architecture
β βββ dinov2_layers/ # DINOv2 components
βββ assets/
βββ examples/ # v2-specific examples
```
## π§ Configuration
### Model Configuration
Models are configured in the respective config dictionaries:
- `V1_MODEL_CONFIGS`: HuggingFace Hub model identifiers
- `V2_MODEL_CONFIGS`: Local checkpoint paths and architecture parameters
### Environment Variables
- `DEVICE`: Automatically detects CUDA availability
- GPU memory is managed automatically by ZeroGPU
## π Performance
### Inference Times (Approximate)
- **ViT-S models**: ~1-2 seconds
- **ViT-B models**: ~2-4 seconds
- **ViT-L models**: ~4-8 seconds
*Times vary based on image resolution and GPU availability*
### Memory Usage
- Optimized for ZeroGPU's memory constraints
- Automatic model unloading prevents OOM errors
- Efficient batch processing for multiple comparisons
## π― Usage Examples
### Compare v1 vs v2 Models
1. Upload an image or select from examples
2. Choose models from both v1 and v2 families
3. Click "Compare" or "Slider Compare"
4. Analyze the depth estimation differences
### Analyze Single Model Performance
1. Select "Single Model" tab
2. Choose any available model
3. Upload image and click "Run"
4. Examine detailed depth map output
## π€ Contributing
Contributions are welcome! Areas for improvement:
- Additional model variants
- New visualization options
- Performance optimizations
- UI/UX enhancements
## π References
- **Depth Anything v1**: [LiheYoung/Depth-Anything](https://github.com/LiheYoung/Depth-Anything)
- **Depth Anything v2**: [DepthAnything/Depth-Anything-V2](https://github.com/DepthAnything/Depth-Anything-V2)
- **Original Papers**:
- [Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data](https://arxiv.org/abs/2401.10891)
- [Depth Anything V2: More Efficient, Better Supervised](https://arxiv.org/abs/2406.09414)
## π License
This project combines implementations from:
- Depth Anything v1: MIT License
- Depth Anything v2: Apache 2.0 License
- Demo code: MIT License
Please check individual component licenses for specific terms.
## π Acknowledgments
- Original Depth Anything authors and contributors
- HuggingFace team for Spaces and ZeroGPU infrastructure
- Gradio team for the excellent UI framework
---
**Note**: This is a demonstration/comparison tool. For production use of the Depth Anything models, please refer to the original repositories and follow their recommended practices.
|