File size: 6,161 Bytes
789c9b1
 
 
 
 
 
 
 
 
 
 
a7087a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
title: Depth Anything Compare Demo
emoji: πŸ‘€
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.46.0
app_file: app.py
pinned: false
---

# Depth Anything v1 vs v2 Comparison Demo

A comprehensive comparison tool for **Depth Anything v1** and **Depth Anything v2** models, built with Gradio and optimized for HuggingFace Spaces with ZeroGPU support.

## πŸš€ Features

### Three Comparison Modes

1. **🎚️ Slider Comparison**: Interactive side-by-side comparison with a draggable slider
2. **πŸ” Method Comparison**: Traditional side-by-side view with model labels
3. **πŸ”¬ Single Model**: Run individual models for detailed analysis

### Supported Models

#### Depth Anything v1
- **ViT-S (Small)**: Fastest inference, good quality
- **ViT-B (Base)**: Balanced speed and quality
- **ViT-L (Large)**: Best quality, slower inference

#### Depth Anything v2
- **ViT-Small**: Enhanced small model with improved accuracy
- **ViT-Base**: Balanced performance with v2 improvements
- **ViT-Large**: State-of-the-art depth estimation quality

## πŸ–ΌοΈ Example Images

The demo includes 20+ carefully selected example images showcasing various scenarios:
- Indoor and outdoor scenes
- Different lighting conditions
- Various object types and compositions
- Challenging depth estimation scenarios

## πŸ› οΈ Technical Details

### Architecture
- **Framework**: Gradio 4.0+ with modern UI components
- **Backend**: PyTorch with CUDA acceleration
- **Deployment**: ZeroGPU-optimized for HuggingFace Spaces
- **Memory Management**: Automatic model loading/unloading for efficient GPU usage

### ZeroGPU Optimizations
- `@spaces.GPU` decorators for GPU-intensive functions
- Automatic memory cleanup between inferences
- On-demand model loading to prevent OOM errors
- Efficient resource allocation and deallocation

### Depth Visualization
- **Colormap**: Spectral_r colormap for intuitive depth representation
- **Normalization**: Min-max scaling for consistent visualization
- **Resolution**: Maintains original image aspect ratios

## πŸ“¦ Installation & Setup

### Local Development

1. **Clone the repository**:
```bash
git clone <repository-url>
cd Depth-Anything-Compare-demo
```

2. **Install dependencies**:
```bash
pip install -r requirements.txt
```

3. **Download model checkpoints** (for local usage):
```bash
# Depth Anything v1 models are downloaded automatically from HuggingFace Hub
# For v2 models, download checkpoints to Depth-Anything-V2/checkpoints/
```

4. **Run locally**:
```bash
python app_local.py  # For local development
python app.py        # For ZeroGPU deployment
```

### HuggingFace Spaces Deployment

This app is optimized for HuggingFace Spaces with ZeroGPU. Simply:

1. Upload the repository to your HuggingFace Space
2. Set hardware to "ZeroGPU"
3. The app will automatically handle GPU allocation and model loading

## πŸ“ Project Structure

```
Depth-Anything-Compare-demo/
β”œβ”€β”€ app.py                 # ZeroGPU-optimized main application
β”œβ”€β”€ app_local.py          # Local development version
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ README.md            # This file
β”œβ”€β”€ assets/
β”‚   └── examples/        # Example images for testing
β”œβ”€β”€ Depth-Anything/      # Depth Anything v1 implementation
β”‚   β”œβ”€β”€ depth_anything/
β”‚   β”‚   β”œβ”€β”€ dpt.py      # v1 model architecture
β”‚   β”‚   └── util/       # v1 utilities and transforms
β”‚   └── torchhub/       # Required dependencies
└── Depth-Anything-V2/   # Depth Anything v2 implementation
    β”œβ”€β”€ depth_anything_v2/
    β”‚   β”œβ”€β”€ dpt.py      # v2 model architecture
    β”‚   └── dinov2_layers/ # DINOv2 components
    └── assets/
        └── examples/    # v2-specific examples
```

## πŸ”§ Configuration

### Model Configuration
Models are configured in the respective config dictionaries:
- `V1_MODEL_CONFIGS`: HuggingFace Hub model identifiers
- `V2_MODEL_CONFIGS`: Local checkpoint paths and architecture parameters

### Environment Variables
- `DEVICE`: Automatically detects CUDA availability
- GPU memory is managed automatically by ZeroGPU

## πŸ“Š Performance

### Inference Times (Approximate)
- **ViT-S models**: ~1-2 seconds
- **ViT-B models**: ~2-4 seconds  
- **ViT-L models**: ~4-8 seconds

*Times vary based on image resolution and GPU availability*

### Memory Usage
- Optimized for ZeroGPU's memory constraints
- Automatic model unloading prevents OOM errors
- Efficient batch processing for multiple comparisons

## 🎯 Usage Examples

### Compare v1 vs v2 Models
1. Upload an image or select from examples
2. Choose models from both v1 and v2 families
3. Click "Compare" or "Slider Compare"
4. Analyze the depth estimation differences

### Analyze Single Model Performance
1. Select "Single Model" tab
2. Choose any available model
3. Upload image and click "Run"
4. Examine detailed depth map output

## 🀝 Contributing

Contributions are welcome! Areas for improvement:
- Additional model variants
- New visualization options
- Performance optimizations
- UI/UX enhancements

## πŸ“š References

- **Depth Anything v1**: [LiheYoung/Depth-Anything](https://github.com/LiheYoung/Depth-Anything)
- **Depth Anything v2**: [DepthAnything/Depth-Anything-V2](https://github.com/DepthAnything/Depth-Anything-V2)
- **Original Papers**: 
  - [Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data](https://arxiv.org/abs/2401.10891)
  - [Depth Anything V2: More Efficient, Better Supervised](https://arxiv.org/abs/2406.09414)

## πŸ“„ License

This project combines implementations from:
- Depth Anything v1: MIT License
- Depth Anything v2: Apache 2.0 License
- Demo code: MIT License

Please check individual component licenses for specific terms.

## πŸ™ Acknowledgments

- Original Depth Anything authors and contributors
- HuggingFace team for Spaces and ZeroGPU infrastructure
- Gradio team for the excellent UI framework

---

**Note**: This is a demonstration/comparison tool. For production use of the Depth Anything models, please refer to the original repositories and follow their recommended practices.