Spaces:

shriarul5273
/

Depth-Estimation-Compare-demo

Running on Zero

App Files Files Community

Depth-Estimation-Compare-demo / README.md

shriarul5273

add GitHub Actions workflow for syncing to Hugging Face space

a7087a4 3 months ago

preview code

raw

history blame

6.16 kB

metadata

title: Depth Anything Compare Demo
emoji: 👀
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.46.0
app_file: app.py
pinned: false

Depth Anything v1 vs v2 Comparison Demo

A comprehensive comparison tool for Depth Anything v1 and Depth Anything v2 models, built with Gradio and optimized for HuggingFace Spaces with ZeroGPU support.

🚀 Features

Three Comparison Modes

🎚️ Slider Comparison: Interactive side-by-side comparison with a draggable slider
🔍 Method Comparison: Traditional side-by-side view with model labels
🔬 Single Model: Run individual models for detailed analysis

Supported Models

Depth Anything v1

ViT-S (Small): Fastest inference, good quality
ViT-B (Base): Balanced speed and quality
ViT-L (Large): Best quality, slower inference

Depth Anything v2

ViT-Small: Enhanced small model with improved accuracy
ViT-Base: Balanced performance with v2 improvements
ViT-Large: State-of-the-art depth estimation quality

🖼️ Example Images

The demo includes 20+ carefully selected example images showcasing various scenarios:

Indoor and outdoor scenes
Different lighting conditions
Various object types and compositions
Challenging depth estimation scenarios

🛠️ Technical Details

Architecture

Framework: Gradio 4.0+ with modern UI components
Backend: PyTorch with CUDA acceleration
Deployment: ZeroGPU-optimized for HuggingFace Spaces
Memory Management: Automatic model loading/unloading for efficient GPU usage

ZeroGPU Optimizations

@spaces.GPU decorators for GPU-intensive functions
Automatic memory cleanup between inferences
On-demand model loading to prevent OOM errors
Efficient resource allocation and deallocation

Depth Visualization

Colormap: Spectral_r colormap for intuitive depth representation
Normalization: Min-max scaling for consistent visualization
Resolution: Maintains original image aspect ratios

📦 Installation & Setup

Local Development

Clone the repository:

git clone <repository-url>
cd Depth-Anything-Compare-demo

Install dependencies:

pip install -r requirements.txt

Download model checkpoints (for local usage):

# Depth Anything v1 models are downloaded automatically from HuggingFace Hub
# For v2 models, download checkpoints to Depth-Anything-V2/checkpoints/

Run locally:

python app_local.py  # For local development
python app.py        # For ZeroGPU deployment

HuggingFace Spaces Deployment

This app is optimized for HuggingFace Spaces with ZeroGPU. Simply:

Upload the repository to your HuggingFace Space
Set hardware to "ZeroGPU"
The app will automatically handle GPU allocation and model loading

📁 Project Structure

Depth-Anything-Compare-demo/
├── app.py                 # ZeroGPU-optimized main application
├── app_local.py          # Local development version
├── requirements.txt      # Python dependencies
├── README.md            # This file
├── assets/
│   └── examples/        # Example images for testing
├── Depth-Anything/      # Depth Anything v1 implementation
│   ├── depth_anything/
│   │   ├── dpt.py      # v1 model architecture
│   │   └── util/       # v1 utilities and transforms
│   └── torchhub/       # Required dependencies
└── Depth-Anything-V2/   # Depth Anything v2 implementation
    ├── depth_anything_v2/
    │   ├── dpt.py      # v2 model architecture
    │   └── dinov2_layers/ # DINOv2 components
    └── assets/
        └── examples/    # v2-specific examples

🔧 Configuration

Model Configuration

Models are configured in the respective config dictionaries:

V1_MODEL_CONFIGS: HuggingFace Hub model identifiers
V2_MODEL_CONFIGS: Local checkpoint paths and architecture parameters

Environment Variables

DEVICE: Automatically detects CUDA availability
GPU memory is managed automatically by ZeroGPU

📊 Performance

Inference Times (Approximate)

ViT-S models: ~1-2 seconds
ViT-B models: ~2-4 seconds
ViT-L models: ~4-8 seconds

Times vary based on image resolution and GPU availability

Memory Usage

Optimized for ZeroGPU's memory constraints
Automatic model unloading prevents OOM errors
Efficient batch processing for multiple comparisons

🎯 Usage Examples

Compare v1 vs v2 Models

Upload an image or select from examples
Choose models from both v1 and v2 families
Click "Compare" or "Slider Compare"
Analyze the depth estimation differences

Analyze Single Model Performance

Select "Single Model" tab
Choose any available model
Upload image and click "Run"
Examine detailed depth map output

🤝 Contributing

Contributions are welcome! Areas for improvement:

Additional model variants
New visualization options
Performance optimizations
UI/UX enhancements

📚 References

Depth Anything v1: LiheYoung/Depth-Anything
Depth Anything v2: DepthAnything/Depth-Anything-V2
Original Papers:
- Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
- Depth Anything V2: More Efficient, Better Supervised

📄 License

This project combines implementations from:

Depth Anything v1: MIT License
Depth Anything v2: Apache 2.0 License
Demo code: MIT License

Please check individual component licenses for specific terms.

🙏 Acknowledgments

Original Depth Anything authors and contributors
HuggingFace team for Spaces and ZeroGPU infrastructure
Gradio team for the excellent UI framework

Note: This is a demonstration/comparison tool. For production use of the Depth Anything models, please refer to the original repositories and follow their recommended practices.