shriarul5273's picture
add GitHub Actions workflow for syncing to Hugging Face space
a7087a4
|
raw
history blame
6.16 kB
metadata
title: Depth Anything Compare Demo
emoji: πŸ‘€
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.46.0
app_file: app.py
pinned: false

Depth Anything v1 vs v2 Comparison Demo

A comprehensive comparison tool for Depth Anything v1 and Depth Anything v2 models, built with Gradio and optimized for HuggingFace Spaces with ZeroGPU support.

πŸš€ Features

Three Comparison Modes

  1. 🎚️ Slider Comparison: Interactive side-by-side comparison with a draggable slider
  2. πŸ” Method Comparison: Traditional side-by-side view with model labels
  3. πŸ”¬ Single Model: Run individual models for detailed analysis

Supported Models

Depth Anything v1

  • ViT-S (Small): Fastest inference, good quality
  • ViT-B (Base): Balanced speed and quality
  • ViT-L (Large): Best quality, slower inference

Depth Anything v2

  • ViT-Small: Enhanced small model with improved accuracy
  • ViT-Base: Balanced performance with v2 improvements
  • ViT-Large: State-of-the-art depth estimation quality

πŸ–ΌοΈ Example Images

The demo includes 20+ carefully selected example images showcasing various scenarios:

  • Indoor and outdoor scenes
  • Different lighting conditions
  • Various object types and compositions
  • Challenging depth estimation scenarios

πŸ› οΈ Technical Details

Architecture

  • Framework: Gradio 4.0+ with modern UI components
  • Backend: PyTorch with CUDA acceleration
  • Deployment: ZeroGPU-optimized for HuggingFace Spaces
  • Memory Management: Automatic model loading/unloading for efficient GPU usage

ZeroGPU Optimizations

  • @spaces.GPU decorators for GPU-intensive functions
  • Automatic memory cleanup between inferences
  • On-demand model loading to prevent OOM errors
  • Efficient resource allocation and deallocation

Depth Visualization

  • Colormap: Spectral_r colormap for intuitive depth representation
  • Normalization: Min-max scaling for consistent visualization
  • Resolution: Maintains original image aspect ratios

πŸ“¦ Installation & Setup

Local Development

  1. Clone the repository:
git clone <repository-url>
cd Depth-Anything-Compare-demo
  1. Install dependencies:
pip install -r requirements.txt
  1. Download model checkpoints (for local usage):
# Depth Anything v1 models are downloaded automatically from HuggingFace Hub
# For v2 models, download checkpoints to Depth-Anything-V2/checkpoints/
  1. Run locally:
python app_local.py  # For local development
python app.py        # For ZeroGPU deployment

HuggingFace Spaces Deployment

This app is optimized for HuggingFace Spaces with ZeroGPU. Simply:

  1. Upload the repository to your HuggingFace Space
  2. Set hardware to "ZeroGPU"
  3. The app will automatically handle GPU allocation and model loading

πŸ“ Project Structure

Depth-Anything-Compare-demo/
β”œβ”€β”€ app.py                 # ZeroGPU-optimized main application
β”œβ”€β”€ app_local.py          # Local development version
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ README.md            # This file
β”œβ”€β”€ assets/
β”‚   └── examples/        # Example images for testing
β”œβ”€β”€ Depth-Anything/      # Depth Anything v1 implementation
β”‚   β”œβ”€β”€ depth_anything/
β”‚   β”‚   β”œβ”€β”€ dpt.py      # v1 model architecture
β”‚   β”‚   └── util/       # v1 utilities and transforms
β”‚   └── torchhub/       # Required dependencies
└── Depth-Anything-V2/   # Depth Anything v2 implementation
    β”œβ”€β”€ depth_anything_v2/
    β”‚   β”œβ”€β”€ dpt.py      # v2 model architecture
    β”‚   └── dinov2_layers/ # DINOv2 components
    └── assets/
        └── examples/    # v2-specific examples

πŸ”§ Configuration

Model Configuration

Models are configured in the respective config dictionaries:

  • V1_MODEL_CONFIGS: HuggingFace Hub model identifiers
  • V2_MODEL_CONFIGS: Local checkpoint paths and architecture parameters

Environment Variables

  • DEVICE: Automatically detects CUDA availability
  • GPU memory is managed automatically by ZeroGPU

πŸ“Š Performance

Inference Times (Approximate)

  • ViT-S models: ~1-2 seconds
  • ViT-B models: ~2-4 seconds
  • ViT-L models: ~4-8 seconds

Times vary based on image resolution and GPU availability

Memory Usage

  • Optimized for ZeroGPU's memory constraints
  • Automatic model unloading prevents OOM errors
  • Efficient batch processing for multiple comparisons

🎯 Usage Examples

Compare v1 vs v2 Models

  1. Upload an image or select from examples
  2. Choose models from both v1 and v2 families
  3. Click "Compare" or "Slider Compare"
  4. Analyze the depth estimation differences

Analyze Single Model Performance

  1. Select "Single Model" tab
  2. Choose any available model
  3. Upload image and click "Run"
  4. Examine detailed depth map output

🀝 Contributing

Contributions are welcome! Areas for improvement:

  • Additional model variants
  • New visualization options
  • Performance optimizations
  • UI/UX enhancements

πŸ“š References

πŸ“„ License

This project combines implementations from:

  • Depth Anything v1: MIT License
  • Depth Anything v2: Apache 2.0 License
  • Demo code: MIT License

Please check individual component licenses for specific terms.

πŸ™ Acknowledgments

  • Original Depth Anything authors and contributors
  • HuggingFace team for Spaces and ZeroGPU infrastructure
  • Gradio team for the excellent UI framework

Note: This is a demonstration/comparison tool. For production use of the Depth Anything models, please refer to the original repositories and follow their recommended practices.