shriarul5273's picture
Added Depth-Anything-3 models
e7682cd

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Depth Anything 3 AnySize

πŸ”„ Key Modifications from the Original Repo

  • Native-Resolution Inputs: Images are now processed at their original resolution by default. During inference, inputs are padded to the ViT patch size, and outputs (depth/confidence/sky maps and processed images) are cropped back to the source height and width. Using larger inputs now will increase memory and compute requirements.
  • Updated Defaults: The CLI defaults to --process-res None --process-res-method keep, and the API uses process_res=None, process_res_method="keep". See docs/CLI.md and docs/API.md for details.
  • Optional Downscaling: For faster inference and lower memory usage, set process_res (e.g., 720) with a resize strategy like --process-res-method upper_bound_resize.
  • Original Baseline: Previously, images were resized to 504 px on the long side.
  • Implementation Details: Input padding is handled in src/depth_anything_3/utils/io/input_processor.py, and output cropping is managed in src/depth_anything_3/api.py.