robotflowlabs
/

depth-anything-v2-small

+---
+license: apache-2.0
+base_model: depth-anything/Depth-Anything-V2-Small
+tags:
+  - robotics
+  - edge-deployment
+  - anima
+  - forge
+  - depth-estimation
+  - monocular-depth
+  - safetensors
+  - vision
+  - ros2
+  - jetson
+  - real-time
+library_name: transformers
+pipeline_tag: depth-estimation
+model-index:
+  - name: depth-anything-v2-small
+    results:
+      - task:
+          type: depth-estimation
+        metrics:
+          - name: Model Size (MB)
+            type: model_size
+            value: 95
+---
+# Depth Anything V2 Small — SafeTensors
+> Depth Anything V2 (Small, ViT-S backbone) converted to SafeTensors for real-time robotic depth estimation. At just **95 MB**, this is the lightest production-quality monocular depth model available — perfect for edge devices like Jetson Nano.
+This model is part of the **[RobotFlowLabs](https://huggingface.co/robotflowlabs)** model library, built for the **ANIMA** agentic robotics platform.
+## Why This Model Exists
+Depth estimation needs to run alongside segmentation, features, and action models — all on the same edge GPU. At 95 MB, Depth Anything V2 Small is tiny enough to fit in any perception stack while still producing high-quality relative depth maps. Converted from raw `.pth` to SafeTensors for safe, zero-copy loading.
+## Model Details
+| Property | Value |
+|----------|-------|
+| **Architecture** | DPT head + ViT-Small encoder |
+| **Parameters** | 24.8M |
+| **Encoder** | ViT-S/14 (DINOv2-based) |
+| **Input Resolution** | Flexible (recommended 518×518) |
+| **Output** | Dense relative depth map |
+| **Original Model** | [`depth-anything/Depth-Anything-V2-Small`](https://huggingface.co/depth-anything/Depth-Anything-V2-Small) |
+| **License** | Apache-2.0 |
+## Quick Start
+```python
+from safetensors.torch import load_file
+state_dict = load_file("model.safetensors")
+from depth_anything_v2.dpt import DepthAnythingV2
+model = DepthAnythingV2(encoder='vits', features=64, out_channels=[48, 96, 192, 384])
+model.load_state_dict(state_dict)
+model.to("cuda").eval()
+depth = model.infer_image(image)
+```
+## Use Cases in ANIMA
+- **Real-Time Obstacle Avoidance** — Fastest depth estimation for navigation at camera framerate
+- **Grasp Distance** — Quick depth estimate for reach planning
+- **Mobile Robots** — Fits on Jetson Nano-class devices alongside other models
+- **Multi-Camera Setups** — Small enough to run one instance per camera
+## Depth Anything V2 Family
+| Model | Params | Size | Best For |
+|-------|--------|------|----------|
+| [depth-anything-v2-large](https://huggingface.co/robotflowlabs/depth-anything-v2-large) | 335M | 1.3 GB | Highest quality depth |
+| **[depth-anything-v2-small](https://huggingface.co/robotflowlabs/depth-anything-v2-small)** | **24.8M** | **95 MB** | **Real-time edge deployment** |
+## Limitations
+- Relative depth only — not metric (needs calibration for absolute distances)
+- Lower accuracy than Large variant on complex scenes
+- Single-frame estimation — no temporal consistency
+## Attribution
+- **Original Model**: [`depth-anything/Depth-Anything-V2-Small`](https://huggingface.co/depth-anything/Depth-Anything-V2-Small) by TUM & HKU
+- **License**: [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0)
+- **Paper**: [Depth Anything V2](https://arxiv.org/abs/2406.09414) — Yang et al., 2024
+- **Converted by**: [RobotFlowLabs](https://huggingface.co/robotflowlabs) using [FORGE](https://github.com/robotflowlabs/forge)
+## Citation
+```bibtex
+@article{yang2024depth_anything_v2,
+  title={Depth Anything V2},
+  author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
+  journal={arXiv preprint arXiv:2406.09414},
+  year={2024}
+}
+```
+---
+<p align="center">
+  <b>Built with FORGE by <a href="https://huggingface.co/robotflowlabs">RobotFlowLabs</a></b><br>
+  Optimizing foundation models for real robots.
+</p>

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:07635472183d063a80e6b2d78645ab944a9210fd973b1c88e4b7e3ba23981f75
+size 99165428