ilessio-aiflowlab commited on
Commit
8ee38d2
·
verified ·
1 Parent(s): d95d858

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +109 -0
  2. model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: depth-anything/Depth-Anything-V2-Small
4
+ tags:
5
+ - robotics
6
+ - edge-deployment
7
+ - anima
8
+ - forge
9
+ - depth-estimation
10
+ - monocular-depth
11
+ - safetensors
12
+ - vision
13
+ - ros2
14
+ - jetson
15
+ - real-time
16
+ library_name: transformers
17
+ pipeline_tag: depth-estimation
18
+ model-index:
19
+ - name: depth-anything-v2-small
20
+ results:
21
+ - task:
22
+ type: depth-estimation
23
+ metrics:
24
+ - name: Model Size (MB)
25
+ type: model_size
26
+ value: 95
27
+ ---
28
+
29
+ # Depth Anything V2 Small — SafeTensors
30
+
31
+ > Depth Anything V2 (Small, ViT-S backbone) converted to SafeTensors for real-time robotic depth estimation. At just **95 MB**, this is the lightest production-quality monocular depth model available — perfect for edge devices like Jetson Nano.
32
+
33
+ This model is part of the **[RobotFlowLabs](https://huggingface.co/robotflowlabs)** model library, built for the **ANIMA** agentic robotics platform.
34
+
35
+ ## Why This Model Exists
36
+
37
+ Depth estimation needs to run alongside segmentation, features, and action models — all on the same edge GPU. At 95 MB, Depth Anything V2 Small is tiny enough to fit in any perception stack while still producing high-quality relative depth maps. Converted from raw `.pth` to SafeTensors for safe, zero-copy loading.
38
+
39
+ ## Model Details
40
+
41
+ | Property | Value |
42
+ |----------|-------|
43
+ | **Architecture** | DPT head + ViT-Small encoder |
44
+ | **Parameters** | 24.8M |
45
+ | **Encoder** | ViT-S/14 (DINOv2-based) |
46
+ | **Input Resolution** | Flexible (recommended 518×518) |
47
+ | **Output** | Dense relative depth map |
48
+ | **Original Model** | [`depth-anything/Depth-Anything-V2-Small`](https://huggingface.co/depth-anything/Depth-Anything-V2-Small) |
49
+ | **License** | Apache-2.0 |
50
+
51
+ ## Quick Start
52
+
53
+ ```python
54
+ from safetensors.torch import load_file
55
+
56
+ state_dict = load_file("model.safetensors")
57
+
58
+ from depth_anything_v2.dpt import DepthAnythingV2
59
+ model = DepthAnythingV2(encoder='vits', features=64, out_channels=[48, 96, 192, 384])
60
+ model.load_state_dict(state_dict)
61
+ model.to("cuda").eval()
62
+
63
+ depth = model.infer_image(image)
64
+ ```
65
+
66
+ ## Use Cases in ANIMA
67
+
68
+ - **Real-Time Obstacle Avoidance** — Fastest depth estimation for navigation at camera framerate
69
+ - **Grasp Distance** — Quick depth estimate for reach planning
70
+ - **Mobile Robots** — Fits on Jetson Nano-class devices alongside other models
71
+ - **Multi-Camera Setups** — Small enough to run one instance per camera
72
+
73
+ ## Depth Anything V2 Family
74
+
75
+ | Model | Params | Size | Best For |
76
+ |-------|--------|------|----------|
77
+ | [depth-anything-v2-large](https://huggingface.co/robotflowlabs/depth-anything-v2-large) | 335M | 1.3 GB | Highest quality depth |
78
+ | **[depth-anything-v2-small](https://huggingface.co/robotflowlabs/depth-anything-v2-small)** | **24.8M** | **95 MB** | **Real-time edge deployment** |
79
+
80
+ ## Limitations
81
+
82
+ - Relative depth only — not metric (needs calibration for absolute distances)
83
+ - Lower accuracy than Large variant on complex scenes
84
+ - Single-frame estimation — no temporal consistency
85
+
86
+ ## Attribution
87
+
88
+ - **Original Model**: [`depth-anything/Depth-Anything-V2-Small`](https://huggingface.co/depth-anything/Depth-Anything-V2-Small) by TUM & HKU
89
+ - **License**: [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0)
90
+ - **Paper**: [Depth Anything V2](https://arxiv.org/abs/2406.09414) — Yang et al., 2024
91
+ - **Converted by**: [RobotFlowLabs](https://huggingface.co/robotflowlabs) using [FORGE](https://github.com/robotflowlabs/forge)
92
+
93
+ ## Citation
94
+
95
+ ```bibtex
96
+ @article{yang2024depth_anything_v2,
97
+ title={Depth Anything V2},
98
+ author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
99
+ journal={arXiv preprint arXiv:2406.09414},
100
+ year={2024}
101
+ }
102
+ ```
103
+
104
+ ---
105
+
106
+ <p align="center">
107
+ <b>Built with FORGE by <a href="https://huggingface.co/robotflowlabs">RobotFlowLabs</a></b><br>
108
+ Optimizing foundation models for real robots.
109
+ </p>
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07635472183d063a80e6b2d78645ab944a9210fd973b1c88e4b7e3ba23981f75
3
+ size 99165428