vietanhdev commited on
Commit
9effc01
·
verified ·
1 Parent(s): b112164

docs: rewrite model card with variants table, citation, AnyLabeling cross-link

Browse files
Files changed (1) hide show
  1. README.md +97 -101
README.md CHANGED
@@ -1,101 +1,97 @@
1
- ---
2
- license: apache-2.0
3
- tags:
4
- - image-segmentation
5
- - segment-anything
6
- - onnx
7
- - onnxruntime
8
- library_name: onnxruntime
9
- ---
10
-
11
- # Segment Anything (SAM + MobileSAM) — ONNX Models
12
-
13
- ONNX-exported versions of Meta's [Segment Anything Model (SAM)](https://github.com/facebookresearch/segment-anything) and [MobileSAM](https://github.com/ChaoningZhang/MobileSAM), ready for CPU/GPU inference with [ONNX Runtime](https://onnxruntime.ai/) — no PyTorch required at runtime.
14
-
15
- These models are used by **[AnyLabeling](https://github.com/vietanhdev/anylabeling)** for AI-assisted image annotation, and exported by **[samexporter](https://github.com/vietanhdev/samexporter)**.
16
-
17
- ## Available Models
18
-
19
- | File | Variant | Encoder size | Notes |
20
- |------|---------|-------------|-------|
21
- | `sam_vit_b_01ec64.zip` | SAM ViT-B | ~90 MB | Fastest, lowest accuracy |
22
- | `sam_vit_b_01ec64_quant.zip` | SAM ViT-B (Quant) | ~25 MB | Quantized — smaller & faster |
23
- | `sam_vit_l_0b3195.zip` | SAM ViT-L | ~330 MB | Good balance |
24
- | `sam_vit_l_0b3195_quant.zip` | SAM ViT-L (Quant) | ~83 MB | Quantized — smaller & faster |
25
- | `sam_vit_h_4b8939.zip` | SAM ViT-H | ~630 MB | Highest accuracy |
26
- | `sam_vit_h_4b8939_quant.zip` | SAM ViT-H (Quant) | ~158 MB | Quantized — smaller & faster |
27
- | `mobile_sam_20230629.zip` | MobileSAM | ~9 MB | Ultra-lightweight |
28
-
29
- Each zip contains two ONNX files: an **encoder** (runs once per image) and a **decoder** (runs interactively for each prompt).
30
-
31
- ## Prompt Types
32
-
33
- - **Point** (`+point` / `-point`): click to include/exclude regions
34
- - **Rectangle**: draw a bounding box around the target object
35
-
36
- ## Use with AnyLabeling (Recommended)
37
-
38
- [AnyLabeling](https://github.com/vietanhdev/anylabeling) is a desktop annotation tool with a built-in model manager that downloads, caches, and runs these models automatically — no coding required.
39
-
40
- 1. Install: `pip install anylabeling`
41
- 2. Launch: `anylabeling`
42
- 3. Click the **Brain** button → select a SAM model from the dropdown
43
- 4. Use point or rectangle prompts to segment objects
44
-
45
- [![AnyLabeling demo](https://user-images.githubusercontent.com/18329471/236625792-07f01838-3f69-48b0-a12e-30bad27bd921.gif)](https://github.com/vietanhdev/anylabeling)
46
-
47
- ## Use Programmatically with ONNX Runtime
48
-
49
- ```python
50
- import urllib.request, zipfile, pathlib
51
- # Download and extract
52
- url = "https://huggingface.co/vietanhdev/segment-anything-onnx-models/resolve/main/sam_vit_b_01ec64.zip"
53
- urllib.request.urlretrieve(url, "sam_vit_b_01ec64.zip")
54
- with zipfile.ZipFile("sam_vit_b_01ec64.zip") as z:
55
- z.extractall("sam_vit_b_01ec64")
56
- ```
57
-
58
- Then use [samexporter](https://github.com/vietanhdev/samexporter)'s inference module:
59
-
60
- ```bash
61
- pip install samexporter
62
- python -m samexporter.inference \
63
- --encoder_model sam_vit_b_01ec64/sam_vit_b_encoder.onnx \
64
- --decoder_model sam_vit_b_01ec64/sam_vit_b_decoder.onnx \
65
- --image photo.jpg \
66
- --prompt prompt.json \
67
- --output result.png
68
- ```
69
-
70
- ## Re-export from Source
71
-
72
- To re-export or customize the models using [samexporter](https://github.com/vietanhdev/samexporter):
73
-
74
- ```bash
75
- pip install samexporter
76
- # Export SAM ViT-H encoder + decoder
77
- python -m samexporter.export_encoder \
78
- --checkpoint original_models/sam_vit_h_4b8939.pth \
79
- --output output_models/sam_vit_h_4b8939.encoder.onnx \
80
- --model-type vit_h --use-preprocess
81
- python -m samexporter.export_decoder \
82
- --checkpoint original_models/sam_vit_h_4b8939.pth \
83
- --output output_models/sam_vit_h_4b8939.decoder.onnx \
84
- --model-type vit_h --return-single-mask
85
- # Or convert all SAM variants at once:
86
- bash convert_all_meta_sam.sh
87
- ```
88
-
89
- ## Related Repositories
90
-
91
- | Repo | Description |
92
- |------|-------------|
93
- | [vietanhdev/samexporter](https://github.com/vietanhdev/samexporter) | Export scripts, inference code, conversion tools |
94
- | [vietanhdev/anylabeling](https://github.com/vietanhdev/anylabeling) | Desktop annotation app powered by these models |
95
- | [facebookresearch/segment-anything](https://github.com/facebookresearch/segment-anything) | Original SAM by Meta |
96
- | [ChaoningZhang/MobileSAM](https://github.com/ChaoningZhang/MobileSAM) | Original MobileSAM |
97
-
98
- ## License
99
-
100
- The ONNX models are derived from Meta's SAM and MobileSAM, both released under the **Apache 2.0** license.
101
- The export code is part of [samexporter](https://github.com/vietanhdev/samexporter), released under the **MIT** license.
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: image-segmentation
4
+ library_name: onnx
5
+ tags:
6
+ - onnxruntime
7
+ - onnx
8
+ - segment-anything
9
+ - image-segmentation
10
+ - edge-ai
11
+ - anylabeling
12
+ authors:
13
+ - Viet-Anh Nguyen
14
+ ---
15
+
16
+ # Segment Anything (SAM) — ONNX Models
17
+
18
+ ONNX exports of Meta's original [Segment Anything](https://github.com/facebookresearch/segment-anything) family, plus [MobileSAM](https://github.com/ChaoningZhang/MobileSAM), packaged for direct use with [`onnxruntime`](https://onnxruntime.ai) and [AnyLabeling](https://github.com/vietanhdev/anylabeling).
19
+
20
+ ## Why this repo exists
21
+
22
+ Running SAM through the original PyTorch checkpoint is heavy on a CPU laptop or an edge device. ONNX gives you a portable, dependency-light runtime that works in Python, C++, JavaScript, and most embedded targets. These exports are the ones AnyLabeling consumes for its smart-labeling features.
23
+
24
+ ## Variants
25
+
26
+ Each `.zip` bundles the encoder + decoder ONNX files for that backbone.
27
+
28
+ | File | Backbone | Size | Notes |
29
+ |---|---|---|---|
30
+ | `mobile_sam_20230629.zip` | MobileSAM | 35 MB | Smallest — best for mobile / low-power |
31
+ | `mobile_sam_20230629_quant.zip` | MobileSAM | 10.5 MB | Quantized MobileSAM |
32
+ | `sam_vit_b_01ec64.zip` | ViT-B | 332 MB | Base |
33
+ | `sam_vit_b_01ec64_quant.zip` | ViT-B | 72 MB | Quantized base |
34
+ | `sam_vit_l_0b3195.zip` | ViT-L | 1.1 GB | Large |
35
+ | `sam_vit_l_0b3195_quant.zip` | ViT-L | 213 MB | Quantized large |
36
+ | `sam_vit_h_4b8939.zip` | ViT-H | 2.3 GB | Huge — best quality |
37
+ | `sam_vit_h_4b8939_quant.zip` | ViT-H | 422 MB | Quantized huge |
38
+
39
+ ## Quick start
40
+
41
+ ```bash
42
+ pip install huggingface_hub onnxruntime
43
+ ```
44
+
45
+ ```python
46
+ from huggingface_hub import hf_hub_download
47
+ import zipfile, onnxruntime as ort
48
+
49
+ zip_path = hf_hub_download(repo_id="vietanhdev/segment-anything-onnx-models",
50
+ filename="sam_vit_b_01ec64_quant.zip")
51
+ with zipfile.ZipFile(zip_path) as z:
52
+ z.extractall("./sam_vit_b_quant")
53
+
54
+ session = ort.InferenceSession("./sam_vit_b_quant/encoder.onnx",
55
+ providers=["CPUExecutionProvider"])
56
+ # Inspect expected inputs:
57
+ print([(i.name, i.shape, i.type) for i in session.get_inputs()])
58
+ ```
59
+
60
+ For the full image → mask pipeline (encoder + decoder + prompt handling), see how AnyLabeling wires it: <https://github.com/vietanhdev/anylabeling>
61
+
62
+ ## Use with AnyLabeling
63
+
64
+ These models drop into AnyLabeling's auto-labeling backend without conversion. See the [AnyLabeling docs](https://github.com/vietanhdev/anylabeling) for the model-config wiring.
65
+
66
+ ## Source weights
67
+
68
+ - Original SAM weights & license: <https://github.com/facebookresearch/segment-anything>
69
+ - MobileSAM: <https://github.com/ChaoningZhang/MobileSAM>
70
+
71
+ This repo redistributes the same weights in ONNX format. License unchanged from upstream releases (Apache 2.0).
72
+
73
+ ## Citation
74
+
75
+ ```bibtex
76
+ @misc{nguyen2026sam_onnx,
77
+ author = {Nguyen, Viet-Anh and {Neural Research Lab}},
78
+ title = {Segment Anything ONNX Models},
79
+ year = {2026},
80
+ url = {https://huggingface.co/vietanhdev/segment-anything-onnx-models}
81
+ }
82
+ ```
83
+
84
+ For the underlying model, cite Meta's original SAM paper:
85
+
86
+ ```bibtex
87
+ @article{kirillov2023sam,
88
+ title = {Segment Anything},
89
+ author = {Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
90
+ journal = {arXiv:2304.02643},
91
+ year = {2023}
92
+ }
93
+ ```
94
+
95
+ ## Acknowledgments
96
+
97
+ Thanks to Meta AI Research for releasing the SAM family, and to the MobileSAM team for their efficient distillation. This repo packages their work for edge inference.