Segment Anything 3 (SAM 3) β ONNX Models
ONNX-exported version of Meta's Segment Anything Model 3 (SAM 3), an open-vocabulary segmentation model that accepts text prompts in addition to points and rectangles.
SAM 3 uses a CLIP-based language encoder to let you describe objects in natural language (e.g., "truck", "person with hat") and segment them without task-specific training.
These models are used by AnyLabeling for AI-assisted image annotation, and exported by samexporter.
Available Models
| File | Contents | Description |
|---|---|---|
sam3_vit_h.zip |
3 ONNX files | SAM 3 ViT-H (all components) |
The zip contains three ONNX components that work together:
| ONNX File | Role | Runs |
|---|---|---|
sam3_image_encoder.onnx |
Extracts visual features from the input image | Once per image |
sam3_language_encoder.onnx |
Encodes text prompt tokens into feature vectors | Once per text query |
sam3_decoder.onnx |
Produces segmentation masks given image + language features | Per prompt |
Prompt Types
SAM 3 supports three prompt modalities:
| Prompt | Description |
|---|---|
| Text | Natural-language description, e.g. "truck" β unique to SAM 3 |
| Point | Click +point / -point to include/exclude regions |
| Rectangle | Draw a bounding box around the target object |
Text prompts are the recommended workflow: they drive detection open-vocabulary style, so you can label any object class without retraining.
Use with AnyLabeling (Recommended)
AnyLabeling is a desktop annotation tool with a built-in model manager that downloads, caches, and runs these models automatically β no coding required.
- Install:
pip install anylabeling - Launch:
anylabeling - Click the Brain button β select Segment Anything 3 (ViT-H) from the dropdown
- Type a text description (e.g.,
truck) in the text prompt field - Optionally refine with point/rectangle prompts
Use Programmatically with ONNX Runtime
import urllib.request, zipfile
url = "https://huggingface.co/vietanhdev/segment-anything-3-onnx-models/resolve/main/sam3_vit_h.zip"
urllib.request.urlretrieve(url, "sam3_vit_h.zip")
with zipfile.ZipFile("sam3_vit_h.zip") as z:
z.extractall("sam3")
Then use samexporter's inference module:
pip install samexporter
# Text prompt
python -m samexporter.inference \
--sam_variant sam3 \
--encoder_model sam3/sam3_image_encoder.onnx \
--decoder_model sam3/sam3_decoder.onnx \
--language_encoder_model sam3/sam3_language_encoder.onnx \
--image photo.jpg \
--prompt prompt.json \
--text_prompt "truck" \
--output result.png
Example prompt.json for a text-only query:
[{"type": "text", "data": "truck"}]
Model Architecture
SAM 3 follows the same encoder/decoder pattern as SAM and SAM 2, with an added CLIP-based language branch:
Input image βββΊ Image Encoder βββββββββββββββββββββββββββ
βΌ
Text prompt βββΊ Language Encoder βββΊ Decoder βββΊ Masks + Scores + Boxes
β²
Optional: point / box prompts βββββββββββ
The image encoder runs once per image and caches features. The language encoder runs once per text query. The decoder is lightweight and runs interactively for each prompt combination.
Re-export from Source
To re-export or customize the models using samexporter:
pip install samexporter
# Export all three SAM 3 ONNX components
python -m samexporter.export_sam3 --output_dir output_models/sam3
# Or use the convenience script:
bash convert_sam3.sh
Custom Model Config for AnyLabeling
To use a locally re-exported SAM 3 as a custom model in AnyLabeling, create a config.yaml:
type: segment_anything
name: sam3_vit_h_custom
display_name: Segment Anything 3 (ViT-H)
encoder_model_path: sam3_image_encoder.onnx
decoder_model_path: sam3_decoder.onnx
language_encoder_path: sam3_language_encoder.onnx
input_size: 1008
max_height: 1008
max_width: 1008
Then load it via Brain button β Load Custom Model in AnyLabeling.
Related Repositories
| Repo | Description |
|---|---|
| vietanhdev/samexporter | Export scripts, inference code, conversion tools |
| vietanhdev/anylabeling | Desktop annotation app powered by these models |
| facebook/sam3 | Original SAM 3 PyTorch checkpoint by Meta |
License
The ONNX models are derived from Meta's SAM 3, released under the SAM License. The export code is part of samexporter, released under the MIT license.
Model tree for vietanhdev/segment-anything-3-onnx-models
Base model
facebook/sam3