Update README.md

1b9f061 verified 4 days ago

5.64 kB

license: apache-2.0
tags:
  - image-segmentation
  - segment-anything
  - segment-anything-3
  - open-vocabulary
  - text-to-segmentation
  - onnx
  - onnxruntime
library_name: onnxruntime
base_model:
  - facebook/sam3

Segment Anything 3 (SAM 3) — ONNX Models

ONNX-exported version of Meta's Segment Anything Model 3 (SAM 3), an open-vocabulary segmentation model that accepts text prompts in addition to points and rectangles.

SAM 3 uses a CLIP-based language encoder to let you describe objects in natural language (e.g., "truck", "person with hat") and segment them without task-specific training.

These models are used by AnyLabeling for AI-assisted image annotation, and exported by samexporter.

Available Models

File	Contents	Description
`sam3_vit_h.zip`	3 ONNX files	SAM 3 ViT-H (all components)

The zip contains three ONNX components that work together:

ONNX File	Role	Runs
`sam3_image_encoder.onnx`	Extracts visual features from the input image	Once per image
`sam3_language_encoder.onnx`	Encodes text prompt tokens into feature vectors	Once per text query
`sam3_decoder.onnx`	Produces segmentation masks given image + language features	Per prompt

Prompt Types

SAM 3 supports three prompt modalities:

Prompt	Description
Text	Natural-language description, e.g. `"truck"` — unique to SAM 3
Point	Click `+point` / `-point` to include/exclude regions
Rectangle	Draw a bounding box around the target object

Text prompts are the recommended workflow: they drive detection open-vocabulary style, so you can label any object class without retraining.

Use with AnyLabeling (Recommended)

AnyLabeling is a desktop annotation tool with a built-in model manager that downloads, caches, and runs these models automatically — no coding required.

Install: pip install anylabeling
Launch: anylabeling
Click the Brain button → select Segment Anything 3 (ViT-H) from the dropdown
Type a text description (e.g., truck) in the text prompt field
Optionally refine with point/rectangle prompts

Use Programmatically with ONNX Runtime

import urllib.request, zipfile
url = "https://huggingface.co/vietanhdev/segment-anything-3-onnx-models/resolve/main/sam3_vit_h.zip"
urllib.request.urlretrieve(url, "sam3_vit_h.zip")
with zipfile.ZipFile("sam3_vit_h.zip") as z:
    z.extractall("sam3")

Then use samexporter's inference module:

pip install samexporter

# Text prompt
python -m samexporter.inference \
    --sam_variant sam3 \
    --encoder_model sam3/sam3_image_encoder.onnx \
    --decoder_model sam3/sam3_decoder.onnx \
    --language_encoder_model sam3/sam3_language_encoder.onnx \
    --image photo.jpg \
    --prompt prompt.json \
    --text_prompt "truck" \
    --output result.png

Example prompt.json for a text-only query:

[{"type": "text", "data": "truck"}]

Model Architecture

SAM 3 follows the same encoder/decoder pattern as SAM and SAM 2, with an added CLIP-based language branch:

Input image  ──► Image Encoder  ──────────────────────────┐
                                                           ▼
Text prompt  ──► Language Encoder ──► Decoder ──► Masks + Scores + Boxes
                                        ▲
Optional: point / box prompts ──────────┘

The image encoder runs once per image and caches features. The language encoder runs once per text query. The decoder is lightweight and runs interactively for each prompt combination.

Re-export from Source

To re-export or customize the models using samexporter:

pip install samexporter

# Export all three SAM 3 ONNX components
python -m samexporter.export_sam3 --output_dir output_models/sam3

# Or use the convenience script:
bash convert_sam3.sh

Custom Model Config for AnyLabeling

To use a locally re-exported SAM 3 as a custom model in AnyLabeling, create a config.yaml:

type: segment_anything
name: sam3_vit_h_custom
display_name: Segment Anything 3 (ViT-H)
encoder_model_path: sam3_image_encoder.onnx
decoder_model_path: sam3_decoder.onnx
language_encoder_path: sam3_language_encoder.onnx
input_size: 1008
max_height: 1008
max_width: 1008

Then load it via Brain button → Load Custom Model in AnyLabeling.

Related Repositories

Repo	Description
vietanhdev/samexporter	Export scripts, inference code, conversion tools
vietanhdev/anylabeling	Desktop annotation app powered by these models
facebook/sam3	Original SAM 3 PyTorch checkpoint by Meta

License

The ONNX models are derived from Meta's SAM 3, released under the SAM License. The export code is part of samexporter, released under the MIT license.