mlx-community
/

sam3-8bit

@@ -1,110 +1,50 @@
 ---
 library_name: mlx
-base_model: facebook/sam3
 tags:
-- mlx
 - sam3
-- segmentation
-- detection
-- tracking
 ---
-# sam3-8bit
-[facebook/sam3](https://huggingface.co/facebook/sam3) converted to MLX (8-bit quantized, 1.04 GB).
-Open-vocabulary **object detection**, **instance segmentation**, and **video tracking** on Apple Silicon (~860M parameters).
-## Quick Start
 ```bash
-pip install mlx-vlm
-```
-```python
-from PIL import Image
-from mlx_vlm.utils import load_model, get_model_path
-from mlx_vlm.models.sam3.generate import Sam3Predictor
-from mlx_vlm.models.sam3.processing_sam3 import Sam3Processor
-model_path = get_model_path("mlx-community/sam3-8bit")
-model = load_model(model_path)
-processor = Sam3Processor.from_pretrained(str(model_path))
-predictor = Sam3Predictor(model, processor, score_threshold=0.3)
 ```
-## Object Detection
-```python
-image = Image.open("photo.jpg")
-result = predictor.predict(image, text_prompt="a dog")
-for i in range(len(result.scores)):
-    x1, y1, x2, y2 = result.boxes[i]
-    print(f"[{result.scores[i]:.2f}] box=({x1:.0f}, {y1:.0f}, {x2:.0f}, {y2:.0f})")
-```
-## Instance Segmentation
-```python
-result = predictor.predict(image, text_prompt="a person")
-# result.boxes   -> (N, 4) xyxy bounding boxes
-# result.masks   -> (N, H, W) binary segmentation masks
-# result.scores  -> (N,) confidence scores
-import numpy as np
-overlay = np.array(image).copy()
-W, H = image.size
-for i in range(len(result.scores)):
-    mask = result.masks[i]
-    if mask.shape != (H, W):
-        mask = np.array(Image.fromarray(mask.astype(np.float32)).resize((W, H)))
-    binary = mask > 0
-    overlay[binary] = (overlay[binary] * 0.5 + np.array([255, 0, 0]) * 0.5).astype(np.uint8)
-```
-## Box-Guided Detection
-```python
-import numpy as np
-boxes = np.array([[100, 50, 400, 350]])  # xyxy pixel coords
-result = predictor.predict(image, text_prompt="a cat", boxes=boxes)
-```
-## Semantic Segmentation
-```python
-import mlx.core as mx
-inputs = processor.preprocess_image(image)
-text_inputs = processor.preprocess_text("a cat")
-outputs = model.detect(
-    mx.array(inputs["pixel_values"]),
-    mx.array(text_inputs["input_ids"]),
-    mx.array(text_inputs["attention_mask"]),
-)
-mx.eval(outputs)
-pred_masks = outputs["pred_masks"]      # (B, 200, 288, 288) instance masks
-semantic_seg = outputs["semantic_seg"]  # (B, 1, 288, 288) semantic segmentation
-```
-## Video Tracking (CLI)
 ```bash
-python -m mlx_vlm.models.sam3.track_video --video input.mp4 --prompt "a car" --model mlx-community/sam3-8bit
 ```
-| Flag | Default | Description |
-|------|---------|-------------|
-| `--video` | *(required)* | Input video path |
-| `--prompt` | *(required)* | Text prompt |
-| `--output` | `<input>_tracked.mp4` | Output video path |
-| `--model` | `facebook/sam3` | Model path or HF repo |
-| `--threshold` | `0.15` | Score threshold |
-| `--every` | `2` | Detect every N frames |
-## Original Model
-[facebook/sam3](https://huggingface.co/facebook/sam3) · [Paper](https://ai.meta.com/blog/segment-anything-model-3/) · [Code](https://github.com/facebookresearch/sam3)

 ---
+license: other
+extra_gated_fields:
+  First Name: text
+  Last Name: text
+  Date of birth: date_picker
+  Country: country
+  Affiliation: text
+  Job title:
+    type: select
+    options:
+    - Student
+    - Research Graduate
+    - AI researcher
+    - AI developer/engineer
+    - Reporter
+    - Other
+  geo: ip_location
+  ? By clicking Submit below I accept the terms of the license and acknowledge that
+    the information I provide will be collected stored processed and shared in accordance
+    with the Meta Privacy Policy
+  : checkbox
+extra_gated_description: The information you provide will be collected, stored, processed
+  and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
+extra_gated_button_content: Submit
+language:
+- en
+pipeline_tag: mask-generation
 library_name: mlx
 tags:
 - sam3
+- mlx
+base_model: facebook/sam3
 ---
+# mlx-community/sam3-8bit
+This model was converted to MLX format from [`facebook/sam3`](https://huggingface.co/facebook/sam3)
+using mlx-vlm version **0.4.2**.
+Refer to the [original model card](https://huggingface.co/facebook/sam3) for more details on the model.
+## Use with mlx
 ```bash
+pip install -U mlx-vlm
 ```
 ```bash
+python -m mlx_vlm.generate --model mlx-community/sam3-8bit --max-tokens 100 --temperature 0.0 --prompt "Describe this image." --image <path_to_image>
 ```