File size: 4,349 Bytes
aa49bf6
ed9da19
 
 
 
aa49bf6
 
 
 
ed9da19
 
aa49bf6
 
ed9da19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
edcc62f
ed9da19
 
 
edcc62f
ed9da19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
---
title: SAM3 Promptable Concept Segmentation
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: SAM3 inference with text prompts and SAM2 API compatibility
---

# SAM3 Promptable Concept Segmentation

This Space provides both a **web interface** and **REST API** for SAM3 (Segment Anything Model 3) inference, featuring:

## πŸš€ Key Features

- **πŸ†• Text Prompts**: Segment objects using natural language descriptions (e.g., "kitten", "car", "person wearing red shirt")
- **πŸ”„ SAM2 Compatible**: Drop-in replacement for existing SAM2 inference endpoints
- **πŸ“Š High Quality**: Uses official SAM3 post-processing for single high-confidence masks
- **πŸ”Œ Dual APIs**: Simple Gradio API + SAM2-compatible inference endpoint format
- **⚑ Fast**: Optimized for production use with proper confidence thresholding

## πŸ“– Usage

### Web Interface
Simply upload an image, enter a text description of what you want to segment, and adjust the confidence threshold.

### API Usage

#### 1. Simple Text API (Gradio format)
```python
import requests
import base64

# Encode your image to base64
with open("image.jpg", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode()

# Make API request
response = requests.post(
    "https://your-username-sam3-api.hf.space/api/predict",
    json={
        "data": [image_b64, "kitten", 0.5]
    }
)

result = response.json()
```

#### 2. SAM2/SAM3 Compatible API (Inference Endpoint format)
```python
import requests
import base64

# Encode your image to base64
with open("image.jpg", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode()

# SAM3 Text Prompts (NEW)
response = requests.post(
    "https://your-username-sam3-api.hf.space/api/sam2_compatible",
    json={
        "data": [{
            "inputs": {
                "image": image_b64,
                "text_prompts": ["kitten", "toy"],
                "confidence_threshold": 0.5
            }
        }]
    }
)

# SAM2 Compatible (Points/Boxes)
response = requests.post(
    "https://your-username-sam3-api.hf.space/api/sam2_compatible",
    json={
        "data": [{
            "inputs": {
                "image": image_b64,
                "boxes": [[100, 100, 200, 200]],
                "confidence_threshold": 0.5
            }
        }]
    }
)

result = response.json()
```

## πŸ”§ API Parameters

### SAM2-Compatible API Input
```json
{
  "inputs": {
    "image": "base64_encoded_image_string",

    // SAM3 NEW: Text-based prompts
    "text_prompts": ["person", "car"],  // List of text descriptions

    // SAM2 COMPATIBLE: Point-based prompts
    "points": [[[x1, y1]], [[x2, y2]]],  // Points for each object
    "point_labels": [[1], [1]],  // Labels for each point (1=foreground, 0=background)

    // SAM2 COMPATIBLE: Bounding box prompts
    "boxes": [[x1, y1, x2, y2], [x1, y1, x2, y2]],  // Bounding boxes
    "box_labels": [1, 0],  // Labels for each box (1=positive, 0=negative/exclude)

    "multimask_output": false,  // Optional, defaults to False
    "confidence_threshold": 0.5  // Optional, minimum confidence for returned masks
  }
}
```

### API Response
```json
{
  "masks": ["base64_encoded_mask_1", "base64_encoded_mask_2"],
  "scores": [0.95, 0.87],
  "num_objects": 2,
  "sam_version": "3.0",
  "success": true
}
```

## πŸ†š SAM3 vs SAM2

| Feature | SAM2 | SAM3 |
|---------|------|------|
| **Text Prompts** | ❌ | βœ… Natural language descriptions |
| **Point Prompts** | βœ… | βœ… (compatible) |
| **Box Prompts** | βœ… | βœ… (compatible) |
| **Quality** | High | Higher (concept-aware) |
| **API Format** | HF Inference Endpoints | βœ… Compatible + Extensions |

## πŸ”¬ Technical Details

- **Model**: `facebook/sam3` from HuggingFace Transformers
- **Post-processing**: Official `post_process_instance_segmentation()` API
- **Framework**: Gradio 5.49.1 with automatic API generation
- **Dependencies**: Latest transformers with SAM3 support
- **Deployment**: HuggingFace Spaces (avoids Inference Toolkit compatibility issues)

## πŸ“š References

- [SAM3 Model Card](https://huggingface.co/facebook/sam3)
- [SAM3 Paper](https://ai.meta.com/research/publications/segment-anything-model-3/)
- [Transformers SAM3 Documentation](https://huggingface.co/docs/transformers/model_doc/sam3)