File size: 2,267 Bytes

0d65d50
 
 
594e70f

---

license: mit
---


# EdgeSAM - Efficient Segment Anything Model

EdgeSAM is an accelerated variant of the Segment Anything Model (SAM) optimized for edge devices using ONNX Runtime.

## Model Files

- `edge_sam_3x_encoder.onnx` - Image encoder (1024x1024 input)
- `edge_sam_3x_decoder.onnx` - Mask decoder with prompt support

## Usage

### API Request Format

```python

import requests

import base64



# Encode your image

with open("image.jpg", "rb") as f:

    image_b64 = base64.b64encode(f.read()).decode()



# Make request

response = requests.post(

    "https://YOUR-ENDPOINT-URL",

    json={

        "inputs": image_b64,

        "parameters": {

            "point_coords": [[512, 512]],  # Click point in 1024x1024 space

            "point_labels": [1],            # 1 = foreground, 0 = background

            "return_mask_image": True

        }

    }

)



result = response.json()

```

### Response Format

```json

[

  {

    "mask_shape": [1024, 1024],

    "has_object": true,

    "mask": "<base64_encoded_png>"

  }

]

```

### Parameters

- **point_coords**: Array of `[x, y]` coordinates in 1024x1024 space (optional)

- **point_labels**: Array of labels (1=foreground, 0=background) corresponding to points (optional)
- **box_coords**: Bounding box `[x1, y1, x2, y2]` (optional, not yet implemented)

- **return_mask_image**: Return base64-encoded PNG mask (default: `true`)



### Coordinate System



All coordinates should be in **1024x1024** space, regardless of original image size. The handler automatically resizes input images to 1024x1024 before processing.



Example: For a click at the center of any image, use `[512, 512]`.



## Local Testing



```bash

# Install dependencies

pip install -r requirements.txt



# Run test script

python test_handler.py

```



This will create:

- `test_input.png` - Test image with red circle

- `test_output_mask.png` - Generated segmentation mask

- `test_output_overlay.png` - Overlay visualization



## Technical Details



- **Input**: RGB images (auto-resized to 1024x1024)

- **Preprocessing**: Normalized to [0, 1] range (`/ 255.0`)

- **Hardware**: Supports CUDA GPU with automatic CPU fallback

- **Framework**: ONNX Runtime Web compatible