--- license: mit --- # EdgeSAM - Efficient Segment Anything Model EdgeSAM is an accelerated variant of the Segment Anything Model (SAM) optimized for edge devices using ONNX Runtime. ## Model Files - `edge_sam_3x_encoder.onnx` - Image encoder (1024x1024 input) - `edge_sam_3x_decoder.onnx` - Mask decoder with prompt support ## Usage ### API Request Format ```python import requests import base64 # Encode your image with open("image.jpg", "rb") as f: image_b64 = base64.b64encode(f.read()).decode() # Make request response = requests.post( "https://YOUR-ENDPOINT-URL", json={ "inputs": image_b64, "parameters": { "point_coords": [[512, 512]], # Click point in 1024x1024 space "point_labels": [1], # 1 = foreground, 0 = background "return_mask_image": True } } ) result = response.json() ``` ### Response Format ```json [ { "mask_shape": [1024, 1024], "has_object": true, "mask": "" } ] ``` ### Parameters - **point_coords**: Array of `[x, y]` coordinates in 1024x1024 space (optional) - **point_labels**: Array of labels (1=foreground, 0=background) corresponding to points (optional) - **box_coords**: Bounding box `[x1, y1, x2, y2]` (optional, not yet implemented) - **return_mask_image**: Return base64-encoded PNG mask (default: `true`) ### Coordinate System All coordinates should be in **1024x1024** space, regardless of original image size. The handler automatically resizes input images to 1024x1024 before processing. Example: For a click at the center of any image, use `[512, 512]`. ## Local Testing ```bash # Install dependencies pip install -r requirements.txt # Run test script python test_handler.py ``` This will create: - `test_input.png` - Test image with red circle - `test_output_mask.png` - Generated segmentation mask - `test_output_overlay.png` - Overlay visualization ## Technical Details - **Input**: RGB images (auto-resized to 1024x1024) - **Preprocessing**: Normalized to [0, 1] range (`/ 255.0`) - **Hardware**: Supports CUDA GPU with automatic CPU fallback - **Framework**: ONNX Runtime Web compatible