Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.5.1
π Deploying MedSAM to HuggingFace Space
Step-by-Step Guide
Step 1: Create New Space
Fill in details:
- Owner: Your username
- Space name:
medsam-inference - License: Apache 2.0
- Select SDK: Gradio
- Space hardware:
- Start with CPU basic (free)
- Upgrade to T4 small ($0.60/hour) for better performance
- Visibility: Public or Private
Click Create Space
Step 2: Upload Files
You have two options:
Option A: Using Git (Recommended)
# Clone your new Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/medsam-inference
cd medsam-inference
# Copy files from this directory
cp /path/to/huggingface_space/* .
# Download your model from HuggingFace
# Option 1: Download via Python
python3 << EOF
from huggingface_hub import hf_hub_download
hf_hub_download(
repo_id="Aniketg6/Fine-Tuned-MedSAM",
filename="medsam_vit_b.pth",
local_dir=".",
local_dir_use_symlinks=False
)
EOF
# Option 2: Download manually
# Go to https://huggingface.co/Aniketg6/Fine-Tuned-MedSAM
# Download medsam_vit_b.pth (375 MB)
# Place it in this directory
# Initialize Git LFS (for large files)
git lfs install
git lfs track "*.pth"
# Add and commit
git add .
git commit -m "Initial commit: MedSAM inference API"
git push
Option B: Using Web Interface
- In your Space, click Files tab
- Click Add file β Upload files
- Upload:
app.pyrequirements.txtREADME.md.gitattributes
- For the model file (
medsam_vit_b.pth):- Download from: https://huggingface.co/Aniketg6/Fine-Tuned-MedSAM
- Upload to your Space (375 MB)
Step 3: Wait for Build
- HuggingFace will automatically build your Space
- Check the Logs tab for build progress
- Should take 3-5 minutes
- Once done, your Space will be live!
Step 4: Test Your Space
Visit: https://huggingface.co/spaces/YOUR_USERNAME/medsam-inference
You should see:
- β Interactive UI with two tabs
- β API Interface for programmatic access
- β Simple Interface for manual testing
Step 5: Get Your API Endpoint
Your API endpoint is:
https://YOUR_USERNAME-medsam-inference.hf.space/api/predict
Or use Gradio's direct endpoint:
https://YOUR_USERNAME-medsam-inference.hf.space/run/predict
Testing Your Space
Test via Web UI
- Go to your Space URL
- Click Simple Interface tab
- Upload an image
- Enter X, Y coordinates
- Click Segment
- See the mask output!
Test via Python
import requests
import json
import base64
from PIL import Image
import numpy as np
# Your Space URL
SPACE_URL = "https://YOUR_USERNAME-medsam-inference.hf.space"
def call_medsam_space(image_path, point_coords, point_labels, multimask=True):
"""
Call your MedSAM Space
Args:
image_path: Path to image file
point_coords: List of [x, y] coordinates, e.g., [[100, 150]]
point_labels: List of labels (1=foreground, 0=background), e.g., [1]
multimask: Whether to output multiple masks
Returns:
Dictionary with masks and scores
"""
# Read and encode image
with open(image_path, "rb") as f:
img_base64 = base64.b64encode(f.read()).decode()
# Prepare points JSON
points_json = json.dumps({
"coords": point_coords,
"labels": point_labels,
"multimask_output": multimask
})
# Call API
response = requests.post(
f"{SPACE_URL}/api/predict",
json={
"data": [
f"data:image/jpeg;base64,{img_base64}",
points_json
]
}
)
# Parse result
result = response.json()
output_json = result["data"][0] # Gradio wraps output in data array
return json.loads(output_json)
# Example usage
if __name__ == "__main__":
result = call_medsam_space(
image_path="test_image.jpg",
point_coords=[[200, 150]],
point_labels=[1],
multimask=True
)
if result['success']:
print(f"β
Segmentation successful!")
print(f" Number of masks: {result['num_masks']}")
print(f" Scores: {result['scores']}")
# Get best mask
best_idx = np.argmax(result['scores'])
best_mask_data = result['masks'][best_idx]['mask_data']
best_mask = np.array(best_mask_data, dtype=bool)
print(f" Best mask shape: {best_mask.shape}")
else:
print(f"β Error: {result['error']}")
Integration with Your Backend
Now update your app.py to use this Space:
# In backend/app.py or backend/hf_inference.py
import requests
import json
import base64
from io import BytesIO
from PIL import Image
import numpy as np
# Your Space URL
MEDSAM_SPACE_URL = "https://YOUR_USERNAME-medsam-inference.hf.space/api/predict"
def call_medsam_space(image_array, point_coords, point_labels, multimask_output=True):
"""
Call MedSAM Space API
Args:
image_array: numpy array of image
point_coords: numpy array [[x, y]]
point_labels: numpy array [1] or [0]
multimask_output: bool
Returns:
masks, scores (matching original SAM interface)
"""
try:
# Convert numpy array to base64
image = Image.fromarray(image_array)
buffered = BytesIO()
image.save(buffered, format="PNG")
img_base64 = base64.b64encode(buffered.getvalue()).decode()
# Prepare points JSON
points_json = json.dumps({
"coords": point_coords.tolist(),
"labels": point_labels.tolist(),
"multimask_output": multimask_output
})
# Call Space API
response = requests.post(
MEDSAM_SPACE_URL,
json={
"data": [
f"data:image/png;base64,{img_base64}",
points_json
]
},
timeout=60
)
# Parse result
result = response.json()
output_json = result["data"][0]
output = json.loads(output_json)
if not output['success']:
raise Exception(output['error'])
# Convert back to numpy arrays (matching SAM interface)
masks = []
for mask_data in output['masks']:
mask = np.array(mask_data['mask_data'], dtype=bool)
masks.append(mask)
masks = np.array(masks)
scores = np.array(output['scores'])
return masks, scores, None # Return None for logits (not needed)
except Exception as e:
print(f"Error calling MedSAM Space: {e}")
raise
# Replace your SAM predictor calls with this:
# OLD:
# sam_predictor.set_image(image_array)
# masks, scores, _ = sam_predictor.predict(
# point_coords=np.array([[x, y]]),
# point_labels=np.array([1]),
# multimask_output=True
# )
# NEW:
# masks, scores, _ = call_medsam_space(
# image_array,
# point_coords=np.array([[x, y]]),
# point_labels=np.array([1]),
# multimask_output=True
# )
Cost & Performance
Free Tier (CPU Basic):
- β Free!
- β οΈ Slower inference (~5-10 seconds per image)
- β οΈ May sleep after inactivity
- β Good for testing and low usage
Paid Tier (T4 Small GPU):
- π° $0.60/hour (~$432/month if always on)
- β Fast inference (~1-2 seconds per image)
- β No sleep mode
- β Better for production
Upgrade to GPU:
- Go to your Space settings
- Click Settings tab
- Under Space hardware, select T4 small
- Click Update
Troubleshooting
"Application startup failed"
- Check logs for errors
- Make sure
medsam_vit_b.pthis uploaded - Verify
requirements.txtis correct
"Out of memory"
- Upgrade to GPU hardware
- Reduce image size before sending
"Space is sleeping"
- Free tier spaces sleep after 48h inactivity
- First request will wake it up (takes 10-20s)
- Upgrade to paid tier for always-on
API returns error
- Check input format matches examples
- Verify coordinates are within image bounds
- Check Space logs for detailed errors
Next Steps
- β Deploy Space
- β Test via web UI
- β Test via Python script
- β Integrate with your backend
- β Deploy your backend to Vercel/Railway
- β Deploy frontend to Vercel
- π Done!
Alternative: Use Inference Endpoints
For production, consider HuggingFace Inference Endpoints:
- Dedicated infrastructure
- Auto-scaling
- Better performance
- $0.60/hour minimum
See: https://huggingface.co/inference-endpoints
Questions? Check HuggingFace Spaces docs: https://huggingface.co/docs/hub/spaces