Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.5.1
π Integration Guide - Use HF Space in Your Backend
Quick Integration (3 Steps)
Step 1: Copy the client file
# Copy the client to your backend directory
cp medsam_space_client.py ../medsam_space_client.py
Step 2: Update your app.py
Find this code in app.py (around line 86-104):
# OLD CODE - Remove this:
sam_checkpoint = "models/sam_vit_h_4b8939.pth"
model_type = "vit_b"
sam = None
sam_predictor = None
try:
if os.path.exists(sam_checkpoint):
sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
sam.to(device=device)
sam_predictor = SamPredictor(sam)
print("SAM model loaded successfully")
else:
print(f"Warning: SAM checkpoint not found at {sam_checkpoint}")
except Exception as e:
print(f"Warning: Failed to load SAM model: {e}")
Replace with:
# NEW CODE - Add this:
from medsam_space_client import MedSAMSpacePredictor
# Initialize Space predictor
MEDSAM_SPACE_URL = os.getenv('MEDSAM_SPACE_URL',
'https://YOUR_USERNAME-medsam-inference.hf.space/api/predict')
sam_predictor = MedSAMSpacePredictor(MEDSAM_SPACE_URL)
print("β MedSAM Space predictor initialized")
Step 3: Update your .env
cd backend
echo "MEDSAM_SPACE_URL=https://YOUR_USERNAME-medsam-inference.hf.space/api/predict" >> .env
That's it! Your code now uses the HF Space API! π
What Changes?
β These STAY THE SAME (No changes needed!)
All your endpoint code stays exactly the same:
@app.route('/api/segment', methods=['POST'])
def segment_with_sam():
# ... existing code ...
# This works exactly the same!
sam_predictor.set_image(image_array)
masks, scores, _ = sam_predictor.predict(
point_coords=np.array([[x, y]]),
point_labels=np.array([1]),
multimask_output=True
)
# Get the best mask
best_mask = masks[np.argmax(scores)]
# ... rest of your code ...
π What's Different
Before (Local SAM):
- Loads 2.5GB model into memory
- Uses GPU/CPU for inference
- Fast but requires resources
After (HF Space):
- No model loading
- API call to HF Space
- Slightly slower but no resource usage
Complete Example
Here's a complete before/after comparison:
BEFORE (app.py with local SAM):
from segment_anything import sam_model_registry, SamPredictor
# Initialize SAM locally (loads 2.5GB model)
sam = sam_model_registry["vit_b"](checkpoint="models/sam_vit_h_4b8939.pth")
sam.to(device=device)
sam_predictor = SamPredictor(sam)
@app.route('/api/segment', methods=['POST'])
def segment():
data = request.json
image_data = data.get('image')
x, y = data.get('x'), data.get('y')
# Decode image
image_bytes = base64.b64decode(image_data.split(',')[1])
image = Image.open(BytesIO(image_bytes))
image_array = np.array(image.convert('RGB'))
# Segment with SAM
sam_predictor.set_image(image_array)
masks, scores, _ = sam_predictor.predict(
point_coords=np.array([[x, y]]),
point_labels=np.array([1]),
multimask_output=True
)
# Get best mask
best_mask = masks[np.argmax(scores)]
return jsonify({'success': True})
AFTER (app.py with HF Space):
from medsam_space_client import MedSAMSpacePredictor
# Initialize Space predictor (no model loading!)
sam_predictor = MedSAMSpacePredictor(
"https://YOUR_USERNAME-medsam-inference.hf.space/api/predict"
)
@app.route('/api/segment', methods=['POST'])
def segment():
data = request.json
image_data = data.get('image')
x, y = data.get('x'), data.get('y')
# Decode image
image_bytes = base64.b64decode(image_data.split(',')[1])
image = Image.open(BytesIO(image_bytes))
image_array = np.array(image.convert('RGB'))
# Segment with SAM Space (SAME CODE!)
sam_predictor.set_image(image_array)
masks, scores, _ = sam_predictor.predict(
point_coords=np.array([[x, y]]),
point_labels=np.array([1]),
multimask_output=True
)
# Get best mask (SAME CODE!)
best_mask = masks[np.argmax(scores)]
return jsonify({'success': True})
Notice: Only the initialization changed! Everything else is identical! β¨
Testing
1. Test the client directly:
# test_client.py
from medsam_space_client import MedSAMSpacePredictor
import numpy as np
from PIL import Image
# Initialize
predictor = MedSAMSpacePredictor(
"https://YOUR_USERNAME-medsam-inference.hf.space/api/predict"
)
# Load test image
image = np.array(Image.open("test_image.jpg"))
# Set image
predictor.set_image(image)
# Predict
masks, scores, _ = predictor.predict(
point_coords=np.array([[200, 150]]),
point_labels=np.array([1]),
multimask_output=True
)
print(f"β
Got {len(masks)} masks")
print(f" Scores: {scores}")
print(f" Best score: {scores.max():.4f}")
2. Test your full backend:
# Start your backend
python app.py
# In another terminal, test the endpoint
curl -X POST http://localhost:5000/api/segment \
-H "Content-Type: application/json" \
-d '{
"image": "...",
"x": 200,
"y": 150
}'
Deployment
Now your backend is lightweight and can deploy to Vercel!
Update requirements.txt for Vercel:
# requirements_vercel.txt
Flask==2.3.3
Flask-CORS==4.0.0
requests==2.31.0
Pillow>=10.0.0
numpy>=1.24.0
# No torch, no segment-anything!
Deploy to Vercel:
cd backend
# Create vercel.json
cat > vercel.json << 'EOF'
{
"version": 2,
"builds": [{"src": "app.py", "use": "@vercel/python"}],
"routes": [{"src": "/(.*)", "dest": "app.py"}]
}
EOF
# Deploy
vercel
vercel env add MEDSAM_SPACE_URL
# Paste: https://YOUR_USERNAME-medsam-inference.hf.space/api/predict
vercel --prod
Performance
Local SAM:
- β Fast: 1-3 seconds
- β Memory: 2.5GB+
- β Requires GPU for speed
HF Space (Free CPU):
- β οΈ Slower: 5-10 seconds
- β Memory: None (API call)
- β οΈ May sleep (first request slow)
HF Space (GPU T4):
- β Fast: 1-2 seconds
- β Memory: None (API call)
- β Always on
- π° Cost: $0.60/hour
Troubleshooting
"Failed to get prediction from MedSAM Space"
β Check MEDSAM_SPACE_URL is correct β Check Space is running (visit URL in browser)
First request is very slow (20-30s)
β Normal! Free tier Spaces sleep after inactivity β They wake up on first request β Subsequent requests are faster
"Request timeout"
β Space might be overloaded β Try again in a minute β Or upgrade to GPU tier
Summary
β What you did:
- Copied
medsam_space_client.pyto backend - Changed 5 lines in
app.py(just initialization) - Added
MEDSAM_SPACE_URLto.env
β What stays the same:
- All your endpoint code
- All your SAM prediction calls
- Your entire application logic
β What you gained:
- No more 2.5GB model in memory
- Can deploy to Vercel/serverless
- Model hosted on HuggingFace (free!)
π Your backend is now cloud-ready!