medsam-inference / README_INTEGRATION.md
Anigor66
Initial commit
0b86477
# πŸ”— Integration Guide - Use HF Space in Your Backend
## Quick Integration (3 Steps)
### Step 1: Copy the client file
```bash
# Copy the client to your backend directory
cp medsam_space_client.py ../medsam_space_client.py
```
### Step 2: Update your app.py
Find this code in `app.py` (around line 86-104):
```python
# OLD CODE - Remove this:
sam_checkpoint = "models/sam_vit_h_4b8939.pth"
model_type = "vit_b"
sam = None
sam_predictor = None
try:
if os.path.exists(sam_checkpoint):
sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
sam.to(device=device)
sam_predictor = SamPredictor(sam)
print("SAM model loaded successfully")
else:
print(f"Warning: SAM checkpoint not found at {sam_checkpoint}")
except Exception as e:
print(f"Warning: Failed to load SAM model: {e}")
```
Replace with:
```python
# NEW CODE - Add this:
from medsam_space_client import MedSAMSpacePredictor
# Initialize Space predictor
MEDSAM_SPACE_URL = os.getenv('MEDSAM_SPACE_URL',
'https://YOUR_USERNAME-medsam-inference.hf.space/api/predict')
sam_predictor = MedSAMSpacePredictor(MEDSAM_SPACE_URL)
print("βœ“ MedSAM Space predictor initialized")
```
### Step 3: Update your .env
```bash
cd backend
echo "MEDSAM_SPACE_URL=https://YOUR_USERNAME-medsam-inference.hf.space/api/predict" >> .env
```
**That's it!** Your code now uses the HF Space API! πŸŽ‰
---
## What Changes?
### βœ… These STAY THE SAME (No changes needed!)
All your endpoint code stays exactly the same:
```python
@app.route('/api/segment', methods=['POST'])
def segment_with_sam():
# ... existing code ...
# This works exactly the same!
sam_predictor.set_image(image_array)
masks, scores, _ = sam_predictor.predict(
point_coords=np.array([[x, y]]),
point_labels=np.array([1]),
multimask_output=True
)
# Get the best mask
best_mask = masks[np.argmax(scores)]
# ... rest of your code ...
```
### πŸ”„ What's Different
**Before (Local SAM):**
- Loads 2.5GB model into memory
- Uses GPU/CPU for inference
- Fast but requires resources
**After (HF Space):**
- No model loading
- API call to HF Space
- Slightly slower but no resource usage
---
## Complete Example
Here's a complete before/after comparison:
### BEFORE (app.py with local SAM):
```python
from segment_anything import sam_model_registry, SamPredictor
# Initialize SAM locally (loads 2.5GB model)
sam = sam_model_registry["vit_b"](checkpoint="models/sam_vit_h_4b8939.pth")
sam.to(device=device)
sam_predictor = SamPredictor(sam)
@app.route('/api/segment', methods=['POST'])
def segment():
data = request.json
image_data = data.get('image')
x, y = data.get('x'), data.get('y')
# Decode image
image_bytes = base64.b64decode(image_data.split(',')[1])
image = Image.open(BytesIO(image_bytes))
image_array = np.array(image.convert('RGB'))
# Segment with SAM
sam_predictor.set_image(image_array)
masks, scores, _ = sam_predictor.predict(
point_coords=np.array([[x, y]]),
point_labels=np.array([1]),
multimask_output=True
)
# Get best mask
best_mask = masks[np.argmax(scores)]
return jsonify({'success': True})
```
### AFTER (app.py with HF Space):
```python
from medsam_space_client import MedSAMSpacePredictor
# Initialize Space predictor (no model loading!)
sam_predictor = MedSAMSpacePredictor(
"https://YOUR_USERNAME-medsam-inference.hf.space/api/predict"
)
@app.route('/api/segment', methods=['POST'])
def segment():
data = request.json
image_data = data.get('image')
x, y = data.get('x'), data.get('y')
# Decode image
image_bytes = base64.b64decode(image_data.split(',')[1])
image = Image.open(BytesIO(image_bytes))
image_array = np.array(image.convert('RGB'))
# Segment with SAM Space (SAME CODE!)
sam_predictor.set_image(image_array)
masks, scores, _ = sam_predictor.predict(
point_coords=np.array([[x, y]]),
point_labels=np.array([1]),
multimask_output=True
)
# Get best mask (SAME CODE!)
best_mask = masks[np.argmax(scores)]
return jsonify({'success': True})
```
**Notice:** Only the initialization changed! Everything else is identical! ✨
---
## Testing
### 1. Test the client directly:
```python
# test_client.py
from medsam_space_client import MedSAMSpacePredictor
import numpy as np
from PIL import Image
# Initialize
predictor = MedSAMSpacePredictor(
"https://YOUR_USERNAME-medsam-inference.hf.space/api/predict"
)
# Load test image
image = np.array(Image.open("test_image.jpg"))
# Set image
predictor.set_image(image)
# Predict
masks, scores, _ = predictor.predict(
point_coords=np.array([[200, 150]]),
point_labels=np.array([1]),
multimask_output=True
)
print(f"βœ… Got {len(masks)} masks")
print(f" Scores: {scores}")
print(f" Best score: {scores.max():.4f}")
```
### 2. Test your full backend:
```bash
# Start your backend
python app.py
# In another terminal, test the endpoint
curl -X POST http://localhost:5000/api/segment \
-H "Content-Type: application/json" \
-d '{
"image": "data:image/jpeg;base64,/9j/4AAQ...",
"x": 200,
"y": 150
}'
```
---
## Deployment
Now your backend is lightweight and can deploy to Vercel!
### Update requirements.txt for Vercel:
```txt
# requirements_vercel.txt
Flask==2.3.3
Flask-CORS==4.0.0
requests==2.31.0
Pillow>=10.0.0
numpy>=1.24.0
# No torch, no segment-anything!
```
### Deploy to Vercel:
```bash
cd backend
# Create vercel.json
cat > vercel.json << 'EOF'
{
"version": 2,
"builds": [{"src": "app.py", "use": "@vercel/python"}],
"routes": [{"src": "/(.*)", "dest": "app.py"}]
}
EOF
# Deploy
vercel
vercel env add MEDSAM_SPACE_URL
# Paste: https://YOUR_USERNAME-medsam-inference.hf.space/api/predict
vercel --prod
```
---
## Performance
### Local SAM:
- βœ… Fast: 1-3 seconds
- ❌ Memory: 2.5GB+
- ❌ Requires GPU for speed
### HF Space (Free CPU):
- ⚠️ Slower: 5-10 seconds
- βœ… Memory: None (API call)
- ⚠️ May sleep (first request slow)
### HF Space (GPU T4):
- βœ… Fast: 1-2 seconds
- βœ… Memory: None (API call)
- βœ… Always on
- πŸ’° Cost: $0.60/hour
---
## Troubleshooting
### "Failed to get prediction from MedSAM Space"
β†’ Check MEDSAM_SPACE_URL is correct
β†’ Check Space is running (visit URL in browser)
### First request is very slow (20-30s)
β†’ Normal! Free tier Spaces sleep after inactivity
β†’ They wake up on first request
β†’ Subsequent requests are faster
### "Request timeout"
β†’ Space might be overloaded
β†’ Try again in a minute
β†’ Or upgrade to GPU tier
---
## Summary
βœ… **What you did:**
1. Copied `medsam_space_client.py` to backend
2. Changed 5 lines in `app.py` (just initialization)
3. Added `MEDSAM_SPACE_URL` to `.env`
βœ… **What stays the same:**
- All your endpoint code
- All your SAM prediction calls
- Your entire application logic
βœ… **What you gained:**
- No more 2.5GB model in memory
- Can deploy to Vercel/serverless
- Model hosted on HuggingFace (free!)
πŸŽ‰ **Your backend is now cloud-ready!**