medsam-inference

Sleeping

App Files Files Community

medsam-inference / README_INTEGRATION.md

Anigor66

Initial commit

0b86477 3 months ago

preview code

raw

history blame contribute delete

7.17 kB

	# 🔗 Integration Guide - Use HF Space in Your Backend

	## Quick Integration (3 Steps)

	### Step 1: Copy the client file

	```bash
	# Copy the client to your backend directory
	cp medsam_space_client.py ../medsam_space_client.py
	```

	### Step 2: Update your app.py

	Find this code in `app.py` (around line 86-104):

	```python
	# OLD CODE - Remove this:
	sam_checkpoint = "models/sam_vit_h_4b8939.pth"
	model_type = "vit_b"
	sam = None
	sam_predictor = None

	try:
	if os.path.exists(sam_checkpoint):
	sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
	sam.to(device=device)
	sam_predictor = SamPredictor(sam)
	print("SAM model loaded successfully")
	else:
	print(f"Warning: SAM checkpoint not found at {sam_checkpoint}")
	except Exception as e:
	print(f"Warning: Failed to load SAM model: {e}")
	```

	Replace with:

	```python
	# NEW CODE - Add this:
	from medsam_space_client import MedSAMSpacePredictor

	# Initialize Space predictor
	MEDSAM_SPACE_URL = os.getenv('MEDSAM_SPACE_URL',
	'https://YOUR_USERNAME-medsam-inference.hf.space/api/predict')

	sam_predictor = MedSAMSpacePredictor(MEDSAM_SPACE_URL)
	print("✓ MedSAM Space predictor initialized")
	```

	### Step 3: Update your .env

	```bash
	cd backend
	echo "MEDSAM_SPACE_URL=https://YOUR_USERNAME-medsam-inference.hf.space/api/predict" >> .env
	```

	That's it! Your code now uses the HF Space API! 🎉

	---

	## What Changes?

	### ✅ These STAY THE SAME (No changes needed!)

	All your endpoint code stays exactly the same:

	```python
	@app.route('/api/segment', methods=['POST'])
	def segment_with_sam():
	# ... existing code ...

	# This works exactly the same!
	sam_predictor.set_image(image_array)
	masks, scores, _ = sam_predictor.predict(
	point_coords=np.array([[x, y]]),
	point_labels=np.array([1]),
	multimask_output=True
	)

	# Get the best mask
	best_mask = masks[np.argmax(scores)]

	# ... rest of your code ...
	```

	### 🔄 What's Different

	Before (Local SAM):
	- Loads 2.5GB model into memory
	- Uses GPU/CPU for inference
	- Fast but requires resources

	After (HF Space):
	- No model loading
	- API call to HF Space
	- Slightly slower but no resource usage

	---

	## Complete Example

	Here's a complete before/after comparison:

	### BEFORE (app.py with local SAM):

	```python
	from segment_anything import sam_model_registry, SamPredictor

	# Initialize SAM locally (loads 2.5GB model)
	sam = sam_model_registry["vit_b"](checkpoint="models/sam_vit_h_4b8939.pth")
	sam.to(device=device)
	sam_predictor = SamPredictor(sam)

	@app.route('/api/segment', methods=['POST'])
	def segment():
	data = request.json
	image_data = data.get('image')
	x, y = data.get('x'), data.get('y')

	# Decode image
	image_bytes = base64.b64decode(image_data.split(',')[1])
	image = Image.open(BytesIO(image_bytes))
	image_array = np.array(image.convert('RGB'))

	# Segment with SAM
	sam_predictor.set_image(image_array)
	masks, scores, _ = sam_predictor.predict(
	point_coords=np.array([[x, y]]),
	point_labels=np.array([1]),
	multimask_output=True
	)

	# Get best mask
	best_mask = masks[np.argmax(scores)]

	return jsonify({'success': True})
	```

	### AFTER (app.py with HF Space):

	```python
	from medsam_space_client import MedSAMSpacePredictor

	# Initialize Space predictor (no model loading!)
	sam_predictor = MedSAMSpacePredictor(
	"https://YOUR_USERNAME-medsam-inference.hf.space/api/predict"
	)

	@app.route('/api/segment', methods=['POST'])
	def segment():
	data = request.json
	image_data = data.get('image')
	x, y = data.get('x'), data.get('y')

	# Decode image
	image_bytes = base64.b64decode(image_data.split(',')[1])
	image = Image.open(BytesIO(image_bytes))
	image_array = np.array(image.convert('RGB'))

	# Segment with SAM Space (SAME CODE!)
	sam_predictor.set_image(image_array)
	masks, scores, _ = sam_predictor.predict(
	point_coords=np.array([[x, y]]),
	point_labels=np.array([1]),
	multimask_output=True
	)

	# Get best mask (SAME CODE!)
	best_mask = masks[np.argmax(scores)]

	return jsonify({'success': True})
	```

	Notice: Only the initialization changed! Everything else is identical! ✨

	---

	## Testing

	### 1. Test the client directly:

	```python
	# test_client.py
	from medsam_space_client import MedSAMSpacePredictor
	import numpy as np
	from PIL import Image

	# Initialize
	predictor = MedSAMSpacePredictor(
	"https://YOUR_USERNAME-medsam-inference.hf.space/api/predict"
	)

	# Load test image
	image = np.array(Image.open("test_image.jpg"))

	# Set image
	predictor.set_image(image)

	# Predict
	masks, scores, _ = predictor.predict(
	point_coords=np.array([[200, 150]]),
	point_labels=np.array([1]),
	multimask_output=True
	)

	print(f"✅ Got {len(masks)} masks")
	print(f" Scores: {scores}")
	print(f" Best score: {scores.max():.4f}")
	```

	### 2. Test your full backend:

	```bash
	# Start your backend
	python app.py

	# In another terminal, test the endpoint
	curl -X POST http://localhost:5000/api/segment \
	-H "Content-Type: application/json" \
	-d '{
	"image": "data:image/jpeg;base64,/9j/4AAQ...",
	"x": 200,
	"y": 150
	}'
	```

	---

	## Deployment

	Now your backend is lightweight and can deploy to Vercel!

	### Update requirements.txt for Vercel:

	```txt
	# requirements_vercel.txt
	Flask==2.3.3
	Flask-CORS==4.0.0
	requests==2.31.0
	Pillow>=10.0.0
	numpy>=1.24.0

	# No torch, no segment-anything!
	```

	### Deploy to Vercel:

	```bash
	cd backend

	# Create vercel.json
	cat > vercel.json << 'EOF'
	{
	"version": 2,
	"builds": [{"src": "app.py", "use": "@vercel/python"}],
	"routes": [{"src": "/(.*)", "dest": "app.py"}]
	}
	EOF

	# Deploy
	vercel
	vercel env add MEDSAM_SPACE_URL
	# Paste: https://YOUR_USERNAME-medsam-inference.hf.space/api/predict
	vercel --prod
	```

	---

	## Performance

	### Local SAM:
	- ✅ Fast: 1-3 seconds
	- ❌ Memory: 2.5GB+
	- ❌ Requires GPU for speed

	### HF Space (Free CPU):
	- ⚠️ Slower: 5-10 seconds
	- ✅ Memory: None (API call)
	- ⚠️ May sleep (first request slow)

	### HF Space (GPU T4):
	- ✅ Fast: 1-2 seconds
	- ✅ Memory: None (API call)
	- ✅ Always on
	- 💰 Cost: $0.60/hour

	---

	## Troubleshooting

	### "Failed to get prediction from MedSAM Space"
	→ Check MEDSAM_SPACE_URL is correct
	→ Check Space is running (visit URL in browser)

	### First request is very slow (20-30s)
	→ Normal! Free tier Spaces sleep after inactivity
	→ They wake up on first request
	→ Subsequent requests are faster

	### "Request timeout"
	→ Space might be overloaded
	→ Try again in a minute
	→ Or upgrade to GPU tier

	---

	## Summary

	✅ What you did:
	1. Copied `medsam_space_client.py` to backend
	2. Changed 5 lines in `app.py` (just initialization)
	3. Added `MEDSAM_SPACE_URL` to `.env`

	✅ What stays the same:
	- All your endpoint code
	- All your SAM prediction calls
	- Your entire application logic

	✅ What you gained:
	- No more 2.5GB model in memory
	- Can deploy to Vercel/serverless
	- Model hosted on HuggingFace (free!)

	🎉 Your backend is now cloud-ready!