# 🚀 Deploying MedSAM to HuggingFace Space ## Step-by-Step Guide ### Step 1: Create New Space 1. Go to: https://huggingface.co/new-space 2. Fill in details: - **Owner:** Your username - **Space name:** `medsam-inference` - **License:** Apache 2.0 - **Select SDK:** Gradio - **Space hardware:** - Start with **CPU basic** (free) - Upgrade to **T4 small** ($0.60/hour) for better performance - **Visibility:** Public or Private 3. Click **Create Space** ### Step 2: Upload Files You have two options: #### Option A: Using Git (Recommended) ```bash # Clone your new Space git clone https://huggingface.co/spaces/YOUR_USERNAME/medsam-inference cd medsam-inference # Copy files from this directory cp /path/to/huggingface_space/* . # Download your model from HuggingFace # Option 1: Download via Python python3 << EOF from huggingface_hub import hf_hub_download hf_hub_download( repo_id="Aniketg6/Fine-Tuned-MedSAM", filename="medsam_vit_b.pth", local_dir=".", local_dir_use_symlinks=False ) EOF # Option 2: Download manually # Go to https://huggingface.co/Aniketg6/Fine-Tuned-MedSAM # Download medsam_vit_b.pth (375 MB) # Place it in this directory # Initialize Git LFS (for large files) git lfs install git lfs track "*.pth" # Add and commit git add . git commit -m "Initial commit: MedSAM inference API" git push ``` #### Option B: Using Web Interface 1. In your Space, click **Files** tab 2. Click **Add file** → **Upload files** 3. Upload: - `app.py` - `requirements.txt` - `README.md` - `.gitattributes` 4. For the model file (`medsam_vit_b.pth`): - Download from: https://huggingface.co/Aniketg6/Fine-Tuned-MedSAM - Upload to your Space (375 MB) ### Step 3: Wait for Build - HuggingFace will automatically build your Space - Check the **Logs** tab for build progress - Should take 3-5 minutes - Once done, your Space will be live! ### Step 4: Test Your Space Visit: `https://huggingface.co/spaces/YOUR_USERNAME/medsam-inference` You should see: - ✅ Interactive UI with two tabs - ✅ API Interface for programmatic access - ✅ Simple Interface for manual testing ### Step 5: Get Your API Endpoint Your API endpoint is: ``` https://YOUR_USERNAME-medsam-inference.hf.space/api/predict ``` Or use Gradio's direct endpoint: ``` https://YOUR_USERNAME-medsam-inference.hf.space/run/predict ``` --- ## Testing Your Space ### Test via Web UI 1. Go to your Space URL 2. Click **Simple Interface** tab 3. Upload an image 4. Enter X, Y coordinates 5. Click **Segment** 6. See the mask output! ### Test via Python ```python import requests import json import base64 from PIL import Image import numpy as np # Your Space URL SPACE_URL = "https://YOUR_USERNAME-medsam-inference.hf.space" def call_medsam_space(image_path, point_coords, point_labels, multimask=True): """ Call your MedSAM Space Args: image_path: Path to image file point_coords: List of [x, y] coordinates, e.g., [[100, 150]] point_labels: List of labels (1=foreground, 0=background), e.g., [1] multimask: Whether to output multiple masks Returns: Dictionary with masks and scores """ # Read and encode image with open(image_path, "rb") as f: img_base64 = base64.b64encode(f.read()).decode() # Prepare points JSON points_json = json.dumps({ "coords": point_coords, "labels": point_labels, "multimask_output": multimask }) # Call API response = requests.post( f"{SPACE_URL}/api/predict", json={ "data": [ f"data:image/jpeg;base64,{img_base64}", points_json ] } ) # Parse result result = response.json() output_json = result["data"][0] # Gradio wraps output in data array return json.loads(output_json) # Example usage if __name__ == "__main__": result = call_medsam_space( image_path="test_image.jpg", point_coords=[[200, 150]], point_labels=[1], multimask=True ) if result['success']: print(f"✅ Segmentation successful!") print(f" Number of masks: {result['num_masks']}") print(f" Scores: {result['scores']}") # Get best mask best_idx = np.argmax(result['scores']) best_mask_data = result['masks'][best_idx]['mask_data'] best_mask = np.array(best_mask_data, dtype=bool) print(f" Best mask shape: {best_mask.shape}") else: print(f"❌ Error: {result['error']}") ``` --- ## Integration with Your Backend Now update your `app.py` to use this Space: ```python # In backend/app.py or backend/hf_inference.py import requests import json import base64 from io import BytesIO from PIL import Image import numpy as np # Your Space URL MEDSAM_SPACE_URL = "https://YOUR_USERNAME-medsam-inference.hf.space/api/predict" def call_medsam_space(image_array, point_coords, point_labels, multimask_output=True): """ Call MedSAM Space API Args: image_array: numpy array of image point_coords: numpy array [[x, y]] point_labels: numpy array [1] or [0] multimask_output: bool Returns: masks, scores (matching original SAM interface) """ try: # Convert numpy array to base64 image = Image.fromarray(image_array) buffered = BytesIO() image.save(buffered, format="PNG") img_base64 = base64.b64encode(buffered.getvalue()).decode() # Prepare points JSON points_json = json.dumps({ "coords": point_coords.tolist(), "labels": point_labels.tolist(), "multimask_output": multimask_output }) # Call Space API response = requests.post( MEDSAM_SPACE_URL, json={ "data": [ f"data:image/png;base64,{img_base64}", points_json ] }, timeout=60 ) # Parse result result = response.json() output_json = result["data"][0] output = json.loads(output_json) if not output['success']: raise Exception(output['error']) # Convert back to numpy arrays (matching SAM interface) masks = [] for mask_data in output['masks']: mask = np.array(mask_data['mask_data'], dtype=bool) masks.append(mask) masks = np.array(masks) scores = np.array(output['scores']) return masks, scores, None # Return None for logits (not needed) except Exception as e: print(f"Error calling MedSAM Space: {e}") raise # Replace your SAM predictor calls with this: # OLD: # sam_predictor.set_image(image_array) # masks, scores, _ = sam_predictor.predict( # point_coords=np.array([[x, y]]), # point_labels=np.array([1]), # multimask_output=True # ) # NEW: # masks, scores, _ = call_medsam_space( # image_array, # point_coords=np.array([[x, y]]), # point_labels=np.array([1]), # multimask_output=True # ) ``` --- ## Cost & Performance ### Free Tier (CPU Basic): - ✅ **Free!** - ⚠️ Slower inference (~5-10 seconds per image) - ⚠️ May sleep after inactivity - ✅ Good for testing and low usage ### Paid Tier (T4 Small GPU): - 💰 **$0.60/hour** (~$432/month if always on) - ✅ Fast inference (~1-2 seconds per image) - ✅ No sleep mode - ✅ Better for production ### Upgrade to GPU: 1. Go to your Space settings 2. Click **Settings** tab 3. Under **Space hardware**, select **T4 small** 4. Click **Update** --- ## Troubleshooting ### "Application startup failed" - Check logs for errors - Make sure `medsam_vit_b.pth` is uploaded - Verify `requirements.txt` is correct ### "Out of memory" - Upgrade to GPU hardware - Reduce image size before sending ### "Space is sleeping" - Free tier spaces sleep after 48h inactivity - First request will wake it up (takes 10-20s) - Upgrade to paid tier for always-on ### API returns error - Check input format matches examples - Verify coordinates are within image bounds - Check Space logs for detailed errors --- ## Next Steps 1. ✅ Deploy Space 2. ✅ Test via web UI 3. ✅ Test via Python script 4. ✅ Integrate with your backend 5. ✅ Deploy your backend to Vercel/Railway 6. ✅ Deploy frontend to Vercel 7. 🎉 Done! --- ## Alternative: Use Inference Endpoints For production, consider **HuggingFace Inference Endpoints**: - Dedicated infrastructure - Auto-scaling - Better performance - $0.60/hour minimum See: https://huggingface.co/inference-endpoints --- **Questions? Check HuggingFace Spaces docs:** https://huggingface.co/docs/hub/spaces