medsam-inference

Runtime error

File size: 8,868 Bytes

0b86477

# 🚀 Deploying MedSAM to HuggingFace Space

## Step-by-Step Guide

### Step 1: Create New Space

1. Go to: https://huggingface.co/new-space
2. Fill in details:
   - **Owner:** Your username
   - **Space name:** `medsam-inference`
   - **License:** Apache 2.0
   - **Select SDK:** Gradio
   - **Space hardware:** 
     - Start with **CPU basic** (free)
     - Upgrade to **T4 small** ($0.60/hour) for better performance
   - **Visibility:** Public or Private

3. Click **Create Space**

### Step 2: Upload Files

You have two options:

#### Option A: Using Git (Recommended)

```bash
# Clone your new Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/medsam-inference
cd medsam-inference

# Copy files from this directory
cp /path/to/huggingface_space/* .

# Download your model from HuggingFace
# Option 1: Download via Python
python3 << EOF
from huggingface_hub import hf_hub_download
hf_hub_download(
    repo_id="Aniketg6/Fine-Tuned-MedSAM",
    filename="medsam_vit_b.pth",
    local_dir=".",
    local_dir_use_symlinks=False
)
EOF

# Option 2: Download manually
# Go to https://huggingface.co/Aniketg6/Fine-Tuned-MedSAM
# Download medsam_vit_b.pth (375 MB)
# Place it in this directory

# Initialize Git LFS (for large files)
git lfs install
git lfs track "*.pth"

# Add and commit
git add .
git commit -m "Initial commit: MedSAM inference API"
git push
```

#### Option B: Using Web Interface

1. In your Space, click **Files** tab
2. Click **Add file** → **Upload files**
3. Upload:
   - `app.py`
   - `requirements.txt`
   - `README.md`
   - `.gitattributes`
4. For the model file (`medsam_vit_b.pth`):
   - Download from: https://huggingface.co/Aniketg6/Fine-Tuned-MedSAM
   - Upload to your Space (375 MB)

### Step 3: Wait for Build

- HuggingFace will automatically build your Space
- Check the **Logs** tab for build progress
- Should take 3-5 minutes
- Once done, your Space will be live!

### Step 4: Test Your Space

Visit: `https://huggingface.co/spaces/YOUR_USERNAME/medsam-inference`

You should see:
- ✅ Interactive UI with two tabs
- ✅ API Interface for programmatic access
- ✅ Simple Interface for manual testing

### Step 5: Get Your API Endpoint

Your API endpoint is:
```
https://YOUR_USERNAME-medsam-inference.hf.space/api/predict
```

Or use Gradio's direct endpoint:
```
https://YOUR_USERNAME-medsam-inference.hf.space/run/predict
```

---

## Testing Your Space

### Test via Web UI

1. Go to your Space URL
2. Click **Simple Interface** tab
3. Upload an image
4. Enter X, Y coordinates
5. Click **Segment**
6. See the mask output!

### Test via Python

```python
import requests
import json
import base64
from PIL import Image
import numpy as np

# Your Space URL
SPACE_URL = "https://YOUR_USERNAME-medsam-inference.hf.space"

def call_medsam_space(image_path, point_coords, point_labels, multimask=True):
    """
    Call your MedSAM Space
    
    Args:
        image_path: Path to image file
        point_coords: List of [x, y] coordinates, e.g., [[100, 150]]
        point_labels: List of labels (1=foreground, 0=background), e.g., [1]
        multimask: Whether to output multiple masks
    
    Returns:
        Dictionary with masks and scores
    """
    # Read and encode image
    with open(image_path, "rb") as f:
        img_base64 = base64.b64encode(f.read()).decode()
    
    # Prepare points JSON
    points_json = json.dumps({
        "coords": point_coords,
        "labels": point_labels,
        "multimask_output": multimask
    })
    
    # Call API
    response = requests.post(
        f"{SPACE_URL}/api/predict",
        json={
            "data": [
                f"data:image/jpeg;base64,{img_base64}",
                points_json
            ]
        }
    )
    
    # Parse result
    result = response.json()
    output_json = result["data"][0]  # Gradio wraps output in data array
    
    return json.loads(output_json)

# Example usage
if __name__ == "__main__":
    result = call_medsam_space(
        image_path="test_image.jpg",
        point_coords=[[200, 150]],
        point_labels=[1],
        multimask=True
    )
    
    if result['success']:
        print(f"✅ Segmentation successful!")
        print(f"   Number of masks: {result['num_masks']}")
        print(f"   Scores: {result['scores']}")
        
        # Get best mask
        best_idx = np.argmax(result['scores'])
        best_mask_data = result['masks'][best_idx]['mask_data']
        best_mask = np.array(best_mask_data, dtype=bool)
        print(f"   Best mask shape: {best_mask.shape}")
    else:
        print(f"❌ Error: {result['error']}")
```

---

## Integration with Your Backend

Now update your `app.py` to use this Space:

```python
# In backend/app.py or backend/hf_inference.py

import requests
import json
import base64
from io import BytesIO
from PIL import Image
import numpy as np

# Your Space URL
MEDSAM_SPACE_URL = "https://YOUR_USERNAME-medsam-inference.hf.space/api/predict"

def call_medsam_space(image_array, point_coords, point_labels, multimask_output=True):
    """
    Call MedSAM Space API
    
    Args:
        image_array: numpy array of image
        point_coords: numpy array [[x, y]]
        point_labels: numpy array [1] or [0]
        multimask_output: bool
    
    Returns:
        masks, scores (matching original SAM interface)
    """
    try:
        # Convert numpy array to base64
        image = Image.fromarray(image_array)
        buffered = BytesIO()
        image.save(buffered, format="PNG")
        img_base64 = base64.b64encode(buffered.getvalue()).decode()
        
        # Prepare points JSON
        points_json = json.dumps({
            "coords": point_coords.tolist(),
            "labels": point_labels.tolist(),
            "multimask_output": multimask_output
        })
        
        # Call Space API
        response = requests.post(
            MEDSAM_SPACE_URL,
            json={
                "data": [
                    f"data:image/png;base64,{img_base64}",
                    points_json
                ]
            },
            timeout=60
        )
        
        # Parse result
        result = response.json()
        output_json = result["data"][0]
        output = json.loads(output_json)
        
        if not output['success']:
            raise Exception(output['error'])
        
        # Convert back to numpy arrays (matching SAM interface)
        masks = []
        for mask_data in output['masks']:
            mask = np.array(mask_data['mask_data'], dtype=bool)
            masks.append(mask)
        
        masks = np.array(masks)
        scores = np.array(output['scores'])
        
        return masks, scores, None  # Return None for logits (not needed)
        
    except Exception as e:
        print(f"Error calling MedSAM Space: {e}")
        raise

# Replace your SAM predictor calls with this:
# OLD:
# sam_predictor.set_image(image_array)
# masks, scores, _ = sam_predictor.predict(
#     point_coords=np.array([[x, y]]),
#     point_labels=np.array([1]),
#     multimask_output=True
# )

# NEW:
# masks, scores, _ = call_medsam_space(
#     image_array,
#     point_coords=np.array([[x, y]]),
#     point_labels=np.array([1]),
#     multimask_output=True
# )
```

---

## Cost & Performance

### Free Tier (CPU Basic):
- ✅ **Free!**
- ⚠️ Slower inference (~5-10 seconds per image)
- ⚠️ May sleep after inactivity
- ✅ Good for testing and low usage

### Paid Tier (T4 Small GPU):
- 💰 **$0.60/hour** (~$432/month if always on)
- ✅ Fast inference (~1-2 seconds per image)
- ✅ No sleep mode
- ✅ Better for production

### Upgrade to GPU:

1. Go to your Space settings
2. Click **Settings** tab
3. Under **Space hardware**, select **T4 small**
4. Click **Update**

---

## Troubleshooting

### "Application startup failed"
- Check logs for errors
- Make sure `medsam_vit_b.pth` is uploaded
- Verify `requirements.txt` is correct

### "Out of memory"
- Upgrade to GPU hardware
- Reduce image size before sending

### "Space is sleeping"
- Free tier spaces sleep after 48h inactivity
- First request will wake it up (takes 10-20s)
- Upgrade to paid tier for always-on

### API returns error
- Check input format matches examples
- Verify coordinates are within image bounds
- Check Space logs for detailed errors

---

## Next Steps

1. ✅ Deploy Space
2. ✅ Test via web UI
3. ✅ Test via Python script
4. ✅ Integrate with your backend
5. ✅ Deploy your backend to Vercel/Railway
6. ✅ Deploy frontend to Vercel
7. 🎉 Done!

---

## Alternative: Use Inference Endpoints

For production, consider **HuggingFace Inference Endpoints**:
- Dedicated infrastructure
- Auto-scaling
- Better performance
- $0.60/hour minimum

See: https://huggingface.co/inference-endpoints

---

**Questions? Check HuggingFace Spaces docs:**
https://huggingface.co/docs/hub/spaces