smartleaf-api / deployment_guide.md
yasyn14's picture
getting strted
7a11d7a

Deployment Guide for Hugging Face Spaces

πŸ“ File Structure

Make sure your repository has the following structure:

your-space/
β”œβ”€β”€ main.py                 # Main FastAPI application
β”œβ”€β”€ app.py                  # Alternative entry point
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ Dockerfile             # Docker configuration
β”œβ”€β”€ README.md              # Space documentation
β”œβ”€β”€ .gitignore            # Git ignore rules
β”œβ”€β”€ .dockerignore         # Docker ignore rules
└── DEPLOYMENT_GUIDE.md   # This file

πŸš€ Step-by-Step Deployment

1. Create a New Space

  1. Go to Hugging Face Spaces
  2. Click "Create new Space"
  3. Fill in the details:
    • Space name: plant-disease-api (or your preferred name)
    • License: Apache 2.0
    • SDK: Docker
    • Hardware: CPU Basic (upgrade to GPU if needed)
    • Visibility: Public or Private

2. Configure the Space

The README.md file already contains the necessary YAML frontmatter:

---
title: Plant Disease Prediction API
emoji: 🌱
colorFrom: green
colorTo: blue
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860
---

3. Upload Files

You can either:

Option A: Git Clone and Push

git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME
# Copy all files to this directory
git add .
git commit -m "Initial deployment"
git push

Option B: Web Interface

  • Upload files directly through the Hugging Face web interface
  • Drag and drop or use the file upload feature

4. Environment Variables (Optional)

If you need to set custom environment variables:

  1. Go to your Space settings
  2. Add environment variables:
    • HF_MODEL_REPO: Your model repository
    • HF_MODEL_FILENAME: Your model filename
    • HF_HOME: Cache directory (default: /tmp/huggingface)

5. Monitor Deployment

  1. Go to your Space page
  2. Check the "Logs" tab for build progress
  3. Wait for the status to change from "Building" to "Running"

πŸ”§ Configuration Details

Port Configuration

  • Hugging Face Spaces expects applications to run on port 7860
  • The Dockerfile and application are configured for this

Model Loading

  • The model will be downloaded from Hugging Face Hub on first startup
  • Subsequent startups will use cached model (faster)
  • Pre-warming ensures fast first predictions

Resource Requirements

  • Memory: ~2-3GB for TensorFlow + model
  • CPU: Minimum 2 cores recommended
  • Storage: ~1GB for model and dependencies

πŸ› Troubleshooting

Common Issues

  1. Build Fails

    • Check logs in the Space interface
    • Verify all files are uploaded correctly
    • Ensure requirements.txt has correct versions
  2. Model Loading Errors

    • Verify HF_MODEL_REPO and HF_MODEL_FILENAME are correct
    • Check if model exists and is accessible
    • Review model format (should be .keras file)
  3. Memory Issues

    • Upgrade to larger hardware tier
    • Optimize model loading in code
    • Clear unnecessary cache
  4. Port Issues

    • Ensure application runs on port 7860
    • Check Dockerfile EXPOSE directive
    • Verify app_port in README.md frontmatter

Debug Commands

Add these to your main.py for debugging:

import os
import psutil
import logging

# Log system info
logging.info(f"Available memory: {psutil.virtual_memory().total / 1e9:.2f} GB")
logging.info(f"CPU cores: {psutil.cpu_count()}")
logging.info(f"Python version: {sys.version}")
logging.info(f"TensorFlow version: {tf.__version__}")

πŸ“Š Testing Your Deployment

Health Check

curl https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/health

Test Prediction

curl -X POST "https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/predict" \
     -F "files=@your_test_image.jpg"

Interactive API Docs

Visit: https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/docs

πŸ”„ Updates and Maintenance

Updating Your Space

  1. Make changes to your local files
  2. Push to the Space repository
  3. Space will automatically rebuild and redeploy

Monitoring Performance

  • Check Space logs regularly
  • Monitor response times
  • Watch for memory usage spikes

Scaling Options

  • Upgrade hardware tier for better performance
  • Consider GPU hardware for faster inference
  • Implement caching for frequently used predictions

πŸ”’ Security Considerations

  • Keep your Space public for API access
  • Don't include sensitive credentials in code
  • Use environment variables for configuration
  • Monitor usage to prevent abuse

πŸ“ˆ Performance Optimization

Model Optimization

  • Use model quantization for smaller size
  • Implement model pruning if needed
  • Cache predictions when possible

API Optimization

  • Add request rate limiting
  • Implement response caching
  • Optimize image preprocessing

Need Help?