ai-engineering-project / docs /HUGGINGFACE_CI_CD.md
GitHub Action
Clean deployment without binary files
f884e6e

HuggingFace Spaces CI/CD Configuration

πŸ€— HuggingFace Native CI/CD Options

1. Space Webhooks & Auto-Deploy

HF Spaces can automatically rebuild when:

  • Git repository changes are pushed
  • Dependencies are updated
  • Configuration changes occur

2. Health Checks & Monitoring

Built-in capabilities:

  • Automatic restart on crashes
  • Memory usage monitoring
  • Build status tracking
  • Runtime error logging

3. Custom Build Scripts

HF Spaces supports custom build automation through:

# .hf/startup.sh - Runs during space startup
#!/bin/bash
echo "πŸš€ Starting HuggingFace Space with custom setup..."

# Install additional dependencies
pip install -r requirements.txt

# Run custom validation
python scripts/validate_services.py

# Start health monitoring
python scripts/hf_health_monitor.py &

# Start the main application
python app.py

4. Environment-Based Testing

# .hf.yml configuration for testing
variables:
  ENVIRONMENT: "production"
  RUN_TESTS_ON_STARTUP: "true"
  TEST_TIMEOUT: "300"
  HEALTH_CHECK_INTERVAL: "60"

5. Multi-Space Deployment Pipeline

  • Development Space: Auto-deploy from feature branches
  • Staging Space: Auto-deploy from main branch
  • Production Space: Manual promotion after validation

πŸ”§ HuggingFace Actions (Third-Party)

GitHub Actions for HF Spaces

# .github/workflows/hf-spaces-ci.yml
name: HuggingFace Spaces CI/CD

on:
  push:
    branches: [main]

jobs:
  deploy-to-hf-staging:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Deploy to HF Staging
        uses: huggingface/hf-space-action@v1
        with:
          space-id: 'your-org/your-space-staging'
          hf-token: ${{ secrets.HF_TOKEN }}

  run-hf-tests:
    needs: deploy-to-hf-staging
    runs-on: ubuntu-latest
    steps:
      - name: Test HF Space
        run: |
          # Wait for space to be ready
          sleep 60
          # Run health checks
          curl -f https://your-org-your-space-staging.hf.space/health

  promote-to-production:
    needs: run-hf-tests
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to Production
        uses: huggingface/hf-space-action@v1
        with:
          space-id: 'your-org/your-space'
          hf-token: ${{ secrets.HF_TOKEN }}

πŸ› οΈ Custom HF Space Automation

Space Build Hooks

# scripts/hf_build_hooks.py
"""
Custom build hooks for HuggingFace Spaces
"""
import os
import subprocess
import logging

def pre_build_validation():
    """Run validation before space builds"""
    print("πŸ” Running pre-build validation...")

    # Run tests
    result = subprocess.run(['python', 'scripts/test_e2e_pipeline.py'],
                          capture_output=True, text=True)

    if result.returncode != 0:
        print("❌ Pre-build tests failed!")
        print(result.stderr)
        exit(1)

    print("βœ… Pre-build validation passed!")

def post_deploy_health_check():
    """Health check after deployment"""
    import requests
    import time

    space_url = os.getenv('SPACE_URL', 'http://localhost:7860')

    for attempt in range(10):
        try:
            response = requests.get(f"{space_url}/health", timeout=30)
            if response.status_code == 200:
                print("βœ… Health check passed!")
                return
        except Exception as e:
            print(f"⏳ Health check attempt {attempt + 1} failed: {e}")
            time.sleep(30)

    print("❌ Health check failed after 10 attempts!")
    exit(1)

if __name__ == "__main__":
    if os.getenv('BUILD_STAGE') == 'pre':
        pre_build_validation()
    elif os.getenv('BUILD_STAGE') == 'post':
        post_deploy_health_check()

πŸ“Š Monitoring & Alerting

Space Health Monitor

# scripts/hf_health_monitor.py
"""
Continuous health monitoring for HF Spaces
"""
import time
import requests
import logging
from datetime import datetime

class HFSpaceMonitor:
    def __init__(self):
        self.check_interval = int(os.getenv('HEALTH_CHECK_INTERVAL', 60))
        self.webhook_url = os.getenv('SLACK_WEBHOOK_URL')

    def check_health(self):
        """Check space health"""
        try:
            # Check memory usage
            import psutil
            memory_percent = psutil.virtual_memory().percent

            # Check disk usage
            disk_percent = psutil.disk_usage('/').percent

            # Check application health
            response = requests.get('http://localhost:7860/health', timeout=10)

            if memory_percent > 90 or disk_percent > 90 or response.status_code != 200:
                self.alert(f"Health check failed: Memory={memory_percent}%, Disk={disk_percent}%, HTTP={response.status_code}")
            else:
                logging.info(f"βœ… Health check passed: Memory={memory_percent}%, Disk={disk_percent}%")

        except Exception as e:
            self.alert(f"Health check error: {e}")

    def alert(self, message):
        """Send alert notification"""
        if self.webhook_url:
            payload = {
                "text": f"🚨 HF Space Alert: {message}",
                "timestamp": datetime.now().isoformat()
            }
            requests.post(self.webhook_url, json=payload)

        logging.error(message)

    def run(self):
        """Start monitoring loop"""
        while True:
            self.check_health()
            time.sleep(self.check_interval)

if __name__ == "__main__":
    monitor = HFSpaceMonitor()
    monitor.run()