VibecoderMcSwaggins's picture
feat(api): async job queue with comprehensive test coverage (#36)
722753e unverified
|
raw
history blame
5.38 kB

Bug Tracker: HuggingFace Spaces Deployment

This directory tracks bugs found during deployment to HuggingFace Spaces.

Active Bugs

None currently.

Fixed Bugs

ID Title Severity Status
001 CORS regex blocking static file requests Critical FIXED
002 HTTP vs HTTPS URL mismatch behind proxy High FIXED
003 Gateway timeout for long ML inference Medium FIXED

HF Spaces Deployment Checklist

Last audit: 2025-12-12

Check Status Notes
CORS regex matches both URL formats PASS r"https://.*stroke-viewer-frontend.*\.hf\.space"
All URLs use HTTPS PASS --proxy-headers flag in Dockerfile
File outputs to /tmp/ PASS Uses /tmp/stroke-results/
Static files mounted after dir exists PASS mkdir() before app.mount() in main.py
HF_SPACES env var set PASS Set in Dockerfile
Using port 7860 PASS Configured in Dockerfile CMD
Inference timeout handled PASS Async job queue pattern (no timeout risk)
Error responses return JSON PASS HTTPException with detail
CORS preflight (OPTIONS) handled PASS CORSMiddleware handles automatically
Progress updates for long tasks PASS Polling with ProgressIndicator component

Common HuggingFace Spaces Pitfalls

Based on research and experience, here are common issues to watch for:

1. CORS Configuration

  • HF Spaces URLs use single hyphens: {username}-{spacename}.hf.space
  • Proxy/embed URLs may use double hyphens: {username}--{spacename}--{hash}.hf.space
  • Always use a permissive regex that matches both formats

2. HTTPS Behind Proxy

  • HF Spaces terminates SSL at their proxy
  • Uvicorn sees HTTP internally
  • Add --proxy-headers to trust X-Forwarded-Proto
  • Or explicitly set BACKEND_PUBLIC_URL environment variable

3. File System Restrictions

  • Only /tmp is writable
  • Use /tmp/stroke-results for output files
  • Ensure directories are created with proper permissions

4. Static Files

  • Mount static files AFTER directory exists
  • Ensure CORS allows file fetches from frontend origin
  • Files served from /files/... must be accessible

5. Environment Variables

  • HF_SPACES=1 indicates running on HF Spaces
  • SPACE_ID contains the space identifier
  • Use these to detect production environment

6. Gateway Timeouts (SOLVED)

  • HF Spaces proxy has ~60 second timeout
  • Solution: Async job queue pattern with polling
  • POST returns immediately with job ID
  • Frontend polls GET /api/jobs/{id} for progress
  • See Bug 003 and Spec

E2E Flow (v2.0 - Async Job Pattern)

The complete flow from frontend to backend and back:

1. Frontend loads
   β”œβ”€β”€ CaseSelector fetches GET /api/cases
   β”œβ”€β”€ CORS: origin regex must match frontend URL
   └── Response: JSON list of case IDs

2. User runs segmentation
   β”œβ”€β”€ App calls POST /api/segment {case_id, fast_mode}
   β”œβ”€β”€ Backend creates job record
   └── Response: 202 Accepted + {jobId, status: "pending"}

3. Frontend polls for status
   β”œβ”€β”€ GET /api/jobs/{jobId} every 2 seconds
   β”œβ”€β”€ Response: {status, progress, progressMessage}
   └── ProgressIndicator shows real-time updates

4. Backend processes (in background thread)
   β”œβ”€β”€ Job status: "running"
   β”œβ”€β”€ Progress updates: 10% β†’ 30% β†’ 85% β†’ 95%
   β”œβ”€β”€ Runs DeepISLES inference
   └── Writes results to /tmp/stroke-results/{jobId}/

5. Job completes
   β”œβ”€β”€ Status: "completed"
   β”œβ”€β”€ Result includes file URLs
   └── Frontend stops polling

6. Frontend receives result
   β”œβ”€β”€ Updates state with URLs
   β”œβ”€β”€ Passes URLs to NiiVueViewer
   └── Shows metrics in MetricsPanel

7. NiiVue fetches static files
   β”œβ”€β”€ Cross-origin fetch to backend /files/...
   β”œβ”€β”€ CORS headers on static file response
   └── Binary NIfTI files download

8. Viewer displays
   └── NIfTI volumes rendered in WebGL canvas

API Endpoints (v2.0)

Method Endpoint Description
GET /api/cases List available cases
POST /api/segment Create segmentation job (202 Accepted)
GET /api/jobs/{id} Get job status/progress/results
GET /files/{jobId}/{caseId}/* Static NIfTI files
GET / Health check
GET /health Detailed health with job count

Sources