File size: 6,106 Bytes
1efb3e0
 
 
 
 
 
1977496
 
 
1efb3e0
 
 
 
 
 
 
722753e
 
 
 
 
 
 
 
1977496
 
722753e
 
 
 
 
 
 
 
 
1efb3e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1977496
 
 
 
 
 
 
 
722753e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1977496
 
 
722753e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1efb3e0
 
 
 
 
 
722753e
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
# Bug Tracker: HuggingFace Spaces Deployment

This directory tracks bugs found during deployment to HuggingFace Spaces.

## Active Bugs

| ID | Title | Severity | Status |
|----|-------|----------|--------|
| [004](./004-staticfiles-cors-middleware-not-applied.md) | CORS/CORP middleware not applied to mounted StaticFiles | **CRITICAL** | OPEN - Awaiting Senior Review |

## Fixed Bugs

| ID | Title | Severity | Status |
|----|-------|----------|--------|
| [001](./001-cors-static-files-hf-spaces.md) | CORS regex blocking static file requests | Critical | FIXED |
| [002](./002-http-vs-https-proxy-headers.md) | HTTP vs HTTPS URL mismatch behind proxy | High | FIXED |
| [003](./003-gateway-timeout-long-inference.md) | Gateway timeout for long ML inference | Medium | FIXED |

## HF Spaces Deployment Checklist

Last audit: 2025-12-12

| Check | Status | Notes |
|-------|--------|-------|
| CORS regex matches both URL formats | N/A | Replaced with exact-match list (PR #38) |
| **CORS on StaticFiles mount** | **FAIL** | BUG-004: Middleware doesn't apply to mounted apps |
| All URLs use HTTPS | PASS | `--proxy-headers` flag in Dockerfile |
| File outputs to /tmp/ | PASS | Uses `/tmp/stroke-results/` |
| Static files mounted after dir exists | PASS | `mkdir()` before `app.mount()` in main.py |
| HF_SPACES env var set | PASS | Set in Dockerfile |
| Using port 7860 | PASS | Configured in Dockerfile CMD |
| Inference timeout handled | PASS | Async job queue pattern (no timeout risk) |
| Error responses return JSON | PASS | HTTPException with detail |
| CORS preflight (OPTIONS) handled | PASS | CORSMiddleware handles automatically |
| Progress updates for long tasks | PASS | Polling with ProgressIndicator component |

## Common HuggingFace Spaces Pitfalls

Based on research and experience, here are common issues to watch for:

### 1. CORS Configuration
- HF Spaces URLs use single hyphens: `{username}-{spacename}.hf.space`
- Proxy/embed URLs may use double hyphens: `{username}--{spacename}--{hash}.hf.space`
- Always use a permissive regex that matches both formats

### 2. HTTPS Behind Proxy
- HF Spaces terminates SSL at their proxy
- Uvicorn sees HTTP internally
- Add `--proxy-headers` to trust `X-Forwarded-Proto`
- Or explicitly set `BACKEND_PUBLIC_URL` environment variable

### 3. File System Restrictions
- Only `/tmp` is writable
- Use `/tmp/stroke-results` for output files
- Ensure directories are created with proper permissions

### 4. Static Files
- Mount static files AFTER directory exists
- Ensure CORS allows file fetches from frontend origin
- Files served from `/files/...` must be accessible

### 5. Environment Variables
- `HF_SPACES=1` indicates running on HF Spaces
- `SPACE_ID` contains the space identifier
- Use these to detect production environment

### 6. chmod "Operation not permitted" Warnings (HARMLESS)
DeepISLES tries to chmod model weight files but fails due to container permissions:
```
chmod: changing permissions of '/app/weights/SEALS/...': Operation not permitted
```
These are **benign warnings**, not errors. The container can still READ the files.

### 7. Gateway Timeouts (SOLVED)
- HF Spaces proxy has ~60 second timeout
- Solution: Async job queue pattern with polling
- POST returns immediately with job ID
- Frontend polls GET /api/jobs/{id} for progress
- See [Bug 003](./003-gateway-timeout-long-inference.md) and [Spec](../specs/async-job-queue.md)

## E2E Flow (v2.0 - Async Job Pattern)

The complete flow from frontend to backend and back:

```text
1. Frontend loads
   β”œβ”€β”€ CaseSelector fetches GET /api/cases
   β”œβ”€β”€ CORS: origin regex must match frontend URL
   └── Response: JSON list of case IDs

2. User runs segmentation
   β”œβ”€β”€ App calls POST /api/segment {case_id, fast_mode}
   β”œβ”€β”€ Backend creates job record
   └── Response: 202 Accepted + {jobId, status: "pending"}

3. Frontend polls for status
   β”œβ”€β”€ GET /api/jobs/{jobId} every 2 seconds
   β”œβ”€β”€ Response: {status, progress, progressMessage}
   └── ProgressIndicator shows real-time updates

4. Backend processes (in background thread)
   β”œβ”€β”€ Job status: "running"
   β”œβ”€β”€ Progress updates: 10% β†’ 30% β†’ 85% β†’ 95%
   β”œβ”€β”€ Runs DeepISLES inference
   └── Writes results to /tmp/stroke-results/{jobId}/

5. Job completes
   β”œβ”€β”€ Status: "completed"
   β”œβ”€β”€ Result includes file URLs
   └── Frontend stops polling

6. Frontend receives result
   β”œβ”€β”€ Updates state with URLs
   β”œβ”€β”€ Passes URLs to NiiVueViewer
   └── Shows metrics in MetricsPanel

7. NiiVue fetches static files
   β”œβ”€β”€ Cross-origin fetch to backend /files/...
   β”œβ”€β”€ ⚠️ BUG-004: StaticFiles mount doesn't get CORS headers!
   β”œβ”€β”€ Browser blocks fetch (no Access-Control-Allow-Origin)
   └── "Failed to load volume: Failed to fetch"

8. Viewer displays
   └── NIfTI volumes rendered in WebGL canvas
```

## API Endpoints (v2.0)

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | /api/cases | List available cases |
| POST | /api/segment | Create segmentation job (202 Accepted) |
| GET | /api/jobs/{id} | Get job status/progress/results |
| GET | /files/{jobId}/{caseId}/* | Static NIfTI files |
| GET | / | Health check |
| GET | /health | Detailed health with job count |

## Sources

- [Deploying FastAPI on HuggingFace Spaces](https://huggingface.co/blog/HemanthSai7/deploy-applications-on-huggingface-spaces)
- [HF Spaces Restrictions](https://medium.com/@na.mazaheri/deploying-a-fastapi-app-on-hugging-face-spaces-and-handling-all-its-restrictions-d494d97a78fa)
- [FastAPI HTTPS Discussion](https://github.com/fastapi/fastapi/discussions/6670)
- [HF Docker Spaces Docs](https://huggingface.co/docs/hub/en/spaces-sdks-docker)
- [504 Gateway Timeout - HF Forums](https://discuss.huggingface.co/t/504-gateway-timeout-with-http-request/24018)
- [FastAPI Background Tasks](https://fastapi.tiangolo.com/tutorial/background-tasks/)
- [FastAPI Polling Strategy](https://openillumi.com/en/en-fastapi-long-task-progress-polling/)