File size: 10,550 Bytes
4702dbb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 | # Hugging Face Spaces Deployment Guide
## What is Hugging Face Spaces?
**Hugging Face Spaces** is a free hosting platform for machine learning demos and applications. It allows you to:
- β
Deploy web apps for free (with resource limits)
- β
Set environment variables and secrets securely
- β
Use Docker for full customization
- β
Get a public URL accessible worldwide
- β
Integrate with GitHub for continuous deployment
### Key Features
- **Free tier**: 2 vCPU, 8GB RAM per Space
- **Public/Private**: Choose visibility level
- **Auto-builds**: Redeploy on GitHub push (with GitHub integration)
- **Secrets management**: Store API tokens securely
- **Multiple SDK support**: Gradio, Streamlit, Docker, Python
---
## How Does Hugging Face Spaces Work?
### 1. **Creation Phase**
You create a new Space and choose an SDK (Gradio, Streamlit, Docker, etc.)
```
βββββββββββββββββββββββββββββββββββββββββββ
β Hugging Face Spaces Dashboard β
β ββ Create New Space β
β ββ Choose SDK: Docker β [We use this] β
β ββ Set Name: audit-repair-env β
β ββ Set License: MIT β
β ββ Create β
βββββββββββββββββββββββββββββββββββββββββββ
```
### 2. **Build Phase**
HF Spaces pulls your code (from GitHub) and builds a Docker image
```
GitHub Repo Hugging Face Spaces
β β
ββ Dockerfile βββββ Build Server
ββ requirements.txt β
ββ inference.py Builds Docker Image
ββ server.py Creates Container
ββ demo.py Allocates Resources
β
Pushes to Registry
```
### 3. **Runtime Phase**
The container runs on HF's infrastructure with:
- Assigned vCPU/RAM
- Public HTTP endpoint
- Environment variables & secrets
```
Public URL
β
ββ https://huggingface.co/spaces/username/audit-repair-env
β
ββ Routes to Container
β ββ :7860 (Gradio Demo)
β ββ :8000 (FastAPI Server - optional)
β
ββ Processes Requests
ββ Receives HTTP request
ββ Runs inference.py / demo.py
ββ Returns response
```
### 4. **Lifecycle**
- **Sleeping**: Space goes to sleep after 48 hours of inactivity
- **Paused**: You can manually pause spaces
- **Running**: Active and processing requests
- **Error**: Logs visible in Space page
---
## Step-by-Step Deployment
### Step 1: Prepare Your GitHub Repository
**Requirement**: Public GitHub repo with your code
```bash
git init
git add .
git commit -m "Initial commit"
git remote add origin https://github.com/YOUR_USERNAME/audit-repair-env.git
git branch -M main
git push -u origin main
```
**File checklist**:
- β
`inference.py` (root directory)
- β
`server.py`
- β
`tasks.py`
- β
`requirements.txt`
- β
`demo.py`
- β
`Dockerfile`
- β
`README.md`
### Step 2: Create Hugging Face Spaces
1. Go to [huggingface.co/spaces](https://huggingface.co/spaces)
2. Click **"Create new Space"**
3. Fill in:
- **Owner**: Your HF username
- **Space name**: `audit-repair-env` (or your choice)
- **License**: MIT
- **SDK**: Docker β **IMPORTANT**
4. Click **"Create Space"**
### Step 3: Connect to GitHub (Auto-Deployment)
In your **Space Settings**:
1. Go to **Space** β **Settings** (gear icon)
2. Scroll to **"Linked Repository"**
3. Click **"Link a repository"**
4. Select your GitHub repo: `username/audit-repair-env`
5. Choose **"Simple"** or **"Sync"** mode
- **Simple**: Manual redeploy via button
- **Sync**: Auto-redeploy on GitHub push (recommended)
### Step 4: Set Environment Variables & Secrets
In **Space Settings**:
1. Scroll to **"Repository secrets"**
2. Click **"Add secret"**
3. Add:
```
Name: HF_TOKEN
Value: hf_your_actual_token_here
```
4. Add:
```
Name: API_BASE_URL
Value: https://router.huggingface.co/v1
```
5. Add:
```
Name: MODEL_NAME
Value: Qwen/Qwen2.5-72B-Instruct
```
**β οΈ NOTE**: These secrets are only passed to Docker at build-time. If they need to be runtime-only, use the `.dockerfile` method.
### Step 5: Check Logs & Verify Deployment
1. Go to your Space URL: `https://huggingface.co/spaces/username/audit-repair-env`
2. Click **"Logs"** tab to see build output
3. Wait for status: **"Running"**
4. Click the **"App"** link to access your demo
---
## Dockerfile Setup for Spaces
Your `Dockerfile` should be:
```dockerfile
FROM python:3.10-slim
WORKDIR /app
# Copy everything
COPY . .
# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Expose port for Gradio (or FastAPI)
EXPOSE 7860
# Run Gradio demo by default
CMD ["python", "demo.py"]
```
**Alternative** (run both server + demo):
```dockerfile
FROM python:3.10-slim
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 7860 8000
# Create startup script
RUN echo '#!/bin/bash\npython server.py &\npython demo.py' > /app/start.sh
RUN chmod +x /app/start.sh
CMD ["/app/start.sh"]
```
---
## Troubleshooting Common Issues
### Issue: "Build Failed"
```
β Docker build failed
```
**Fixes**:
1. Check Logs tab for error messages
2. Verify `requirements.txt` syntax
3. Ensure `Dockerfile` references correct files
4. Check for permission issues
### Issue: "Application Error" on Load
```
β Application Error: Connection refused
```
**Fixes**:
1. Verify app runs on `0.0.0.0:7860`
2. Check environment variables are set
3. Look at Space Logs for exceptions
4. Ensure HF_TOKEN is valid
### Issue: "HF_TOKEN not valid"
```
β Error initializing client: Invalid token
```
**Fixes**:
1. Generate new token at [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
2. Make sure it has API access
3. Update secret in Space Settings
4. Rebuild Space
### Issue: "Model not found"
```
β Error: MODEL_NAME 'Qwen/Qwen2.5-72B-Instruct' not found
```
**Fixes**:
1. Verify model exists on Hugging Face Hub
2. Check if you have access (private models need approval)
3. Use inference API endpoint instead:
```
API_BASE_URL=https://api-inference.huggingface.co/v1
```
4. Ensure HF_TOKEN is set
### Issue: "Out of Memory"
```
β Killed due to resource limit
```
**Fixes**:
- Free tier is 2 vCPU / 8GB RAM
- Reduce model size
- Use a smaller LLM (e.g., `mistral-7b`)
- Consider upgrading to upgrade (usually not needed)
- Optimize inference batch size
### Issue: Space Falls Asleep
```
β οΈ This space has been sleeping for 48 hours
```
**Explanation**: HF Spaces sleep after inactivity to save resources
**Solutions**:
1. Upgrade to paid tier (stays warm)
2. Add uptime monitoring (pings Space regularly)
3. Use HF Pro subscription
---
## Performance Optimization
### For Spaces with Free Tier (2 vCPU, 8GB RAM)
**1. Use Quantized Models**
```python
# Instead of full precision 72B
MODEL_NAME = "Qwen/Qwen2.5-32B-Instruct-GGUF" # Smaller, quantized
```
**2. Cache Client**
```python
@cache
def get_openai_client():
return OpenAI(base_url=API_BASE_URL, api_key=HF_TOKEN)
```
**3. Limit Request Size**
```python
MAX_TOKENS = 150 # Reduce from 300
TEMPERATURE = 0.1 # Lower temp = faster convergence
```
**4. Async Requests** (if multiple concurrent users)
```python
import asyncio
# Use async/await for non-blocking I/O
```
---
## Real-World Example: Workflow
```
1. Developer makes changes locally
ββ git commit -am "Fix HF_TOKEN validation"
ββ git push origin main
2. GitHub notifies HF Spaces
ββ HF detects push to linked repo
ββ Triggers automatic build
3. HF Spaces builds Docker image
ββ Pulls latest code from main branch
ββ Runs: pip install -r requirements.txt
ββ Loads secrets (HF_TOKEN, API_BASE_URL, etc.)
ββ Runs: python demo.py
4. Container starts running
ββ Gradio interface initializes on :7860
ββ FastAPI server (optional) on :8000
ββ Public URL becomes active
5. User accesses Space URL
ββ Browser loads Gradio interface
ββ User selects task (easy/medium/hard)
ββ Clicks "Run Inference"
ββ inference.py executes with LLM calls
6. LLM calls routed via:
API_BASE_URL (huggingface.co/v1)
β
HF Token used for authentication
β
Model (Qwen/Qwen2.5-72B-Instruct) queried
β
Response returned to inference.py
β
Results shown in Gradio UI
```
---
## Security Best Practices
### β
DO
- Set HF_TOKEN as a **secret** in Space settings
- Use `.gitignore` to prevent token from being committed:
```
.env
.env.local
*.key
secrets/
```
- Validate all user inputs
- Use HTTPS (handled by HF automatically)
### β DON'T
- Commit API keys to GitHub
- Expose secrets in logs
- Store sensitive data in code
- Leave Space public if handling private data
---
## Next Steps
1. **Verify locally first**:
```bash
export HF_TOKEN="your_token"
export API_BASE_URL="https://router.huggingface.co/v1"
python inference.py # Run submission tests
python demo.py # Test Gradio UI
```
2. **Push to GitHub**:
```bash
git add -A
git commit -m "Ready for HF Spaces deployment"
git push origin main
```
3. **Create & Link Space**:
- Create Space on HF
- Link GitHub repo
- Set secrets in Settings
- Wait for build
4. **Test on Spaces**:
- Access public URL
- Run test inference
- Share link with community
---
## Additional Resources
- [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces)
- [Docker Spaces Guide](https://huggingface.co/docs/hub/spaces-config-reference#docker)
- [Gradio Documentation](https://www.gradio.app/)
- [OpenAI Python Client](https://github.com/openai/openai-python)
- [HF Inference API Docs](https://huggingface.co/docs/api-inference)
---
**Good luck with your submission! π**
|