File size: 10,550 Bytes
4702dbb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
# Hugging Face Spaces Deployment Guide

## What is Hugging Face Spaces?

**Hugging Face Spaces** is a free hosting platform for machine learning demos and applications. It allows you to:

- βœ… Deploy web apps for free (with resource limits)
- βœ… Set environment variables and secrets securely
- βœ… Use Docker for full customization
- βœ… Get a public URL accessible worldwide
- βœ… Integrate with GitHub for continuous deployment

### Key Features
- **Free tier**: 2 vCPU, 8GB RAM per Space
- **Public/Private**: Choose visibility level
- **Auto-builds**: Redeploy on GitHub push (with GitHub integration)
- **Secrets management**: Store API tokens securely
- **Multiple SDK support**: Gradio, Streamlit, Docker, Python

---

## How Does Hugging Face Spaces Work?

### 1. **Creation Phase**
You create a new Space and choose an SDK (Gradio, Streamlit, Docker, etc.)

```

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚  Hugging Face Spaces Dashboard          β”‚

β”‚  β”œβ”€ Create New Space                    β”‚

β”‚  β”œβ”€ Choose SDK: Docker ← [We use this] β”‚

β”‚  β”œβ”€ Set Name: audit-repair-env          β”‚

β”‚  β”œβ”€ Set License: MIT                    β”‚

β”‚  └─ Create                              β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

```

### 2. **Build Phase**
HF Spaces pulls your code (from GitHub) and builds a Docker image

```

GitHub Repo              Hugging Face Spaces

    β”‚                           β”‚

    β”œβ”€ Dockerfile     ────→    Build Server

    β”œβ”€ requirements.txt        β”‚

    β”œβ”€ inference.py      Builds Docker Image

    β”œβ”€ server.py         Creates Container

    └─ demo.py           Allocates Resources

                         β”‚

                      Pushes to Registry

```

### 3. **Runtime Phase**
The container runs on HF's infrastructure with:
- Assigned vCPU/RAM
- Public HTTP endpoint
- Environment variables & secrets

```

Public URL

    β”‚

    β”œβ”€ https://huggingface.co/spaces/username/audit-repair-env

    β”‚

    β”œβ”€ Routes to Container

    β”‚     β”œβ”€ :7860 (Gradio Demo)

    β”‚     └─ :8000 (FastAPI Server - optional)

    β”‚

    └─ Processes Requests

        β”œβ”€ Receives HTTP request

        β”œβ”€ Runs inference.py / demo.py

        └─ Returns response

```

### 4. **Lifecycle**
- **Sleeping**: Space goes to sleep after 48 hours of inactivity
- **Paused**: You can manually pause spaces
- **Running**: Active and processing requests
- **Error**: Logs visible in Space page

---

## Step-by-Step Deployment

### Step 1: Prepare Your GitHub Repository

**Requirement**: Public GitHub repo with your code

```bash

git init

git add .

git commit -m "Initial commit"

git remote add origin https://github.com/YOUR_USERNAME/audit-repair-env.git

git branch -M main

git push -u origin main

```

**File checklist**:
- βœ… `inference.py` (root directory)
- βœ… `server.py`
- βœ… `tasks.py`
- βœ… `requirements.txt`
- βœ… `demo.py`
- βœ… `Dockerfile`
- βœ… `README.md`

### Step 2: Create Hugging Face Spaces

1. Go to [huggingface.co/spaces](https://huggingface.co/spaces)
2. Click **"Create new Space"**
3. Fill in:
   - **Owner**: Your HF username
   - **Space name**: `audit-repair-env` (or your choice)
   - **License**: MIT
   - **SDK**: Docker ← **IMPORTANT**
4. Click **"Create Space"**

### Step 3: Connect to GitHub (Auto-Deployment)

In your **Space Settings**:

1. Go to **Space** β†’ **Settings** (gear icon)
2. Scroll to **"Linked Repository"**
3. Click **"Link a repository"**
4. Select your GitHub repo: `username/audit-repair-env`
5. Choose **"Simple"** or **"Sync"** mode
   - **Simple**: Manual redeploy via button
   - **Sync**: Auto-redeploy on GitHub push (recommended)

### Step 4: Set Environment Variables & Secrets

In **Space Settings**:

1. Scroll to **"Repository secrets"**
2. Click **"Add secret"**
3. Add:
   ```

   Name: HF_TOKEN

   Value: hf_your_actual_token_here

   ```

4. Add:
   ```

   Name: API_BASE_URL

   Value: https://router.huggingface.co/v1

   ```

5. Add:
   ```

   Name: MODEL_NAME

   Value: Qwen/Qwen2.5-72B-Instruct

   ```

**⚠️ NOTE**: These secrets are only passed to Docker at build-time. If they need to be runtime-only, use the `.dockerfile` method.

### Step 5: Check Logs & Verify Deployment

1. Go to your Space URL: `https://huggingface.co/spaces/username/audit-repair-env`
2. Click **"Logs"** tab to see build output
3. Wait for status: **"Running"**
4. Click the **"App"** link to access your demo

---

## Dockerfile Setup for Spaces

Your `Dockerfile` should be:

```dockerfile

FROM python:3.10-slim



WORKDIR /app



# Copy everything

COPY . .



# Install dependencies

RUN pip install --no-cache-dir -r requirements.txt



# Expose port for Gradio (or FastAPI)

EXPOSE 7860



# Run Gradio demo by default

CMD ["python", "demo.py"]

```

**Alternative** (run both server + demo):
```dockerfile

FROM python:3.10-slim



WORKDIR /app

COPY . .

RUN pip install --no-cache-dir -r requirements.txt



EXPOSE 7860 8000



# Create startup script

RUN echo '#!/bin/bash\npython server.py &\npython demo.py' > /app/start.sh

RUN chmod +x /app/start.sh



CMD ["/app/start.sh"]

```

---

## Troubleshooting Common Issues

### Issue: "Build Failed"
```

❌ Docker build failed

```

**Fixes**:
1. Check Logs tab for error messages
2. Verify `requirements.txt` syntax
3. Ensure `Dockerfile` references correct files
4. Check for permission issues

### Issue: "Application Error" on Load
```

❌ Application Error: Connection refused

```

**Fixes**:
1. Verify app runs on `0.0.0.0:7860`
2. Check environment variables are set
3. Look at Space Logs for exceptions
4. Ensure HF_TOKEN is valid



### Issue: "HF_TOKEN not valid"
```

❌ Error initializing client: Invalid token

```

**Fixes**:
1. Generate new token at [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
2. Make sure it has API access
3. Update secret in Space Settings
4. Rebuild Space

### Issue: "Model not found"
```

❌ Error: MODEL_NAME 'Qwen/Qwen2.5-72B-Instruct' not found

```

**Fixes**:
1. Verify model exists on Hugging Face Hub
2. Check if you have access (private models need approval)
3. Use inference API endpoint instead:
   ```

   API_BASE_URL=https://api-inference.huggingface.co/v1

   ```
4. Ensure HF_TOKEN is set



### Issue: "Out of Memory"

```

❌ Killed due to resource limit

```



**Fixes**:

- Free tier is 2 vCPU / 8GB RAM

- Reduce model size

- Use a smaller LLM (e.g., `mistral-7b`)

- Consider upgrading to upgrade (usually not needed)

- Optimize inference batch size



### Issue: Space Falls Asleep

```

⚠️ This space has been sleeping for 48 hours

```



**Explanation**: HF Spaces sleep after inactivity to save resources



**Solutions**:

1. Upgrade to paid tier (stays warm)

2. Add uptime monitoring (pings Space regularly)

3. Use HF Pro subscription



---



## Performance Optimization



### For Spaces with Free Tier (2 vCPU, 8GB RAM)



**1. Use Quantized Models**

```python

# Instead of full precision 72B

MODEL_NAME = "Qwen/Qwen2.5-32B-Instruct-GGUF"  # Smaller, quantized
```



**2. Cache Client**

```python

@cache

def get_openai_client():

    return OpenAI(base_url=API_BASE_URL, api_key=HF_TOKEN)

```

**3. Limit Request Size**
```python

MAX_TOKENS = 150  # Reduce from 300

TEMPERATURE = 0.1  # Lower temp = faster convergence

```

**4. Async Requests** (if multiple concurrent users)
```python

import asyncio

# Use async/await for non-blocking I/O

```

---

## Real-World Example: Workflow

```

1. Developer makes changes locally

   β”œβ”€ git commit -am "Fix HF_TOKEN validation"

   └─ git push origin main



2. GitHub notifies HF Spaces

   β”œβ”€ HF detects push to linked repo

   └─ Triggers automatic build



3. HF Spaces builds Docker image

   β”œβ”€ Pulls latest code from main branch

   β”œβ”€ Runs: pip install -r requirements.txt

   β”œβ”€ Loads secrets (HF_TOKEN, API_BASE_URL, etc.)

   └─ Runs: python demo.py



4. Container starts running

   β”œβ”€ Gradio interface initializes on :7860

   β”œβ”€ FastAPI server (optional) on :8000

   └─ Public URL becomes active



5. User accesses Space URL

   β”œβ”€ Browser loads Gradio interface

   β”œβ”€ User selects task (easy/medium/hard)

   β”œβ”€ Clicks "Run Inference"

   └─ inference.py executes with LLM calls



6. LLM calls routed via:

   API_BASE_URL (huggingface.co/v1)

       ↓

   HF Token used for authentication

       ↓

   Model (Qwen/Qwen2.5-72B-Instruct) queried

       ↓

   Response returned to inference.py

       ↓

   Results shown in Gradio UI

```

---

## Security Best Practices

### βœ… DO

- Set HF_TOKEN as a **secret** in Space settings

- Use `.gitignore` to prevent token from being committed:

  ```

  .env

  .env.local

  *.key

  secrets/

  ```

- Validate all user inputs

- Use HTTPS (handled by HF automatically)



### ❌ DON'T



- Commit API keys to GitHub

- Expose secrets in logs

- Store sensitive data in code

- Leave Space public if handling private data



---



## Next Steps



1. **Verify locally first**:

   ```bash

   export HF_TOKEN="your_token"

   export API_BASE_URL="https://router.huggingface.co/v1"

   python inference.py  # Run submission tests

   python demo.py       # Test Gradio UI

   ```



2. **Push to GitHub**:

   ```bash

   git add -A

   git commit -m "Ready for HF Spaces deployment"

   git push origin main

   ```



3. **Create & Link Space**:

   - Create Space on HF

   - Link GitHub repo

   - Set secrets in Settings

   - Wait for build



4. **Test on Spaces**:

   - Access public URL

   - Run test inference

   - Share link with community



---



## Additional Resources



- [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces)

- [Docker Spaces Guide](https://huggingface.co/docs/hub/spaces-config-reference#docker)

- [Gradio Documentation](https://www.gradio.app/)

- [OpenAI Python Client](https://github.com/openai/openai-python)

- [HF Inference API Docs](https://huggingface.co/docs/api-inference)



---



**Good luck with your submission! πŸš€**