Spaces:
Running
Running
Commit
·
eb87c73
1
Parent(s):
2ce4373
Simplify: Remove proxy service, keep only audio file upload for HF Spaces
Browse files- Removed proxy_service directory and all proxy-related code
- Removed httpx dependency (no longer needed)
- Removed YOUTUBE_PROXY_URL configuration
- Simplified error messages to suggest audio upload endpoint
- Updated README with clear HF Spaces limitations
- Added HF_SPACES_GUIDE.md with detailed deployment instructions
- YouTube extraction endpoint remains but documented as self-hosted only
- Audio upload endpoint (/transcribe) works perfectly on HF Spaces
- HF_SPACES_GUIDE.md +442 -0
- README.md +28 -70
- app/apis/subtitles/service.py +4 -94
- app/core/config.py +0 -3
- poetry.lock +1 -48
- proxy_service/Dockerfile +0 -35
- proxy_service/README.md +0 -409
- proxy_service/main.py +0 -216
- proxy_service/render.yaml +0 -21
- proxy_service/requirements.txt +0 -6
- pyproject.toml +0 -1
HF_SPACES_GUIDE.md
ADDED
|
@@ -0,0 +1,442 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Hugging Face Spaces Deployment Guide
|
| 2 |
+
|
| 3 |
+
## 🎯 Overview
|
| 4 |
+
|
| 5 |
+
This guide explains how to deploy and use the Multi-Utility Server on Hugging Face Spaces, including limitations and workarounds.
|
| 6 |
+
|
| 7 |
+
## 🚀 Quick Deployment
|
| 8 |
+
|
| 9 |
+
### Step 1: Create a Space
|
| 10 |
+
|
| 11 |
+
1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
|
| 12 |
+
2. Click **"Create new Space"**
|
| 13 |
+
3. Choose:
|
| 14 |
+
- **Space name:** Your choice
|
| 15 |
+
- **SDK:** Docker
|
| 16 |
+
- **Visibility:** Public or Private
|
| 17 |
+
4. Click **"Create Space"**
|
| 18 |
+
|
| 19 |
+
### Step 2: Configure Secrets
|
| 20 |
+
|
| 21 |
+
1. Go to your Space's **Settings** → **Repository secrets**
|
| 22 |
+
2. Add a new secret:
|
| 23 |
+
- **Name:** `API_KEYS`
|
| 24 |
+
- **Value:** `your-secure-api-key-here` (comma-separated for multiple keys)
|
| 25 |
+
3. Save
|
| 26 |
+
|
| 27 |
+
### Step 3: Push Code
|
| 28 |
+
|
| 29 |
+
```bash
|
| 30 |
+
# Clone your space
|
| 31 |
+
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
|
| 32 |
+
cd YOUR_SPACE_NAME
|
| 33 |
+
|
| 34 |
+
# Add this repository as a remote
|
| 35 |
+
git remote add source https://github.com/YOUR_REPO/multiutility-server.git
|
| 36 |
+
git pull source main
|
| 37 |
+
|
| 38 |
+
# Push to HF Spaces
|
| 39 |
+
git push origin main
|
| 40 |
+
```
|
| 41 |
+
|
| 42 |
+
Or connect your GitHub repository directly in Space settings.
|
| 43 |
+
|
| 44 |
+
## 📊 Feature Availability on HF Spaces
|
| 45 |
+
|
| 46 |
+
| Feature | Status | Endpoint |
|
| 47 |
+
|---------|--------|----------|
|
| 48 |
+
| **Text Embeddings** | ✅ Works | `POST /api/v1/embeddings/generate` |
|
| 49 |
+
| **Audio File Transcription** | ✅ Works | `POST /api/v1/subtitles/transcribe` |
|
| 50 |
+
| **YouTube Subtitle Extraction** | ❌ Blocked | `POST /api/v1/subtitles/extract` |
|
| 51 |
+
| **Health Checks** | ✅ Works | `GET /health` |
|
| 52 |
+
|
| 53 |
+
## ⚠️ Network Limitations
|
| 54 |
+
|
| 55 |
+
### What's Blocked
|
| 56 |
+
|
| 57 |
+
Hugging Face Spaces runs in a sandboxed environment that **blocks external internet access** for security reasons. This means:
|
| 58 |
+
|
| 59 |
+
- ❌ Cannot download from YouTube directly
|
| 60 |
+
- ❌ Cannot access external APIs
|
| 61 |
+
- ❌ Cannot perform web scraping
|
| 62 |
+
|
| 63 |
+
### What Works
|
| 64 |
+
|
| 65 |
+
- ✅ File uploads from users
|
| 66 |
+
- ✅ AI model inference (Whisper, embeddings)
|
| 67 |
+
- ✅ Returning results to users
|
| 68 |
+
- ✅ Internal HF services
|
| 69 |
+
|
| 70 |
+
## 🎤 Audio Transcription Workflow
|
| 71 |
+
|
| 72 |
+
Since YouTube downloads don't work on HF Spaces, use this workflow instead:
|
| 73 |
+
|
| 74 |
+
### Option 1: User Downloads Audio Locally
|
| 75 |
+
|
| 76 |
+
**Step 1:** User downloads audio using [yt-dlp](https://github.com/yt-dlp/yt-dlp)
|
| 77 |
+
```bash
|
| 78 |
+
# Install yt-dlp
|
| 79 |
+
pip install yt-dlp
|
| 80 |
+
|
| 81 |
+
# Download audio from YouTube
|
| 82 |
+
yt-dlp -x --audio-format mp3 "https://www.youtube.com/watch?v=VIDEO_ID" -o audio.mp3
|
| 83 |
+
```
|
| 84 |
+
|
| 85 |
+
**Step 2:** User uploads audio to your HF Space
|
| 86 |
+
```bash
|
| 87 |
+
curl -X POST https://YOUR_SPACE.hf.space/api/v1/subtitles/transcribe \
|
| 88 |
+
-H "x-api-key: your-api-key" \
|
| 89 |
+
-F "file=@audio.mp3" \
|
| 90 |
+
-F "lang=en"
|
| 91 |
+
```
|
| 92 |
+
|
| 93 |
+
**Step 3:** Receive transcription
|
| 94 |
+
```json
|
| 95 |
+
{
|
| 96 |
+
"status": "success",
|
| 97 |
+
"language": "en",
|
| 98 |
+
"file_name": "audio.mp3",
|
| 99 |
+
"transcription": [
|
| 100 |
+
"First segment of transcribed text",
|
| 101 |
+
"Second segment of transcribed text",
|
| 102 |
+
"..."
|
| 103 |
+
]
|
| 104 |
+
}
|
| 105 |
+
```
|
| 106 |
+
|
| 107 |
+
### Option 2: Browser-Based Upload
|
| 108 |
+
|
| 109 |
+
Create a simple HTML form for users:
|
| 110 |
+
|
| 111 |
+
```html
|
| 112 |
+
<!DOCTYPE html>
|
| 113 |
+
<html>
|
| 114 |
+
<body>
|
| 115 |
+
<h2>Audio Transcription</h2>
|
| 116 |
+
<form id="uploadForm">
|
| 117 |
+
<input type="file" id="audioFile" accept="audio/*" required>
|
| 118 |
+
<select id="language">
|
| 119 |
+
<option value="en">English</option>
|
| 120 |
+
<option value="es">Spanish</option>
|
| 121 |
+
<option value="fr">French</option>
|
| 122 |
+
</select>
|
| 123 |
+
<button type="submit">Transcribe</button>
|
| 124 |
+
</form>
|
| 125 |
+
|
| 126 |
+
<div id="result"></div>
|
| 127 |
+
|
| 128 |
+
<script>
|
| 129 |
+
document.getElementById('uploadForm').onsubmit = async (e) => {
|
| 130 |
+
e.preventDefault();
|
| 131 |
+
const formData = new FormData();
|
| 132 |
+
formData.append('file', document.getElementById('audioFile').files[0]);
|
| 133 |
+
formData.append('lang', document.getElementById('language').value);
|
| 134 |
+
|
| 135 |
+
const response = await fetch('https://YOUR_SPACE.hf.space/api/v1/subtitles/transcribe', {
|
| 136 |
+
method: 'POST',
|
| 137 |
+
headers: { 'x-api-key': 'your-api-key' },
|
| 138 |
+
body: formData
|
| 139 |
+
});
|
| 140 |
+
|
| 141 |
+
const result = await response.json();
|
| 142 |
+
document.getElementById('result').innerHTML =
|
| 143 |
+
'<pre>' + JSON.stringify(result, null, 2) + '</pre>';
|
| 144 |
+
};
|
| 145 |
+
</script>
|
| 146 |
+
</body>
|
| 147 |
+
</html>
|
| 148 |
+
```
|
| 149 |
+
|
| 150 |
+
## 📝 API Usage Examples
|
| 151 |
+
|
| 152 |
+
### Text Embeddings (Works on HF Spaces)
|
| 153 |
+
|
| 154 |
+
```python
|
| 155 |
+
import requests
|
| 156 |
+
|
| 157 |
+
url = "https://YOUR_SPACE.hf.space/api/v1/embeddings/generate"
|
| 158 |
+
headers = {
|
| 159 |
+
"Content-Type": "application/json",
|
| 160 |
+
"x-api-key": "your-api-key"
|
| 161 |
+
}
|
| 162 |
+
data = {
|
| 163 |
+
"texts": [
|
| 164 |
+
"Hello, how are you?",
|
| 165 |
+
"Machine learning is fascinating"
|
| 166 |
+
],
|
| 167 |
+
"normalize": True
|
| 168 |
+
}
|
| 169 |
+
|
| 170 |
+
response = requests.post(url, headers=headers, json=data)
|
| 171 |
+
print(response.json())
|
| 172 |
+
```
|
| 173 |
+
|
| 174 |
+
### Audio File Transcription (Works on HF Spaces)
|
| 175 |
+
|
| 176 |
+
```python
|
| 177 |
+
import requests
|
| 178 |
+
|
| 179 |
+
url = "https://YOUR_SPACE.hf.space/api/v1/subtitles/transcribe"
|
| 180 |
+
headers = {"x-api-key": "your-api-key"}
|
| 181 |
+
|
| 182 |
+
with open("audio.mp3", "rb") as audio_file:
|
| 183 |
+
files = {"file": audio_file}
|
| 184 |
+
data = {"lang": "en"}
|
| 185 |
+
response = requests.post(url, headers=headers, files=files, data=data)
|
| 186 |
+
|
| 187 |
+
print(response.json())
|
| 188 |
+
```
|
| 189 |
+
|
| 190 |
+
### YouTube Extraction (Does NOT Work on HF Spaces)
|
| 191 |
+
|
| 192 |
+
```python
|
| 193 |
+
# ❌ This will fail on HF Spaces with network error
|
| 194 |
+
import requests
|
| 195 |
+
|
| 196 |
+
url = "https://YOUR_SPACE.hf.space/api/v1/subtitles/extract"
|
| 197 |
+
headers = {
|
| 198 |
+
"Content-Type": "application/json",
|
| 199 |
+
"x-api-key": "your-api-key"
|
| 200 |
+
}
|
| 201 |
+
data = {
|
| 202 |
+
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
|
| 203 |
+
"lang": "en"
|
| 204 |
+
}
|
| 205 |
+
|
| 206 |
+
response = requests.post(url, headers=headers, json=data)
|
| 207 |
+
# Error: Network connectivity issue
|
| 208 |
+
```
|
| 209 |
+
|
| 210 |
+
## 🔧 Configuration
|
| 211 |
+
|
| 212 |
+
### Required Environment Variables
|
| 213 |
+
|
| 214 |
+
Set these in HF Spaces **Repository secrets**:
|
| 215 |
+
|
| 216 |
+
| Variable | Description | Example |
|
| 217 |
+
|----------|-------------|---------|
|
| 218 |
+
| `API_KEYS` | Comma-separated API keys | `key1,key2,key3` |
|
| 219 |
+
|
| 220 |
+
### Optional Environment Variables
|
| 221 |
+
|
| 222 |
+
| Variable | Description | Default |
|
| 223 |
+
|----------|-------------|---------|
|
| 224 |
+
| `CORS_ORIGINS` | Allowed origins | `*` |
|
| 225 |
+
| `RATE_LIMIT_REQUESTS` | Requests per minute | `100` |
|
| 226 |
+
| `LOG_LEVEL` | Logging level | `INFO` |
|
| 227 |
+
| `WHISPER_MODEL` | Whisper model size | `base` |
|
| 228 |
+
| `EMBEDDING_MODEL` | HuggingFace model | `mixedbread-ai/mxbai-embed-large-v1` |
|
| 229 |
+
|
| 230 |
+
### Whisper Model Options
|
| 231 |
+
|
| 232 |
+
| Model | Size | Speed | Accuracy |
|
| 233 |
+
|-------|------|-------|----------|
|
| 234 |
+
| `tiny` | 39 MB | Fastest | Lowest |
|
| 235 |
+
| `base` | 74 MB | Fast | Good |
|
| 236 |
+
| `small` | 244 MB | Medium | Better |
|
| 237 |
+
| `medium` | 769 MB | Slow | Best |
|
| 238 |
+
|
| 239 |
+
**Recommendation for HF Spaces:** Use `base` or `small` for good balance.
|
| 240 |
+
|
| 241 |
+
## 🐛 Troubleshooting
|
| 242 |
+
|
| 243 |
+
### Issue: Build fails with poetry.lock error
|
| 244 |
+
|
| 245 |
+
**Error:**
|
| 246 |
+
```
|
| 247 |
+
The lock file might not be compatible with the current version of Poetry
|
| 248 |
+
```
|
| 249 |
+
|
| 250 |
+
**Solution:**
|
| 251 |
+
```bash
|
| 252 |
+
poetry lock
|
| 253 |
+
git add poetry.lock
|
| 254 |
+
git commit -m "Update poetry.lock"
|
| 255 |
+
git push
|
| 256 |
+
```
|
| 257 |
+
|
| 258 |
+
### Issue: "Unauthorized" error
|
| 259 |
+
|
| 260 |
+
**Error:**
|
| 261 |
+
```json
|
| 262 |
+
{"detail": "Unauthorized: Invalid or missing API key"}
|
| 263 |
+
```
|
| 264 |
+
|
| 265 |
+
**Solution:**
|
| 266 |
+
- Verify `API_KEYS` secret is set in Space settings
|
| 267 |
+
- Include `x-api-key` header in your requests
|
| 268 |
+
- Check for typos in the API key
|
| 269 |
+
|
| 270 |
+
### Issue: YouTube extraction fails
|
| 271 |
+
|
| 272 |
+
**Error:**
|
| 273 |
+
```json
|
| 274 |
+
{
|
| 275 |
+
"status": "error",
|
| 276 |
+
"message": "Network connectivity issue: Unable to reach YouTube..."
|
| 277 |
+
}
|
| 278 |
+
```
|
| 279 |
+
|
| 280 |
+
**Solution:**
|
| 281 |
+
This is expected on HF Spaces. Use the audio upload endpoint instead:
|
| 282 |
+
1. Download audio locally with yt-dlp
|
| 283 |
+
2. Upload to `/api/v1/subtitles/transcribe`
|
| 284 |
+
|
| 285 |
+
### Issue: Out of memory
|
| 286 |
+
|
| 287 |
+
**Error:**
|
| 288 |
+
```
|
| 289 |
+
Container killed due to memory limit
|
| 290 |
+
```
|
| 291 |
+
|
| 292 |
+
**Solution:**
|
| 293 |
+
- Use smaller Whisper model: `WHISPER_MODEL=tiny` or `WHISPER_MODEL=base`
|
| 294 |
+
- Process shorter audio files
|
| 295 |
+
- Consider upgrading to HF Spaces Pro (more RAM)
|
| 296 |
+
|
| 297 |
+
### Issue: Slow transcription
|
| 298 |
+
|
| 299 |
+
**Solution:**
|
| 300 |
+
- Use smaller Whisper model (`tiny` or `base`)
|
| 301 |
+
- Process shorter audio segments
|
| 302 |
+
- Note: HF Spaces free tier uses CPU (no GPU)
|
| 303 |
+
|
| 304 |
+
## 📈 Performance Tips
|
| 305 |
+
|
| 306 |
+
### 1. Choose the Right Whisper Model
|
| 307 |
+
|
| 308 |
+
```python
|
| 309 |
+
# Fast but less accurate (good for testing)
|
| 310 |
+
WHISPER_MODEL=tiny
|
| 311 |
+
|
| 312 |
+
# Balanced (recommended for production)
|
| 313 |
+
WHISPER_MODEL=base
|
| 314 |
+
|
| 315 |
+
# Accurate but slow (only if you need high quality)
|
| 316 |
+
WHISPER_MODEL=small
|
| 317 |
+
```
|
| 318 |
+
|
| 319 |
+
### 2. Optimize Audio Files
|
| 320 |
+
|
| 321 |
+
```bash
|
| 322 |
+
# Convert to optimal format before upload
|
| 323 |
+
ffmpeg -i input.wav -ar 16000 -ac 1 -c:a libmp3lame output.mp3
|
| 324 |
+
```
|
| 325 |
+
|
| 326 |
+
### 3. Rate Limiting
|
| 327 |
+
|
| 328 |
+
The server has rate limiting enabled:
|
| 329 |
+
- Default: 100 requests per minute
|
| 330 |
+
- Adjust via `RATE_LIMIT_REQUESTS` environment variable
|
| 331 |
+
|
| 332 |
+
## 🔒 Security Best Practices
|
| 333 |
+
|
| 334 |
+
### 1. Use Strong API Keys
|
| 335 |
+
|
| 336 |
+
```bash
|
| 337 |
+
# Generate secure API key
|
| 338 |
+
openssl rand -base64 32
|
| 339 |
+
```
|
| 340 |
+
|
| 341 |
+
### 2. Rotate Keys Regularly
|
| 342 |
+
|
| 343 |
+
Update `API_KEYS` in Space secrets monthly.
|
| 344 |
+
|
| 345 |
+
### 3. Monitor Usage
|
| 346 |
+
|
| 347 |
+
Check Space logs regularly:
|
| 348 |
+
- Settings → Logs
|
| 349 |
+
- Look for suspicious activity
|
| 350 |
+
|
| 351 |
+
### 4. Use Private Spaces for Sensitive Data
|
| 352 |
+
|
| 353 |
+
Consider making your Space private if handling sensitive content.
|
| 354 |
+
|
| 355 |
+
## 💰 Cost Considerations
|
| 356 |
+
|
| 357 |
+
### Free Tier
|
| 358 |
+
|
| 359 |
+
- ✅ Unlimited inference
|
| 360 |
+
- ✅ 16GB RAM
|
| 361 |
+
- ✅ 2 vCPU
|
| 362 |
+
- ⚠️ CPU-only (no GPU)
|
| 363 |
+
- ⚠️ May sleep after inactivity
|
| 364 |
+
|
| 365 |
+
### Spaces Pro ($5/month per Space)
|
| 366 |
+
|
| 367 |
+
- ✅ Always-on
|
| 368 |
+
- ✅ Better performance
|
| 369 |
+
- ✅ More resources
|
| 370 |
+
- ✅ Custom domains
|
| 371 |
+
|
| 372 |
+
## 🎓 Best Practices
|
| 373 |
+
|
| 374 |
+
### 1. Document the Workflow
|
| 375 |
+
|
| 376 |
+
Add a README to your Space explaining:
|
| 377 |
+
- How to download audio locally
|
| 378 |
+
- How to use the upload endpoint
|
| 379 |
+
- Supported audio formats
|
| 380 |
+
|
| 381 |
+
### 2. Provide Examples
|
| 382 |
+
|
| 383 |
+
Include example API calls and code snippets.
|
| 384 |
+
|
| 385 |
+
### 3. Set Expectations
|
| 386 |
+
|
| 387 |
+
Clearly state that YouTube direct extraction doesn't work on HF Spaces.
|
| 388 |
+
|
| 389 |
+
### 4. Offer Alternatives
|
| 390 |
+
|
| 391 |
+
Suggest self-hosted deployment for users who need YouTube extraction.
|
| 392 |
+
|
| 393 |
+
## 🚀 Alternative Deployment
|
| 394 |
+
|
| 395 |
+
If you need YouTube extraction, consider:
|
| 396 |
+
|
| 397 |
+
### Self-Hosted Options
|
| 398 |
+
|
| 399 |
+
1. **Docker on VPS** (DigitalOcean, Linode)
|
| 400 |
+
- Cost: $4-12/month
|
| 401 |
+
- Full control
|
| 402 |
+
- All features work
|
| 403 |
+
|
| 404 |
+
2. **Cloud Platforms** (AWS, GCP, Azure)
|
| 405 |
+
- Scalable
|
| 406 |
+
- More expensive
|
| 407 |
+
- Enterprise-grade
|
| 408 |
+
|
| 409 |
+
3. **Railway/Render**
|
| 410 |
+
- Easy deployment
|
| 411 |
+
- $5-20/month
|
| 412 |
+
- Good middle ground
|
| 413 |
+
|
| 414 |
+
## 📚 Additional Resources
|
| 415 |
+
|
| 416 |
+
- [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
|
| 417 |
+
- [yt-dlp Documentation](https://github.com/yt-dlp/yt-dlp)
|
| 418 |
+
- [Whisper Model Information](https://github.com/openai/whisper)
|
| 419 |
+
- [FastAPI Documentation](https://fastapi.tiangolo.com/)
|
| 420 |
+
|
| 421 |
+
## 🆘 Support
|
| 422 |
+
|
| 423 |
+
For issues:
|
| 424 |
+
1. Check Space logs (Settings → Logs)
|
| 425 |
+
2. Verify environment variables are set
|
| 426 |
+
3. Test with simple requests first
|
| 427 |
+
4. Check API key is correct
|
| 428 |
+
5. Review this guide for common issues
|
| 429 |
+
|
| 430 |
+
## ✅ Success Checklist
|
| 431 |
+
|
| 432 |
+
After deployment, verify:
|
| 433 |
+
|
| 434 |
+
- [ ] Space builds successfully
|
| 435 |
+
- [ ] Health check works: `GET /health`
|
| 436 |
+
- [ ] Embeddings endpoint works
|
| 437 |
+
- [ ] Audio upload endpoint works
|
| 438 |
+
- [ ] API key authentication works
|
| 439 |
+
- [ ] Rate limiting is configured
|
| 440 |
+
- [ ] Documentation is clear for users
|
| 441 |
+
|
| 442 |
+
**Your HF Space is ready to use! 🎉**
|
README.md
CHANGED
|
@@ -34,10 +34,7 @@ A centralized, extensible FastAPI server providing reusable APIs with robust aut
|
|
| 34 |
| **Subtitles** | `POST /api/v1/subtitles/transcribe` | Transcribe uploaded audio with Whisper ✅ |
|
| 35 |
| **Embeddings** | `POST /api/v1/embeddings/generate` | Generate text embeddings (1024-dim) |
|
| 36 |
|
| 37 |
-
> ⚠️ **Note:** The YouTube extraction endpoint requires external network access
|
| 38 |
-
> - ✅ Use the `/transcribe` endpoint (upload audio files)
|
| 39 |
-
> - ✅ Deploy a [proxy service](#bypassing-hf-spaces-restrictions) to enable YouTube downloads
|
| 40 |
-
> - ⚠️ Or use self-hosted deployment for direct access
|
| 41 |
|
| 42 |
## Quick Start
|
| 43 |
|
|
@@ -72,7 +69,6 @@ docker run -p 7860:7860 -e API_KEYS=your-key multiutility-server
|
|
| 72 |
| `LOG_LEVEL` | Logging level | `INFO` |
|
| 73 |
| `WHISPER_MODEL` | Whisper model size | `base` |
|
| 74 |
| `EMBEDDING_MODEL` | HuggingFace model | `mixedbread-ai/mxbai-embed-large-v1` |
|
| 75 |
-
| `YOUTUBE_PROXY_URL` | Proxy service URL (optional) | - |
|
| 76 |
|
| 77 |
## API Usage
|
| 78 |
|
|
@@ -88,7 +84,7 @@ curl -H "x-api-key: your-api-key" http://localhost:8000/api/v1/...
|
|
| 88 |
|
| 89 |
#### Extract from YouTube URL
|
| 90 |
|
| 91 |
-
> ⚠️ **Important:** This endpoint requires network access to YouTube
|
| 92 |
|
| 93 |
```bash
|
| 94 |
curl -X POST http://localhost:8000/api/v1/subtitles/extract \
|
|
@@ -165,19 +161,22 @@ app/
|
|
| 165 |
|
| 166 |
### Hugging Face Spaces
|
| 167 |
|
| 168 |
-
⚠️ **Network Limitation:** HF Spaces blocks external internet access
|
| 169 |
|
| 170 |
-
**
|
| 171 |
- ✅ `/api/v1/subtitles/transcribe` - Upload audio files for transcription
|
| 172 |
- ✅ `/api/v1/embeddings/generate` - Generate text embeddings
|
| 173 |
-
-
|
| 174 |
|
| 175 |
-
**
|
| 176 |
1. Create a Docker Space
|
| 177 |
2. Set `API_KEYS` secret in Space settings
|
| 178 |
3. Push repository
|
| 179 |
|
| 180 |
-
**
|
|
|
|
|
|
|
|
|
|
| 181 |
|
| 182 |
### Docker Compose
|
| 183 |
|
|
@@ -185,70 +184,29 @@ app/
|
|
| 185 |
docker-compose up --build
|
| 186 |
```
|
| 187 |
|
| 188 |
-
##
|
| 189 |
|
| 190 |
-
|
| 191 |
-
Hugging Face Spaces blocks external network access, preventing YouTube downloads.
|
| 192 |
|
| 193 |
-
###
|
| 194 |
-
Deploy the included proxy service on a platform **with** internet access (Railway, Render, etc.) to act as an intermediary.
|
| 195 |
|
|
|
|
|
|
|
|
|
|
| 196 |
```
|
| 197 |
-
HF Spaces → Proxy Service → YouTube → Proxy → HF Spaces → Whisper
|
| 198 |
-
```
|
| 199 |
|
| 200 |
-
###
|
| 201 |
-
|
| 202 |
-
|
| 203 |
-
|
| 204 |
-
|
| 205 |
-
|
| 206 |
-
|
| 207 |
-
|
| 208 |
-
|
| 209 |
-
|
| 210 |
-
|
| 211 |
-
|
| 212 |
-
|
| 213 |
-
**Render.com:**
|
| 214 |
-
- Push `proxy_service/` to GitHub
|
| 215 |
-
- Create new Web Service on Render
|
| 216 |
-
- Connect repo, Render auto-detects configuration
|
| 217 |
-
|
| 218 |
-
**Docker (Self-hosted):**
|
| 219 |
-
```bash
|
| 220 |
-
cd proxy_service
|
| 221 |
-
docker build -t youtube-proxy .
|
| 222 |
-
docker run -p 8080:8080 youtube-proxy
|
| 223 |
-
```
|
| 224 |
-
|
| 225 |
-
2. **Configure Main Server:**
|
| 226 |
-
```bash
|
| 227 |
-
# In HF Spaces secrets or .env file
|
| 228 |
-
YOUTUBE_PROXY_URL=https://your-proxy.railway.app/download
|
| 229 |
-
```
|
| 230 |
-
|
| 231 |
-
3. **Test:**
|
| 232 |
-
```bash
|
| 233 |
-
curl -X POST https://your-space.hf.space/api/v1/subtitles/extract \
|
| 234 |
-
-H "Content-Type: application/json" \
|
| 235 |
-
-H "x-api-key: your-key" \
|
| 236 |
-
-d '{"url": "https://youtube.com/watch?v=dQw4w9WgXcQ", "lang": "en"}'
|
| 237 |
-
```
|
| 238 |
-
|
| 239 |
-
### How It Works
|
| 240 |
-
1. Main server tries direct YouTube download
|
| 241 |
-
2. If blocked (network error), automatically falls back to proxy
|
| 242 |
-
3. Proxy downloads audio and returns file
|
| 243 |
-
4. Main server transcribes with Whisper
|
| 244 |
-
|
| 245 |
-
See `proxy_service/README.md` for detailed deployment instructions and platform comparisons.
|
| 246 |
-
|
| 247 |
-
### Free Deployment Options
|
| 248 |
-
- **Railway:** 500 hours/month free
|
| 249 |
-
- **Render:** Free tier with auto-sleep
|
| 250 |
-
- **Fly.io:** 3 VMs free tier
|
| 251 |
-
- **Google Cloud Run:** 2M requests/month free
|
| 252 |
|
| 253 |
## Development
|
| 254 |
|
|
|
|
| 34 |
| **Subtitles** | `POST /api/v1/subtitles/transcribe` | Transcribe uploaded audio with Whisper ✅ |
|
| 35 |
| **Embeddings** | `POST /api/v1/embeddings/generate` | Generate text embeddings (1024-dim) |
|
| 36 |
|
| 37 |
+
> ⚠️ **Note on HF Spaces:** The YouTube extraction endpoint (`/extract`) requires external network access and will **not work on Hugging Face Spaces** due to platform restrictions. Instead, use the **audio file upload endpoint** (`/transcribe`) which works perfectly on HF Spaces. For YouTube extraction, use a self-hosted deployment.
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
## Quick Start
|
| 40 |
|
|
|
|
| 69 |
| `LOG_LEVEL` | Logging level | `INFO` |
|
| 70 |
| `WHISPER_MODEL` | Whisper model size | `base` |
|
| 71 |
| `EMBEDDING_MODEL` | HuggingFace model | `mixedbread-ai/mxbai-embed-large-v1` |
|
|
|
|
| 72 |
|
| 73 |
## API Usage
|
| 74 |
|
|
|
|
| 84 |
|
| 85 |
#### Extract from YouTube URL
|
| 86 |
|
| 87 |
+
> ⚠️ **Important:** This endpoint requires network access to YouTube and will **not work on Hugging Face Spaces**. Use the audio file upload endpoint below instead, or deploy on a self-hosted environment.
|
| 88 |
|
| 89 |
```bash
|
| 90 |
curl -X POST http://localhost:8000/api/v1/subtitles/extract \
|
|
|
|
| 161 |
|
| 162 |
### Hugging Face Spaces
|
| 163 |
|
| 164 |
+
⚠️ **Network Limitation:** HF Spaces blocks external internet access, so YouTube downloads are not possible.
|
| 165 |
|
| 166 |
+
**What works on HF Spaces:**
|
| 167 |
- ✅ `/api/v1/subtitles/transcribe` - Upload audio files for transcription
|
| 168 |
- ✅ `/api/v1/embeddings/generate` - Generate text embeddings
|
| 169 |
+
- ❌ `/api/v1/subtitles/extract` - YouTube downloads (requires self-hosted deployment)
|
| 170 |
|
| 171 |
+
**Setup:**
|
| 172 |
1. Create a Docker Space
|
| 173 |
2. Set `API_KEYS` secret in Space settings
|
| 174 |
3. Push repository
|
| 175 |
|
| 176 |
+
**Recommended workflow for subtitles:**
|
| 177 |
+
1. Download audio locally using [yt-dlp](https://github.com/yt-dlp/yt-dlp): `yt-dlp -x --audio-format mp3 VIDEO_URL`
|
| 178 |
+
2. Upload the audio file to `/api/v1/subtitles/transcribe` endpoint
|
| 179 |
+
3. Receive transcription from Whisper
|
| 180 |
|
| 181 |
### Docker Compose
|
| 182 |
|
|
|
|
| 184 |
docker-compose up --build
|
| 185 |
```
|
| 186 |
|
| 187 |
+
## Alternative: Self-Hosted Deployment for YouTube Extraction
|
| 188 |
|
| 189 |
+
If you need YouTube subtitle extraction, deploy the server on a platform with internet access:
|
|
|
|
| 190 |
|
| 191 |
+
### Docker (VPS/Cloud VM)
|
|
|
|
| 192 |
|
| 193 |
+
```bash
|
| 194 |
+
docker build -t multiutility-server .
|
| 195 |
+
docker run -p 7860:7860 -e API_KEYS=your-key multiutility-server
|
| 196 |
```
|
|
|
|
|
|
|
| 197 |
|
| 198 |
+
### Cloud Platforms
|
| 199 |
+
|
| 200 |
+
- **Railway:** Direct Docker deployment
|
| 201 |
+
- **Render:** Connect GitHub repo, auto-deploy
|
| 202 |
+
- **DigitalOcean:** Deploy on Droplet ($4-12/month)
|
| 203 |
+
- **AWS/GCP/Azure:** Use ECS, Cloud Run, or App Service
|
| 204 |
+
|
| 205 |
+
### Benefits of Self-Hosted
|
| 206 |
+
- ✅ Direct YouTube access (no restrictions)
|
| 207 |
+
- ✅ Full control over resources
|
| 208 |
+
- ✅ No usage limits
|
| 209 |
+
- ✅ All features work natively
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 210 |
|
| 211 |
## Development
|
| 212 |
|
app/apis/subtitles/service.py
CHANGED
|
@@ -8,7 +8,6 @@ import threading
|
|
| 8 |
from pathlib import Path
|
| 9 |
from typing import TYPE_CHECKING, List, Optional, Tuple
|
| 10 |
|
| 11 |
-
import httpx
|
| 12 |
from cachetools import TTLCache
|
| 13 |
|
| 14 |
from app.apis.subtitles.utils import extract_video_id
|
|
@@ -79,10 +78,7 @@ class SubtitleService:
|
|
| 79 |
return SUBTITLE_CACHE[cache_key]
|
| 80 |
|
| 81 |
with tempfile.TemporaryDirectory() as temp_dir:
|
| 82 |
-
|
| 83 |
-
audio_path = await self._download_audio_with_fallback(
|
| 84 |
-
url, temp_dir, video_id
|
| 85 |
-
)
|
| 86 |
|
| 87 |
if not audio_path or not audio_path.exists():
|
| 88 |
raise SubtitleExtractionError("Failed to download audio from video")
|
|
@@ -96,51 +92,8 @@ class SubtitleService:
|
|
| 96 |
SUBTITLE_CACHE[cache_key] = result
|
| 97 |
return result
|
| 98 |
|
| 99 |
-
async def _download_audio_with_fallback(
|
| 100 |
-
self, url: str, temp_dir: str, video_id: str
|
| 101 |
-
) -> Path:
|
| 102 |
-
"""
|
| 103 |
-
Download audio with fallback to proxy service.
|
| 104 |
-
|
| 105 |
-
Tries direct yt-dlp download first. If that fails due to network restrictions
|
| 106 |
-
(e.g., on HF Spaces), falls back to proxy service if configured.
|
| 107 |
-
"""
|
| 108 |
-
try:
|
| 109 |
-
# Try direct download first
|
| 110 |
-
return await self._download_audio(url, temp_dir, video_id)
|
| 111 |
-
except SubtitleExtractionError as e:
|
| 112 |
-
error_msg = str(e)
|
| 113 |
-
|
| 114 |
-
# Check if it's a network connectivity issue
|
| 115 |
-
if (
|
| 116 |
-
"Network connectivity issue" in error_msg
|
| 117 |
-
or "Failed to resolve" in error_msg
|
| 118 |
-
):
|
| 119 |
-
# Try proxy service if configured
|
| 120 |
-
if settings.youtube_proxy_url:
|
| 121 |
-
logger.info(
|
| 122 |
-
f"Direct download failed, attempting proxy download via {settings.youtube_proxy_url}"
|
| 123 |
-
)
|
| 124 |
-
try:
|
| 125 |
-
return await self._download_audio_via_proxy(
|
| 126 |
-
url, temp_dir, video_id
|
| 127 |
-
)
|
| 128 |
-
except Exception as proxy_error:
|
| 129 |
-
logger.error(f"Proxy download also failed: {proxy_error}")
|
| 130 |
-
raise SubtitleExtractionError(
|
| 131 |
-
f"Both direct and proxy downloads failed. Direct: {error_msg}. "
|
| 132 |
-
f"Proxy: {str(proxy_error)}"
|
| 133 |
-
)
|
| 134 |
-
else:
|
| 135 |
-
logger.warning(
|
| 136 |
-
"No proxy service configured, cannot bypass network restriction"
|
| 137 |
-
)
|
| 138 |
-
|
| 139 |
-
# Re-raise original error if not a network issue or no proxy available
|
| 140 |
-
raise
|
| 141 |
-
|
| 142 |
async def _download_audio(self, url: str, temp_dir: str, video_id: str) -> Path:
|
| 143 |
-
"""Download audio from video URL using yt-dlp
|
| 144 |
cmd = [
|
| 145 |
sys.executable,
|
| 146 |
"-m",
|
|
@@ -178,7 +131,8 @@ class SubtitleService:
|
|
| 178 |
raise SubtitleExtractionError(
|
| 179 |
"Network connectivity issue: Unable to reach YouTube. "
|
| 180 |
"This service may be running in a sandboxed environment (e.g., Hugging Face Spaces) "
|
| 181 |
-
"that blocks external internet access. Please use
|
|
|
|
| 182 |
)
|
| 183 |
|
| 184 |
if "Video unavailable" in error_msg or "Private video" in error_msg:
|
|
@@ -201,50 +155,6 @@ class SubtitleService:
|
|
| 201 |
except asyncio.TimeoutError:
|
| 202 |
raise DownloadTimeoutError("Timeout while downloading audio")
|
| 203 |
|
| 204 |
-
async def _download_audio_via_proxy(
|
| 205 |
-
self, url: str, temp_dir: str, video_id: str
|
| 206 |
-
) -> Path:
|
| 207 |
-
"""
|
| 208 |
-
Download audio via external proxy service.
|
| 209 |
-
|
| 210 |
-
The proxy service should accept POST requests with JSON body:
|
| 211 |
-
{"url": "youtube_url"} and return the audio file.
|
| 212 |
-
"""
|
| 213 |
-
if not settings.youtube_proxy_url:
|
| 214 |
-
raise SubtitleExtractionError("Proxy URL not configured")
|
| 215 |
-
|
| 216 |
-
output_path = Path(temp_dir) / f"{video_id}.mp3"
|
| 217 |
-
|
| 218 |
-
logger.info(f"Requesting audio download from proxy: {url}")
|
| 219 |
-
|
| 220 |
-
try:
|
| 221 |
-
async with httpx.AsyncClient(timeout=self.timeout_download) as client:
|
| 222 |
-
response = await client.post(
|
| 223 |
-
settings.youtube_proxy_url,
|
| 224 |
-
json={"url": url, "format": "mp3"},
|
| 225 |
-
follow_redirects=True,
|
| 226 |
-
)
|
| 227 |
-
|
| 228 |
-
if response.status_code != 200:
|
| 229 |
-
error_msg = response.text[:200]
|
| 230 |
-
raise SubtitleExtractionError(
|
| 231 |
-
f"Proxy service returned status {response.status_code}: {error_msg}"
|
| 232 |
-
)
|
| 233 |
-
|
| 234 |
-
# Save the downloaded audio
|
| 235 |
-
output_path.write_bytes(response.content)
|
| 236 |
-
|
| 237 |
-
if not output_path.exists() or output_path.stat().st_size == 0:
|
| 238 |
-
raise SubtitleExtractionError("Proxy returned empty audio file")
|
| 239 |
-
|
| 240 |
-
logger.info(f"Audio downloaded via proxy: {output_path}")
|
| 241 |
-
return output_path
|
| 242 |
-
|
| 243 |
-
except httpx.TimeoutException:
|
| 244 |
-
raise DownloadTimeoutError("Timeout while downloading audio via proxy")
|
| 245 |
-
except httpx.RequestError as e:
|
| 246 |
-
raise SubtitleExtractionError(f"Proxy request failed: {str(e)}")
|
| 247 |
-
|
| 248 |
async def _transcribe_audio(self, audio_path: Path, lang: str) -> List[str]:
|
| 249 |
"""Transcribe audio file using Whisper."""
|
| 250 |
self._load_whisper_model()
|
|
|
|
| 8 |
from pathlib import Path
|
| 9 |
from typing import TYPE_CHECKING, List, Optional, Tuple
|
| 10 |
|
|
|
|
| 11 |
from cachetools import TTLCache
|
| 12 |
|
| 13 |
from app.apis.subtitles.utils import extract_video_id
|
|
|
|
| 78 |
return SUBTITLE_CACHE[cache_key]
|
| 79 |
|
| 80 |
with tempfile.TemporaryDirectory() as temp_dir:
|
| 81 |
+
audio_path = await self._download_audio(url, temp_dir, video_id)
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
if not audio_path or not audio_path.exists():
|
| 84 |
raise SubtitleExtractionError("Failed to download audio from video")
|
|
|
|
| 92 |
SUBTITLE_CACHE[cache_key] = result
|
| 93 |
return result
|
| 94 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 95 |
async def _download_audio(self, url: str, temp_dir: str, video_id: str) -> Path:
|
| 96 |
+
"""Download audio from video URL using yt-dlp."""
|
| 97 |
cmd = [
|
| 98 |
sys.executable,
|
| 99 |
"-m",
|
|
|
|
| 131 |
raise SubtitleExtractionError(
|
| 132 |
"Network connectivity issue: Unable to reach YouTube. "
|
| 133 |
"This service may be running in a sandboxed environment (e.g., Hugging Face Spaces) "
|
| 134 |
+
"that blocks external internet access. Please use the audio file upload endpoint "
|
| 135 |
+
"(/api/v1/subtitles/transcribe) instead, or use a self-hosted deployment."
|
| 136 |
)
|
| 137 |
|
| 138 |
if "Video unavailable" in error_msg or "Private video" in error_msg:
|
|
|
|
| 155 |
except asyncio.TimeoutError:
|
| 156 |
raise DownloadTimeoutError("Timeout while downloading audio")
|
| 157 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 158 |
async def _transcribe_audio(self, audio_path: Path, lang: str) -> List[str]:
|
| 159 |
"""Transcribe audio file using Whisper."""
|
| 160 |
self._load_whisper_model()
|
app/core/config.py
CHANGED
|
@@ -31,9 +31,6 @@ class Settings(BaseSettings):
|
|
| 31 |
# Embedding configuration
|
| 32 |
embedding_model: str = "mixedbread-ai/mxbai-embed-large-v1"
|
| 33 |
|
| 34 |
-
# Proxy configuration for bypassing HF Spaces network restrictions
|
| 35 |
-
youtube_proxy_url: str = "" # Optional proxy service URL for YouTube downloads
|
| 36 |
-
|
| 37 |
# Server configuration
|
| 38 |
host: str = "0.0.0.0"
|
| 39 |
port: int = 8000
|
|
|
|
| 31 |
# Embedding configuration
|
| 32 |
embedding_model: str = "mixedbread-ai/mxbai-embed-large-v1"
|
| 33 |
|
|
|
|
|
|
|
|
|
|
| 34 |
# Server configuration
|
| 35 |
host: str = "0.0.0.0"
|
| 36 |
port: int = 8000
|
poetry.lock
CHANGED
|
@@ -672,28 +672,6 @@ files = [
|
|
| 672 |
[package.extras]
|
| 673 |
tests = ["pytest"]
|
| 674 |
|
| 675 |
-
[[package]]
|
| 676 |
-
name = "httpcore"
|
| 677 |
-
version = "1.0.9"
|
| 678 |
-
description = "A minimal low-level HTTP client."
|
| 679 |
-
optional = false
|
| 680 |
-
python-versions = ">=3.8"
|
| 681 |
-
groups = ["main"]
|
| 682 |
-
files = [
|
| 683 |
-
{file = "httpcore-1.0.9-py3-none-any.whl", hash = "sha256:2d400746a40668fc9dec9810239072b40b4484b640a8c38fd654a024c7a1bf55"},
|
| 684 |
-
{file = "httpcore-1.0.9.tar.gz", hash = "sha256:6e34463af53fd2ab5d807f399a9b45ea31c3dfa2276f15a2c3f00afff6e176e8"},
|
| 685 |
-
]
|
| 686 |
-
|
| 687 |
-
[package.dependencies]
|
| 688 |
-
certifi = "*"
|
| 689 |
-
h11 = ">=0.16"
|
| 690 |
-
|
| 691 |
-
[package.extras]
|
| 692 |
-
asyncio = ["anyio (>=4.0,<5.0)"]
|
| 693 |
-
http2 = ["h2 (>=3,<5)"]
|
| 694 |
-
socks = ["socksio (==1.*)"]
|
| 695 |
-
trio = ["trio (>=0.22.0,<1.0)"]
|
| 696 |
-
|
| 697 |
[[package]]
|
| 698 |
name = "httptools"
|
| 699 |
version = "0.6.4"
|
|
@@ -750,31 +728,6 @@ files = [
|
|
| 750 |
[package.extras]
|
| 751 |
test = ["Cython (>=0.29.24)"]
|
| 752 |
|
| 753 |
-
[[package]]
|
| 754 |
-
name = "httpx"
|
| 755 |
-
version = "0.25.2"
|
| 756 |
-
description = "The next generation HTTP client."
|
| 757 |
-
optional = false
|
| 758 |
-
python-versions = ">=3.8"
|
| 759 |
-
groups = ["main"]
|
| 760 |
-
files = [
|
| 761 |
-
{file = "httpx-0.25.2-py3-none-any.whl", hash = "sha256:a05d3d052d9b2dfce0e3896636467f8a5342fb2b902c819428e1ac65413ca118"},
|
| 762 |
-
{file = "httpx-0.25.2.tar.gz", hash = "sha256:8b8fcaa0c8ea7b05edd69a094e63a2094c4efcb48129fb757361bc423c0ad9e8"},
|
| 763 |
-
]
|
| 764 |
-
|
| 765 |
-
[package.dependencies]
|
| 766 |
-
anyio = "*"
|
| 767 |
-
certifi = "*"
|
| 768 |
-
httpcore = "==1.*"
|
| 769 |
-
idna = "*"
|
| 770 |
-
sniffio = "*"
|
| 771 |
-
|
| 772 |
-
[package.extras]
|
| 773 |
-
brotli = ["brotli ; platform_python_implementation == \"CPython\"", "brotlicffi ; platform_python_implementation != \"CPython\""]
|
| 774 |
-
cli = ["click (==8.*)", "pygments (==2.*)", "rich (>=10,<14)"]
|
| 775 |
-
http2 = ["h2 (>=3,<5)"]
|
| 776 |
-
socks = ["socksio (==1.*)"]
|
| 777 |
-
|
| 778 |
[[package]]
|
| 779 |
name = "huggingface-hub"
|
| 780 |
version = "0.36.0"
|
|
@@ -3206,4 +3159,4 @@ test = ["pytest (>=8.1,<9.0)", "pytest-rerunfailures (>=14.0,<15.0)"]
|
|
| 3206 |
[metadata]
|
| 3207 |
lock-version = "2.1"
|
| 3208 |
python-versions = "^3.11"
|
| 3209 |
-
content-hash = "
|
|
|
|
| 672 |
[package.extras]
|
| 673 |
tests = ["pytest"]
|
| 674 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 675 |
[[package]]
|
| 676 |
name = "httptools"
|
| 677 |
version = "0.6.4"
|
|
|
|
| 728 |
[package.extras]
|
| 729 |
test = ["Cython (>=0.29.24)"]
|
| 730 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 731 |
[[package]]
|
| 732 |
name = "huggingface-hub"
|
| 733 |
version = "0.36.0"
|
|
|
|
| 3159 |
[metadata]
|
| 3160 |
lock-version = "2.1"
|
| 3161 |
python-versions = "^3.11"
|
| 3162 |
+
content-hash = "ec39fc9067b87ef79eb93b123db27d3f8f462a61b46f0475263bd2a431f65fea"
|
proxy_service/Dockerfile
DELETED
|
@@ -1,35 +0,0 @@
|
|
| 1 |
-
# Dockerfile for YouTube Audio Proxy Service
|
| 2 |
-
# Lightweight FastAPI service for downloading YouTube audio
|
| 3 |
-
|
| 4 |
-
FROM python:3.11-slim
|
| 5 |
-
|
| 6 |
-
# Set working directory
|
| 7 |
-
WORKDIR /app
|
| 8 |
-
|
| 9 |
-
# Install system dependencies for yt-dlp
|
| 10 |
-
RUN apt-get update && apt-get install -y \
|
| 11 |
-
ffmpeg \
|
| 12 |
-
&& rm -rf /var/lib/apt/lists/*
|
| 13 |
-
|
| 14 |
-
# Copy requirements first for better caching
|
| 15 |
-
COPY requirements.txt .
|
| 16 |
-
|
| 17 |
-
# Install Python dependencies
|
| 18 |
-
RUN pip install --no-cache-dir -r requirements.txt
|
| 19 |
-
|
| 20 |
-
# Copy application code
|
| 21 |
-
COPY main.py .
|
| 22 |
-
|
| 23 |
-
# Expose port
|
| 24 |
-
EXPOSE 8080
|
| 25 |
-
|
| 26 |
-
# Set environment variables
|
| 27 |
-
ENV PORT=8080
|
| 28 |
-
ENV PYTHONUNBUFFERED=1
|
| 29 |
-
|
| 30 |
-
# Health check
|
| 31 |
-
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
| 32 |
-
CMD python -c "import requests; requests.get('http://localhost:8080/health')"
|
| 33 |
-
|
| 34 |
-
# Run the application
|
| 35 |
-
CMD ["python", "main.py"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
proxy_service/README.md
DELETED
|
@@ -1,409 +0,0 @@
|
|
| 1 |
-
# YouTube Audio Proxy Service
|
| 2 |
-
|
| 3 |
-
A lightweight FastAPI microservice that downloads YouTube audio files. Designed to bypass network restrictions in sandboxed environments like Hugging Face Spaces.
|
| 4 |
-
|
| 5 |
-
## 🎯 Purpose
|
| 6 |
-
|
| 7 |
-
Hugging Face Spaces and similar platforms block external internet access for security reasons. This proxy service runs on a platform **with** internet access and acts as an intermediary for YouTube downloads.
|
| 8 |
-
|
| 9 |
-
## 🏗️ Architecture
|
| 10 |
-
|
| 11 |
-
```
|
| 12 |
-
┌─────────────────────┐ ┌──────────────────┐ ┌─────────────┐
|
| 13 |
-
│ HF Spaces Server │ ─────▶ │ Proxy Service │ ─────▶ │ YouTube │
|
| 14 |
-
│ (No Internet) │ │ (Has Internet) │ │ │
|
| 15 |
-
└─────────────────────┘ └──────────────────┘ └─────────────┘
|
| 16 |
-
│ │
|
| 17 |
-
│ │
|
| 18 |
-
▼ ▼
|
| 19 |
-
Transcribes Downloads Audio
|
| 20 |
-
with Whisper & Returns File
|
| 21 |
-
```
|
| 22 |
-
|
| 23 |
-
## 🚀 Quick Start
|
| 24 |
-
|
| 25 |
-
### Local Testing
|
| 26 |
-
|
| 27 |
-
```bash
|
| 28 |
-
cd proxy_service
|
| 29 |
-
|
| 30 |
-
# Install dependencies
|
| 31 |
-
pip install -r requirements.txt
|
| 32 |
-
|
| 33 |
-
# Run server
|
| 34 |
-
python main.py
|
| 35 |
-
```
|
| 36 |
-
|
| 37 |
-
Server starts at `http://localhost:8080`
|
| 38 |
-
|
| 39 |
-
### Test the Endpoint
|
| 40 |
-
|
| 41 |
-
```bash
|
| 42 |
-
curl -X POST http://localhost:8080/download \
|
| 43 |
-
-H "Content-Type: application/json" \
|
| 44 |
-
-d '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "format": "mp3"}'
|
| 45 |
-
```
|
| 46 |
-
|
| 47 |
-
## 📦 Deployment Options
|
| 48 |
-
|
| 49 |
-
### Option 1: Railway.app (Recommended - Free Tier Available)
|
| 50 |
-
|
| 51 |
-
1. **Install Railway CLI:**
|
| 52 |
-
```bash
|
| 53 |
-
npm install -g @railway/cli
|
| 54 |
-
```
|
| 55 |
-
|
| 56 |
-
2. **Login and deploy:**
|
| 57 |
-
```bash
|
| 58 |
-
railway login
|
| 59 |
-
railway init
|
| 60 |
-
railway up
|
| 61 |
-
```
|
| 62 |
-
|
| 63 |
-
3. **Get your service URL:**
|
| 64 |
-
```bash
|
| 65 |
-
railway domain
|
| 66 |
-
```
|
| 67 |
-
|
| 68 |
-
4. **Configure main server:**
|
| 69 |
-
```bash
|
| 70 |
-
# In your main .env file
|
| 71 |
-
YOUTUBE_PROXY_URL=https://your-service.railway.app/download
|
| 72 |
-
```
|
| 73 |
-
|
| 74 |
-
**Pros:** Easy deployment, free tier, automatic HTTPS, good performance
|
| 75 |
-
**Cons:** Free tier has usage limits
|
| 76 |
-
|
| 77 |
-
---
|
| 78 |
-
|
| 79 |
-
### Option 2: Render.com (Free Tier Available)
|
| 80 |
-
|
| 81 |
-
1. **Create a new Web Service** on [Render.com](https://render.com)
|
| 82 |
-
|
| 83 |
-
2. **Connect your Git repository** or deploy manually
|
| 84 |
-
|
| 85 |
-
3. **Configure:**
|
| 86 |
-
- Build Command: `pip install -r requirements.txt`
|
| 87 |
-
- Start Command: `python main.py`
|
| 88 |
-
- Or use the included `render.yaml` for automatic configuration
|
| 89 |
-
|
| 90 |
-
4. **Copy the service URL** (e.g., `https://your-service.onrender.com`)
|
| 91 |
-
|
| 92 |
-
5. **Update main server:**
|
| 93 |
-
```bash
|
| 94 |
-
YOUTUBE_PROXY_URL=https://your-service.onrender.com/download
|
| 95 |
-
```
|
| 96 |
-
|
| 97 |
-
**Pros:** Free tier, simple setup, automatic SSL
|
| 98 |
-
**Cons:** Free tier sleeps after inactivity (cold starts)
|
| 99 |
-
|
| 100 |
-
---
|
| 101 |
-
|
| 102 |
-
### Option 3: Docker (Self-Hosted)
|
| 103 |
-
|
| 104 |
-
```bash
|
| 105 |
-
# Build image
|
| 106 |
-
docker build -t youtube-proxy .
|
| 107 |
-
|
| 108 |
-
# Run container
|
| 109 |
-
docker run -p 8080:8080 youtube-proxy
|
| 110 |
-
|
| 111 |
-
# Or use docker-compose
|
| 112 |
-
docker-compose up -d
|
| 113 |
-
```
|
| 114 |
-
|
| 115 |
-
**docker-compose.yml example:**
|
| 116 |
-
```yaml
|
| 117 |
-
version: '3.8'
|
| 118 |
-
services:
|
| 119 |
-
proxy:
|
| 120 |
-
build: .
|
| 121 |
-
ports:
|
| 122 |
-
- "8080:8080"
|
| 123 |
-
restart: unless-stopped
|
| 124 |
-
environment:
|
| 125 |
-
- PORT=8080
|
| 126 |
-
```
|
| 127 |
-
|
| 128 |
-
**Pros:** Full control, no usage limits
|
| 129 |
-
**Cons:** Requires server infrastructure
|
| 130 |
-
|
| 131 |
-
---
|
| 132 |
-
|
| 133 |
-
### Option 4: Fly.io (Free Tier Available)
|
| 134 |
-
|
| 135 |
-
```bash
|
| 136 |
-
# Install flyctl
|
| 137 |
-
curl -L https://fly.io/install.sh | sh
|
| 138 |
-
|
| 139 |
-
# Login and launch
|
| 140 |
-
flyctl auth login
|
| 141 |
-
flyctl launch
|
| 142 |
-
|
| 143 |
-
# Deploy
|
| 144 |
-
flyctl deploy
|
| 145 |
-
```
|
| 146 |
-
|
| 147 |
-
**Pros:** Good free tier, edge network, fast
|
| 148 |
-
**Cons:** Requires credit card for verification
|
| 149 |
-
|
| 150 |
-
---
|
| 151 |
-
|
| 152 |
-
### Option 5: AWS Lambda (Serverless)
|
| 153 |
-
|
| 154 |
-
Use [Mangum](https://mangum.io/) to deploy FastAPI to AWS Lambda:
|
| 155 |
-
|
| 156 |
-
```python
|
| 157 |
-
# lambda_handler.py
|
| 158 |
-
from mangum import Mangum
|
| 159 |
-
from main import app
|
| 160 |
-
|
| 161 |
-
handler = Mangum(app)
|
| 162 |
-
```
|
| 163 |
-
|
| 164 |
-
**Pros:** Scales automatically, pay-per-use
|
| 165 |
-
**Cons:** More complex setup, cold starts
|
| 166 |
-
|
| 167 |
-
---
|
| 168 |
-
|
| 169 |
-
### Option 6: Google Cloud Run (Free Tier)
|
| 170 |
-
|
| 171 |
-
```bash
|
| 172 |
-
# Build and deploy
|
| 173 |
-
gcloud run deploy youtube-proxy \
|
| 174 |
-
--source . \
|
| 175 |
-
--platform managed \
|
| 176 |
-
--region us-central1 \
|
| 177 |
-
--allow-unauthenticated
|
| 178 |
-
```
|
| 179 |
-
|
| 180 |
-
**Pros:** Generous free tier, auto-scaling
|
| 181 |
-
**Cons:** Requires Google Cloud account
|
| 182 |
-
|
| 183 |
-
## 🔧 Configuration
|
| 184 |
-
|
| 185 |
-
The proxy service accepts these environment variables:
|
| 186 |
-
|
| 187 |
-
| Variable | Description | Default |
|
| 188 |
-
|----------|-------------|---------|
|
| 189 |
-
| `PORT` | Server port | `8080` |
|
| 190 |
-
| `PYTHONUNBUFFERED` | Python output buffering | `1` |
|
| 191 |
-
|
| 192 |
-
## 📡 API Endpoints
|
| 193 |
-
|
| 194 |
-
### `POST /download`
|
| 195 |
-
|
| 196 |
-
Download YouTube audio and return the file.
|
| 197 |
-
|
| 198 |
-
**Request:**
|
| 199 |
-
```json
|
| 200 |
-
{
|
| 201 |
-
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
|
| 202 |
-
"format": "mp3"
|
| 203 |
-
}
|
| 204 |
-
```
|
| 205 |
-
|
| 206 |
-
**Supported formats:** `mp3`, `m4a`, `wav`, `opus`
|
| 207 |
-
|
| 208 |
-
**Response:** Binary audio file
|
| 209 |
-
|
| 210 |
-
**Status Codes:**
|
| 211 |
-
- `200`: Success - returns audio file
|
| 212 |
-
- `400`: Invalid request (bad URL or format)
|
| 213 |
-
- `403`: Video is private
|
| 214 |
-
- `404`: Video not found
|
| 215 |
-
- `500`: Download failed
|
| 216 |
-
- `504`: Download timeout
|
| 217 |
-
|
| 218 |
-
---
|
| 219 |
-
|
| 220 |
-
### `GET /health`
|
| 221 |
-
|
| 222 |
-
Health check endpoint.
|
| 223 |
-
|
| 224 |
-
**Response:**
|
| 225 |
-
```json
|
| 226 |
-
{
|
| 227 |
-
"status": "healthy",
|
| 228 |
-
"service": "youtube-audio-proxy",
|
| 229 |
-
"yt_dlp_available": true
|
| 230 |
-
}
|
| 231 |
-
```
|
| 232 |
-
|
| 233 |
-
---
|
| 234 |
-
|
| 235 |
-
### `GET /`
|
| 236 |
-
|
| 237 |
-
Service information and usage instructions.
|
| 238 |
-
|
| 239 |
-
## 🔗 Connecting to Main Server
|
| 240 |
-
|
| 241 |
-
After deploying the proxy service:
|
| 242 |
-
|
| 243 |
-
1. **Copy the service URL** (e.g., `https://your-proxy.railway.app`)
|
| 244 |
-
|
| 245 |
-
2. **Update main server configuration:**
|
| 246 |
-
|
| 247 |
-
**Option A: Environment Variable**
|
| 248 |
-
```bash
|
| 249 |
-
export YOUTUBE_PROXY_URL=https://your-proxy.railway.app/download
|
| 250 |
-
```
|
| 251 |
-
|
| 252 |
-
**Option B: .env file**
|
| 253 |
-
```env
|
| 254 |
-
YOUTUBE_PROXY_URL=https://your-proxy.railway.app/download
|
| 255 |
-
```
|
| 256 |
-
|
| 257 |
-
**Option C: Docker**
|
| 258 |
-
```bash
|
| 259 |
-
docker run -e YOUTUBE_PROXY_URL=https://your-proxy.railway.app/download ...
|
| 260 |
-
```
|
| 261 |
-
|
| 262 |
-
3. **Verify configuration:**
|
| 263 |
-
```bash
|
| 264 |
-
# Test the subtitle extraction endpoint
|
| 265 |
-
curl -X POST https://your-hf-space.hf.space/api/v1/subtitles/extract \
|
| 266 |
-
-H "Content-Type: application/json" \
|
| 267 |
-
-H "x-api-key: your-key" \
|
| 268 |
-
-d '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "lang": "en"}'
|
| 269 |
-
```
|
| 270 |
-
|
| 271 |
-
## 🔒 Security Considerations
|
| 272 |
-
|
| 273 |
-
### Rate Limiting
|
| 274 |
-
|
| 275 |
-
Consider adding rate limiting to prevent abuse:
|
| 276 |
-
|
| 277 |
-
```python
|
| 278 |
-
from slowapi import Limiter, _rate_limit_exceeded_handler
|
| 279 |
-
from slowapi.util import get_remote_address
|
| 280 |
-
|
| 281 |
-
limiter = Limiter(key_func=get_remote_address)
|
| 282 |
-
app.state.limiter = limiter
|
| 283 |
-
|
| 284 |
-
@app.post("/download")
|
| 285 |
-
@limiter.limit("10/minute")
|
| 286 |
-
async def download_audio(request: DownloadRequest):
|
| 287 |
-
...
|
| 288 |
-
```
|
| 289 |
-
|
| 290 |
-
### Authentication
|
| 291 |
-
|
| 292 |
-
Add API key authentication for production:
|
| 293 |
-
|
| 294 |
-
```python
|
| 295 |
-
from fastapi import Header, HTTPException
|
| 296 |
-
|
| 297 |
-
async def verify_api_key(x_api_key: str = Header(...)):
|
| 298 |
-
if x_api_key not in VALID_API_KEYS:
|
| 299 |
-
raise HTTPException(status_code=401, detail="Invalid API key")
|
| 300 |
-
```
|
| 301 |
-
|
| 302 |
-
### CORS Configuration
|
| 303 |
-
|
| 304 |
-
Update CORS settings for production:
|
| 305 |
-
|
| 306 |
-
```python
|
| 307 |
-
app.add_middleware(
|
| 308 |
-
CORSMiddleware,
|
| 309 |
-
allow_origins=["https://your-main-service.com"], # Specific origins
|
| 310 |
-
allow_credentials=True,
|
| 311 |
-
allow_methods=["POST"],
|
| 312 |
-
allow_headers=["Content-Type"],
|
| 313 |
-
)
|
| 314 |
-
```
|
| 315 |
-
|
| 316 |
-
## 📊 Monitoring
|
| 317 |
-
|
| 318 |
-
### Health Checks
|
| 319 |
-
|
| 320 |
-
All deployment platforms support health checks via `/health` endpoint.
|
| 321 |
-
|
| 322 |
-
### Logging
|
| 323 |
-
|
| 324 |
-
Add structured logging for monitoring:
|
| 325 |
-
|
| 326 |
-
```python
|
| 327 |
-
import logging
|
| 328 |
-
|
| 329 |
-
logging.basicConfig(
|
| 330 |
-
level=logging.INFO,
|
| 331 |
-
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
| 332 |
-
)
|
| 333 |
-
|
| 334 |
-
logger = logging.getLogger(__name__)
|
| 335 |
-
```
|
| 336 |
-
|
| 337 |
-
## 🐛 Troubleshooting
|
| 338 |
-
|
| 339 |
-
### "Video unavailable" error
|
| 340 |
-
- Check if the video is private or region-restricted
|
| 341 |
-
- Verify the URL is correct
|
| 342 |
-
- Try the video on youtube.com directly
|
| 343 |
-
|
| 344 |
-
### Timeout errors
|
| 345 |
-
- Increase timeout in main server config: `YT_DLP_TIMEOUT_DOWNLOAD=180`
|
| 346 |
-
- Check proxy service logs
|
| 347 |
-
- Consider upgrading server resources
|
| 348 |
-
|
| 349 |
-
### "yt-dlp not found" error
|
| 350 |
-
- Ensure `yt-dlp` is in requirements.txt
|
| 351 |
-
- Verify ffmpeg is installed (required for audio conversion)
|
| 352 |
-
- Check Docker image includes system dependencies
|
| 353 |
-
|
| 354 |
-
### Slow downloads
|
| 355 |
-
- Upgrade proxy service plan for better resources
|
| 356 |
-
- Use a region closer to your main service
|
| 357 |
-
- Consider caching frequently requested videos
|
| 358 |
-
|
| 359 |
-
## 💰 Cost Estimates
|
| 360 |
-
|
| 361 |
-
### Free Tier Options
|
| 362 |
-
- **Railway:** 500 hours/month free, then $5/month
|
| 363 |
-
- **Render:** 750 hours/month free, sleep after 15min inactivity
|
| 364 |
-
- **Fly.io:** 3 shared-cpu VMs free
|
| 365 |
-
- **Google Cloud Run:** 2 million requests/month free
|
| 366 |
-
|
| 367 |
-
### Paid Options
|
| 368 |
-
- **Railway:** $5-20/month for consistent uptime
|
| 369 |
-
- **AWS Lambda:** ~$0.20 per 1 million requests
|
| 370 |
-
- **DigitalOcean:** $4/month for basic droplet
|
| 371 |
-
|
| 372 |
-
## 🎓 How It Works
|
| 373 |
-
|
| 374 |
-
1. **Main server** receives subtitle extraction request
|
| 375 |
-
2. **Main server** tries direct YouTube download via `yt-dlp`
|
| 376 |
-
3. **If blocked** (network error), falls back to proxy service
|
| 377 |
-
4. **Proxy service** downloads audio using `yt-dlp` (has internet access)
|
| 378 |
-
5. **Proxy service** returns audio file bytes
|
| 379 |
-
6. **Main server** saves audio to temp directory
|
| 380 |
-
7. **Main server** transcribes audio with Whisper
|
| 381 |
-
8. Returns subtitles to user
|
| 382 |
-
|
| 383 |
-
## 🔄 Updates
|
| 384 |
-
|
| 385 |
-
Keep yt-dlp updated for best compatibility:
|
| 386 |
-
|
| 387 |
-
```bash
|
| 388 |
-
pip install --upgrade yt-dlp
|
| 389 |
-
```
|
| 390 |
-
|
| 391 |
-
## 📝 License
|
| 392 |
-
|
| 393 |
-
Same as main project (MIT License)
|
| 394 |
-
|
| 395 |
-
## 🤝 Contributing
|
| 396 |
-
|
| 397 |
-
To improve this proxy service:
|
| 398 |
-
1. Add caching for frequently requested videos
|
| 399 |
-
2. Implement video quality selection
|
| 400 |
-
3. Add support for playlists
|
| 401 |
-
4. Improve error handling and logging
|
| 402 |
-
5. Add metrics and analytics
|
| 403 |
-
|
| 404 |
-
## 📚 Additional Resources
|
| 405 |
-
|
| 406 |
-
- [yt-dlp Documentation](https://github.com/yt-dlp/yt-dlp)
|
| 407 |
-
- [FastAPI Documentation](https://fastapi.tiangolo.com/)
|
| 408 |
-
- [Railway Documentation](https://docs.railway.app/)
|
| 409 |
-
- [Render Documentation](https://render.com/docs)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
proxy_service/main.py
DELETED
|
@@ -1,216 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
YouTube Audio Download Proxy Service
|
| 3 |
-
|
| 4 |
-
A simple FastAPI service that downloads YouTube audio and returns it.
|
| 5 |
-
Deploy this on platforms with internet access (Vercel, Railway, Render, etc.)
|
| 6 |
-
to bypass Hugging Face Spaces network restrictions.
|
| 7 |
-
|
| 8 |
-
Usage:
|
| 9 |
-
uvicorn main:app --host 0.0.0.0 --port 8080
|
| 10 |
-
|
| 11 |
-
Deployment:
|
| 12 |
-
- Vercel: Use vercel.json configuration
|
| 13 |
-
- Railway: Direct deployment
|
| 14 |
-
- Render: Use render.yaml configuration
|
| 15 |
-
- Docker: Standard FastAPI Docker setup
|
| 16 |
-
"""
|
| 17 |
-
|
| 18 |
-
import asyncio
|
| 19 |
-
import os
|
| 20 |
-
import sys
|
| 21 |
-
import tempfile
|
| 22 |
-
from pathlib import Path
|
| 23 |
-
from typing import Optional
|
| 24 |
-
|
| 25 |
-
from fastapi import FastAPI, HTTPException
|
| 26 |
-
from fastapi.middleware.cors import CORSMiddleware
|
| 27 |
-
from fastapi.responses import FileResponse, JSONResponse
|
| 28 |
-
from pydantic import BaseModel, HttpUrl, field_validator
|
| 29 |
-
|
| 30 |
-
# Initialize FastAPI app
|
| 31 |
-
app = FastAPI(
|
| 32 |
-
title="YouTube Audio Proxy Service",
|
| 33 |
-
description="Proxy service for downloading YouTube audio in restricted environments",
|
| 34 |
-
version="1.0.0",
|
| 35 |
-
)
|
| 36 |
-
|
| 37 |
-
# Configure CORS - allow all origins for proxy service
|
| 38 |
-
app.add_middleware(
|
| 39 |
-
CORSMiddleware,
|
| 40 |
-
allow_origins=["*"],
|
| 41 |
-
allow_credentials=True,
|
| 42 |
-
allow_methods=["*"],
|
| 43 |
-
allow_headers=["*"],
|
| 44 |
-
)
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
class DownloadRequest(BaseModel):
|
| 48 |
-
"""Request model for audio download."""
|
| 49 |
-
|
| 50 |
-
url: HttpUrl
|
| 51 |
-
format: str = "mp3"
|
| 52 |
-
|
| 53 |
-
@field_validator("url")
|
| 54 |
-
@classmethod
|
| 55 |
-
def validate_youtube_url(cls, v: HttpUrl) -> HttpUrl:
|
| 56 |
-
"""Validate that the URL is a YouTube URL."""
|
| 57 |
-
url_str = str(v)
|
| 58 |
-
if not any(domain in url_str for domain in ["youtube.com", "youtu.be"]):
|
| 59 |
-
raise ValueError("URL must be a valid YouTube URL")
|
| 60 |
-
return v
|
| 61 |
-
|
| 62 |
-
@field_validator("format")
|
| 63 |
-
@classmethod
|
| 64 |
-
def validate_format(cls, v: str) -> str:
|
| 65 |
-
"""Validate audio format."""
|
| 66 |
-
allowed_formats = {"mp3", "m4a", "wav", "opus"}
|
| 67 |
-
if v.lower() not in allowed_formats:
|
| 68 |
-
raise ValueError(f"Format must be one of {allowed_formats}, got '{v}'")
|
| 69 |
-
return v.lower()
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
@app.get("/")
|
| 73 |
-
async def root():
|
| 74 |
-
"""Root endpoint with service information."""
|
| 75 |
-
return {
|
| 76 |
-
"service": "YouTube Audio Proxy",
|
| 77 |
-
"version": "1.0.0",
|
| 78 |
-
"status": "operational",
|
| 79 |
-
"endpoints": {
|
| 80 |
-
"download": "POST /download",
|
| 81 |
-
"health": "GET /health",
|
| 82 |
-
},
|
| 83 |
-
"usage": {
|
| 84 |
-
"method": "POST",
|
| 85 |
-
"url": "/download",
|
| 86 |
-
"body": {
|
| 87 |
-
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
|
| 88 |
-
"format": "mp3",
|
| 89 |
-
},
|
| 90 |
-
},
|
| 91 |
-
}
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
@app.get("/health")
|
| 95 |
-
async def health_check():
|
| 96 |
-
"""Health check endpoint."""
|
| 97 |
-
return {
|
| 98 |
-
"status": "healthy",
|
| 99 |
-
"service": "youtube-audio-proxy",
|
| 100 |
-
"yt_dlp_available": True,
|
| 101 |
-
}
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
@app.post("/download")
|
| 105 |
-
async def download_audio(request: DownloadRequest):
|
| 106 |
-
"""
|
| 107 |
-
Download YouTube audio and return the file.
|
| 108 |
-
|
| 109 |
-
Args:
|
| 110 |
-
request: Contains YouTube URL and desired audio format
|
| 111 |
-
|
| 112 |
-
Returns:
|
| 113 |
-
Audio file in requested format
|
| 114 |
-
"""
|
| 115 |
-
temp_dir = None
|
| 116 |
-
|
| 117 |
-
try:
|
| 118 |
-
# Create temporary directory
|
| 119 |
-
temp_dir = tempfile.mkdtemp()
|
| 120 |
-
output_template = str(Path(temp_dir) / f"audio.%(ext)s")
|
| 121 |
-
|
| 122 |
-
# Build yt-dlp command
|
| 123 |
-
cmd = [
|
| 124 |
-
sys.executable,
|
| 125 |
-
"-m",
|
| 126 |
-
"yt_dlp",
|
| 127 |
-
"--extract-audio",
|
| 128 |
-
"--audio-format",
|
| 129 |
-
request.format,
|
| 130 |
-
"--audio-quality",
|
| 131 |
-
"5",
|
| 132 |
-
"--no-warnings",
|
| 133 |
-
"--no-playlist",
|
| 134 |
-
"--output",
|
| 135 |
-
output_template,
|
| 136 |
-
str(request.url),
|
| 137 |
-
]
|
| 138 |
-
|
| 139 |
-
# Execute download
|
| 140 |
-
process = await asyncio.create_subprocess_exec(
|
| 141 |
-
*cmd,
|
| 142 |
-
stdout=asyncio.subprocess.PIPE,
|
| 143 |
-
stderr=asyncio.subprocess.PIPE,
|
| 144 |
-
)
|
| 145 |
-
|
| 146 |
-
stdout, stderr = await asyncio.wait_for(process.communicate(), timeout=120)
|
| 147 |
-
|
| 148 |
-
if process.returncode != 0:
|
| 149 |
-
error_msg = stderr.decode("utf-8", errors="ignore")
|
| 150 |
-
|
| 151 |
-
# Parse common errors
|
| 152 |
-
if "Video unavailable" in error_msg:
|
| 153 |
-
raise HTTPException(
|
| 154 |
-
status_code=404, detail="Video not found or unavailable"
|
| 155 |
-
)
|
| 156 |
-
elif "Private video" in error_msg:
|
| 157 |
-
raise HTTPException(status_code=403, detail="Video is private")
|
| 158 |
-
else:
|
| 159 |
-
raise HTTPException(
|
| 160 |
-
status_code=500,
|
| 161 |
-
detail=f"Download failed: {error_msg[:200]}",
|
| 162 |
-
)
|
| 163 |
-
|
| 164 |
-
# Find downloaded file
|
| 165 |
-
audio_files = list(Path(temp_dir).glob(f"audio.*"))
|
| 166 |
-
|
| 167 |
-
if not audio_files:
|
| 168 |
-
raise HTTPException(
|
| 169 |
-
status_code=500, detail="Audio file not found after download"
|
| 170 |
-
)
|
| 171 |
-
|
| 172 |
-
audio_file = audio_files[0]
|
| 173 |
-
|
| 174 |
-
# Return the audio file
|
| 175 |
-
return FileResponse(
|
| 176 |
-
path=str(audio_file),
|
| 177 |
-
media_type=f"audio/{request.format}",
|
| 178 |
-
filename=f"audio.{request.format}",
|
| 179 |
-
background=None, # Don't delete yet
|
| 180 |
-
)
|
| 181 |
-
|
| 182 |
-
except asyncio.TimeoutError:
|
| 183 |
-
raise HTTPException(status_code=504, detail="Download timeout (exceeded 120s)")
|
| 184 |
-
except HTTPException:
|
| 185 |
-
raise
|
| 186 |
-
except Exception as e:
|
| 187 |
-
raise HTTPException(status_code=500, detail=f"Unexpected error: {str(e)}")
|
| 188 |
-
finally:
|
| 189 |
-
# Cleanup will happen automatically when temp dir is garbage collected
|
| 190 |
-
# For production, consider implementing proper cleanup
|
| 191 |
-
pass
|
| 192 |
-
|
| 193 |
-
|
| 194 |
-
@app.exception_handler(Exception)
|
| 195 |
-
async def global_exception_handler(request, exc):
|
| 196 |
-
"""Global exception handler."""
|
| 197 |
-
return JSONResponse(
|
| 198 |
-
status_code=500,
|
| 199 |
-
content={
|
| 200 |
-
"status": "error",
|
| 201 |
-
"message": str(exc),
|
| 202 |
-
"detail": "An unexpected error occurred",
|
| 203 |
-
},
|
| 204 |
-
)
|
| 205 |
-
|
| 206 |
-
|
| 207 |
-
if __name__ == "__main__":
|
| 208 |
-
import uvicorn
|
| 209 |
-
|
| 210 |
-
port = int(os.environ.get("PORT", 8080))
|
| 211 |
-
uvicorn.run(
|
| 212 |
-
"main:app",
|
| 213 |
-
host="0.0.0.0",
|
| 214 |
-
port=port,
|
| 215 |
-
reload=False,
|
| 216 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
proxy_service/render.yaml
DELETED
|
@@ -1,21 +0,0 @@
|
|
| 1 |
-
# Render.com deployment configuration for YouTube Audio Proxy Service
|
| 2 |
-
# This service provides YouTube audio downloads for restricted environments
|
| 3 |
-
|
| 4 |
-
services:
|
| 5 |
-
- type: web
|
| 6 |
-
name: youtube-audio-proxy
|
| 7 |
-
env: docker
|
| 8 |
-
dockerfilePath: ./Dockerfile
|
| 9 |
-
plan: free
|
| 10 |
-
region: oregon
|
| 11 |
-
healthCheckPath: /health
|
| 12 |
-
envVars:
|
| 13 |
-
- key: PORT
|
| 14 |
-
value: 8080
|
| 15 |
-
- key: PYTHONUNBUFFERED
|
| 16 |
-
value: 1
|
| 17 |
-
autoDeploy: true
|
| 18 |
-
disk:
|
| 19 |
-
name: temp-storage
|
| 20 |
-
mountPath: /tmp
|
| 21 |
-
sizeGB: 1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
proxy_service/requirements.txt
DELETED
|
@@ -1,6 +0,0 @@
|
|
| 1 |
-
fastapi==0.104.1
|
| 2 |
-
uvicorn[standard]==0.24.0
|
| 3 |
-
pydantic==2.5.0
|
| 4 |
-
pydantic-settings==2.1.0
|
| 5 |
-
yt-dlp==2023.11.16
|
| 6 |
-
httpx==0.25.2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
pyproject.toml
CHANGED
|
@@ -20,7 +20,6 @@ cachetools = "^5.3.0"
|
|
| 20 |
sentence-transformers = "^2.2.2"
|
| 21 |
torch = "^2.0.0"
|
| 22 |
faster-whisper = "^1.0.0"
|
| 23 |
-
httpx = "^0.25.2"
|
| 24 |
|
| 25 |
[tool.poetry.group.dev.dependencies]
|
| 26 |
pytest = "^7.4.3"
|
|
|
|
| 20 |
sentence-transformers = "^2.2.2"
|
| 21 |
torch = "^2.0.0"
|
| 22 |
faster-whisper = "^1.0.0"
|
|
|
|
| 23 |
|
| 24 |
[tool.poetry.group.dev.dependencies]
|
| 25 |
pytest = "^7.4.3"
|