multiutility-server / README.md
abhisheksan's picture
Simplify: Remove proxy service, keep only audio file upload for HF Spaces
eb87c73
---
title: Multi-Utility FastAPI Server
emoji: 🌐
colorFrom: indigo
colorTo: blue
sdk: docker
sdk_version: "0.95.2"
app_file: main.py
pinned: false
---
# Multi-Utility FastAPI Server
A centralized, extensible FastAPI server providing reusable APIs with robust authentication, rate limiting, and logging.
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.104+-green.svg)](https://fastapi.tiangolo.com/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
## Features
- **Modular Architecture** - Easy to add new APIs
- **API Key Authentication** - Secure, timing-safe key validation
- **Rate Limiting** - Configurable per-endpoint limits with `slowapi`
- **Result Caching** - TTL-based caching with `cachetools`
- **Structured Logging** - Loguru with console/file output
- **Docker Ready** - Multi-stage, cache-optimized Dockerfile
## APIs
| API | Endpoint | Description |
|-----|----------|-------------|
| **Subtitles** | `POST /api/v1/subtitles/extract` | Transcribe YouTube video with Whisper ⚠️ |
| **Subtitles** | `POST /api/v1/subtitles/transcribe` | Transcribe uploaded audio with Whisper βœ… |
| **Embeddings** | `POST /api/v1/embeddings/generate` | Generate text embeddings (1024-dim) |
> ⚠️ **Note on HF Spaces:** The YouTube extraction endpoint (`/extract`) requires external network access and will **not work on Hugging Face Spaces** due to platform restrictions. Instead, use the **audio file upload endpoint** (`/transcribe`) which works perfectly on HF Spaces. For YouTube extraction, use a self-hosted deployment.
## Quick Start
### Local Development
```bash
# Install dependencies
poetry install
# Configure environment
cp .env.example .env
# Edit .env with your API keys
# Run server
poetry run uvicorn app.main:app --reload
```
### Docker
```bash
docker build -t multiutility-server .
docker run -p 7860:7860 -e API_KEYS=your-key multiutility-server
```
## Configuration
| Variable | Description | Default |
|----------|-------------|---------|
| `API_KEYS` | Comma-separated API keys (required) | - |
| `CORS_ORIGINS` | Allowed origins | `*` |
| `RATE_LIMIT_REQUESTS` | Requests per minute | `100` |
| `LOG_LEVEL` | Logging level | `INFO` |
| `WHISPER_MODEL` | Whisper model size | `base` |
| `EMBEDDING_MODEL` | HuggingFace model | `mixedbread-ai/mxbai-embed-large-v1` |
## API Usage
### Authentication
All endpoints (except health checks) require the `x-api-key` header:
```bash
curl -H "x-api-key: your-api-key" http://localhost:8000/api/v1/...
```
### Subtitles API
#### Extract from YouTube URL
> ⚠️ **Important:** This endpoint requires network access to YouTube and will **not work on Hugging Face Spaces**. Use the audio file upload endpoint below instead, or deploy on a self-hosted environment.
```bash
curl -X POST http://localhost:8000/api/v1/subtitles/extract \
-H "Content-Type: application/json" \
-H "x-api-key: your-key" \
-d '{"url": "https://youtube.com/watch?v=VIDEO_ID", "lang": "en"}'
```
**Response:**
```json
{
"status": "success",
"video_id": "VIDEO_ID",
"language": "en",
"subtitles": ["Line 1", "Line 2", "..."]
}
```
#### Transcribe Audio File
> βœ… **Works everywhere:** This endpoint accepts direct audio file uploads and works in all environments including HF Spaces.
```bash
curl -X POST http://localhost:8000/api/v1/subtitles/transcribe \
-H "x-api-key: your-key" \
-F "file=@audio.mp3" \
-F "lang=en"
```
**Supported formats:** mp3, wav, m4a, flac, ogg, webm
**Response:**
```json
{
"status": "success",
"language": "en",
"file_name": "audio.mp3",
"transcription": ["Segment 1", "Segment 2", "..."]
}
```
### Embeddings API
```bash
curl -X POST http://localhost:8000/api/v1/embeddings/generate \
-H "Content-Type: application/json" \
-H "x-api-key: your-key" \
-d '{"texts": ["Hello world", "Another text"], "normalize": true}'
```
**Response:**
```json
{
"status": "success",
"embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]],
"model": "mixedbread-ai/mxbai-embed-large-v1",
"dimensions": 1024
}
```
## Project Structure
```
app/
β”œβ”€β”€ main.py # FastAPI application
β”œβ”€β”€ core/ # Config, logging, exceptions
β”œβ”€β”€ middleware/ # Auth, rate limiting
└── apis/
β”œβ”€β”€ subtitles/ # YouTube subtitle extraction
└── embeddings/ # Text embedding generation
```
## Deployment
### Hugging Face Spaces
⚠️ **Network Limitation:** HF Spaces blocks external internet access, so YouTube downloads are not possible.
**What works on HF Spaces:**
- βœ… `/api/v1/subtitles/transcribe` - Upload audio files for transcription
- βœ… `/api/v1/embeddings/generate` - Generate text embeddings
- ❌ `/api/v1/subtitles/extract` - YouTube downloads (requires self-hosted deployment)
**Setup:**
1. Create a Docker Space
2. Set `API_KEYS` secret in Space settings
3. Push repository
**Recommended workflow for subtitles:**
1. Download audio locally using [yt-dlp](https://github.com/yt-dlp/yt-dlp): `yt-dlp -x --audio-format mp3 VIDEO_URL`
2. Upload the audio file to `/api/v1/subtitles/transcribe` endpoint
3. Receive transcription from Whisper
### Docker Compose
```bash
docker-compose up --build
```
## Alternative: Self-Hosted Deployment for YouTube Extraction
If you need YouTube subtitle extraction, deploy the server on a platform with internet access:
### Docker (VPS/Cloud VM)
```bash
docker build -t multiutility-server .
docker run -p 7860:7860 -e API_KEYS=your-key multiutility-server
```
### Cloud Platforms
- **Railway:** Direct Docker deployment
- **Render:** Connect GitHub repo, auto-deploy
- **DigitalOcean:** Deploy on Droplet ($4-12/month)
- **AWS/GCP/Azure:** Use ECS, Cloud Run, or App Service
### Benefits of Self-Hosted
- βœ… Direct YouTube access (no restrictions)
- βœ… Full control over resources
- βœ… No usage limits
- βœ… All features work natively
## Development
```bash
# Run tests
poetry run pytest
# Type checking
poetry run mypy app/
# Format code
poetry run black . && poetry run isort .
```
## License
MIT License - see [LICENSE](LICENSE) for details.