multiutility-server / README.md
abhisheksan's picture
Simplify: Remove proxy service, keep only audio file upload for HF Spaces
eb87c73
metadata
title: Multi-Utility FastAPI Server
emoji: 🌐
colorFrom: indigo
colorTo: blue
sdk: docker
sdk_version: 0.95.2
app_file: main.py
pinned: false

Multi-Utility FastAPI Server

A centralized, extensible FastAPI server providing reusable APIs with robust authentication, rate limiting, and logging.

Python 3.11+ FastAPI License: MIT

Features

  • Modular Architecture - Easy to add new APIs
  • API Key Authentication - Secure, timing-safe key validation
  • Rate Limiting - Configurable per-endpoint limits with slowapi
  • Result Caching - TTL-based caching with cachetools
  • Structured Logging - Loguru with console/file output
  • Docker Ready - Multi-stage, cache-optimized Dockerfile

APIs

API Endpoint Description
Subtitles POST /api/v1/subtitles/extract Transcribe YouTube video with Whisper ⚠️
Subtitles POST /api/v1/subtitles/transcribe Transcribe uploaded audio with Whisper βœ…
Embeddings POST /api/v1/embeddings/generate Generate text embeddings (1024-dim)

⚠️ Note on HF Spaces: The YouTube extraction endpoint (/extract) requires external network access and will not work on Hugging Face Spaces due to platform restrictions. Instead, use the audio file upload endpoint (/transcribe) which works perfectly on HF Spaces. For YouTube extraction, use a self-hosted deployment.

Quick Start

Local Development

# Install dependencies
poetry install

# Configure environment
cp .env.example .env
# Edit .env with your API keys

# Run server
poetry run uvicorn app.main:app --reload

Docker

docker build -t multiutility-server .
docker run -p 7860:7860 -e API_KEYS=your-key multiutility-server

Configuration

Variable Description Default
API_KEYS Comma-separated API keys (required) -
CORS_ORIGINS Allowed origins *
RATE_LIMIT_REQUESTS Requests per minute 100
LOG_LEVEL Logging level INFO
WHISPER_MODEL Whisper model size base
EMBEDDING_MODEL HuggingFace model mixedbread-ai/mxbai-embed-large-v1

API Usage

Authentication

All endpoints (except health checks) require the x-api-key header:

curl -H "x-api-key: your-api-key" http://localhost:8000/api/v1/...

Subtitles API

Extract from YouTube URL

⚠️ Important: This endpoint requires network access to YouTube and will not work on Hugging Face Spaces. Use the audio file upload endpoint below instead, or deploy on a self-hosted environment.

curl -X POST http://localhost:8000/api/v1/subtitles/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-key" \
  -d '{"url": "https://youtube.com/watch?v=VIDEO_ID", "lang": "en"}'

Response:

{
  "status": "success",
  "video_id": "VIDEO_ID",
  "language": "en",
  "subtitles": ["Line 1", "Line 2", "..."]
}

Transcribe Audio File

βœ… Works everywhere: This endpoint accepts direct audio file uploads and works in all environments including HF Spaces.

curl -X POST http://localhost:8000/api/v1/subtitles/transcribe \
  -H "x-api-key: your-key" \
  -F "file=@audio.mp3" \
  -F "lang=en"

Supported formats: mp3, wav, m4a, flac, ogg, webm

Response:

{
  "status": "success",
  "language": "en",
  "file_name": "audio.mp3",
  "transcription": ["Segment 1", "Segment 2", "..."]
}

Embeddings API

curl -X POST http://localhost:8000/api/v1/embeddings/generate \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-key" \
  -d '{"texts": ["Hello world", "Another text"], "normalize": true}'

Response:

{
  "status": "success",
  "embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]],
  "model": "mixedbread-ai/mxbai-embed-large-v1",
  "dimensions": 1024
}

Project Structure

app/
β”œβ”€β”€ main.py                 # FastAPI application
β”œβ”€β”€ core/                   # Config, logging, exceptions
β”œβ”€β”€ middleware/             # Auth, rate limiting
└── apis/
    β”œβ”€β”€ subtitles/          # YouTube subtitle extraction
    └── embeddings/         # Text embedding generation

Deployment

Hugging Face Spaces

⚠️ Network Limitation: HF Spaces blocks external internet access, so YouTube downloads are not possible.

What works on HF Spaces:

  • βœ… /api/v1/subtitles/transcribe - Upload audio files for transcription
  • βœ… /api/v1/embeddings/generate - Generate text embeddings
  • ❌ /api/v1/subtitles/extract - YouTube downloads (requires self-hosted deployment)

Setup:

  1. Create a Docker Space
  2. Set API_KEYS secret in Space settings
  3. Push repository

Recommended workflow for subtitles:

  1. Download audio locally using yt-dlp: yt-dlp -x --audio-format mp3 VIDEO_URL
  2. Upload the audio file to /api/v1/subtitles/transcribe endpoint
  3. Receive transcription from Whisper

Docker Compose

docker-compose up --build

Alternative: Self-Hosted Deployment for YouTube Extraction

If you need YouTube subtitle extraction, deploy the server on a platform with internet access:

Docker (VPS/Cloud VM)

docker build -t multiutility-server .
docker run -p 7860:7860 -e API_KEYS=your-key multiutility-server

Cloud Platforms

  • Railway: Direct Docker deployment
  • Render: Connect GitHub repo, auto-deploy
  • DigitalOcean: Deploy on Droplet ($4-12/month)
  • AWS/GCP/Azure: Use ECS, Cloud Run, or App Service

Benefits of Self-Hosted

  • βœ… Direct YouTube access (no restrictions)
  • βœ… Full control over resources
  • βœ… No usage limits
  • βœ… All features work natively

Development

# Run tests
poetry run pytest

# Type checking
poetry run mypy app/

# Format code
poetry run black . && poetry run isort .

License

MIT License - see LICENSE for details.