--- title: WhisperSelf ML Service emoji: 🎙️ colorFrom: blue colorTo: green sdk: docker pinned: false --- # WhisperSelf ML Service This folder contains the ML service for speech transcription using faster-whisper and FastAPI. ## What This Project Includes - ML API service: ml/serve.py - Model config: ml/config.yaml - Python dependencies: ml/requirements.txt - Fine-tuning scripts: ml/finetune/ - Model download helper: scripts/download_model.py - Docker files: docker/Dockerfile.ml and docker/docker-compose.yml ## 1) Run Locally (Python) ### Prerequisites - Python 3.11+ - ffmpeg installed and available in PATH ### Setup ```powershell cd Transcription python -m venv .venv .\.venv\Scripts\Activate.ps1 pip install --upgrade pip pip install -r ml\requirements.txt ``` ### Download model weights ```powershell python scripts\download_model.py --model large-v3 --output .\models ``` ### Start ML server ```powershell cd ml uvicorn serve:app --host 0.0.0.0 --port 8000 --reload ``` ### Health check Open: - http://localhost:8000/health ### Test transcription endpoint ```powershell curl.exe -X POST "http://localhost:8000/transcribe" ^ -F "file=@C:\path\to\audio.wav" ^ -F "model=small" ^ -F "language=auto" ^ -F "task=transcribe" ``` ## 2) Run With Docker From the docker folder: ```powershell cd Transcription\docker docker compose up --build ml ``` ML service will be available at: - http://localhost:8000 ## 3) Environment Variables (Important) These can be set in your host environment or container: - MODEL_PATH (default: ../models/large-v3) - WHISPER_DEVICE (default: cpu) - WHISPER_COMPUTE_TYPE (default: int8) - WHISPER_LANGUAGE (default: en) - WHISPER_TASK (default: transcribe) - WHISPER_BEAM_SIZE (default: 1) - WHISPER_BEST_OF (default: 1) - WHISPER_VAD_FILTER (default: true) - WHISPER_CONDITION_ON_PREVIOUS_TEXT (default: false) - WHISPER_CPU_THREADS (default: number of CPUs) - WHISPER_NUM_WORKERS (default: 1) - JOB_RETENTION_SECONDS (default: 3600) ## 4) Host On Hugging Face Space (Docker) 1. Create a new Hugging Face Space and choose Docker SDK. 2. Push this folder content to that Space repository. 3. Ensure there is a Dockerfile at repository root (Hugging Face builds from root). 4. Expose port 8000 in Dockerfile. 5. Start command should run uvicorn serve:app --host 0.0.0.0 --port 8000. If your Space is only for inference and models are large, prefer downloading model weights at build/start time or using a smaller model (small/base) to avoid storage and startup issues. ## 5) API Endpoints - GET /health - POST /transcribe - POST /transcribe/jobs - GET /transcribe/jobs/{job_id} - DELETE /transcribe/jobs/{job_id} ## 6) Common Errors and Fixes - Error: app.py not found on Hugging Face - Cause: Space configured as Gradio/Streamlit instead of Docker. - Fix: Use sdk: docker and provide a root Dockerfile. - Error: No module named faster_whisper - Fix: pip install -r ml/requirements.txt - Error: ffmpeg not found - Fix: Install ffmpeg on host or use Docker image that installs ffmpeg. - Slow startup / memory issues - Fix: Use model=small and WHISPER_COMPUTE_TYPE=int8. ## 7) Quick Production Tips - Keep model small or medium for free-tier hosting. - Add request timeout and upload-size limits in reverse proxy. - Keep health checks enabled on /health. - Monitor disk usage when caching model weights.