Spaces:

dark-kill
/

Transcription

Runtime error

App Files Files Community

Transcription / README.md

Shubham32142

Add Docker support and implement Whisper transcription service

a4a3878 about 1 month ago

preview code

raw

history blame contribute delete

3.39 kB

	---
	title: WhisperSelf ML Service
	emoji: 🎙️
	colorFrom: blue
	colorTo: green
	sdk: docker
	pinned: false
	---

	# WhisperSelf ML Service

	This folder contains the ML service for speech transcription using faster-whisper and FastAPI.

	## What This Project Includes

	- ML API service: ml/serve.py
	- Model config: ml/config.yaml
	- Python dependencies: ml/requirements.txt
	- Fine-tuning scripts: ml/finetune/
	- Model download helper: scripts/download_model.py
	- Docker files: docker/Dockerfile.ml and docker/docker-compose.yml

	## 1) Run Locally (Python)

	### Prerequisites

	- Python 3.11+
	- ffmpeg installed and available in PATH

	### Setup

	```powershell
	cd Transcription
	python -m venv .venv
	.\.venv\Scripts\Activate.ps1
	pip install --upgrade pip
	pip install -r ml\requirements.txt
	```

	### Download model weights

	```powershell
	python scripts\download_model.py --model large-v3 --output .\models
	```

	### Start ML server

	```powershell
	cd ml
	uvicorn serve:app --host 0.0.0.0 --port 8000 --reload
	```

	### Health check

	Open:

	- http://localhost:8000/health

	### Test transcription endpoint

	```powershell
	curl.exe -X POST "http://localhost:8000/transcribe" ^
	-F "file=@C:\path\to\audio.wav" ^
	-F "model=small" ^
	-F "language=auto" ^
	-F "task=transcribe"
	```

	## 2) Run With Docker

	From the docker folder:

	```powershell
	cd Transcription\docker
	docker compose up --build ml
	```

	ML service will be available at:

	- http://localhost:8000

	## 3) Environment Variables (Important)

	These can be set in your host environment or container:

	- MODEL_PATH (default: ../models/large-v3)
	- WHISPER_DEVICE (default: cpu)
	- WHISPER_COMPUTE_TYPE (default: int8)
	- WHISPER_LANGUAGE (default: en)
	- WHISPER_TASK (default: transcribe)
	- WHISPER_BEAM_SIZE (default: 1)
	- WHISPER_BEST_OF (default: 1)
	- WHISPER_VAD_FILTER (default: true)
	- WHISPER_CONDITION_ON_PREVIOUS_TEXT (default: false)
	- WHISPER_CPU_THREADS (default: number of CPUs)
	- WHISPER_NUM_WORKERS (default: 1)
	- JOB_RETENTION_SECONDS (default: 3600)

	## 4) Host On Hugging Face Space (Docker)

	1. Create a new Hugging Face Space and choose Docker SDK.
	2. Push this folder content to that Space repository.
	3. Ensure there is a Dockerfile at repository root (Hugging Face builds from root).
	4. Expose port 8000 in Dockerfile.
	5. Start command should run uvicorn serve:app --host 0.0.0.0 --port 8000.

	If your Space is only for inference and models are large, prefer downloading model weights at build/start time or using a smaller model (small/base) to avoid storage and startup issues.

	## 5) API Endpoints

	- GET /health
	- POST /transcribe
	- POST /transcribe/jobs
	- GET /transcribe/jobs/{job_id}
	- DELETE /transcribe/jobs/{job_id}

	## 6) Common Errors and Fixes

	- Error: app.py not found on Hugging Face
	- Cause: Space configured as Gradio/Streamlit instead of Docker.
	- Fix: Use sdk: docker and provide a root Dockerfile.

	- Error: No module named faster_whisper
	- Fix: pip install -r ml/requirements.txt

	- Error: ffmpeg not found
	- Fix: Install ffmpeg on host or use Docker image that installs ffmpeg.

	- Slow startup / memory issues
	- Fix: Use model=small and WHISPER_COMPUTE_TYPE=int8.

	## 7) Quick Production Tips

	- Keep model small or medium for free-tier hosting.
	- Add request timeout and upload-size limits in reverse proxy.
	- Keep health checks enabled on /health.
	- Monitor disk usage when caching model weights.