Spaces:

benhadjermed
/

tahkik-basic-warsh

Sleeping

Migrate to faster-whisper with INT8 quantization for ~4x speedup

90b0434 verified about 1 month ago

1.07 kB

	# ── Tahkik Inference Space ──────────────────────────────────────────────────
	# Uses faster-whisper (CTranslate2 INT8) for ~4x faster inference vs PyTorch.
	# To enable GPU (T4/L4/A100), change the base image to:
	# FROM nvidia/cuda:12.1-runtime-ubuntu22.04
	# and set compute_type="float16" in main.py.
	# ---------------------------------------------------------------------------

	FROM python:3.10-slim

	# HF Spaces requires a non-root user with UID 1000.
	RUN useradd -m -u 1000 user

	WORKDIR /home/user/app

	# Install dependencies as root (before switching user).
	COPY --chown=user requirements.txt .
	RUN pip install --no-cache-dir -r requirements.txt

	# Copy application code.
	COPY --chown=user . .

	# Redirect all model/cache downloads to /tmp (only writable path in Spaces).
	ENV HF_HOME=/tmp/huggingface_cache
	ENV HF_HUB_DISABLE_PROGRESS_BARS=1
	ENV CT2_VERBOSE=0

	USER user

	EXPOSE 7860

	CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]